Introduction: Set Up Web Content Filtering in 4 Steps With Ubuntu

As an IT guy, one of the most common things coworkers ask me is how they can control which sites their kids can access online. This is very easy to do and free using Ubuntu linux, dansguardian and tinyproxy.

Step 1: Install the Software

In Ubuntu's terminal, issue these commands:

$ sudo apt-get install tinyproxy dansguardian

When prompted, enter your root password and confirm the download.

Step 2: Configure the Applications

You will need to configure both of these new applications before they will work, but that's pretty easy. From a terminal:

$ sudo nano -c /etc/dansguardian/dansguardian.conf

Comment out line 3 (Place a # in front of the word "UNCONFIGURED"),

Line 62 should read:
filterport = 8080

and line 65 should read:
proxyport = 3128

ctrl+x to exit, save to original file name.

Now we'll edit tinyproxy.conf (in a terminal):

$ sudo nano -c /etc/tinyproxy/tinyproxy.conf
line 15 should read:
Port=3128

Step 3: Start the Services

Finally we need to start the services. Again in a terminal, issue the following commands:
$ sudo /etc/init.d/dansguardian start
$ sudo /etc/init.d/tinyproxy start

Step 4: Configure Your Client Computers

All that's left to do now is configure your clients to connect through your proxy. Using administrator accounts and some registry edits, you can prevent these changes from being undone once they're set. This way you can also completely disable internet access by your client computers by stopping one service on the Ubuntu box. I'll let you figure that one out, it varies from operating system to operating system. Here's how to configure your client web browsers for proxies:


In Firefox (Windows):
Tools--> Options--> Advanced --> Network Tab - Click "Settings" button in Connection area.
Click "Manual Proxy Configuration", in "HTTP Proxy" enter the IP address or hostname of your Proxy Server. In "Port" field, enter 8080. Click "Use this proxy server for all protocols". Click Ok to apply settings, and request a page.

Internet Explorer 7:
Tools-->Options-->Connections Tab-->Lan Settings button Check box labeled "Use a proxy server....", click "Advanced". In HTTP field, type in the IP Address or Hostname of your Proxy Server, and in the Port field, type in 8080. Click "OK" 3 times and test your connection.


To test whether we did everything right, try going to www.google.com. If you're allowed through, great. Now try going to www.badboys.com. By default this site is blocked, and makes a good test.

Step 5: Explanation of Steps and Advanced Configuration

I hate it when how-tos leave you without a good understanding of what you just did. That said, here's a basic explanation:

Step one installed the two apps we'll use. Dansguardian is used for web filtering. It is a highly configurable filter that uses many different methods to allow/deny access to websites. You can have a default-deny (whitelist) setup where only a select few sites are allowed, or you can go with the less restrictive default-allow (blacklist) model where sites are specifically blocked by URL or by a weighted word list. This one piece of software has sold my company on open-source, it is very well written and reliable. Tinyproxy provides the proxy server functionality that will act as an intermediary between dansguardian and the internet. In step 2 we told dansguardian which port to listen on (from your clients-port 8080) and if the request is approved, which port to pass the request along to tinyproxy on (3128). Also in step 2 we made sure that Tinyproxy is listening on port 3128. We started both services for the first time in step 3, and configured the clients in step 4.

Advanced configuration of Dansguardian:

dansguardian.conf - From here you set global variables such as port numbers, adapters to bind to, etc.
dansguardianf1.conf - This file holds the settings for filtergroup 1, and can be copied and the copies altered if multiple filter groups are used. Here is also where you'll change the "naughtyness limit" of the default filter group. The recommended setting for this variable goes like this - 50 for young children, 100 for old children, 160 for young adults. The default setting is 50.
bannedsitelist - where you will go to ban entire sites like example.com
bannedurllist - where you will go to ban specific URLs like example.com/~user/index.htm
bannedphraselist - allows you to specify phrases that will be scanned for in each requested page, e.g. "Potty Humor" this is useful if there are specific things that still make it through after the filter is set up.
bannediplist - for a total ban of a site, specify the IP address. This could have uninteded consequences as some sites share IPs with other sites.
exceptioniplist - for whitelist configurations - allows specific IPs
exceptionsitelist - for whitelist configurations - allows specific sites like example.com
exceptionurllist - for whitelist configurations - allows specific urls, but not entire sites, like example.com/~user/index.htm
exceptionphraselist - exempts specific phrases if they are blocked needlessly

Using these files you can tweak your filter to suit your needs, you can also edit the access denied page with your company logo, or to display a personalized message telling your kids to get back to work!

There are many alternate configurations with Dansguardian that extend functionality greatly. The extensible standards-based nature of dansguardian makes it a very versatile,adaptable and scalable product, and 3rd party software exists to graph statistics, analyze log files and make management easier. I encourage you to go to www.dansguardian.org and look into all the possibilities of this wonderful software. Please message me or comment on this instructable if you have any questions or comments.