2

From reading lots and lots of web pages I've come to the conclusion that:

to filter adult content (not block it completely) from https sites like Twitter I would need to setup something like dansguardian/squid etc. on my Ubuntu proxy box to perform a Man In The Middle interception of SSL.

Does anybody know if it's possible to do something like the following on my Ubuntu proxy:

  1. use a language such as bash/php to grab the 'contents' of specific https sites like Twitter that the user has requested (i.e. not breaking SSL). This would be like pressing ctrl-a / ctl-c in a browser.

  2. write the contents to a file that is passed on to dansguardian.

  3. dansguardian then either allows or blocks the page depending upon its keyword filtering.

I'm already using OpenDNS / hosts file redirection which does a good job but not for sites like Twitter.

I've looked into technologies such as Untangle/K9 Web Protection but I am ideally searching for a free solution that can sit on a proxy. If I can leave SSL alone it seems like it would be easier/more secure/less support calls.

Thanks!

3 Answers3

1

It is possible to MITM without hacking/exploiting/breaking SSL (this is how Cloudflare works), the trick is to use 2 connections.

You host a web server (on your proxy box) that accepts all requests/connections (and therefore has access to unencrypted content from the user).

When a user who is using your proxy makes a request.
Your web server can accept the request and establish an SSL session with your user.
Within your code (eg PHP) you create a client instance that creates a second SSL connection to the requested server.
Your client instances gets the response on this second connection.
Your webserver then resend that response (recieved by the client instance) back to the user on the first SSL connection.

If you requiring code examples of how to do this, I suspect this would lead the question to being too-broad (too many, long/good answers possible).

NGRhodes
  • 9,680
0

use a language such as bash/php to grab the 'contents' of specific https sites like Twitter that the user has requested (i.e. not breaking SSL)

With HTTPS the full URL is encrypted and only the hostname is sent in clear inside the TLS handshake. This means to even get the URL first and thus to know what the user has requested you have to be a man in the middle for SSL.

0

Thanks all for your responses. I am an novice regarding filtering https and probably asked a confusing question. It's a pity that sites like Twitter don't offer safe url versions of their content like google safe search. SSL seems great except when you're trying to filter dodgy content :)