CPSC 441: Computer Networks

Winter 2022

Assignment 1: Clown Proxy (40 marks)

Due: Friday, January 28, 2022 (4:00pm)

Learning Objectives

The purpose of this assignment is to learn about the HyperText Transfer Protocol (HTTP) used by the World Wide Web. In particular, you will design and implement a Web proxy using HTTP to demonstrate your understanding of this application-layer protocol. Along the way, you will also learn a lot about socket programming, TCP/IP, network debugging, and more.

Preamble

With the New Year upon us, and the pandemic still a part of our lives, it is time for some frivolity to lighten things up. After all, April Fool's Day is less than three months away, and we need to be ready. Writing a Web proxy to brighten someone's day seems like a good strategy for this. To keep the assignment simple, we will restrict ourselves only to plain-text HTTP sites, not secure HTTP (HTTPS) sites.

Background

A Web proxy is a piece of software that functions as an intermediary between a Web client (browser) and a Web server. The Web proxy intercepts Web requests from clients and determines whether they should be transmitted to a Web server or not. If the request is blocked, the proxy informs the client directly. If the request is forwarded to the Web server, then any response that the proxy receives from the Web server is forwarded back to the client. From the server's point of view, the proxy is the client, since that is where the request comes from. Similarly, from the client's point of view, the proxy is the server, since that is where the response comes from. A Web proxy thus provides a single point of control to regulate Web access between clients and servers. A lot of Calgary schools use Web proxies to limit the types of Web sites that students are allowed to access. Net Nanny and Barracuda are examples of commercially available Web proxies.

Technical Requirements

In this assignment, you will implement your very own clown proxy, in either C or C++. The goals of the assignment are to build a properly functioning Web proxy for simple Web pages, and then use your proxy to alter certain content items before they are delivered to the browser.

There are two main pieces of functionality needed in your proxy. The first is the ability to handle HTTP requests and responses, while still forwarding them between client and server. This is called a transparent proxy. The second is the ability to parse, and possibly modify, HTTP requests and responses. This could involve: (a) rewriting some of the content in an HTTP response before that content is displayed by your Web browser; and (b) rewriting some of the HTTP requests so that they request a different object than is normally retrieved on a Web page.

In the spirit of silliness, your proxy should be able to do two specific things. First, it should replace all occurrences of the word "Happy" with the word "Silly" in an HTTP response. Second, it should replace all JPG image files on a given Web page with an image of a happy clown instead. (Please don't use an evil clown, since some people really do suffer from coulrophobia, which is the fear of clowns.)

The most important HTTP command for your Web proxy to handle is the "GET" request, which specifies the URL for an object to be retrieved. In the basic operation of your proxy, it should be able to parse, understand, and forward to the Web server a (possibly modified) version of the client HTTP request. Similarly, the proxy should be able to parse, understand, and return to the client a (possibly modified) version of the HTTP response that the Web server provided to the proxy. Please give some careful thought to how your proxy handles commonly occurring HTTP response codes, such as 200 (OK), 206 (Partial Content), 301 (Moved Permanently), 302 (Found), 304 (Not Modified), 403 (Forbidden), and 404 (Not Found).

You will need at least one TCP socket (i.e., SOCK_STREAM) for client-proxy communication, and at least one additional TCP socket for each Web server that your proxy talks to during proxy-server communication. If you want your proxy to support multiple concurrent HTTP transactions, you may need to fork child processes or create threads for request handling. Each child process or thread will use its own socket instances for its communications with the client and with the server.

When implementing your proxy, feel free to compile and run your Web proxy on any suitable department machine, or even your home machine or laptop, but please be aware that you will ultimately have to demo your proxy to your TA on campus at some point. You should try to access your proxy from your favourite Web browser (e.g., Edge, Firefox, Chrome, Safari), and computer (either on campus or at home). To test the proxy, you will have to configure your Web browser to use your specific Web proxy (e.g., look for menu selections like Tools, Internet Options, Proxies, Advanced, LAN Settings). Make sure that you only tamper with HTTP, and not HTTPS.

As you design and build your Web proxy, give careful consideration to how you will debug and test it. For example, you may want to print out information about requests and responses received, processed, forwarded, redirected, or altered. Once you become confident with the basic operation of your Web proxy, you can toggle off the verbose debugging output. If you are testing on your home network, you can also use tools like WireShark to collect network packet traces. By studying the HTTP messages and TCP/IP packets going to and from your proxy, you might be able to figure out what is working, what isn't working, and why.

When you are finished, please submit your solution in electronic form to your TA via D2L. Your submission should include the source code for your Web proxy, a brief user manual describing how to compile and use your proxy, and a description of the testing done with your proxy. Please remember that assignments are to be done individually, and submitted to your assigned TA on time. You should also plan to give a brief demo of your proxy to your TA during a tutorial time slot just after the assignment deadline.

Testing

During your demo, your proxy will be tested on the following test cases:

Good luck, and have fun!

Grading Rubric

The grading scheme for the assignment is as follows:

Bonus (optional)

Up to 4 bonus marks will be given for a clown proxy that can randomly choose from a small set of different clown images when doing image substitution, resulting in a page that varies every time you load it. Make sure to show this bonus feature to your TA during the demo.

Tips