CPSC 441: Computer Communications

Professor Carey Williamson

Winter 2012

Assignment 2: TCP Traffic Analysis (25 marks)

Due: Thursday, February 16, 2012 (11:59pm)

The purpose of this assignment is to learn about the Transmission Control Protocol (TCP). In particular, you will write a C or C++ program to analyze a specially formatted network traffic trace file, in order to assess and understand the TCP/IP protocol, including its handshaking behaviour and its protocol states.

The file trace.txt (210 KB ASCII text file) shows some TCP/IP packet traffic collected using a network traffic analyzer on a research network at the University of Calgary. This trace contains 2,168 TCP/IP packets, and lasts about 3.5 minutes. During the period traced, a single Web client was downloading Web pages from different Web sites on the Internet. This trace is to be used for your TCP traffic analysis, and for answering the questions given below.

Each line of data in the trace file represents one TCP/IP packet. There are multiple columns of data on each line, separated by spaces. The columns, from left to right, represent:

An example line from this trace is:
1916.911715 192.168.1.9 -> 216.239.39.99 44 TCP 1026 80 20948 : 20948 0 win: 32768 S
This TCP packet traveled from IP source address 192.168.1.9 (port 1026) to IP destination address 216.239.39.99 (port 80) at time 1916.911715 sec. It was a SYN packet of size 44 bytes (including TCP/IP protocol headers). The proposed starting TCP sequence number was 20948. This packet carried no actual TCP data bytes. The acknowledgement field was invalid, and initialized to 0. The flow control window size advertised was 32 KB.

You need to write a program (20 marks) for parsing and processing trace files in this format, and tracking TCP connection state information. In particular, the program processes the trace file and computes summary information about TCP connections. Note that a TCP connection is identified by a 4-tuple (IP source address, source port, IP destination address, destination port), and packets can flow in both directions on a connection (i.e., from host A to host B, and from host B to host A). Also note that the packets from different connections can be arbitrarily interleaved with each other in time, so your program will need to extract packets and associate them with the correct connection. Your program should be written in C or C++.

The summary information to be computed for each TCP connection includes:

Use your program, and the trace file, to answer as many of the following questions as you can (1 mark each, total of 5 marks):

  1. How many complete TCP connections are observed in the trace?
  2. What are the minimum, mean, and maximum time durations of the complete TCP connections that you observed?
  3. What are the minimum, mean, and maximum number of packets sent on the complete TCP connections that you observed?
  4. What is the minimum, mean, and maximum number of data bytes sent on the complete TCP connections that you observed?
  5. How many reset TCP connections are observed in the trace?

Bonus (3 marks): Find in the trace the complete TCP connection that downloaded the most TCP data bytes from the server. What is the IP address of this Web server? Approximately how many objects were downloaded? What was the average throughput (in bits per second) for the entire connection?

When you are finished, please submit your assignment solution as a single file in electronic form to your TA on or before the stated deadline. Make sure your name and identification is on everything that you submit. Your submission should include your source code, the output produced by your program on the trace file, and a text file with your answers to the questions above. Make sure to show your work for any calculations.

TIPS:

Parsing the input file might look a bit intimidating, but it is doable. If you need some help getting started, take a look at tcpreader.c and try it out.

For testing your program, it is best to work with some small example traces. Here are a few that you might find useful. The trace example1.txt (17 TCP packets) contains a single complete TCP connection. The trace example2.txt (12 TCP packets) contains a single TCP connection that is reset. The trace example3.txt (100 TCP packets) contains 8 TCP connections (5 complete, 2 reset, and 1 still in progress when the trace ended, as evidenced in the example3 sample output). When you have your program working properly, you can run it on the large trace file for this assignment.