Low Latency TCP Protocol for Beowulf Clusters
The cluster protocol is a new network protocol intended to improve network
performance on a secure Beowulf cluster. Two versions of the cluster protocol
are available: they are named version A and version B.
Version A protocol doesn't use the TCP send queue, doesn't use ACKs,
windows, and sequence numbers, and doesn't checksum. It uses a ring buffer
to store messages that are being sent, and the length is the kernel parameter
/proc/sys/net/ipv4/cluster_output_ring.
Version B protocol is very similar to TCP except that it doesn't
checksum and support many of the TCP/IP options. To use the protocols, the
user needs to build a new kernel using the sources provided here. Also two
user level header files need to be modified. The the user can use SOCK_CLUSTER
option instead of SOCK_STREAM in calls to socket.
The following subdirectories contain the relevant code.
kernelA - This contains the code required to build
a cluster protocol A compatible kernel. (Or get the Protocol A Kernel patches)
kernelB -This contains the code required to build
a cluster protocol B compatible kernel. (Or get the Protocol B Kernel patches)
user - Header files required to build user level
programs using the cluster protocol.
tests - Simple test code, timing the differences
between the cluster protocol and TCP/IP.
This project was done as a project for Masters in Computer Science by Ira Burton
under the supervision of Amit
Jain.
The entire project report is available here in
PostScript [476KB].