diff --git a/3_Implementation/implementation.tex b/3_Implementation/implementation.tex index 63f7576..f72a97e 100644 --- a/3_Implementation/implementation.tex +++ b/3_Implementation/implementation.tex @@ -14,7 +14,7 @@ Implementation of the proxy is in two parts: software that provides a multipath layer 3 tunnel between two hosts, and the system configuration necessary to utilise this tunnel as a proxy. An overview of the software and system is presented in figure \ref{fig:dataflow-overview}. -This chapter will detail this implementation in three sections. The software will be described in sections \ref{section:implementation-software-structure} and \ref{section:implementation-producer-consumer}. Section \ref{section:implementation-software-structure} explains the software's structure and dataflow. Section \ref{section:implementation-producer-consumer} details the implementation of both TCP and UDP methods of transporting the tunnelled packets between the hosts. The system configuration will be described in section \ref{section:implementation-system-configuration}, along with a discussion of some of the oddities of multipath routing, such that a reader would have enough knowledge to implement the proxy. +This chapter will detail this implementation in three sections. The software will be described in sections \ref{section:implementation-packet-transport} and \ref{section:implementation-software-structure}. Section \ref{section:implementation-packet-transport} details the implementation of both TCP and UDP methods of transporting the tunnelled packets between the hosts. Section \ref{section:implementation-software-structure} explains the software's structure and dataflow. The system configuration will be described in section \ref{section:implementation-system-configuration}, along with a discussion of some of the oddities of multipath routing, such that a reader would have enough knowledge to implement the proxy given the software. Figure \ref{fig:dataflow-overview} shows the path of packets within the proxy. As each section discusses an element of the program, where it fits within this diagram is detailed. \begin{sidewaysfigure} \includegraphics[width=\textheight]{overview.png} @@ -22,6 +22,122 @@ This chapter will detail this implementation in three sections. The software wil \label{fig:dataflow-overview} \end{sidewaysfigure} +% -------------------------------------------------------------------------- % +% -------------------------- Packet Transport ------------------------------ % +% -------------------------------------------------------------------------- % +\section{Packet Transport} +\label{section:implementation-packet-transport} + +As shown in figure \ref{fig:dataflow-overview} and described in section \ref{section:implementation-software-structure}, the interfaces through which transport for packets is provided between the two hosts are producers and consumers. A transport pair is then created between a consumer on one host and a producer on the other, where packets enter the consumer and exit the corresponding producer. Two methods for producers and consumers are implemented: TCP and UDP. As the greedy load balancing of this proxy relies on congestion control, TCP provided a base for a proof of concept, while UDP expands on this proof of concept to produce a usable solution. This section discusses, in section \ref{section:implementation-tcp}, the method of transporting discrete packets across the continuous byte stream of a TCP flow. Then, in section \ref{section:implementation-udp}, it goes on to discuss adding congestion control to UDP datagrams, while avoiding retransmissions. + +\subsection{TCP} +\label{section:implementation-tcp} + +The base implementation for producers and consumers takes advantage of TCP. The requirements for the load balancing given above to function are simple: flow control and congestion control. TCP provides both of these, so was an obvious initial solution. However, TCP also provides unnecessary overhead, which will go on to be discussed further. + +TCP is a stream-oriented connection, while the packets to be sent are discrete datagrams. That is, a TCP flow cannot be connected directly to a TUN adapter, as the TUN adapter expects discrete and formatted IP packets while the TCP connection sends a stream of bytes. To resolve this, each packet sent across a TCP flow is prefixed with the length of the packet. On the sending side, this involves writing the 32-bit length of the packet, followed by the packet itself. For the receiver, first 4 bytes are read to recover the length of the next packet, after which that many bytes are read. This successfully punctuates the stream-oriented connection into a packet-carrying connection. + +However, using TCP to tunnel TCP packets (known as TCP-over-TCP) can cause a degradation in performance in non-ideal circumstances \citep{honda_understanding_2005}. Further, using TCP to tunnel IP packets provides a superset of the required guarantees, in that reliable delivery and ordering are guaranteed. Reliable delivery can cause a decrease in performance for tunnelled flows which do not require reliable delivery, such as a live video stream - a live stream does not wish to wait for a packet to be redelivered from a portion that is already played, and thus will spend longer buffering than if it received the up to date packets instead. Ordering can limit performance when tunnelling multiple streams, as a packet for a phone call could already be received, but instead has to wait in a buffer for a packet for a download to arrive, increasing latency unnecessarily. + +Although the TCP implementation provides an excellent proof of concept and basic implementation, work moved to a second UDP implementation, aiming to solve some of these problems. However, the TCP implementation is functionally correct, so is left as an option, furthering the idea of flexibility maintained throughout this project. In cases where a connection that suffers particularly high packet loss is combined with one which is more stable, TCP could be employed on the high loss connection to limit overall packet loss. The effectiveness of such a solution would be implementation specific, so is left for the architect to decide. + +% --------------------------------- UDP ------------------------------------ % +\subsection{UDP} +\label{section:implementation-udp} + +To resolve the issues seen with TCP, an implementation using UDP was built as an alternative. UDP differs from TCP in that it provides almost no guarantees, and is based on sending discrete datagrams as opposed to a stream of bytes. However, UDP datagrams don't provide the congestion control or flow control required, so this must be built on top of the protocol. As the flow itself can be managed in userspace, opposed to the TCP flow which is managed in kernel space, more flexibility is available in implementation. This allows received packets to be immediately dispatched, with little regard for ordering. + +\subsection{Congestion Control} + +Congestion control is most commonly applied in the context of reliable delivery. This provides a significant benefit to TCP congestion control protocols: cumulative acknowledgements. As all of the bytes should always arrive eventually, unless the connection has faulted, the acknowledgement number (ACK) can simply be set to the highest received byte. Therefore, some adaptations are necessary for TCP congestion control algorithms to apply in an unreliable context. Firstly, for a packet based connection, ACKing specific bytes makes little sense - a packet is atomic, and is lost as a whole unit. To account for this, sequence numbers and their respective acknowledgements will be for entire packets, as opposed to per byte. Secondly, for an unreliable protocol, cumulative acknowledgements are not as simple. As packets are now allowed to never arrive within the correct function of the flow, a situation where a packet is never received would cause deadlock with an ACK that is simply set to the highest received sequence number, demonstrated in figure \ref{fig:sequence-ack-discontinuous}. Neither side can progress once the window is full, as the sender will not receive an ACK to free up space within the window, and the receiver will not receive the missing packet to increase the ACK. + +\begin{figure} + \hfill + \begin{subfigure}[t]{0.3\textwidth} + \centering + \begin{tabular}{|c|c|} + SEQ & ACK \\ + 1 & 0 \\ + 2 & 0 \\ + 3 & 2 \\ + 4 & 2 \\ + 5 & 2 \\ + 6 & 5 \\ + 6 & 6 + \end{tabular} + \caption{ACKs only responding to in order sequence numbers} + \label{fig:sequence-ack-continuous} + \end{subfigure}\hfill + \begin{subfigure}[t]{0.3\textwidth} + \centering + \begin{tabular}{|c|c|} + SEQ & ACK \\ + 1 & 0 \\ + 2 & 0 \\ + 3 & 2 \\ + 5 & 3 \\ + 6 & 3 \\ + 7 & 3 \\ + 7 & 3 + \end{tabular} + \caption{ACKs only responding to a missing sequence number} + \label{fig:sequence-ack-discontinuous} + \end{subfigure}\hfill + \begin{subfigure}[t]{0.35\textwidth} + \centering + \begin{tabular}{|c|c|c|} + SEQ & ACK & NACK \\ + 1 & 0 & 0 \\ + 2 & 0 & 0 \\ + 3 & 2 & 0 \\ + 5 & 2 & 0 \\ + 6 & 2 & 0 \\ + 7 & 6 & 4 \\ + 7 & 7 & 4 + \end{tabular} + \caption{ACKs and NACKs responding to a missing sequence number} + \label{fig:sequence-ack-nack-discontinuous} + \end{subfigure} + \caption{Congestion control responding to correct and missing sequence numbers of packets.} + \label{fig:sequence-ack-nack-comparison} + \hfill +\end{figure} + +I present a solution based on Negative Acknowledgements (NACKs). When the receiver believes that it will never receive a packet, it increases the NACK to the highest missing sequence number, and sets the ACK to one above the NACK. The ACK algorithm is then performed to grow the ACK as high as possible. This is simplified to any change in NACK representing at least one lost packet, which can be used by the specific congestion control algorithms to react. Though this usage of the NACK appears to provide a close approximation to ACKs on reliable delivery, the choice of how to use the ACK and NACK fields is delegated to the congestion controller implementation, allowing for different implementations if they better suit the method of congestion control. + +Given the decision to use ACKs and NACKs, the packet structure for UDP datagrams can now be designed. The chosen structure is given in figure \ref{fig:udp-packet-structure}. The congestion control header consists of the sequence number and the ACK and NACK, each 32-bit unsigned integers. + +\begin{figure} + \centering + \begin{bytefield}[bitwidth=0.6em]{32} + \bitheader{0-31} \\ + \begin{rightwordgroup}{UDP\\Header} + \bitbox{16}{Source port} & \bitbox{16}{Destination port} \\ + \bitbox{16}{Length} & \bitbox{16}{Checksum} + \end{rightwordgroup} \\ + \begin{rightwordgroup}{CC\\Header} + \bitbox{32}{Acknowledgement number} \\ + \bitbox{32}{Negative acknowledgement number} \\ + \bitbox{32}{Sequence number} + \end{rightwordgroup} \\ + \wordbox[tlr]{1}{Proxied IP packet} \\ + \skippedwords \\ + \wordbox[blr]{1}{} \\ + \begin{rightwordgroup}{Security\\Footer} + \wordbox[tlr]{1}{Security footer} \\ + \wordbox[blr]{1}{$\cdots$} + \end{rightwordgroup} + \end{bytefield} + \caption{UDP packet structure} + \label{fig:udp-packet-structure} +\end{figure} + +\subsubsection{New Reno} + +The first algorithm to be implemented for UDP Congestion Control is based on TCP New Reno. TCP New Reno is a well understood and powerful congestion control protocol. RTT estimation is performed by applying $RTT_{AVG} = RTT_{AVG}*(1-x) + RTT_{SAMPLE}*x$ for each newly received packet. Packet loss is measured in two ways: negative acknowledgements when a receiver receives a later packet than expected and has not received the preceding for $0.5*RTT$, and a sender timeout of $3*RTT$. The sender timeout exists to ensure that even if the only packet containing a NACK is dropped, the sender does not deadlock, though this case should be rare with a busy connection. + +To achieve the same curve as New Reno, there are two phases: exponential growth and congestion avoidance. On flow start, using a technique known as slow start, for every packet that is acknowledged, the window size is increased by one. When a packet loss is detected (using either of the two aforementioned methods), slow start ends, and the window size is halved. Now in congestion avoidance, the window size is increased by one for every full window of packets acknowledged without loss, instead of each individual packet. When a packet loss is detected, the window size is half, and congestion avoidance continues. + % -------------------------------------------------------------------------- % % ------------------------- Software Structure ----------------------------- % % -------------------------------------------------------------------------- % @@ -167,7 +283,6 @@ A directory tree of the repository is provided in figure \ref{fig:repository-str .2 code\DTcomment{Go code for the project}. .3 config\DTcomment{Configuration management}. .3 crypto\DTcomment{Cryptographic methods}. - .4 exchanges\DTcomment{Cryptographic exchange FSMs}. .4 sharedkey\DTcomment{Shared key MACs}. .3 mocks\DTcomment{Mocks to enable testing}. .3 proxy\DTcomment{The central proxy controller}. @@ -185,122 +300,6 @@ A directory tree of the repository is provided in figure \ref{fig:repository-str \label{fig:repository-structure} \end{figure} -% -------------------------------------------------------------------------- % -% -------------------------- Packet Transport ------------------------------ % -% -------------------------------------------------------------------------- % -\section{Packet Transport} -\label{section:implementation-producer-consumer} - -As shown in figure \ref{fig:dataflow-overview} and described in section \ref{section:implementation-software-structure}, the interfaces through which transport for packets is provided between the two hosts are producers and consumers. A transport pair is then created between a consumer on one host and a producer on the other, where packets enter the consumer and exit the corresponding producer. Two methods for producers and consumers are implemented: TCP and UDP. As the greedy load balancing of this proxy relies on congestion control, TCP provided a base for a proof of concept, while UDP expands on this proof of concept to produce a usable solution. This section discusses, in section \ref{section:implementation-tcp}, the method of transporting discrete packets across the continuous byte stream of a TCP flow. Then, in section \ref{section:implementation-udp}, it goes on to discuss adding congestion control to UDP datagrams, while avoiding retransmissions. - -\subsection{TCP} -\label{section:implementation-tcp} - -The base implementation for producers and consumers takes advantage of TCP. The requirements for the load balancing given above to function are simple: flow control and congestion control. TCP provides both of these, so was an obvious initial solution. However, TCP also provides unnecessary overhead, which will go on to be discussed further. - -TCP is a stream-oriented connection, while the packets to be sent are discrete datagrams. That is, a TCP flow cannot be connected directly to a TUN adapter, as the TUN adapter expects discrete and formatted IP packets while the TCP connection sends a stream of bytes. To resolve this, each packet sent across a TCP flow is prefixed with the length of the packet. On the sending side, this involves writing the 32-bit length of the packet, followed by the packet itself. For the receiver, first 4 bytes are read to recover the length of the next packet, after which that many bytes are read. This successfully punctuates the stream-oriented connection into a packet-carrying connection. - -However, using TCP to tunnel TCP packets (known as TCP-over-TCP) can cause a degradation in performance in non-ideal circumstances \citep{honda_understanding_2005}. Further, using TCP to tunnel IP packets provides a superset of the required guarantees, in that reliable delivery and ordering are guaranteed. Reliable delivery can cause a decrease in performance for tunnelled flows which do not require reliable delivery, such as a live video stream - a live stream does not wish to wait for a packet to be redelivered from a portion that is already played, and thus will spend longer buffering than if it received the up to date packets instead. Ordering can limit performance when tunnelling multiple streams, as a packet for a phone call could already be received, but instead has to wait in a buffer for a packet for a download to arrive, increasing latency unnecessarily. - -Although the TCP implementation provides an excellent proof of concept and basic implementation, work moved to a second UDP implementation, aiming to solve some of these problems. However, the TCP implementation is functionally correct, so is left as an option, furthering the idea of flexibility maintained throughout this project. In cases where a connection that suffers particularly high packet loss is combined with one which is more stable, TCP could be employed on the high loss connection to limit overall packet loss. The effectiveness of such a solution would be implementation specific, so is left for the architect to decide. - -% --------------------------------- UDP ------------------------------------ % -\subsection{UDP} -\label{section:implementation-udp} - -To resolve the issues seen with TCP, an implementation using UDP was built as an alternative. UDP differs from TCP in that it provides almost no guarantees, and is based on sending discrete datagrams as opposed to a stream of bytes. However, UDP datagrams don't provide the congestion control or flow control required, so this must be built on top of the protocol. As the flow itself can be managed in userspace, opposed to the TCP flow which is managed in kernel space, more flexibility is available in implementation. This allows received packets to be immediately dispatched, with little regard for ordering. - -\subsection{Congestion Control} - -Congestion control is most commonly applied in the context of reliable delivery. This provides a significant benefit to TCP congestion control protocols: cumulative acknowledgements. As all of the bytes should always arrive eventually, unless the connection has faulted, the acknowledgement number (ACK) can simply be set to the highest received byte. Therefore, some adaptations are necessary for TCP congestion control algorithms to apply in an unreliable context. Firstly, for a packet based connection, ACKing specific bytes makes little sense - a packet is atomic, and is lost as a whole unit. To account for this, sequence numbers and their respective acknowledgements will be for entire packets, as opposed to per byte. Secondly, for an unreliable protocol, cumulative acknowledgements are not as simple. As packets are now allowed to never arrive within the correct function of the flow, a situation where a packet is never received would cause deadlock with an ACK that is simply set to the highest received sequence number, demonstrated in figure \ref{fig:sequence-ack-discontinuous}. Neither side can progress once the window is full, as the sender will not receive an ACK to free up space within the window, and the receiver will not receive the missing packet to increase the ACK. - -\begin{figure} - \hfill - \begin{subfigure}[t]{0.3\textwidth} - \centering - \begin{tabular}{|c|c|} - SEQ & ACK \\ - 1 & 0 \\ - 2 & 0 \\ - 3 & 2 \\ - 4 & 2 \\ - 5 & 2 \\ - 6 & 5 \\ - 6 & 6 - \end{tabular} - \caption{ACKs only responding to in order sequence numbers} - \label{fig:sequence-ack-continuous} - \end{subfigure}\hfill - \begin{subfigure}[t]{0.3\textwidth} - \centering - \begin{tabular}{|c|c|} - SEQ & ACK \\ - 1 & 0 \\ - 2 & 0 \\ - 3 & 2 \\ - 5 & 3 \\ - 6 & 3 \\ - 7 & 3 \\ - 7 & 3 - \end{tabular} - \caption{ACKs only responding to a missing sequence number} - \label{fig:sequence-ack-discontinuous} - \end{subfigure}\hfill - \begin{subfigure}[t]{0.35\textwidth} - \centering - \begin{tabular}{|c|c|c|} - SEQ & ACK & NACK \\ - 1 & 0 & 0 \\ - 2 & 0 & 0 \\ - 3 & 2 & 0 \\ - 5 & 2 & 0 \\ - 6 & 2 & 0 \\ - 7 & 6 & 4 \\ - 7 & 7 & 4 - \end{tabular} - \caption{ACKs and NACKs responding to a missing sequence number} - \label{fig:sequence-ack-nack-discontinuous} - \end{subfigure} - \caption{Congestion control responding to correct and missing sequence numbers of packets.} - \label{fig:sequence-ack-nack-comparison} - \hfill -\end{figure} - -I present a solution based on Negative Acknowledgements (NACKs). When the receiver believes that it will never receive a packet, it increases the NACK to the highest missing sequence number, and sets the ACK to one above the NACK. The ACK algorithm is then performed to grow the ACK as high as possible. This is simplified to any change in NACK representing at least one lost packet, which can be used by the specific congestion control algorithms to react. Though this usage of the NACK appears to provide a close approximation to ACKs on reliable delivery, the choice of how to use the ACK and NACK fields is delegated to the congestion controller implementation, allowing for different implementations if they better suit the method of congestion control. - -Given the decision to use ACKs and NACKs, the packet structure for UDP datagrams can now be designed. The chosen structure is given in figure \ref{fig:udp-packet-structure}. The congestion control header consists of the sequence number and the ACK and NACK, each 32-bit unsigned integers. - -\begin{figure} - \centering - \begin{bytefield}[bitwidth=0.6em]{32} - \bitheader{0-31} \\ - \begin{rightwordgroup}{UDP\\Header} - \bitbox{16}{Source port} & \bitbox{16}{Destination port} \\ - \bitbox{16}{Length} & \bitbox{16}{Checksum} - \end{rightwordgroup} \\ - \begin{rightwordgroup}{CC\\Header} - \bitbox{32}{Acknowledgement number} \\ - \bitbox{32}{Negative acknowledgement number} \\ - \bitbox{32}{Sequence number} - \end{rightwordgroup} \\ - \wordbox[tlr]{1}{Proxied IP packet} \\ - \skippedwords \\ - \wordbox[blr]{1}{} \\ - \begin{rightwordgroup}{Security\\Footer} - \wordbox[tlr]{1}{Security footer} \\ - \wordbox[blr]{1}{$\cdots$} - \end{rightwordgroup} - \end{bytefield} - \caption{UDP packet structure} - \label{fig:udp-packet-structure} -\end{figure} - -\subsubsection{New Reno} - -The first algorithm to be implemented for UDP Congestion Control is based on TCP New Reno. TCP New Reno is a well understood and powerful congestion control protocol. RTT estimation is performed by applying $RTT_{AVG} = RTT_{AVG}*(1-x) + RTT_{SAMPLE}*x$ for each newly received packet. Packet loss is measured in two ways: negative acknowledgements when a receiver receives a later packet than expected and has not received the preceding for $0.5*RTT$, and a sender timeout of $3*RTT$. The sender timeout exists to ensure that even if the only packet containing a NACK is dropped, the sender does not deadlock, though this case should be rare with a busy connection. - -To achieve the same curve as New Reno, there are two phases: exponential growth and congestion avoidance. On flow start, using a technique known as slow start, for every packet that is acknowledged, the window size is increased by one. When a packet loss is detected (using either of the two aforementioned methods), slow start ends, and the window size is halved. Now in congestion avoidance, the window size is increased by one for every full window of packets acknowledged without loss, instead of each individual packet. When a packet loss is detected, the window size is half, and congestion avoidance continues. - % -------------------------------------------------------------------------- % % ------------------------ System Configuration ---------------------------- % % -------------------------------------------------------------------------- %