530 lines
37 KiB
TeX
530 lines
37 KiB
TeX
%*******************************************************************************
|
|
%****************************** Third Chapter **********************************
|
|
%*******************************************************************************
|
|
\chapter{Implementation}
|
|
|
|
% **************************** Define Graphics Path **************************
|
|
\ifpdf
|
|
\graphicspath{{3_Implementation/Figs/Raster/}{3_Implementation/Figs/PDF/}{3_Implementation/Figs/}}
|
|
\else
|
|
\graphicspath{{3_Implementation/Figs/Vector/}{3_Implementation/Figs/}}
|
|
\fi
|
|
|
|
% --------------------------- Introduction --------------------------------- %
|
|
|
|
\begin{sidewaysfigure}
|
|
\includegraphics[width=\textheight]{overview.png}
|
|
\caption{Diagram of the dataflow within the proxy.}
|
|
\label{fig:dataflow-overview}
|
|
\end{sidewaysfigure}
|
|
|
|
Implementation of the proxy is in two parts: software that provides a multipath layer 3 tunnel between two hosts, and the system configuration necessary to proxy as described. An overview of the system is presented in figure 3.1.
|
|
|
|
This chapter will detail this implementation in three sections. The software will be described in sections \ref{section:implementation-software-structure} and \ref{section:implementation-producer-consumer}. Section \ref{section:implementation-software-structure} explains the software's structure and dataflow. Section \ref{section:implementation-producer-consumer} details the implementation of both TCP and UDP methods of transporting the tunnelled packets between the hosts. The system configuration will be described in section \ref{section:implementation-system-configuration}, along with a discussion of some of the oddities of multipath routing, such that a reader would have enough knowledge to implement the proxy.
|
|
|
|
% ------------------------- Software Structure ----------------------------- %
|
|
\section{Software Structure}
|
|
\label{section:implementation-software-structure}
|
|
|
|
\mynote{TODO}
|
|
|
|
% -------------------------- Packet Transport ------------------------------ %
|
|
\section{Packet Transport}
|
|
\label{section:implementation-producers-consumers}
|
|
|
|
\mynote{TODO}
|
|
|
|
|
|
% ------------------------ System Configuration ---------------------------- %
|
|
\section{System Configuration}
|
|
\label{section:implementation-system-configuration}
|
|
|
|
The software portion of this proxy is entirely symmetric, as can be seen in figure \ref{fig:dataflow-overview}. However, the system configuration diverges, providing a different role for each side of the proxy.
|
|
|
|
\mynote{TODO}
|
|
|
|
% ------------------------------ Summary ----------------------------------- %
|
|
\section{Summary}
|
|
|
|
\mynote{TODO: Move repository overview here.}
|
|
|
|
\mynote{TODO}
|
|
|
|
% ------------------------------- Proxy ------------------------------------ %
|
|
\section{Proxy}
|
|
\label{section:implementation-proxy}
|
|
|
|
The central structure for the operation of the software is the \verb'Proxy' struct. The proxy is defined by its source and sink, and provides methods for \verb'AddConsumer' and \verb'AddProducer'. The proxy coordinates the dispatching of sourced packets to consumers, and the delivery of produced packets to the sink. This follows the packet data path shown in figure \ref{fig:proxy-start-data-flow}.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{tikzpicture}[
|
|
rednode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm, align=center},
|
|
bluenode/.style={rectangle, draw=black!60, fill=blue!5, very thick, minimum size=5mm, align=center},
|
|
]
|
|
|
|
% Nodes
|
|
\node[rednode] at (0,1.25) (source) {Packet\\Source};
|
|
\node[bluenode] at (2,1.25) (sourcechan) {Source\\Queue};
|
|
\node[rednode] at (5,2.5) (consumer1) {Consumer\\1};
|
|
\node[rednode] at (5,0) (consumern) {Consumer\\n};
|
|
|
|
\node[rednode] at (8,2.5) (producer1) {Producer\\1};
|
|
\node[rednode] at (8,0) (producern) {Producer\\n};
|
|
\node[bluenode] at (11,1.25) (sinkchan) {Sink\\Queue};
|
|
\node[rednode] at (13,1.25) (sink) {Packet\\Sink};
|
|
|
|
% Ellipses
|
|
\path (consumer1) -- (consumern) node [black, font=\huge, midway, sloped] {$\dots$};
|
|
\path (producer1) -- (producern) node [black, font=\huge, midway, sloped] {$\dots$};
|
|
|
|
% Arrows
|
|
\draw[->] (source.east) -- (sourcechan.west);
|
|
\draw[->] (sourcechan.east) -- (consumer1.west);
|
|
\draw[->] (sourcechan.east) -- (consumern.west);
|
|
|
|
\draw[->] (producer1.east) -- (sinkchan.west);
|
|
\draw[->] (producern.east) -- (sinkchan.west);
|
|
\draw[->] (sinkchan.east) -- (sink.west);
|
|
|
|
\end{tikzpicture}
|
|
\caption{Packet flow within proxy start method.}
|
|
\label{fig:proxy-start-data-flow}
|
|
\end{figure}
|
|
|
|
The proxy is implemented to take a consistent sink and source and accept consumers and producers that vary over the lifetime. This is due to the nature of producers and consumers, as each may be either ephemeral or persistent, depending on the configuration. An example is a device that accepts TCP connections and makes outbound UDP connections. In such a case, the TCP producers and consumers would be ephemeral, existing only until they are closed by the far side. The UDP producers and consumers are persistent, as control of reconnection is handled by this proxy. As the configuration is deliberately intended to be flexible, both of these can exist within the same proxy instance.
|
|
|
|
The structure of the proxy is built around the flow graph in figure \ref{fig:proxy-start-data-flow}. The flow graph demonstrates the four transfers of data that occur: packet source to source queue, source queue to consumer, producer to sink queue, and sink queue to packet sink. For the former and latter, these exist once for an instance of the proxy. The others run once for each consumer or producer. Basic examples of the logic applied for each flow are given in figure \ref{fig:proxy-loops}.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{subfigure}[b]{0.45\textwidth}
|
|
\begin{minted}{python}
|
|
while True:
|
|
packet := source.source()
|
|
source_queue.addOrBlock(packet)
|
|
\end{minted}
|
|
\caption{Loop for packet sourcing.}
|
|
\end{subfigure}
|
|
\begin{subfigure}[b]{0.45\textwidth}
|
|
\begin{minted}{python}
|
|
while is_alive(consumer):
|
|
packet = source_queue.popOrBlock()
|
|
consumer.consume(packet)
|
|
\end{minted}
|
|
\caption{Loop for each consumer.}
|
|
\end{subfigure}
|
|
\begin{subfigure}[b]{0.45\textwidth}
|
|
\begin{minted}{python}
|
|
while True:
|
|
packet = sink_queue.popOrBlock(packet)
|
|
sink.sink(packet)
|
|
\end{minted}
|
|
\caption{Loop for packet sinking.}
|
|
\end{subfigure}
|
|
\begin{subfigure}[b]{0.45\textwidth}
|
|
\begin{minted}{python}
|
|
while is_alive(producer):
|
|
packet = producer.produce()
|
|
sink_queue.addOrBlock(packet)
|
|
\end{minted}
|
|
\caption{Loop for each producer.}
|
|
\end{subfigure}
|
|
\caption{Pseudocode for the four flows of data around the central proxy.}
|
|
\label{fig:proxy-loops}
|
|
\end{figure}
|
|
|
|
Although the pseudocode given in figure \ref{fig:proxy-loops} is incredibly simple, aside from error handling, this is as implemented in the Go code. Go's cooperative scheduler and lightweight Goroutines make this an efficient implementation. However, given the expected quantities of simultaneously connected consumers and producers is low, heavier OS threads would also be effective here. The queues are further trivial to implement in Go, as channels provide all of the necessary functionality, but can also be implemented in other languages. The lifetime of producers and consumers are controlled by the lifetime of the aforementioned loops, and are only referenced within them, such that the garbage collector can collect any producers and consumers for which the loops have exited.
|
|
|
|
Finally is the aforementioned ability for the central proxy to restart consumers or producers that support it (thus far, those initiated by the proxy in question). This causes the wrapping of the loops shown in figure \ref{fig:proxy-loops} in an additional layer. Pseudocode for the expansion to the consumer is shown in figure \ref{fig:proxy-loops-restart}, with the providers being expanded similarly.
|
|
|
|
\begin{figure}
|
|
\begin{minted}{python}
|
|
do:
|
|
while is_reconnectable(consumer) and not is_alive(consumer):
|
|
reconnect(consumer)
|
|
while is_alive(consumer):
|
|
packet = source_queue.popOrBlock()
|
|
consumer.consume(packet)
|
|
while is_reconnectable(consumer)
|
|
\end{minted}
|
|
\caption{Pseudocode for a consumer, supporting reconnection.}
|
|
\label{fig:proxy-loops-restart}
|
|
\end{figure}
|
|
|
|
% ------------------------- Builder / Config ------------------------------- %
|
|
\section{Configuration}
|
|
|
|
The configuration format chosen was INI, extended with duplicate names. Included is a single Host section, followed by multiple Peer sections specific to a method of communicating with the other side. Processing the configuration file is split into three parts: loading the configuration file into a Go struct, validating the configuration file, and building a proxy from the loaded configuration.
|
|
|
|
Validation of the configuration file is included to discover configuration errors prior to building an invalid proxy. Firstly, this ensures that all parts of the program built from the configuration are given values which are invalid in context and easily verifiable, such as a TCP port of above 65,535. Secondly, catching errors in configuration before attempting to build the proxy constrains the errors of an invalid configuration to a single location. For a user, this might mean that an error such as \verb'Peer[1].LocalPort invalid: max 65535; given 74523' is shown, as opposed to \verb'tcp: invalid address', which shows the user's error as opposed to a bug in the code.
|
|
|
|
Once a configuration is validated, the proxy is built. This is a simple case of creating the proxy from the given data and adding the producers and consumers for its successful running.
|
|
|
|
This builder structure is also useful for a Go project, as it helps avoid circular imports, which are banned in Go. An example is a TCP flow implementing \verb'proxy.Consumer'. The proxy package cannot import TCP to create flows, so this must be delegated to another package. A builder package can bridge the gap, while maintaining a close link between the configuration and a single place where it is built.
|
|
|
|
% ------------------------- Sources and Sinks ------------------------------ %
|
|
\section{Sourcing and Sinking Packets}
|
|
|
|
A source and sink of packets are the names given to the method of entry for packets to tunnel and the method of exit for packets received. In the current implementation, both of these are achieved with a TUN adapter. A TUN adapter is a virtual network interface provided by the kernels of multiple unices, allowing userspace code to interact with Layer 3 networking.
|
|
|
|
% ---------------------- Providers and Consumers --------------------------- %
|
|
\section{Producers and Consumers}
|
|
\label{section:implementation-prod-cons}
|
|
|
|
Producers and consumers were designed with flexibility in mind. A producer produces packets from the far side proxy. A consumer sends packets to the far side proxy. It has already been mentioned that they can be either ephemeral or persistent, and this is achieved by implementation of an additional interface, \verb'Reconnectable'. The three relevant interfaces for producers and consumers are given in figure \ref{fig:producer-consumer-interfaces}. A basic consumer or producer must simply provide a method to consume a packet or produce one, given the relevant security method, and a method to decide if it is still alive. This is an interface which can be applied to both TCP and UDP based connections, with others also feasible.
|
|
|
|
\begin{figure}
|
|
\begin{minted}{go}
|
|
type Producer interface {
|
|
IsAlive() bool
|
|
Produce(MacVerifier) (Packet, error)
|
|
}
|
|
|
|
type Consumer interface {
|
|
IsAlive() bool
|
|
Consume(Packet, MacGenerator) error
|
|
}
|
|
|
|
type Reconnectable interface {
|
|
Reconnect() error
|
|
}
|
|
\end{minted}
|
|
\caption{Interfaces for producers and consumers.}
|
|
\label{fig:producer-consumer-interfaces}
|
|
\end{figure}
|
|
|
|
% --------------------------------- TCP ------------------------------------ %
|
|
\section{TCP}
|
|
\label{section:implementation-tcp}
|
|
|
|
The base implementation for producers and consumers takes advantage of TCP. The requirements for the load balancing given above to function are simple: flow control and congestion control. TCP provides both of these, so was an obvious initial solution. However, TCP also provides unnecessary overhead, which will go on to be discussed further.
|
|
|
|
TCP is a stream oriented connection, while the packets to be sent are discrete datagrams. That is, a TCP flow cannot be connected directly to a TUN adapter, as the TUN adapter expects discrete and formatted IP packets while the TCP connection sends a stream of bytes. To resolve this, each packet sent across a TCP flow is prefixed with the length of the packet. On the sending side, this involves writing the 32-bit length of the packet, followed by the packet itself. For the receiver, first 4 bytes are read to recover the length of the next packet, after which that many bytes are read. This successfully punctuates the stream oriented connection into a packet based connection.
|
|
|
|
However, using TCP to tunnel TCP packets (known as TCP-over-TCP) can cause a degradation in performance in non-ideal circumstances \citep{honda_understanding_2005}. Further, using TCP to tunnel IP packets provides a superset of the required guarantees, in that reliable delivery and ordering are guaranteed. Reliable delivery can cause a decrease in performance for tunnelled flows which do not require reliable delivery, such as a live video stream - a live stream does not wish to wait for a packet to be redelivered from a portion that is already played, and thus will spend longer buffering than if it received the up to date packets instead. Ordering can limit performance when tunnelling multiple streams, as a packet for a phone call could already be received, but instead has to wait in a buffer for a packet for a download to arrive, increasing latency unnecessarily.
|
|
|
|
Although the TCP implementation provides an excellent proof of concept and basic implementation, work moved to a second UDP implementation, aiming to solve some of these problems. However, the TCP implementation is functionally correct, so is left as an option, furthering the idea of flexibility maintained throughout this project. In cases where a connection that suffers particularly high packet loss is combined with one which is more stable, TCP could be employed on the high loss connection to limit overall packet loss. The effectiveness of such a solution would be implementation specific, so is left for the architect to decide.
|
|
|
|
% --------------------------------- UDP ------------------------------------ %
|
|
\section{UDP}
|
|
\label{section:implementation-udp}
|
|
|
|
To resolve the issues seen with TCP, an implementation using UDP was built as an alternative. UDP differs from TCP in that it provides almost no guarantees, and is based on sending discrete datagrams as opposed to a stream of bytes. However, UDP datagrams don't provide the congestion control or flow control required, so this must be built on top of the protocol. As the flow itself can be managed in userspace, opposed to the TCP flow which is managed in kernel space, more flexibility is available in implementation. This allows received packets to be immediately dispatched, with little regard for ordering.
|
|
|
|
\subsection{Congestion Control}
|
|
|
|
Congestion control is most commonly applied in the context of reliable delivery. This provides a significant benefit to TCP congestion control protocols: cumulative acknowledgements. As all of the bytes should always arrive eventually, unless the connection has faulted, the acknowledgement number (ACK) can simply be set to the highest received byte. Therefore, some adaptations are necessary for TCP congestion control algorithms to apply in an unreliable context. Firstly, for a packet based connection, ACKing specific bytes makes little sense - a packet is atomic, and is lost as a whole unit. To account for this, sequence numbers and their respective acknowledgements will be for entire packets, as opposed to per byte. Secondly, for an unreliable protocol, cumulative acknowledgements are not as simple. As packets are now allowed to never arrive within the correct function of the flow, a situation where a packet is never received would cause deadlock with an ACK that is simply set to the highest received sequence number, demonstrated in figure \ref{fig:sequence-ack-discontinuous}. Neither side can progress once the window is full, as the sender will not receive an ACK to free up space within the window, and the receiver will not receive the missing packet to increase the ACK.
|
|
|
|
\begin{figure}
|
|
\hfill
|
|
\begin{subfigure}[t]{0.3\textwidth}
|
|
\centering
|
|
\begin{tabular}{|c|c|}
|
|
SEQ & ACK \\
|
|
1 & 0 \\
|
|
2 & 0 \\
|
|
3 & 2 \\
|
|
4 & 2 \\
|
|
5 & 2 \\
|
|
6 & 5 \\
|
|
6 & 6
|
|
\end{tabular}
|
|
\caption{ACKs only responding to in order sequence numbers}
|
|
\label{fig:sequence-ack-continuous}
|
|
\end{subfigure}\hfill
|
|
\begin{subfigure}[t]{0.3\textwidth}
|
|
\centering
|
|
\begin{tabular}{|c|c|}
|
|
SEQ & ACK \\
|
|
1 & 0 \\
|
|
2 & 0 \\
|
|
3 & 2 \\
|
|
5 & 3 \\
|
|
6 & 3 \\
|
|
7 & 3 \\
|
|
7 & 3
|
|
\end{tabular}
|
|
\caption{ACKs only responding to a missing sequence number}
|
|
\label{fig:sequence-ack-discontinuous}
|
|
\end{subfigure}\hfill
|
|
\begin{subfigure}[t]{0.35\textwidth}
|
|
\centering
|
|
\begin{tabular}{|c|c|c|}
|
|
SEQ & ACK & NACK \\
|
|
1 & 0 & 0 \\
|
|
2 & 0 & 0 \\
|
|
3 & 2 & 0 \\
|
|
5 & 2 & 0 \\
|
|
6 & 2 & 0 \\
|
|
7 & 6 & 4 \\
|
|
7 & 7 & 4
|
|
\end{tabular}
|
|
\caption{ACKs and NACKs responding to a missing sequence number}
|
|
\label{fig:sequence-ack-nack-discontinuous}
|
|
\end{subfigure}
|
|
\caption{Congestion control responding to correct and missing sequence numbers of packets.}
|
|
\label{fig:sequence-ack-nack-comparison}
|
|
\hfill
|
|
\end{figure}
|
|
|
|
I present a solution based on Negative Acknowledgements (NACKs). When the receiver believes that it will never receive a packet, it increases the NACK to the highest missing sequence number, and sets the ACK to one above the NACK. The ACK algorithm is then performed to grow the ACK as high as possible. This is simplified to any change in NACK representing at least one lost packet, which can be used by the specific congestion control algorithms to react. Though this usage of the NACK appears to provide a close approximation to ACKs on reliable delivery, the choice of how to use the ACK and NACK fields is delegated to the congestion controller implementation, allowing for different implementations if they better suit the method of congestion control.
|
|
|
|
Given the decision to use ACKs and NACKs, the packet structure for UDP datagrams can now be designed. The chosen structure is given in figure \ref{fig:udp-packet-structure}. The congestion control header consists of the sequence number and the ACK and NACK, each 32-bit unsigned integers.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{bytefield}[bitwidth=0.6em]{32}
|
|
\bitheader{0-31} \\
|
|
\begin{rightwordgroup}{UDP\\Header}
|
|
\bitbox{16}{Source port} & \bitbox{16}{Destination port} \\
|
|
\bitbox{16}{Length} & \bitbox{16}{Checksum}
|
|
\end{rightwordgroup} \\
|
|
\begin{rightwordgroup}{CC\\Header}
|
|
\bitbox{32}{Acknowledgement number} \\
|
|
\bitbox{32}{Negative acknowledgement number} \\
|
|
\bitbox{32}{Sequence number}
|
|
\end{rightwordgroup} \\
|
|
\wordbox[tlr]{1}{Proxied IP packet} \\
|
|
\skippedwords \\
|
|
\wordbox[blr]{1}{} \\
|
|
\begin{rightwordgroup}{Security\\Footer}
|
|
\wordbox[tlr]{1}{Security footer} \\
|
|
\wordbox[blr]{1}{$\cdots$}
|
|
\end{rightwordgroup}
|
|
\end{bytefield}
|
|
\caption{UDP packet structure}
|
|
\label{fig:udp-packet-structure}
|
|
\end{figure}
|
|
|
|
\subsubsection{New Reno}
|
|
|
|
The first algorithm to be implemented for UDP Congestion Control is based on TCP New Reno. TCP New Reno is a well understood and powerful congestion control protocol. RTT estimation is performed by applying $RTT_{AVG} = RTT_{AVG}*(1-x) + RTT_{SAMPLE}*x$ for each newly received packet. Packet loss is measured in two ways: negative acknowledgements when a receiver receives a later packet than expected and has not received the preceding for $0.5*RTT$, and a sender timeout of $3*RTT$. The sender timeout exists to ensure that even if the only packet containing a NACK is dropped, the sender does not deadlock, though this case should be rare with a busy connection.
|
|
|
|
To achieve the same curve as New Reno, there are two phases: exponential growth and congestion avoidance. On flow start, using a technique known as slow start, for every packet that is acknowledged, the window size is increased by one. When a packet loss is detected (using either of the two aforementioned methods), slow start ends, and the window size is halved. Now in congestion avoidance, the window size is increased by one for every full window of packets acknowledged without loss, instead of each individual packet. When a packet loss is detected, the window size is half, and congestion avoidance continues.
|
|
|
|
% ------------------------------- Routing ---------------------------------- %
|
|
\section{System Configuration}
|
|
\label{section:implementation-systems}
|
|
|
|
\mynote{Write this.}
|
|
|
|
\subsection{Routing}
|
|
\label{section:implementation-systems-routing}
|
|
|
|
\mynote{Left for now. I have a bit of an idea about why this might/might not be necessary, and want to discuss it with slightly more context (it differs massively between use cases).}
|
|
|
|
\subsection{Buffers}
|
|
|
|
\mynote{Left for now. I have some specific debugging I want to do on the real connection to see if I can come up with some better reasons for this. Currently it's along the lines of "setting it to 0 solves one problem but might make another worse".}
|
|
|
|
% ----------------------------- Security ----------------------------------- %
|
|
\section{Security}
|
|
\label{section:implementation-security}
|
|
|
|
The security implementation (beyond layered security, see section \ref{section:layered-security}), is implemented in three parts: the \verb'Exchange' interface for initial exchanges across a flow, the \verb'RepeatBlocker' interface for preventing repeated packets, and the \verb'MacGenerator' and \verb'MacVerifier' interfaces for providing and verifying message authenticity.
|
|
|
|
\subsection{Exchanges}
|
|
|
|
Cryptographic exchanges are represented by the interface given in figure \ref{fig:crypto-exchanges-interface}. The interface was chosen to be absolutely flexible, while maintaining simplicity. The \verb'Next' method returns two byte arrays. The first contains the message that should be sent back to the far side, if there is one. The second contains data taken from the packet, if it contains any. This allows for exchanges such as the TCP three way handshake, in which the final message contains data as well as setup. The first exchange implemented is the \verb'None' exchange, which should implement no exchange at all, and is shown in figure \ref{fig:crypto-exchanges-none}.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{minted}{go}
|
|
type Exchange interface {
|
|
IsDone() bool
|
|
First() (out []byte, err error)
|
|
Next(in []byte) (out []byte, data []byte, err error)
|
|
}
|
|
\end{minted}
|
|
\caption{Cryptographic exchanges interface.}
|
|
\label{fig:crypto-exchanges-interface}
|
|
\end{figure}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{minted}{go}
|
|
type None struct{}
|
|
|
|
func (None) IsDone() bool { return true }
|
|
func (None) First() ([]byte, error) { return nil, nil }
|
|
func (None) Next(b []byte) ([]byte, []byte, error) { return nil, nil, nil }
|
|
\end{minted}
|
|
\caption{Empty cryptographic exchange implementation.}
|
|
\label{fig:crypto-exchanges-none}
|
|
\end{figure}
|
|
|
|
The exchanges are designed to be applicable to any packet transport method. The flow of using an exchange is shown in pseudocode in figure \ref{fig:crypto-exchanges-pseudocode}. The flow manages all packets exchanged by the connection, until the exchange confirms that it is done, or fails. The exchange can also produce data during the exchange, which is handed off to the normal packet handling mechanism, represented in the pseudocode by the \verb'yield' keyword.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{minted}{python}
|
|
if not exchange.IsDone():
|
|
if this_side_goes_first:
|
|
send(try(exchange.First()))
|
|
|
|
while not exchange.IsDone():
|
|
packet = receive()
|
|
next, out = try(exchange.Next(packet))
|
|
if out != nil:
|
|
yield out
|
|
\end{minted}
|
|
\caption{Pseudocode for implementing the cryptographic exchange interface for a transport.}
|
|
\label{fig:crypto-exchanges-pseudocode}
|
|
\end{figure}
|
|
|
|
\mynote{Reference implemented exchange from preparation once finalised.}
|
|
|
|
\subsection{Repeated Packets}
|
|
\label{section:implementation-repeated-packets}
|
|
|
|
As discussed in section \ref{section:preparation-repeated-packets}, the algorithm used to prevent repeated packets is the \emph{IPsec Anti-Replay Algorithm without Bit Shifting} \citep{tsou_ipsec_2012}. In the referenced work, a \verb'C' implementation is provided, which could be easily adapted to Go.
|
|
|
|
Although the repeated packet protection is primarily targeted at avoiding repeated IP packets, it is possible for it to be used elsewhere, as mentioned in exchanges. As such, the interface for repeat protection is made concurrent, such that it can be used anywhere in the program it's deemed useful. Given that initial flow exchanges should happen exceedingly rarely compared to packet flow, this does not affect the size of the repeated packet window significantly.
|
|
|
|
\mynote{Discuss the specific implementation of this algorithm once finalised.}
|
|
|
|
\subsection{Message Authentication Codes}
|
|
|
|
The message authentication interfaces, shown in figure \ref{fig:message-authenticator-interface}, are designed to simply support any MAC generator, including digital signatures. The MAC algorithm that is implemented at this point is BLAKE2s \citep{hutchison_blake2_2013}, a fast and secure algorithm that supports prefix-MAC generation. BLAKE2s is available in OpenSSL\footnote{\url{https://openssl.org}} and the Go \verb'crypto'\footnote{\url{https://github.com/golang/crypto}} library, making it accessible. It is also an extremely fast algorithm, that provides the necessary guarantees for this software, so was chosen for its low performance overheads. Any other MAC which can be generated in a similar fashion could also be used.
|
|
|
|
\begin{figure}
|
|
\inputminted{go}{3_Implementation/Samples/mac.go}
|
|
\caption{Message authenticator interface}
|
|
\label{fig:message-authenticator-interface}
|
|
\end{figure}
|
|
|
|
\subsection{Hierarchy}
|
|
|
|
The security features presented in this section form a hierarchy within the data flow of a packet. Given in figure \ref{fig:udp-packet-dataflow} is an example of the growth of a packet with a specific configuration of UDP, replay protection, and a message authentication code. The process of sending a packet via a consumer is given right to left, and the reverse process of receiving a packet from a producer is given left to right. For a packet to consume, the first element added is the data sequence number. This data sequence number is closest to the proxy, as it is global and not unique to a flow. This is followed by the congestion control header, added before the MAC such that it receive integrity and authenticity protection, preventing tampering. Finally, the UDP header is prepended before the packet is dispatched. The same process is repeated in reverse in the other direction. This figure represents the flow of data through the software, and importantly, where each portion of the security must be implemented to achieve its goals.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{tikzpicture}[
|
|
onenode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm, align=center},
|
|
twonode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm, align=center, rectangle split, rectangle split parts=2},
|
|
threenode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm, align=center, rectangle split, rectangle split parts=3},
|
|
fournode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm, align=center, rectangle split, rectangle split parts=4},
|
|
fivenode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm, align=center, rectangle split, rectangle split parts=5},
|
|
bluenode/.style={rectangle, draw=black!60, fill=blue!5, very thick, minimum size=5mm, align=center},
|
|
]
|
|
% Nodes
|
|
\node[fivenode] at (0,0) (udp) {\nodepart{one} UDP Header \nodepart{two} Congestion\\Control\\Header \nodepart{three} Packet\\Data \nodepart{four} Data\\Sequence\\Number \nodepart{five} MAC};
|
|
|
|
\node[fournode] at (3,0) (mac) {\nodepart{one} Congestion\\Control\\Header \nodepart{two} Packet\\Data \nodepart{three} Data\\Sequence\\Number \nodepart{four} MAC};
|
|
|
|
\node[threenode] at (6,0) (cc) {\nodepart{one} Congestion\\Control\\Header \nodepart{two} Packet\\Data \nodepart{three} Data\\Sequence\\Number};
|
|
|
|
\node[twonode] at (9,0) (sequence) {\nodepart{one} Packet\\Data \nodepart{two} Data\\Sequence\\Number};
|
|
|
|
\node[onenode] at (12,0) (data) {Packet\\Data};
|
|
|
|
% Edges
|
|
\draw[<->] (udp.east) -- (mac.west);
|
|
\draw[<->] (mac.east) -- (cc.west);
|
|
\draw[<->] (cc.east) -- (sequence.west);
|
|
\draw[<->] (sequence.east) -- (data.west);
|
|
\end{tikzpicture}
|
|
\caption{Data flow of a UDP packet through the application.}
|
|
\label{fig:udp-packet-dataflow}
|
|
\end{figure}
|
|
|
|
% ----------------------------- Testing ------------------------------------ %
|
|
\section{Testing and Evaluation}
|
|
|
|
The project has a particular focus on automatic evaluation. Although some testing can be performed with mocked connections, the evaluation of the success criteria revolve around virtual hardware. The benefit of virtual hardware is the ability to spin up and spin down entire testing environments when required. As such, automatic evaluation software can be built to create the required environments and gather the information required. This allows each code change to be verified by confirming that the graphs show the required trends.
|
|
|
|
The application is split into two parts: data gathering and data processing. There are two reasons for this. Firstly, the data gathering takes significantly longer than data processing, so being able to gather data once and then process it multiple times is a benefit for rapid development. Secondly, the choice of language is different for each stage, as discussed in \ref{section:preparation-language-choices-evaluation}. To complete the full process of automatic evaluation, the data is first updated by running the data gathering application, after which the data processing is completed. As the only item which needs to cross the language barrier is test output data, this can be achieved by using the file system. Any loss of speed by switching from memory to files is irrelevant for testing purposes, so this is an acceptable trade-off.
|
|
|
|
Discussion of data gathering will be given in this section, while the data processing will be discussed in the evaluation chapter, in section \ref{INVALID}. The data gathering grew to be a significant part of this project, and was built alongside the implementation to gather the requisite data. In fact, it developed from a custom script created to make testing this project easier, into a larger Java library that can be used for a much wider variety of application testing.
|
|
|
|
\subsection{Data Gathering}
|
|
|
|
The system for automated data gathering is built using Java. It involves three layers: the Java application itself, the \verb'virtualtests'\footnote{\url{https://github.com/JakeHillion/virtual-tests}} library, which depends on the \verb'proxmox'\footnote{\url{https://github.com/JakeHillion/proxmox-java}} library. Each of those elements were developed by me within the constraints of this project.
|
|
|
|
The data gathering system was designed to make adding data points as simple as possible. As such, the pseudocode of the program's main method follows that in figure \ref{fig:data-gathering-pseudocode}. This succinctly demonstrates that each testing environment exists only as long as test are running within it. The majority of tests are standard tests, with a network structure shown in \ref{fig:data-gathering-standard-network-structure}. The structure connects the local portal via $n$ connections to the remote portal. The test then specifies the initial rate limit for each connection, changes to the rate of each connection over time, and changes to the liveness of each connection. Once this structure is created, tests can be run between the speed test server and proxied client.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{minted}{python}
|
|
with environment_1.build() as env:
|
|
test(env, test1a)
|
|
test(env, test1b)
|
|
|
|
with environment_2.build() as env:
|
|
test(env, test2a)
|
|
test(env, test2b)
|
|
\end{minted}
|
|
\caption{Pseudocode for data gathering main method.}
|
|
\label{fig:data-gathering-pseudocode}
|
|
\end{figure}
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\begin{tikzpicture}[
|
|
squarednode/.style={rectangle, draw=black!60, fill=red!5, very thick, minimum size=5mm},
|
|
]
|
|
|
|
% Nodes
|
|
\node[squarednode] at (0,0) (speedtest) {Speed Test Server};
|
|
\node[squarednode] at (4,0) (remoteportal) {Remote Portal};
|
|
\node[squarednode] at (8,0) (localportal) {Local Portal};
|
|
\node[squarednode] at (11,0) (client) {Client};
|
|
|
|
% Edges
|
|
\draw[->] ([yshift=6mm]speedtest.north) -- (speedtest.north);
|
|
\draw[->] ([yshift=6mm]remoteportal.north) -- (remoteportal.north);
|
|
\draw[->] ([xshift=-7mm,yshift=6mm]localportal.north) -- ([xshift=-7mm]localportal.north);
|
|
\draw[->] ([yshift=6mm]localportal.north) -- (localportal.north);
|
|
\draw[->] ([xshift=7mm,yshift=6mm]localportal.north) -- ([xshift=7mm]localportal.north);
|
|
\draw[->] ([yshift=6mm]client.north) -- (client.north);
|
|
|
|
\draw[-] ([yshift=6mm]speedtest.north) -- ([yshift=6mm]localportal.north);
|
|
\draw[-] ([xshift=7mm,yshift=6mm]localportal.north) -- ([yshift=6mm]client.north);
|
|
|
|
% Edge Label
|
|
\node at ([xshift=-3.5mm,yshift=9mm]localportal.north) {0 .. N};
|
|
\end{tikzpicture}
|
|
|
|
\caption{The network structure of standard tests}
|
|
\label{fig:data-gathering-standard-network-structure}
|
|
\end{figure}
|
|
|
|
The tests proposed here are expensive to compute. They involve creating Virtual Machines, installing the required software, and most involve running speed tests that take multiple seconds. As such, methods are taken to keep repeats of test small, while maintaining sufficiently small uncertainty to conclusively demonstrate effects. To achieve this, each test is run a minimum of 5 times, then run until the coefficient of variance ($CV = \mu/\sigma$) falls below a provided value. Further, the tests support a draft mode, which runs the tests fewer times in order to produce results more quickly, albeit with a higher level of uncertainty.
|
|
|
|
After performing the required data analysis to confirm the level of uncertainty for each result, the result can be saved directly to a file. For the majority of tests these are JSON files. To ensure correct communication between the Java and Python softwares, a matching structure exists on both sides that can be compiled to a directory name. Within this directory, each repeat of test data can be stored by its numeric index.
|
|
|
|
% ------------------------ Repository Overview ----------------------------- %
|
|
\section{Repository Overview}
|
|
|
|
A directory tree of the repository is provided in figure \ref{fig:repository-structure}. The top level is split between \verb'code' and \verb'evaluation', where \verb'code' is compiled into the application binary, and \verb'evaluation' is used to verify the performance characteristics and generate graphs.
|
|
|
|
\begin{figure}
|
|
\dirtree{%
|
|
.1 /.
|
|
.2 code\DTcomment{Go code for the project}.
|
|
.3 config\DTcomment{Configuration management}.
|
|
.3 crypto\DTcomment{Cryptographic methods}.
|
|
.4 exchanges\DTcomment{Cryptographic exchange FSMs}.
|
|
.4 sharedkey\DTcomment{Shared key MACs}.
|
|
.3 mocks\DTcomment{Mocks to enable testing}.
|
|
.3 proxy\DTcomment{The central proxy controller}.
|
|
.3 shared\DTcomment{Shared errors}.
|
|
.3 tcp\DTcomment{TCP flow transport}.
|
|
.3 tun\DTcomment{TUN adapter}.
|
|
.3 udp\DTcomment{UDP datagram transport}.
|
|
.4 congestion\DTcomment{Congestion control methods}.
|
|
.3 utils\DTcomment{Common data structures}.
|
|
.2 evaluation\DTcomment{Result gathering and graph generation}.
|
|
.3 java\DTcomment{Java automated result gathering}.
|
|
.3 python\DTcomment{Python graph generation}.
|
|
}
|
|
\caption{Repository folder structure.}
|
|
\label{fig:repository-structure}
|
|
\end{figure}
|
|
|
|
% ------------------------------ Summary ----------------------------------- %
|
|
\section{Summary}
|
|
|
|
The program overall is structured for future growth, and to provide flexibility for network architects to implement it as they see fit. This chapter makes clear how this is achieved using interfaces that are flexible, before providing details on the concrete implementations. TCP provides a proof of concept with less implementation effort, but with varying performance outside of ideal environments, allowing the structure to be thoroughly tested before a more complex UDP implementation was developed. Security allows for either external or internal solutions to be used, with future support built for more complex internal initial exchanges, allowing for security measures such as digital signatures. Overall, this chapter shows a highly flexible solution for a multi-path proxy. In the next chapter, it will be shown to be highly performant.
|