Update on Overleaf.

2021-01-29 22:43:23 +00:00 · 2021-01-29 22:43:23 +00:00 · 503caed578
commit 503caed578
parent 99c4cfd050
3 changed files with 108 additions and 39 deletions
--- a/LanguageSamples/languagesamples.tex
+++ b/LanguageSamples/languagesamples.tex
@ -5,24 +5,24 @@
 \label{appendix:language-samples}

 \begin{figure}
-    \inputminted[firstline=1,lastline=48]{cpp}{Preparation/Samples/main.cpp}
+    \inputminted[firstline=1,lastline=48]{cpp}{LanguageSamples/Samples/main.cpp}
    \caption{A sample script written in C++ to collect packets from a TUN interface and print them from multiple threads}
    \label{fig:cpp-tun-sample}
 \end{figure}

 \begin{figure}
    \ContinuedFloat
-    \inputminted[firstline=49]{cpp}{Preparation/Samples/main.cpp}
+    \inputminted[firstline=49]{cpp}{LanguageSamples/Samples/main.cpp}
 \end{figure}

 \begin{figure}
-    \inputminted{rust}{Preparation/Samples/main.rs}
+    \inputminted{rust}{LanguageSamples/Samples/main.rs}
    \caption{A sample script written in Rust to collect packets from a TUN interface and print them from multiple threads}
    \label{fig:rust-tun-sample}
 \end{figure}

 \begin{figure}
-    \inputminted{go}{Preparation/Samples/main.go}
+    \inputminted{go}{LanguageSamples/Samples/main.go}
    \caption{A sample script written in Go to collect packets from a TUN interface and print them from multiple threads}
    \label{fig:go-tun-sample}
 \end{figure}
--- a/Preamble/preamble.tex
+++ b/Preamble/preamble.tex
@ -84,6 +84,7 @@
 \usepackage{graphicx}
 \usepackage{bytefield}
 \usepackage{rotating}
+\usepackage{dpfloat}

 \usepackage{tikz}
 \usetikzlibrary{positioning}
--- a/Preparation/preparation.tex
+++ b/Preparation/preparation.tex
@ -12,6 +12,11 @@

 Proxying packets is the process of taking packets that arrive at one location and transporting them to leave at another. This chapter focuses on the preparatory work to achieve this practically and securely, given the design outlined in the previous chapter, in which the proxy consolidates multiple connections to appear as one to both the wider Internet and devices on the local network. In sections \ref{section:risk-analysis} to \ref{section:preparation-security}, I discuss the security risks and plans to confront them. In section \ref{section:language-selection}, I present three languages: Go, Rust and C++, and provide context for choosing Go as the implementation language.  Finally, in sections \ref{section:requirements-analysis} and \ref{section:engineering-approach}, I present a requirements analysis and a description of the engineering approach for the project.

+% ----------------------------- Threat Model ------------------------------- %
+\section{Threat Model}
+
+The threat model that the security of this application will be considered under is the Dolev Yao model
+
 % ---------------------------- Risk Analysis ------------------------------- %
 \section{Risk Analysis}
 \label{section:risk-analysis}
@ -136,7 +141,7 @@ $B$ has multiple checks to perform before replying to $A$'s request. If the mess

 Finally, $A$ responds to $B$ with both their identities, $B$'s chosen nonce and the current time. Prior to sending this message, $A$ can be confident that $B$'s identity is correct. $A$ performs the same checks as $B$ previously before responding, terminating the flow in the case of a bad response. The timestamp and MAC are checked, the values from the previous message verified to be equal, and $B$'s identity compared to that expected. Once these are verified, $A$ is confident that it is talking to $B$, so responds to the message with enough information to confirm to $B$ that it knows who its talking to, $B$'s nonce, and the timestamp and key. This concludes the exchange.

-It is possible to create shorter crpytographic exchanges, but for this project it is not necessary. The flows generated by this project are very long lived, and as such, the constant length of the initial exchange is amortised. Therefore, keeping the exchange clear and simple is most important.
+It is possible to create shorter crpytographic exchanges, but for this project it is not necessary. The flows generated by this project are very long lived, and as such, the constant length of the initial exchange is amortised. Therefore, keeping the exchange clear and simple is reasonable.

 \subsubsection{Message Passing}

@ -158,38 +163,106 @@ To avoid this case, additional measures must be taken to avoid proxying repeated

 The sliding window technique requires each packet to have a strictly increasing sequence number. This takes advantage of the composable structure mentioned above - the sequence number can be placed within the packet sent. The sequence number here must be globally unique within the connection, and thus is not equivalent to the independent sequence number of TCP or UDP flows. This is similar to the issue given in congestion control for multipath TCP, where a second sequence number must be added, named the data sequence number. The data sequence number provides a separation between the loss control of indvidual subflows and the data transfer of the flow as a whole \citep[pp. 11]{wischik_design_2011}.

-\subsubsection{Transparent Security}
+\subsubsection{Layered Security}

-It was previously mentioned that this solution focuses on providing transparent security for the proxied packets. Further to this, this solution provides transparent security in the other direction. Consider the case of a satellite office that employs both a whole network corporate VPN and this solution. The network can be configured in each of two cases: the multipath proxy runs behind the VPN, or the VPN runs behind the multipath proxy. These two examples are given in figure \ref{fig:whole-network-vpn-transparency}.
+It was previously mentioned that this solution focuses on providing transparent security for the proxied packets. Further to this, this solution provides transparent security in the other direction. Consider the case of a satellite office that employs both a whole network corporate VPN and this solution. The network can be configured in each of two cases: the multipath proxy runs behind the VPN, or the VPN runs behind the multipath proxy.
+
+These two examples are given in figures \ref{fig:whole-network-vpn-behind} and \ref{fig:whole-network-vpn-infront}, for the VPN Wireguard \citep{donenfeld_wireguard_2017}. In this setup, it is assumed that the portals are only accessible via the VPN protected network. It can be seen that the packet in figure \ref{fig:whole-network-vpn-infront} is shorter, given the removal of the message authentication code and the data sequence number. The data sequence number is unnecessary, given that Wireguard uses the same anti-replay algorithm, and thus replayed packets would have been caught entering the secure network. Further, the message authentication code is unnecessary, as the authenticity of packets is now guaranteed by Wireguard.
+
+Supporting and encouraging this layering of protocols provides a second benefit: if the security in this solution breaks with time, there are two options to repair it. One can either fix the open source application, or compose it with a security solution that is not broken, but perhaps provides extraneous security guarantees and therefore causes reduced performance. To this end, the security features mentioned will all be configurable. This allows for flexibility in implementation.

 \begin{figure}
-    \centering
-    \begin{subfigure}[b]{.49\textwidth}
-        \includegraphics[width=\textwidth]{VPNConfig1.jpg}
-        \caption{A VPN client behind the multipath proxy.}
-    \end{subfigure}
-    \begin{subfigure}[b]{.49\textwidth}
-        \includegraphics[width=\textwidth]{VPNConfig2.jpg}
-        \caption{A VPN client in front of the multipath proxy.}
-    \end{subfigure}
-    \caption{Two network architectures running a whole network VPN and this solution}
-    \label{fig:whole-network-vpn-transparency}
+    \begin{leftfullpage}
+        \centering
+        \begin{bytefield}[bitwidth=0.6em]{32}
+            \bitheader{0-31} \\
+            \wordbox[tlr]{1}{IPv4 Header} \\
+            \wordbox[blr]{1}{$\cdots$} \\
+            \begin{rightwordgroup}{UDP\\Header}
+                \bitbox{16}{Source port} & \bitbox{16}{Destination port} \\
+                \bitbox{16}{Length} & \bitbox{16}{Checksum}
+            \end{rightwordgroup} \\
+            \begin{rightwordgroup}{CC\\Header}
+                \bitbox{32}{Acknowledgement number} \\
+                \bitbox{32}{Negative acknowledgement number} \\
+                \bitbox{32}{Sequence number}
+            \end{rightwordgroup} \\
+            \begin{rightwordgroup}{Proxied\\Wireguard\\Packet}
+                \wordbox[tlr]{1}{IPv4 Header} \\
+                \wordbox[blr]{1}{$\cdots$} \\
+                \begin{leftwordgroup}{UDP Header}
+                    \bitbox{16}{Source port} & \bitbox{16}{Destination port} \\
+                    \bitbox{16}{Length} & \bitbox{16}{Checksum}
+                \end{leftwordgroup} \\
+                \begin{leftwordgroup}{Wireguard\\Header}
+                    \bitbox{8}{type} & \bitbox{24}{reserved} \\
+                    \wordbox{1}{receiver} \\
+                    \wordbox{2}{counter}
+                \end{leftwordgroup} \\
+                \wordbox[tlr]{1}{Proxied IP packet} \\
+                \skippedwords\\
+                \wordbox[blr]{1}{}
+            \end{rightwordgroup} \\
+            \begin{rightwordgroup}{Security\\Footer}
+                \bitbox{32}{Data sequence number} \\
+                \wordbox[tlr]{1}{Message authentication code} \\
+                \wordbox[blr]{1}{$\cdots$}
+            \end{rightwordgroup}
+        \end{bytefield}
+        
+        \caption{A Wireguard client behind the multipath proxy.}
+        \label{fig:whole-network-vpn-behind}
+    \end{leftfullpage}
 \end{figure}

-Both of these setups have their merits. If you have little control over the VPN, it might be necessary to use the first case. However, if the VPN is under your control, the second case is likely a better choice. The security efforts, detailed above, become redundant if the same guarantees are provided at a higher layer. If the overlying VPN connection provides the required security guarantees, there is little point reimplementing them at the proxying layer, which provides a significant increase in per-packet efficiency. For this reason, all of the security features mentioned above will be configurable, such that this gain in efficiency is realisable.
-
-Supporting and encouraging this layering of protocols provides a second benefit: if the security in this solution breaks with time, there are two options to repair it. One can either fix the open source application, or compose it with a security solution that has not broken with time, but perhaps provides extraneous security guarantees and therefore causes reduced performance.
+\begin{figure}
+    \begin{fullpage}
+        \centering
+        \begin{bytefield}[bitwidth=0.6em]{32}
+            \bitheader{0-31} \\
+            \wordbox[tlr]{1}{IPv4 Header} \\
+            \wordbox[blr]{1}{$\cdots$}\\
+            \begin{rightwordgroup}{UDP\\Header}
+                \bitbox{16}{Source port} & \bitbox{16}{Destination port} \\
+                \bitbox{16}{Length} & \bitbox{16}{Checksum}
+            \end{rightwordgroup} \\
+            \begin{rightwordgroup}{Wireguard\\Header}
+                \bitbox{8}{type} & \bitbox{24}{reserved} \\
+                \wordbox{1}{receiver} \\
+                \wordbox{2}{counter}
+            \end{rightwordgroup} \\
+            \begin{rightwordgroup}{Tunnelled\\Proxy\\Packet}
+                \wordbox[tlr]{1}{IPv4 Header} \\
+                \wordbox[blr]{1}{$\cdots$}\\
+                \begin{leftwordgroup}{UDP Header}
+                    \bitbox{16}{Source port} & \bitbox{16}{Destination port} \\
+                    \bitbox{16}{Length} & \bitbox{16}{Checksum}
+                \end{leftwordgroup} \\
+                \begin{leftwordgroup}{CC\\Header}
+                    \bitbox{32}{Acknowledgement number} \\
+                    \bitbox{32}{Negative acknowledgement number} \\
+                    \bitbox{32}{Sequence number}
+                \end{leftwordgroup} \\
+                \wordbox[tlr]{1}{Proxied IP packet} \\
+                \skippedwords\\
+                \wordbox[blr]{1}{}
+            \end{rightwordgroup}
+        \end{bytefield}
+        
+        \caption{A Wireguard client in front of the multipath proxy.}
+        \label{fig:whole-network-vpn-infront}
+    \end{fullpage}
+\end{figure}

 % -------------------------- Language Selection ---------------------------- %
 \section{Language Selection}
 \label{section:language-selection}

-In this section, I evaluate three potential languages (C++, Rust and Go) for the implementation of this software. To support this evaluation, I have provided a sample program in each language. The sample program is intended to be a minimal example of reading packets from a TUN interface, placing them in a queue from a single thread, and consuming the packets from the queue with multiple threads. These examples are given in figures \ref{fig:cpp-tun-sample} through \ref{fig:go-tun-sample}.
+In this section, I evaluate three potential languages (C++, Rust and Go) for the implementation of this software. To support this evaluation, I have provided a sample program in each language. The sample program is intended to be a minimal example of reading packets from a TUN interface, placing them in a queue from a single thread, and consuming the packets from the queue with multiple threads. These examples are given in figures \ref{fig:cpp-tun-sample} through \ref{fig:go-tun-sample}. The primary considerations will be the performance of the language, clarity of code of the style needed to complete this software, and the ecosystem of the language. This culminates in choosing Go for the implementation language.

-TODO: The section then concludes with an evaluation of two languages (Python and Java) for the evaluation of the software.
+Alongside the implementation language, a language is chosen to evaluate the implementation. Two potential languages are considered here, Python and Java. Though Python was initially chosen for rapid development and better ecosystem support, the final result is a combination of both Python and Java - Python for data processing, and Java for systems interaction.

 \subsection{Implementation Languages}
-
 \subsubsection{C++}

 There are two primary advantages to completing this project in C++: speed of execution, and C++ being low level enough to achieve these goals. The negatives of using C++ are demonstrated in the sample script, given in figure \ref{fig:cpp-tun-sample}. It is immediately obvious that to achieve even the base of this project, the code in C++ is multiple times the length of equivalent code in either Rust or Go.
@ -204,21 +277,22 @@ For the purposes of this project, the downsides of Rust come from its youthfulne

 \subsubsection{Go}

-The final language to evaluate is Go, often written as GoLang. It is the language of choice for this project, with a sample provided in figure \ref{fig:go-tun-sample}. Go is significantly higher level than the other two languages mentioned, and provides a memory management model that is both simpler than C++ and more standardised than Rust.
+The final language to evaluate is Go, often written as GoLang. The primary difference between Go and the other two evaluated languages is the presence of a runtime. Regardless, it is the language of choice for this project, with a sample provided in figure \ref{fig:go-tun-sample}. Go is significantly higher level than the other two languages mentioned, and provides a memory management model that is both simpler than C++ and more standard than Rust.

 For the greedy structure of this project, Go's focus on concurrency is extremely beneficial. Go has channels in the standard runtime, which support any number of both producers and consumers. In this project, both SPMC (Single Producer Multi Consumer) and MPSC (Multi Producer Single Consumer) queues are required, so having these provided as a first class feature of the language is beneficial.

 Garbage collection and first order concurrency come together to make the code produced for this project highly readable. The downside of this runtime is that the speed of execution is negatively affected. However, for the purposes of this first production, that compromise is acceptable. By producing code that makes the functionality of the application clear, future implementations could more easily be built to mirror it. Given the sample of speeds displayed in section (Ref Needed: Introduction Comments on Speed), and the performance shown in section \ref{section:performance-evaluation}, the compromise of using a well-suited high-level language is one worth taking. 

 \subsection{Evaluation Languages}
-
 \subsubsection{Python}

-TODO
+Python is a dynamically typed, and was chosen as the initial implementation language. The first reason for this is \verb'matplotlib'\footnote{\url{https://matplotlib.org/}}, a widely used graphing library that can produce the graphs needed for this evaluation. The second reason is \verb'proxmoxer'\footnote{\url{https://github.com/proxmoxer/proxmoxer}}, a fluent API for interacting with a Proxmox server.
+
+Having the required modules available allowed for a swift initial development sprint. This showed that the method of evaluation was viable and effective. However, the requirements of evaluation changed with the growth of the software, and an important part of an agile process is adapting to changing requirements. The lack of static typing limits the refactorability of Python, 

 \subsubsection{Java}

-TODO
+Java is statically typed, and became the implementation language for all external interaction. One of the initial reasons for not choosing Java was the availability of an equivalent library to \verb'proxmoxer'. Although two libraries to interact with Proxmox are available for Java, one was released under an incompatible license, and the other does not have adequate type safety. To this end, to develop in Java, I would need to develop my own Proxmox library.

 % ------------------------- Requirements Analysis -------------------------- %
 \section{Requirements Analysis}
@ -232,26 +306,20 @@ The requirements of the project are detailed in the Success Criteria of the Proj

 \subsubsection{Software Development Model}

-The development of this software used the iterative model, with the initial iteration following a waterfall model. The core deliverable of this project is large, such that much programming was required before systems testing became a possibility. The waterfall model best suited this - building the software in separately tested parts, then putting significant focus on systems testing.
-
-As many of the requirements laid out in the project proposal's success criteria are quantitative system performance tests, I developed a system to automate this as part of the initial waterfall. This allowed frequent evaluation of the software against the success criteria. 
-
-The rest of the iterations were much smaller than the first, with each focusing on improving a specific factor. These iterations were continued until the success criteria were satisfied, meaning that the software had met its intended use.
-
-% --
-
 The development of this software followed the agile methodology. Work was organised into 2-7 day sprints, aiming for increased functionality in the software each time. By focusing on sufficient but not excessive planning, a minimum viable product was quickly established. From there, the remaining features could be extracted in the correct sized segments. Examples of these sprints are: initial build including configuration, TUN adapters and main program; TCP transport, enabling an end-to-end connection between the two parts; repeatable testing, providing the data to evaluate each iteration of the project against its success criteria; UDP for performance and control.

-One of the most important features of any agile methodology is welcoming changing requirements \citep{beck_manifesto_2001}. As the project grew, it became clear where shortcomings existed, and these could be fixed in short sprints. An example is given in figure \ref{fig:changing-requirements}, in which the type of a variable was changed from \mintinline{go}{string} to \mintinline{go}{func() string}. This allowed for lazy evaluation, when it became clear that configuring fixed IP addresses or DNS names could be impractical with certain setups. The static typing in the chosen language enables refactors like this to be completed with ease, particularly with the development tools mentioned in the next section.
+One of the most important features of any agile methodology is welcoming changing requirements \citep{beck_manifesto_2001}. As the project grew, it became clear where shortcomings existed, and these could be fixed in short sprints. An example is given in figure \ref{fig:changing-requirements}, in which the type of a variable was changed from \mintinline{go}{string} to \mintinline{go}{func() string}. This allowed for lazy evaluation, when it became clear that configuring fixed IP addresses or DNS names could be impractical with certain setups. The static typing in the chosen language enables refactors like this to be completed with ease, particularly with the development tools mentioned in the next section, reducing the incidental complexity of the agile methodology.

 \begin{figure}
    \centering
-    \begin{subfigure}[b]{0.3\textwidth}
+    \begin{subfigure}[t]{0.45\textwidth}
         \centering
+         \inputminted{go}{Preparation/Samples/string.go}
         \caption{The structure with a fixed local address.}
     \end{subfigure}
-     \begin{subfigure}[b]{0.3\textwidth}
+     \begin{subfigure}[t]{0.45\textwidth}
         \centering
+         \inputminted{go}{Preparation/Samples/funcstring.go}
         \caption{The structure with a dynamic local address.}
     \end{subfigure}
    \caption{An example of refactoring for changing requirements.}