Update on Overleaf.

2021-02-02 22:45:05 +00:00 · 2021-02-02 22:45:05 +00:00 · bae7a36d6b
commit bae7a36d6b
parent 41b40610d0
2 changed files with 43 additions and 18 deletions
--- a/Preamble/preamble.tex
+++ b/Preamble/preamble.tex
@ -157,7 +157,7 @@
 % ********************** TOC depth and numbering depth *************************

 \setcounter{secnumdepth}{2}
-\setcounter{tocdepth}{2}
+\setcounter{tocdepth}{1}


 % ******************************* Nomenclature *********************************
--- a/Preparation/preparation.tex
+++ b/Preparation/preparation.tex
@ -112,16 +112,40 @@ This threat is based on an attacker wishing to force cost upon you. In the examp

 This section provides means of confronting the threats given in section \ref{section:threat-model}, in order to alleviate the additional risk of proxying traffic.

-\subsection{IP Authentication Header}
-
-The security requirements for this project are equivalent to those of the IP Authentication Header \citep{kent_ip_2005}
-
 \subsection{Message Authentication}

 To provide integrity and authentication for each message, I evaluate two choices: Message Authentication Codes (MACs) or Digital Signatures. A MAC combines the data with a shared key using a specific method, before using a one-way hash function to generate a message authentication code, and thus the result is only verifiable by someone with the same private key \citep[pp. 352]{menezes_handbook_1997}. Producing a digital signature for a message uses the private key in public/private keypair to produce a digital signature for a message, proving that the message was produced by the owner of the private key, which can be verified by anyone with the public key \citep[pp. 147-149]{anderson_security_2008}. In both cases, the message authentication code is appended to the message, such that the integrity and authenticity of the message can be verified.

 The comparison is as such: signatures provide non-repudiation, while MACs do not - one can know the owner of which private key signed a message, while anyone with the shared key could have produced an MAC for a message. The second point is that digital signatures are much more computationally complex than MACs, and thus, given that the control of both ends lies with the same party, MAC is the message authentication of choice for this project.

+\subsection{IP Authentication Header}
+
+The security requirements for this project are equivalent to those provided by the IP Authentication Header \citep{kent_ip_2005}. The IP authentication header operates between IP and the transport layer, using IP protocol number 51. The authentication header uses a hash function and a secret shared key to provide an Integrity Check Value. This check value covers all immutable parts of the IP header, the authentication header itself, and the data below the authentication header. Combined, this provides connectionless integrity and authenticity, as the IP header is authenticated. Further, the header contains a sequence number, which is used to prevent replay attacks.
+
+Unfortunately, there are two reasons why this solution cannot be used: difficulties with NAT traversal, and inaccessibility for user-space programs. As the IP packet provides integrity for the source and destination addresses, any NAT that alters these addresses violates the integrity of the packet. Although NAT traversal is not an explicit success criteria for this project, it is implicit, as the flexibility of the project for different network structures is a priority, including those where NAT is unavoidable. The second is that IP authentication headers, being an IP protocol and not transport layer, would cause issues interacting with user-space programs. Given that the first implementation of transport is completed using TCP, having IP Authentication Headers would require the user-space program to handle the TCP connection without the aid of the kernel, complicating multiplexing and being an unsupported setup.
+
+Overall, using the IP authentication header would function similarly to running over a VPN, described in section \ref{section:layered-security}. Although this will be a supported configuration, the shortfalls mean that it will not be the base implementation. However, inspiration can be taken from the header structure, shown in figure \ref{fig:ip-auth-header-structure}.
+
+\begin{figure}
+    \centering
+    \begin{bytefield}[bitwidth=0.8em]{32}
+        \bitheader{0-31} \\
+        \bitbox{8}{Next Header} & \bitbox{8}{Payload Len} & \bitbox{16}{Reserved} \\
+        \wordbox{1}{Security Parameters Index} \\
+        \wordbox{1}{Sequence Number} \\
+        \wordbox[tlr]{1}{Integrity Check Value} \\
+        \wordbox[blr]{1}{$\cdots$}
+    \end{bytefield}
+    \caption{IP authentication header structure}
+    \label{fig:ip-auth-header-structure}
+\end{figure}
+
+It is first important to note the differences between the use of IP authentication headers and the security footers used in this application. Firstly, the next header field is unnecessary, given that headers are not being chained. Secondly, given the portals having a fixed security configuration by static configuration, the payload length field is unnecessary - the payloads will always be of a predetermined length. Similarly, the security parameters index is unnecessary, as the parameters will be equal.
+
+The difference in security arises from the lack of integrity given to the fields above the application layer. That is, the IP header itself, and the TCP or UDP header. However, there is an important distinction between the TCP and UDP cases: TCP congestion control will not be covered by any application provided security, while the UDP congestion control will. That is, this application can do nothing to authenticate the ACKs of a TCP connection, as these are created outside of the control of the application. As such, the TCP implementation provided by the solution should be used in one of two ways: as a baseline test for the performance of other algorithms, or taking advantage of layered security as given in section \ref{section:layered-security}. The rest of this section will therefore focus on securing the UDP transport.
+
+Further differences arising from the lack of integrity above the application layer still apply to UDP transport. Although the congestion control layer and therefore packet flow is authenticated, the source and destination of packets are not.
+
 \subsubsection{Initial Exchange}

 An initial exchange at the beginning of a flow must take place for the security of this application. However, one must be wary of the implications of reflection attacks, mentioned in section \ref{section:reflection-attacks}. Given is the chosen authentication exchange for this project:
@ -162,7 +186,8 @@ To avoid this case, additional measures must be taken to avoid proxying repeated

 The sliding window technique requires each packet to have a strictly increasing sequence number. This takes advantage of the composable structure mentioned above - the sequence number can be placed within the packet sent. The sequence number here must be globally unique within the connection, and thus is not equivalent to the independent sequence number of TCP or UDP flows. This is similar to the issue given in congestion control for multipath TCP, where a second sequence number must be added, named the data sequence number. The data sequence number provides a separation between the loss control of indvidual subflows and the data transfer of the flow as a whole \citep[pp. 11]{wischik_design_2011}.

-\subsubsection{Layered Security}
+\subsection{Layered Security}
+\label{section:layered-security}

 It was previously mentioned that this solution focuses on providing transparent security for the proxied packets. Further to this, this solution provides transparent security in the other direction. Consider the case of a satellite office that employs both a whole network corporate VPN and this solution. The network can be configured in each of two cases: the multipath proxy runs behind the VPN, or the VPN runs behind the multipath proxy.

@ -264,15 +289,15 @@ Alongside the implementation language, a language is chosen to evaluate the impl
 \subsection{Implementation Languages}
 \subsubsection{C++}

-There are two primary advantages to completing this project in C++: speed of execution, and C++ being low level enough to achieve these goals. The negatives of using C++ are demonstrated in the sample script, given in figure \ref{fig:cpp-tun-sample}. It is immediately obvious that to achieve even the base of this project, the code in C++ is multiple times the length of equivalent code in either Rust or Go.
+There are two primary advantages to completing this project in C++: speed of execution, and C++ being low level enough to achieve these goals. The negatives of using C++ are demonstrated in the sample script, given in figure \ref{fig:cpp-tun-sample}, where it is immediately obvious that to achieve even the base of this project, the code in C++ is multiple times the length of equivalent code in either Rust or Go, at 93 lines compared to 34 for Rust or 48 for Go. This difference arises from the need to manually implement the required thread safe queue, while it is available as a library for both Rust and Go, and can be handled by the respective package managers. This manual implementation gives rise to additional risk of incorrect implementation, specifically with regards to thread safety, that could cause undefined behaviour and great difficulty debugging.

-The lack of memory safety in C++ is a significant negative of the language. Although C++ would likely be more performant, it is avoided due to the massive incidental complexity of manual memory management.
+The lack of memory safety in C++ is a significant negative of the language. Although C++ would provide increased performance over a language such as Go with a runtime, it is avoided due to the massive incidental complexity of manual memory management and the difficulty of manual thread safety.

 \subsubsection{Rust}

-Rust provides advantages over C++ while still maintaining the speed. It is also memory safe, significantly reducing the programming load of searching for memory errors. The Rust sample is given in figure \ref{fig:rust-tun-sample}, and it is pleasantly concise.
+Rust is memory safe and thread safe, solving the latter issues with C++. Rust also has no runtime, allowing for similar execution speed, comparable to C or C++. The Rust sample is given in figure \ref{fig:rust-tun-sample}, and it is pleasantly concise.

-For the purposes of this project, the downsides of Rust come from its youthfulness. This is two-faceted: IDE support and Crate stability. Firstly, the IDE support for Rust in my IDEs of choice is provided via a plugin to IntelliJ, and is not as well supported as many other languages. Secondly, the crate available for TUN support (tun-tap\footnote{\url{https://docs.rs/tun-tap/}}) does not yet provide a stable API, which was noticed during the development of even this test program. Between writing the program initially and re-testing it to place in this document, the API of the Crate had changed to the point where my script no longer type checked. Further, the old version had disappeared, and thus I was left with a program that didn't compile or function. Although writing the API for TUN interaction is not an issue, it would reduce portability, as I cannot test on as many systems and platforms as the open source community.
+For the purposes of this project, the downsides of Rust come from its youthfulness. This is two-faceted: IDE support and Crate stability. Firstly, the IDE support for Rust in my IDEs of choice is provided via a plugin to IntelliJ, and is not as well supported as many other languages. Secondly, the crate available for TUN support (tun-tap\footnote{\url{https://docs.rs/tun-tap/}}) does not yet provide a stable API, which was noticed during the development of even this test program. Between writing the program initially and re-testing it to place in this document, the API of the Crate had changed to the point where my script no longer type checked. Further, the old version had disappeared, and thus I was left with a program that didn't compile or function. Although writing the API for TUN interaction is not an issue, the safety benefits of Rust would be less pronounced, as the direct systems interaction would require unsafe code, leading to an increased potential for bugs.

 \subsubsection{Go}

@ -287,11 +312,13 @@ Garbage collection and first order concurrency come together to make the code pr

 Python is a dynamically typed, and was chosen as the initial implementation language. The first reason for this is \verb'matplotlib'\footnote{\url{https://matplotlib.org/}}, a widely used graphing library that can produce the graphs needed for this evaluation. The second reason is \verb'proxmoxer'\footnote{\url{https://github.com/proxmoxer/proxmoxer}}, a fluent API for interacting with a Proxmox server.

-Having the required modules available allowed for a swift initial development sprint. This showed that the method of evaluation was viable and effective. However, the requirements of evaluation changed with the growth of the software, and an important part of an agile process is adapting to changing requirements. The lack of static typing limits the refactorability of Python, 
+Having the required modules available allowed for a swift initial development sprint. This showed that the method of evaluation was viable and effective. However, the requirements of evaluation changed with the growth of the software, and an important part of an agile process is adapting to changing requirements. The lack of static typing limits the refactorability of Python, and becomes increasingly challenging as the project grows. Therefore, after the initial proof of concept, it became necessary to explore another language for the Proxmox interaction.

 \subsubsection{Java}

-Java is statically typed, and became the implementation language for all external interaction. One of the initial reasons for not choosing Java was the availability of an equivalent library to \verb'proxmoxer'. Although two libraries to interact with Proxmox are available for Java, one was released under an incompatible license, and the other does not have adequate type safety. To this end, to develop in Java, I would need to develop my own Proxmox library.
+Java is statically typed, and became the implementation language for all external interaction. One of the initial reasons for not choosing Java was the availability of an equivalent library to \verb'proxmoxer'. Although two libraries to interact with Proxmox are available for Java, one was released under an incompatible license, and the other does not have adequate type safety. To this end, to develop in Java, I would need to develop my own Proxmox library. However, after the initial development in Python, it became clear that this was a valuable use of time, and thus development began. By developing a type safe Proxmox API library, and having learnt from the initial development in Python, a clear path to producing the appropriate Java libraries was available.
+
+However, as Python is an incredibly popular language for data processing, the solution was not to use purely Java. Given the graphing existed already in Python and worked perfectly well, a combined solution with Java gathering the data and Python processing it was chosen.

 % ------------------------- Requirements Analysis -------------------------- %
 \section{Requirements Analysis}
@ -299,6 +326,8 @@ Java is statically typed, and became the implementation language for all externa

 The requirements of the project are detailed in the Success Criteria of the Project Proposal (Appendix \ref{appendix:project-proposal}), and are the primary method of evaluation for project success. They are split into three categories: success criteria, extended goals and stretch goals.

+The three categories of success criteria can be summarised as follows. The success criteria, or must have elements, are to provide a multi-path proxy that is functional, secure and improves speed and resilience in specific cases. The extended goals, or should have elements, are focused on increasing the performance and flexibility of the solution. The stretch goals, or could have elements, are aimed at increasing performance by reducing overheads, and supporting IPv6 alongside IPv4.
+
 % ------------------------- Engineering Approach --------------------------- %
 \section{Engineering Approach}
 \label{section:engineering-approach}
@ -331,15 +360,11 @@ A large part of the language choice focused on development tools. As discussed i

 I used Git version control, with a self-hosted Gitea\footnote{\url{https://gitea.com/}} server as the remote. My repositories have a multitude of on- and off-site backups at varying frequencies (2xUSB + 2xDistinct Cloud Storage + NAS + Multiple Computers).

-Alongside my self-hosted Gitea server, I have a self hosted Drone by Harness\footnote{\url{http://drone.io/}} server for continuous integration. This made it simple to add a Drone file to the repository, allowing for the Go tests to be run, and using a script with the gofmt\footnote{\url{https://golang.org/cmd/gofmt/}} tool.
-
-\mint{shell-session}`bash -c "gofmt -l . | wc -l | cmp -s <(echo 0) || (gofmt -l . && exit 1)"`
-
-This script, run by Drone, rejects any pushes to the Git repository that do not conform to the formatting specified by the gofmt tool. Ensuring that all branches are consistently formatted can significantly reduce merge issues.
+Alongside my self-hosted Gitea server, I have a self hosted Drone by Harness\footnote{\url{http://drone.io/}} server for continuous integration. This made it simple to add a Drone file to the repository, allowing for the Go tests to be run, formatting verified, and artifacts built. On a push, after the verification, each artefact is built and uploaded to a central repository, where it is saved for the branch name. This is particularly useful for automated testing, as the relevant artefact can be downloaded automatically from a known location for the branch under test. Further, artefacts can be built for multiple architectures, particularly useful when performing real world testing spread between x86\_64 and ARMv7 architectures.

 \subsubsection{Licensing}

-I have chosen to license this software under the MIT license. The MIT license is simple and permissive.
+I have chosen to license this software under the MIT license. The MIT license is simple and permissive, enabling reuse and modification of the code, subject to including the license. Alongside the hopes that the code will receive updated pull requests over time, a permissive license allows others to build upon the given solution. A potential example of a solution that could build from this is a company employing a SaaS (Software as a Service) model to configure remote portals on your behalf, perhaps including the hardware required to convert this fairly involved solution into a plug-and-play option.

 % ---------------------------- Starting Point ------------------------------ %
 \section{Starting Point}