|
|
|
@ -103,16 +103,16 @@ The benefits of using a VPN tunnel between the two proxies are shown in Figure \
|
|
|
|
|
\section{Language Selection}
|
|
|
|
|
\label{section:language-selection}
|
|
|
|
|
|
|
|
|
|
In this section, I evaluate three potential languages (C++, Rust and Go) for the implementation of this software. To support this evaluation, I have provided a sample program in each language. The sample program is intended to be a minimal example of reading packets from a TUN interface, placing them in a queue from a single thread, and consuming the packets from the queue with multiple threads. These examples are given in figures \ref{fig:cpp-tun-sample} through \ref{fig:go-tun-sample}, in Appendix \ref{appendix:language-samples}. The first test was whether the small example was possible, which passed for all three languages. I then considered the performance of the language, clarity of code of the style needed to complete this software, and the ecosystem of the language. This culminated in choosing Go for the implementation language.
|
|
|
|
|
In this section, I evaluate three potential languages (C++, Rust and Go) for the implementation of this software. To support this evaluation, I have provided a sample program in each language. The sample program is a minimal example of reading packets from a TUN interface, placing them in a queue from a single thread, and consuming the packets from the queue with multiple threads. These examples are given in Figures \ref{fig:cpp-tun-sample} through \ref{fig:go-tun-sample}, in Appendix \ref{appendix:language-samples}. For each language, I considered the performance, code clarity, and the language ecosystem. This culminated in choosing Go for the implementation language.
|
|
|
|
|
|
|
|
|
|
Alongside the implementation language, a language is chosen to evaluate the implementation. Two potential languages are considered here, Python and Java. Though Python was initially chosen for rapid development and better ecosystem support, the final test suite is a combination of both Python and Java - Python for data processing, and Java for systems interaction.
|
|
|
|
|
I similarly evaluated two languages for the test suite: Python and Java. Though Python was initially chosen for rapid development and better ecosystem support, the final test suite is a combination of both Python and Java - Python for data processing, and Java for systems interaction.
|
|
|
|
|
|
|
|
|
|
\subsection{Implementation Languages}
|
|
|
|
|
\subsubsection{C++}
|
|
|
|
|
|
|
|
|
|
There are two primary advantages to completing this project in C++: speed of execution, and C++ being low level enough to achieve this project's goals (which turned out to be true for all considered languages). The negatives of using C++ are demonstrated in the sample script, given in Figure \ref{fig:cpp-tun-sample}, where it is immediately obvious that to achieve even the base functionality of this project, the code in C++ is multiple times the length of equivalent code in either Rust or Go, at 93 lines compared to 34 for Rust or 48 for Go. This difference arises from the need to manually implement the required thread safe queue, while it is available as a library for Rust, and included in the Go runtime. This manual implementation gives rise to additional risk of incorrect implementation, specifically with regards to thread safety, that could cause undefined behaviour, security vulnerabilities, and great difficulty debugging. Further, although open source queues are available, they are not handled by a package manager, and thus security updates would have to be manual, leaving opportunity for unfound bugs.
|
|
|
|
|
There are two primary advantages to completing this project in C++: speed of execution, and C++ being low level enough to achieve this project's goals (which turned out to be true for all considered languages).
|
|
|
|
|
|
|
|
|
|
The lack of memory safety in C++ is a significant negative of the language. Although C++ would provide increased performance over a language such as Go with a more feature-rich runtime, it is avoided due to the incidental complexity of manual memory management and the difficulty of manual thread safety.
|
|
|
|
|
The negatives of using C++ are demonstrated in the sample script, given in Figure \ref{fig:cpp-tun-sample}: achieving even the base functionality of this project requires multiple times more code than Rust or Go (93 lines compared to 34 for Rust or 48 for Go). This arises from the need to manually implement the required thread safe queue, which is available as a library for Rust, and included in the Go runtime. This manual implementation gives rise to additional risk of incorrect implementation, specifically with regards to thread safety, that could cause undefined behaviour, security vulnerabilities, and great difficulty debugging. Further, although open source queues are available, they are not handled by a package manager, and thus security updates would have to be manual, risking the introduction of bugs. Finally, C++ does not provide any memory safety guarantees.
|
|
|
|
|
|
|
|
|
|
\subsubsection{Rust}
|
|
|
|
|
|
|
|
|
@ -151,7 +151,7 @@ The requirements of the project are detailed in the Success Criteria of the Proj
|
|
|
|
|
|
|
|
|
|
The three categories of success criteria can be summarised as follows. The success criteria, or must have elements, are to provide a multi-path proxy that is functional, secure and improves speed and resilience in specific cases. The extended goals, or should have elements, are focused on increasing the performance and flexibility of the solution. The stretch goals, or could have elements, are aimed at increasing performance by reducing overheads, and supporting IPv6 alongside IPv4.
|
|
|
|
|
|
|
|
|
|
Beyond the success criteria, a requirement of the software produced is platform compatibility. As the proxy is expected to run on networking hardware, platforms such as Windows and MacOS will not be supported. However, networking hardware runs a wide variety of operating systems. The testing process will run on Linux and FreeBSD, but the software should be designed in such a way that more operating systems could be supported with minimal difficulty.
|
|
|
|
|
Beyond the success criteria, I wanted to demonstrate the practicality of my software on prototypic networking equipment; therefore, continuous integration testing and evaluation will run on Linux and FreeBSD.
|
|
|
|
|
|
|
|
|
|
% ------------------------- Engineering Approach --------------------------- %
|
|
|
|
|
\section{Engineering Approach}
|
|
|
|
@ -159,9 +159,9 @@ Beyond the success criteria, a requirement of the software produced is platform
|
|
|
|
|
|
|
|
|
|
\subsubsection{Software Development Model}
|
|
|
|
|
|
|
|
|
|
The development of this software followed the agile methodology. Work was organised into weekly sprints, aiming for increased functionality in the software each time. By focusing on sufficient but not excessive planning, a minimum viable product was quickly established. From there, the remaining features could be implemented in the correct sized segments. Examples of these sprints are: initial build including configuration, TUN adapters and main program; TCP transport, enabling an end-to-end connection between the two parts; repeatable testing, providing the data to evaluate each iteration of the project against its success criteria; UDP transport for performance and control.
|
|
|
|
|
The development of this software followed the agile methodology. Work was organised into weekly sprints, aiming for increased functionality in the software each time. By focusing on sufficient but not excessive planning, a minimum viable product was quickly established. From there, the remaining features could be implemented in the correct sized segments. Examples of these sprints are: initial build including configuration, TUN adaptors and main program; TCP transport, enabling an end-to-end connection between the two parts; repeatable testing, providing the data to evaluate each iteration of the project against its success criteria; UDP transport for performance and control.
|
|
|
|
|
|
|
|
|
|
One of the most important features of any agile methodology is welcoming changing requirements \citep{beck_manifesto_2001}. As the project grew, it became clear where shortcomings existed, and these could be fixed in very quick pull requests. An example is given in Figure \ref{fig:changing-requirements}, in which the type of a variable was changed from \mintinline{go}{string} to \mintinline{go}{func() string}. This allowed for lazy evaluation, when it became clear that configuring fixed IP addresses or DNS names could be impractical with certain setups. The static typing in the chosen language enables refactors like this to be completed with ease, particularly with the development tools mentioned in the next section, reducing the incidental complexity of the agile methodology.
|
|
|
|
|
The agile methodology welcomse changing requirements \citep{beck_manifesto_2001}, and as the project grew, it became clear where shortcomings existed, and these could be fixed in very quick pull requests. An example is given in Figure \ref{fig:changing-requirements}, in which the type of a variable was changed from \mintinline{go}{string} to \mintinline{go}{func() string}. This allowed for lazy evaluation, when it became clear that configuring fixed IP addresses or DNS names could be impractical. Static typing enables refactors like this to be completed with ease, particularly with the development tools mentioned in the next section, reducing the incidental complexity of the agile methodology.
|
|
|
|
|
|
|
|
|
|
\begin{figure}
|
|
|
|
|
\centering
|
|
|
|
@ -181,17 +181,17 @@ One of the most important features of any agile methodology is welcoming changin
|
|
|
|
|
|
|
|
|
|
\subsubsection{Development Tools}
|
|
|
|
|
|
|
|
|
|
A large part of the language choice focused on development tools. As discussed in Section \ref{section:language-selection}, IDE support is important for programming productivity. My preferred IDEs are those supplied by JetBrains,\footnote{\url{https://jetbrains.com/}} generously provided for education and academic research free of charge. As such, I used GoLand for the Go development of this project, IntelliJ for the Java evaluation development, and PyCharm for the Python evaluation program. Using an intelligent IDE, particularly with the statically typed Go and Java, can significantly increases programming productivity. They provide intelligent code suggestions and automated code generation for repetitive sections to reduce keystrokes, syntax highlighting for ease of reading, near-instant type checking without interaction, and many other features. Each reduce incidental complexity.
|
|
|
|
|
A large part of the language choice focused on development tools, particularly IDE support. I used GoLand (Go), IntelliJ (Java), and PyCharm (Python). Using intelligent IDEs, particularly with the statically-typed Go and Java, significantly increases programming productivity. They provide code suggestions and automated code generation for repetitive sections to reduce keystrokes, syntax highlighting for ease of reading, near-instant type checking without interaction, and many other features. Each reduce incidental complexity.
|
|
|
|
|
|
|
|
|
|
I used Git version control, with a self-hosted Gitea\footnote{\url{https://gitea.com/}} server as the remote. The repository contains over 180 commits, committed at regular intervals while programming. My repositories have a multitude of on- and off-site backups at varying frequencies (Multiple Computers + Git Remote + NAS + 2xCloud + 2xUSB). The Git remote was updated with every commit, the NAS and Cloud providers daily, with one USB updated every time significant work was added and the other a few days after. Having some automated and some manual backups, along with a wide variety of backup locations, ensures that the potential data loss in the event of any failure is minimal. The backups are regularly checked for consistency, to ensure no data loss goes unnoticed.
|
|
|
|
|
I used Git version control, with a self-hosted Gitea\footnote{\url{https://gitea.com/}} server as the remote. The repository contains over 180 commits, committed at regular intervals while programming. I maintained several on- and off-site backups (Multiple Computers + Git Remote + NAS + 2xCloud + 2xUSB). The Git remote was updated with every commit, the NAS and Cloud providers daily, with one USB updated every time significant work was added and the other a few days after. Having some automated and some manual backups, along with a variety of backup locations, minimises any potential data loss in the event of any failure. The backups are regularly checked for consistency, to ensure no data loss goes unnoticed.
|
|
|
|
|
|
|
|
|
|
Alongside my self-hosted Gitea server, I have a self-hosted Drone\footnote{\url{http://drone.io/}} server for continuous integration. This made it simple to add a Drone file to the repository, allowing for the Go tests to be run, formatting verified, and artefacts built. On a push, after the verification, each artefact is built and uploaded to a central repository, where it is saved under the branch name. This is particularly useful for automated testing, as the relevant artefact can be downloaded automatically from a known location for the branch under test. Further, artefacts are built for multiple architectures, particularly useful when performing real world testing spread between \texttt{AMD64} and \texttt{ARM64} architectures.
|
|
|
|
|
Alongside my Gitea server, I have a self-hosted Drone\footnote{\url{http://drone.io/}} server for continuous integration: running Go tests, verifying formatting, and building artefacts. On a push, after verification, each artefact is built, uploaded to a central repository, and saved under the branch name. This dovetailed with my automated testing, which downloaded the relevant artefact automatically for the branch under test. I also built artefacts for multiple architectures to support real world testing on \texttt{AMD64} and \texttt{ARM64} architectures.
|
|
|
|
|
|
|
|
|
|
Continuous integration and Git are used in tandem to ensure that all code in a pull request meets certain standards. By ensuring that tests are automatically run before merging, all code that is merged must be formatted correctly and able to pass the tests. This removes the possibility of accidentally causing an already tested for regression to occur during a merge by forgetting to run the tests. Pull requests also provide an opportunity to review submitted code, even with the same set of eyes, in an attempt to detect any glaring errors. Twenty-four pull requests were submitted to the repository for this project.
|
|
|
|
|
Continuous integration and Git are used in tandem to ensure that each pull request meet certain standards before merging, reducing the possibility of accidentally causing performance regressions. Pull requests also provide an opportunity to review submitted code, even with the same set of eyes, in an attempt to detect any glaring errors. Twenty-four pull requests were submitted to the repository for this project.
|
|
|
|
|
|
|
|
|
|
\subsubsection{Licensing}
|
|
|
|
|
|
|
|
|
|
I have chosen to license this software under the MIT license. The MIT license is simple and permissive, enabling reuse and modification of the code, subject to including the license. Alongside the hopes that the code will receive updated pull requests over time, a permissive license allows others to build upon the given solution. A potential example of a solution that could build from this is a company employing a Software as a Service (SaaS) model to configure a remote proxy on your behalf, perhaps including the hardware required to convert this fairly involved solution into a plug-and-play option.
|
|
|
|
|
I chose to license this software under the MIT license, which is simple and permissive.
|
|
|
|
|
|
|
|
|
|
% ---------------------------- Starting Point ------------------------------ %
|
|
|
|
|
\section{Starting Point}
|
|
|
|
|