commit 1041ab6d874d05072aac47d9a009f09fad90802f Author: Jake Hillion Date: Tue Oct 20 09:41:24 2020 +0100 First overseer draft diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..d3eb8c8 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +project-proposal-annotated.pdf +project-proposal.aux +project-proposal.log +project-proposal.pdf diff --git a/cover-annotated.pdf b/cover-annotated.pdf new file mode 100644 index 0000000..b14154d Binary files /dev/null and b/cover-annotated.pdf differ diff --git a/cover.pdf b/cover.pdf new file mode 100644 index 0000000..f0309cd Binary files /dev/null and b/cover.pdf differ diff --git a/phase1.txt b/phase1.txt new file mode 100644 index 0000000..3b34d2e --- /dev/null +++ b/phase1.txt @@ -0,0 +1,46 @@ +Subject: Phase 1 - Hillion: A Multi-Path Bidirectional Layer 3 Proxy + +Phase 1 Project Selection Status Report + +Name: Jake Hillion + +College: Queens' + +User Identifier: jsh77 + +Directory of Studies: Neil Lawrence + +Please complete 1, 2 and 3 below. + +1. Please write 100 words on your current project ideas. + +The project aims to use multiple, heterogeneous, congestion controlled network connections to a single high bandwidth server, creating a single virtual link between the client and wider network, which can make efficient use of the combined bandwidth of the individual connections. Congestion control will allow the system to adapt to changing network conditions, for example with wireless links. It will also allow failover without a visible change of IP address to peers, making the failover transparent to higher layers of the communications stack. + +2. Please list names of potential project supervisors. + +Mike Dodson + +3. Is there any chance that your project will involve any +computing resources other than the Computing Service's MCS and +software that is already installed there, for example: your own +machine, machines in College, special peripherals, imported +software packages, special hardware, network access, substantial +extra disc space on the MCS. + +If so indicate below what, and what it is needed for. + +Personal Computer (AMD R9 3950X, 16GB RAM) +Personal Laptop (AMD i7-8550U, 16GB RAM) + +Used for development without requiring the lab. Testing this application will require extended capabilities, which would not be readily available on shared systems. + +Virtualisation Server (2x Intel Xeon X5667, 12GB RAM) +Backup Virtualisation Server (2x Intel Xeon X5570, 48GB RAM) + +A virtualisation server allows controlled testing of the application, without any packets leaving the physical interfaces of the server. +If both of these servers were to fail, it is possible to perform testing with a cloud provider, albeit in a less controlled environment. + +I accept full responsibility for the above 4 machines and I have made contingency plans to protect myself against hardware and/or software failure. + +Go(Lang) code written will use a version later than that available on the MCS, as the version currently on the MCS (1.10) does not support Go Modules. +Rust is not available on the MCS at the time of writing. diff --git a/project-proposal.tex b/project-proposal.tex new file mode 100644 index 0000000..c308f5a --- /dev/null +++ b/project-proposal.tex @@ -0,0 +1,303 @@ +% Document Setup +\documentclass[12pt]{article} +\usepackage[a4paper, margin=15mm, tmargin=1in]{geometry} + +\usepackage{pdfpages} +\usepackage{graphicx} + +\begin{document} +\includepdf{cover.pdf} + +\section*{Introduction and Description of the Work} + +This project aims to provide a method of combining Internet connections without modifying existing devices or infrastructure. Rather, by inserting a Local Server to consolidate multiple, heterogeneous connections and a Remote Server to act as a common connection to the wider network, both speed and resilience improvements may be possible. While there are existing solutions that combine multiple connections, they prioritise one at the expense of the other; this project will attempt to show that this trade-off can be avoided. + +The resilience focus of this software should allow a TCP flow\footnote{Flow is used slightly outside of normal in this document. It describes a channel of communication between two devices that is uninterrupted.} or UDP flow to continue, given all but one of the connections between the Local Server and the Remote Server being lost. As an example, this would allow a SIP call to continue without a redial. The speed focus is achieved by providing a single virtual connection that aggregates the speed of all connections. An example of this is allowing a single flow video stream to exceed the bandwidth available on a single connection. Further, the system is integrated in such a way that the details of the load balancing are hidden from both the applications behind the Local Server and peers on the wider network. + +This system is useful in areas where multiple low bandwidth connections are available, but not a single higher bandwidth connection. This is often the case in rural areas in the UK. It will also be useful in areas with diverse connections of varying reliability, such as a home with both DSL and wireless connections, which may become more common with the advent of 5G and LEO systems such as Starlink. The lack of requirement for vendor support allows for this mixture of connections to be supported. + +Some existing attempts to solve these problems, and the shortfalls of each solution, are summarized below: + +\begin{itemize} + + \item Failover: All existing flows must be restarted when failover occurs. There is no speed benefit over having a single connection. + + \item Session Based Load Balancing: All flows on a failed connection must be restarted. Speed benefit varies between applications, but is excellent in ideal circumstances. This solution is less effective when parameters of the connections vary with time, as with wireless connections. Further, advanced policies can be required on an application level to achieve the best speed. + + \item Application Support: Many modern protocols that are designed with mobile devices in mind can already handle IP changes (e.g. switching from WiFi to 4G). This allows these applications to handle situations such as Failover (above), as they treat it like any other network change. The downside of requiring application support is older protocols, such as SIP, for which resilience needs to be gained at a higher level. + + \item MultiPath TCP: MPTCP works best with multiple interfaces on each device that is using it, e.g. a 4G and WiFi connection on a mobile device. This is due to a device on a NAT with access to two WAN connections having no direct knowledge of this. It also requires support on both ends, which isn't common yet (MPTCP is not yet mainlined in the Linux kernel). Further, many modern applications are moving away from TCP in favour of lighter UDP protocols, which wouldn't benefit from MPTCP support. + + \item OpenVPN over MultiPath TCP: This allows both non-TCP based protocols, and clients that don't support MPTCP to benefit (if it's implemented network wide). Head of line blocking becomes more of an issue when passing multiple entirely different applications over a VPN, as any application can block any other. OpenVPN also adds a lot of unnecessary overhead if a network wide VPN would not otherwise be used. + +\end{itemize} + +By providing congestion control over each interface and therefore being able to share packets without bias between connections, this project should provide a superior solution for load balancing across heterogeneous and volatile network connections. An example of a client using this is shown in Figure \ref{fig:sample-network}. It's worth noting that this solution is highly flexible, allowing the client to be a NAT Router with more devices behind it, or the flows from the Local Server to the Remote Server being tunnelled over a VPN. + +\begin{figure} + \small + \begin{verbatim} + Uncontrolled + On Premises Off Premises Devices ++-------------------------------------------------+ +----------+ +----------+ + + +----------+ + | Web | + +->+ Server | + +----------+ via | | | ++-------------------------------+ | | ISP A | +----------+ +| | +-->+ Modem A +--+ | +| +----------+ +----------+ | | | | | +----------+ | +----------+ +| | | | Local +----+ +----------+ +---->+ Remote +--+ | VoIP | +| | Client +---->+ Server | | | Server |---->+ Server | +| | | | +----+ +----------+ +---->+ +--+ | | +| +----------+ +----------+ | | | | | +----------+ | +----------+ +| | +-->+ Modem B +--+ | ++-------------------------------+ | | via | +----------+ + +----------+ ISP B | |Corporate | + +->+ VPN | + | | + +----------+ + \end{verbatim} + \caption{A network applying this proxy} + \label{fig:sample-network} +\end{figure} + +\section*{Starting Point} + +I have spent some time looking into the shortfalls and benefits of the available methods for combining multiple Internet connections. The Part IB course \emph{Computer Networking} has provided the background information for this project. I have significant experience with Go, though none with lower level networking. I have no experience with Rust, and my C++ experience is limited to the Part IB course \emph{Programming in C and C++}. + +I have some knowledge of the Wireguard project, though as a user instead of a programmer. I intend for this to inspire the interaction between the user and the project, though this will not be in the form of code. + +\section*{Substance and Structure of the Project} + +The system will involve load balancing multiple congestion controlled flows between the Local Server and the Remote Server. The Local Server will receive packets from the client, and use load balancing and congestion control algorithms to send individual packets along one of the multiple available connections to the Remote Server, which will extract the original packets and forward them along a high bandwidth connection to the wider network. + +To achieve this congestion control, I will initially use TCP flows, which include congestion control. However, TCP also provides other guarantees, which will not benefit this task. For this reason, the application should be structured in such a way that it can support alternative protocols to TCP. An improved alternative is using UDP datagrams with a custom congestion control protocol, that only guarantees congestion control as opposed to packet delivery. Another alternative solution would be a custom IP packet with modified source and destination addresses and a custom preamble. Having a variety of techniques available would be very useful, as each of these has less overhead than the last, while also being less likely to work with more complicated network setups. + +When the Local Server has a packet it wishes to send outbound, it will place the packet and some additional security data in a queue. The multiple congestion controlled links will each be consuming from this queue when they are not congested. This will cause greedy load balancing, where each connection takes all that it can get from the packet queue. As congestion control algorithms adapt to the present network conditions, this load balancing will alter the balance between links as the capacity of each link changes. + +To make integration of this solution as simple as possible, the Local Server will endeavour to provide a DHCP server on the client interface. This will allow the client to automatically configure its IP, as is often the case with an ISP. + +Security is an important consideration in this project. Creating a multipath connection and proxies in general can create additional attack vectors, so I will perform a review of some existing security literature for each of these. However, as the tunnel created here transports entire IP packets, any security added by the application or transport layer will be maintained by my solution. + +The structure of the Wireguard project is also a good fit for this project. The elements are presented as follows: + +\begin{itemize} + + \item To manage the tunnel, a C kernel codebase and a Go user space codebase. + + \item Existing $ip(8)$ and $ifconfig(8)$ tools for the configuration that they can manage. + + \item A $wg(8)$ tool for configuration that can't be handled by existing tools. + + \item A $wg-quick(8)$ tool for persistent configuration. + +\end{itemize} + +Although I only plan to implement a user space codebase as part of this project, I will endeavour to produce the three parts listed above. That is, allowing all configuration that can be handled by the existing tools $ip(8)$ or $ifconfig(8)$ to be completed by them, an additional tool for bespoke configuration elements, and a separate script that uses both of these for persistent configuration. + +Examples are provided showing the path of a packet with standard session based load balancing, and with this solution applied: + +\subsubsection*{Session Based Load Balancing} +A sample network is provided in Figure \ref{fig:sample-network-session-based}. + +\begin{enumerate} + \item NAT Router receives the packet from the client. + \item NAT Router uses packet details and Layer 4 knowledge in an attempt to find an established connection. If there is an established connection, the NAT Router allocates this packet to that WAN interface. Else, it selects one using a defined load balancing algorithm. + \item NAT Router masquerades the source IP of the packet as that of the selected WAN interface. + \item NAT Router dispatches the packet via the chosen WAN interface. + \item Destination server receives the packet. +\end{enumerate} + +\begin{figure} + \small + \begin{verbatim} + Uncontrolled + On Premises Devices ++-------------------------------------------------+ +----------+ + + +----------+ + | Web | + +-->+ Server | + +----------+ | | | ++-------------------------------+ | +--+ +----------+ +| | +-->+ Modem A | +| +----------+ +----------+ | | | +--+ +----------+ +| | | | NAT +----+ +----------+ | | VoIP | +| | Client +---->+ Router | | +-->+ Server | +| | | | +----+ +----------+ | | +| +----------+ +----------+ | | | | +----------+ +| | +-->+ Modem B +--+ ++-------------------------------+ | | | +----------+ + +----------+ | |Corporate | + +-->+ VPN | + | | + +----------+ + \end{verbatim} + \caption{A network with a NAT Router and two modems} + \label{fig:sample-network-session-based} +\end{figure} + +\subsubsection*{This Solution} +A sample network is provided in Figure \ref{fig:sample-network}. + +\begin{enumerate} + \item Local Server receives the packet from the client. + \item Local Server wraps the packet with additional information. + \item Local Server sends the wrapped packet along whichever connection has available capacity. + \item Wrapped packet travels across the Internet to the Remote Server. + \item Remote Server receives the packet. + \item Remote Server dispatches the unwrapped packet via its high speed WAN interface. + \item Destination receives the packet. +\end{enumerate} + +\pagebreak +\section*{Success Criteria} +\begin{enumerate} + + \item Allow either a TCP flow or a UDP flow to continue if one or more (but not all) of the connections between the Local Server and the Remote Server are lost. + + \item Any and all performance gains stated below should function bidirectionally (inbound/outbound to/from the client). + + \item Allow the network client behind the main client to treat its IP address on the link to the Local Server as the IP of the Remote Server. + + \item Provide security that is no worse than not using this solution at all. + + \item Demonstrate that more bandwidth is available over two connections of equal bandwidth with this solution than is available over one connection without. + + \item Demonstrate that a flow can be maintained over two connections of equal bandwidth with this solution if one of the connections becomes unavailable. + + \item Provide full support for both IPv4 and IPv6. This includes reaching the Remote Server over IPv6 but proxying IPv4 packets, and vice versa. + +\end{enumerate} + +\subsection*{Extended Goals} +\begin{enumerate} + + \item Demonstrate that more bandwidth is available over two connections of unequal bandwidth than is available over two connections of equal bandwidth, where this bandwidth is the minimum of the unequal connections. + + \item Demonstrate that more bandwidth is available over four connections of equal bandwidth than is available over three connections of equal bandwidth. + + \item Demonstrate that if the bandwidth of one of two connections increases/decreases, the bandwidth available adapts accordingly. + + \item Demonstrate that if one of two connections is lost and then regained, the bandwidth available reaches the levels of before the connection was lost. + + \item My initial design requires the Remote Server to have two interfaces: one for communicating with the Local Server, and one for communicating with the wider network. This criteria is achieved by supporting both of these actions over one interface. + + \item Support a metric value for connections, such that connections with higher metrics are only used for load balancing if no connection with a lower metric is available. + +\end{enumerate} + +\subsection*{Stretch Goals} +\begin{enumerate} + + \item Provide a UDP based solution of tunnelling the IP packets which exceeds the performance of the TCP solution in the above bandwidth tests. + + \item Provide an IP based solution of forwarding the IP packets which exceeds the performance of the UDP solution in the above bandwidth tests. + +\end{enumerate} + +\pagebreak +\section*{Timetable and Milestones} + +\subsection*{12/10/2020 - 1/11/2020} + +Study Go, Rust and C++'s abilities to read all packets from an interface and place them into some form of concurrent queue. Research the positives and negatives of each language's SPMC and MPSC queues. + +\noindent \\ +Milestone: Example programs in each language that read all packets from a specific interface and place them into a queue, or a reason why this isn't feasible. A decision of which language to use for the rest of the project, based on these code segments and the status of SPMC queues in the language. + +\subsection*{02/11/2020 - 15/11/2020} + +Set up the infrastructure to effectively test any produced work from this point onwards. + +\noindent \\ +Milestone: A virtual router acting as a virtual Internet for these tests. 3 standard VMs below this level for each: the Local Server, the Remote Server and a speed test server to host iPerf3. Behind the Local Server should be another virtual machine, acting as the client to test the speed from. Backups of this setup should also have been made. + +\subsection*{16/11/2020 - 29/11/2020} + +This section should focus on the security of the application. This would include the ability for someone to maliciously use a Remote Server to perform a DoS attack. + +\noindent \\ +Milestone: An analysis of how the security of this solution compares, both with other multipath solutions and a network without any multipath solution applied. + +\subsection*{30/11/2020 - 20/12/2020} + +Implementation of the transport aspect of the Local Server and Remote Server. The first data structure for transport should also be created. This does not include the load sharing between connections - it is for a single connection. To enable testing, this will also require the setup of configuration options for each side. At this stage, it would be reasonable for the Remote Server to require two different IPs - one for server communication, and one as the public IP of the Local Router. The initial implementation should use TCP, but if time is available, UDP with a custom datagram should be explored for reduced overhead. + +\noindent \\ +Milestone: A piece of software that can act either as the Local Server or Remote Server based on configuration. Any IP packets sent to the Local Server should emerge from the Remote Server. + +\subsection*{21/12/2020 - 10/01/2021} + +Create mock connections for tests that support variable speeds, a list of packet numbers to lose and a number of packets to stop handling packets after. Produce the first draft of the preparation chapter. + +\noindent \\ +Milestone: Mock connections and tests for the existing single transport. A draft of the preparation chapter. + +\subsection*{11/01/2021 - 07/01/2021} + +Implement the load balancing between multiple connections for both servers. At this point, connection losses should be tested too. + +The progress report is due soon after this work segment, so that should be completed in here. + +\noindent \\ +Milestone: The Local Server and Remote Server are capable of balancing load between multiple connections. They can also suffer a network failure of all but one connection with minimal packet loss. The progress report should be prepared. + +\subsection*{08/02/2021 - 21/02/2021} + +Implementation of a DHCP server for the Local Server. This should create a DHCP pool with a single address, that of the Remote IP. + +As well as code in this period, it is important to complete the preparation chapter, and complete a solid draft of the implementation chapter of the dissertation. + +\noindent \\ +Milestone: The router behind the Local Server should automatically receive the Public IP of the Remote Server from DHCP. The dissertation should have a completed preparation chapter, and an acceptably drafted implementation chapter. + +\subsection*{22/02/2021 - 21/03/2021} + +Complete dissertation. + +\noindent \\ +Milestone: Benchmarks and graphs for non-extended success criteria complete and added. Complete dissertation draft handed to DoS and supervisor for feedback. + +\subsection*{22/03/2021 - 25/04/2021} + +Flexible time: divide between re-drafting dissertation and adding additional extended success criteria features, with priority given to re-drafting the dissertation. + +\noindent \\ +Milestone: A finished dissertation and any extended success criteria that have been completed. + +\subsection*{26/04/2021 - 09/05/2021} + +New additions freeze. Nothing new should be added to either the dissertation or code at this point. + +\noindent \\ +Milestone: Bug fixes and polishing. + +\subsection*{10/05/2021 - 14/05/2021} + +The project should already be submitted a week clear of the deadline, so this week has no planned activity. + +\section*{Resources Required} + +\begin{itemize} + \item Personal Computer (AMD R9 3950X, 16GB RAM) + \item Personal Laptop (AMD i7-8550U, 16GB RAM) +\end{itemize} + +Used for development without requiring the lab. Testing this application will require extended capabilities, which would not be readily available on shared systems. + +\begin{itemize} + \item Virtualisation Server (2x Intel Xeon X5667, 12GB RAM) + \item Backup Virtualisation Server (2x Intel Xeon X5570, 48GB RAM) +\end{itemize} + +A virtualisation server allows controlled testing of the application, without any packets leaving the physical interfaces of the server. + +I accept full responsibility for the above 4 machines and I have made contingency plans to protect myself against hardware and/or software failure. All resources will be backed up according to the 3-2-1 rule. This would allow me to migrate development and/or testing to the cloud if needed. + +Go(Lang) code written will use a version later than that available on the MCS, as the version currently on the MCS (1.10) does not support Go Modules. +Rust is not available on the MCS at the time of writing. This can be managed by using personal machines or cloud machines accessed via the MCS. + +\end{document}