dissertation-4-dissertation/1_Introduction/introduction.tex

%!TEX root = ../thesis.tex
%*******************************************************************************
%*********************************** First Chapter *****************************
%*******************************************************************************

\chapter{Introduction}  %Title of the First Chapter

\ifpdf
    \graphicspath{{1_Introduction/Figs/Raster/}{1_Introduction/Figs/PDF/}{1_Introduction/Figs/}}
\else
    \graphicspath{{1_Introduction/Figs/Vector/}{1_Introduction/Figs/}}
\fi

The advertised broadband download speed of most UK residences lies between 30Mbps and 100Mbps \citep{ofcom_performance_2020}, which is often the highest available speed. However, in most cases, more of these connections can be installed at a price linear in the number of connections. More generally, a wider variety of Internet connections for fixed locations are becoming available with time. These include: DSL, Fibre To The Premises, 4G, 5G, Wireless ISPs such as LARIAT\footnote{\url{http://lariat.net}} and Low Earth Orbit ISPs such as Starlink.\footnote{\url{https://starlink.com}}

Though multiple low bandwidth, low cost connections may be accessible, a mechanism to combine multiple connections to present a single high speed, highly available connection to a user is unavailable. This work focuses on providing such a mechanism, taking multiple, distinct connections and providing a single, aggregate connection via a proxy.

Using a proxy to combine connections provides three significant benefits: immediate failover of a single flow, exceeding the bandwidth of each individual connection with a single flow, and balancing of inbound connections. For failover, this means that a flow between a user of this proxy and an external user, such as a SIP call, is maintained when one Internet connection is lost. Exceeding the bandwidth of a single connection means that an application which utilises a single flow can take advantage of higher bandwidth than is available over a single connection. This is useful in cases such as a CCTV system, where viewing a live stream from a camera remotely is  possible in a higher resolution with the increased bandwidth available. Finally, although methods such as load balancing routers can balance outgoing flows effectively in many cases, inbound flows cannot be handled so simply. Balancing inbound flows involves complex solutions, which rely on client support. The proxy presented here returns control to the network architect, and hides the complexity from the client and server on either side of the proxy, providing a practical mechanism for obtaining all three benefits.

\section{Existing Work}

Three pieces of existing work that will be examined for their usefulness are MultiPath TCP (MPTCP), Wireguard, and Cloudflare. Multipath TCP is an effort to expand TCP (Transmission Control Protocol) connections to multiple paths, and is implemented at the kernel layer such that applications which already use TCP can immediately take advantange of the multipath benefits. Wireguard is a state of the art Virtual Private Network (VPN), providing an excellent example for transmitting packets securely over the Internet. Finally, Cloudflare shows examples of how a high bandwidth network can be used to supplement multiple smaller networks, but in a different context to this project. This section focuses on how these examples do not satisfy the aims of this project, and how they provide useful initial steps and considerations for this project.

\subsection{MultiPath TCP (MPTCP)}

MultiPath TCP \citep{handley_tcp_2020} is an extension to the regular Transmission Control Protocol, allowing for the creation of subflows. MultiPath TCP was designed with two purposes: increasing resiliency and throughput for multi-homed mobile devices, and providing multi-homed servers with better control over balancing flows between their interfaces. Initially, MultiPath TCP seems like a solution to the aims of this project. However, it suffers for three reasons: the rise of User Datagram Protocol (UDP) -based protocols, device knowledge of interfaces, and legacy devices.

Although many UDP-based protocols have been around for a long time, using UDP-based protocols in applications to replace TCP-based protocols is a newer effort. An example of an older UDP-based protocol is SIP \citep{schooler_sip_2002}, still widely used for VoIP calls, which would benefit particularly from increased resilience to single Internet connection outages. For a more recent UDP-based protocol intended to replace a TCP-based protocol, HTTP/3 \citep{bishop_hypertext_2021}, also known as HTTP-over-QUIC, is one of the largest. HTTP/3 is enabled by default in Google Chrome \citep{govindan_enabling_2020} and its derivatives, soon to be enabled by default in Mozilla Firefox \citep{damjanovic_quic_2021}, and available behind an experimental flag in Apple's Safari \citep{kinnear_boost_2020}. Previously, HTTP requests have been sent over TCP connections, but HTTP/3 switches this to a UDP-based protocol. As such, HTTP requests are moving away from benefiting from MPTCP.

Secondly, devices using MPTCP must have knowledge of their network infrastructure. Consider the example of a phone with a WiFi and 4G interface reaching out to a voice assistant. The phone in this case can utilise MPTCP effectively, as it has knowledge of both Internet connections, and it can create subflows appropriately. However, consider instead a tablet with only a WiFi interface, but behind a router with two Wide Area Network (WAN) interfaces that is using Network Address Translation (NAT). In this case, the tablet will believe that it only has one connection to the Internet, while actually being able to take advantage of two. This is a problem that is difficult to solve at the client level, suggesting that solving the problem of combining multiple Internet connections is better suited to network infrastructure.

Finally, it is important to remember legacy devices. Often, these legacy devices will benefit the most from resilience improvements, and they are the least likely to receive updates to new networking technologies such as MPTCP. Although MPTCP can still provide a significant balancing benefit to the servers to which legacy devices connect, the legacy devices see little benefit from the availability of multiple connections. In contrast, providing an infrastructure-level solution, such as the proxy presented here, benefits all devices behind it equally, regardless of their legacy status.

\subsection{Wireguard}

Wireguard \citep{donenfeld_wireguard_2017} is a state of the art VPN solution. Though Wireguard does not serve to combine multiple network connections, it is widely considered an excellent method of transmitting packets securely via the Internet, demonstrated by its inclusion in the Linux kernel \citep{torvalds_linux_2020}, use by commercial providers of overlay networks \citep{pennarun_how_2020}, a security audit \citep{donenfeld_wireguard_2020}, and ongoing efforts for formal verification \citep{donenfeld_formal_nodate,preneel_cryptographic_2018}.

For each Layer 3 packet that Wireguard transports, it generates and sends a single UDP datagram. This is a pattern that will be followed in the UDP implementation of my software. These UDP packets present many of the same challenges as will occur in my software, such as a vulnerability to replay attacks, so the Wireguard implementation overcoming these challenges will be considered throughout. Finally, Wireguard provides an implementation in Go, which will be a useful reference for the Layer 3 networking in Go used in my project.

\subsection{Cloudflare}

Cloudflare is a company that uses a global network of servers to provide a variety of infrastructure products, mostly pertaining to websites and security \citep{cloudflare_cloudflare_nodate}. Two of the products offered by Cloudflare are of particular interest to this project: load balancing and magic WAN.

Cloudflare provides the option to proxy HTTPS traffic via their global network of servers to your origin server. This layer 7 (application layer) proxy operates on the level of proxying HTTP requests themselves, and can take advantage of its knowledge of connections to provide effective load balancing between origin servers. Similarly to my project, Cloudflare can use knowledge of the responsiveness of origin servers to alter the load balancing. This is a similar use case to my proxy, where items (HTTP requests / IP packets) hit one high bandwidth server (one of Cloudflare's edge servers / the remote proxy), and this server decides which path through which to proxy the item (a chosen origin server / which connection to the local proxy).

Unlike Cloudflare load balancing, the proxy presented in this work operates on layer 3. Cloudflare receives a stream of HTTPS requests and forwards each on to a chosen origin server, while my remote proxy receives a stream of IP packets and forwards them via a chosen path to my local proxy. Though these achieve different goals, Cloudflare load balancing provides an example of using a high bandwidth edge server to manage balancing between multiple lower bandwidth endpoints.

The second product of Cloudflare's that shows some similarity to my project is Magic WAN. Cloudflare Magic WAN provides a fully software defined WAN over their global network. That is, their anycast infrastructure will accept traffic to your network at any of the edge servers in their global infrastructure before forwarding it to you. This provides some significant benefits, such as DDoS mitigation and firewalling at a far higher capacity than on your origin servers. When a DDoS attack or connections against firewall policies occur, they are cut off at the Cloudflare edge, without ever reaching the limited bandwidth of your local system.

Magic WAN demonstrates that there can be security benefits to moving your network edge to the cloud. By configuring to block bad traffic at the edge, the limited bandwidth connections at your origin are protected. It further demonstrates that WAN-as-a-Service is possible at a large scale, which is the same class of products as my proxy.

Though neither of these Cloudflare products solve the aims of my proxy, specifically the multipath problem, they show how cloud infrastructure can be leveraged to support the Internet connections of services in different capacities.

\section{Aims}

This project aims to produce proxy software that uses congestion control to manage transporting packets across multiple paths of data flow, including across discrete Internet connections. When combining Internet connections, there are three main measures that one can prioritise: throughput, resilience, and latency. This project aims to improve throughput and resilience at the cost of latency. By using a layer 3 proxy for entire IP packets, connections are combined in a way that is transparent to devices on both sides of the proxy, overcoming the throughput and availability limitations of each individual connection. The basic structure of this proxy system is shown in Figure \ref{fig:proxy-components}.

\begin{figure}
    \centering
    \begin{tikzpicture}
        \draw (0.5, 3.5) node {Local Proxy};
        \draw (0,0.14) rectangle (1,3.14);

        \draw (5.5, 3.5) node {Remote Proxy};
        \draw (5,0.14) rectangle (6,3.14);

        \draw (3, 2.8) node {ISP A};
        \draw [<->] (1, 2.5) -- (5, 2.5);

        \draw (3, 1.8) node {ISP B};
        \draw [<->] (1, 1.5) -- (5, 1.5);

        \draw (3, 0.8) node {ISP C};
        \draw [<->] (1, 0.5) -- (5, 0.5);

        \draw (-1.7, 1.5) node {Client};
        \draw [<->] (-1, 1.5) -- (0, 1.5);

        \draw (7.9, 1.5) node {Internet};
        \draw [<->] (6, 1.5) -- (7, 1.5);
    \end{tikzpicture}
    \caption{The basic components of this proxy.}
    \label{fig:proxy-components}
\end{figure}

The approach presented in this work achieves throughput superior to a single connection by using congestion control to split packets appropriately between each available connection. Further, resilience increases, as a connection loss results in decreased throughout, but does not lose any connection state. Latency, however, increases, as packets must travel via a proxy server. Fortunately, the wide availability of well-peered cloud servers allows for this latency increase to be kept minimal, affecting only the most latency sensitive applications.