Update on Overleaf.

This commit is contained in:
jsh77 2021-05-04 07:13:13 +00:00 committed by overleaf
parent 3d81a24271
commit 96191a5156
2 changed files with 46 additions and 91 deletions

View File

@ -19,21 +19,13 @@ Proxying packets is the process of taking packets that arrive at one location an
\begin{figure}
\centering
\includegraphics{2_Preparation/Figs/security-zones.png}
\caption{A summary of the three different connection classes in this proxy.}
\caption{A summary of the three different transportation zones in this proxy.}
\label{fig:security-zones}
\end{figure}
Any connection between two computers presents a set of security risks. A proxy consists of both these risks, and further, which I will present and discuss in this section. Firstly, this focuses on layered security. This is the case of the Local Portal and Remote Portal, with everything in between, being viewed as an Internet connection. The focus is on how the risks compare to that of a standard Internet connection, and what guarantees must be made to achieve the same risks for a proxied connection as for a standard connection.
Secondly, this section focuses on the connections between the Local Portal and the Remote Portal. This focuses primarily on the risk of accepting and transmitting a packet that is not intended, or sending packets to an unintended recipient.
These security problems will be considered in the context of the success criteria: provide security no worse than a standard connection. That is, the security should be identical or stronger than the threats in the first case, and provide no additional vectors of attack in the second.
\subsection{Higher Layer Security}
This application proxies entire IP packets, so is a Layer 3 solution. As such, the goal is to maintain the same guarantees that one would normally expect at layer 3, for higher layers to build off of. At layer 3, none of anonymity, integrity, privacy or freshness are guaranteed, so it is up to the application to provide its own security guarantees. As such, maintaining the same level of security for applications can be achieved by ensuring that the the packets which leave one side of a proxy are a subset of the packets that entered the other side.
This ensures that guarantees managed by layers above layer 3 are maintained. Regardless of whether a user is accessing insecure websites over HTTP, running a corporate VPN connection or sending encrypted emails, the security of these applications will be unaltered. Further, this allows other guarantees to be managed, including reliable delivery with TCP.
The transportation of packets is in three sections, as shown in figure \ref{fig:security-zones}. The first section is client to portal, which occurs physically in the local zone. Secondly is portal to portal, which occurs across the Internet. Finally is portal to server, which also occurs across the Internet. With the goal of providing security equivalent to a standard connection, the client to portal communication can be considered safe - it is equivalent to connecting a client directly to a modem. Therefore, this section will focus on the transports of portal to portal, and portal to server communication.
\subsection{Portal to Portal Communication}
@ -44,106 +36,55 @@ This ensures that guarantees managed by layers above layer 3 are maintained. Reg
\label{fig:security-zones-attackers}
\end{figure}
\subsubsection{Denial of Service}
\label{subsubsection:threats-denial-of-service}
There are four locations to insert or remove packets in the transport between the two portals, as shown in figure \ref{fig:security-zones-attackers}. In this case, Alice can insert packets to the local portal to be sent to the client, Bob can insert packets to the remote portal to be sent to the world, Charlie can steal packets from the local portal destined for the remote portal, and Dave can steal packets from the remote portal destined for the local portal. Each of these will be examined for the impact that it would cause.
Proxying packets in this way provides a new method of Denial of Service. If an attacker can convince either portal to send them a portion of the packets due for the other portal, the packet loss of the overall connection from the perspective of the other portal will increase by an equivalent amount. For example, if a bad actor can convince the remote portal to send them $50\%$ of the good packets, and previously the packet loss was at $0.2\%$, the packet loss will increase to $50.1\%$.
The impact of Alice inserting packets of their choosing to the local portal is none. Considering a client connected directly to a modem, any device on the Internet is able to send packets to this modem. In the case of Alice inserting these packets, they could simply send the packets to the remote portal instead, achieving the same effect. As such, inserting packets destined for the client presents no additional risk.
This is of particular concern for flows carried by the proxy that use loss based congestion control. In such a case, for every 2 packets a TCP flow sends, it will lose one on average. This means that the window size will be unable to grow beyond one with NewReno congestion control. As such, the performance of these flows will be severely negatively impacted.
The impact of Bob inserting packets of their choosing to the remote portal creates a legal risk to the user, and further cost. For example, Bob may be inserting packets destined for Frank, of an illegal nature. As the machine running the remote portal is your responsibility, these packets would appear to have come from you. Similarly, if using a metered service such as many cloud providers, all traffic inserted by Bob will be billed to you. Thus it is highly important to prevent attackers such as Bob from inserting packets that will be forwarded as if from you.
However, even if only $25\%$ of the packets are lost, NewReno would still fail to increase the window size past 3. This demonstrates that an attacker with even a slower connection than you can have a significant impact on connection performance.
Charlie and Dave stealing packets provides the same risk in either direction, which is denial of service. Even if only a small percentage of packets are stolen, the increase in packet loss has a significant effect on any loss based congestion control mechanisms. This applies whether to the tunnelled flows, or to the congestion controlled flows between proxies themselves. Mitigations for this will focus on opportunities for stealing packets unique to this proxy setup. For example, intercepting the wire at the last hop is feasible regardless of this solution, so will not be covered.
\subsubsection{Privacy}
\subsection{Portal to Server Communication}
Though the packets leaving a modem have no reasonable expectation of privacy, having the packets enter the Internet at two points does present more points at which a packet can be read. For example, if the remote portal lies in a data center, the content and metadata of packets can be sniffed in either the data center or at the physical connections. However, this is equivalent to your packets taking a longer route through the Internet, with more hops. Therefore, comparatively, this is not worse.
Further, if an attacker convinces the Remote Portal that they are a valid connection from the Local Portal, a portion of packets will be sent to them. However, as a fortunate side effect, this method to attempt sniffing would cause a significant Denial of Service to any congestion controlled links based on packet loss, as shown in the previous segment. Therefore, as long as it is ensured that each packet is not sent to multiple places, privacy should be maintained at a similar level to simple Internet access, given that an eavesdropper using this active eavesdropping method will be very easy to detect.
\subsubsection{Cost}
In many cases, the remote portal will be taking advantage of a cloud instance, for the extremely high bandwidth and well peered connections available at a reasonable price. Cloud instances are often billed per unit of outbound traffic, and as such, an attacker could cause a user to carry a high cost burden by forcing their remote portal to transmit more outbound data. This should be avoided by ensuring that the packets transmitted out are both from the local portal and fresh.
Packets between the portal and server are transmitted openly across the Internet. As this proxy transports entire IP packets at layer 3, no security guarantees need be maintained once the IP packet has left the remote portal. As such, it the responsibility of the application to provide its own security guarantees. Maintaining the same level of security for applications can therefore be achieved by ensuring that the the packets which leave one side of a proxy are a subset of the packets that entered the other side.
% ------------------------------- Security --------------------------------- %
\section{Security}
\section{Security Solutions}
\label{section:preparation-security}
This section provides means of alleviating the risks given in section \ref{section:risk-analysis}. To achieve this goal, the authenticity of packets will be verified. Authenticity in this context means two properties of the object hold: integrity and freshness \citep[pp. 14]{anderson_security_2008}. Integrity means that the object has not been altered since the last authorised modification, that is the transmission from the other portal. Freshness details that the object has not been used for this purpose before.
This section provides means of alleviating the risks given in section \ref{section:risk-analysis}. To achieve this goal, the authenticity of packets will be verified. Authenticity in this context means two properties of the object hold: integrity and freshness \citep[pp. 14]{anderson_security_2008}. Integrity guarantees that any modification between the sending and receiving portals can be detected, while freshness guarantees that reuse of a previously transmitted packet can be detected.
\subsection{Message Authentication}
To provide integrity and freshness for each message, I evaluate two choices: Message Authentication Codes (MACs) or Digital Signatures. A MAC combines the data with a shared key using a specific method, before using a one-way hash function to generate a message authentication code, and thus the result is only verifiable by someone with the same private key \citep[pp. 352]{menezes_handbook_1997}. Producing a digital signature for a message uses the private key in public/private keypair to produce a digital signature for a message, proving that the message was produced by the owner of the private key, which can be verified by anyone with the public key \citep[pp. 147-149]{anderson_security_2008}. In both cases, the message authentication code is appended to the message, such that the integrity and authenticity of the message can be verified.
To provide integrity and freshness for each message, I evaluate two choices: Message Authentication Codes (MACs) or Digital Signatures. A MAC is a hash digest generated from a concatenation of data and a secret key. The hash digest is appended to the data before transmission. Anyone sharing the secret key can perform an identical operation to verify the hash and, therefore, the integrity of the data \citep[pp. 352]{menezes_handbook_1997}. Producing a digital signature for a message uses the private key in public/private keypair to generate a digital signature for a message, proving that the message was signed by the owner of the private key, which can be verified by anyone with the corresponding public key \citep[pp. 147-149]{anderson_security_2008}. In each case, a code is appended to the message, such that the integrity and authenticity of the message can be verified.
The comparison is as such: signatures provide non-repudiation, while MACs do not - one can know the owner of which private key signed a message, while anyone with the shared key could have produced an MAC for a message. The second point is that digital signatures are generally more computationally complex than MACs, and thus, given that the control of both ends lies with the same party, MAC is the message authentication of choice for this project.
Signatures provide non-repudiation, while MACs do not - one can know the owner of which private key signed a message, while anyone with the shared key could have produced an MAC for a message. The second point is that digital signatures are generally more computationally complex than MACs, and thus, given that the control of both ends lies with the same party, MAC is the message authentication of choice for this project.
\subsection{IP Authentication Header}
\subsection{Connection Authentication}
The security requirements for this project are equivalent to those provided by the IP Authentication Header \citep{kent_ip_2005}. The IP authentication header operates between IP and the transport layer, using IP protocol number 51. The authentication header uses a hash function and a secret shared key to provide an Integrity Check Value. This check value covers all immutable parts of the IP header, the authentication header itself, and the data below the authentication header. Combined, this provides connectionless integrity and authenticity, as the IP header is authenticated. Further, the header contains a sequence number, which is used to prevent replay attacks.
Beyond authenticating messages themselves, the connection built between the two proxies must be authenticated. Consider a man in the middle attack, where an attacker forwards the packets as themselves between the two portals. Then, the attacker stops forwarding packets, and instead black holes them. This creates the denial of service mentioned in the previous section.
Unfortunately, there are two reasons why this solution cannot be used: difficulties with NAT traversal, and inaccessibility for user-space programs. As the IP packet provides integrity for the source and destination addresses, any NAT that alters these addresses violates the integrity of the packet. Although NAT traversal is not an explicit success criteria for this project, it is implicit, as the flexibility of the project for different network structures is a priority, including those where NAT is unavoidable. The second is that IP authentication headers, being an IP protocol and not transport layer, would cause issues interacting with user-space programs. Given that the first implementation of transport is completed using TCP, having IP Authentication Headers would require the user-space program to handle the TCP connection without the aid of the kernel, complicating multiplexing and being an unsupported setup.
To prevent such forwarding attacks, the connection itself must be authenticated. I present two methods to solve this, the first being address whitelists, and the second authenticating the IP address and port of each sent packet. The first solution is static, and simply states that the listening portal may only respond to new communications when their IP/a DNS record pointing to their IP is present in an approved set.
Overall, using the IP authentication header would function similarly to running over a VPN, described in section \ref{section:layered-security}. Although this will be a supported configuration, the shortfalls mean that it will not be the base implementation. However, inspiration can be taken from the header structure, shown in figure \ref{fig:ip-auth-header-structure}.
Secondly is a more dynamic solution. The IP authentication header \citep{kent_ip_2005} achieves this by protecting all immutable parts of the IP header with an authentication code. In the case of this software, including the source IP address, source port, destination IP address, and destination port ensures connection authenticity, presuming the lack of an on the wire attack (an attack that is feasible regardless of the presence of this software). By authenticating these addresses, which can be checked easily at both ends, it can be confirmed that both devices knew with whom they were talking. That is, an owner of the shared key authorised this communication path.
\begin{figure}
\centering
\begin{bytefield}[bitwidth=0.8em]{32}
\bitheader{0-31} \\
\bitbox{8}{Next Header} & \bitbox{8}{Payload Len} & \bitbox{16}{Reserved} \\
\wordbox{1}{Security Parameters Index} \\
\wordbox{1}{Sequence Number} \\
\wordbox[tlr]{1}{Integrity Check Value} \\
\wordbox[blr]{1}{$\cdots$}
\end{bytefield}
\caption{IP authentication header structure}
\label{fig:ip-auth-header-structure}
\end{figure}
However, both of these solutions have some shortfalls when NAT is involved. The second solution, authenticating addresses, fails with any form of NAT (Network Address Translation). This is because the IPs and often ports of the packets are unknown when it is sent, and therefore cannot be authorised. The first solution, providing a set of addresses, fails with CG-NAT (Carrier Grade NAT), as many share the IP address and hence anyone under the same IP could perform an attack. In most cases one of these solutions will work, else one can fail over to the security layering presented in section \ref{section:layered-security}.
It is first important to note the differences between the use of IP authentication headers and the security footers used in this application. Firstly, the next header field is unnecessary, given that headers are not being chained. Secondly, given the portals have a fixed security configuration by static configuration, the payload length field is unnecessary - the payloads will always be of a predetermined length. Similarly, the security parameters index is unnecessary, as the parameters will be equal.
\subsection{Freshness}
The difference in security arises from the lack of integrity given to the fields above the application layer. That is, the IP header itself, and the TCP or UDP header. However, there is an important distinction between the TCP and UDP cases: TCP congestion control will not be covered by any application provided security, while the UDP congestion control will. That is, this application can do nothing to authenticate the ACKs of a TCP connection, as these are created outside of the control of the application. As such, the TCP implementation provided by the solution should be used in one of two ways: as a baseline test for the performance of other algorithms, or taking advantage of layered security as given in section \ref{section:layered-security}. The rest of this section will therefore focus on securing the UDP transport.
To ensure freshness of received packets, an anti-replay algorithm is employed. Replay protection in IP authentication headers is achieved by using a sequence number on each packet. This sequence number is monotonically and strictly increasing. The algorithm that I have chosen to implement for this is \emph{IPsec Anti-Replay Algorithm without Bit Shifting} \citep{tsou_ipsec_2012}, also employed in Wireguard \citep{donenfeld_wireguard_2017}.
Further differences arising from the lack of integrity above the application layer still apply to UDP transport. Although the congestion control layer and therefore packet flow is authenticated, the source and destination of packets are not.
\subsubsection{Adapting for NAT}
To achieve authentication with the IP Authentication Header, one must authenticate the source and destination addresses of the packet. However, these addresses may be altered by NAT devices in transit between the local and remote portals. In the case of source NAT, the source of an outgoing packet is masqueraded to the public address of the NAT router, likely altering the outgoing port as well. For destination NAT, an inbound packet to a NAT router will have its address changed to the internal destination, possibly changing the destination port too.
However, each of these address translations is predictable at the time of packet sending and the time of packet receiving. For a packet that will go through source NAT, the eventual source address is predictable, in that it will be altered from the internal address to the public address of the router. Likewise, with destination NAT, the destination address of the packet will be predictable as the public address of the router that receives it. An example of this is shown in figure \ref{fig:network-address-translations}.
\begin{figure}
\centering
\begin{tikzpicture}
\draw (0, 0) rectangle (3, 1.5) node[midway,align=center] {Host B \\10.172.14.6/24};
\draw [->] (1.5, 2.5) -- (1.5, 1.5);
\draw (0, 2.5) rectangle (3, 4) node[midway,align=center] {Dest. NAT \\192.168.1.8/24};
\draw [->] (1.5, 5) -- (1.5, 4);
\draw (0, 5) rectangle (3, 6.5) node[midway,align=center] {Source NAT \\192.168.1.9/24};
\draw [->] (1.5, 7.5) -- (1.5, 6.5);
\draw (0, 7.5) rectangle (3, 9) node[midway,align=center] {Host A \\172.19.72.12/24};
\draw[dashed] (1.5, 2) -- (4, 2);
\draw[dashed] (4, 0.5) rectangle (7.5, 3.5) node[midway,align=left] {SA: 192.168.1.9\\SP: 31602\\DA: 10.172.14.6\\DP: 2048};
\draw[dashed] (1.5, 4.5) -- (-1, 4.5);
\draw[dashed] (-1, 3) rectangle (-4.5, 6) node[midway,align=left] {SA: 192.168.1.9\\SP: 31602\\DA: 192.168.1.8\\DP: 1024};
\draw[dashed] (1.5, 7) -- (4, 7);
\draw[dashed] (4, 5.5) rectangle (7.5, 8.5) node[midway,align=left] {SA: 172.19.72.12\\SP: 21941\\DA: 192.168.1.8\\DP: 1024};
\end{tikzpicture}
\caption{UDP packet passing through source and destination network address translation, and the addresses and ports at each point.}
\label{fig:network-address-translations}
\end{figure}
Therefore, to authenticate the message's source and destination, the source address and destination address from the period between the NATs will be used. Host A can predict this by using the destination address of the flow transporting the packets and knowledge of its own NATs public IP as the source address. Similarly, host B can predict this by using the source address of the flow transporting the packets and knowledge of its own NATs public IP as the destination address. Although this does mean that the authentication would apply equally to any other device behind both NATs, this is an acceptable compromise for NATs controlled by the user. Achieving sufficient security under a CG-NAT is left as an exercise to the implementer, where the techniques described in section \ref{section:layered-security} can be applied.
\subsubsection{Replay Attacks}
Replay protection in IP Authentication Headers is achieved by using a sequence number on each packet. This sequence number is monotonically and strictly increasing. The algorithm that I have chosen to implement for this is \emph{IPsec Anti-Replay Algorithm without Bit Shifting} \citep{tsou_ipsec_2012}, also employed in Wireguard \citep{donenfeld_wireguard_2017}.
A specific of the multipath nature of this application is requiring the use of a sequence number space that is shared between flows. This is similar to the design pattern of multipath TCP's congestion control, where there is a separation between the sequence number of individual subflows and the sequence number of the data transport as a whole \citep[pp. 11]{wischik_design_2011}.
A specific of the multipath nature of this application is requiring the use of a sequence number space that is shared between flows. This is similar to the design pattern of multipath TCP's congestion control, where there is a separation between the sequence number of individual subflows and the sequence number of the data transport as a whole \citep[pp. 11]{wischik_design_2011}. This means that replay is treated as a shared problem between all flows, while the flow congestion control is independent of other flows. This is important, as an attacker wishing to replay a packet could otherwise create a new flow each time they wish to replay the packet.
\subsection{Layered Security}
\label{section:layered-security}
It was previously mentioned that this solution focuses on maintaing the higher layer security of proxied packets. Further to this, this solution provides transparent security in the other direction. Consider the case of a satellite office that employs both a whole network corporate VPN and this solution. The network can be configured in each of two cases: the multipath proxy runs behind the VPN, or the VPN runs behind the multipath proxy.
These two examples are given in figures \ref{fig:whole-network-vpn-behind} and \ref{fig:whole-network-vpn-infront}, for the VPN Wireguard \citep{donenfeld_wireguard_2017}. In figure \ref{fig:whole-network-vpn-infront}, the portals are only accessible via the VPN protected network. It can be seen that the packet in figure \ref{fig:whole-network-vpn-infront} is shorter, given the removal of the message authentication code and the data sequence number. The data sequence number is unnecessary, given that Wireguard uses the same anti-replay algorithm, and thus replayed packets would have been caught entering the secure network. Further, the message authentication code is unnecessary, as the authenticity of packets is now guaranteed by Wireguard.
Supporting and encouraging this layering of protocols provides a second benefit: if the security in this solution is found to be broken with time, there are two options to repair it. One can either fix the open source application, or compose it with a security solution that is not broken, but perhaps provides extraneous security guarantees and therefore causes reduced performance. To this end, the security features mentioned will all be configurable. This allows for flexibility in implementation.
\begin{figure}
\centering
\includegraphics{2_Preparation/Figs/security-zones-vpn.png}
@ -151,11 +92,7 @@ A specific of the multipath nature of this application is requiring the use of a
\label{fig:security-zones-vpn}
\end{figure}
It was previously mentioned that this solution focuses on maintaing the higher layer security of proxied packets. Further to this, this solution provides transparent security in the other direction. Consider the case of a satellite office that employs both a whole network corporate VPN and this solution. The network can be configured in each of two cases: the multipath proxy runs behind the VPN, or the VPN runs behind the multipath proxy.
These two examples are given in figures \ref{fig:whole-network-vpn-behind} and \ref{fig:whole-network-vpn-infront}, for the VPN Wireguard \citep{donenfeld_wireguard_2017}. In figure \ref{fig:whole-network-vpn-infront}, the portals are only accessible via the VPN protected network. It can be seen that the packet in figure \ref{fig:whole-network-vpn-infront} is shorter, given the removal of the message authentication code and the data sequence number. The data sequence number is unnecessary, given that Wireguard uses the same anti-replay algorithm, and thus replayed packets would have been caught entering the secure network. Further, the message authentication code is unnecessary, as the authenticity of packets is now guaranteed by Wireguard.
Supporting and encouraging this layering of protocols provides a second benefit: if the security in this solution breaks with time, there are two options to repair it. One can either fix the open source application, or compose it with a security solution that is not broken, but perhaps provides extraneous security guarantees and therefore causes reduced performance. To this end, the security features mentioned will all be configurable. This allows for flexibility in implementation.
The benefits of using a VPN tunnel between the two proxies are shown in figure \ref{fig:security-zones-vpn}. Whereas in figure \ref{fig:security-zones} the portal to portal communication is across the unprotected Internet, in figure \ref{fig:security-zones-vpn} this communication occurs across a secure overlay network. This allows the packets transported there to be trusted, and avoids the need for additional verification. Further, it allows the application to remain secure in any situation where a VPN will work.
\begin{figure}
\begin{leftfullpage}

View File

@ -2,6 +2,24 @@
\begin{proforma}
\mynote{Fill in closer to the time.}
\begin{tabular}{ll}
Candidate Number: & 2373A \\
Project Title: & A Multi-Path Bidirectional Layer 3 Proxy \\
Examination: & Computer Science Tripos - Part II, 2021 \\
Word Count: & 12740 \\
Line Count: & 2705 \\
Project Originator: & The dissertation author \\
Supervisor: & Mike Dodson
\end{tabular}
\section*{Original Aims of the Project}
\section*{Work Completed}
\section*{Special Difficulties}
None.
\mynote{Update and complete this.}
\end{proforma}