mirror of
https://git.overleaf.com/6227c8e96fcdc06e56454f24
synced 2024-12-23 02:33:06 +00:00
Update on Overleaf.
This commit is contained in:
parent
bca80966df
commit
067b66cd91
@ -188,7 +188,7 @@ I present a summary of the privilege separation features in modern Linux, the sy
|
||||
|
||||
\section{Introduction}
|
||||
|
||||
Void processes take advantage of modern Linux name spaces to attempt to run applications without exposing them to the system itself. Void processes use a mixture of Linux name spaces and file descriptive based capabilities to allow running purpose-built applications without expecting the support of the standard Linux system. During the process of building such a system, gaps in the kernel were exposed, given that this work is at the edge of what main spaces can achieve. This work will go onto detail the process of creating void processes themselves, re-adding features that these processes need to do useful work, and the learnings of what features are missing in the user-space kernel APIs to succeed in creating processes this way.
|
||||
Void processes take advantage of modern Linux namespaces to attempt to run applications without exposing them to the system itself. Void processes use a mixture of Linux namespaces and file descriptive based capabilities to allow running purpose-built applications without expecting the support of the standard Linux system. During the process of building such a system, gaps in the kernel were exposed, given that this work is at the edge of what main spaces can achieve. This work will go onto detail the process of creating void processes themselves, re-adding features that these processes need to do useful work, and the learnings of what features are missing in the user-space kernel APIs to succeed in creating processes this way.
|
||||
|
||||
This work explores the question of what is an operating system by taking a novel approach to running applications with the system exposed in an entirely different way. Rather than limiting the access of a process or set of processes to the operating system, such as in containers, we instead limit the access to the operating system with more explicit methods per process. Interaction between processes is allowed by specifying such interaction statically at compile time, removing any separation between the application developer and the system controlling access to the application, unlike solutions such as SELinux.
|
||||
|
||||
@ -214,11 +214,42 @@ I present a threat model in which application binaries are trusted absolutely. T
|
||||
|
||||
\subsection{Mount Namespaces}
|
||||
|
||||
Mount namespaces were by far the most challenging part of this project. When adding new features, they continuously raised problems in both API description, expected behaviour, and performance of the tools given. A comparison will be given in this section to two other namespaces, network and UTS, to show the significant differences in the design goals of mount namespaces. Much of the programming issue here comes from a fundamental lack of consistency between the namespaces in Linux, which will be discussed further later in this section.
|
||||
Mount namespaces were by far the most challenging part of this project. When adding new features, they continuously raised problems in both API description, expected behaviour, and performance of the tools given. A comparison will be given in this section to two other namespaces, network and UTS, to show the significant differences in the design goals of mount namespaces. Much of the programming issue here comes from a fundamental lack of consistency between mount namespaces and other namespaces in Linux, which will be discussed further in this section.
|
||||
|
||||
Comparing to network name spaces, a slightly more modern name space [Table \ref{tab:namespaces}], we see a huge difference in what occurs when a new name space is created. When creating a new network namespace, one is immediately placed into a void, a network namespace containing only a loopback adapter. That is, the process has no ability to interact with the outside network, and no immediate relation to the parent network namespace. To interact with alternative namespaces, one must explicitly create a connection between the two, or move a physical adapter into the new (empty) namespace. Further to this, sockets continue to exist in their initial namespace, allowing for regular file-descriptor passing semantics \citep{biederman_re_2007}. Extending upon this socket behaviour is Wireguard, which creates adapters that may be freely moved between namespaces while continuing to connect externally from their initial parent \citep[§7.3]{donenfeld_wireguard_2017}. Mount namespaces, rather than creating a new and empty namespace, made the choice to create a copy of the parent namespace, in a copy-on-write fashion. That is, after creating a new mount namespace, the mount hierarchy appears much the same as before.
|
||||
\subsubsection{Copy-on-Write}
|
||||
|
||||
While some other namespaces are copy-on-write, for example UTS namespaces, they do not suffer from the same problem as mount namespaces. Although UTS namespaces are copy-on-write, it is trivial to create a void by setting the hostname of the machine to a constant. This removes any relation to the parent namespace and to the outside machine. Mount namespaces instead maintain a shared pointer with most filesystems, more akin to not creating a new namespace than a copy-on-write namespace.
|
||||
Comparing to network namespaces, a slightly more modern namespace [Table \ref{tab:namespaces}], we see a huge difference in what occurs when a new namespace is created. When creating a new network namespace, one is immediately placed into a void, a network namespace containing only a loopback adapter. That is, the process has no ability to interact with the outside network, and no immediate relation to the parent network namespace. To interact with alternative namespaces, one must explicitly create a connection between the two, or move a physical adapter into the new (empty) namespace. Further to this, sockets continue to exist in their initial namespace, allowing for regular file-descriptor passing semantics \citep{biederman_re_2007}. Extending upon this socket behaviour is Wireguard, which creates adapters that may be freely moved between namespaces while continuing to connect externally from their initial parent \citep[§7.3]{donenfeld_wireguard_2017}. Mount namespaces, rather than creating a new and empty namespace, made the choice to create a copy of the parent namespace, in a copy-on-write fashion. That is, after creating a new mount namespace, the mount hierarchy appears much the same as before.
|
||||
|
||||
\subsubsection{Shared Subtrees}
|
||||
|
||||
While some other namespaces are copy-on-write, for example UTS namespaces, they do not present the same problem as mount namespaces. Although UTS namespaces are copy-on-write, it is trivial to create a void by setting the hostname of the machine to a constant. This removes any relation to the parent namespace and to the outside machine. Mount namespaces instead maintain a shared pointer with most filesystems, more akin to not creating a new namespace than a copy-on-write namespace.
|
||||
|
||||
Shared subtrees were introduced to provide a consistent view of the unified hierarchy between namespaces. Consider a
|
||||
|
||||
\texttt{systemd} made the choice to mount \texttt{/} as a shared subtree [CN]. This means that when creating a new namespace, mounts and unmounts are propagated in by default. Further, it means that mounts and unmounts are propagated out of the namespace. This can be highly confusing behaviour, and \texttt{unshare(2)} considers this behaviour inconsistent with the goals of unsharing - it immediately calls \texttt{mount("none", "/", NULL, MS\_REC|MS\_PRIVATE, NULL)} after \texttt{unshare(CLONE\_NEWNS)}.
|
||||
|
||||
\begin{figure*}
|
||||
\begin{minipage}{.45\textwidth}
|
||||
|
||||
\begin{lstlisting}[caption=code 1,frame=tlrb]{Name}
|
||||
void code()
|
||||
{
|
||||
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\end{minipage}\hfill
|
||||
\begin{minipage}{.45\textwidth}
|
||||
|
||||
\begin{lstlisting}[caption=code 2,frame=tlrb]{Name}
|
||||
void code()
|
||||
{
|
||||
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\end{minipage}
|
||||
\end{figure*}
|
||||
|
||||
\section{System Design}
|
||||
|
||||
@ -365,7 +396,7 @@ While virtual machines and containers provide a strong isolation at the applicat
|
||||
|
||||
\subsection{Capsicum}
|
||||
|
||||
Capsicum \citep{watson_capsicum_2010} extends UNIX file descriptors in FreeBSD to reflect the rights on the object they hold. These capabilities may be shared between processes as other file descriptors (§\ref{section:file-descriptor-passing}). The goals of both software are the same: make privilege separated software better. However, we take quite different approaches. Multi-entrypoint applications focus on building a static definition really close to the code, while Capsicum allows processes to dynamically privilege separate. This allows applying static analysis to the policies, while also keeping the definition close to the code.
|
||||
Capsicum \citep{watson_capsicum_2010} extends UNIX file descriptors in FreeBSD to reflect the rights on the object they hold. These capabilities may be shared between processes as other file descriptors. The goals of both software are the same: make privilege separated software better. However, we take quite different approaches. Multi-entrypoint applications focus on building a static definition really close to the code, while Capsicum allows processes to dynamically privilege separate. This allows applying static analysis to the policies, while also keeping the definition close to the code.
|
||||
|
||||
\section{Future Work}
|
||||
|
||||
|
@ -1,4 +1,11 @@
|
||||
|
||||
@misc{pai_shared_nodate,
|
||||
title = {Shared {Subtrees}},
|
||||
url = {https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt},
|
||||
urldate = {2022-04-15},
|
||||
author = {Pai, Ram and Viro, Al},
|
||||
}
|
||||
|
||||
@misc{biederman_re_2007,
|
||||
type = {Mailing {List}},
|
||||
title = {Re: netns : close all sockets at unshare ?},
|
||||
@ -447,16 +454,6 @@ Volume: 7},
|
||||
year = {2014},
|
||||
}
|
||||
|
||||
@misc{kerrisk_mount_namespaces7_2021,
|
||||
title = {mount\_namespaces(7)},
|
||||
url = {https://man7.org/linux/man-pages/man7/mount_namespaces.7.html},
|
||||
urldate = {2022-02-03},
|
||||
journal = {Linux manual pages},
|
||||
author = {Kerrisk, Michael and Biederman, Eric W.},
|
||||
month = aug,
|
||||
year = {2021},
|
||||
}
|
||||
|
||||
@article{yasunori_kernel-based_2011,
|
||||
title = {Kernel-based {Virtual} {Machine} {Technology}},
|
||||
volume = {47},
|
||||
|
Loading…
Reference in New Issue
Block a user