linux-hardened sets kernel.unprivileged_userns_clone=0 by default; see
anthraxx/linux-hardened@104f44058f.
This allows the Nix sandbox to function while reducing the attack
surface posed by user namespaces, which allow unprivileged code to
exercise lots of root-only code paths and have lead to privilege
escalation vulnerabilities in the past.
We can safely leave user namespaces on for privileged users, as root
already has root privileges, but if you're not running builds on your
machine and really want to minimize the kernel attack surface then you
can set security.allowUserNamespaces to false.
Note that Chrome's sandbox requires either unprivileged CLONE_NEWUSER or
setuid, and Firefox's silently reduces the security level if it isn't
allowed (see about:support), so desktop users may want to set:
boot.kernel.sysctl."kernel.unprivileged_userns_clone" = true;
As discussed in https://github.com/NixOS/nixpkgs/pull/73763, prevailing
consensus is to revert that commit. People use the hardened profile on
machines and run nix builds, and there's no good reason to use
unsandboxed builds at all unless you're in a platform that doesn't
support them.
This reverts commit 00ac71ab19.
Disables the build sandbox by default to avoid incompatibility with
defaulting user namespaces to false. Ideally there would be some kind of
linux kernel feature that allows us to trust nix-daemon builders to
allow both nix sandbox builds and disabling untrusted naemspaces at the
same time.
The rationale for this is that old filesystems have recieved little scrutiny
wrt. security relevant bugs.
Lifted from OpenSUSE[1].
[1]: 8cb42fb665
Co-Authored-By: Renaud <c0bw3b@users.noreply.github.com>
systemd provides two sysctl snippets, 50-coredump.conf and
50-default.conf.
These enable:
- Loose reverse path filtering
- Source route filtering
- `fq_codel` as a packet scheduler (this helps to fight bufferbloat)
This also configures the kernel to pass coredumps to `systemd-coredump`.
These sysctl snippets can be found in `/etc/sysctl.d/50-*.conf`,
and overridden via `boot.kernel.sysctl`
(which will place the parameters in `/etc/sysctl.d/60-nixos.conf`.
Let's start using these, like other distros already do for quite some
time, and remove those duplicate `boot.kernel.sysctl` options we
previously did set.
In the case of rp_filter (which systemd would set to 2 (loose)), make
our overrides to "1" more explicit.
slab_nomerge may reduce surface somewhat
slub_debug is used to enable additional sanity checks and "red zones" around
allocations to detect read/writes beyond the allocated area, as well as
poisoning to overwrite free'd data.
The cost is yet more memory fragmentation ...
For the hardened profile disable symmetric multi threading. There seems to be
no *proven* method of exploiting cache sharing between threads on the same CPU
core, so this may be considered quite paranoid, considering the perf cost.
SMT can be controlled at runtime, however. This is in keeping with OpenBSD
defaults.
TODO: since SMT is left to be controlled at runtime, changing the option
definition should take effect on system activation. Write to
/sys/devices/system/cpu/smt/control
For the hardened profile enable flushing whenever the hypervisor enters the
guest, but otherwise leave at kernel default (conditional flushing as of
writing).
Introduces the option security.protectKernelImage that is intended to control
various mitigations to protect the integrity of the running kernel
image (i.e., prevent replacing it without rebooting).
This makes sense as a dedicated module as it is otherwise somewhat difficult
to override for hardened profile users who want e.g., hibernation to work.
A module for security options that are too small to warrant their own module.
The impetus for adding this module is to make it more convenient to override
the behavior of the hardened profile wrt user namespaces.
Without a dedicated option for user namespaces, the user needs to
1) know which sysctl knob controls userns
2) know how large a value the sysctl knob needs to allow e.g.,
Nix sandbox builds to work
In the future, other mitigations currently enabled by the hardened profile may
be promoted to options in this module.
This eliminates a theoretical risk of ASLR bypass due to the fixed address
mapping used by the legacy vsyscall mechanism. Modern glibc use vdso(7)
instead so there is no loss of functionality, but some programs may fail
to run in this configuration. Programs that fail to run because vsyscall
has been disabled will be logged to dmesg.
For background on virtual syscalls see https://lwn.net/Articles/446528/
Closes https://github.com/NixOS/nixpkgs/pull/25289
The idea is to provide a convenient way to enable most vanilla hardening
features in one go. The hardened profile, then, will serve as a place for
features that enhance security but cannot be enabled for all deployments
because they interfere with legitimate use cases (e.g., using ptrace to
debug problems in an already running process).
Closes https://github.com/NixOS/nixpkgs/pull/24680