This effectively fixes the majority of all VM tests which were broken
because `/dev/vda` (or any other block device) wasn't mountable:
machine # mounting /dev/vda on /...
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device[ 2.820976] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 2.821757] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
machine # [ 2.821757] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
machine # [ 2.821757] Call Trace:
machine # [ 2.821757] dump_stack+0x6b/0x83
machine # [ 2.821757] panic+0x101/0x2c8
machine # [ 2.821757] do_exit.cold+0x14/0xb3
machine # [ 2.821757] do_group_exit+0x33/0xa0
machine # [ 2.821757] __x64_sys_exit_group+0x14/0x20
machine # [ 2.821757] do_syscall_64+0x33/0x40
machine # [ 2.821757] entry_SYSCALL_64_after_hwframe+0x44/0xa9
machine # [ 2.821757] RIP: 0033:0x7f67ec2800f6
machine # [ 2.821757] Code: 00 4c 8b 0d 2c 5d 11 00 eb 19 66 2e 0f 1f 84 00 00 00 00 00 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 22 f4 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e2 f7 d8 64 41 89 01 eb da 66 2e 0f 1f 84 00
machine # [ 2.821757] RSP: 002b:00007fff8f5a71d8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
machine # [ 2.821757] RAX: ffffffffffffffda RBX: 0000000000699704 RCX: 00007f67ec2800f6
machine # [ 2.821757] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
machine # [ 2.821757] RBP: 0000000000000004 R08: 00000000000000e7 R09: ffffffffffffff80
machine # [ 2.821757] R10: 00007f67ec33f3e0 R11: 0000000000000202 R12: 000000000000000b
machine # [ 2.821757] R13: 00007fff8f5a75a8 R14: 0000000000000000 R15: 00000000004fc198
machine # [ 2.821757] Kernel Offset: 0x31e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
machine # [ 2.821757] Rebooting in 1 seconds..
This happened because the kernel failed to load modules such as `ext4`
from `boot.initrd.availableKernelModules`[1] on e.g. a `mount(2)` syscall.
The problem is that `kmod` isn't linked against `libpthread.so.0`
anymore because it got merged into `libc.so.6` (however, the .so still
exists), but still needs it:
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # writev(2, [{iov_base="/nix/store/kdc9n48ksdc1a8y8w512w"..., iov_len=69}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libpthread.so.0", iov_len=15}, {iov_base=": ", iov_len=2}, {iov_base="cy
machine # ) = 184
machine # exit_group(127) = ?
machine # +++ exited with 127 +++
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device
machine # [ 19.167180] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 19.167711] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
This is not a problem
* inside stage-1 because `LD_LIBRARY_PATH` points to `$out/lib` of
extra-utils where `libpthread.so.6` also exists.
* on a running system because `${pkgs.glibc}/lib` is part of kmod's
rpath.
However this is a problem inside the kernel which calls `modprobe` (in
our case `kmod`) to load modules and doesn't know about
`LD_LIBRARY_PATH`. Also, the rpath-reference was nuked.
To work around this, the kernel's `modprobe`
(i.e. `/proc/sys/kernel/modprobe`) now points to a wrapper which
explicitly declares `LD_LIBRARY_PATH`. We can't use `makeWrapper` here
because `modprobe` itself must not be renamed. Otherwise, `kmod` (which
is the link-target of `modprobe`) won't work because it expects
`argv[0] == "modprobe"` to perform modprobe's tasks.
[1] https://nixos.org/manual/nixos/stable/options.html#opt-boot.initrd.availableKernelModules
systemd needs this so special characters (like the ones in wireguard
units that appear because they are part of base64) can be escaped using
the \x syntax.
Root of the issue is that `glob()` handles the backslash internally
which is obviously not what we want here.
Also add a test case and fix some perlcritic issues in the subroutine.
wtmp and btmp are created by systemd, so the rules are more appropriate there.
They can be disabled explicitly with something like
services.ogrotate.paths = {
"/var/log/btmp".enable = false;
"/var/log/wtmp".enable = false;
};
if required.
This is accomplished by comparing the hashes that the unit files
contain. By filtering for a special key `X-Reload-Triggers` in the
`[Unit]` section, we can differentiate between reloads and restarts.
Since activation scripts can request reloads of units as well, more
checking of this behaviour is implemented. If a unit is to be restarted,
it's never reloaded as well which would make no sense.
Also removes a useless subroutine and perl dependencies that are
nowadays handled by the propagated build inputs feature of
`perl.withPackages`.
The mount options need to be passed as a comma-separated list of options so that they
end up one a single Options line in the resulting mount unit.
The current code passed the options as a list, resulting in several Options lines in
the mount unit, all but the first of these were ignored by systemd however.
This behaviour is not clearly defined in the systemd man page.
The `nix.*` options, apart from options for setting up the
daemon itself, currently provide a lot of setting mappings
for the Nix daemon configuration. The scope of the mapping yields
convience, but the line where an option is considered essential
is blurry. For instance, the `extra-sandbox-paths` mapping is
provided without its primary consumer, and the corresponding
`sandbox-paths` option is also not mapped.
The current system increases the maintenance burden as maintainers have to
closely follow upstream changes. In this case, there are two state versions
of Nix which have to be maintained collectively, with different options
avaliable.
This commit aims to following the standard outlined in RFC 42[1] to
implement a structural setting pattern. The Nix configuration is encoded
at its core as key-value pairs which maps nicely to attribute sets, making
it feasible to express in the Nix language itself. Some existing options are
kept such as `buildMachines` and `registry` which present a simplified interface
to managing the respective settings. The interface is exposed as `nix.settings`.
Legacy configurations are mapped to their corresponding options under `nix.settings`
for backwards compatibility.
Various options settings in other nixos modules and relevant tests have been
updated to use structural setting for consistency.
The generation and validation of the configration file has been modified to
use `writeTextFile` instead of `runCommand` for clarity. Note that validation
is now mandatory as strict checking of options has been pushed down to the
derivation level due to freeformType consuming unmatched options. Furthermore,
validation can not occur when cross-compiling due to current limitations.
A new option `publicHostKey` was added to the `buildMachines`
submodule corresponding to the base64 encoded public host key settings
exposed in the builder syntax. The build machine generation was subsequently
rewritten to use `concatStringsSep` for better performance by grouping
concatenations.
[1] - https://github.com/NixOS/rfcs/blob/master/rfcs/0042-config-option.md
This option behaves exactly like `boot.extraModprobeConfig`, except that it also includes the generated modprobe.d file in the initrd.
Many years ago, someone tried to include the normal modprobe.d/nixos.conf file generated by `boot.extraModprobeConfig` in the initrd: 0aa2c1dc46. This file contains a reference to a directory with firmware files inside. Including firmware in the initrd made it too big, so the commit was reverted again in 4a4c051a95.
The `boot.extraModprobeConfig` option not changing the initrd caused me much confusion because I tried to set the maximum cache size for ZFS and it didn't work.
Closes https://github.com/NixOS/nixpkgs/issues/25456.
Modules that do not depend on e.g. toplevel should not have to include it just to set
things in `system.build`. As a general rule, this keeps tests simple, usage flexible
and evaluation fast. While one module is insignificant, consistency and good practices
are.
If the Nix daemon has never been enabled (nix.enable has always been
set to false), the gcroots directory won't exist. If the Nix daemon
is later enabled, the GC roots for booted-system and current-system
will be missing, and they might end up being garbage collected. Since
it's cheap to add GC roots even if the daemon will never be enabled,
let's just always add them so we're okay in the case where the daemon
is enabled later.
- Fully get rid of `parseKeyValues` and use systemctl features for that
- Add some regex modifiers recommended by perlcritic
- Get rid of a postfix if
- Sort units when showing their status
- Clean the logic for showing what failed from `elif` to `next`
- Switch from `state` to `substate` for `auto-restart` because that's
actually where the value is stored
- Show status of units with one single systemctl call and get rid of
COLUMNS in favor of --full
- Add a test for failing units
This replaces the naive K=V unit parser with a proper INI parser from a
library and adds proper support for override files. Also adds a bunch of
comments about parsing, I hope this makes it easier to understand and
maintain in the future.
There are multiple reasons to do so, the first one is just general
correctness with is nice imo. But to get to more serious reasons (I
didn't put in all that effort for nothing) is that this is the first
step torwards more clever restart/reload handling. By using a library
like Data::Compare a future PR could replace the current way of
fingerprinting units (which is to compare store paths) by comparing the
hashes. This is more precise because units won't get restarted because
the order of the options change, comments are added, some dependency of
writeText changes, .... Also this allows us to add a feature like
`X-Reload-Triggers` so the unit can either be reloaded when these change
or restarted when everything else changes, giving module authors the
ability to have their services reloaded without having to fear that
updates are not applied because the service doesn't get restarted.
Another reason why this feature is nice is that now that the unit files
are parsed correctly (and values are just extracted from one section),
potential future rewrites can just rely on some INI library without
having to implement their own weird parser that is compatible with this
script.
This also comes with a new subroutine to handle systemd booleans because
I thought the current way of handling it was just ugly. This also allows
overriding values this script reads in an override file.
Apart from making this script more compatible with the world around it,
this also fixes two issues I saw bugging exactly 0 (zero) people. First
is that this script now supports multiple override files, also ones that
are not called override.conf and the second one is that `1` and `on` are
treated as bools by systemd but were previously not parsed as such by
switch-to-configuration.
This removes `/run/nixos/activation-reload-list` (which we will need in
the future when reworking the reload logic) and makes
`/run/nixos/activation-restart-list` honor `restartIfChanged` and
`reloadIfChanged`. This way activation scripts don't have to bother with
choosing between reloading and restarting.