This accomplishes multiple things:
- Allows us to start systemd without stage-2-init.sh. This was not
possible before because the environment would have been wrong
- `systemctl daemon-reexec` also changes the environment, giving us
newer tools for the fs packages
- Starts systemd in a fully clean environment, making everything more
consistent and pure
At some point, I'd like to make another attempt at
71f1f4884b ("openssl: stop static binaries referencing libs"), which
was reverted in 195c7da07d. One problem with my previous attempt is
that I moved OpenSSL's libraries to a lib output, but many dependent
packages were hardcoding the out output as the location of the
libraries. This patch fixes every such case I could find in the tree.
It won't have any effect immediately, but will mean these packages
will automatically use an OpenSSL lib output if it is reintroduced in
future.
This patch should cause very few rebuilds, because it shouldn't make
any change at all to most packages I'm touching. The few rebuilds
that are introduced come from when I've changed a package builder not
to use variable names like openssl.out in scripts / substitution
patterns, which would be confusing since they don't hardcode the
output any more.
I started by making the following global replacements:
${pkgs.openssl.out}/lib -> ${lib.getLib pkgs.openssl}/lib
${openssl.out}/lib -> ${lib.getLib openssl}/lib
Then I removed the ".out" suffix when part of the argument to
lib.makeLibraryPath, since that function uses lib.getLib internally.
Then I fixed up cases where openssl was part of the -L flag to the
compiler/linker, since that unambigously is referring to libraries.
Then I manually investigated and fixed the following packages:
- pycurl
- citrix-workspace
- ppp
- wraith
- unbound
- gambit
- acl2
I'm reasonably confindent in my fixes for all of them.
For acl2, since the openssl library paths are manually provided above
anyway, I don't think openssl is required separately as a build input
at all. Removing it doesn't make a difference to the output size, the
file list, or the closure.
I've tested evaluation with the OfBorg meta checks, to protect against
introducing evaluation failures.
We can perform most of the mkdir/ln/rm using systemd-tmpfiles
instead which cleans up the script.
/bin and /home are created by their activation script snippets
usbfs is deprecated and unused.
hwclock seems to be automatically executed by systemd on startup.
The mkswap to prevent hibernation cycles seems to be executed by systemd
as well since the provided regression tests succeeds.
Currently it is only possible to add upstream _system_ units. The option
systemd.additionalUpstreamSystemUnits can be used for this.
However, this was not yet possible for systemd.user. In a similar
fashion this was added to systemd-user.nix.
This is intended to have other modules add upstream units.
Use a quoted heredoc to inject installBootLoader safely into the script,
and restore the previous invocation of `system` with a single argument so
that shell commands keep working.
As of systemd/systemd@e908434458,
systemd-networkd now automatically configures routes to addresses
specified in AllowedIPs unless explicitly disabled with
"RouteTable=off".
This bug is so obscure and unlikely that I was honestly not able to
properly write a test for it. What happens is that we are calling
handleModifiedUnit() with $unitsToStart=\%unitsToRestart. We do this to
make sure that the unit is stopped before it's started again which is
not possible by regular means because the stop phase is already done
when calling the activation script.
recordUnit() still gets $startListFile, however which is the wrong file.
The bug would be triggered if an activation script requests a service
restart for a service that has `stopIfChanged = true` and
switch-to-configuration is killed before the restart phase was run. If
the script is run again, but the activation script is not requesting
more restarts, the unit would be started instead of restarted.
When initializing a system (e.g. first boot / livecd) we have no good
reference source for time. systemd-timesyncd however would revert back
to its configured fallback time (in our case 01.01.1980). Since we
probably don't want to hardcode a specific date as fallback we are now
using the current system time (wherever that might have come from) to
initialize the reference clock file.
The only systems that might be remotely affected by this change are
machines that have highly unreliable RTCs or those where the battery
that backs the RTC is running empty.
Historically these systems always had a tough time with anything time
related and likely required manual intervention.
For stateless systems (those that wipe / between reboots or our
installer CDs) this has the consequence that time will always be reset
to whatever the system comes up with on boot. This is likely the correct
time coming from an RTC. No harm done here the situation is likely
unchanged for them.
For stateful systems (those that retain the / partition across reboots)
there shouldn't be a change at all. They'll provide an initial clock
value once on their lifetime (during first boot / after installation).
From then onwards systemd-timesyncd will update the file with the newer
fallback time (that will be picked up on the next boot).
This effectively fixes the majority of all VM tests which were broken
because `/dev/vda` (or any other block device) wasn't mountable:
machine # mounting /dev/vda on /...
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device[ 2.820976] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 2.821757] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
machine # [ 2.821757] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
machine # [ 2.821757] Call Trace:
machine # [ 2.821757] dump_stack+0x6b/0x83
machine # [ 2.821757] panic+0x101/0x2c8
machine # [ 2.821757] do_exit.cold+0x14/0xb3
machine # [ 2.821757] do_group_exit+0x33/0xa0
machine # [ 2.821757] __x64_sys_exit_group+0x14/0x20
machine # [ 2.821757] do_syscall_64+0x33/0x40
machine # [ 2.821757] entry_SYSCALL_64_after_hwframe+0x44/0xa9
machine # [ 2.821757] RIP: 0033:0x7f67ec2800f6
machine # [ 2.821757] Code: 00 4c 8b 0d 2c 5d 11 00 eb 19 66 2e 0f 1f 84 00 00 00 00 00 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 22 f4 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e2 f7 d8 64 41 89 01 eb da 66 2e 0f 1f 84 00
machine # [ 2.821757] RSP: 002b:00007fff8f5a71d8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
machine # [ 2.821757] RAX: ffffffffffffffda RBX: 0000000000699704 RCX: 00007f67ec2800f6
machine # [ 2.821757] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
machine # [ 2.821757] RBP: 0000000000000004 R08: 00000000000000e7 R09: ffffffffffffff80
machine # [ 2.821757] R10: 00007f67ec33f3e0 R11: 0000000000000202 R12: 000000000000000b
machine # [ 2.821757] R13: 00007fff8f5a75a8 R14: 0000000000000000 R15: 00000000004fc198
machine # [ 2.821757] Kernel Offset: 0x31e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
machine # [ 2.821757] Rebooting in 1 seconds..
This happened because the kernel failed to load modules such as `ext4`
from `boot.initrd.availableKernelModules`[1] on e.g. a `mount(2)` syscall.
The problem is that `kmod` isn't linked against `libpthread.so.0`
anymore because it got merged into `libc.so.6` (however, the .so still
exists), but still needs it:
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/x86_64", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # newfstatat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/lib", 0x7ffd951114c0, 0) = -1 ENOENT (No such file or directory)
machine # openat(AT_FDCWD, "/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-glibc-2.34-36/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
machine # writev(2, [{iov_base="/nix/store/kdc9n48ksdc1a8y8w512w"..., iov_len=69}, {iov_base=": ", iov_len=2}, {iov_base="error while loading shared libra"..., iov_len=36}, {iov_base=": ", iov_len=2}, {iov_base="libpthread.so.0", iov_len=15}, {iov_base=": ", iov_len=2}, {iov_base="cy
machine # ) = 184
machine # exit_group(127) = ?
machine # +++ exited with 127 +++
machine # mount: mounting /dev/vda on /mnt-root/ failed: No such device
machine # [ 19.167180] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
machine # [ 19.167711] CPU: 0 PID: 1 Comm: init Not tainted 5.10.72 #1-NixOS
This is not a problem
* inside stage-1 because `LD_LIBRARY_PATH` points to `$out/lib` of
extra-utils where `libpthread.so.6` also exists.
* on a running system because `${pkgs.glibc}/lib` is part of kmod's
rpath.
However this is a problem inside the kernel which calls `modprobe` (in
our case `kmod`) to load modules and doesn't know about
`LD_LIBRARY_PATH`. Also, the rpath-reference was nuked.
To work around this, the kernel's `modprobe`
(i.e. `/proc/sys/kernel/modprobe`) now points to a wrapper which
explicitly declares `LD_LIBRARY_PATH`. We can't use `makeWrapper` here
because `modprobe` itself must not be renamed. Otherwise, `kmod` (which
is the link-target of `modprobe`) won't work because it expects
`argv[0] == "modprobe"` to perform modprobe's tasks.
[1] https://nixos.org/manual/nixos/stable/options.html#opt-boot.initrd.availableKernelModules
systemd needs this so special characters (like the ones in wireguard
units that appear because they are part of base64) can be escaped using
the \x syntax.
Root of the issue is that `glob()` handles the backslash internally
which is obviously not what we want here.
Also add a test case and fix some perlcritic issues in the subroutine.
wtmp and btmp are created by systemd, so the rules are more appropriate there.
They can be disabled explicitly with something like
services.ogrotate.paths = {
"/var/log/btmp".enable = false;
"/var/log/wtmp".enable = false;
};
if required.
This is accomplished by comparing the hashes that the unit files
contain. By filtering for a special key `X-Reload-Triggers` in the
`[Unit]` section, we can differentiate between reloads and restarts.
Since activation scripts can request reloads of units as well, more
checking of this behaviour is implemented. If a unit is to be restarted,
it's never reloaded as well which would make no sense.
Also removes a useless subroutine and perl dependencies that are
nowadays handled by the propagated build inputs feature of
`perl.withPackages`.
The mount options need to be passed as a comma-separated list of options so that they
end up one a single Options line in the resulting mount unit.
The current code passed the options as a list, resulting in several Options lines in
the mount unit, all but the first of these were ignored by systemd however.
This behaviour is not clearly defined in the systemd man page.
The `nix.*` options, apart from options for setting up the
daemon itself, currently provide a lot of setting mappings
for the Nix daemon configuration. The scope of the mapping yields
convience, but the line where an option is considered essential
is blurry. For instance, the `extra-sandbox-paths` mapping is
provided without its primary consumer, and the corresponding
`sandbox-paths` option is also not mapped.
The current system increases the maintenance burden as maintainers have to
closely follow upstream changes. In this case, there are two state versions
of Nix which have to be maintained collectively, with different options
avaliable.
This commit aims to following the standard outlined in RFC 42[1] to
implement a structural setting pattern. The Nix configuration is encoded
at its core as key-value pairs which maps nicely to attribute sets, making
it feasible to express in the Nix language itself. Some existing options are
kept such as `buildMachines` and `registry` which present a simplified interface
to managing the respective settings. The interface is exposed as `nix.settings`.
Legacy configurations are mapped to their corresponding options under `nix.settings`
for backwards compatibility.
Various options settings in other nixos modules and relevant tests have been
updated to use structural setting for consistency.
The generation and validation of the configration file has been modified to
use `writeTextFile` instead of `runCommand` for clarity. Note that validation
is now mandatory as strict checking of options has been pushed down to the
derivation level due to freeformType consuming unmatched options. Furthermore,
validation can not occur when cross-compiling due to current limitations.
A new option `publicHostKey` was added to the `buildMachines`
submodule corresponding to the base64 encoded public host key settings
exposed in the builder syntax. The build machine generation was subsequently
rewritten to use `concatStringsSep` for better performance by grouping
concatenations.
[1] - https://github.com/NixOS/rfcs/blob/master/rfcs/0042-config-option.md
This option behaves exactly like `boot.extraModprobeConfig`, except that it also includes the generated modprobe.d file in the initrd.
Many years ago, someone tried to include the normal modprobe.d/nixos.conf file generated by `boot.extraModprobeConfig` in the initrd: 0aa2c1dc46. This file contains a reference to a directory with firmware files inside. Including firmware in the initrd made it too big, so the commit was reverted again in 4a4c051a95.
The `boot.extraModprobeConfig` option not changing the initrd caused me much confusion because I tried to set the maximum cache size for ZFS and it didn't work.
Closes https://github.com/NixOS/nixpkgs/issues/25456.
Modules that do not depend on e.g. toplevel should not have to include it just to set
things in `system.build`. As a general rule, this keeps tests simple, usage flexible
and evaluation fast. While one module is insignificant, consistency and good practices
are.
If the Nix daemon has never been enabled (nix.enable has always been
set to false), the gcroots directory won't exist. If the Nix daemon
is later enabled, the GC roots for booted-system and current-system
will be missing, and they might end up being garbage collected. Since
it's cheap to add GC roots even if the daemon will never be enabled,
let's just always add them so we're okay in the case where the daemon
is enabled later.
- Fully get rid of `parseKeyValues` and use systemctl features for that
- Add some regex modifiers recommended by perlcritic
- Get rid of a postfix if
- Sort units when showing their status
- Clean the logic for showing what failed from `elif` to `next`
- Switch from `state` to `substate` for `auto-restart` because that's
actually where the value is stored
- Show status of units with one single systemctl call and get rid of
COLUMNS in favor of --full
- Add a test for failing units
This replaces the naive K=V unit parser with a proper INI parser from a
library and adds proper support for override files. Also adds a bunch of
comments about parsing, I hope this makes it easier to understand and
maintain in the future.
There are multiple reasons to do so, the first one is just general
correctness with is nice imo. But to get to more serious reasons (I
didn't put in all that effort for nothing) is that this is the first
step torwards more clever restart/reload handling. By using a library
like Data::Compare a future PR could replace the current way of
fingerprinting units (which is to compare store paths) by comparing the
hashes. This is more precise because units won't get restarted because
the order of the options change, comments are added, some dependency of
writeText changes, .... Also this allows us to add a feature like
`X-Reload-Triggers` so the unit can either be reloaded when these change
or restarted when everything else changes, giving module authors the
ability to have their services reloaded without having to fear that
updates are not applied because the service doesn't get restarted.
Another reason why this feature is nice is that now that the unit files
are parsed correctly (and values are just extracted from one section),
potential future rewrites can just rely on some INI library without
having to implement their own weird parser that is compatible with this
script.
This also comes with a new subroutine to handle systemd booleans because
I thought the current way of handling it was just ugly. This also allows
overriding values this script reads in an override file.
Apart from making this script more compatible with the world around it,
this also fixes two issues I saw bugging exactly 0 (zero) people. First
is that this script now supports multiple override files, also ones that
are not called override.conf and the second one is that `1` and `on` are
treated as bools by systemd but were previously not parsed as such by
switch-to-configuration.
This removes `/run/nixos/activation-reload-list` (which we will need in
the future when reworking the reload logic) and makes
`/run/nixos/activation-restart-list` honor `restartIfChanged` and
`reloadIfChanged`. This way activation scripts don't have to bother with
choosing between reloading and restarting.
I was confused why I couldn't find a mention of udev.log_priority in
systemd-udevd.service(8). It turns out that it was renamed[1] to
udev.log_level. The old name is still accepted, but it'll avoid
further confusion if we use the new name in our documentation.
[1]: 64a3494c3d
most modules can be evaluated for their documentation in a very
restricted environment that doesn't include all of nixpkgs. this
evaluation can then be cached and reused for subsequent builds, merging
only documentation that has changed into the cached set. since nixos
ships with a large number of modules of which only a few are used in any
given config this can save evaluation a huge percentage of nixos
options available in any given config.
in tests of this caching, despite having to copy most of nixos/, saves
about 80% of the time needed to build the system manual, or about two
second on the machine used for testing. build time for a full system
config shrank from 9.4s to 7.4s, while turning documentation off
entirely shortened the build to 7.1s.
Add `shellDryRun` to the generic stdenv and substitute it for uses of
`${stdenv.shell} -n`. The point of this layer of abstraction is to add
the flag `-O extglob`, which resolves#126344 in a more direct way.
some options have default that are best described in prose, such as
defaults that depend on the system stateVersion, defaults that are
derivations specific to the surrounding context, or those where the
expression is much longer and harder to understand than a simple text
snippet.