nixpkgs/pkgs/build-support/kernel/make-initrd-ng
2022-03-22 07:02:22 -04:00
..
src
Cargo.lock
Cargo.toml
README.md

What is this for?

NixOS's traditional initrd is generated by listing the paths that should be included in initrd and copying the full runtime closure of those paths into the archive. For most things, like almost any executable, this involves copying the entirety of huge packages like glibc, when only things like the shared library files are needed. To solve this, NixOS does a variety of patchwork to edit the files being copied in so they only refer to small, patched up paths. For instance, executables and their shared library dependencies are copied into an extraUtils derivation, and every ELF file is patched to refer to files in that output.

The problem with this is that it is often difficult to correctly patch some things. For instance, systemd bakes the path to the mount command into the binary, so patchelf is no help. Instead, it's very often easier to simply copy the desired files to their original store locations in initrd and not copy their entire runtime closure. This does mean that it is the burden of the developer to ensure that all necessary dependencies are copied in, as closures won't be consulted. However, it is rare that full closures are actually desirable, so in the traditional initrd, the developer was likely to do manual work on patching the dependencies explicitly anyway.

How it works

This program is similar to its inspiration (find-libs from the traditional initrd), except that it also handles symlinks and directories according to certain rules. As input, it receives a sequence of pairs of paths. The first path is an object to copy into initrd. The second path (if not empty) is the path to a symlink that should be placed in the initrd, pointing to that object. How that object is copied depends on its type.

  1. A regular file is copied directly to the same absolute path in the initrd.

    • If it is also an ELF file, then all of its direct shared library dependencies are also listed as objects to be copied.
  2. A directory's direct children are listed as objects to be copied, and a directory at the same absolute path in the initrd is created.

  3. A symlink's target is listed as an object to be copied.

There are a couple of quirks to mention here. First, the term "object" refers to the final file path that the developer intends to have copied into initrd. This means any parent directory is not considered an object just because its child was listed as an object in the program input; instead those intermediate directories are simply created in support of the target object. Second, shared libraries, directory children, and symlink targets aren't immediately recursed, because they simply get listed as objects themselves, and are therefore traversed when they themselves are processed. Finally, symlinks in the intermediate directories leading to an object are preserved, meaning an input object /a/symlink/b will just result in initrd containing /a/symlink -> /target/b and /target/b, even if /target has other children. Preserving symlinks in this manner is important for things like systemd.

These rules automate the most important and obviously necessary copying that needs to be done in most cases, allowing programs and configuration files to go unpatched, while keeping the content of the initrd to a minimum.

Why Rust?

  • A prototype of this logic was written in Bash, in an attempt to keep with its find-libs ancestor, but that program was difficult to write, and ended up taking several minutes to run. This program runs in less than a second, and the code is substantially easier to work with.

  • This will not require end users to install a rust toolchain to use NixOS, as long as this tool is cached by Hydra. And if you're bootstrapping NixOS from source, rustc is already required anyway.

  • Rust was favored over Python for its type system, and because if you want to go fast, why not go really fast?