We spent a whole afternoon debugging this, because upstream has very
bad software quality and the error messages were incredibly
misleading.
So let’s document it for the sanity of other people.
Btw, I think the implementation of our module is pretty brittle,
especially the part about diffing tokens to check whether they
changed. We should rather just request a new builder registration
every time, it’s not that much overhead, and always set `replace` so
it is idempotent.
This commit changes how we deal with the current token, i.e., the token
which may exist from a previous runner registration, and the configured
token, i.e., the path set for the respective NixOS configuration option.
Until now, we copied the configured and the current token (if any) to
the runtime directory to compare them. The path of the current token may
reference a file which is only accessible to specific users (even only
root). Therefore, we ran the copying of credentials with elevated
privileges by prefixing the `ExecStartPre=` script with a `+` (see
systemd.service(5)). In this script, we also changed the owner of the
files to the service user. Apparently, however, the user/group pair
sometimes did not exist because we use `DynamicUser=`.
To address this issue, we no longer change the owner of the file.
Instead, we change the file permissions to 0666 to allow the runner
configuration script (runs with full sandboxing) to read-write the file.
Due to the current permissions of the runtime directory (0755), this
would expose the token. Therefore, we process the tokens in the state
directory, which is only accessible to the service user.
If a new token file exists in the state directory, the configuration
script should trigger a new runner registration. Afterward, it deletes
the new token file. The token is still available using the path of the
current token which is inaccessible within the service's sandbox.
This addresses #120263 in part, by allowing users to override the
github-runner derivation that is bound to turn non-functional via the
self-update mechanism. (And it'll allow using a buildFHSUserEnv-based
derivation, if someone ends up building that!)
* github-runner: init at 2.277.1
* nixos/github-runner: initial version
* nixos/github-runner: add warning if tokenFile in Nix store
* github-runner: don't accept unexpected attrs
* github-runner: formatting nits
* github-runner: add pre and post hooks to checkPhase
* nixos/github-runner: update ExecStartPre= comment
* nixos/github-runner: adapt tokenFile option description
Also note that not only a change to the option value will trigger a
reconfiguration but also modifications to the file's content.
* nixos/github-runner: remove mkDefault for DynamicUser=
* nixos/github-runner: create a parent for systemd dirs
Adds a parent directory "github-runner/" to all of the systemd lifecycle
directories StateDirectory=, RuntimeDirectory= and LogDirectory=.
Doing this has two motivations:
1. Something like this would required if we want to support multiple
runners configurations. Please note that this is already possible
using NixOS containers.
2. Having an additional parent directory makes it easier to remap
any of the directories. Without a parent, systemd is going to
complain if, for example, the given StateDirectory= is a symlink.
* nixos/github-runner: use specifier to get abs runtime path
* nixos/github-runner: use hostname as default for option `name`
Until now, the runner registration did not set the `--name` argument if
the configuration option was `null`, the default for the option.
According to GitHub's documentation, this instructs the registration
script to use the machine's hostname.
This commit causes the registration to always pass the `--name` argument
to the runner configuration script. The option now defaults to
`networking.hostName` which should be always set on NixOS.
This change becomes necessary as the systemd service name includes the
name of the runner since fcfa809 and, hence, expects it to be set. Thus,
an unset `name` option leads to an error.
* nixos/github-runner: use types.str for `name` option
Forcing a `name` option to comply with a pattern which could also be
used as a hostname is probably not required by GitHub.
* nixos/github-runner: pass dir paths explicitly for ExecStartPre=
* nixos/github-runner: update variable and script naming
* nixos/github-runner: let systemd choose the user/group
User and group naming restrictions are a complex topic [1] that I don't
even want to touch. Let systemd figure out the username and group and
reference it in our scripts through the USER environment variable.
[1] https://systemd.io/USER_NAMES/
* Revert "nixos/github-runner: use types.str for `name` option"
The escaping applied to the subdirectory paths given to StateDirectory=,
RuntimeDirectory= and LogsDirectory= apparently doesn't use the same
strategy that is used to escape unit names (cf. systemd-escape(1)). This
makes it unreasonably hard to construct reliable paths which work for
StateDirectory=/RuntimeDirectory=/LogsDirectory= and ExecStartPre=.
Against this background, I decided to (re-)apply restrictions to the
name a user might give for the GitHub runner. The pattern for
`networking.hostName` seems like a reasonable choice, also as its value
is the default if the `name` option isn't set.
This reverts commit 193ac67ba337990c22126da24a775c497dbc7e7d.
* nixos/github-runner: use types.path for `tokenFile` option
* nixos/github-runner: escape options used as shell arguments
* nixos/github-runner: wait for network-online.target
* github-runner: ignore additional online tests