During the last update, `hydra-notify` was rewritten as a daemon which
listens to postgresql notifications for each build[1]. The module
uses the `hydra-notify.service` unit from upstream's Hydra module and
the VM test ensures that email notifications are sent properly.
Also updated `hydra-init.service` to install `pg_trgm` on a local
database if needed[2].
[1] c7861b85c4
[2] 8a0a5ec3a3
We ship `https://cache.nixos.org` as binary cache by default which
automatically substitutes the test derivation used inside the Hydra
test. However it needs to be built locally to confirm that
`hydra-queue-runner` works properly.
Also inherited the platform name for the test derivation from `system`
to ensure that the build can be tested on each supported platform.
ZHF #68361
Always enable the UART because the VirtualBug bug that required running without the UART was fixed in 6.0.10. Stop using an old kernel version because the tests work with the default kernel.
(cherry picked from commit ae93571e8d04cebd69491a789d902d6481e05d3f)
In c814d72b51, a bunch of packages were
changed to use the pname attribute, among them were the quake3-demodata
and quake3-pointrelease which we use for the quake3 test.
Fortunately, having pname available means that we no longer need to
match using a prefix, so fixing this eval error also simplifies our
matching.
I directly pushed this to master because the change is non-controversial
and we can't break things that are already broken :-)
Signed-off-by: aszlig <aszlig@nix.build>
* remove kinetic
* release note
* add johanot as maintainer
nixos/ceph: create option for mgr_module_path
- since the upstream default is no longer correct in v14
* fix module, default location for libexec has changed
* ceph: fix test
* maintain only one version
* ceph-client: init
* include ceph-volume python tool in output
nixos/ceph: extraConfig, fix test, wait for ceph-mgr to become active
* run ceph with disk group permission
* add extraConfig option for the global section
needed per cluster
* clear up how ceph.conf is generated
* fix ceph testcase
Since https://github.com/NixOS/nixpkgs/pull/61321, local-fs.target is
part of sysinit.target again, meaning units without
DefaultDependencies=no will automatically depend on it, and the manual
set dependencies can be dropped.
It turns out that checking for the last mount time of an ext4 file
system isn't a very reliable way to check whether the file system was
properly unmounted.
When creating that test in the first place (88530e02b6),
I was reluctant to inspect the file system when the VM is down and was
searching for a way to check for a clean unmount *after* the file system
was mounted again to make sure we don't need to create a 512 MB raw
image on the host.
Fortunately however, when converting from qcow2, qemu-img actually
writes a sparse file, so for most file systems (that is, file systems
supporting sparse files) this shouldn't waste a lot of disk space.
So when investigating the flakiness, I found that whenever the test is
failing, the unmount of /test-x-initrd-mount was done *before* the final
step during which systemd remounts+unmounts all the remaining file
systems.
I haven't investigated why this is the case, but the test is a
regression test for https://github.com/NixOS/nixpkgs/issues/35268, which
actually didn't unmount the file system *at* *all*, so really all we
need to take care here is whether the unmount has happened and not
*how*.
To make sure that checking the filesystem state is enough for this, I
temporarily replaced the $machine->shutdown call with $machine->crash
and verified that the file system state is "not clean".
Signed-off-by: aszlig <aszlig@nix.build>
Fixes: https://github.com/NixOS/nixpkgs/issues/67555
* nixos/acme: Fix ordering of cert requests
When subsequent certificates would be added, they would
not wake up nginx correctly due to target units only being triggered
once. We now added more fine-grained systemd dependencies to make sure
nginx always is aware of new certificates and doesn't restart too early
resulting in a crash.
Furthermore, the acme module has been refactored. Mostly to get
rid of the deprecated PermissionStartOnly systemd options which were
deprecated. Below is a summary of changes made.
* Use SERVICE_RESULT to determine status
This was added in systemd v232. we don't have to keep track
of the EXITCODE ourselves anymore.
* Add regression test for requesting mutliple domains
* Deprecate 'directory' option
We now use systemd's StateDirectory option to manage
create and permissions of the acme state directory.
* The webroot is created using a systemd.tmpfiles.rules rule
instead of the preStart script.
* Depend on certs directly
By getting rid of the target units, we make sure ordering
is correct in the case that you add new certs after already
having deployed some.
Reason it broke before: acme-certificates.target would
be in active state, and if you then add a new cert, it
would still be active and hence nginx would restart
without even requesting a new cert. Not good! We
make the dependencies more fine-grained now. this should fix that
* Remove activationDelay option
It complicated the code a lot, and is rather arbitrary. What if
your activation script takes more than activationDelay seconds?
Instead, one should use systemd dependencies to make sure some
action happens before setting the certificate live.
e.g. If you want to wait until your cert is published in DNS DANE /
TLSA, you could create a unit that blocks until it appears in DNS:
```
RequiredBy=acme-${cert}.service
After=acme-${cert}.service
ExecStart=publish-wait-for-dns-script
```
There ver very many conflicts, basically all due to
name -> pname+version. Fortunately, almost everything was auto-resolved
by kdiff3, and for now I just fixed up a couple evaluation problems,
as verified by the tarball job. There might be some fallback to these
conflicts, but I believe it should be minimal.
Hydra nixpkgs: ?compare=1538299