Systemd provides an option for allocating DynamicUsers
which we want to use in NixOS to harden service configuration.
However, we discovered that the user wasn't allocated properly
for services. After some digging this turned out to be, of course,
a cache inconsistency problem.
When a DynamicUser creation is performed, Systemd check beforehand
whether the requested user already exists statically. If it does,
it bails out. If it doesn't, systemd continues with allocating the
user.
However, by checking whether the user exists, nscd will store
the fact that the user does not exist in it's negative cache.
When the service tries to lookup what user is associated to its
uid (By calling whoami, for example), it will try to consult
libnss_systemd.so However this will read from the cache and tell
report that the user doesn't exist, and thus will return that
there is no user associated with the uid. It will continue
to do so for the cache duration time. If the service
doesn't immediately looks up its username, this bug is not
triggered, as the cache will be invalidated around this time.
However, if the service is quick enough, it might end up
in a situation where it's incorrectly reported that the
user doesn't exist.
Preferably, we would not be using nscd at all. But we need to
use it because glibc reads nss modules from /etc/nsswitch.conf
by looking relative to the global LD_LIBRARY_PATH. Because LD_LIBRARY_PATH
is not set globally (as that would lead to impurities and ABI issues),
glibc will fail to find any nss modules.
Instead, as a hack, we start up nscd with LD_LIBRARY_PATH set
for only that service. Glibc will forward all nss syscalls to
nscd, which will then respect the LD_LIBRARY_PATH and only
read from locations specified in the NixOS config.
we can load nss modules in a pure fashion.
However, I think by accident, we just copied over the default
settings of nscd, which actually caches user and group lookups.
We already disable this when sssd is enabled, as this interferes
with the correct working of libnss_sss.so as it already
does its own caching of LDAP requests.
(See https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/usingnscd-sssd)
Because nscd caching is now also interferring with libnss_systemd.so
and probably also with other nsss modules, lets just pre-emptively
disable caching for now for all options related to users and groups,
but keep it for caching hosts ans services lookups.
Note that we can not just put in /etc/nscd.conf:
enable-cache passwd no
As this will actually cause glibc to _not_ forward the call to nscd
at all, and thus never reach the nss modules. Instead we set
the negative and positive cache ttls to 0 seconds as a workaround.
This way, Glibc will always forward requests to nscd, but results
will never be cached.
Fixes#50273
Could also move kdc.conf, but this makes it inconvenient to use command line
utilities with heimdal, as it would require specifying --config-file with every
command.
Allow switching out kerberos server implementation.
Sharing config is probably sensible, but implementation is different enough to
be worth splitting into two files. Not sure this is the correct way to split an
implementation, but it works for now.
Uses the switch from config.krb5 to select implementation.
As cassandra start script hardcodes the location of logback
configuration to `CASSANDRA_CONF_DIR/logback.xml` there is no way to
pass an alternate file via `$JVM_OPTS` for example.
Also, without logback configuration DEBUG level is used which is not
necessary for standard usage.
With this commit a default logback configuration is set with log level
INFO.
Configuration borrowed from:
https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configLoggingLevels.html
Instead of setting User/Group only when DynamicUser is disabled, the
previous version of the code set it only when it was enabled. This
caused services with DynamicUser enabled to actually run as nobody, and
services without DynamicUser enabled to run as root.
Regression from fbb7e0c82f.
This cleans up the CockroachDB expression, with a few suggestions from
@aszlig.
However, it brought up the note of using systemd's StateDirectory=
directive, which is a nice feature for managing long-term data files,
especially for UID/GID assigned services. However, it can only manage
directories under /var/lib (for global services), so it has to introduce
a special path to make use of it at all in the case someone wants a path
at a different root.
While the dataDir directive at the NixOS level is _occasionally_ useful,
I've gone ahead and removed it for now, as this expression is so new,
and it makes the expression cleaner, while other kinks can be worked out
and people can test drive it.
CockroachDB's dataDir directive, instead, has been replaced with
systemd's StateDirectory management to place the data under
/var/lib/cockroachdb for all uses.
There's an included RequiresMountsFor= clause like usual though, so if
people want dependencies for any kind of mounted device at boot
time/before database startup, it's easy to specify using their own
mount/filesystems clause.
This can also be reverted if necessary, but, we can see if anyone ever
actually wants that later on before doing it -- it's a backwards
compatible change, anyway.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Currently there are two calls to curl in the reloadScript, neither which
check for errors. If something is misconfigured (like wrong authToken),
the only trace that something wrong happened is this log message:
Asking Jenkins to reload config
<h1>Bad Message 400</h1><pre>reason: Illegal character VCHAR='<'</pre>
The service isn't marked as failed, so it's easy to miss.
Fix it by passing --fail to curl.
While at it:
* Add $curl_opts and $jenkins_url variables to keep the curl command
lines DRY.
* Add --show-error to curl to show short error message explanation when
things go wrong (like HTTP 401 error).
* Lower-case the $CRUMB variable as upper case is for exported environment
variables.
The new behaviour, when having wrong accessToken:
Asking Jenkins to reload config
curl: (22) The requested URL returned error: 401
And the service is clearly marked as failed in `systemctl --failed`.
This also includes a full end-to-end CockroachDB clustering test to
ensure everything basically works. However, this test is not currently
enabled by default, though it can be run manually. See the included
comments in the test for more information.
Closes#51306. Closes#38665.
Co-authored-by: Austin Seipp <aseipp@pobox.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
As the comment notes, restarts/exits of dhcpcd generally require
restarting the NTP service since, if name resolution fails for a pool of
servers, the service might break itself. To be on the safe side, try
restarting Chrony in these instances, too.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Setting the server list to be empty is useful e.g. for hardware-only
or virtualized reference clocks that are passed through to the system
directly. In this case, initstepslew has no effect, so don't emit it.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
Part of #49783. NextCloud tracks in its `config.php` the application's
state which makes it hard for the module to modify configurations during
upgrades.
It will take time until the issue is properly fixed, therefore we
decided to warn about this in the manual.
This PR addresses two things:
* Adding a basic example for nextcloud. I figured it to be helpful to
add some basic usage instructions when adding a new manual entry.
Advanced documentation may follow later.
For now this document actively links to the service options, so users
are guided to the remaining options that can be helpful in certain
cases.
* Add a warning about upgrades and manual changes in
`/var/lib/nextcloud`. This will be fixed in the future, but it's
definetely helpful to document the current issues in the manual (as
proposed in https://github.com/NixOS/nixpkgs/issues/49783#issuecomment-439691127).
This allows, finally, proper detection when postgresql is ready to
accept connections. Until now, it was possible that services depending
on postgresql would fail in a race condition trying to connect
to postgresql.
When reworking the rspamd workers I disallowed `proxy` as a type and
instead used `rspamd_proxy` which is the correct name for that worker
type. That change breaks peoples existing config and so I have made this
commit which allows `proxy` as a worker type again but makes it behave
as `rspamd_proxy` and prints a warning if you use it.
This commit adds an assertion that checks that either `configFile` or
`configuration` is configured for alertmanager. The alertmanager config
can not be an empty attributeset. The check executed with `amtool` fails
before the service even has the chance to start. We should probably not
allow a broken alertmanager configuration anyway.
This also introduces a test for alertmanager configuration that piggy
backs on the existing prometheus tests.
The default of systemd is to kill the
the whole cgroup of a service. For slurmd
this means that all running jobs get killed
as well whenever the configuration is updated (and activated).
To avoid this behaviour we set "KillMode=process"
to kill only slurmd on reload. This is how
slurm configures the systemd service.
See:
https://bugs.schedmd.com/show_bug.cgi?id=2095#c24508f866ea1
Otherwise netdata will not find python modules.
To make sure netdata still pick up our setuid version of apps.plugin
we rename the original executable.
These days build systems are more robust w.r.t. to concurrency.
Most users will have at least two cores in their machines.
Therefore I suggest to increase the number of cores used for building.
fixes#50376
Based on reports X wouldn't start out of the box and seems OK now.
In case there are still some problems, we can improve later.
I checked that nixos.tests.virtualbox.* still succeed.
By using types.lines for 'config', we can specify monit configurations
in lots of modules and they can all be automatically combined together
with newlines. This is desireable because different modules might want
to each specify the small monitoring task specific to their service.
This commit also updates the module to use current idioms.
The `rmilter` module has options for configuring `postfix` to use it but
since that module is deprecated because rspamd now has a builtin worker
that supports the milter protocol this commit adds similar `postfix`
integration options directly to the `rspamd` module.
* Added license: GPLv2.
* Updated homepage and description.
* CFLAGS are no longer necessary as of version 2.2.0.
* Option '-a ::' is no longer necessary as of version 2.2.0.
While this seems silly at first (it's already given as start parameter
to mysqld), it seems like xtrabackup needs that sometimes.
Without it, a Galera cluster cannot be run using the xtrabackup
replication method.
The lines stored in `extraConfig` and `worker.<name?>.extraConfig`
should take precedent over values from included files but in order to do
this in rspamd UCL they need to be stored in a file that then gets
included with a high priority. This commit uses the overrides option to
store the value of the two `extraConfig` options in `extra-config.inc`
and `worker-<name?>.inc` respectively.
When the workers option for rspamd was originally implemented it was
based on a flawed understanding of how workers are configured in rspamd.
This meant that while rspamd supports configuring multiple workers of
the same type, so that different controller workers could have different
passwords, the NixOS module did not support this because it would write
an invalid configuration file if you tried.
Specifically a configuration like the one below:
```
workers.controller = {};
workers.controller2 = {
type = "controller";
};
```
Would result in a rspamd configuration of:
```
worker {
type = "controller";
count = 1;
.include "$CONFDIR/worker-controller.inc"
}
worker "controller2" {
type = "controller";
count = 1;
}
```
While to get multiple controller workers it should instead be:
```
worker "controller" {
type = "controller";
count = 1;
.include "$CONFDIR/worker-controller.inc"
}
worker "controller" {
type = "controller";
count = 1;
}
```
When implementing #49620 I included an enable option for both the
locals and overrides options but the code writing the files didn't
actually look at enable and so would write the file regardless of its
value. I also set the type to loaOf which should have been attrsOf
since the code was not written to handle the options being lists.
This fixes both of those issues.
With `promtool` we can check the validity of a configuration before
deploying it. This avoids situations where you would end up with a
broken monitoring system without noticing it - since the monitoring
broke down. :-)
Gluster's pidfile handling is bug-ridden.
I have fixed https://bugzilla.redhat.com/show_bug.cgi?id=1509340
in an attempt to improve it but that is far from enough.
The gluster developers describe another pidfile issue as
"our brick-process management is a total nightmare", see
f1071f17e0/xlators/mgmt/glusterd/src/glusterd-utils.c (L5907-L5924)
I have observed multiple cases where glusterd doesn't start correctly
and systemd doesn't notice because of the erroneous pidfile handling.
To improve the situation, we don't let glusterd daemonize itself any more
and instead use `--no-daemon` and the `Simple` service type.
Removes the old UI build tooling; it is no longer necessary
because as of 1.2.0 it's bundled into the server binary.
It doesn't even need to have JS built, because it's bundled into
the release commit's source tree (see #48714).
The UI is enabled by default, so the NixOS service is
updated to directly use `ui = webUi;` now.
Fixes#48714.
Fixes#44192.
Fixes#41243.
Fixes#35602.
Signed-off-by: Niklas Hambüchen <mail@nh2.me>
Setting this variable in the environment of systemd-timedated allows
'timedatectl' to tell if an NTP service is running.
Closes#48917.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
By default rspamd will look for multiple files in /etc/rspamd/local.d
and /etc/rspamd/override.d to be included in subsections of the merged
final config for rspamd. Most of the config snippets in the official
rspamd documentation are made to these files and so it makes sense for
NixOS to support them and this is what this commit does.
As part of rspamd 1.8.1 support was added for having custom Lua
rules stored in $LOCAL_CONFDIR/rspamd.local.lua which means that it is
now possible for NixOS to support such rules and so this commit also
adds support for this to the rspamd module.
hass will ignore the standard SIGTERM sent by systemd during stop/restart and we
then have to wait for the timeout after which systemd will forcefully kill the
process.
If instead if we send SIGINT, hass will shut down nicely.
There are many issues reported upstream about the inability to shut down/restart
and it is *supposed* to work with SIGTERM but doesn't.
* run as user 'slurm' per default instead of root
* add user/group slurm to ids.nix
* fix default location for the state dir of slurmctld:
(/var/spool -> /var/spool/slurmctld)
* Update release notes with the above changes
With this option enabled, before creating file/directories/symlinks in baseDir
according to configuration, old occurences of them are removed.
This prevents remainders of an old configuration (libraries, webapps, you name
it) from persisting after activating a new configuration.
The changes were found by executing the following in the strongswan
repo (https://github.com/strongswan/strongswan):
git diff 5.6.3..5.7.1 src/swanctl/swanctl.opt
This was overlooked on a rebase of mine on master, when I didn't realize
that in the time of me writing the znc changes this new option got
introduced.
On AMD hardware with Mesa 18, compton renders some colours incorrectly
when using the glx backend. This patch sets an environmental variable
for compton so colours are rendered correctly.
Topical bug: <https://bugs.freedesktop.org/show_bug.cgi?id=104597>
This breaks with networking backends enabled and
also creates large delays on boot when some services depends
on the network target. It is also not really required
because tinc does create those interfaces itself.
fixes#27070
Tor requires ``SOCKSPort 0`` when non-anonymous hidden services are
enabled. If the configuration doesn't enable Tor client features,
generate a configuration file that explicitly includes this disabling
to allow such non-anonymous hidden services to be created (note that
doing so still requires additional configuration). See #48622.
* nat/bind/dhcp.service:
Remove. Those services have nothing to do with a link-level service.
* sys-subsystem-net-devices-${if}.device:
Add as BindsTo dependency as this will make hostapd stop when the
device is unplugged.
* network-link-${if}.service:
Add hostapd as dependency for this service via requiredBy clause,
so that the network link is only considered to be established
only after hostapd has started.
* network.target:
Remove this from wantedBy clause as this is already implied from
dependencies stacked above hostapd. And if it's not implied than
starting hostapd is not required for this particular network
configuration.
This option represents the ZNC configuration as a Nix value. It will be
converted to a syntactically valid file. This provides:
- Flexibility: Any ZNC option can be used
- Modularity: These values can be set from any NixOS module and will be
merged correctly
- Overridability: Default values can be overridden
Also done:
Remove unused/unneeded options, mkRemovedOptionModule unfortunately doesn't work
inside submodules (yet). The options userName and modulePackages were never used
to begin with