Commit Graph

50 Commits

Author SHA1 Message Date
Andrea Righi
0cbb8632d0 ci: enable verbose mode when testing schedulers
Always run schedulers in verbose mode to be able to catch the details of
potential BPF failures.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
2024-10-23 09:29:54 +02:00
Pat Somaru
5e4a7ac655
setup matrix job to run key paths of layered through verifier/stress test 2024-10-09 21:09:41 -04:00
Daniel Hodges
feab01dd44 scx_layered: Update CI to show stats
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-10-09 05:18:04 -07:00
Pat Somaru
59f35fcbec
update stress test settings to constants used in test_scheds 2024-10-08 22:08:13 -04:00
Daniel Hodges
e0ddff1403 scx_layered: Add verbose output on stress tests
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-10-08 08:20:59 -07:00
Daniel Hodges
b803d59e1e scx_layered: Add verbose output on CI logs
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-10-08 08:10:49 -07:00
likewhatevs
bf68679d35
Setup "debugging" and misc cleanup (#695)
* Fix a couple of misc errors in build scripts.
* Tweak scripts/kconfigs to make bpftrace work.
* Update how CI caching works to make builds faster (6 minute turnaround
  time)
* Update CI config to generate per-scheduler debug archives w/ guest
  dmesg/scheduler stdout, guest stdout, bpftrace script output,
  veristat output.

* Update build scripts to accept the following:
** VNG RW -- write to host filesystem (better caching, logging).
* For stress tests in particular (via ini config):
** QEMU Opts -- to facilitate reproducing bugs (i.e. high core count).
** bpftrace scripts -- specify bpftrace scripts to run during stress
tests.
2024-09-26 11:11:10 -04:00
Tejun Heo
540576ac30 scheds/c: Re-enable scx_flatcg and scx_pair
cgroup support is availale again, re-enable scx_flatcg and scx_pair.
2024-09-25 12:38:45 -10:00
likewhatevs
2282a0af37
enable bpftrace when using stress tests (#688)
* enable bpftrace when using stress tests

update meson/stress test runner to enable
running bpftrace scripts while running
stress tests.

* disable layered stats output on ci
2024-09-25 17:20:36 -04:00
Tejun Heo
56bb963136 build: Use a single top-level rust workspace
Rust build was using two separate workspaces - rust/ and scheds/rust.
There's no reason to separate them and it makes doc generation tricky. Use
single top level workspace so that we can drive all rust building from
cargo.
2024-09-08 14:23:48 -10:00
patso
120211d731
split build and test jobs
split build and test jobs to reduce ci turnaround time
and make it clear what is failing when something fails.

also add virtiofsd to deps to make test compilation faster
(most test time is compliation) and remove all force 9ps.
2024-09-08 02:54:24 -04:00
Tejun Heo
4513dfbe4b
Merge pull request #565 from CachyOS/feature/scx-loader
scx_loader: Add scheduler loader via system DBUS interface
2024-09-04 06:34:59 -10:00
Andrea Righi
ac0cfa32de ci: bump up virtme-ng memory size from 1GB to 2GB
Recently, we have triggered some OOM conditions during stress tests,
particularly with the user-space schedulers. To avoid this issue and
prevent false positives, increase the memory size of the virtme-ng
instance from the default 1GB to 2GB.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-03 22:46:38 +02:00
Vladislav Nepogodin
4d770e1f84
scx_loader: Add scheduler loader via system DBUS interface 2024-08-30 00:56:27 +04:00
Tejun Heo
43950c65bd build: Use workspace to group rust sub-projects
meson build script was building each rust sub-project under rust/ and
scheds/rust/ separately. This means that each rust project is built
independently which leads to a couple problems - 1. There are a lot of
shared dependencies but they have to be built over and over again for each
proejct. 2. Concurrency management becomes sad - we either have to unleash
multiple cargo builds at the same time possibly thrashing the system or
build one by one.

We've been trying to solve this from meson side in vain. Thankfully, in
issue #546, @vimproved suggested using cargo workspace which makes the
sub-projects share the same target directory and built together by the same
cargo instance while still allowing each project to behave independently for
development and publishing purposes.

Make the following changes:

- Create two cargo workspaces - one under rust/, the other under
  scheds/rust/. Each contains all rust projects underneath it.

- Don't let meson descend into rust/. These are libraries used by the rust
  schedulers. No need to build them from meson. Cargo will build them as
  needed.

- Change the rust_scheds build target to invoke `cargo build` in
  scheds/rust/ and let cargo do its thing.

- Remove per-scheduler meson.build files and instead generate custom_targets
  in scheds/rust/meson.build which invokes `cargo build -p $SCHED`.

- This changes rust binary directory. Update README and
  meson-scripts/install_rust_user_scheds accordingly.

- Remove per-scheduler Cargo.lock as scheds/rust/Cargo.lock is shared by all
  schedulers now.

- Unify .gitignore handling.

The followings are build times on Ryzen 3975W:

Before:
  ________________________________________________________
  Executed in  165.93 secs    fish           external
     usr time   40.55 mins    2.71 millis   40.55 mins
     sys time    3.34 mins   36.40 millis    3.34 mins

After:
  ________________________________________________________
  Executed in   36.04 secs    fish           external
     usr time  336.42 secs    0.00 millis  336.42 secs
     sys time   36.65 secs   43.95 millis   36.61 secs

Wallclock time is reduced 5x and CPU time 7x.
2024-08-25 00:47:58 -10:00
Peter Jung
0faa0efe74
get_clang_ver: Fix regex for LLVM RC Versions
Signed-off-by: Peter Jung <admin@ptr1337.dev>
2024-08-25 00:49:00 +02:00
Tejun Heo
c0fcc9bdeb meson-scripts/test_sched: Enable scx_layered testing
scx_layered now can be run with a single command when `--run-example` is
specified. Update test_sched script to support per-sched arguments and
enable it for scx_layered.
2024-08-19 20:50:10 -10:00
Daniel Hodges
40bb003555 ci: fix merge veristat cache generation
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-19 18:30:32 -07:00
Daniel Hodges
1ff5e4fbed ci: fix veristat for PRs
Make sure veristat is available for CI for PRs.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-19 16:42:31 -07:00
Daniel Hodges
7c27f8067d ci: Fix veristat pull request workflow
See the [failure](https://github.com/sched-ext/scx/actions/runs/10389671253), which needs to have an action defined.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-19 12:08:59 -07:00
Daniel Hodges
f68bc82582 meson: Add github action to run veristat
Add a github action to run
[`veristat`](https://github.com/libbpf/veristat) during PRs and merges.
This uses a [action cache](https://github.com/actions/cache) to store
the veristat values when a PR is merged.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-14 07:36:16 -07:00
Daniel Hodges
d9d0f14a41 meson: Add veristat script
Add a meson script to run veristat. This can later be used to generate
reports for BPF program complexity at PR time.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-09 13:26:45 -07:00
Tejun Heo
759be0e406 ci: Use latest upstream kernel and exclude scx_flatcg and scx_pair from testing 2024-07-14 13:27:51 -10:00
Peter Jung
a7fa651bfc
install_user_scheds: Skip packaging of scx_mitosis
Signed-off-by: Peter Jung <admin@ptr1337.dev>
2024-07-14 20:21:14 +02:00
Andrea Righi
f98c35fd07
Merge pull request #388 from sched-ext/bpfland
scheds: introduce scx_bpfland
2024-06-28 21:27:43 +02:00
Andrea Righi
273728fd2b meson: check if commit exists in remote git repos
When fetching external git repositories (libbpf and bpftool) we don't
check if the target commit exists.

This can leads to issues such as #400, because we may silently use HEAD,
instead of the specified commit.

Prevent this by returning an error when the target SHA1 cannot be found.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-28 15:16:56 +02:00
Andrea Righi
188b3d3bfc ci: enable stress tests for scx_bpfland
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-27 17:28:42 +02:00
David Vernet
3219d15e3d
Merge pull request #292 from hodgesds/stress-ng-ci
Add stress-ng to scheduler tests
2024-06-21 11:35:56 -05:00
Violet Purcell
2341b67971
Support LLVM_VERSION_SUFFIX in clang version parsing regex
If LLVM is compiled with the LLVM_VERSION_SUFFIX cmake option, then the
version may have an additional suffix, for example "18.1.7+libcxx".
Gentoo for example uses this to fend off ABI issues between libstdc++
and libc++.

Signed-off-by: Violet Purcell <vimproved@inventati.org>
2024-06-12 11:58:27 -04:00
Daniel Hodges
8dd8f3f5a6 Add stress-ng to scheduler tests
This change adds stress-ng as a load test for schedulers when running in
CI. It will run stress-ng while schedulers are being tested with a
reasonable amount of work. At the end of the run the stress-ng metrics
are collected for later analysis. However, since these results may be
running in a VM they may not be super robust.
2024-06-10 19:48:42 -07:00
Andrea Righi
0d26219fad ci: enable kvm support in the github workflow
Enable kvm acceleration and qemu microvm to speed up CI tests inside
virtme-ng.

Also adjust the regex to catch potential errors excluding a false
positive triggered by the new configuration.

Link: https://github.blog/changelog/2023-02-23-hardware-accelerated-android-virtualization-on-actions-windows-and-linux-larger-hosted-runners/
Link: https://github.blog/2024-01-17-github-hosted-runners-double-the-power-for-open-source/
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-05-03 23:20:10 +02:00
Daniel Hodges
0a587d63dd
Fix issue when CDPATH contains libbpf directory
When CDPATH is set the fetch_libbpf build script will cd into
the preferred CDPATH directory. This change removes the CDPATH
environment variable so any preferred CDPATH paths are ignored.
The issue can be reproduced with the following steps:

1) mkdir -p /tmp/libbpf
2) CDPATH=/tmp/ meson setup build --prefix /tmp

The build should fail at the fetch_libbpf step.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-05-01 08:43:58 -04:00
Tejun Heo
4a77c8f8fb meson.build: Update libbpf and bpftool version requirements
The recent compat additions require new libbpf and bpftool. Update the
requirements.

- libbpf >= 1.4
- bpftool >= 7.4
2024-04-04 13:16:08 -10:00
Tejun Heo
d62a15193e Doc & build: Warn about sccache
Build fails with sccache.

- Update meson-scripts/build_bpftool to support sccache. Unfortunately, this
  isn't enough.

- Update README to warn about sccache and add the instruction to disable it
  for buliding scx.

- Also add ⚠️ to make boot loader update step more prominent in arch
  installation instruction.
2024-04-03 10:18:45 -10:00
Breno Leitao
1745daea0f meson: support clang built from git
get_clang_ver fails if clang is built from scratch.

Teach get_clang_ver to recognize the clang version even for clang built
from git.

These are the tests I ran:

	# /usr/local/bin/clang --version
	clang version 18.0.0git (https://github.com/llvm/llvm-project.git c458f928fad7bbcf08ab1da9949eb2969fc9f89c)
	# meson-scripts/get_clang_ver /usr/local/bin/clang
	18.0.0

	# /usr/bin/clang --version
	clang version 17.0.6 (CentOS 17.0.6-5.el9)
	# meson-scripts/get_clang_ver /usr/bin/clang
	17.0.6

Signed-off-by: Breno Leitao <leitao@debian.org>
2024-03-14 03:39:09 -07:00
Jordan Rome
ffc7b7dc4a Fetch and build bpftool by default
This pairs with the new default behavior to fetch and build libbpf
and is mostly being used so we can use the latest bpftool and libbpf.
2024-03-11 10:00:01 -07:00
Jordan Rome
1769dece7d Remove libbpf as a submodule
Instead clone the libbpf repo at a specific hash during setup.
This is to fix an issue whereby submodules are not included
in the tarball and therefore won't be updated/fetched during
setup after unzipping the tarball.
2024-03-07 18:31:09 -08:00
Jordan Rome
96fe285588 Libbpf - add BUILD_STATIC_ONLY flag 2024-03-05 15:11:51 -08:00
Tejun Heo
069c390ef2 meson-scripts/build_libbpf: Accommodate meson setting CC to "ccache $COMPILER"
Otherwise, we end up passing CC=ccache to libbpf's Makefile which triggers
an error as ccache invoked on its own can't act as a stand-in for the
compiler.
2024-03-04 10:04:25 -10:00
Jordan Rome
499924ead8 Add libbpf as a submodule
This is to potentinally reduce issues with folks
using different versions of libbpf at runtime.

This also:
- makes static linking of libbpf the default
- adds steps in `meson setup` to fetch libbpf and make it
2024-03-01 12:39:35 -08:00
Andrea Righi
d1cfe1765d ci: eclude scx_qmap and scx_userland from testing
These two schedulers are provided mostly as examples / PoC, so we should
exclude them from our periodic testing, to prevent triggering false
positives in our CI.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-02-24 10:24:06 +01:00
Andrea Righi
67a53ba621 ci: detect errors only from stderr
Search for potential errors only in the kernel logs and the scheduler
stderr.

In this way we can use "error keywords" in the scheduler's output
without triggering false positives in the CI (see for example #127).

NOTE: this works, because virtme-ng, when executed in verbose mode,
sends the kernel messages to stderr (together with the command's stderr)
and it channels the command's stdout to the stdout of the host.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-02-05 17:59:11 +01:00
Andrea Righi
05f5c69747 ci: use virtme-ng to test the schedulers
Use virtme-ng to run the schedulers after they're built; virtme-ng
allows to pick an arbitrary sched-ext enabled kernel and run it
virtualizing the entire user-space root filesystem, so we can basically
exceute the recompiled schedulers inside such kernel.

This should allow to catch potential run-time issue in advance (both in
the kernel and the schedulers).

The sched-ext kernel is taken from the Ubuntu ppa (ppa:arighi/sched-ext)
at the moment, since it is the easiest / fastest way to get a
precompiled sched-ext kernel to run inside the Ubuntu 22.04 testing
environment.

The schedulers are tested using the new meson target "test_sched", the
specific actions are defined in meson-scripts/test_sched.

By default each test has a timeout of 30 sec, after the virtme-ng
completes the boot (that should be enough to initialize the scheduler
and run the scheduler for some seconds), while the total lifetime of the
virtme-ng guest is set to 60 sec, after this time the guest will be
killed (this allows to catch potential kernel crashes / hangs).

If a single scheduler fails the test, the entire "test_sched" action
will be interrupted and the overall test result will be considered a
failure.

At the moment scx_layered is excluded from the tests, because it
requires a special configuration (we should probably pre-generate a
default config in the workflow actions and change the scheduler to use
the default config if it's executed without any argument).

Moreover, scx_flatcg is also temporarily excluded from the tests,
because of these known issues:
 - https://github.com/sched-ext/scx/issues/49
 - https://github.com/sched-ext/sched_ext/pull/101

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-29 15:54:10 +01:00
Jordan Rome
e9a9d32ab6 Restructure scheds folder names
- combine c and kernel-examples as it's confusing to have both
- rename 'rust-user' and 'c-user' to just 'rust' and 'c', which is simpler
- update and fix sync-to-kernel.sh
2023-12-17 13:14:31 -08:00
Tejun Heo
9e12238d64 Support offline compilation 2023-12-08 08:45:44 -10:00
Andrea Righi
00cd15a3ae build: properly detect clang version in Ubuntu
Some distro may add their own prefix to the version string of clang, for
example in Ubuntu:

 $ clang --version
 Ubuntu clang version 17.0.5 (1ubuntu1)
 ...

That triggers the following meson error during the setup phase:

 meson.build:25:44: ERROR: String '' cannot be converted to int

Change the regexp used to evaluate the clang version to avoid this
build failure.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-07 19:24:12 +01:00
Tejun Heo
44b811831a doc: README.md and OVERVIEW.md added and other minor updates 2023-12-03 11:48:55 -10:00
Tejun Heo
6b9c392bf0 build: "meson install" works now 2023-12-01 13:37:28 -10:00
Tejun Heo
6ec509b3b6 build, scx_utils: Misc improvements
- build: Check clang version like scx_utils does.

- scx_utils: Generate rerun-if-env-changed directives.
2023-12-01 10:20:06 -10:00
Tejun Heo
68b6d37800 scx: Initial repo setup and import of example schedulers from kernel tree 2023-11-27 14:47:04 -10:00