We should explicitly use u64 for hweight_gen to prevent the following
build failures on 32-bit architectures:
scheds/kernel-examples/scx_flatcg.p/scx_flatcg.bpf.skel.h: In function ‘scx_flatcg__assert’:
scheds/kernel-examples/scx_flatcg.p/scx_flatcg.bpf.skel.h:3523:9: error: static assertion failed: "unexpected size of \'hweight_gen\'"
3523 | _Static_assert(sizeof(s->data->hweight_gen) == 8, "unexpected size of 'hweight_gen'");
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
When printing scheduler statistics we use %lu to print u64 values, that
works well on 64-bit architectures, but on 32-bit architectures we get
errors like the following:
106 | printf("total :%10lu local:%10lu queued:%10lu lost:%10lu\n",
| ~~~~^
| |
| long unsigned int
| %10llu
107 | skel->bss->nr_total,
| ~~~~~~~~~~~~~~~~~~~
| |
| u64 {aka long long unsigned int}
Fix this by using the proper format %llu.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Use compiler's built-in stack initialization instead of memset().
In this way we can get rid of the string.h include and make
cross-compilation easier in certain small environments (i.e., arm).
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
In scx_nest, we currently count the number of times that a core is
scheduled for compaction before we eventually just eagerly compact the
core. The idea is that the core could thrash between being scheduled and
then "de-scheduled" for compaction if there are a couple of tasks that
are bouncing between cores in the primary nest often enough to kick them
out of being compacted.
We're currently resetting schedulings when a core is eagerly compacted,
but to be precise we should probably also reset the count when a core
consumes a task from the fallback DSQ, at this indicates that the system
is overcommitted and that we likely won't benefit from compacting the
primary nest.
Signed-off-by: David Vernet <void@manifault.com>
The scx_nest scheduler seems to be behaving well. Let's merge it to the
scx repo so that CachyOS can package and use it more easily.
Signed-off-by: David Vernet <void@manifault.com>
We were assigning curr to prev stats, and vice versa, in calc_util().
This was causing the following crash on debug builds:
[void@maniforge scheds]$ sudo RUST_BACKTRACE=1 scx_rusty
00:00:56 [INFO] CPUs: online/possible = 32/32
00:00:56 [INFO] DOM[00] cpumask 0000000000FF00FF (16 cpus)
00:00:56 [INFO] DOM[01] cpumask 00000000FF00FF00 (16 cpus)
00:00:56 [INFO] Rusty Scheduler Attached
thread 'main' panicked at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/core/src/ops/arith.rs:217:1:
attempt to subtract with overflow
stack backtrace:
0: rust_begin_unwind
at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/std/src/panicking.rs:597:5
1: core::panicking::panic_fmt
at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/core/src/panicking.rs:72:14
2: core::panicking::panic
at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/core/src/panicking.rs:127:5
3: <u64 as core::ops::arith::Sub>::sub
at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/core/src/ops/arith.rs:217:1
4: <&u64 as core::ops::arith::Sub<&u64>>::sub
at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/core/src/internal_macros.rs:55:17
5: scx_rusty::calc_util
at ./rust-user/scx_rusty/src/main.rs:216:29
6: scx_rusty::Tuner::step
at ./rust-user/scx_rusty/src/main.rs:444:38
7: scx_rusty::Scheduler::run
at ./rust-user/scx_rusty/src/main.rs:1198:17
8: scx_rusty::main
at ./rust-user/scx_rusty/src/main.rs:1261:5
9: core::ops::function::FnOnce::call_once
at /rustc/475c71da0710fd1d40c046f9cee04b733b5b2b51/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Flip them to avoid the crash. Rusty now runs fine.
Signed-off-by: David Vernet <void@manifault.com>
There's a fairly comprehensive README in the kernel's tools/sched_ext
directory which describes each of the example schedulers. Let's pull it
into this repository, and split it into the various subdirectories
containing the kernele-examples/ schedulers, and the rust-user/
schedulers.
Signed-off-by: David Vernet <void@manifault.com>
SCX_DSQ_GLOBAL now does not support vtime dispatching. scx_simple uses
it to do vtime scheduling, so let's update it to create and use a
separate DSQ that it can both FIFO and PRIQ dispatch to.
Signed-off-by: David Vernet <void@manifault.com>
tp_cgroup_attach_task() walks p->thread_group to visit all member threads
and set tctx->refresh_layer. However, the upstream kernel has removed
p->thread_group recently in 8e1f385104ac ("kill task_struct->thread_group")
as it was mostly a duplicate of p->signal->thread_head list which goes
through p->thread_node.
Switch to iterate via p->thread_node instead, add a comment explaining why
it's using the cgroup TP instead of scx_ops.cgroup_move(), and make
iteration failure non-fatal as the iteration is racy.
As in scx_layered, bpf_map_delete_elem() can fail due to recursion
protection triggering spuriously which can then lead to task_ctx creation
failure after PIDs wrap. Work around by dropping BPF_NOEXIST.
The scx repo is going to serve as the source of truth for sched_ext
schedulers. Reverse the sync direction and include syncing rust-user
schedulers too.