Commit Graph

211 Commits

Author SHA1 Message Date
Andrea Righi
d67dfe50f9 scx_rustland: treat the CPU running the user-space scheduler as idle
Considering the CPU where the user-space scheduler is running as busy
doesn't really provide any benefit, since the user-space scheduler is
constantly dispatching an amount of tasks equal to the amount of idle
CPUs and then yields (therefore its own CPU should be considered idle).

Considering the CPU where the user-space scheduler is running as busy
doesn't provide any benefit, as the scheduler consistently dispatches
tasks equal to the number of idle CPUs and then yields (therefore its
own CPU should be considered idle).

This also allows to reduce the overall user-space scheduler CPU
utilization, especially when the system is mostly idle, without
introducing any measurable performance regression.

Measuring the average CPU utilization of a (mostly) idle system over a
time period of 60 sec:

 - wihout this patch: 5.41% avg cpu util
 - with this patch:   2.26% avg cpu util

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-29 21:14:58 +01:00
Andrea Righi
6df4d7e0c6 scx_rustland: introduce an update_idle() callback
Move the logic to activate the userspace scheduler to an update_idle()
callback, which is called when the CPU is about to go idle.

This disables the built-in idle tracking mechanism, so it allows to rely
completely on the internal CPU ownership logic (via get_cpu_owner() and
set_cpu_owner()) and it also allows to share the idle state with the
user-space scheduler via the BPF_MAP_TYPE_ARRAY cpu_map.

Moreover, when the user-space scheduler is activated, kick the idle cpu
to trigger immediate dispatch and avoid bubbles in the scheduling
pipeline.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-28 14:41:08 +01:00
Andrea Righi
1baae38e7f Revert "scx_rustland: always dispatch kthreads on the local CPU"
This reverts commit 9237e1d ("scx_rustland: always dispatch kthreads on
the local CPU").

Do not always prioritize all kthreads, we may have unbound workqueue
workers that can consume a lot of CPU cycles (e.g., encryption workers),
so we definitely want to apply the scheduling for those.

Therefore, restore the old behavior to prioritize only per-CPU kthreads.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-28 14:40:03 +01:00
Andrea Righi
9237e1d835 scx_rustland: always dispatch kthreads on the local CPU
Adding extra overhead to any kthread can potentially slow down the
entire system, so make sure this never happens by dispatching all
kthreads directly on the same local CPU (not just the per-CPU kthreads),
bypassing the user-space scheduler.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-27 14:15:46 +01:00
Andrea Righi
f0ece7af6b scx_rustland: wake-up user-space scheduler when a CPU is released
Trigger the user-space scheduler only upon a task's CPU release event
(avoiding its activation during each enqueue event) and only if there
are tasks waiting to be processed by the user-space scheduler.

This should save unnecessary calls to the user-space scheduler, reducing
the overall overhead of the scheduler.

Moreover, rename nr_enqueues to nr_queued and store the amount of tasks
currently queued to the user-space scheduler (that are waiting to be
dispatched).

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-27 14:15:46 +01:00
Andrea Righi
7d01be9568 scx_rustland: provide get/set_cpu_owner()
Provide the following primitives to get and set CPU ownership in the BPF
part. This improves code readability and these primitives can be used by
the BPF part as a baseline to implement a better CPU idle tracking in
the future.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-27 14:15:39 +01:00
Andrea Righi
cd7e1c6248 scx_rustland: clarify BPF / user-space interlocking
BPF doesn't have full memory model yet, and while strict atomicity might
not be necessary in this context, it is advisable to enhance clarity in
the interlocking model.

To achieve this, provide the following primitives to operate on
usersched_needed:

  static void set_usersched_needed(void)

  static bool test_and_clear_usersched_needed(void)

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-26 14:28:24 +01:00
Andrea Righi
e038a530ae scx_rustland: dispatch tasks in batch
Dispatch tasks in a batch equal to the amount of idle CPUs in the
system.

This allows to reduce the pressure on the dispatcher queues, improving
the effectiveness of the scheduler (by having more tasks sitting in the
scheduler task pool) and mitigating potential priority inversion issues.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-23 10:44:03 +01:00
Andrea Righi
4d98862674 scx_rustland: expose CPU information to the user-space scheduler
Provide an interface for the BPF dispatcher and user-space scheduler to
share CPU information. This information can empower the user-space
scheduler to make more informed decisions and enable the implementation
of a broader range of scheduling policies.

With this change the BPF dispatcher provides a CPU map (one entry per
CPU) that stores the pid that is running on each CPU (0 if the CPU is
idle). The CPU map is updated by the BPF dispatcher in the .running()
and .stopping() callbacks.

The dispatcher then sends to the user-space scheduler a suggestion of
the candidate CPU for each task that needs to run (that is always the
previously used CPU), along with all the task's information.

The user-space scheduler can decide to confirm the selected CPU or to
choose a different one, using all the shared CPU information.

Lastly, the selected CPU is communicated back to the dispatcher along
with all the task's information and the BPF dispatcher takes care of
executing the task on the selected CPU, eventually triggering a
migration.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-23 10:38:56 +01:00
Andrea Righi
968ac80a3f scx_rustland: handle graceful vs non-graceful exit
Do not report an exit error message if it's empty. Moreover, distinguish
between a graceful exit vs a non-graceful exit.

In general, try to follow the behavior of user_exit_info.h for the C
schedulers.

NOTE: in the future the whole exit handling probably can be moved to a
more generic place (scx_utils) to prevent code duplication across
schedulers and also to prevent small inconsistencies like this one.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-22 19:44:14 +01:00
Andrea Righi
f7f0e3236c scx_rustland: rename from scx_rustlite
Rename scx_rustlite to scx_rustland to better represent the mirroring of
scx_userland (in C), but implemented in Rust.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2023-12-22 00:20:14 +01:00