Commit Graph

350 Commits

Author SHA1 Message Date
Andrea Righi
191cc7fe8b scx_loader: tune scx_bpfland default options
With the recent rework of scx_bpfland the default options for the
different profiles in scx_loader are not valid anymore.

Update them with some appropriate options.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
2024-11-07 17:54:13 +01:00
Emil Tsalapatis
7d44511422 fix missing/extraneous newline 2024-11-06 12:52:10 -08:00
Emil Tsalapatis
42880404e1 Merge branch 'main' of https://github.com/sched-ext/scx into core_enums 2024-11-06 12:44:23 -08:00
Emil Tsalapatis
de5f2f9c8d regenerate autogen Rust file 2024-11-06 12:21:16 -08:00
Emil Tsalapatis
1cabed9d09 Autogenerate enums and BPF enum setters for Rust schedulers 2024-11-06 12:17:16 -08:00
Emil Tsalapatis
d500c50098 add autogenerated enum definitions for Rust schedulers 2024-11-06 12:17:16 -08:00
Tejun Heo
ad45727139 version: v1.0.6 2024-11-06 06:54:26 -10:00
Emil Tsalapatis
23f302cf13 add SCX_SLICE_* macros to scx_utils and use them for the Rust schedulers 2024-11-06 07:52:04 -08:00
Tejun Heo
386ae20ee7
Merge pull request #843 from CachyOS/feature/scx-loader-config
scx_loader: introduce configuration
2024-10-23 23:00:24 +00:00
Vladislav Nepogodin
f2980d69af
scx_loader: introduce configuration 2024-10-24 01:36:11 +04:00
Tejun Heo
6ea15f9f9f
Merge pull request #819 from minosfuture/vmlinux_per_arch
Use per-arch vmlinux.h v2
2024-10-21 19:36:52 +00:00
Ming Yang
1b5359ef4a Use per-arch vmlinux.h v2
Rework per-arch vmlinux solution
* have per-arch directory under sched/include/arch/, in which we
  maintain vmlinux.h symlink and real file
  vmlinux-{kernel_ver}-g{sha1}.h. The original sched/include/vmlinux/
  folder is removed.
* update meson build `-I` option to find the new vmlinux.h position
* update cargo build scripts to use the per-arch vmlinux.h for
  generating bindings
* keep the original ClangInfo refactoring changes

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-10-19 10:50:59 -07:00
Ming Yang
f3f4726c09 scx_layered: Read CPU topology for building CpuPool
Building CpuPool from cache-cpu topology did not apply on arm, because
`/sys/devices/system/cpu/cpu{}/cache/index{}/id` file is unavailable.

Read CPU topology instead.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-10-17 23:41:08 -07:00
Andrea Righi
a155ff2ada scx_rustland_core: update documentation about the new API
Update the documentation adding the new task statistics provided by
scx_rustland_core.

Fixes: be681c7 ("scx_rustland_core: pass nvcsw, slice and dsq_vtime to user-space")
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-17 19:07:51 +02:00
Andrea Righi
2ea47af4bc
Merge pull request #804 from sched-ext/rustland-fixes
scx_rustland fixes and improvements
2024-10-16 18:26:03 +00:00
Tejun Heo
84d8abf913 Revert "Use per-arch vmlinux.h"
This reverts commit a23f3566e3.
2024-10-16 06:42:28 -10:00
Andrea Righi
97629178e2 scx_rustland_core: bump up version to 2.2.2
Bump up the minor version to reflect the new backward-compatible
functionality added.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-16 14:06:00 +02:00
Andrea Righi
704fe95f51 scx_rustland_core: get rid of the SCX_ENQ_WAKEUP logic
With user-space scheduling we don't usually dispatch a task immediately
after selecting an idle CPU, so there's not much benefit at trying to
optimize the WAKE_SYNC scenario (when a task is waking up another task
and releaing the CPU) when picking an idle CPU.

Therefore, get rid of the WAKE_SYNC logic in select_cpu() and rely on
the user-space logic (that has access to the WAKE_SYNC information) to
handle this particular case.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-16 14:05:58 +02:00
Andrea Righi
67ec1af5cf scx_rustland_core: kick an idle CPU after global dispatch
Do not kick a CPU from rs_select_cpu() (called by the user-space
scheduler), since we may not immediately dispatch the task.

Instead, always try to wake up the task's assigned CPU after dispatching
to a global DSQ, ensuring it can be consumed immediately.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-16 14:05:33 +02:00
Andrea Righi
0a05f1f193 scx_rustland_core: keep CPUs alive with pending tasks
Prevent CPUs from going idle when the user-space scheduler has some
pending activities to complete.

Keeping the CPU alive allows to consume tasks from the user-space
scheduler more efficiently, preventing bubbles in the scheduling
pipeline.

To achieve this, trigger a CPU kick from ops.update_idle() and set a
flag in the CPU context to prevent it from going idle. Then keep kicking
the CPU from ops.dispatch() until the flag is cleared, which occurs when
no more tasks are pending or when the CPU exits idle as a task starts
running on it.

This allows to fix the performance regression introduced by the
put_prev_task_scx() behavior change in Linux 6.12 (see #788).

Link: https://lore.kernel.org/lkml/20241015111539.12136-1-andrea.righi@linux.dev/
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-16 10:43:43 +02:00
Andrea Righi
abfb4c53f5 scx_rustland_core: restart scheduler on hotplug events
User-space schedulers may still hit some stalls during CPU hotplugging
events.

There is no reason to overcomplicate the code and trying to handle
hotplug events within the scx_rustland_core framework and we can simply
handle a scheduler restart performed by the scx core.

This makes CPU hotplugging more reliable with scx_rustland_core-based
schedulers.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-15 23:11:43 +02:00
Andrea Righi
4432e64d85 scx_rustland_core: allow user-space scheduler to run indefinitely
Assign an infinite time slice to the user-space scheduler itself, so
that it can completely drain all the pending tasks and voluntarily
release the CPU when it's done.

This allows to achieve more consistent performance and we can also
remove the speculative user-space scheduler wakeup from ops.stopping().

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-15 23:11:43 +02:00
Andrea Righi
be681c731a scx_rustland_core: pass nvcsw, slice and dsq_vtime to user-space
Provide additional task metrics to user-space schedulers via QueuedTask:
 - nvcsw: total amount of voluntary context switches
 - slice: task time slice "budget" (from p->scx.slice)
 - dsq_vtime: current task vtime (from p->scx.dsq_vtime)

In this way user-space schedulers can quickly access these metrics to
implement better scheduling policy.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-15 23:11:43 +02:00
Andrea Righi
1bbae64dc7 scx_rustland_core: update CPU idle selection logic
Re-align idle selection logic with some of the latest improvements done
in scx_bpfland.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-15 23:11:42 +02:00
Ming Yang
a23f3566e3 Use per-arch vmlinux.h
vmlinux.h is not compatible across archs.

Handle this compatibility issue by
* Add arch info into vmlinux.h real file name
* Link vmlinux.h to the target-arch real file at build time
* Use target-arch real file for scx_utils bindgen.

Also refactored clang related logic into a new clang_info mod, which is
shared by bpf_builder.rs and builder.rs.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-10-13 07:57:12 -07:00
Andrea Righi
b3c5a23693 scx_rustland_core: use handle_mm_fault kprobe
The symbol __handle_mm_fault isn't available anymore in 6.12, let's rely
on handle_mm_fault that is available both on 6.12 and older kernels.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-10-11 15:39:34 +02:00
Ryan Wilson
fbdb6664ec [rusty] Fix load stats when host is under-utilized 2024-10-08 21:08:07 -07:00
Tejun Heo
4979cb8762
Merge pull request #739 from CachyOS/feature/scx-loader-switch-sched
scx_loader: Add SwitchScheduler methods to DBUS interface
2024-10-07 16:40:12 +00:00
Daniel Hodges
d86638ef0b
scx_layered: Add big cpumask
Add big cpumask to scx_layered and prefer selecting big idle cores when
using the BigLittle growth algo.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-10-06 14:05:12 -04:00
Vladislav Nepogodin
7bd61f4334
scx_loader: Add SwitchScheduler methods to DBUS interface
These methods allow switching between different schedulers without requiring manual stopping and starting.
2024-10-05 02:57:17 +04:00
Tejun Heo
7402895f4a version: v1.0.5 2024-10-02 08:34:57 -10:00
Tejun Heo
0dda8de2b0
Merge pull request #707 from CachyOS/scx-loader-dbus-introspection
scx_loader: Add D-Bus Introspection XML
2024-09-30 19:27:15 +00:00
Tejun Heo
04648bc511
Merge pull request #703 from minosfuture/main
scx_stats: Implement macro #stat_doc to autogen doc from stat desc
2024-09-30 17:58:56 +00:00
Vladislav Nepogodin
6df06c5569
scx_loader: Add D-Bus Introspection XML
XML can be used for code generation like `gdbus-codegen`,`zbus-xmlgen`

refs:
- https://dbus2.github.io/zbus/client.html
- https://docs.gtk.org/gio/migrating-gdbus.html#generating-code-and-docs
- https://dbus.freedesktop.org/doc/dbus-api-design.html
- https://dbus.freedesktop.org/doc/dbus-specification.html
2024-09-30 03:28:30 +04:00
Andrea Righi
302dadd1ae Revert "scx_rustland_core: prevent deadlock with per-CPU DSQs and CPU affinity"
It seems that with the latest kernel the per-CPU DSQ stall while
executing sched_setaffinity() doesn't happen anymore.

Therefore, get rid of the temporary workaround introduced by commit
86db45f ("scx_rustland_core: prevent deadlock with per-CPU DSQs and CPU
affinity") and restore the old behavior, which offers more fair
scheduling policy.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-29 14:42:41 +02:00
Ming Yang
28bfd2986a scx_stats: Implement #stat_doc to autogen doc from stat desc
The doc of scx_layered `Opt` is out of sync.

Implement attribute macro #stat_doc to generate doc from the `desc`
property.

Apply #stat_doc to `LayerStats` and `SysStats in scx_layered.

Signed-off-by : Ming Yang <minos.future@gmail.com>
2024-09-28 09:32:48 -07:00
Vladislav Nepogodin
9b5e8da8e3
scx_loader: Add systemd service and on-DBUS launch
- add ability to start loader at system start as a service

- add ability to automatically launch on DBUS call whenever third-party
calls the interface

ref: "Example 7. DBus services" https://www.freedesktop.org/software/systemd/man/256/systemd.service.html
2024-09-27 23:51:28 +04:00
likewhatevs
bd2e90b0b6
run cargo +nightly-2024-09-10 fmt to fix lint err (#691)
run cargo +nightly-2024-09-10 fmt to fix lint err
2024-09-25 17:34:53 -04:00
Daniel Hodges
73d33e86e8
Merge pull request #686 from frelon/scx_utils-gpu-topo
scx_utils: Add gpu-topology crate feature
2024-09-25 15:09:27 -04:00
Tejun Heo
64b815b6a6
Merge pull request #676 from MitchellAugustin/scx_loader_automatic
scx_loader: Add initial automatic scheduler switching via --monitor-no-dbus
2024-09-25 08:03:51 -10:00
Fredrik Lönnegren
bbce932393 scx_utils: Add gpu-topology crate feature
The gpu-topology feature can be enabled to include GPUs when generating
a topology-map. Disabling the feature will remove the nvml-wrapper
dependency as well as GPU-specific code in topology.rs.

Most of the code was moved to a new module in rust/scx_utils/src/gpu.rs
but some of it was kept in topology.rs and hidden behind #[cfg(feature =
"gpu-topology")].

Signed-off-by: Fredrik Lönnegren <fredrik@frelon.se>
2024-09-25 14:54:52 +02:00
Andrea Righi
aea431c0c6
Merge pull request #678 from sched-ext/rustland-core-fix-mm-stall
scx_rustland_core: fix mm stall
2024-09-24 23:26:24 +02:00
likewhatevs
99d1179866
enable ide's etc. to work on bpf.c files (#668)
* enable ide's etc. to work on the bpf.c files
this makes it so that clangd and ide tools which use clangd
can work on the bpf.c code.

nothing should actually be changed outside of that ide/editor
environment, all the changes are ifdef'ed on LSP which is set
in the added .clangd file.

* move intf include out of both sides of ifdef toggle
2024-09-24 16:55:02 -04:00
Mitchell Augustin
e4eaed07a8 Change monitor_no_dbus to auto, use enum-string conv, use log::info, use Tokio spawning 2024-09-24 09:36:27 -05:00
Mitchell Augustin
b7e82bdeba
Merge branch 'sched-ext:main' into scx_loader_automatic 2024-09-24 08:16:37 -05:00
I Hsin Cheng
61cb3f7fc5 scx_common_bpf: Append cast_mask()
Remove cast_mask() function distributed throughout different schedulers
and add it in common.bpf.h so every scheduler can reference it once they
need to.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-09-24 16:01:19 +08:00
Andrea Righi
0a57b93846 scx_rustland_core: prevent mm stall
Bypass user-space scheduling for tasks currently handling a page fault,
preventing potential deadlock conditions involving VMA lock / mmap_lock
during user-space scheduling.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-24 08:46:14 +02:00
Andrea Righi
34bc6a2b64 Revert "scx_rustland_core: dispatch all kthreads directly from BPF"
This reverts commit 809d39aa7f.

Dispatching all kthreads directly doesn't really help much at preventing
stalls with the stress-ng fork stressor, so revert this commit. A better
workaround will be provided in the next commit.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-24 08:24:01 +02:00
Mitchell Augustin
d434ab4266 scx_loader: Add initial automatic scheduler switching via --monitor-no-dbus
Exposes an option --monitor-no-dbus in scx_loader that will monitor CPU
utilization and start scx_lavd when any CPU exceeds 90% for more than 5
seconds. scx_lavd will be terminated if all CPUs are below 90% for
more than 30 seconds. When this flag is specified, scx_loader's
dbus functionality is not utilized.
2024-09-23 17:07:43 -05:00
Daniel Hodges
a9f3190b5f
scx_utils: Add extra ordering macros for topology
Add extra ordering macros for Core/CPU structs for ease of use with
Rust standard library features. This issue was hit when trying to sort
cores based on the CoreType. See this similar issue for details:
https://github.com/rust-lang/rust/issues/113550

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-20 11:41:23 -04:00