Commit Graph

648 Commits

Author SHA1 Message Date
David Vernet
c187c65702
topology: Don't allocate on calls to span()
We're currently cloning cpumasks returned by calls to {Core, Cache,
Node, Topology}::span(). If a caller needs to clone it, they can. Let's
not penalize the callers that just want to query the underlying cpumask.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-23 22:59:42 -05:00
Andrea Righi
a8226f0fde
Merge pull request #235 from sched-ext/rustland-preemption
rustland: enable preemption
2024-04-23 17:20:39 +02:00
Andrea Righi
f02e9b072c scx_rustland_core: use a separate field to store dispatch flags
Do not encode dispatch flags in the cpu field, but simply use a separate
"flags" field.

This makes the code much simpler and it increases the size of
dispatched_task_ctx from 24 to 32, that is probably better in terms of
cacheline allocation / performance.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-23 16:10:56 +02:00
Andrea Righi
fbe9a80af8 scx_rustland: introduce --no-preemption
Provide a run-time option to disable task preemption.

This option can be used to improve the throughput of the CPU-intensive
tasks while still providing a good level of responsiveness in the
system.

By default preemption is enabled, to provide a higher level of
responsiveness to the interactive tasks.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-23 07:13:30 +02:00
Andrea Righi
0ffaaac6db scx_rustland: enable preemption
Use the new scx_rustland_core dispatch flag RL_PREEMPT_CPU to allow
interactive tasks to preempt other tasks with scx_rustland.

If the built-in idle selection logic is enforced (option `-i`), the
scheduler prioritizes keeping tasks on the target CPU designated by this
logic. With preemption enabled, these tasks have a higher likelihood of
reusing their cached working set, potentially improving performance.

Alternatively, when tasks are dispatched to the first available CPU
(default behavior), interactive tasks benefit from running more promptly
by kicking out other tasks before their assigned time slice expires.

This potentially allows to increase the default time slice to higher
values in the future, to improve the overall throughput in the system
and, at the same time, still maintain a good level of responsiveness,
because interactive tasks are now able to run pretty much immediately,
independently on the remaining time slice of the other tasks that are
contending the CPUs in the system.

= Results =

Measuring the performance of the usual benchmark "playing a video game
while running a parallel kernel build in background" seems to give
around 2-10% boost in the fps with preemption enabled, depending on the
particular video game.

Results were obtained running a `make -j32` kernel build on a AMD Ryzen
7 5800X 8-Cores 16GB RAM, while testing video games such as Baldur's
Gate 3 (with a solid +10% fps), Counter Strike 2 (around +5%) and Team
Fortress 2 (+2% boost).

Moreover, some WebGL applications (such as
https://webglsamples.org/aquarium/aquarium.html) seem to benefit even
more with preemption enabled, providing up to a +15% fps boost.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-23 07:13:30 +02:00
Andrea Righi
27c1f9c329 scx_rustland_core: introduce preemption
Introduce the new dispatch flag RL_PREEMPT_CPU that can be used to
dispatch tasks that can preempt others.

Tasks with this flag set will be dispatched by the BPF part using
SCX_ENQ_PREEMPT, so they can potentially preempt any other task running
on the target CPU.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-23 07:13:30 +02:00
Andrea Righi
6d2aac1591 scx_rustland_core: introduce dispatch flags
Reserve some bits of the `cpu` attribute of a task to store special
dispatch flags.

Initially, let's introduce just RL_CPU_ANY to replace the special value
NO_CPU, indicating that the task can be dispatched on any CPU,
specifically the first CPU that becomes available.

This allows to keep the CPU value assigned by the builtin idle selection
logic, that can potentially be used later for further optimizations.

Moreover, having the possibility to specify dispatch flags gives more
flexibility and it allows to map new scheduling features to such flags.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-23 07:13:30 +02:00
Tejun Heo
d3076de936
Merge pull request #236 from takase1121/scheds-rust/meson-build-seq
scheds-rust: build rust schedulers in sequence
2024-04-22 14:37:30 -10:00
takase1121
3e12676ca2
scheds-rust: add explanation for chaining schedulers 2024-04-23 08:30:38 +08:00
takase1121
5d20f89a87
scheds-rust: build rust schedulers in sequence 2024-04-23 08:06:27 +08:00
Tejun Heo
7b4d231af5
Merge pull request #233 from sched-ext/layer_init_task
layered: Fix init_task
2024-04-18 05:41:30 -10:00
David Vernet
5f1eac85ff
layered: Fix init_task
When I transitioned layered to using task local storage, I messed up
initializing the task ctx, not realizing we previously had a separate
variable that was initializing the hasmap entry. We need to initialize
the task's layer to -11, and also set refresh_layer to 1.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-18 09:44:32 -05:00
Changwoo Min
34b330c388
Merge pull request #232 from sched-ext/lavd_typos
lavd: Fix a few typos
2024-04-17 10:00:15 -07:00
David Vernet
45589cd0f7
lavd: Fix a few typos
Noticed a few typos. Let's fix em up

Signed-off-by: David Vernet <void@manifault.com>
2024-04-17 08:17:52 -05:00
Tejun Heo
2f5dd3d207
Merge pull request #231 from sched-ext/simple_switch_all
simple: Invoke __COMPAT_scx_bpf_switch_all();
2024-04-16 06:52:19 -10:00
David Vernet
eed338ef25
simple: Invoke __COMPAT_scx_bpf_switch_all();
scx_simple no longer supports running in "partial" mode, with only
certain tasks usig scx_simple. When this option was removed, we also
removed the call to scx_bpf_switch_all();

While switching-all is the default behavior for newer kernels, let's add
__COMPAT_scx_bpf_switch_all() so that scx_simple can work on older
kernels as well.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-16 11:09:44 -05:00
David Vernet
b9d57e85b5
Merge pull request #230 from sched-ext/update-readme
README: update additional resources
2024-04-15 12:21:24 -05:00
Andrea Righi
411af2216e README: update additional resources
Add "Getting started with sched-ext development" blog post to the
"Additional Resources" section.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-15 19:18:06 +02:00
Tejun Heo
50391b807b
Merge pull request #229 from sched-ext/padding
rusty: Remove explicit padding
2024-04-12 09:10:45 -10:00
David Vernet
ffced1f615
rusty: Remove explicit padding
As of libbpf-rs 0.23.0 (which contains commit
9d9e979fcf),
libbpf-rs now generates rust structs that honor padding. We can
therefore remove the custom padding in scx_rusty's struct pcpu_ctx.

For example, here is the generated pub struct pcpu_ctx:

pub struct pcpu_ctx {
    pub dom_rr_cur: u32,
    pub dom_id: u32,
    pub nr_node_doms: u32,
    pub node_doms: [u32; 64],
    pub __pad_268: [u8; 52],
}

And here is the matching struct in the BPF object file:

struct pcpu_ctx {
        u32                        dom_rr_cur;           /*     0     4 */
        u32                        dom_id;               /*     4     4 */
        u32                        nr_node_doms;         /*     8     4 */
        u32                        node_doms[64];        /*    12   256 */

        /* size: 320, cachelines: 5, members: 4 */
        /* padding: 52 */
} __attribute__((__aligned__(64)));

Signed-off-by: David Vernet <void@manifault.com>
2024-04-12 13:52:13 -05:00
David Vernet
5404303bc2
Merge pull request #228 from sched-ext/refactor
Small rusty cleanup
2024-04-12 10:02:09 -05:00
David Vernet
e032ee7cc0
rusty: Add lookup_pcpu_ctx() helper
Getting rid of more boilerplate

Signed-off-by: David Vernet <void@manifault.com>
2024-04-11 19:30:23 -05:00
David Vernet
885a9fd7da
rusty: Make lookup_task_ctx() static
It doesn't need to be a global prog. Let's make it static.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-11 19:30:23 -05:00
David Vernet
0ff73754cf
rusty: Add create_save_cpumask() helper
We have a lot of boilerplate code where we create a cpumask, initialize
it, and then bpf_kptr_xchg() it into the map. In an effort to slightly
reduce the amount of boilerplate, let's create a helper that can
alleviate some of it.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-11 19:30:21 -05:00
David Vernet
e27d5b4e67
rusty: Fix a few random issues
There are some random issues in the code, like unused variables, and bad
print formatters. I'm not sure why the compiler isn't consistently
complaining, but let's fix them.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-11 19:21:02 -05:00
Tejun Heo
52d8a5d770
Merge pull request #226 from sched-ext/numa_dsq
rusty: Allocate DSQ on appropriate NUMA node
2024-04-11 09:05:42 -10:00
David Vernet
31cc2dccb9
rusty: Allocate DSQ on appropriate NUMA node
In scx_rusty, now that we have a complete view of the host's topology
thanks to the Topology crate, we can update our calls to
scx_bpf_create_dsq() to create the DSQ on the NUMA node of the domain.
It's unclear how much this will end up mattering for performance in the
typical case, but we might as well do the right thing given that host
topolgoy is static, and we have the information.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-11 00:01:25 -05:00
David Vernet
47ab9331f4
Merge pull request #225 from sched-ext/error-typo
Fix error typo
2024-04-10 15:05:13 -05:00
Dan Schatzberg
6eefc8c27f
Fix error typo
ENONET means "Machine is not on the network" - this was supposed to be ENOENT "No such file or directory"
2024-04-10 15:28:05 -04:00
Changwoo Min
f53c29759e
scx_lavd: support preemption (in some scenarios) (#224)
* scx-lavd: preemption of a lower-priority task using kick cpu

When a task is enqueued to the global queue, the scheduler checks if
there is a lower priority task than the enqueued task. If so, it kicks
out the lower-priority task, hoping the newly enqueued task or another
higher-priority task runs on the kicked CPU. Kicking another CPU is
expensive as an IPI is involved, so the scheduler judiciously kicks the
CPU when its benefit (i.e., priority gap) is clear enough.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-04-09 14:25:53 +09:00
Tejun Heo
994d2a586c
Merge pull request #222 from jordalgo/header-paths
Sync libbpf_h and libbpf_local_h
2024-04-04 14:46:04 -10:00
Jordan Rome
c1925f07b4 Sync libbpf_h and libbpf_local_h
Make sure these include the same header directories.
2024-04-04 17:32:37 -07:00
Tejun Heo
1b897ae24b
Merge pull request #221 from sched-ext/htejun/misc
meson.build: Update libbpf and bpftool version requirements
2024-04-04 13:49:45 -10:00
Tejun Heo
4a77c8f8fb meson.build: Update libbpf and bpftool version requirements
The recent compat additions require new libbpf and bpftool. Update the
requirements.

- libbpf >= 1.4
- bpftool >= 7.4
2024-04-04 13:16:08 -10:00
Tejun Heo
8271bdd818
Merge pull request #220 from ThinkerYzu/uapi-path
meson: Add missing path to libbpf_local_h.
2024-04-04 11:57:48 -10:00
David Vernet
9a8ed8ab44
Merge pull request #218 from sched-ext/rusty_hotplug
Gracefully handle hotplug in scx_rusty
2024-04-04 16:03:59 -05:00
Tejun Heo
d2905e3328
Merge pull request #219 from sched-ext/rustland-core-0.2
scx_rustland_core: bump up version to 0.2
2024-04-04 11:03:34 -10:00
Kui-Feng Lee
b61dec6815 meson: Add missing path to libbpf_local_h.
The build system included linux/btf.h from system even there is one in
libbpf.  Adding libbpf/include/uapi to libbpf_local_h, the build
system will include linux/btf.h provided by libbpf.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
2024-04-04 14:01:59 -07:00
Andrea Righi
17a30bddc9 scx_rustland_core: bump up version to 0.2
Bump up the version of the crate and update dependencies.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-04 22:44:55 +02:00
David Vernet
622b61dd2f
rusty: Support restarting rusty on hotplug events
The scx_rusty scheduler does not support hotplug, and expects a static
host topology throughout its runtime. Though the kernel does have
support for detecting hotplug events, we currently don't detect this in
the kernel, nor surface it to user space when it happens. Now that we
have scx_bpf_exit(), we can gracefully exit the kernel in the event of a
hotplug, and communicate to user space that it should restart the
scheduler.

This patch adds that support to scx_rusty. Note that this assumes that
we're running on a recent enough kernel that has scx_bpf_exit(). If it
doesn't, then we instead just error out of the kernel scheduler and exit
the application.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-04 14:52:48 -05:00
David Vernet
052ce428a3
uei: Export exit_code from UserExitInfo
Newer kernels also support exiting gracefully with an exit code. Let's
update the UserExitInfo struct to also read and export this value.

Signed-off-by: David Vernet <void@manifault.com>
2024-04-04 14:52:45 -05:00
Tejun Heo
1c6af78b79
Merge pull request #217 from sched-ext/fix-rustland-core-crate
scx_rustland_core: separate crate source code from assets
2024-04-04 09:51:43 -10:00
Andrea Righi
85a32a7b51 scx_rustland_core: separate crate source code from assets
scx_rustland_core needs to ship both a binary part and a source code
part, which will be used to build schedulers based on it.

To effectively publish the scx_rustland_core crate on crates.io we need
to properly separate the source code assets from the crate's main source
code.

To achieve this, move the assets into a separate directory and declare
them inside a [lib] section in Cargo.toml.

This allows to publish the crate on crates.io, providing also a clear
separation between source code and assets.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-04 21:31:02 +02:00
Tejun Heo
939ffdf0f0
Merge pull request #216 from sirlucjan/systemd-reorder
systemd: Move services to separate directory
2024-04-04 08:51:36 -10:00
Piotr Gorski
ef60559cb5
systemd: Move services to separate directory
Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
2024-04-04 20:29:54 +02:00
David Vernet
163e96db59
Merge pull request #215 from sched-ext/htejun/misc
scx_lavd: Add .gitignore
2024-04-04 12:32:08 -05:00
Tejun Heo
ba52cc131b scx_lavd: Add .gitignore 2024-04-04 07:15:37 -10:00
Andrea Righi
c6fda263bd
Merge pull request #214 from sched-ext/rustland-core-update-libbpf-rs-api
scx_rustland_core: re-introduce consume_raw() libbpf-rs API
2024-04-04 18:09:54 +02:00
Andrea Righi
fbccb161a3 scx_rustland_core: re-introduce consume_raw() libbpf-rs API
Now that libbpf-rs 0.23 has been officially released with the new
consume_raw() API (https://github.com/libbpf/libbpf-rs/pull/680) we can
re-introduce the change in rustland-core that allows to use this API to
improve the quality of the code and make it slightly more efficient when
consuming tasks from BPF to user-space.

Fixes: bd2c18a ("Revert "scx_rustland_core: use new consume_raw() libbpf-rs API"")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-04-04 17:34:57 +02:00
Andrea Righi
bc1b0e99e6
Merge pull request #213 from sched-ext/fix-ci-build
ci: make test build more robust
2024-04-04 17:34:38 +02:00