Add an enum for the layer growth algo to the bpf layer config. This will
be useful for implementing topology aware layer growth algorithms.
When selecting an idle CPU the current logic tries to keep tasks
local to LLC/NUMA node. However, for certain growth algorithms (ex:
RoundRobin) this is suboptimal. Adding the layer growth algorithm
will allow for different paths for CPU selection in the idle/preemption
paths.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
When preempting restrict preemption to the current layer cpumask. This
may reduce the amount of preemption, but cause better cache locality
of preempted tasks.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Remove cast_mask() function distributed throughout different schedulers
and add it in common.bpf.h so every scheduler can reference it once they
need to.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
If a waker is more latency critical than a wakee, inherit a waker's
latency criticality for the wakee. This allows the wakee to consider the
context of who wakes me up. For now, we limit such inheritance to one
hop and one schedule.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Use the cast_mask helper to clean up some of the bpf cpumask conversion
code for preemption.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Add topology aware preemption that begins in the local LLC and attempts
to preempt from cpus nearest in the topology.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Previously, we found a victim from the entire CPUs, which include remote
or non-compatible CPUs. Now we limit our search for victim finding
within a task's compute domain.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Add core growth algos for Big/Little core support. The algos allow
layers to grow layers by preferring either big or little cores first.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
The usage of cast_mask() within bpfland_enqueue aims to cast the type of
"p->cpus_ptr" from "struct bpf_cpumask *" to "const struct cpumask *".
However, the type of "p->cpus_ptr" is already "const cpumask_t *" aka
"const struct cpumask *", so no conversion is needed.
Passing a value of type "struct cpumask *" into "struct bpf_cpumask *"
also leads to compiling error.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Refactor match_layer() to prevent the compiling error caused by
uninitialization of the variable "nr_match_ors" before usage.
Move the checking of "nr_match_ors" after it access the value within
"layer->nr_match_ors" to make sure it's initiailized successfully.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Pass enqueue flags to user-space: flags will be passed via
QueuedTask.flags and can be forwarded back to BPF via
DispatchedTask.flags.
These flags can be also passed to BpfScheduler.select_cpu() to apply a
more refined CPU selection policy.
Moreover, avoid to prioritize the user-space scheduler too much and
dispatch it only if there are no other tasks that needs to be dispatched
in ops.dispatch().
This improves CPU utilization and enhances the fairness, robustness, and
resilience of schedulers based on scx_rustland_core, particularly under
stress test conditions.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
On WAKE_SYNC attempt to migrate the wakee on the same CPU as the waker
if the waker is not exiting, the wakee can use the waker's CPU, the
waker's L3 domain is not saturated and there are not other tasks queued
to the local DSQ of the waker's CPU.
This is the same logic used in scx_rusty.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Using the turbo boosted CPUs as preferred scheduling seems to be
beneficial only a very few corner cases, for example on battery-powered
devices with an aggressive cpufreq governor that constantly tries to
scale down the frequency (and even in this case it's probably better to
not force the tasks to run on the fast CPUs, to save power).
In practive the preferred domain seems to introduce more overhead than
benefits overall, so let's get rid of it.
This can be improved in the future adding multiple user-configurable
scheduling domains.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Many kernel threads performs latency critical tasks (e.g., net, gpu). In
particular, AMD GPU driver runs the most part in the kernel space using
kworker. Hence, treat kernel threads as if a woken up task.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Initialize the node cpumask, which was previously uninitialized causing
metric calculations to be wrong when attempting to lookup CPUs in the
node cpumask.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Use `cargo fmt` with a specific nightly branch in the CI to enforce formatting. Globally format these files while the diff is still small so we can stay on top of it.
Test plan:
- CI lint check passes.