Add an enum for the layer growth algo to the bpf layer config. This will
be useful for implementing topology aware layer growth algorithms.
When selecting an idle CPU the current logic tries to keep tasks
local to LLC/NUMA node. However, for certain growth algorithms (ex:
RoundRobin) this is suboptimal. Adding the layer growth algorithm
will allow for different paths for CPU selection in the idle/preemption
paths.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
* make ci nicer
Replace build scheds and merged with caching build, and rename
caching build to build-and-test.
This should make the CI reports on PRs be nice and specific
(i.e. at a glance, know what passes and what fails).
It also keeps PR CI jobs up to date (as folks edit things) and
has them all use one config/24.04 etc.
* prevent untar permission errors from causing cache misses
When preempting restrict preemption to the current layer cpumask. This
may reduce the amount of preemption, but cause better cache locality
of preempted tasks.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
I think I see PRs being harder to write because all parts of a CI job
are cancelled when one fails.
I think I am also starting to see that we have enough largely disjoint
moving pieces that there will often be one that is failing stress
tests at any time.
Make CI run all stress tests always to address this.
Remove cast_mask() function distributed throughout different schedulers
and add it in common.bpf.h so every scheduler can reference it once they
need to.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
If a waker is more latency critical than a wakee, inherit a waker's
latency criticality for the wakee. This allows the wakee to consider the
context of who wakes me up. For now, we limit such inheritance to one
hop and one schedule.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Use the cast_mask helper to clean up some of the bpf cpumask conversion
code for preemption.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Add topology aware preemption that begins in the local LLC and attempts
to preempt from cpus nearest in the topology.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Previously, we found a victim from the entire CPUs, which include remote
or non-compatible CPUs. Now we limit our search for victim finding
within a task's compute domain.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Add core growth algos for Big/Little core support. The algos allow
layers to grow layers by preferring either big or little cores first.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Add extra ordering macros for Core/CPU structs for ease of use with
Rust standard library features. This issue was hit when trying to sort
cores based on the CoreType. See this similar issue for details:
https://github.com/rust-lang/rust/issues/113550
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
The usage of cast_mask() within bpfland_enqueue aims to cast the type of
"p->cpus_ptr" from "struct bpf_cpumask *" to "const struct cpumask *".
However, the type of "p->cpus_ptr" is already "const cpumask_t *" aka
"const struct cpumask *", so no conversion is needed.
Passing a value of type "struct cpumask *" into "struct bpf_cpumask *"
also leads to compiling error.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Use an "_" variable to access the returned valued of "saturating_sub()"
to mute the compilation warnings.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Refactor match_layer() to prevent the compiling error caused by
uninitialization of the variable "nr_match_ors" before usage.
Move the checking of "nr_match_ors" after it access the value within
"layer->nr_match_ors" to make sure it's initiailized successfully.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Dispatching kthreads via user-space can still lead to deadlocks in
certain cases (for example we can still trigger stalls by running the
fork stressor via stress-ng).
To prevent such stalls simply dispatch kthreads directly from BPF for
now to prevent failures.
In the future we may consider to provide an API to restrict the
selection of tasks directly dispatched (for example passing a mask PF_*
flags to "whitelist" the tasks that are allowed to bypass the user-space
scheduler).
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Updating nr_queued in a non-atomic when a queued task is consumed can
lead to underflows. We don't really care about being 100% accurate here,
since nr_queued should be considered more of a statistic than an
accurate value.
Therefore, just accept the fact that nr_queued can be inaccurate and
handle potential underflows.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>