When selecting the idle CPU use the idle_smt option on the layer. This
may improve cache locality in some cases by placing tasks on CPUs that
are on closer cache lines.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
On some older kernels layered fails to validate. Prevent certain helpers
from being inlined to pass the verifier.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
The loops in topology aware mode were recently refactored to place the -per-LLC
loops inside the per-layer loops. However, the layer specific checks were left
in the inner loops, slowing this down unnecessarily.
Pull the layer specific checks from the inner loop into the outer loop.
Also changes these functions to `__weak` to ensure they don't get inlined -
they're expected to be verified as global functions.
Note to reviewers: this looks good to me, but I'd appreciate if you reviewed
the De Morgan applications in detail.
Test plan:
- `cargo build --release && sudo target/release/scx_layered --run-example` on a
machine with multiple LLCs. It's possible to stall it quite easily with
stress-ng but I believe this is the case on main.
With the recent rework of scx_bpfland the default options for the
different profiles in scx_loader are not valid anymore.
Update them with some appropriate options.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Add fallback DSQ cost accounting so that fallback DSQ costs are
accounted for and so that dispatch of fallback DSQs can be done in a
standardized way.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
The verifier error seems to stem from the wrong vmlinux.h.
Also, PR #889 seems to completely fix the problem.
So, drop the workaround.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Previously, the cur_logical_clk is updated with WIRTE_ONCE(),
which does not guarantee the atomicity when concurrent writes happen
-- which is possible. So change it using CAS (compare-and-swap).
Signed-off-by: Changwoo Min <changwoo@igalia.com>