"struct task_struct *p" isn't used within the function
"task_load_adj()". Delete the function parameter for cleaner code.
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
Use scx_utils::NR_CPU_IDS to iterate whole CPUs and separately count the
number of online CPUs to support CPU hotplug correctly.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
`#stat_doc` extends the document from stat desc property.
Add this attribute macro to the remaining Stats structs.
Signed-off-by: Ming Yang <minos.future@gmail.com>
task_avg_nvcsw() was incorrectly returning a bool instead of u64,
limiting the impact of the lowlatency boost.
Fix it by returning the proper type (u64).
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
When a task is the last one running on a CPU and still wants to
continue, allow it to run and replenish its time only if the used CPU is
part a fully idle SMT core.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
During ttwu, the kernel may decide to skip ->select_task_rq() (e.g.,
when only one CPU is allowed or migration is disabled). This causes to
call ops.enqueue() directly without having a chance to call
ops.select_cpu().
Therefore, introduce a new flag (select_cpu_done) in the local task
context to determine if ops.select_cpu() was bypassed and, in that case,
attempt to find an idle CPU directly from ops.enqueue().
In the future this information will be supplied by the kernel through a
special enqueue flag (SCX_ENQ_CPU_SELECTED) [1]. However, the custom
flag in the local task context ensures to reliably determine the same
information, even on older kernels where this flag is not available.
[1] https://lore.kernel.org/lkml/20240928003840.GA2717@maniforge/T
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Fix a bug in cache initialization where the first node would repeated
get all CPUs added to the mask. Refactor some consts to be more clear.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
It seems that with the latest kernel the per-CPU DSQ stall while
executing sched_setaffinity() doesn't happen anymore.
Therefore, get rid of the temporary workaround introduced by commit
86db45f ("scx_rustland_core: prevent deadlock with per-CPU DSQs and CPU
affinity") and restore the old behavior, which offers more fair
scheduling policy.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
When finding a victim candidate for preemption, a randomly chosen
candidate could be out of valid CPU range due to CPU offline, etc. In
this case, try another CPU randomly.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
The doc of scx_layered `Opt` is out of sync.
Implement attribute macro #stat_doc to generate doc from the `desc`
property.
Apply #stat_doc to `LayerStats` and `SysStats in scx_layered.
Signed-off-by : Ming Yang <minos.future@gmail.com>
We used the average performance criticality of tasks as a threshold to
determine the proper core type (big or little). However, if the big
core's compute capacity is not half of the total compute capacity, such
an average-based determination becomes suboptimal. If fewer tasks are
classified as performance-critical tasks and requested to run on big
cores, the big cores would be wasted by stealing arbitrary
non-performance-critical tasks. That could result in performance
instability.
Hence, determine the threshold more accurately by considering (active)
big cores' compute capacity and the (approximated) distribution of
performance criticality of tasks.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
As a preparation to improve the performance criticality logic, we first
rename "avg_perf_cri" to "thr_perf_cri" since average is no longer the
threshold.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Enabling the `caching-build` workflow on `merge_group` actions. This is required to enable merge queues which should make merging lots of PRs while getting full CI testing possible. Currently `lint` is the only required check but we will add more in future.
* Fix a couple of misc errors in build scripts.
* Tweak scripts/kconfigs to make bpftrace work.
* Update how CI caching works to make builds faster (6 minute turnaround
time)
* Update CI config to generate per-scheduler debug archives w/ guest
dmesg/scheduler stdout, guest stdout, bpftrace script output,
veristat output.
* Update build scripts to accept the following:
** VNG RW -- write to host filesystem (better caching, logging).
* For stress tests in particular (via ini config):
** QEMU Opts -- to facilitate reproducing bugs (i.e. high core count).
** bpftrace scripts -- specify bpftrace scripts to run during stress
tests.