Commit Graph

961 Commits

Author SHA1 Message Date
Ming Yang
35d5c082d5 scx_layered: Break up layer_core_order function
`layer_core_order` provided multiple core growth implementation

Break it up into smaller function. Also, attach the method to
LayerGrowthAlgo. And `LayerCoreOrderGenerator` is added to make future
growth algo extension easy.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-10-04 09:18:01 -07:00
Ming Yang
29308d4705 scx_layered: Move layer core growth logic to separate module
Move layer core growth logic to separate module for further refactoring.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-10-04 08:51:11 -07:00
Changwoo Min
7c5c83a3a2 scx_lavd: split main.bpf.c into multiple files
As the main.bpf.c file grows, it gets hard to maintain.
So, split it into multiple logical files. There is no
functional change.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-10-05 00:25:40 +09:00
Ming Yang
658a75df73 scx_layered: Add per layer time slices to stats
Quoting issue #720 description from @hodgesds:

> In `scx_layered` the time slice can be [configured per layer](https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/main.rs#L493).
> This should be added to the
> [`LayerStats`](https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/stats.rs#L51)
> for each layer. During stats
> [refresh](https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/main.rs#L852)
> read the time slice duration (from the bpf skel) to the layer and add it
> to the stats. Finally, update the
> [format](https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/stats.rs#L218)
> method for `LayerStats` to print the per layer time slices.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-10-03 20:56:03 -07:00
Fredrik Lönnegren
4b290a1757 scx_rusty: fix single dom short-circuit
Remove a short-circuit in cpu_to_dom_id that will return domain id 0 for
any input.

This fixes a crash of scx_rusty when running with a single domain and
any CPU is offline.

Signed-off-by: Fredrik Lönnegren <fredrik@frelon.se>
2024-10-03 20:34:18 +02:00
Changwoo Min
b1070449b2
Merge pull request #714 from multics69/lavd-hotplug
scx_lavd: support CPU hotplug correctly
2024-10-03 07:35:25 +09:00
Tejun Heo
7402895f4a version: v1.0.5 2024-10-02 08:34:57 -10:00
Daniel Hodges
054352f172
Merge pull request #716 from vax-r/lavd_typo
scx_lavd: Fix typo
2024-10-02 13:21:15 +00:00
I Hsin Cheng
3055716382 scx_rusty: Delete unused function variable
"struct task_struct *p" isn't used within the function
"task_load_adj()". Delete the function parameter for cleaner code.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-10-02 17:49:13 +08:00
I Hsin Cheng
7fbef2aa0b scx_lavd: Fix typo
Fix "alreay" to "already".

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-10-02 17:39:10 +08:00
Changwoo Min
770a59f69d scx_lavd: support CPU hotplug correctly
Use scx_utils::NR_CPU_IDS to iterate whole CPUs and separately count the
number of online CPUs to support CPU hotplug correctly.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-10-02 14:19:18 +09:00
Changwoo Min
fb7bc0a850 scx_lavd: fix incorrect preemtability test
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-10-02 13:24:43 +09:00
Ming Yang
445743487a Add #stat_doc attribute macro to Stats struct
`#stat_doc` extends the document from stat desc property.

Add this attribute macro to the remaining Stats structs.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-09-30 22:12:11 -07:00
Daniel Hodges
c897511c62
scx_layered: Fix compiler warnings
Cleanup various compiler warnings.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-30 20:59:24 -04:00
Tejun Heo
04648bc511
Merge pull request #703 from minosfuture/main
scx_stats: Implement macro #stat_doc to autogen doc from stat desc
2024-09-30 17:58:56 +00:00
Andrea Righi
e966455af2 scx_bpfland: fix task_avg_nvcsw() return type
task_avg_nvcsw() was incorrectly returning a bool instead of u64,
limiting the impact of the lowlatency boost.

Fix it by returning the proper type (u64).

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-30 14:36:32 +02:00
Andrea Righi
6e24fcc7f0 scx_bpfland: keep tasks running on full-idle SMT cores
When a task is the last one running on a CPU and still wants to
continue, allow it to run and replenish its time only if the used CPU is
part a fully idle SMT core.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-30 14:36:32 +02:00
Andrea Righi
c20a19c946 scx_bpfland: always give tasks a chance to run on an idle CPU
During ttwu, the kernel may decide to skip ->select_task_rq() (e.g.,
when only one CPU is allowed or migration is disabled). This causes to
call ops.enqueue() directly without having a chance to call
ops.select_cpu().

Therefore, introduce a new flag (select_cpu_done) in the local task
context to determine if ops.select_cpu() was bypassed and, in that case,
attempt to find an idle CPU directly from ops.enqueue().

In the future this information will be supplied by the kernel through a
special enqueue flag (SCX_ENQ_CPU_SELECTED) [1]. However, the custom
flag in the local task context ensures to reliably determine the same
information, even on older kernels where this flag is not available.

[1] https://lore.kernel.org/lkml/20240928003840.GA2717@maniforge/T

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-30 14:36:19 +02:00
Daniel Hodges
bb560088de
scx_layered: Fix cache initialization cpumask
Fix a bug in cache initialization where the first node would repeated
get all CPUs added to the mask. Refactor some consts to be more clear.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-29 22:10:08 -04:00
Changwoo Min
ade6931bfc scx_lavd: fix incorrect neighbor_bit initialization
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-29 02:27:40 +09:00
Changwoo Min
6d116208c8 scx_lavd: do not perform the victim selection for an invalid cpu
When finding a victim candidate for preemption, a randomly chosen
candidate could be out of valid CPU range due to CPU offline, etc. In
this case, try another CPU randomly.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-29 02:22:18 +09:00
Ming Yang
28bfd2986a scx_stats: Implement #stat_doc to autogen doc from stat desc
The doc of scx_layered `Opt` is out of sync.

Implement attribute macro #stat_doc to generate doc from the `desc`
property.

Apply #stat_doc to `LayerStats` and `SysStats in scx_layered.

Signed-off-by : Ming Yang <minos.future@gmail.com>
2024-09-28 09:32:48 -07:00
Changwoo Min
e8ebc09ced
Merge pull request #702 from multics69/lavd-dyn-pc-thr
scx_lavd: more accurately determine the performance criticality threshold
2024-09-28 04:35:45 +00:00
Changwoo Min
cd7846f4d2 scx_lavd: more accurately determine the performance criticality threshold
We used the average performance criticality of tasks as a threshold to
determine the proper core type (big or little). However, if the big
core's compute capacity is not half of the total compute capacity, such
an average-based determination becomes suboptimal. If fewer tasks are
classified as performance-critical tasks and requested to run on big
cores, the big cores would be wasted by stealing arbitrary
non-performance-critical tasks. That could result in performance
instability.

Hence, determine the threshold more accurately by considering (active)
big cores' compute capacity and the (approximated) distribution of
performance criticality of tasks.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-27 16:56:30 +09:00
Changwoo Min
f07023e42b scx_lavd: rename avg_perf_cri to thr_perf_cri
As a preparation to improve the performance criticality logic, we first
rename "avg_perf_cri" to "thr_perf_cri" since average is no longer the
threshold.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-27 13:12:20 +09:00
Daniel Hodges
d1b425d1fa scx_layered: Fix idle core selection
Fix idle core selection to correctly use pick_idle_cpu_from.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-26 07:46:26 -07:00
Daniel Hodges
f9b39244cc
Merge pull request #687 from hodgesds/layered-growth-enum-refactor
scx_layered: Add layer growth algo to layer bpf config
2024-09-25 15:10:00 -04:00
Daniel Hodges
bce840d9e5 scx_layered: Add layer growth algo to layer bpf config
Add an enum for the layer growth algo to the bpf layer config. This will
be useful for implementing topology aware layer growth algorithms.
When selecting an idle CPU the current logic tries to keep tasks
local to LLC/NUMA node. However, for certain growth algorithms (ex:
RoundRobin) this is suboptimal. Adding the layer growth algorithm
will allow for different paths for CPU selection in the idle/preemption
paths.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-25 12:00:24 -07:00
likewhatevs
99d1179866
enable ide's etc. to work on bpf.c files (#668)
* enable ide's etc. to work on the bpf.c files
this makes it so that clangd and ide tools which use clangd
can work on the bpf.c code.

nothing should actually be changed outside of that ide/editor
environment, all the changes are ifdef'ed on LSP which is set
in the added .clangd file.

* move intf include out of both sides of ifdef toggle
2024-09-24 16:55:02 -04:00
Daniel Hodges
2805bb77a5
Merge pull request #683 from hodgesds/layered-idle-topo
scx_layered: Make layered idle CPU selection topology aware
2024-09-24 16:37:23 -04:00
Daniel Hodges
679dd5920c scx_layered: Fix comment
Fix comment to be more accurate.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-24 12:31:03 -07:00
Daniel Hodges
87c6e276d9 scx_layered: Restrict idle selection to layer cpus
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-24 09:40:38 -07:00
Daniel Hodges
e68fccd26c scx_layered: Update comments on layer preemption
Update comments on layer preemption to be more descriptive.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-24 08:24:04 -07:00
Daniel Hodges
6b966cda0c scx_layered: Restrict preemption to layer cpumask
When preempting restrict preemption to the current layer cpumask. This
may reduce the amount of preemption, but cause better cache locality
of preempted tasks.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-24 06:23:01 -07:00
I Hsin Cheng
61cb3f7fc5 scx_common_bpf: Append cast_mask()
Remove cast_mask() function distributed throughout different schedulers
and add it in common.bpf.h so every scheduler can reference it once they
need to.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-09-24 16:01:19 +08:00
Changwoo Min
b1bc4033b4
Merge pull request #673 from multics69/lavd-prop-lat-cri
scx_lavd: propagate waker's latency criticality to its wakee
2024-09-24 07:34:07 +09:00
Daniel Hodges
29fb647c93 scx_layered: Refactor idle core selection
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-23 12:01:42 -07:00
Daniel Hodges
380fd1f3b3 scx_layered: Make idle select topology aware
Make idle CPU selection topology aware.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-23 10:10:43 -07:00
Daniel Hodges
35477970bd scx_layered: Cleanup dump format
Cleanup the dump format for topology aware dumps in scx_layered.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-23 10:02:49 -07:00
Daniel Hodges
8b14e48994
Merge pull request #671 from hodgesds/layered-last-waker
scx_layered: Add waker stats per layer
2024-09-23 10:58:54 -04:00
Changwoo Min
71fa92cf1c scx_lavd: propagate waker's latency criticality to its wakee
If a waker is more latency critical than a wakee, inherit a waker's
latency criticality for the wakee. This allows the wakee to consider the
context of who wakes me up. For now, we limit such inheritance to one
hop and one schedule.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-23 12:56:16 +09:00
Changwoo Min
ad8536b4a4
Merge pull request #670 from multics69/lavd-opt-preemption
scx_lavd: find a victim cpu for preemption within task's compute domain
2024-09-23 10:22:08 +09:00
Daniel Hodges
91d32663bd
scx_layered: Refactor waker tracking to only use last waker
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 18:05:54 -04:00
Daniel Hodges
1a2f82b91c
Merge pull request #666 from hodgesds/layered-local-llc
scx_layered: Add topology aware preemption
2024-09-22 17:36:32 -04:00
Daniel Hodges
326f3b7988
Merge pull request #667 from hodgesds/layered-pcore-grow
scx_layered: Add Big/Little core growth algos
2024-09-22 16:59:42 -04:00
Daniel Hodges
1ac9712d2e
scx_layered: Refactor preemption into a separate function
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 16:54:11 -04:00
Daniel Hodges
bc34bd867b
scx_layered: Add option to enable XNUMA preemption
Disable XNUMA preemption by default and add an option to enable it.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 16:52:57 -04:00
Daniel Hodges
e105d9f8b1
scx_layered: Use cast_mask helper
Use the cast_mask helper to clean up some of the bpf cpumask conversion
code for preemption.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 16:52:57 -04:00
Daniel Hodges
5d9d32b65c
scx_layered: Add stats for XLLC/XNUMA preemptions
Add stats for XLLC/XNUMA preemptions.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 16:52:57 -04:00
Daniel Hodges
c15ecbb3a4
scx_layered: Add topology aware preemption
Add topology aware preemption that begins in the local LLC and attempts
to preempt from cpus nearest in the topology.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 16:52:56 -04:00
Daniel Hodges
6fb2f0b2b4
scx_layered: Clean up waker code
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 06:43:10 -04:00
Daniel Hodges
c55b2c6e69
scx_layered: Add waker stats per layered
Update the task context to keep a mask of wakers and add stats for wakes
across layers.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-22 06:43:03 -04:00
Daniel Hodges
140a101874
Merge pull request #449 from hodgesds/layered-dsq-fixes
scx_layered: Add a hi fallback dsq per llc
2024-09-22 06:39:46 -04:00
Changwoo Min
7321a89724 scx_lavd: find a victim cpu for preemption within task's compute domain
Previously, we found a victim from the entire CPUs, which include remote
or non-compatible CPUs. Now we limit our search for victim finding
within a task's compute domain.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-22 12:47:18 +09:00
Changwoo Min
a13082c2b8
Merge pull request #669 from multics69/lavd-opt-select-cpu
scx_lavd: consider waker's CPU when ops.select_cpu()
2024-09-22 09:16:06 +09:00
Andrea Righi
897977bbc1
Merge pull request #663 from vax-r/bpfland_fix
scx_bpfland: Remove the usage of cast_mask in bpfland_enqueue
2024-09-21 22:15:11 +02:00
Changwoo Min
8d8d8f9f61 scx_lavd: consider waker's CPU when ops.select_cpu()
In case of sync wake-up, consider waker's CPU also to improve cache
locality.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-22 01:57:49 +09:00
Daniel Hodges
4aa841de0a
scx_layered: Rename HI_FALLBACK_DSQ to HI_FALLBACK_DSQ_BASE
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-20 17:28:38 -04:00
Daniel Hodges
a3d1344293
scx_layered: Add core growth algo for core type
Add core growth algos for Big/Little core support. The algos allow
layers to grow layers by preferring either big or little cores first.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-20 11:50:15 -04:00
I Hsin Cheng
7799b94f07 scx_layered: Add helper function to access cpumask within bpf_cpumask
Before passing "nodec->cpumas" and "cachec->cpumask" into
"bpf_cpumask_test_cpu()", type conversion should be done first.
Implement "cast_mask()" to convert "struct bpf_cpumask *" into "const
struct cpumask *".

Reference from
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/progs/cpumask_common.h#n63

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-09-20 20:52:03 +08:00
I Hsin Cheng
5596d5e3fe scx_bpfland: Remove the usage of cast_mask in bpfland_enqueue
The usage of cast_mask() within bpfland_enqueue aims to cast the type of
"p->cpus_ptr" from "struct bpf_cpumask *" to "const struct cpumask *".
However, the type of "p->cpus_ptr" is already "const cpumask_t *" aka
"const struct cpumask *", so no conversion is needed.

Passing a value of type "struct cpumask *" into "struct bpf_cpumask *"
also leads to compiling error.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-09-20 20:45:09 +08:00
Daniel Hodges
8532ba3f1e
scx_layered: Fix hi fallback dsq consumption
Fix hi fallback dsq consumption to only consume from the cache local hi
fallback dsq.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-20 04:18:05 -04:00
I Hsin Cheng
e4bb99efc5 scx_layered: Refactor match_layer()
Refactor match_layer() to prevent the compiling error caused by
uninitialization of the variable "nr_match_ors" before usage.

Move the checking of "nr_match_ors" after it access the value within
"layer->nr_match_ors" to make sure it's initiailized successfully.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-09-19 22:20:03 +08:00
Andrea Righi
3f8db5783b
Merge pull request #658 from sched-ext/rustland-core-improve-cpu-selection
scx_rustland_core: improve idle CPU selection API and logic
2024-09-17 22:38:15 +02:00
Andrea Righi
e6b624a97c scx_rustland_core: improve idle CPU selection API and logic
Pass enqueue flags to user-space: flags will be passed via
QueuedTask.flags and can be forwarded back to BPF via
DispatchedTask.flags.

These flags can be also passed to BpfScheduler.select_cpu() to apply a
more refined CPU selection policy.

Moreover, avoid to prioritize the user-space scheduler too much and
dispatch it only if there are no other tasks that needs to be dispatched
in ops.dispatch().

This improves CPU utilization and enhances the fairness, robustness, and
resilience of schedulers based on scx_rustland_core, particularly under
stress test conditions.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-16 22:12:38 +02:00
Daniel Hodges
4f98de333d
Merge pull request #652 from JakeHillion/layer-growth-rr
scx_layered: add round robin growth strategy
2024-09-16 17:34:48 +02:00
Andrea Righi
00eebaf905 scx_bpfland: refine task wakeup logic
On WAKE_SYNC attempt to migrate the wakee on the same CPU as the waker
if the waker is not exiting, the wakee can use the waker's CPU, the
waker's L3 domain is not saturated and there are not other tasks queued
to the local DSQ of the waker's CPU.

This is the same logic used in scx_rusty.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-15 14:50:14 +02:00
Andrea Righi
079a53c689 scx_bpfland: get rid of preferred domain
Using the turbo boosted CPUs as preferred scheduling seems to be
beneficial only a very few corner cases, for example on battery-powered
devices with an aggressive cpufreq governor that constantly tries to
scale down the frequency (and even in this case it's probably better to
not force the tasks to run on the fast CPUs, to save power).

In practive the preferred domain seems to introduce more overhead than
benefits overall, so let's get rid of it.

This can be improved in the future adding multiple user-configurable
scheduling domains.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-15 14:50:14 +02:00
Changwoo Min
95e2f4dabe scx_lavd: boost the latency critility of kernel threads
Many kernel threads performs latency critical tasks (e.g., net, gpu). In
particular, AMD GPU driver runs the most part in the kernel space using
kworker. Hence, treat kernel threads as if a woken up task.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-14 00:41:02 +09:00
Changwoo Min
4b4f42fce1 scx_lavd: add a short circuit for the case of no turbo core
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-13 16:02:07 +09:00
Jake Hillion
3848d87895 scx_layered: add round robin growth strategy 2024-09-12 23:27:21 +01:00
Daniel Hodges
632fcfe4ae
Merge pull request #648 from hodgesds/layered-llc-stats
scx_layered: Add stats for XNUMA/XLLC migrations
2024-09-12 13:23:23 -04:00
Daniel Hodges
dde6e0c7f9 scx_utils: Add node/llc id to core topology
Add ids for node/llc in the Core topology struct.
2024-09-12 10:05:02 -07:00
Daniel Hodges
aee19dd9a1 scx_layered: Add topology aware core growth selection
Add topology aware core growth selection.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-12 06:48:51 -07:00
Daniel Hodges
14a19dc3ca scx_layered: Add random layer growth algo
Add a random layer growth algo.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-12 05:35:54 -07:00
Daniel Hodges
ae57f8d1f9 scx_rusty: Initialize node cpumask
Initialize the node cpumask, which was previously uninitialized causing
metric calculations to be wrong when attempting to lookup CPUs in the
node cpumask.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-11 13:14:44 -07:00
Jake Hillion
8ca45cfa37
lint: enable cargo fmt (#643)
Use `cargo fmt` with a specific nightly branch in the CI to enforce formatting. Globally format these files while the diff is still small so we can stay on top of it.

Test plan:
- CI lint check passes.
2024-09-11 10:03:20 +01:00
Daniel Hodges
43ec8bfe82 scx_layered: Add stats for XNUMA/XLLC migrations
Add stats for XNUMA/XLLC migrations. An example of the output is shown:
```
  hodgesd  : util/frac=    5.4/  0.1 load/frac=    301.0/  0.3 tasks=   476
             tot=   3168 local=97.82 wake/exp/reenq= 2.18/ 0.00/ 0.00
             keep/max/busy= 0.03/ 0.00/ 0.03 kick= 0.00 yield/ign= 0.09/    0
             open_idle= 0.00 mig= 6.82 xnuma_mig= 6.82 xllc_mig= 4.86 affn_viol= 0.00
             preempt/first/idle/fail= 0.00/ 0.00/ 0.00/ 0.00 min_exec= 0.00/   0.00ms
             cpus=  2 [  2,  4] 00000000 00000010 00001000
  normal   : util/frac=   28.7/  0.7 load/frac= 101704.7/ 95.8 tasks=  2450
             tot=   4660 local=99.06 wake/exp/reenq= 0.88/ 0.06/ 0.00
             keep/max/busy= 1.03/ 0.00/ 0.00 kick= 0.06 yield/ign= 0.04/  400
             open_idle=15.73 mig=23.45 xnuma_mig=23.45 xllc_mig= 3.07 affn_viol= 0.00
             preempt/first/idle/fail= 0.00/ 0.00/ 0.00/ 0.88 min_exec= 0.00/   0.00ms
             cpus=  2 [  2,  2] 00000001 00000100 00000000
             excl_coll=12.55 excl_preempt= 0.00
  random   : util/frac=    0.0/  0.0 load/frac=      0.0/  0.0 tasks=     0
             tot=      0 local= 0.00 wake/exp/reenq= 0.00/ 0.00/ 0.00
             keep/max/busy= 0.00/ 0.00/ 0.00 kick= 0.00 yield/ign= 0.00/    0
             open_idle= 0.00 mig= 0.00 xnuma_mig= 0.00 xllc_mig= 0.00 affn_viol= 0.00
             preempt/first/idle/fail= 0.00/ 0.00/ 0.00/ 0.00 min_exec= 0.00/   0.00ms
             cpus=  0 [  0,  0] 00000000 00000000 00000000
             excl_coll= 0.00 excl_preempt= 0.00
  stress-ng: util/frac= 4189.1/ 99.2 load/frac=   4200.0/  4.0 tasks=    43
             tot=     62 local= 0.00 wake/exp/reenq= 0.00/100.0/ 0.00
             keep/max/busy=2433.9/177.4/ 0.00 kick=100.0 yield/ign= 3.23/    0
             open_idle= 0.00 mig=54.84 xnuma_mig=54.84 xllc_mig=35.48 affn_viol= 0.00
             preempt/first/idle/fail= 0.00/ 0.00/ 0.00/ 0.00 min_exec= 0.00/   0.00ms
             cpus=  4 [  4,  4] 00000300 00030000 00000000
             excl_coll= 0.00 excl_preempt= 0.00
```

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-10 19:53:28 -07:00
Tejun Heo
8f0cc89ee8
Merge pull request #645 from frelon/rusty-init-dom
scx_rusty: init domains when calculating averages
2024-09-10 12:25:51 -10:00
Andrea Righi
e6e3579a92
Merge pull request #634 from anh0516/main
scx_bpfland: Documentation consistency fix
2024-09-10 23:25:55 +02:00
Fredrik Lönnegren
f155966b77 scx_rusty: init domains when calculating averages
The domains are added to the aggregator when load is added (and
duty_cycle is not 0.0f64).

This commit makes sure that all domains are added to the aggregator even
when the calculated duty_cycle is 0.

Signed-off-by: Fredrik Lönnegren <fredrik@frelon.se>
2024-09-10 21:51:41 +02:00
likewhatevs
85863d0e1c
Merge pull request #644 from hodgesds/layered-topo-order
scx_layered: Pass layer spec for core growth algo
2024-09-10 14:49:37 -04:00
Daniel Hodges
5fdd257862 scx_layered: Pass layer spec for core growth algo
Pass in the layer spec when determining the layer core growth algo. This
should make it easier to implement layer growth algos that are spec
specific.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-10 10:27:08 -07:00
Samuel Nair
c6af1aa1c8 scx_layered: Fix typo in stats 2024-09-10 08:44:57 -07:00
likewhatevs
c4c3659b6d
Merge pull request #638 from likewhatevs/remove-rlimit-dep
remove dependency on rlimit.rs
2024-09-10 03:14:12 -04:00
Andrea Righi
655ed5b4c6 scx_bpfland: use sum_exec_runtime to evaluate task's used time slice
Using p->scx.slice to evaluate the consumed time slice can be a bit
imprecise, because the sched_ext core implements yielding by setting
p->scx.slice to 0.

When the task's vruntime is evaluated this is considered as the task has
exhausted its entire allocated time slice, even though it voluntarily
released the CPU before the slice fully expired.

To avoid this inaccuracy and prevent penalizing tasks that voluntarily
release the CPU, always evaluate the used time slice based on the
difference in the task's total execution time (p->se.sum_exec_runtime).

This method provides a more precise calculation of vruntime and results
in a fairer task's deadline evaluation.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-10 08:03:35 +02:00
patso
c1df85914b
remove dependency on rlimit.rs
the rlimit crate is the only dependency crate
with a build.rs. build.rs files complicate portability.
this removes the need for rlimit.rs
2024-09-10 01:16:53 -04:00
Tejun Heo
56bb963136 build: Use a single top-level rust workspace
Rust build was using two separate workspaces - rust/ and scheds/rust.
There's no reason to separate them and it makes doc generation tricky. Use
single top level workspace so that we can drive all rust building from
cargo.
2024-09-08 14:23:48 -10:00
patso
120211d731
split build and test jobs
split build and test jobs to reduce ci turnaround time
and make it clear what is failing when something fails.

also add virtiofsd to deps to make test compilation faster
(most test time is compliation) and remove all force 9ps.
2024-09-08 02:54:24 -04:00
Changwoo Min
17e0e08e6e
Merge pull request #621 from multics69/lavd-greedy-fix
scx_lavd: improve greedy ratio calculation and more
2024-09-07 10:52:00 +09:00
Tejun Heo
6f8917ceca
Merge pull request #624 from JakeHillion/cleanup-layer_growth_algo
scx_layered: clean up Layer::new layer_growth_algo
2024-09-06 15:10:41 -10:00
Avraham Hollander
f71cc646a3 scx_bpfland: Fix in README.md for the same text as a comment in the
source
2024-09-06 19:12:33 -04:00
Jake Hillion
2c008b2afa scx_layered: clean up Layer::new layer_growth_algo 2024-09-06 18:25:50 +01:00
Changwoo Min
36df970a8f scx_lavd: add debug print for turbo cores
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 19:23:17 +09:00
Changwoo Min
351a1c6656 scx_lavd: enable autopilot mode by default
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 19:23:12 +09:00
Andrea Righi
8231f8586a scx_rlfifo: better documentation and code readability
Simplify scx_rlfifo code, add detailed documentation of the
scx_rustland_core API and get rid of the additional task queue, since it
just makes the code bigger, slower and it doesn't really provide any
benefit (considering that we are dispatching the tasks in FIFO order
anyway).

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-06 11:25:24 +02:00
Andrea Righi
ed879bae28 scx_rustland_core: expose enq_flags to user-space
Pass the enqueue flags to the user-space scheduler through the
QueuedTask struct.

These flags allow the user-space scheduler to make more informed
scheduling decisions.

Also bump up scx_rustland_core minor version to reflect the new API (we
are not breaking the old API, so we don't need to bump the major version
in this case).

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-06 11:25:24 +02:00
Changwoo Min
ebe9375b6a scx_lavd: pretty printing of status
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 16:27:20 +09:00
Changwoo Min
461cb9a3a0 scx_lavd: fix calculation of greedy_ratio
The service time (taskc->svc_time) should be the sum of total CPU time
consumed not jut a delta.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 16:22:40 +09:00
Tejun Heo
46fc2e1a49 version: v1.0.4 2024-09-05 18:12:45 -10:00