Commit Graph

273 Commits

Author SHA1 Message Date
Changwoo Min
7c5c83a3a2 scx_lavd: split main.bpf.c into multiple files
As the main.bpf.c file grows, it gets hard to maintain.
So, split it into multiple logical files. There is no
functional change.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-10-05 00:25:40 +09:00
Changwoo Min
b1070449b2
Merge pull request #714 from multics69/lavd-hotplug
scx_lavd: support CPU hotplug correctly
2024-10-03 07:35:25 +09:00
Tejun Heo
7402895f4a version: v1.0.5 2024-10-02 08:34:57 -10:00
I Hsin Cheng
7fbef2aa0b scx_lavd: Fix typo
Fix "alreay" to "already".

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-10-02 17:39:10 +08:00
Changwoo Min
770a59f69d scx_lavd: support CPU hotplug correctly
Use scx_utils::NR_CPU_IDS to iterate whole CPUs and separately count the
number of online CPUs to support CPU hotplug correctly.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-10-02 14:19:18 +09:00
Changwoo Min
fb7bc0a850 scx_lavd: fix incorrect preemtability test
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-10-02 13:24:43 +09:00
Ming Yang
445743487a Add #stat_doc attribute macro to Stats struct
`#stat_doc` extends the document from stat desc property.

Add this attribute macro to the remaining Stats structs.

Signed-off-by: Ming Yang <minos.future@gmail.com>
2024-09-30 22:12:11 -07:00
Changwoo Min
ade6931bfc scx_lavd: fix incorrect neighbor_bit initialization
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-29 02:27:40 +09:00
Changwoo Min
6d116208c8 scx_lavd: do not perform the victim selection for an invalid cpu
When finding a victim candidate for preemption, a randomly chosen
candidate could be out of valid CPU range due to CPU offline, etc. In
this case, try another CPU randomly.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-29 02:22:18 +09:00
Changwoo Min
cd7846f4d2 scx_lavd: more accurately determine the performance criticality threshold
We used the average performance criticality of tasks as a threshold to
determine the proper core type (big or little). However, if the big
core's compute capacity is not half of the total compute capacity, such
an average-based determination becomes suboptimal. If fewer tasks are
classified as performance-critical tasks and requested to run on big
cores, the big cores would be wasted by stealing arbitrary
non-performance-critical tasks. That could result in performance
instability.

Hence, determine the threshold more accurately by considering (active)
big cores' compute capacity and the (approximated) distribution of
performance criticality of tasks.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-27 16:56:30 +09:00
Changwoo Min
f07023e42b scx_lavd: rename avg_perf_cri to thr_perf_cri
As a preparation to improve the performance criticality logic, we first
rename "avg_perf_cri" to "thr_perf_cri" since average is no longer the
threshold.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-27 13:12:20 +09:00
I Hsin Cheng
61cb3f7fc5 scx_common_bpf: Append cast_mask()
Remove cast_mask() function distributed throughout different schedulers
and add it in common.bpf.h so every scheduler can reference it once they
need to.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-09-24 16:01:19 +08:00
Changwoo Min
71fa92cf1c scx_lavd: propagate waker's latency criticality to its wakee
If a waker is more latency critical than a wakee, inherit a waker's
latency criticality for the wakee. This allows the wakee to consider the
context of who wakes me up. For now, we limit such inheritance to one
hop and one schedule.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-23 12:56:16 +09:00
Changwoo Min
7321a89724 scx_lavd: find a victim cpu for preemption within task's compute domain
Previously, we found a victim from the entire CPUs, which include remote
or non-compatible CPUs. Now we limit our search for victim finding
within a task's compute domain.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-22 12:47:18 +09:00
Changwoo Min
8d8d8f9f61 scx_lavd: consider waker's CPU when ops.select_cpu()
In case of sync wake-up, consider waker's CPU also to improve cache
locality.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-22 01:57:49 +09:00
Changwoo Min
95e2f4dabe scx_lavd: boost the latency critility of kernel threads
Many kernel threads performs latency critical tasks (e.g., net, gpu). In
particular, AMD GPU driver runs the most part in the kernel space using
kworker. Hence, treat kernel threads as if a woken up task.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-14 00:41:02 +09:00
Changwoo Min
4b4f42fce1 scx_lavd: add a short circuit for the case of no turbo core
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-13 16:02:07 +09:00
Jake Hillion
8ca45cfa37
lint: enable cargo fmt (#643)
Use `cargo fmt` with a specific nightly branch in the CI to enforce formatting. Globally format these files while the diff is still small so we can stay on top of it.

Test plan:
- CI lint check passes.
2024-09-11 10:03:20 +01:00
patso
c1df85914b
remove dependency on rlimit.rs
the rlimit crate is the only dependency crate
with a build.rs. build.rs files complicate portability.
this removes the need for rlimit.rs
2024-09-10 01:16:53 -04:00
Changwoo Min
17e0e08e6e
Merge pull request #621 from multics69/lavd-greedy-fix
scx_lavd: improve greedy ratio calculation and more
2024-09-07 10:52:00 +09:00
Changwoo Min
36df970a8f scx_lavd: add debug print for turbo cores
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 19:23:17 +09:00
Changwoo Min
351a1c6656 scx_lavd: enable autopilot mode by default
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 19:23:12 +09:00
Changwoo Min
ebe9375b6a scx_lavd: pretty printing of status
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 16:27:20 +09:00
Changwoo Min
461cb9a3a0 scx_lavd: fix calculation of greedy_ratio
The service time (taskc->svc_time) should be the sum of total CPU time
consumed not jut a delta.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 16:22:40 +09:00
Tejun Heo
46fc2e1a49 version: v1.0.4 2024-09-05 18:12:45 -10:00
Changwoo Min
d9274bd8e6 scx_lavd: drop time slice boost for big cores
Unexpectedly, little cores, which have relative short time slices, have
more chance to schedule performance-critical tasks. Hence it is better
to keep the time slice same regardless the core types.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 09:32:38 +09:00
Changwoo Min
fdecba227c scx_lavd: print more info with --monitor
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 09:32:31 +09:00
Changwoo Min
f490a55d54 scx_lavd: accmulate more system-wide statistics
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Changwoo Min
e5d27d0553 scx_lavd: print basic system status when --monior is given
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Changwoo Min
6b717a3f3d scx_lavd: add --help-stats option
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Changwoo Min
ca1c86eb9c scx_lavd: improve pick_idle_cpu() for pinned tasks
When a pinned task cannot run on either active or overflow sets, we try
to stay on the previous CPU which is still okay to run on.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Tejun Heo
f010eda5c0 meson: Remove scheds/rust/*/meson.build
These aren't used since 43950c65 ("build: Use workspace to group rust
sub-projects"). Drop them.
2024-09-04 06:40:17 -10:00
Changwoo Min
0108b83050 scx_lavd: make the old verifier happy (bpf_cpumask_set_cpu)
An old BPF verifier does not allow calling bpf_cpumask_set_cpu() in the
BPF syscall context, so we defer actual bpf_cpumask_set_cpu() to the
timer handler, update_sys_stat(), to workaround the problem.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-02 18:00:12 +09:00
Changwoo Min
3bc2fd4977 scx_lavd: update README
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-02 18:00:12 +09:00
Changwoo Min
afbebaeed6 scx_lavd: check a core type of previous cpu at pick_idle_cpu()
If a task is performance-critical, pick_idle_cpu() checks if the
previous core is a big core or not. If not, don't try to run on previous
core since a performance-critical task is better to run on a big core.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-01 17:28:16 +09:00
Changwoo Min
5ca4501139 scx_lavd: dynamically decide autopilot's low watermark
A single threshold for a low watermark does not work well across systems
with various numbers of cores and core types. Instead of using a single
low watermark value, we dynamically decide the low watermark: 1) until
one little core is fully utilized or 2) until two big cores are fully
utilized. This works better across systems.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-01 12:46:57 +09:00
Changwoo Min
4a7b806dd2 scx_lavd: when no_freq_scaling, always set to the max freq
When the no_freq_scaling changes during runtime in the autopilot mode,
the last target freq set would not be 1024. So the performance mode
enabled by the autopilot mode would not run in the best profile. Hence,
we set the target freq to 1024 always when no_freq_scaling is set.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 18:22:33 +09:00
Changwoo Min
9091dd983b scx_lavd: add "--autopilot" mode
Add "--autopilot" option and mode. In the autopilot mode, the scheduler
dynamically changes its power mode according to system's load (cpu
utilization). When the cpu utilization is low enough (say <=5%), it
switches to the powersave mode since there is nothing to process fast so
powersaving is the primary goal. When the utilization is moderate (say
>5%, <=30%), it runs in balanced mode. When the utilization is high
enough (say >30%), it runs in performance mode.

Note that it only changes scheduler's power mode but it does not change
system's energy profile.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 01:14:33 +09:00
Changwoo Min
5ecaa9ebe2 scx_lavd: improve the accuracy of cpu utilization calculation
When a cpu is idle for a whole interval, its idle time does not
correctlyh adds up so the utilization of such cpu tends to be higher
than the actual utilization. Now it is fixedk, so cpu utilization
becomes more accurate.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 01:14:33 +09:00
Changwoo Min
2f8cc0d60f scx_lavd: rename the "--auto" opetion to "--autopower" to be clear
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 01:14:33 +09:00
Changwoo Min
815f1263b2 scx_lavd: reinitialize active cpumask when power mode changes
When the power mode changes back to performance mode, we should
active/overflow cpumask to its initial state -- all big cores are in
active cpumask and all little cores are in overflow cpumask. Otherwise,
the active/overflow cpumasks will be used in the perfformance mode.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 01:14:33 +09:00
Changwoo Min
afb8c78a09 scx_lavd: print power mode change in the auto mode
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 01:14:33 +09:00
Changwoo Min
a89a56dba4 scx_lavd: add a fastpath in ops.select_cpu() for a sharply pinned task
If a task can be run only on a single cpu, we don't need to go through
all the steps in ops.select_cpu(). Instread, we simply check if a task
is still pinned on the prev_cpu and go.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-31 01:14:33 +09:00
Changwoo Min
3e2e78a9ec
Merge pull request #584 from multics69/lavd-turbo2
scx_lavd: automatically determine power mode and more
2024-08-30 08:56:16 +09:00
Changwoo Min
bb08919203 scx_lavd: determine power mode automatically with --auto option
It checkes the EPP (energy performance preference) peirodically and sets
the power profile of the scheduler during runtiime as a user changes its
EPP profile (from her desktop UI).

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-29 19:15:23 +09:00
Avraham Hollander
2a3cbeb760 scx_lavd: Add same power mode clarification to --no-prefer-turbo-core 2024-08-27 23:06:31 -04:00
Changwoo Min
5588126cff scx_lavd: minior optimization for consume_task()
When iterating neighbors, the existing code unnecessarily iterates all
the neighbors to the maximum even if there is no neighors. So the fix
escapes early when there is no neighbors.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-28 10:26:50 +09:00
Changwoo Min
95272ae910 scx_lavd: proper handling of ctrl-c in a monitoring mode
Ctrl-c wasn't properly handled in the monitoring mode
(`--monitor-sched-samples`), so the scheduler could not be terminated by
pressing ctrl-c. The missing ctrl-c handling is added to the monitor
thread.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-28 10:05:34 +09:00
Changwoo Min
9c4428fd8b scx_lavd: remove unused rust functions
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-28 10:02:11 +09:00
Daniel Hodges
41cebb807a
Merge pull request #569 from anh0516/main
scx_layered: Clean up in-code documentation; add commas for consistency
2024-08-27 09:47:29 -04:00