Commit Graph

1636 Commits

Author SHA1 Message Date
patso
eaa636dfcc
set vng cpus to 8 for rust tests
set vng cpus to 8 for rust tests (for stability in testing),
update relevant doc test.
2024-09-06 10:07:37 -04:00
patso
082bccb557
fix/enable rust tests, make build faster
This commit fixes rust tests and configures ci to
run them on commit. It also sets up CI to run those
in a timely manner by caching dependencies and splitting jobs.
2024-09-06 06:18:11 -04:00
Tejun Heo
fb35fdb6f2
Merge pull request #620 from sched-ext/htejun/release
version: v1.0.4
2024-09-05 18:13:13 -10:00
Tejun Heo
46fc2e1a49 version: v1.0.4 2024-09-05 18:12:45 -10:00
Tejun Heo
cd555741d0 rust: Synchronize depency versions 2024-09-05 17:10:02 -10:00
Changwoo Min
20e3d998fe
Merge pull request #613 from sirlucjan/readme-lavd-monitor
scx_stats: Add proper logs for LAVD
2024-09-06 09:34:18 +09:00
Changwoo Min
e3243c5d51
Merge pull request #612 from multics69/lavd-monitor
scx_lavd: add --monitor flag and two micro-optimizations
2024-09-06 09:33:55 +09:00
Changwoo Min
d9274bd8e6 scx_lavd: drop time slice boost for big cores
Unexpectedly, little cores, which have relative short time slices, have
more chance to schedule performance-critical tasks. Hence it is better
to keep the time slice same regardless the core types.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 09:32:38 +09:00
Changwoo Min
fdecba227c scx_lavd: print more info with --monitor
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-06 09:32:31 +09:00
Daniel Hodges
0fa369b914
Merge pull request #619 from hodgesds/stats-fixes
scx_layered: Fix stats typo
2024-09-05 15:44:15 -04:00
Andrea Righi
ee97632d9f
Merge pull request #618 from sched-ext/bpfland-wake-sync
scx_bpfland: optimize producer/consumer workloads
2024-09-05 21:04:53 +02:00
Daniel Hodges
25e1642bbc
scx_layered: Fix stats typo
Small typo fix

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-05 14:12:28 -04:00
Andrea Righi
41856aa527
Merge pull request #617 from sirlucjan/bpfland-new-flags
scx-scheds: update bpflands suggested flags
2024-09-05 19:21:17 +02:00
Andrea Righi
918cfc613d scx_bpfland: optimize producer/consumer workloads
When selecting an idle CPU for a task that has been woken up, prioritize
reusing the same CPU if the waker and wakee share the same L3 cache.

Otherwise, attempt to migrate the wakee to the waker's CPU, provided it
is allowed by the wakee's scheduling domain.

This seems to consistently improve FPS performance when the system is
not operating over its full capacity.

Example:
 $ __GL_SYNC_TO_VBLANK=0 vblank_mode=0 glxgears -geometry 800x600

 - before: ~18305.77 FPS
 - after:  ~19060.62 FPS

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-05 19:02:09 +02:00
Andrea Righi
28050dcd7d
Merge pull request #615 from sched-ext/bpfland-auto
scx_bpfland: enable "auto" mode by default
2024-09-05 19:01:50 +02:00
Daniel Hodges
e6ed9b05ba
Merge pull request #614 from hodgesds/layered-stats-fix
scx_layered: Fix stats formatting
2024-09-05 12:54:56 -04:00
Tejun Heo
08619302e8
Merge pull request #607 from sirlucjan/openrc-cleanup
openrc: drop separate logs
2024-09-05 06:54:13 -10:00
Piotr Gorski
51f8c35841
openrc: drop separate logs
Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
2024-09-05 18:19:01 +02:00
Piotr Gorski
97864d0b3d
scx_stats: Add proper logs for LAVD
Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
2024-09-05 18:18:34 +02:00
Andrea Righi
844c00fd26 scx_bpfland: enable "auto" mode by default
Rename "turbo domain" to "preferred domain", that conceptually is more
generic and introduce the new option `--preferred-domain CPUMASK`, which
allows users to define the preferred domain, specifying a cpumask as a
hex number. By default ("auto") the scheduler will always try to detect
and use the fastest CPUs in the system.

Moreover, adjust the cpufreq logic to use "auto" both with the
"balance_power" and "balance_performance" EPP profiles.

Then, enable "auto" mode by default: the scheduler will try to
automatically determine the optimal primary domain, preferred domain and
cpufreq level, based on the selected scheduler and energy profiles.

Tested-by: Piotr Gorski < piotr.gorski@cachyos.org >
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-05 16:11:12 +02:00
Piotr Gorski
cde5a39ae0
scx-scheds: update bpflands suggested flags
Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
2024-09-05 15:06:21 +02:00
Andrea Righi
239b5194a4
Merge pull request #605 from sched-ext/rustland-core-handle-tctx-error
scx_rustland_core: avoid critical failures due by missing task context
2024-09-05 13:08:56 +02:00
Daniel Hodges
76ad880475
scx_layered: Fix stats formatting
Fix formatting precision of stats to have lower precision for
readability. The existing formatting is hard to read:

tot=   1538 local=31.27 open_idle= 2.73 affn_viol=23.80 proc=4ms
busy=  1.1 util=   16.6 load=     32.7 fallback_cpu=  6
excl_coll=0.06501950585175553 excl_preempt=0.26007802340702213 excl_idle=0.16384915474642392 excl_wakeup=0.25097529258777634

With this fix stats are far more readable formatting:

tot=    441 local=33.56 open_idle= 0.00 affn_viol=20.63 proc=3ms
busy=  0.4 util=    6.3 load=     33.6 fallback_cpu=  6
excl_coll=0.454 excl_preempt=0.000 excl_idle=0.132 excl_wakeup=0.200

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-05 06:44:54 -04:00
Andrea Righi
53617042b3 scx_rustland_core: avoid critical failures due by missing task context
Prevent triggering a critical error when a local context for a task
can't be found.

Instead, handle the error gracefully (reporting a warning in debugfs) to
enhance the robustness of the schedulers based on scx_rustland_core.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-05 10:58:44 +02:00
Changwoo Min
f490a55d54 scx_lavd: accmulate more system-wide statistics
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Changwoo Min
e5d27d0553 scx_lavd: print basic system status when --monior is given
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Changwoo Min
6b717a3f3d scx_lavd: add --help-stats option
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Changwoo Min
ca1c86eb9c scx_lavd: improve pick_idle_cpu() for pinned tasks
When a pinned task cannot run on either active or overflow sets, we try
to stay on the previous CPU which is still okay to run on.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-09-05 16:03:14 +09:00
Andrea Righi
afc7b5404b
Merge pull request #600 from sched-ext/bpfland-cpufreq
scx_bpfland: improve cpufreq awareness
2024-09-05 07:32:10 +02:00
Tejun Heo
161790f32b
Merge pull request #606 from sched-ext/htejun/misc
meson: Remove unused meson.build and add targeted builds for libs
2024-09-04 09:57:29 -10:00
Tejun Heo
708aaaafb9 meson: Update rust/meson.build
Targeted build is now available for libraries too.
2024-09-04 06:49:55 -10:00
Tejun Heo
f010eda5c0 meson: Remove scheds/rust/*/meson.build
These aren't used since 43950c65 ("build: Use workspace to group rust
sub-projects"). Drop them.
2024-09-04 06:40:17 -10:00
Tejun Heo
4513dfbe4b
Merge pull request #565 from CachyOS/feature/scx-loader
scx_loader: Add scheduler loader via system DBUS interface
2024-09-04 06:34:59 -10:00
Andrea Righi
fe81ccb5bc
Merge pull request #604 from sched-ext/ci-parallel-build
ci: allow parallel builds with meson
2024-09-04 10:28:48 +02:00
Andrea Righi
b9d4ddebc9 ci: allow parallel builds with meson
Now that we have a sane build environment with meson we don't need to
force sequential builds anymore (--jobs=1) and we can just run regular
builds without having to worry about excessive parallelization.

See commit 43950c6 ("build: Use workspace to group rust sub-projects").

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-04 08:57:24 +02:00
Andrea Righi
b34358b0d3
Merge pull request #603 from sched-ext/rustland-core-fix-kthread-stall
scx_rustland_core: fix pcpu kthread stall
2024-09-04 08:29:01 +02:00
Andrea Righi
9ff337a406
Merge pull request #602 from sched-ext/ci-bump-vng-memory
ci: bump up virtme-ng memory size from 1GB to 2GB
2024-09-04 08:28:52 +02:00
Andrea Righi
c3cab45f6a scx_rustland_core: bump up version to 2.0.1
Bump up scx_rustland_core version to include this critical fix that
allows to prevent scheduler stalls:

 94a3594 ("scx_rustland_core: always dispatch per-cpu kthreads directly")

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-04 08:00:25 +02:00
Andrea Righi
94a359434f scx_rustland_core: always dispatch per-cpu kthreads directly
Do not send per-CPU kthreads to the user-space scheduler, but always
dispatch them directly from BPF.

In specific environments, sending critical per-CPU kthreads to the
user-space scheduler can lead to potential stalls. This occurs because
the user-space scheduler might be blocked by an action that these
per-CPU kthreads need to perform, but they cannot complete their action
if the scheduler needs to schedule them, hence the deadlock.

To prevent this deadlock, always dispatch the per-CPU kthreads directly
from the BPF component, ensuring that the user-space scheduler does not
get blocked by these events.

Fixes: c0a2cfb ("scx_rustland_core: always schedule per-CPU kthreads to user-space")
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-04 07:56:58 +02:00
Andrea Righi
ac0cfa32de ci: bump up virtme-ng memory size from 1GB to 2GB
Recently, we have triggered some OOM conditions during stress tests,
particularly with the user-space schedulers. To avoid this issue and
prevent false positives, increase the memory size of the virtme-ng
instance from the default 1GB to 2GB.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-03 22:46:38 +02:00
Andrea Righi
918f1db4bd scx_bpfland: dynamically adjust cpufreq level in auto mode
In auto mode, rather than keeping the previous fixed cpuperf factor,
dynamically calculate it based on CPU utilization and apply it before a
task runs within its allocated time slot.

Interactive tasks consistently receive the maximum scaling factor to
ensure optimal performance.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-03 21:36:48 +02:00
Daniel Hodges
9c5717577f
Merge pull request #601 from hodgesds/namespace-helpers
scx_helpers: Add pid namespace helpers
2024-09-03 14:38:26 -04:00
Daniel Hodges
8f4e9e5e3b scx_helpers: Add pid namespace helpers
Add pid namespace helpers for translating namespace pids.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-09-03 11:21:32 -07:00
Andrea Righi
fe6ac15015 scx_bpfland: improve turbo domain CPU selection
Always consider the turbo domain when running in "auto" mode.

Additionally, when the turbo domain is used, split the CPU idle
selection logic into two stages:
 1) in ops.select_cpu(), provide the task with a second opportunity to
    remain within the same LLC
 2) in ops.enqueue(), perform another check for an idle CPU, allowing
    the task to move to a different LLC if an idle CPU within the same
    LLC is not available.

This allows tasks to stick more on turbo-boosted CPUs and CPUs within
the same LLC.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-03 09:59:29 +02:00
Andrea Righi
70b93ed641 scx_bpfland: skip idle CPU selection for tasks with changing affinity
When tasks are changing CPU affinity it is pointless to try to find an
optimal idle CPU. In this case just skip the the idle CPU selection step
and let the task being dispatched to a global DSQ if needed.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-03 09:59:29 +02:00
Andrea Righi
802d104b46 scx_bpfland: add basic cpufreq support
Add hints for the cpufreq governor based on the selected scheduler's
performance profile and the current energy performance preference (EPP).

With this change applied the scheduler works as following:

scheduler profile (--primary-domain option):
  - default:
    - use all cores
    - cpufreq: use default scaling factor
  - powersave:
    - use E-cores
    - cpufreq: use min scaling factor
  - performance:
    - use P-cores
    - cpufreq: use max scaling factor
  - auto:
    - EPP: power, powersave
      - use E-cores
      - cpufreq: use min scaling factor
    - EPP: balance_power (typically battery-powered systems)
      - use E-cores
      - cpufreq: use default scaling factor
    - EPP: balance_performance, performance
      - use P-cores
      - cpufreq: use max scaling factor

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-03 09:59:29 +02:00
Andrea Righi
7dede2d4d7
Merge pull request #599 from sched-ext/scx-rustland-interactive
scx_rustland: aggressively prioritize interactive tasks
2024-09-02 17:43:33 +02:00
Andrea Righi
d0fb29a0f7 scx_rustland: aggressively prioritize interactive tasks
scx_rustland was originally designed as a PoC to showcase the benefits
of implementing specialized schedulers via sched_ext, focusing on a very
specific use case: prioritize game responsiveness regardless of what
runs in the background.

Its original design was subsequently modified to better serve as a
general-purpose scheduler, balancing the prioritization of interactive
tasks with CPU-intensive ones to prevent over-prioritization.

With scx_bpfland serving as a more "general-purpose" scheduler, it makes
sense to revisit scx_rustland's original goal and make it  much more
aggressive at prioritizing interactive tasks, determined in function of
their average amount of context switches.

This change makes scx_rustland again a really good PoC to showcase the
benefits of having specialized schedulers, by focusing only at a very
specific use case: provide a high and stable frames-per-second (fps)
while a kernel build is running in the background.

= Results =

 - Test: Run a WebGL application [1] while building the kernel (make -j32)
 - Hardware: 8-cores Intel 11th Gen Intel(R) Core(TM) i7-1195G7 @ 2.90GHz

  +----------------------+--------+--------+
  |      Scheduler       | avg fps|  stdev |
  +----------------------+--------+--------+
  |               EEVDF  |   28   |  4.00  |
  | scx_rustland-before  |   43   |  1.25  |
  |  scx_rustland-after  |   60   |  0.25  |
  +----------------------+--------+--------+

[1] https://webglsamples.org/aquarium/aquarium.html

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-09-02 15:53:35 +02:00
Andrea Righi
4fd1ae031b
Merge pull request #573 from sirlucjan/drop-journald
scx_stats: Drop sched-ext namespace
2024-09-02 14:30:28 +02:00
Changwoo Min
172fe1efc6
Merge pull request #597 from multics69/lavd-turbo-tuning2
scx_lavd: misc updates (verifier, README, monitor option name, and micro-optimization)
2024-09-02 18:00:26 +09:00