scx/scheds
Andrea Righi f9a994412d scx_bpfland: introduce primary scheduling domain
Allow to specify a primary scheduling domain via the new command line
option `--primary-domain CPUMASK`, where CPUMASK can be a hex number of
arbitrary length, representing the CPUs assigned to the domain.

If this option is not specified the scheduler will use all the available
CPUs in the system as primary domain (no behavior change).

Otherwise, if a primary scheduling domain is defined, the scheduler will
try to dispatch tasks only to the CPUs assigned to the primary domain,
until these CPUs are saturated, at which point tasks may overflow to
other available CPUs.

This feature can be used to prioritize certain cores over others and it
can be really effective in systems with heterogeneous cores (e.g.,
hybrid systems with P-cores and E-cores).

== Example (hybrid architecture) ==

Hardware:
 - Dell Precision 5480 with 13th Gen Intel(R) Core(TM) i7-13800H
   - 6 P-cores 0..5  with 2 CPUs each (CPU from  0..11)
   - 8 E-cores 6..13 with 1 CPU  each (CPU from 12..19)

== Test ==

WebGL application (https://webglsamples.org/aquarium/aquarium.html):
this allows to generate a steady workload in the system without
over-saturating the CPUs.

Use different scheduler configurations:

 - EEVDF (default)
 - scx_bpfland using P-cores only (--primary-domain 0x00fff)
 - scx_bpfland using E-cores only (--primary-domain 0xff000)

Measure performance (fps) and power consumption (W).

== Result ==

                  +-----+-----+------+-----+----------+
                  | min | max | avg  |       |        |
                  | fps | fps | fps  | stdev | power  |
+-----------------+-----+-----+------+-------+--------+
| EEVDF           | 28  | 34  | 31.0 |  1.73 |  3.5W  |
| bpfland-p-cores | 33  | 34  | 33.5 |  0.29 |  3.5W  |
| bpfland-e-cores | 25  | 26  | 25.5 |  0.29 |  2.2W  |
+-----------------+-----+-----+------+-------+--------+

Using a primary scheduling domain of only P-cores with scx_bpfland
allows to achieve a more stable and predictable level of performance,
with an average of 33.5 fps and an error of ±0.5 fps.

In contrast, using EEVDF results in an average frame rate of 31.0 fps
with an error of ±3.0 fps, indicating slightly less consistency, due to
the fact that tasks are evenly distributed across all the cores in the
system (both slow and fast cores).

On the other hand, using a scheduling domain solely of E-cores with
scx_bpfland results in a lower average frame rate (25.5 fps), though it
maintains a stable performance (error of ±0.5 fps), but the power
consumption is also reduced, averaging 2.2W, compared to 3.5W with
either of the other configurations.

== Conclusion ==

In summary, with this change users have the flexibility to prioritize
scheduling on performance cores for better performance and consistency,
or prioritize energy efficient cores for reduced power consumption, on
hybrid architectures.

Moreover, this feature can also be used to minimize the number of cores
used by the scheduler, until they reach full capacity. This capability
can be useful for reducing power consumption even in homogeneous systems
or for conducting scheduling experiments with smaller sets of cores,
provided the system is not overcommitted.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-08-14 16:17:54 +02:00
..
c Update to vmlinux-v6.10-rc2-g1edab907b57d.h 2024-07-12 11:13:34 -10:00
include Update to vmlinux-v6.10-rc2-g1edab907b57d.h 2024-07-12 11:13:34 -10:00
rust scx_bpfland: introduce primary scheduling domain 2024-08-14 16:17:54 +02:00
meson.build Restructure scheds folder names 2023-12-17 13:14:31 -08:00
README.md Restructure scheds folder names 2023-12-17 13:14:31 -08:00
sync-to-kernel.sh sync-to-kernel.sh: Sync scx_central and scx_flatcg 2024-02-23 14:21:03 -10:00

SCHED_EXT SCHEDULERS

Introduction

This directory contains the repo's schedulers.

Some of these schedulers are simply examples of different types of schedulers that can be built using sched_ext. They can be loaded and used to schedule on your system, but their primary purpose is to illustrate how various features of sched_ext can be used.

Other schedulers are actually performant, production-ready schedulers. That is, for the correct workload and with the correct tuning, they may be deployed in a production environment with acceptable or possibly even improved performance. Some of the examples could be improved to become production schedulers.

Please see the following README files for details on each of the various types of schedulers:

  • rust describes all of the schedulers with rust user space components. All of these schedulers are production ready.
  • c describes all of the schedulers with C user space components. All of these schedulers are production ready.

Note on syncing

Note that there is a sync-to-kernel.sh script in this directory. This is used to sync any changes to the specific schedulers with the Linux kernel tree. If you've made any changes to a scheduler in please use the script to synchronize with the sched_ext Linux kernel tree:

$ ./sync-to-kernel.sh /path/to/kernel/tree