f9a994412d
Allow to specify a primary scheduling domain via the new command line option `--primary-domain CPUMASK`, where CPUMASK can be a hex number of arbitrary length, representing the CPUs assigned to the domain. If this option is not specified the scheduler will use all the available CPUs in the system as primary domain (no behavior change). Otherwise, if a primary scheduling domain is defined, the scheduler will try to dispatch tasks only to the CPUs assigned to the primary domain, until these CPUs are saturated, at which point tasks may overflow to other available CPUs. This feature can be used to prioritize certain cores over others and it can be really effective in systems with heterogeneous cores (e.g., hybrid systems with P-cores and E-cores). == Example (hybrid architecture) == Hardware: - Dell Precision 5480 with 13th Gen Intel(R) Core(TM) i7-13800H - 6 P-cores 0..5 with 2 CPUs each (CPU from 0..11) - 8 E-cores 6..13 with 1 CPU each (CPU from 12..19) == Test == WebGL application (https://webglsamples.org/aquarium/aquarium.html): this allows to generate a steady workload in the system without over-saturating the CPUs. Use different scheduler configurations: - EEVDF (default) - scx_bpfland using P-cores only (--primary-domain 0x00fff) - scx_bpfland using E-cores only (--primary-domain 0xff000) Measure performance (fps) and power consumption (W). == Result == +-----+-----+------+-----+----------+ | min | max | avg | | | | fps | fps | fps | stdev | power | +-----------------+-----+-----+------+-------+--------+ | EEVDF | 28 | 34 | 31.0 | 1.73 | 3.5W | | bpfland-p-cores | 33 | 34 | 33.5 | 0.29 | 3.5W | | bpfland-e-cores | 25 | 26 | 25.5 | 0.29 | 2.2W | +-----------------+-----+-----+------+-------+--------+ Using a primary scheduling domain of only P-cores with scx_bpfland allows to achieve a more stable and predictable level of performance, with an average of 33.5 fps and an error of ±0.5 fps. In contrast, using EEVDF results in an average frame rate of 31.0 fps with an error of ±3.0 fps, indicating slightly less consistency, due to the fact that tasks are evenly distributed across all the cores in the system (both slow and fast cores). On the other hand, using a scheduling domain solely of E-cores with scx_bpfland results in a lower average frame rate (25.5 fps), though it maintains a stable performance (error of ±0.5 fps), but the power consumption is also reduced, averaging 2.2W, compared to 3.5W with either of the other configurations. == Conclusion == In summary, with this change users have the flexibility to prioritize scheduling on performance cores for better performance and consistency, or prioritize energy efficient cores for reduced power consumption, on hybrid architectures. Moreover, this feature can also be used to minimize the number of cores used by the scheduler, until they reach full capacity. This capability can be useful for reducing power consumption even in homogeneous systems or for conducting scheduling experiments with smaller sets of cores, provided the system is not overcommitted. Signed-off-by: Andrea Righi <andrea.righi@linux.dev> |
||
---|---|---|
.. | ||
c | ||
include | ||
rust | ||
meson.build | ||
README.md | ||
sync-to-kernel.sh |
SCHED_EXT SCHEDULERS
Introduction
This directory contains the repo's schedulers.
Some of these schedulers are simply examples of different types of schedulers that can be built using sched_ext. They can be loaded and used to schedule on your system, but their primary purpose is to illustrate how various features of sched_ext can be used.
Other schedulers are actually performant, production-ready schedulers. That is, for the correct workload and with the correct tuning, they may be deployed in a production environment with acceptable or possibly even improved performance. Some of the examples could be improved to become production schedulers.
Please see the following README files for details on each of the various types of schedulers:
- rust describes all of the schedulers with rust user space components. All of these schedulers are production ready.
- c describes all of the schedulers with C user space components. All of these schedulers are production ready.
Note on syncing
Note that there is a sync-to-kernel.sh script in this directory. This is used to sync any changes to the specific schedulers with the Linux kernel tree. If you've made any changes to a scheduler in please use the script to synchronize with the sched_ext Linux kernel tree:
$ ./sync-to-kernel.sh /path/to/kernel/tree