scx-upstream

mirror of https://github.com/sched-ext/scx.git synced 2024-11-25 04:00:24 +00:00

Author	SHA1	Message	Date
Andrea Righi	50684e4569	scx_bpfland: introduce Intel Turbo Boost awareness Make `--primar-domain auto` aware of turbo boosted CPUs and prioritize them over the primary scheduling domain when the energy model `balance_power` is used (typically when running on battery power with the "balanced" profile). With this change the scheduling hierarchy becomes the following: 1) CPUs in the turbo scheduling domain 2) CPUs in the primary scheduling domain 3) full-idle SMT CPUs 4) CPUs in the same L2 cache 5) CPUs in the same L3 cache 6) CPUs in the task's allowed domain And the idle selection logic is modified as following: - In the turbo scheduling domain: - pick same full-idle SMT CPU - pick any other full-idle SMT CPU sharing the same L2 cache - pick any other full-idle SMT CPU sharing the same L3 cache - pick any other full-idle SMT CPU - pick same idle CPU - pick any other idle CPU sharing the same L2 cache - pick any other idle CPU sharing the same L3 cache - pick any other idle SMT CPU - In the primary scheduling domain: - pick same full-idle SMT CPU - pick any other full-idle SMT CPU sharing the same L2 cache - pick any other full-idle SMT CPU sharing the same L3 cache - pick any other full-idle SMT CPU - pick same idle CPU - pick any other idle CPU sharing the same L2 cache - pick any other idle CPU sharing the same L3 cache - pick any other idle SMT CPU - In the entire task domain: - pick any other idle CPU Keep in mind that the turbo domain will be evaluated only when the scheduler is started with `--primary-domain auto` and only when the `balance_power` energy profile is used. The turbo domain is always made using the subset of CPUs in the system with the highest max frequency. If such subset can't be determined (for example if all the CPUs in the primary domain have all the same frequency), the turbo domain will be ignored. Prioritizing turbo boosted CPUs can help to improve performance by forcing the governor to scale up their frequency, without increasing too much power consumption, due to the fact that tasks will be preferably confined into a reduced amount of cores. This change seems to improve performance, without increasing much power consuption, on Intel laptops while using the `balanced_power` energy profile. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-23 19:49:08 +02:00
Andrea Righi	d958dd4482	scx_bpfland: introduce dynamic energy profile Introduce the new option `--primary-domain auto`. With this option the scheduler will dynamically adjusts the primary scheduling domain at run-time, in function of the current energy profile reported in /sys/devices/system/cpu/cpufreq/policy0/energy_performance_preference. When the `power` energy profile is selected, the primary scheduling domain will prioritize E-cores. Alternatively, when the `performance` profile is selected, it will prioritize P-cores. For all the other energy profiles, all the CPUs in the system will be used. Note that this option is only relevant on hybrid architectures with P-cores and E-cores. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-23 19:49:01 +02:00
Andrea Righi	bb7248ce61	scx_utils::cpumask: introduce is_empty() and is_full() Introduce new methods to CpuMask to check if no bits are set or if all bits are set. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-23 19:48:53 +02:00
Andrea Righi	115bfc8184	Merge pull request #536 from sched-ext/bpfland-lowlatency-mode scx_bpfland: introduce --lowlatency option	2024-08-23 19:48:08 +02:00
Tejun Heo	e635e7eac8	Merge pull request #537 from sirlucjan/new-default scx-scheds: set scx_bpfland as default scheduler	2024-08-23 05:55:59 -10:00
Piotr Gorski	3f0fcc319c	scx-scheds: set scx_bpfland as default scheduler Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>	2024-08-23 15:47:24 +02:00
Andrea Righi	6a2285398d	scx_bpfland: introduce --lowlatency option Introduce the new `--lowlatency` option, which enables switching between the default pure vruntime-based scheduling (more optimized for server workloads) and a deadline-based scheduling (better suited for low-latency workloads). When the low-latency mode is activated, a task's deadline is calculated as its vruntime, adjusted by a bonus proportional to the task's average number of voluntary context switches (the more voluntary context switches, the shorter the deadline). This feature enhances the prioritization of interactive tasks even more, proportionally to their average voluntary context switches, also within the two main global queues (priority / shared) and it helps to maintain interactive workloads always responsive, even in presence of heavy non-interactive background work. Low-latency mode allows to prevent audio cracking even in presence of a large amount of short-lived tasks with pseudo-interactive behavior (i.e, hackbench) and it enables achieving approximately a +33% average frames-per-second (FPS) in the typical "gaming while building the kernel" benchmark. However, it can also amplify the de-prioritization of CPU-intensive tasks, making this option more suitable for specific low-latency scenarios. Therefore the low-latency mode is disabled by default and it can only be enabled via the `--lowlatency` option. Tested-by: Piotr Gorski (piotrgorski@cachyos.org) Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-22 13:26:19 +02:00
Andrea Righi	dca4af151e	Merge pull request #535 from sched-ext/bpfland-time-slice-control scx_bpfland: better time slice control	2024-08-22 10:32:16 +02:00
Andrea Righi	b0a8e4a91e	scx_bpfland: better time slice control Explicitly replenish the task's time slice from ops.dispatch() if the task still wants to run and no other task is selected. In this way the sched_ext core won't automatically re-schedule the task on the same CPU, implicitly assigning a time slice of SCX_SLICE_DFL. Moreover, instead of determining the task time slice in ops.enqueue(), refresh the time slice immediately before the task is started on its assigned CPU in ops.running(). This allows to use a more precise time slice, adjusted based on the actual amount of tasks that are currently waiting to be scheduled. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-22 09:23:37 +02:00
Tejun Heo	8c35a1976f	Merge pull request #534 from sched-ext/htejun/misc scx_layered: Drop SCX_OPS_ENQ_LAST	2024-08-21 19:57:54 -10:00
Tejun Heo	d6ac5fbd9c	scx_layered: Drop SCX_OPS_ENQ_LAST The meaning of SCX_OPS_ENQ_LAST will change with future kernel updates and enqueueing on local DSQ will no longer be sufficient to avoid stalls. No reason to do it anyway. Just drop it.	2024-08-21 13:13:59 -10:00
Tejun Heo	52d97c041d	Merge pull request #533 from sched-ext/htejun/release Version: Cargo.lock	2024-08-21 06:46:02 -10:00
Tejun Heo	f726f0b73b	Version: Cargo.lock	2024-08-21 06:45:19 -10:00
Tejun Heo	98fd3ec3b0	Merge pull request #532 from sched-ext/htejun/release Version: v1.0.3	2024-08-21 06:43:36 -10:00
Tejun Heo	4d1f0639d8	Version: v1.0.3	2024-08-21 06:42:11 -10:00
Tejun Heo	2e63c0be60	Merge pull request #531 from sched-ext/bpfland-fix-unused-var scx_bpfland: drop unused variable	2024-08-21 06:40:16 -10:00
Tejun Heo	6a2faf2e17	Merge pull request #530 from sched-ext/htejun/misc scx_utils::topology: Use lazy_static instead of LazyLock	2024-08-21 06:04:14 -10:00
Andrea Righi	fedfee0bd6	scx_bpfland: drop unused variable With the global scx_utils::NR_CPU_IDS we don't need Topology anymore in init_primary_domain(), so drop the variable to fix the following build warning: warning: unused variable: `topo` --> src/main.rs:385:9 \| 385 \| topo: &Topology, \| ^^^^ help: if this is intentional, prefix it with an underscore: `_topo` \| = note: `#[warn(unused_variables)]` on by default Fixes: `1da249f` ("scx_utils::topology: Always use NR_CPU_IDS and NR_CPUS_POSSIBLE") Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 17:46:12 +02:00
Andrea Righi	9f7a11bba6	Merge pull request #528 from sched-ext/bpfland-turbo-boost scx_bpfland: properly classify Intel Turbo Boost CPUs	2024-08-21 17:40:25 +02:00
Tejun Heo	081b999361	scx_utils::topology: Use lazy_static instead of LazyLock LazyLock is stable but has become so only very recently and can trigger build errors on not-too-old stable rustc's which are still in wide use. Let's use lazy_static instead for now. Signed-off-by: Tejun Heo <tj@kernel.org>	2024-08-21 05:34:39 -10:00
Daniel Hodges	f2a6661a85	Merge pull request #524 from hodgesds/layered-core-fixes scx_layered: Fix core selection	2024-08-21 08:13:33 -04:00
Tejun Heo	9c62019c81	Merge pull request #527 from sched-ext/htejun/scx_utils scx_utils::cpumask,topology: Misc updates	2024-08-20 22:25:25 -10:00
Andrea Righi	695e3b25b0	scx_bpfland: classify CPUs depending of their the base frequency Use the base frequency, instead of maximum frequency, to classify fast and slow CPUs. This ensures accurate distinction between Intel Turbo Boost CPUs and genuinely faster CPUs when auto-detecting the primary scheduling domain. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 10:16:41 +02:00
Andrea Righi	bbe388e3bc	scx_utils: topology: add base_freq() method to Cpu With Intel Turbo Boost enabled, some CPUs might show a higher maximum frequency than others, even if they are not actually faster cores. This can potentially confuse some auto-detection logic for distinguishing between fast and slow cores in certain schedulers. The base CPU frequency reported in /sys/devices/system/cpu/cpuN/cpufreq/base_frequency represents a more reliable indicator for identifying truly fast and slow cores. To address this, provide a new base_freq() method in the struct Cpu, which will return the base operational frequency of a CPU when Turbo Boost is present. If Turbo Boost is not available, base_freq() will return the maximum frequency, functioning the same as max_freq(). Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 10:07:57 +02:00
Andrea Righi	e0fb99835d	Merge pull request #525 from sched-ext/bpfland-disable-interactive scx_bpfland: allow to completely disable interactive classification	2024-08-21 10:02:43 +02:00
Tejun Heo	1da249f063	scx_utils::topology: Always use NR_CPU_IDS and NR_CPUS_POSSIBLE Always use the LazyLock versions and drop the counterparts from Topology.	2024-08-20 21:57:56 -10:00
Tejun Heo	1ae4655b3c	scx_utils::cpumask: Default to displaying in hex There isn't much to gain by displaying cpumasks in binary. Drop separate Display implementation just default to 'x' formatting.	2024-08-20 21:50:23 -10:00
Tejun Heo	092f5422d6	Merge pull request #518 from sched-ext/htejun/misc scx_layered: Add `--run-example` and enable CI testing	2024-08-20 21:42:45 -10:00
Tejun Heo	3ca2f0b6f9	scx_utils/cpumask: Use nr_cpu_ids instead of num_possible_cpus - Add static NR_CPU_IDS and NR_CPUS_POSSIBLE to topology. - Fix comment for Topology::nr_cpu_ids(). Was missing a negation. - cpumaks should be sized by nr_cpus_ids, not num_possible_cpus and the number can't change while the system is running. Drop cpumask.nr_cpus and use *NR_CPU_IDS everywhere.	2024-08-20 21:25:40 -10:00
Tejun Heo	0cc59a5243	scx_utils: cargo fmt	2024-08-20 21:25:40 -10:00
Tejun Heo	91213de713	Merge branch 'main' into htejun/rusty	2024-08-20 21:13:12 -10:00
Tejun Heo	2d449f3288	Merge pull request #523 from Kawanaao/openrc-logrotate openrc: Add logrotate support for openrc systems	2024-08-20 21:10:51 -10:00
Tejun Heo	f7c193e528	scx_utils, scx_rusty: Minor updates to version handling - Update scx_utils/build.rs so that 12 char SHA1 is generated instead of full one. - Add --version to scx_rusty. Use custom one as we don't want to use the default cargo version one.	2024-08-20 21:03:05 -10:00
Tejun Heo	8f786be08f	scx_rusty: cargo fmt	2024-08-20 21:03:05 -10:00
Tejun Heo	4440567949	scx_rusty: Update Cargo.lock	2024-08-20 21:03:05 -10:00
Andrea Righi	c85315d527	scx_bpfland: allow to completely disable interactive classification Tasks enqueued with SCX_ENQ_WAKEUP are immediately classified as interactive. However, if interactive tasks classification is disabled (via `-c 0`), we should avoid promoting them as interactive. This is particularly important because, with the nvcsw logic disabled, tasks can remain classified as interactive indefinitely and they will never be demoted to regular tasks. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 08:45:13 +02:00
Andrea Righi	014dc7b3c3	Merge pull request #522 from sched-ext/bpfland-cpumask scx_bpfland: use scx_utils::Cpumask	2024-08-21 08:37:43 +02:00
Andrea Righi	a9f5aaa536	scx_bpfland: replace custom CpuMask with scx_utils::Cpumask Rely on scx_utils::Cpumask instead of re-implementing a custom struct to parse and manage CPU masks. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 07:21:52 +02:00
Andrea Righi	235f19fdf1	cpumask: implement hex string formatter Allow to format a Cpumask as an hex string, implementing the proper formatter LowerHex / UpperHex traits. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 07:21:22 +02:00
Daniel Hodges	4d1c932619	scx_layered: Fix core selection Fix a bug introduced in #510 where it assumed core ids are incremental. This refactors the core ordering for layers to be far more simple and provide some space for layer core isolation in low utilization. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-20 19:26:53 -07:00
Kawanaao	f35717e970	Create scx.logrotate	2024-08-20 18:02:15 +03:00
Kawanaao	3485adb47f	Add support for openrc logrotate	2024-08-20 17:47:16 +03:00
Andrea Righi	33b6ada98e	Merge pull request #509 from sched-ext/bpfland-topology scx_bpfland: topology awareness	2024-08-20 14:37:23 +02:00
Daniel Hodges	9f2d548b8f	Merge pull request #520 from hodgesds/merge-fixes ci: Fix cache directory	2024-08-20 07:33:22 -04:00
Andrea Righi	467d4b5ea4	scx_bpfland: get topology information from scx_utils::Topology Rely on scx_utils::Topology to get CPU and cache information, instead of re-implementing custom methods. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-20 10:16:02 +02:00
Andrea Righi	0b2dc6b9fc	scx_utils: Add L2 / L3 cache id to CPU Add the L2 / L3 cache id to the Cpu struct, to quickly determine the cache nodes associated to each CPU. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-20 10:16:02 +02:00
Tejun Heo	c0fcc9bdeb	meson-scripts/test_sched: Enable scx_layered testing scx_layered now can be run with a single command when `--run-example` is specified. Update test_sched script to support per-sched arguments and enable it for scx_layered.	2024-08-19 20:50:10 -10:00
Tejun Heo	c0418250f4	scx_layered: Add --run-example option So that scx_layered can be run in CI environment in a single command.	2024-08-19 20:50:10 -10:00
Daniel Hodges	e121dd3dd5	ci: Fix cache directory Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-19 20:07:50 -07:00
Daniel Hodges	03944694a9	Merge pull request #519 from hodgesds/veristat-merge-fix ci: fix merge veristat cache generation	2024-08-19 21:58:00 -04:00

1 2 3 4 5 ...

1408 Commits