scx-upstream

mirror of https://github.com/sched-ext/scx.git synced 2024-11-29 06:00:23 +00:00

Author	SHA1	Message	Date
Andrea Righi	0f018c5fff	Merge pull request #484 from vax-r/rustland_unused scx: Remove unused variables, imports and functions	2024-08-14 19:03:26 +02:00
Andrea Righi	f9a994412d	scx_bpfland: introduce primary scheduling domain Allow to specify a primary scheduling domain via the new command line option `--primary-domain CPUMASK`, where CPUMASK can be a hex number of arbitrary length, representing the CPUs assigned to the domain. If this option is not specified the scheduler will use all the available CPUs in the system as primary domain (no behavior change). Otherwise, if a primary scheduling domain is defined, the scheduler will try to dispatch tasks only to the CPUs assigned to the primary domain, until these CPUs are saturated, at which point tasks may overflow to other available CPUs. This feature can be used to prioritize certain cores over others and it can be really effective in systems with heterogeneous cores (e.g., hybrid systems with P-cores and E-cores). == Example (hybrid architecture) == Hardware: - Dell Precision 5480 with 13th Gen Intel(R) Core(TM) i7-13800H - 6 P-cores 0..5 with 2 CPUs each (CPU from 0..11) - 8 E-cores 6..13 with 1 CPU each (CPU from 12..19) == Test == WebGL application (https://webglsamples.org/aquarium/aquarium.html): this allows to generate a steady workload in the system without over-saturating the CPUs. Use different scheduler configurations: - EEVDF (default) - scx_bpfland using P-cores only (--primary-domain 0x00fff) - scx_bpfland using E-cores only (--primary-domain 0xff000) Measure performance (fps) and power consumption (W). == Result == +-----+-----+------+-----+----------+ \| min \| max \| avg \| \| \| \| fps \| fps \| fps \| stdev \| power \| +-----------------+-----+-----+------+-------+--------+ \| EEVDF \| 28 \| 34 \| 31.0 \| 1.73 \| 3.5W \| \| bpfland-p-cores \| 33 \| 34 \| 33.5 \| 0.29 \| 3.5W \| \| bpfland-e-cores \| 25 \| 26 \| 25.5 \| 0.29 \| 2.2W \| +-----------------+-----+-----+------+-------+--------+ Using a primary scheduling domain of only P-cores with scx_bpfland allows to achieve a more stable and predictable level of performance, with an average of 33.5 fps and an error of ±0.5 fps. In contrast, using EEVDF results in an average frame rate of 31.0 fps with an error of ±3.0 fps, indicating slightly less consistency, due to the fact that tasks are evenly distributed across all the cores in the system (both slow and fast cores). On the other hand, using a scheduling domain solely of E-cores with scx_bpfland results in a lower average frame rate (25.5 fps), though it maintains a stable performance (error of ±0.5 fps), but the power consumption is also reduced, averaging 2.2W, compared to 3.5W with either of the other configurations. == Conclusion == In summary, with this change users have the flexibility to prioritize scheduling on performance cores for better performance and consistency, or prioritize energy efficient cores for reduced power consumption, on hybrid architectures. Moreover, this feature can also be used to minimize the number of cores used by the scheduler, until they reach full capacity. This capability can be useful for reducing power consumption even in homogeneous systems or for conducting scheduling experiments with smaller sets of cores, provided the system is not overcommitted. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-14 16:17:54 +02:00
Andrea Righi	a6e977c70b	scx_bpfland: make output more compact Abbreviate the statistics reported to stdout and remove the slice_ms metric: this metric can be easily derived from slice_ns, slice_ns_min and nr_wait, which is already reported to stdout. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-14 16:17:54 +02:00
Andrea Righi	8656effa50	scx_bpfland: update copyright info Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-14 16:17:54 +02:00
Changwoo Min	3c6d86b342	scx_lavd: upgrade nix package from 0.28.0 to 0.29.0 Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-14 22:31:05 +09:00
Changwoo Min	444f0b86a5	Merge pull request #489 from multics69/lavd-amp-v4 lavd: make LAVD core-type (AMP) aware	2024-08-14 14:24:09 +09:00
Tejun Heo	4612764b82	Merge pull request #486 from vax-r/Fix_rusty_logic scx_rusty: Fix logical error when filtering tasks	2024-08-13 09:39:12 -10:00
Daniel Hodges	646cefd46d	Merge pull request #477 from hodgesds/layered-global-match scx_rusty: Make layer matching a global function	2024-08-12 09:14:58 -04:00
Daniel Hodges	be5213e129	scx_rusty: Make layer matching a global function Layer matching currently takes a large number of bpf instructions. Moving layer matching to a global function will reduce the overall instruction count and allow for other layer matching methods such as glob. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-12 05:44:34 -07:00
Changwoo Min	b7b8c8de90	scx_lavd: fix build errors Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 14:10:40 +09:00
Changwoo Min	182b0bd249	scx_lavd: make the verifier in 6.8 kernel happy Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:04:04 +09:00
Changwoo Min	4ecf3fc94e	scx_lavd: build cpdom map from rust Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:03:18 +09:00
Changwoo Min	1f1a3dc4f1	scx_lavd: sort cores in descending order of max freq Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	c213a3e44f	scx_lavd: make core compaction core type aware Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	c35b6b27ff	scx_lavd: consider task pinning for core-type-aware ops.enqueue() Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	25bf98d2a0	scx_lavd: make ops.select_cpu() core type aware Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	fa87e1c593	scx_lavd: make ops.dispatch() core type aware Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	c1cf11f7b1	scx_lavd: make ops.enqueue() core type aware Put a performance-critical task to a performance critical queue and a regular task to a regular queue. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	03a8c10ece	scx_lavd: add cpdom_ctx to abstract compute domain and its DSQ Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	623b05a282	scx_lavd: revise perf_cri factor to reflect wakeup, runtime, and run_freq Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	15871fd032	scx_lavd: turn off pinned core less aggressively Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	9dc7f94cb6	scx_lavd: unifiy the deadline calculation and ineligibility calculation The unified version is not only simpler but also works better. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:40 +09:00
Changwoo Min	4705520d40	scx_lavd: remove unnecessary options which has never been used Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-12 13:01:34 +09:00
I Hsin Cheng	15b40de408	scx_rusty: Fix logical error when filtering tasks The logic of tasks filtering were moved from find_first_candidate() into a vector filter operation in commit `1c3b563`. However, it was forgotten to transfer the logic with "NOT" since now .filter() will populate the tasks we want, rather than .skip_while() which was throwing unwanted tasks out. That's why the logic here should be reverse so we won't take kworker or migrated tasks into considerations. Signed-off-by: I Hsin Cheng <richard120310@gmail.com>	2024-08-10 22:56:20 +08:00
I Hsin Cheng	4e40ba3b11	scx_rustland: Removed unused imports and variables The member "topo_map" in Scheduler is never used and thus should be removed, the related imports are removed as well. Signed-off-by: I Hsin Cheng <richard120310@gmail.com>	2024-08-09 20:35:12 +08:00
I Hsin Cheng	b7e03b7a76	scx_bpfland: Remove unused variable Remove unused variable "vtime" in task_vtime(). Signed-off-by: I Hsin Cheng <richard120310@gmail.com>	2024-08-09 20:28:42 +08:00
Tejun Heo	45f7fd13b7	versions: Synchronize crate dependency versions	2024-08-08 14:45:46 -10:00
Tejun Heo	63c4a0191f	Merge branch 'main' into topic/inlined-skeleton-members	2024-08-08 14:23:37 -10:00
Tejun Heo	cd6a4d72c7	Bump versions for 1.0.2 release	2024-08-08 14:10:16 -10:00
Tejun Heo	7c3ffe96e1	Unify crate dependency versions Different sub-projects are using different versions for the same crates. Synchronize them to the latest.	2024-08-08 13:26:47 -10:00
Andrea Righi	9d808ae206	Merge pull request #468 from sched-ext/rustland-refactoring scx_rustland refactoring	2024-08-07 11:38:21 +02:00
Andrea Righi	51cfb69199	scx_rustland_core: re-introduce partial mode Re-add the partial mode option that was dropped during the refactoring. The partial option allows to apply the scheduler only to the tasks which have their scheduling policy set to SCHED_EXT via sched_setscheduler(). Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:41:06 +02:00
Andrea Righi	e1f2b3822e	scx_rustland_core: drop CPU ownership API The API for determining which PID is running on a specific CPU is racy and is unnecessary since this information can be obtained from user space. Additionally, it's not reliable for identifying idle CPUs. Therefore, it's better to remove this API and, in the future, provide a cpumask alternative that can export the idle state of the CPUs to user space. As a consequence also change scx_rustland to dispatch one task a time, instead of dispatching tasks in batches of idle cores (that are usually not accurate due to the racy nature of the CPU ownership interaface). Dispatching one task at a time even makes the scheduler more performant, due to the vruntime scheduling being applied to more tasks sitting in the scheduler's queue. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:41:06 +02:00
Andrea Righi	9a0e7755df	scx_rustland_core: export counter of online CPUs Introduce a helper to get the amount of online CPUs tracked by the BPF part. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:10:53 +02:00
Andrea Righi	d9c9f78e3e	scx_rustland: re-align vruntime and time slice evaluation to scx_bpfland Drop the slice boost logic and apply a vruntime and task time slice evaluation approach similar to scx_bpfland (but implement this in the user-space component instead of the BPF part). Additionally, introduce a slice_us_min parameter to define the minimum time slice that can be assigned to a task, also similar to scx_bpfland. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:10:53 +02:00
Andrea Righi	38a725ea34	scx_rlfifo: update copyright info Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:10:53 +02:00
Andrea Righi	c963d5eb05	scx_rustland: update copyright info Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:10:53 +02:00
Andrea Righi	b87541a26e	scx_rustland_core: refactor idle CPU selection logic Use the same idle selection logic used in scx_bpfland also in scx_rustland_core. Also drop fifo_mode and always use the BPF idle selection logic by default as long as the system is not saturated, unless full_user is specified. This approach allows user-space schedulers aiming for maximum performance to leverage the BPF idle selection logic (bypassing user-space), while those seeking full control can enable full_user to bypass the BPF CPU idle selection logic and choose the target CPU for each task from user-space. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-07 08:10:53 +02:00
Andrea Righi	d8985306f4	scx_rustland: user-space interactive task classifier We don't need to send the number of voluntary context switches (nvcsw) from BPF to user-space, as this information is already accessible in user-space via procfs. Sending this data would only create unnecessary overhead for schedulers that don't require it, and those that do can easily retrieve it through procfs. Therefore, drop this metric from scx_rustland_core and change scx_rustland implementing an interactive task classifier fully in the user-space part of the scheduler. Also drop some options that are not provide any significant benefit (also in preparation of a bigger refactoring to define a better API for the user-space framework). Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-06 17:56:58 +02:00
Daniel Hodges	d5efcd3245	scx_layered: Fix cred declaration The use of the cred struct should be const. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-06 05:22:12 -07:00
Tejun Heo	b226865b96	scx_lavd: Make FlatTopology::new() a bit prettier - Use .enumerate() consistently while building the cpu_fids vector. - Use .then_with() to chain .cmp() when sorting cpu_fids. Both reduce visual clutter.	2024-08-04 11:16:19 -10:00
Changwoo Min	130ea97fbf	Merge pull request #464 from multics69/lavd-amp-v3 scx_lavd: improve the calculation of ineligibility duration	2024-08-03 09:57:41 +09:00
Andrea Righi	3ad2875240	Merge pull request #463 from sched-ext/bpfland-update-dsq-vtime scx_bpfland: always re-align task's vruntime to the global vruntime	2024-08-02 22:13:12 +02:00
Daniel Hodges	1f922b9a73	scx_layered: Add support for disabling topology awareness Add a parameter to disable topology awareness. This is useful when trying to compare the scheduling performance of topology aware scheduling compared to the previous scheduling strategy. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-02 08:07:19 -07:00
Changwoo Min	f3fd6e9cb3	scx_lavd: drop 2-level-scheduling With optimizations of calculatring ineligibility duration, now the scheduler works well under heavy load without 2-level scheduling, so we drop it for simplicitiy. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-02 21:46:07 +09:00
Changwoo Min	c38e749c36	scx_lavd: improve the equation for calculating ineligibility duration This commit include a few changes: - treat a new forked task more conservatively - defer the execution of more tasks for longer time using ineligibility duration - consider if a task is waken up in calculating ineligibility duration	2024-08-02 21:08:29 +09:00
Andrea Righi	bee0d699ef	scx_bpfland: always re-align task's vruntime to the global vruntime Immediately re-align p->scx.dsq_vtime to the global vruntime (+/- slice lag) as soon as we are evaluating the task's vruntime. This allows rapidly chase the minimum global vruntime, ensuring to not over prioritize tasks tasks with a predominantly sleeping behavior pattern. Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-08-02 13:11:25 +02:00
Changwoo Min	5e194330f0	scx_lavd: consider task's wakeup and vruntime (starvation) more aggressively Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-08-02 12:25:29 +09:00
Daniel Hodges	de7b5fe190	scx_layered: Fix dispatch fallback CPU selection When the previous CPU for a task is not known do not fall back to dispatching to CPU 0, use the current CPU. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-07-31 12:35:22 -07:00
Changwoo Min	fc0ffeb45b	scx_lavd: print the overall status of a scheduled task L or R: Latency-critical, Regular H or I: performance-Hungry, performance-Insensitive B or T: Big, liTtle E or G: Eligible, Greedy P or N: Preemption, Not Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-31 19:00:35 +09:00

1 2 3 4 5 ...

711 Commits