scx-upstream

mirror of https://github.com/sched-ext/scx.git synced 2024-11-29 06:00:23 +00:00

Author	SHA1	Message	Date
Tejun Heo	9e3b4e6db0	scx_stats: A bit of cleanups and renames	2024-08-23 09:09:02 -10:00
Tejun Heo	b6ccb87bec	Merge pull request #539 from sched-ext/htejun/scx_rusty scx_rusty: Convert to scx_stats	2024-08-23 08:42:47 -10:00
Daniel Hodges	7d45059fa9	Merge pull request #538 from hodgesds/layered-pid scx_layered: Add pid/ppid matches	2024-08-23 14:08:40 -04:00
Tejun Heo	8c8912ccea	Merge branch 'main' into htejun/scx_rusty	2024-08-23 07:50:23 -10:00
Tejun Heo	44a0f1b124	scx_utils: Factor out monitor_stats() from scx_rusty and scx_layered	2024-08-23 06:46:19 -10:00
Tejun Heo	ae3024e938	scx_layered: Add --stats and make --monitor behavior consistent with scx_rusty	2024-08-23 05:52:52 -10:00
Tejun Heo	0f04a93dd1	scx_rusty: Add stat descriptions and make minor adjustments	2024-08-23 05:46:13 -10:00
Tejun Heo	36865234f8	scx_rusty: Add scx_stats annotations necessary for openmetrics translation	2024-08-23 04:59:08 -10:00
Tejun Heo	2f3f473cd3	scx_rusty: Improve timestamp reporting	2024-08-23 04:31:27 -10:00
Daniel Hodges	11b978a892	scx_layered: Add pid/ppid matches Add matches for pid/ppid. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-23 07:20:05 -07:00
Tejun Heo	76934f3aab	scx_rusty: Convert to scx_stats This allows scx_rusty to avoid generating excessive logs for statistics while still allowing detailed monitoring on demand.	2024-08-22 19:44:12 -10:00
Tejun Heo	16c07a5cd9	scx_rusty: Don't reset bpf_stats, remember prev states and calculate delta This will ease transition to scx_stats.	2024-08-22 13:02:23 -10:00
Tejun Heo	13fa48a871	scx_rusty: Separate out stats generation and formatting to prepare for scx_stats conversion.	2024-08-22 10:03:10 -10:00
Tejun Heo	b4564520e5	scx_rusty: Simplify Stats structs and take id out of the structs to prepare for scx_stats conversion. While at it, make some cosmetic changes.	2024-08-22 08:45:33 -10:00
Andrea Righi	6a2285398d	scx_bpfland: introduce --lowlatency option Introduce the new `--lowlatency` option, which enables switching between the default pure vruntime-based scheduling (more optimized for server workloads) and a deadline-based scheduling (better suited for low-latency workloads). When the low-latency mode is activated, a task's deadline is calculated as its vruntime, adjusted by a bonus proportional to the task's average number of voluntary context switches (the more voluntary context switches, the shorter the deadline). This feature enhances the prioritization of interactive tasks even more, proportionally to their average voluntary context switches, also within the two main global queues (priority / shared) and it helps to maintain interactive workloads always responsive, even in presence of heavy non-interactive background work. Low-latency mode allows to prevent audio cracking even in presence of a large amount of short-lived tasks with pseudo-interactive behavior (i.e, hackbench) and it enables achieving approximately a +33% average frames-per-second (FPS) in the typical "gaming while building the kernel" benchmark. However, it can also amplify the de-prioritization of CPU-intensive tasks, making this option more suitable for specific low-latency scenarios. Therefore the low-latency mode is disabled by default and it can only be enabled via the `--lowlatency` option. Tested-by: Piotr Gorski (piotrgorski@cachyos.org) Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-22 13:26:19 +02:00
Tejun Heo	4834dec684	scx_rusty: Move stats structs to stats.rs and rename for consistency	2024-08-21 22:04:38 -10:00
Andrea Righi	b0a8e4a91e	scx_bpfland: better time slice control Explicitly replenish the task's time slice from ops.dispatch() if the task still wants to run and no other task is selected. In this way the sched_ext core won't automatically re-schedule the task on the same CPU, implicitly assigning a time slice of SCX_SLICE_DFL. Moreover, instead of determining the task time slice in ops.enqueue(), refresh the time slice immediately before the task is started on its assigned CPU in ops.running(). This allows to use a more precise time slice, adjusted based on the actual amount of tasks that are currently waiting to be scheduled. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-22 09:23:37 +02:00
Tejun Heo	d6ac5fbd9c	scx_layered: Drop SCX_OPS_ENQ_LAST The meaning of SCX_OPS_ENQ_LAST will change with future kernel updates and enqueueing on local DSQ will no longer be sufficient to avoid stalls. No reason to do it anyway. Just drop it.	2024-08-21 13:13:59 -10:00
Tejun Heo	f726f0b73b	Version: Cargo.lock	2024-08-21 06:45:19 -10:00
Tejun Heo	4d1f0639d8	Version: v1.0.3	2024-08-21 06:42:11 -10:00
Andrea Righi	fedfee0bd6	scx_bpfland: drop unused variable With the global scx_utils::NR_CPU_IDS we don't need Topology anymore in init_primary_domain(), so drop the variable to fix the following build warning: warning: unused variable: `topo` --> src/main.rs:385:9 \| 385 \| topo: &Topology, \| ^^^^ help: if this is intentional, prefix it with an underscore: `_topo` \| = note: `#[warn(unused_variables)]` on by default Fixes: `1da249f` ("scx_utils::topology: Always use NR_CPU_IDS and NR_CPUS_POSSIBLE") Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 17:46:12 +02:00
Andrea Righi	9f7a11bba6	Merge pull request #528 from sched-ext/bpfland-turbo-boost scx_bpfland: properly classify Intel Turbo Boost CPUs	2024-08-21 17:40:25 +02:00
Daniel Hodges	f2a6661a85	Merge pull request #524 from hodgesds/layered-core-fixes scx_layered: Fix core selection	2024-08-21 08:13:33 -04:00
Tejun Heo	9c62019c81	Merge pull request #527 from sched-ext/htejun/scx_utils scx_utils::cpumask,topology: Misc updates	2024-08-20 22:25:25 -10:00
Andrea Righi	695e3b25b0	scx_bpfland: classify CPUs depending of their the base frequency Use the base frequency, instead of maximum frequency, to classify fast and slow CPUs. This ensures accurate distinction between Intel Turbo Boost CPUs and genuinely faster CPUs when auto-detecting the primary scheduling domain. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 10:16:41 +02:00
Andrea Righi	e0fb99835d	Merge pull request #525 from sched-ext/bpfland-disable-interactive scx_bpfland: allow to completely disable interactive classification	2024-08-21 10:02:43 +02:00
Tejun Heo	5cf4212330	Revert "rusty: Integrate stats with the metrics framework" This reverts commit `83373b1f4e` in prepration for converting to scx_stats.	2024-08-20 21:59:25 -10:00
Tejun Heo	516a7590db	scx_rusty: Revert log_recorder conversion scx_rusty will be converted to scx_stats in a similar fashin with scx_layered. Undo log_recorder conversion in preparation.	2024-08-20 21:59:20 -10:00
Tejun Heo	1da249f063	scx_utils::topology: Always use NR_CPU_IDS and NR_CPUS_POSSIBLE Always use the LazyLock versions and drop the counterparts from Topology.	2024-08-20 21:57:56 -10:00
Tejun Heo	092f5422d6	Merge pull request #518 from sched-ext/htejun/misc scx_layered: Add `--run-example` and enable CI testing	2024-08-20 21:42:45 -10:00
Tejun Heo	f7c193e528	scx_utils, scx_rusty: Minor updates to version handling - Update scx_utils/build.rs so that 12 char SHA1 is generated instead of full one. - Add --version to scx_rusty. Use custom one as we don't want to use the default cargo version one.	2024-08-20 21:03:05 -10:00
Tejun Heo	8f786be08f	scx_rusty: cargo fmt	2024-08-20 21:03:05 -10:00
Tejun Heo	4440567949	scx_rusty: Update Cargo.lock	2024-08-20 21:03:05 -10:00
Andrea Righi	c85315d527	scx_bpfland: allow to completely disable interactive classification Tasks enqueued with SCX_ENQ_WAKEUP are immediately classified as interactive. However, if interactive tasks classification is disabled (via `-c 0`), we should avoid promoting them as interactive. This is particularly important because, with the nvcsw logic disabled, tasks can remain classified as interactive indefinitely and they will never be demoted to regular tasks. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 08:45:13 +02:00
Andrea Righi	a9f5aaa536	scx_bpfland: replace custom CpuMask with scx_utils::Cpumask Rely on scx_utils::Cpumask instead of re-implementing a custom struct to parse and manage CPU masks. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-21 07:21:52 +02:00
Daniel Hodges	4d1c932619	scx_layered: Fix core selection Fix a bug introduced in #510 where it assumed core ids are incremental. This refactors the core ordering for layers to be far more simple and provide some space for layer core isolation in low utilization. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-20 19:26:53 -07:00
Andrea Righi	33b6ada98e	Merge pull request #509 from sched-ext/bpfland-topology scx_bpfland: topology awareness	2024-08-20 14:37:23 +02:00
Andrea Righi	467d4b5ea4	scx_bpfland: get topology information from scx_utils::Topology Rely on scx_utils::Topology to get CPU and cache information, instead of re-implementing custom methods. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-08-20 10:16:02 +02:00
Tejun Heo	c0418250f4	scx_layered: Add --run-example option So that scx_layered can be run in CI environment in a single command.	2024-08-19 20:50:10 -10:00
Changwoo Min	41bc6f0967	Merge pull request #511 from multics69/lavd-perf-profile scx_lavd: add power profile options: --performance, --balanced, --powersave	2024-08-20 09:02:37 +09:00
Changwoo Min	1d61dd4c1d	Merge pull request #508 from multics69/lavd-numa-fix scx_lavd: fix a potential watchdog timeout error at multi-NUMA/CCX platforms	2024-08-20 09:02:23 +09:00
Changwoo Min	2c4c2a0ccf	Merge pull request #507 from multics69/lavd-pretty-rust scx_lavd: revise FlatTopology prettier	2024-08-20 09:01:26 +09:00
Daniel Hodges	05a2721f8e	Merge pull request #510 from hodgesds/layered-core-topo-selection scx_layered: Use topology for core selection	2024-08-19 20:01:16 -04:00
Tejun Heo	d01b49bd0e	scx_layered: Fix verification failure `4fccc06905` ("scx_layered: Fix uninitialized variable") causes the following verification failure. Fix it by moving assignments below range checking. Validating match_layer() func#1... 283: R1=scalar() R2=scalar() R3=mem_or_null(id=49,sz=1) R10=fp0 ; int match_layer(u32 layer_id, pid_t pid, const char cgrp_path) @ main.bpf.c:1029 283: (7b) (u64 )(r10 -24) = r3 ; R3=mem_or_null(id=49,sz=1) R10=fp0 fp-24_w=mem_or_null(id=49,sz=1) 284: (bc) w7 = w1 ; R1=scalar() R7_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) ; struct layer layer = &layers[layer_id]; @ main.bpf.c:1033 285: (bc) w1 = w7 ; R1_w=scalar(id=50,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R7_w=scalar(id=50,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 286: (27) r1 = 1061192 ; R1_w=scalar(smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) 287: (18) r8 = 0xffffc90002a26000 ; R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080) 289: (0f) r8 += r1 ; R1_w=scalar(smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) ; u32 nr_match_ors = layer->nr_match_ors; @ main.bpf.c:1034 290: (bf) r1 = r8 ; R1_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) 291: (07) r1 += 1060992 ; R1_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,off=0x103080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) 292: (61) r1 = (u32 *)(r1 +0) R1 unbounded memory access, make sure to bounds check any such access processed 1099 insns (limit 1000000) max_states_per_insn 2 total_states 72 peak_states 72 mark_read 9 -- END PROG LOAD LOG --	2024-08-19 13:18:20 -10:00
Daniel Hodges	b3793e0069	scx_layered: Use topology for core selection Currently the core selection logic in scx_layered uses the first available core in the bitmask. This is suboptimal when the scheduler is configured with specific NUMA/LLC restrictions. The ideal core selection logic should try to find the least used cores within the preferred scheduling domain and allocate new cpus from shared cores within that domain. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-08-19 15:51:35 -07:00
Tejun Heo	3498a2b899	Merge pull request #514 from sched-ext/htejun/scx_stats scx_stats, scx_layered: Implement independent stats client sessions	2024-08-19 11:24:53 -10:00
Tejun Heo	f6bc52d31e	scx_layered: Make --monitor behavior more useful - If --monitor is specified with layer specs, the scheduler also starts stats monitoring on a thread. - Standalone monitoring mode no longer exits when the scheduler isn't there.	2024-08-19 10:55:02 -10:00
Tejun Heo	d03e48eb75	scx_layered: Implement per-stats-client nr_layer_cpus_ranges tracking With this, every client sees the correct nr_layer_cpus_ranges without interfering with each other.	2024-08-19 09:12:51 -10:00
Tejun Heo	448aacfd60	scx_layered: Initialize Stats.prev_layer_cycles properly on new() So that new stats session doesn't start with an inflated utilization number.	2024-08-19 08:40:40 -10:00
Tejun Heo	25d7e6f787	scx_layered: Implement on-demand statistics generation Instead of keeping one copy of sched_stats, each stats server session carries their own so that stats can be generated independently by each client at any interval. CPU allocation min/max tracking is broken for now.	2024-08-19 08:27:36 -10:00

1 2 3 4 5 ...

799 Commits