JakeHillion/scx

mirror of https://github.com/JakeHillion/scx.git synced 2024-11-25 19:10:23 +00:00

Author	SHA1	Message	Date
Dan Schatzberg	a17f16e4b9	scx_mitosis: handle enqueue() on !wakeup If we're not on the wakeup path, we may see enqueue() invoked without select_cpu() which will require an idle cpu lookup. In order to fix this, we refactor the idle_cpu lookup in select_cpu so it can be invoked from enqueue(). Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-14 10:13:07 -07:00
Daniel Hodges	912d6e01c1	scx_layered: Add LLC integration test Add an integration test for testing that the `llcs` field on the layer config works properly. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-14 07:27:29 -07:00
Daniel Hodges	ed18e43612	Merge pull request #795 from hodgesds/bpftrace-tests scx_layered: Add topology integration test	2024-10-14 12:54:54 +00:00
Daniel Hodges	e456c83536	scx_layered: Add topology integration test Add a bpftrace script that does a topology aware test. The test script runs a bpftrace script that asserts that stress-ng processes are scheduled on NUMA node 0 only. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-13 20:23:11 -07:00
Changwoo Min	c1f4051a14	scx_lavd: fix int overflow in calculating avg_lat_cri u32 is not big enough to hold the sum of lat_cri in a period, so sum_lat_cri (u32) was overflown, resulting in incorrect avg_lat_cri. Change the type from u32 to u64, avoiding the interger overflow. Note that {sum/avg}_lat_cri is only for deubugging so it is irrelevant in making scheduling decisions. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-13 00:58:36 +09:00
Changwoo Min	6c9bbe66dc	scx_lavd: remove unnecessary downscaling in deadline calculation The downscaling is not necessary in calculating task's virtual deadline because virtual dealine represents only relative order in task scheduling. Hence downscaling incurs only inacuracy caused by truncation. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-13 00:41:23 +09:00
Changwoo Min	6ddc3f0a2b	scx_lavd: do not inspect scx_lavd process itself Print the task status of scx_lavd is not useful, so filter it out. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-12 17:21:08 +09:00
Andrea Righi	197dee93f4	scx_bpfland: get rid of per-CPU DSQs Using per-CPU DSQs seems to introduce more issues than benefits (potential stalls, etc.). Therefore, let's get rid of the per-CPU DSQs and use SCX_DSQ_LOCAL for tasks directly dispatched to specific CPUs. This change seems to also improve performance on 6.12 and it makes the scheduler a lot more stable and consistent. The issues will be investigated separately, providing a separate stress test scheduler, designed to stress test per-CPU DSQs. Tested-by: Piotr Gorski <piotrgorski@cachyos.org> Tested-by: Eric Naim <dnaim@cachyos.org> Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-12 08:15:51 +02:00
Andrea Righi	198f22656c	scx_bpfland: clarify error code returned by pick_idle_cpu() Return more meaningful error codes from pick_idle_cpu(). No functional change, just improved code readability. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-12 08:08:48 +02:00
Andrea Righi	ceb4f1755f	scx_bpfland: always refill task timeslice in ops.dispatch() When a task exhausts its timeslice and no other tasks are ready to run, we automatically refill its timeslice, but only if the current CPU is a fully idle SMT core. If we don’t handle the refill, the sched_ext core will default to refilling using SCX_SLICE_DFL, which may not be optimal. To ensure better control over the task’s timeslice, always refill it when no other tasks are available to run. Fixes: `6e24fcc` ("scx_bpfland: keep tasks running on full-idle SMT cores") Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-12 08:08:48 +02:00
Andrea Righi	54d704ceda	scx_bpfland: pick a random idle CPU when prev_cpu is not valid Pick any random idle CPU when the previous CPU isn't valid anymore according to the task's cpumask. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-12 08:08:48 +02:00
Changwoo Min	836cf9faa4	Merge pull request #779 from multics69/lavd-futex-v2 scx_lavd: mitigate the lock holder preemption problem	2024-10-12 02:42:33 +00:00
Daniel Hodges	a08a76ccd6	scx_layered: Cleanup non topology path More cleanup in the non topology path to remove copy/pasta declarations. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-11 10:18:34 -07:00
Jake Hillion	eb59085e61	Merge pull request #781 from JakeHillion/pr781 layered: move configuration into library component	2024-10-11 16:39:23 +00:00
Jake Hillion	52c279a469	layered: make default value for disable_topology dynamic Disable topology currently defaults to `false` (topology enabled...). Change this so that topology is enabled by default on hardware that may benefit from it (multiple NUMA nodes or LLCs) and disabled on hardware that does not benefit from it. This is a slightly noisy change as we have to move ownership of the newly mutable layer specs into the `Scheduler` object (previously they were a borrow). We don't have a `Topology` object to make the default decision from until `Scheduler::init`, and I think this is because of the possibility of hot plugs. We therefore have to clone the `Vec<LayerSpec>` each time as it is potentially mutable. Test plan: - CI. Updated to be explicit about topology in both cases. Single NUMA multi-LLC machine: ``` $ scx_layered --run-example ... 13:34:01 [INFO] Topology awareness not specified, selecting enabled based on hardware ... $ scx_layered --run-example --disable-topology=true ... 13:33:41 [INFO] Disabling topology awareness ... $ scx_layered --run-example -t ... 13:33:15 [INFO] Disabling topology awareness ... $ scx_layered --run-example --disable-topology=false # none of the above messages present ``` Single NUMA single LLC machine: ``` $ scx_layered --run-example 15:33:10 [INFO] Topology awareness not specified, selecting disabled based on hardware ```	2024-10-11 17:09:07 +01:00
Jake Hillion	143a55cda1	layered: move configuration into library component Move the LayerConfig and its children from `main.rs` into `lib.rs`. This allows other tooling, such as config managers or test executors, to modify layered configs programmatically. The end goal is to move everything in `layered` except for the argument parsing into a `run_layered` function, but I haven't done it in this diff because it's a larger change. This is a common pattern in Rust projects to do as little as possible in `main.rs` for extensibility. The only change here, other than publicity and where things are located, is the signature of `CpuPool::alloc_cpus`. It previously relied on `&Layer`, and this changes it to the two elements of `Layer` it uses. This allows `Layer` to stay confined to `main.rs` (for now) to prevent scope creep in this PR. This may be inconvenient in the short term for WIPs and anyone doing non-Cargo builds (cough me), but having things split into more files should make rebases/merges easier in the long run. Test plan: - `cargo build --release` - CI.	2024-10-11 15:55:29 +01:00
Changwoo Min	648c95be9e	scx_lavd: fix incorrect task comparison for preemption Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-11 21:53:24 +09:00
likewhatevs	b88f567e25	Merge pull request #782 from likewhatevs/lsp-nice-util layered -- make lsp work nice on util include file	2024-10-11 12:30:19 +00:00
Pat Somaru	2b309dbbb4	make lsp work nice on util include	2024-10-11 08:06:29 -04:00
Pat Somaru	7627e1cc42	scx_layered: fix lsp etc on util.bpf.c	2024-10-11 08:02:23 -04:00
Changwoo Min	5b4b255cbb	scx_lavd: do not preempt while holding a lock When a task holds a lock, it should not yield its time slice or it should not be preempted out. In this way, we can mitigate harmful preemption of lock holders and reduce the total preemption counts. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-11 18:49:09 +09:00
Changwoo Min	bd17589a6e	scx_lavd: boost latency criticality when a task holds a lock When a lock holder exhausts its time slide, it will be re-enqueued to a DSQ waiting for shceduling while holding a lock. In this case, prioritize its latency criticality proportionally, so a lock holder would be not stuck in a DSQ for a long time, improving system-wide progress. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-11 18:48:56 +09:00
Changwoo Min	77b8e65571	scx_lavd: tracing all blocking locks and futexes Trace the acquisition and release of blocking locks for kernel and fuxtexes for user-space. This is necessary to boost a lock holder task in terms of latency and time slice. We do not boost shared lock holders (e.g., read lock in rw_semaphore) since the kernel already prioritizes the readers over writers. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-11 17:03:48 +09:00
Ryan Wilson	8c8250b1e2	[layered] Implement reverse weight DSQ algorithm	2024-10-10 12:53:25 -07:00
Daniel Hodges	9f60053312	Merge pull request #775 from hodgesds/layered-idle-cleanup scx_layered: Cleanup topology preempt path	2024-10-10 18:34:08 +00:00
Daniel Hodges	fb4dcf91eb	scx_layered: Change default DSQ iter algo Change the default DSQ iter algo from round robin to linear. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-10 11:10:27 -07:00
Daniel Hodges	b22e83d4d5	scx_layered: Cleanup topology preempt path Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-10 09:56:42 -07:00
Andrea Righi	d62989e462	scx_bpfland: fix cpumask initialization error In the WAKE_SYNC path lf L3 cache awareness is disabled (--disable-l3) we may hit the following error: Error: EXIT: scx_bpf_error (CPU L3 cpumask not initialized) Fix this by setting the L3 cpumask to the whole primary domain if L3 cache awareness is disabled. Tested-by: Eric Naim <dnaim@cachyos.org> Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-10 09:30:54 +02:00
Daniel Hodges	fe00e2c7be	scx_layered: Refactor topo preemption Refactor topology preemption logic so the non topology aware code is contianed to a separate function. This should make maintaining the non topology aware code path far easier. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 21:24:07 -04:00
Daniel Hodges	451c68b44e	scx_layered: Cleanup debug messages Cleanup debug messages to use a common prefix when the scheduler is initialized. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 19:06:28 -04:00
Daniel Hodges	81a5250d49	scx_layered: Fix verifier errors Fix verifier errors when using different DSQ iteration algorithms and cleanup some code. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 14:36:12 -07:00
Dan Schatzberg	12cf482487	Merge pull request #767 from dschatzberg/mitosis-build mitosis: Fix build	2024-10-09 19:32:35 +00:00
Dan Schatzberg	c794c389da	mitosis: apply autoformatting Apply clang-format autoformatting on the c code and cargo fmt on the rust code. Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-09 10:56:27 -07:00
Jake Hillion	483a565d7f	Merge pull request #759 from JakeHillion/pr759 layered: attempt to work steal from own llc before others	2024-10-09 17:42:23 +00:00
Daniel Hodges	678c205572	Merge pull request #766 from hodgesds/layered-load-fixes scx_layered: Rename load_adj statistic	2024-10-09 17:12:24 +00:00
Jake Hillion	d9dc46b5d2	layered: attempt to work steal from own llc before others	2024-10-09 17:39:06 +01:00
Dan Schatzberg	347147b10d	mitosis: fix build Minimal changes to make sure scx_mitosis can build with the latest scx changes. Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-09 08:30:15 -07:00
Daniel Hodges	30258cff1b	scx_layered: Update docs for layer_preempt_weight_disable Update docs for layer_preempt_weight_disable and layer_growth_weight_disable. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 06:37:54 -07:00
Daniel Hodges	edc673460d	scx_layered: Rename load_adj statistic Rename the `load_adj` statistic to `load_frac_adj`, which is a more accurate representation of what the statistic is calculating. The statistic is a fractional representation of the load of a layer adjusted for infeasible weights. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 06:23:37 -07:00
Jake Hillion	c23efb1ed3	Merge pull request #749 from JakeHillion/pr749 layered: split dispatch into no_topo version	2024-10-09 13:15:12 +00:00
Jake Hillion	19d09c3cc1	layered: split dispatch into no_topo version Refactor layered_dispatch into two functions: layered_dispatch_no_topo and layered_dispatch. layered_dispatch will delegate to layered_dispatch_no_topo in the disable_topology case. Although this code doesn't run when loaded by BPF due to the global constant bool blocking it, it makes the functions really hard to parse as a human. As they diverge more and more it makes sense to split them into separate manageable functions. This is basically a mechanical change. I duplicated the existing function, replaced all `disable_topology` with true in `no_topo` and false in the existing function, then removed all branches which can't be hit. Test plan: - Runs on my dev box (6.9.0 fbkernel) with `scx_layered --run-example -n`. - As above with `-t`. - CI.	2024-10-09 13:33:06 +01:00
Daniel Hodges	2b5829e275	Merge pull request #763 from ryantimwilson/rusty-default-weights-fix [rusty] Fix load stats when host is under-utilized	2024-10-09 12:14:51 +00:00
likewhatevs	29bb3110ec	Merge pull request #765 from likewhatevs/update-dispatch scx_layered: enable configuring layer iteration when no topo	2024-10-09 06:22:40 +00:00
Pat Somaru	8e2f195af1	enable configuring layer iteration when no topo enable configuring layer iteration order in dispatch when topology is disabled. replace some member_vptr's in that iteration with regular accesses	2024-10-09 01:53:19 -04:00
Andrea Righi	e3e381dc8e	Merge pull request #755 from sched-ext/bpfland-prevent-kthread-stall scx_bpfland: prevent per-CPU DSQ stall with per-CPU kthreads	2024-10-09 05:28:59 +00:00
Ryan Wilson	fbdb6664ec	[rusty] Fix load stats when host is under-utilized	2024-10-08 21:08:07 -07:00
Pat Somaru	c90144d761	Revert "Merge pull request #746 from likewhatevs/layered-delay" This reverts commit `2077b9a799`, reversing changes made to `eb73005d07`.	2024-10-08 22:01:05 -04:00
Daniel Hodges	e6773d43b1	scx_layered: Make stress-ng non exclusive in example Test CI hosts are VMs currently and making stress-ng exclusive may starve the host. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-08 10:49:51 -07:00
Daniel Hodges	66f967c06d	Merge pull request #756 from hodgesds/layered-example-stress scx_layered: Add stress-ng example layer	2024-10-08 15:31:44 +00:00
likewhatevs	e1f6c792fe	Merge pull request #757 from JakeHillion/pr757 layered: cleanup warnings in bpf compilation	2024-10-08 15:29:12 +00:00

1 2 3 4 5 ...

1129 Commits