scx-upstream

mirror of https://github.com/sched-ext/scx.git synced 2024-11-24 20:00:22 +00:00

Author	SHA1	Message	Date
Jake Hillion	52c279a469	layered: make default value for disable_topology dynamic Disable topology currently defaults to `false` (topology enabled...). Change this so that topology is enabled by default on hardware that may benefit from it (multiple NUMA nodes or LLCs) and disabled on hardware that does not benefit from it. This is a slightly noisy change as we have to move ownership of the newly mutable layer specs into the `Scheduler` object (previously they were a borrow). We don't have a `Topology` object to make the default decision from until `Scheduler::init`, and I think this is because of the possibility of hot plugs. We therefore have to clone the `Vec<LayerSpec>` each time as it is potentially mutable. Test plan: - CI. Updated to be explicit about topology in both cases. Single NUMA multi-LLC machine: ``` $ scx_layered --run-example ... 13:34:01 [INFO] Topology awareness not specified, selecting enabled based on hardware ... $ scx_layered --run-example --disable-topology=true ... 13:33:41 [INFO] Disabling topology awareness ... $ scx_layered --run-example -t ... 13:33:15 [INFO] Disabling topology awareness ... $ scx_layered --run-example --disable-topology=false # none of the above messages present ``` Single NUMA single LLC machine: ``` $ scx_layered --run-example 15:33:10 [INFO] Topology awareness not specified, selecting disabled based on hardware ```	2024-10-11 17:09:07 +01:00
likewhatevs	b88f567e25	Merge pull request #782 from likewhatevs/lsp-nice-util layered -- make lsp work nice on util include file	2024-10-11 12:30:19 +00:00
Pat Somaru	2b309dbbb4	make lsp work nice on util include	2024-10-11 08:06:29 -04:00
Pat Somaru	7627e1cc42	scx_layered: fix lsp etc on util.bpf.c	2024-10-11 08:02:23 -04:00
Ryan Wilson	8c8250b1e2	[layered] Implement reverse weight DSQ algorithm	2024-10-10 12:53:25 -07:00
Daniel Hodges	9f60053312	Merge pull request #775 from hodgesds/layered-idle-cleanup scx_layered: Cleanup topology preempt path	2024-10-10 18:34:08 +00:00
Daniel Hodges	fb4dcf91eb	scx_layered: Change default DSQ iter algo Change the default DSQ iter algo from round robin to linear. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-10 11:10:27 -07:00
Daniel Hodges	b22e83d4d5	scx_layered: Cleanup topology preempt path Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-10 09:56:42 -07:00
Andrea Righi	d62989e462	scx_bpfland: fix cpumask initialization error In the WAKE_SYNC path lf L3 cache awareness is disabled (--disable-l3) we may hit the following error: Error: EXIT: scx_bpf_error (CPU L3 cpumask not initialized) Fix this by setting the L3 cpumask to the whole primary domain if L3 cache awareness is disabled. Tested-by: Eric Naim <dnaim@cachyos.org> Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-10 09:30:54 +02:00
Daniel Hodges	fe00e2c7be	scx_layered: Refactor topo preemption Refactor topology preemption logic so the non topology aware code is contianed to a separate function. This should make maintaining the non topology aware code path far easier. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 21:24:07 -04:00
Daniel Hodges	451c68b44e	scx_layered: Cleanup debug messages Cleanup debug messages to use a common prefix when the scheduler is initialized. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 19:06:28 -04:00
Daniel Hodges	81a5250d49	scx_layered: Fix verifier errors Fix verifier errors when using different DSQ iteration algorithms and cleanup some code. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 14:36:12 -07:00
Dan Schatzberg	12cf482487	Merge pull request #767 from dschatzberg/mitosis-build mitosis: Fix build	2024-10-09 19:32:35 +00:00
Dan Schatzberg	c794c389da	mitosis: apply autoformatting Apply clang-format autoformatting on the c code and cargo fmt on the rust code. Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-09 10:56:27 -07:00
Jake Hillion	483a565d7f	Merge pull request #759 from JakeHillion/pr759 layered: attempt to work steal from own llc before others	2024-10-09 17:42:23 +00:00
Daniel Hodges	678c205572	Merge pull request #766 from hodgesds/layered-load-fixes scx_layered: Rename load_adj statistic	2024-10-09 17:12:24 +00:00
Jake Hillion	d9dc46b5d2	layered: attempt to work steal from own llc before others	2024-10-09 17:39:06 +01:00
Dan Schatzberg	347147b10d	mitosis: fix build Minimal changes to make sure scx_mitosis can build with the latest scx changes. Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-09 08:30:15 -07:00
Daniel Hodges	30258cff1b	scx_layered: Update docs for layer_preempt_weight_disable Update docs for layer_preempt_weight_disable and layer_growth_weight_disable. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 06:37:54 -07:00
Daniel Hodges	edc673460d	scx_layered: Rename load_adj statistic Rename the `load_adj` statistic to `load_frac_adj`, which is a more accurate representation of what the statistic is calculating. The statistic is a fractional representation of the load of a layer adjusted for infeasible weights. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-09 06:23:37 -07:00
Jake Hillion	c23efb1ed3	Merge pull request #749 from JakeHillion/pr749 layered: split dispatch into no_topo version	2024-10-09 13:15:12 +00:00
Jake Hillion	19d09c3cc1	layered: split dispatch into no_topo version Refactor layered_dispatch into two functions: layered_dispatch_no_topo and layered_dispatch. layered_dispatch will delegate to layered_dispatch_no_topo in the disable_topology case. Although this code doesn't run when loaded by BPF due to the global constant bool blocking it, it makes the functions really hard to parse as a human. As they diverge more and more it makes sense to split them into separate manageable functions. This is basically a mechanical change. I duplicated the existing function, replaced all `disable_topology` with true in `no_topo` and false in the existing function, then removed all branches which can't be hit. Test plan: - Runs on my dev box (6.9.0 fbkernel) with `scx_layered --run-example -n`. - As above with `-t`. - CI.	2024-10-09 13:33:06 +01:00
Daniel Hodges	2b5829e275	Merge pull request #763 from ryantimwilson/rusty-default-weights-fix [rusty] Fix load stats when host is under-utilized	2024-10-09 12:14:51 +00:00
likewhatevs	29bb3110ec	Merge pull request #765 from likewhatevs/update-dispatch scx_layered: enable configuring layer iteration when no topo	2024-10-09 06:22:40 +00:00
Pat Somaru	8e2f195af1	enable configuring layer iteration when no topo enable configuring layer iteration order in dispatch when topology is disabled. replace some member_vptr's in that iteration with regular accesses	2024-10-09 01:53:19 -04:00
Andrea Righi	e3e381dc8e	Merge pull request #755 from sched-ext/bpfland-prevent-kthread-stall scx_bpfland: prevent per-CPU DSQ stall with per-CPU kthreads	2024-10-09 05:28:59 +00:00
Ryan Wilson	fbdb6664ec	[rusty] Fix load stats when host is under-utilized	2024-10-08 21:08:07 -07:00
Pat Somaru	c90144d761	Revert "Merge pull request #746 from likewhatevs/layered-delay" This reverts commit `2077b9a799`, reversing changes made to `eb73005d07`.	2024-10-08 22:01:05 -04:00
Daniel Hodges	e6773d43b1	scx_layered: Make stress-ng non exclusive in example Test CI hosts are VMs currently and making stress-ng exclusive may starve the host. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-08 10:49:51 -07:00
Daniel Hodges	66f967c06d	Merge pull request #756 from hodgesds/layered-example-stress scx_layered: Add stress-ng example layer	2024-10-08 15:31:44 +00:00
likewhatevs	e1f6c792fe	Merge pull request #757 from JakeHillion/pr757 layered: cleanup warnings in bpf compilation	2024-10-08 15:29:12 +00:00
Jake Hillion	85daa2be32	layered: cleanup warnings in bpf compilation clang is correctly warning that we use various uninitialised variables. clean these up so real errors are easier to read. The largest change here is to non-topological layered_dispatch. The matching_dsq logic seems to be incorrect. It checks whether an uninitialised variable is 0, if it is sets it, then only uses the variable if the value is 0. I have changed this to default to -1, then use the value if it is no longer -1.	2024-10-08 16:25:43 +01:00
Daniel Hodges	f3191afca7	scx_layered: Add stress-ng example layer Add a stress-ng example layer, which will be used for CI testing with stress-ng. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-08 07:56:54 -07:00
Andrea Righi	c8a9207371	scx_bpfland: prevent per-CPU DSQ stall with per-CPU kthreads Since per-CPU kthreads may show an inconsistent prev_cpu and/or cpumask, dispatch them directly to local DSQ and allow to preempt the current running task. This allows to prevent per-CPU kthread stalls and it also helps to prioritize them, as are usually important for system performance and responsiveness. Moreover, change the behavior of --local-kthreads to prioritize all kthreads when this option is used. This addresses issue #728. NOTE: ideally we may want to fix this in the kernel by making sure to always expose a consistent prev_cpu and cpumask also for kthreads, but at the moment this change allows to prevent some annoying stalls and performance-wise it doesn't seem to introduce any regression. In fact, the usual gaming/fps benchmarks show even a slight improvement in responsiveness with this change applied. Thanks to YUBY from the CachyOS community for all the extremely valuable help with the intensive stress tests. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-08 15:02:31 +02:00
Daniel Hodges	d7576d4b44	Merge pull request #754 from minosfuture/cpu_pool_doc scx_layered: Add doc comment to CpuPool	2024-10-08 12:22:55 +00:00
likewhatevs	2077b9a799	Merge pull request #746 from likewhatevs/layered-delay scx_layered: lighten/reduce nested loops in layered dispatch	2024-10-08 11:32:55 +00:00
Ming Yang	0dbb8c2374	scx_layered: Add doc comment to CpuPool Add doc comment to `CpuPool` as a quick reference for each member. Most importantly, differentiate "cpu" and "core", as logical core and physical core, respectively. Signed-off-by: Ming Yang <minos.future@gmail.com>	2024-10-07 21:48:46 -07:00
Pat Somaru	51d9e90d39	formatting	2024-10-07 18:54:30 -04:00
Pat Somaru	d2ac627942	formatting	2024-10-07 18:47:27 -04:00
Pat Somaru	3369836970	formatting	2024-10-07 18:44:44 -04:00
Pat Somaru	e0ce4711d4	flatten and simplify dispatch	2024-10-07 18:36:07 -04:00
Daniel Hodges	eb73005d07	Merge pull request #747 from hodgesds/layered-idle-order scx_layered: Update idle topology selection order	2024-10-07 20:01:38 +00:00
Ryan Wilson	a76778a4ab	scx_rusty: Fix BPF crash during CPU hotplug When hotplugging CPUs in rapid succession, scx_rusty would crash with: ``` scx_bpf_error (Failed to lookup dom[4294967295] ``` The root cause is if the scheduler is restarted fast enough, a task on a previously hotplugged CPU may not have moved off that CPU yet. Thus, the CPU -> domain map would contain an invalid domain (u32::max) and we would fail to lookup the domain correctly in rusty_select_cpu for prev_cpu. To fix this, if the CPU is offline, we do not try to allocate to the same NUMA node (assuming hotplug is a rare operation) beyond domestic domain. Instead we use greedy allocation - first idle, then busy - then any CPU.	2024-10-07 11:59:36 -07:00
Daniel Hodges	0b497d6df0	scx_layered: Update idle topology selection order Update the idle topology selection order, the current logic is: core architecture (big/little) -> LLC -> NUMA -> Machine It's probably better to try to keep cache lines clean and do: LLC -> core architecture (big/little) -> NUMA -> Machine Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-07 10:34:11 -07:00
Daniel Hodges	024a2aa658	scx_layered: Improve perf on non topo aware paths Improve the performance on non topology aware paths by skipping some map lookups and uneccessary initializations. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-07 07:56:18 -07:00
Daniel Hodges	24fba4ab8d	scx_layered: Add idle smt layer configuration Add support for layer configuration for idle CPU selection. This allows layers to choose whether or not to restrict idle CPU selection to SMT idle CPUs. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-07 06:58:54 -07:00
Daniel Hodges	2f280ac025	scx_layered: Use idle smt mask for idle selection In the non topology aware code the idle smt mask is used for finding idle cpus. Update topology aware idle selection to also use the idle smt mask. In certain benchmarks this can improve performance. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-07 05:40:59 -07:00
Daniel Hodges	30feecc5ae	Merge pull request #743 from hodgesds/layered-big-little-mask scx_layered: Add big cpumask	2024-10-07 11:05:01 +00:00
Daniel Hodges	d86638ef0b	scx_layered: Add big cpumask Add big cpumask to scx_layered and prefer selecting big idle cores when using the BigLittle growth algo. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-06 14:05:12 -04:00
Andrea Righi	9a29547e5b	scx_bpfland: rework lowlatency mode In lowlatency mode (option --lowlatency) tasks are ordered using a deadline that is evaluated as the vruntime minus a certain "bonus", determined in function of the max time slice and the average amount of voluntary context switches, to amplify the priority boost of the tasks that are voluntarily releasing the CPU (which are typically interactive). However, this method can be extremely unfair in some cases: tasks with short bursts of voluntary context switches may receive a huge priority boost, making the rest of the system almost unresponsive (see massive hackbench stress tests for example). To prevent this rework the task's deadline logic to use the vruntime and a "deadline component" that is a function of the average used time slice, scaled using a dynamic task priority (evaluated as the static task priority and the its average amount of voluntary context switches). This logic seems to prevent excessive prioritization of tasks performing short intensive bursts of voluntary context switches. It also makes lowlatency mode in scx_bpfland (somehow) more similar to the deadline logic used by scx_rusty. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-05 17:44:09 +02:00

1 2 3 4 5 ...

981 Commits