JakeHillion/scx

mirror of https://github.com/JakeHillion/scx.git synced 2024-11-25 02:50:24 +00:00

Author	SHA1	Message	Date
Jake Hillion	fb442630d0	tests: add integration testing framework	2024-10-25 11:25:12 +01:00
Jake Hillion	6216a4b3b1	Merge pull request #826 from JakeHillion/pr826 layered: bpf: add layer kind to layer	2024-10-21 10:47:39 +00:00
Jake Hillion	55c9636f78	layered: bpf: add layer kind to layer Currently we have an approximation of LayerKind in the BPF code with `open` on the layer, but it is difficult/impossible to tell the difference between an Open and a Grouped layer. Add a `kind` field to the BPF `layer` and plumb through an enum from the Rust side.	2024-10-21 11:32:17 +01:00
Andrea Righi	fb3f1d0b43	Merge pull request #821 from sched-ext/rustland-min-vtime-budget scx_rustland: Adjust task's vruntime budget based on latency weight	2024-10-20 07:44:35 +00:00
Changwoo Min	bf1b014d63	Merge pull request #818 from multics69/lavd-tuning scx_lavd: add missing reset_lock_futex_boost()	2024-10-20 01:41:54 +00:00
Daniel Hodges	e72e5ce0f4	Merge pull request #744 from minosfuture/main scx_layered: Fix crash on aarch64 due to unavailable cache id file	2024-10-19 22:33:53 +00:00
Andrea Righi	30a2a2013c	scx_rustland: Adjust task's vruntime budget based on latency weight Adjust the amount of vruntime budget an idle task can accumulate in function of its latency weight, which is derived from the average number of voluntary context switches. This ensures that latency-sensitive tasks naturally receive an additional priority boost and we can get avoid scaling down the vruntime to determine the task's deadline, making the scheduler more fair. It also makes the scheduler more robust, now rustland can survive intensive stress tests, such as `stress-ng --cpu-sched 64` or hackbench. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-19 19:32:14 +02:00
Daniel Hodges	8f3b75acb9	Merge pull request #820 from hodgesds/rusty-cleanup scx_rusty: Cleanup cpumask casting	2024-10-19 16:12:11 +00:00
Daniel Hodges	b1b76ee72a	scx_rusty: Cleanup cpumask casting Use the cask_mask helper function to cleanup scx_rusty. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-19 12:01:36 -04:00
Changwoo Min	2fd395bbbf	scx_lavd: remove unnecessary load tracking The algorithm has been evolved to decide the time slice without tracking the system-wide load. So remove the obsolete load tracking code. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-19 15:39:24 +09:00
Changwoo Min	8d63024be7	scx_lavd: add missing reset_lock_futex_boost() reset_lock_futex_boost() should be called every context switch of a task. Otherwise, in the worst case, a task and that CPU could block the preemption. To avoid such a situation, add missing reset_lock_futex_boost() calls. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-10-19 15:39:18 +09:00
Ming Yang	f3f4726c09	scx_layered: Read CPU topology for building CpuPool Building CpuPool from cache-cpu topology did not apply on arm, because `/sys/devices/system/cpu/cpu{}/cache/index{}/id` file is unavailable. Read CPU topology instead. Signed-off-by: Ming Yang <minos.future@gmail.com>	2024-10-17 23:41:08 -07:00
Andrea Righi	f37bc0db7f	Merge pull request #813 from sched-ext/bpfland-lowlatency-rework scx_bpfland: rework lowlatency mode	2024-10-17 19:56:00 +00:00
Andrea Righi	48bbcd24dd	scx_bpfland: tune default settings Adjust some default settings after the rework done with commit 112a5d4 ("scx_bpfland: rework lowlatency mode to adjust tasks priority"). Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-17 21:46:51 +02:00
Andrea Righi	4d68133f3b	scx_bpfland: rework lowlatency mode to adjust tasks priority Rework lowlatency mode as following: - introduce task dynamic priority: task weight multiplied by the average amount of voluntary context switches - use dynamic priority to determine task's vruntime (instead of the static task's weight) - task's minimum vruntime is evaluated in function of the dynamic priority (tasks with a higher dynamic priority can have a smaller vruntime compared to tasks with a lower dynamic priority) The dynamic priority allows to maintain a good system responsiveness also without applying the classification of tasks in "interactive" and "regular", therefore in lowlatency mode only the shared DSQ will be used (priority DSQ is disabled). Using a separate priority queue to dispatch "interactive" tasks makes the scheduler less fair, allowing latency-sensitive tasks to be prioritized even when there is a high number of tasks in the system (e.g., `stress-ng -c 1024` or similar scenarios), where relying solely on dynamic priority may not be sufficient. On the other hand, disabling the classification of "interactive" tasks results in a fairer scheduler and more predictable performance, making it better suited for soft real-time applications (e.g, audio and multimedia). Therefore, the --lowlatency option is retained to allow users to choose between more predictable performance (by disabling the interactive task classification) or a more responsive system (default). Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-17 21:46:51 +02:00
Andrea Righi	d336892c71	Merge pull request #816 from sched-ext/rustland-core-update-doc scx_rustland_core: update documentation about the new API	2024-10-17 19:18:16 +00:00
likewhatevs	9a65fea75e	Merge pull request #817 from likewhatevs/fix-ci remove apt fast from ci setup	2024-10-17 17:42:20 +00:00
Pat Somaru	d944a39a7f	remove apt fast from ci setup remove apt fast from ci setup to reduce non-core dependencies	2024-10-17 13:08:03 -04:00
Andrea Righi	a155ff2ada	scx_rustland_core: update documentation about the new API Update the documentation adding the new task statistics provided by scx_rustland_core. Fixes: `be681c7` ("scx_rustland_core: pass nvcsw, slice and dsq_vtime to user-space") Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-17 19:07:51 +02:00
Jake Hillion	f1b1830512	Merge pull request #814 from JakeHillion/pr814 layered: add RandomTopo layer growth algorithm	2024-10-17 17:05:53 +00:00
Jake Hillion	abc202c972	Merge pull request #815 from JakeHillion/pr815 layered: make disable_topology arg require equals	2024-10-17 17:05:51 +00:00
Jake Hillion	1415b4a454	layered: make disable_topology arg require equals The recent changes to `disable_topology` making the arg an `Option<bool>` instead of a `bool` caused an issue with it incorrectly attaching arguments. Make the argument `require_equals` to fix this case. This is a behaviour change for anybody previously relying on `-t true`, `-t false`, `--disable-topology true`, or `--disable-topology false`. The equals syntax worked before and continues to work after, as demonstrated in the CI. Test plan: Before: ```sh $ sudo target/release/scx_layered -t f:/tmp/test.json error: invalid value 'f:/tmp/test.json' for '--disable-topology [<DISABLE_TOPOLOGY>]' [possible values: true, false] For more information, try '--help'. ``` After: ```sh $ sudo target/release/scx_layered -t f:/tmp/test.json 14:44:00 [INFO] CPUs: online/possible=176/176 nr_cores=88 14:44:00 [INFO] Disabling topology awareness ... ^CEXIT: Scheduler unregistered from user space ```	2024-10-17 15:46:30 +01:00
Jake Hillion	a0fe303b61	layered: add RandomTopo layer growth algorithm Add an additional layer growth algorithm, named 'RandomTopo'. It follows these rules: - Randomise NUMA nodes. List each core in each NUMA node before a core from another NUMA node. - Randomise LLCs within each NUMA node. List each core in each LLC before a core in a different LLC. - Randomise the core order within each LLC. This attempts to provide a relatively evenly distributed set of cores while considering topology. Unlike `Topo`, it does not require you to specify the ordering and instead generates it from the hardware, making desyncs between the config and the hardware less likely. Currently `RandomTopo` considers topology even with `--disable-topology=true`. I can see the arguments for this going both ways. On one hand requesting disable topology suggests you want no consideration of machine topology, and `RandomTopo` should decay to `Random` (which it does on single node/LLC machines anyway). On the other hand, the config explicitly specifies `RandomTopo` and should consider the topology. If anyone feels strongly I can change this to respect `disable_topology`. Test plan: ```sh $ sudo target/release/scx_layered -v f:/tmp/test.json ... 14:31:19 [DEBUG] layer: batch algo: RandomTopo core order: [47, 44, 43, 42, 40, 45, 46, 41, 38, 37, 36, 39, 34, 32, 35, 33, 54, 49, 50, 52, 51, 48, 55, 53, 68, 64, 66, 67, 70, 69, 71, 65, 9, 10, 12, 15, 14, 11, 8, 13, 59, 60, 57, 63, 62, 56, 58, 61, 2, 3, 5, 4, 0, 6, 7, 1, 86, 83, 85, 87, 84, 81, 80, 82, 20, 22, 19, 23, 21, 18, 17, 16, 30, 25, 26, 31, 28, 27, 29, 24, 78, 73, 74, 79, 75, 77, 76, 72] 14:31:19 [DEBUG] layer: immediate algo: RandomTopo core order: [45, 40, 46, 42, 47, 43, 41, 44, 80, 82, 83, 84, 85, 86, 81, 87, 13, 10, 9, 15, 14, 12, 11, 8, 36, 38, 39, 32, 34, 35, 33, 37, 7, 3, 1, 0, 2, 5, 4, 6, 53, 52, 54, 48, 50, 49, 55, 51, 76, 77, 79, 78, 73, 74, 72, 75, 71, 66, 64, 67, 70, 69, 65, 68, 24, 26, 31, 25, 28, 30, 27, 29, 58, 56, 59, 61, 57, 62, 60, 63, 16, 19, 17, 23, 22, 20, 18, 21] ... ``` This is a machine with 1 NUMA/11 LLCs with 8 cores per LLC and you can see the results are grouped by LLC but random within.	2024-10-17 15:36:00 +01:00
Daniel Hodges	b01ff79080	Merge pull request #805 from hodgesds/layered-refresh-cleanup scx_layered: Refactor refresh cpumasks	2024-10-16 19:06:15 +00:00
Andrea Righi	2ea47af4bc	Merge pull request #804 from sched-ext/rustland-fixes scx_rustland fixes and improvements	2024-10-16 18:26:03 +00:00
Tejun Heo	58093eace5	Merge pull request #809 from sched-ext/htejun/revert-arch-vmlinux_h Revert #793	2024-10-16 16:52:02 +00:00
Tejun Heo	84d8abf913	Revert "Use per-arch vmlinux.h" This reverts commit `a23f3566e3`.	2024-10-16 06:42:28 -10:00
Tejun Heo	bd79059f1a	Revert "Add vmlinux.h for multiple arch" This reverts commit `7067092555`.	2024-10-16 06:42:18 -10:00
Dan Schatzberg	730052a0c4	Merge pull request #803 from dschatzberg/mitosis_fallback_dsq scx_mitosis: Handle pinned tasks	2024-10-16 13:26:23 +00:00
Andrea Righi	763da6ab55	scx_rlfifo: operate in a more work-conserving way Make scx_rlfifo even simpler and keep dispatching tasks even if the CPUs are all busy. This allows to better stress test the scx_rustland_core backend, by using both the per-CPU DSQs and the global shared DSQ. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:06:00 +02:00
Andrea Righi	b07de1d7d5	scx_rustland: clarify EDF scheduling scx_rustland is now effectively a deadline-based scheduler and not a pure vruntime-based scheduler. Clarify this in the source code. No functional change. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:06:00 +02:00
Andrea Righi	c4b6408e92	scx_rustland: smooth vruntime update Update vruntime adding the used virtual time slice of each task as soon they are scheduled. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:06:00 +02:00
Andrea Righi	0b2de2c10c	scx_rustland: use built-in nvcsw metrics Use the nvcsw metric from the scx_rustland_core backend, intead of retrieving this metric in user-space via procfs. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:06:00 +02:00
Andrea Righi	97629178e2	scx_rustland_core: bump up version to 2.2.2 Bump up the minor version to reflect the new backward-compatible functionality added. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:06:00 +02:00
Andrea Righi	704fe95f51	scx_rustland_core: get rid of the SCX_ENQ_WAKEUP logic With user-space scheduling we don't usually dispatch a task immediately after selecting an idle CPU, so there's not much benefit at trying to optimize the WAKE_SYNC scenario (when a task is waking up another task and releaing the CPU) when picking an idle CPU. Therefore, get rid of the WAKE_SYNC logic in select_cpu() and rely on the user-space logic (that has access to the WAKE_SYNC information) to handle this particular case. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:05:58 +02:00
Andrea Righi	67ec1af5cf	scx_rustland_core: kick an idle CPU after global dispatch Do not kick a CPU from rs_select_cpu() (called by the user-space scheduler), since we may not immediately dispatch the task. Instead, always try to wake up the task's assigned CPU after dispatching to a global DSQ, ensuring it can be consumed immediately. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 14:05:33 +02:00
Andrea Righi	0a05f1f193	scx_rustland_core: keep CPUs alive with pending tasks Prevent CPUs from going idle when the user-space scheduler has some pending activities to complete. Keeping the CPU alive allows to consume tasks from the user-space scheduler more efficiently, preventing bubbles in the scheduling pipeline. To achieve this, trigger a CPU kick from ops.update_idle() and set a flag in the CPU context to prevent it from going idle. Then keep kicking the CPU from ops.dispatch() until the flag is cleared, which occurs when no more tasks are pending or when the CPU exits idle as a task starts running on it. This allows to fix the performance regression introduced by the put_prev_task_scx() behavior change in Linux 6.12 (see #788). Link: https://lore.kernel.org/lkml/20241015111539.12136-1-andrea.righi@linux.dev/ Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-16 10:43:43 +02:00
Daniel Hodges	907746745e	scx_layered: Refactor refresh cpumasks Refactor the logic for refresh cpumasks to be easy to read and verify. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-15 17:58:10 -07:00
Andrea Righi	abfb4c53f5	scx_rustland_core: restart scheduler on hotplug events User-space schedulers may still hit some stalls during CPU hotplugging events. There is no reason to overcomplicate the code and trying to handle hotplug events within the scx_rustland_core framework and we can simply handle a scheduler restart performed by the scx core. This makes CPU hotplugging more reliable with scx_rustland_core-based schedulers. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-15 23:11:43 +02:00
Andrea Righi	4432e64d85	scx_rustland_core: allow user-space scheduler to run indefinitely Assign an infinite time slice to the user-space scheduler itself, so that it can completely drain all the pending tasks and voluntarily release the CPU when it's done. This allows to achieve more consistent performance and we can also remove the speculative user-space scheduler wakeup from ops.stopping(). Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-15 23:11:43 +02:00
Andrea Righi	be681c731a	scx_rustland_core: pass nvcsw, slice and dsq_vtime to user-space Provide additional task metrics to user-space schedulers via QueuedTask: - nvcsw: total amount of voluntary context switches - slice: task time slice "budget" (from p->scx.slice) - dsq_vtime: current task vtime (from p->scx.dsq_vtime) In this way user-space schedulers can quickly access these metrics to implement better scheduling policy. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-15 23:11:43 +02:00
Andrea Righi	1bbae64dc7	scx_rustland_core: update CPU idle selection logic Re-align idle selection logic with some of the latest improvements done in scx_bpfland. Signed-off-by: Andrea Righi <andrea.righi@linux.dev>	2024-10-15 23:11:42 +02:00
Tejun Heo	4841df8138	Merge pull request #793 from minosfuture/vmlinux_per_arch Use per-arch vmlinux.h	2024-10-15 19:52:42 +00:00
Dan Schatzberg	96ebe6b84a	scx_mitosis: Handle pinned tasks Pinned tasks should just be routed to a fallback DSQ. kthreads are given a higher priority than non-kthreads so use two fallback DSQs. Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-15 09:09:01 -07:00
Dan Schatzberg	902f41adf0	Merge pull request #799 from dschatzberg/mitosis_dispatch_no_wakeup scx_mitosis: handle enqueue() on !wakeup	2024-10-15 13:46:07 +00:00
Daniel Hodges	e017692697	Merge pull request #801 from hodgesds/layered-iter-fix scx_layered: Remove layer iteration	2024-10-14 23:39:48 +00:00
Daniel Hodges	71d63010af	scx_layered: Refactor layer iteration Remove DSQ iter algos. Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>	2024-10-14 13:13:53 -07:00
Dan Schatzberg	a17f16e4b9	scx_mitosis: handle enqueue() on !wakeup If we're not on the wakeup path, we may see enqueue() invoked without select_cpu() which will require an idle cpu lookup. In order to fix this, we refactor the idle_cpu lookup in select_cpu so it can be invoked from enqueue(). Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-10-14 10:13:07 -07:00
Daniel Hodges	7bfbc71012	Merge pull request #798 from sched-ext/hodgesds-perfetto-docs Update developer guide with Perfetto info	2024-10-14 15:09:15 +00:00
Daniel Hodges	43615107f9	Merge pull request #797 from hodgesds/layered-llc-integration scx_layered: Add LLC integration test	2024-10-14 15:06:25 +00:00

1 2 3 4 5 ...

2013 Commits