JakeHillion/scx

mirror of https://github.com/JakeHillion/scx.git synced 2024-12-01 21:37:12 +00:00

Author	SHA1	Message	Date
Changwoo Min	3924ebaa4d	scx_lavd: properly synchronize taskc->vdeadline_log_clk Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-20 01:41:29 +09:00
Changwoo Min	02ad43d116	scx_lavd: directly use p->scx.weight instead load_ideal Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-20 00:25:11 +09:00
Changwoo Min	c955caefd8	scx_lavd: drop sys_load_factor In theory, sys_load_factor should not be necessary since we do not stretch the time space anymore. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-20 00:10:29 +09:00
Changwoo Min	67a6deb983	scx_lavd: use lat_cri instead of lat_prio universally Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-19 23:56:51 +09:00
Changwoo Min	6f10d6907c	scx_lavd: drop sched_prio_to_slice_weight[] table Use p->scx.weight instead. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-19 22:39:01 +09:00
Changwoo Min	034303f00f	scx_lavd: consider starvation factor in determining latency criticality Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-19 22:17:50 +09:00
Changwoo Min	99e0d21c3c	scx_lavd: drop the runtime factor in calculating latency criticality That is okay since the runtime is considered in calculating a virtual deadline. A shorter runtime will result in a tighter deadline linearly. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-19 17:28:40 +09:00
Changwoo Min	b90599e967	scx_lavd: do not inherit parent's properties If inheriting the parent's properties, a new fork task tends to be too prioritized. That is, many parent processes, such as `make,` are a bit more latency-critical than average. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-19 15:29:13 +09:00
Changwoo Min	78d96a6fb6	scx_lavd: advance clock by reverse proportional to the system load Advancing the clock slower when overloaded gives more opportunities for latency-critical tasks to cut in the run queue. Controlling the clock better reflects the actual load than the prior approach of stretching the time-space when overloaded. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-18 15:53:38 +09:00
Changwoo Min	9bc20f9160	scx_lavd: maintain ineligible runnable tasks separately We now maintain two run queues—an eligible run queue (DSQ) and an ineligible run queue (rbtree)—sorted by the task's virtual deadline. When the eligible run queue is empty, or the ineligible run queue has not been consumed for too long (e.g., 15 msec), a task in the ineligible run queue is moved to the eligible run queue for execution. With these two queues, we have a better admission control. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-17 23:46:11 +09:00
Changwoo Min	55e19ea5df	scx_lavd: do not prioritize a wake-up task in ops.select_cpu() This is a prep for adding an ineligible DSQ. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-17 11:16:02 +09:00
Changwoo Min	c84b73e971	scx_lavd: rename LAVD_GLOBAL_DSQ to LAVD_ELIGIBLE_DSQ This is a prep to add a global ineligible dsq. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-17 10:34:34 +09:00
Changwoo Min	971bb2e024	scx_lavd: pretty formatting for ineligible duration Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-16 23:54:15 +09:00
Changwoo Min	adfbf3934c	scx_lavd: tuning the max ineligible duration Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-16 23:52:23 +09:00
Changwoo Min	eff444516f	scx_lavd: directly measure service time for eligibility enforcement Estimating the service time from run time and frequency is not incorrect. However, it reacts slowly to sudden changes since it relies on the moving average. Hence, we directly measure the service time to enforce fairness. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-16 23:48:26 +09:00
Tejun Heo	51334b5c4d	Bump versions for 1.0.1 release	2024-07-15 13:21:52 -10:00
Andrea Righi	8e7a526356	scx_bpfland: use nr_cpu_ids for consistency We always use nr_cpu_ids to represent the maximum CPU id returned by scx_bpf_nr_cpu_ids(). Replace cpu_max with nr_cpu_ids to be more consistent with the rest of the code. Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-15 08:44:35 +02:00
Andrea Righi	33d06f653b	scx_bpfland: get rid of the MAX_CPUS hard-coded limit We can rely on scx_bpf_nr_cpu_ids() to create all the possible per-CPU DSQs, eliminating the need for the hard-coded limit MAX_CPUS. In this way scx_bpfland can support the same amount of CPUs that the kernel can handle. Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-15 00:17:30 +02:00
Andrea Righi	b80ef7d8eb	scx_bpfland: optimize offline CPU handling Instead of constantly checking the need to drain tasks from the DSQs of the offline CPUs, provide an atomic flag to notify when there are tasks to be drained from the offline CPUs. Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-15 00:17:23 +02:00
Andrea Righi	0530706710	scx_bpfland: properly initialize the nvcsw metrics Initialize the number of voluntary context switches metrics in the local task storage. Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-15 00:16:10 +02:00
Andrea Righi	bf4ad23599	scx_bpfland: refine interactive tasks flood safeguard Refine the safeguard mechanism to avoid generating too many interactive tasks in the system, which could nullify the effect of the interactive/regular task classification. The safeguard mechanism operates by pausing the promotion of new tasks to interactive status during the task wake-up process, whenever the number of interactive tasks in the priority queue exceeds a specific limit (set to 4x the number of online CPUs). Halting the promotion of additional interactive tasks allows to prioritize those already classified as interactive, thereby preventing potential "bursts" of excessive interactive tasks in the system. This refines the mitigation already provided by commit `640bd562` ("scx_bpfland: prevent tasks from abusing interactive priority boost"). Fixes: `640bd562` ("scx_bpfland: prevent tasks from abusing interactive priority boost") Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-15 00:11:34 +02:00
Andrea Righi	eb1cf0e670	scx_bpfland: improve task time slice evaluation Always assign the maximum time slice if there are idle CPUs in the system. Otherwise, double the task's unused time slice to reward tasks that use less CPU time and at the same time refill the time slice of the tasks every time they're dispatched. Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-14 23:24:24 +02:00
Tejun Heo	3ae76acd12	Merge pull request #424 from sched-ext/sync-upstream-kernel-and-bump-to-1.0 Sync to upstream kernel and bump to 1.0	2024-07-14 07:00:38 -10:00
Changwoo Min	5b2112dd81	Merge pull request #421 from multics69/lavd-metrics scx_lavd: improve time slice and waker freq calculation	2024-07-14 18:49:36 +09:00
Tejun Heo	761ec142ce	Bump most versions to 1.0.0 sched_ext is about to be merged upstream. There are some compatibility breaking changes and we're making the current sched_ext/for-6.11 1edab907b57d ("sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues") the baseline. Tag everything except scx_mitosis as 1.0.0. As scx_mitosis is still in early development and is currently temporarily disabled, only the patchlevel is bumped.	2024-07-12 11:34:14 -10:00
Tejun Heo	54c487731a	Update to vmlinux-v6.10-rc2-g1edab907b57d.h Sync to vmlinux.h from sched_ext/for-6.11 1edab907b57d ("sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues"). This most likely will be the commit which will be merged during the upcoming kernel v6.11 merge window. Unfortunately, this is a compatibility breaking change. As the size of bpf_iter_scx_dsq is reduced, schedulers that use the iterator - scx_lavd and scx_layered - won't be able to run on older kernels. Likewise, older binaries from before this commit won't be able to run on newer kernels.	2024-07-12 11:13:34 -10:00
Tejun Heo	f261d0f037	Sync from kernel - 1edab907b57d Sync from sched_ext/for-6.11 1edab907b57d ("sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues") git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git for-6.11 - cgroup support hasn't landed in the upstream kernel yet. This most likely will happen in a few weeks. For the time being, disable scx_flatcg, scx_pair and scx_mitosis. - Compat macro for DSQ task iterator dropped. This is now a part of the baseline. - scx_bpf_consume() isn't upstream yet. BPF interfacing side is still being discussed. Dropped example usage from tools/sched_ext. None of the practical schedulers use it, so this should be fine for now. - scx_bpf_cpu_rq() added. - AUTOATTACH workaround for newer libbpf versions added.	2024-07-12 11:08:41 -10:00
Changwoo Min	512bd143a5	scx_lavd: count only related tasks in calculating waker_freq A task can become a runnable on any task's context not only its waker task. Thus, we should not count wake-up on unrelated task's context. With this commit, the scheduler can (much more) accurately detect waker-wakee relationsships. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-12 22:51:09 +09:00
Changwoo Min	95733f63ab	scx_lavd: calculate time slice as a function of run queue length The prior approach using the sum of weights gives too much penalty to nice tasks with large nice values. With this commit, the time slice is determined by the number of runnable tasks regardless of nice priority. Note that the fairness will still be enforced based on tasks' nice priorities (weights). Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-12 22:45:22 +09:00
Changwoo Min	00fdc1d949	Merge pull request #417 from multics69/lavd-vdeadline scx_lavd: improve virtual deadline and current clock handling	2024-07-12 14:05:44 +09:00
Changwoo Min	d4bc92bea7	scx_lavd: print lat_cri to output Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-12 13:23:56 +09:00
Changwoo Min	4c5c564523	scx_lavd: initial current logical clock to zero To easily distinguish, let's initialize the current logical clock to zero (not the current physical time). Also, avoid the deadline calculation being zero by adding +1 here and there. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-12 10:15:54 +09:00
Andrea Righi	640bd562ff	scx_bpfland: prevent tasks from abusing interactive priority boost The priority boost for interactive tasks can be exploited to render the system nearly unresponsive by creating numerous tasks that constantly switch between wait/wakeup states. For example, stress tests like `hackbench -l 10000` can significantly degrade system responsiveness. To mitigate this, limit the number of interactive tasks added to the priority queue to 4x the number of online CPUs. This simple approach appears to be a quite effective at identifying potential spam of "fake" interactive tasks, while still prioritizing legitimate interactive tasks. Additionally, periodically refresh the interactive status of the tasks based on their most recent average of voluntary context switches, preventing the interactive status from being too "sticky". Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-11 16:13:55 +02:00
Andrea Righi	1babb2b92d	scx_bpfland: prevent per-CPU kthreads starving other tasks Avoid dispatching per-CPU kthreads directly, since this may cause interactivity problems or unfairness, for example if there are too many softirqs being scheduled (e.g., in presence of high RX network traffic or when running certain stress tests, like hackbench). Moreover, in order to help with testing and benchmarks, introduce the option --local-kthread, that allows to restore the old behavior if enabled. Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-11 16:13:48 +02:00
Andrea Righi	c3ebdd338f	scx_bpfland: prevent slice delta overflow When updating the task vruntime, ensure the time slice delta is always a positive value. Failing to do so may cause the global vruntime to increase excessively due to overflows. Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-11 15:58:01 +02:00
Andrea Righi	f59aa52fe7	scx_bpfland: expose the amount of online CPUs Periodically report the amount of online CPUs to stdout. The online CPUs are initially evaluated looking at the online cpumask, then the value is updated in the .cpu_offline() / .cpu_online() callbacks. Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-11 15:58:01 +02:00
Andrea Righi	3a47b484af	scx_bpfland: report interactive tasks to stdout Keep track of the CPUs that are running interactive tasks and report their amount to stdout. Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-11 15:58:01 +02:00
Andrea Righi	1a1a16b9e9	scx_bpfland: fix typo in slice_ns definition The correct default value of slice_ns 5ms, not 5s. This change doesn't really make any difference in practice, since these values are changed by the Rust part when the scheduler is started, but it's good to keep this aligned to the proper values for consistency. Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com>	2024-07-11 15:58:01 +02:00
Changwoo Min	bdbfeb9fd1	scx_lavd: use logical current clock for virtual deadlines This commit changes the use of a physical clock to a virtual, logical clock in calculating deadlines. - The virtual current clock advances upon a task's running to its virtual deadline. - When enqueuing a task, its virtual deadline from the virtual current clock is calculated. With the above two changes, this guarantees that there is no such task whose virtual deadline is smaller than the virtual current clock. This means any enqueuing task can compete with any other already enqueued tasks. This allows a latency-critical task to be immediately scheduled if needed. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 22:41:56 +09:00
Changwoo Min	408ea7892c	scx_lavd: induce sched_prio_to_latency_weight from slice weight So sched_prio_to_latency_weight is removed. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 21:37:21 +09:00
Changwoo Min	bd964acff6	scx_lavd: deprioritize a newly forked task in latency Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 21:36:32 +09:00
Changwoo Min	48debe416e	scx_lavd: tuning the deadline equation under high load Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 21:35:54 +09:00
Changwoo Min	c72e063680	scx_lavd: do not use lat_prio_to_greedy_thresholds With other optimizations, lat_prio_to_greedy_thresholds is not effective any more. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 21:35:01 +09:00
Changwoo Min	9ed488798e	scx_lavd: use task's runtime to determine its deaddline It has an effect of further perferring shorter jobs. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 21:34:25 +09:00
Changwoo Min	e081b2a294	scx_lavd: rename LAVD_MAX_CAS_RETRY to LAVD_MAX_RETRY Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-07-11 21:33:56 +09:00
Andrea Righi	995577762a	scx_bpfland: refill task time slice Every time we need to dispatch a task re-evalate its time slice as: (unused_time_slice + min_time_slice) / 2 This allows to refill the time slice for tasks that haven't used much of their previously assigned time, improving fairness. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>	2024-07-06 14:07:24 +02:00
Andrea Righi	6a64182ef2	scx_bpfland: always classify interactive tasks Make sure to always classify interactive tasks, even when the system is not fully utilized. This ensures that if the system suddenly becomes overloaded, we already know which tasks need to be dispatched to the priority DSQ. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>	2024-07-06 14:07:24 +02:00
Andrea Righi	8dd528abfd	scx_bpfland: pass enqueue flags when dispatching kthreads Signed-off-by: Andrea Righi <andrea.righi@canonical.com>	2024-07-06 14:07:10 +02:00
Andrea Righi	fc0d1bd003	Merge pull request #415 from sched-ext/bpfland-output scx_bpfland: additional stats and output improvements	2024-07-05 19:50:07 +02:00
Tejun Heo	af5e89e73c	Merge pull request #412 from vax-r/flatcg_delta_fetch scx_flatcg: Make good use of __sync_fetch_and_sub()	2024-07-05 07:39:22 -10:00

1 2 3 4 5 ...

601 Commits