JakeHillion/scx

mirror of https://github.com/JakeHillion/scx.git synced 2024-11-26 19:30:24 +00:00

Author	SHA1	Message	Date
Tejun Heo	d7677e3e5c	scx/common.bpf.h: Rename bpf_log2[l]() to u32/64_log2() The bpf_ prefix is used for BPF API. Rename bpf_log2() to u32_log2() and bpf_log2l() to u64_log2(). While at it, relocate them below compiler directive helpers.	2024-06-14 15:22:39 -10:00
Changwoo Min	94a39f419f	scx_lavd: add the design of core compaction The core compaction seems to work great in various hardware. Now it is time to document its design. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-14 11:53:52 +09:00
Changwoo Min	747bf2a7d7	scx_lavd: add the design of CPU frequency scaling Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-13 01:42:19 +09:00
Changwoo Min	2e74b86b4a	scx_lavd: logging cpu performance target Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-13 00:44:04 +09:00
Changwoo Min	e6348a11e9	scx_lavd: improve frequency scaling logic The old logic for CPU frequency scaling is that the task's CPU performance target (i.e., target CPU frequency) is checked every tick interval and updated immediately. Indeed, it samples and updates a performance target every tick interval. Ultimately, it fluctuates CPU frequency every tick interval, resulting in less steady performance. Now, we take a different strategy. The key idea is to increase the frequency as soon as possible when a task starts running for quick adoption to load spikes. However, if necessary, it decreases gradually every tick interval to avoid frequency fluctuations. In my testing, it shows more stable performance in many workloads (games, compilation). Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-12 23:40:40 +09:00
Changwoo Min	753f333c09	scx_lavd: refactoring do_update_sys_stat() Originally, do_update_sys_stat() simply calculated the system-wide CPU utilization. Over time, it has evolved to collect all kinds of system-wide, periodic statistics for decision-making, so it has become bulky. Now, it is time to refactor it for readability. This commit does not contain functional changes other than refactoring. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-12 21:15:25 +09:00
Changwoo Min	9d129f0afa	scx_lavd: rename LAVD_CPU_UTIL_INTERVAL_NS to LAVD_SYS_STAT_INTERVAL_NS The periodic CPU utilization routine does a lot of other work now. So we rename LAVD_CPU_UTIL_INTERVAL_NS to LAVD_SYS_STAT_INTERVAL_NS. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-12 20:06:17 +09:00
Changwoo Min	7046b47b9c	scx_lavd: properly calculate task's runtime after suspend/resume When a device is suspended and resumed, the suspended duration is added up to a task's runtime if the task was running on the CPU. After the resume, the task's runtime is incorrectly long and the scheduler starts to recognize the system is under heavy load. To avoid such problem, the suspended duration is measured and substracted from the task's runtime. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-12 15:58:41 +09:00
Tejun Heo	30f27d99d9	Merge pull request #340 from sched-ext/htejun/layered-updates scx_layered: Improve yield, preemption and other behaviors	2024-06-10 11:27:44 -10:00
Tejun Heo	92317aa2f9	Use __always_inline uniformly Instead of using __attribute__((always_inline)) use the __always_inline macro provided by BPF.	2024-06-10 11:23:26 -10:00
Changwoo Min	472ab945b8	scx_lavd: core compaction for low power consumption (#338 ) scx_lavd: core compaction for low power consumption When system-wide CPU utilization is low, it is very likely all the CPUs are running with very low utilization. That means all CPUs run with low clock frequency thanks to dynamic frequency scaling and very frequently go in and out from/to C-state. That results in low performance (i.e., low clock frequency) and high power consumption (i.e., frequent P-/C-state transition). The idea of core compaction is using less number of CPUs when system-wide CPU utilization is low. The chosen cores (called "active cores") will run in higher utilization and higher clock frequency, and the rest of the cores (called "idle cores") will be in a C-state for a much longer duration. Thus, the core compaction can achieve higher performance with lower power consumption. One potential problem of core compaction is latency spikes when all the active cores are overloaded. A few techniques are incorporated to solve this problem. 1) Limit the active CPU core's utilization below a certain limit (say 50%). 2) Do not use the core compaction when the system-wide utilization is moderate (say 50%). 3) Do not enforce the core compaction for kernel and pinned user-space tasks since they are manually optimized for performance. In my experiments, under a wide range of system-wide CPU utilization (5%—80%), the core compaction reduces 7-30% power consumption without sacrificing average and 99p tail latency. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-06-08 09:25:27 +09:00
Tejun Heo	e556dd375d	scx: Unify loading and running boilerplate across rust schedulers Make restart handling with user_exit_info simpler and consistently use the load and report macros consistently across the rust schedulers. This makes all schedulers automatically handle auto restarts from CPU hotplug events. Note that this is necessary even for scx_lavd which has CPU hotplug operations as CPU hotplug operations which took place between skel open and scheduler init can still trigger restart.	2024-06-03 12:25:41 -10:00
Tejun Heo	a2d5310cb6	Bump versions for a release	2024-06-03 08:35:21 -10:00
I Hsin Cheng	0921fde1f1	scx_lavd: Adding READ_ONCE()/WRITE_ONCE() macros In order to prevent compiler from merging or refetching load/store operations or unwanted reordering, we take the implemetation of READ_ONCE()/WRITE_ONCE() from kernel sources under "/include/asm-generic/rwonce.h". Use WRITE_ONCE() in function flip_sys_cpu_util() to ensure the compiler doesn't perform unnecessary optimization so the compiler won't make incorrect assumptions when performing the operation of modifying of bit flipping. Signed-off-by: I Hsin Cheng <richard120310@gmail.com>	2024-06-01 11:07:52 +08:00
Changwoo Min	4c0f996ddc	Revert "scx_lavd: Enforce memory barrier in flip_sys_cpu_util"	2024-05-27 12:19:21 +09:00
I Hsin Cheng	f839106a57	scx_lavd: Enforce memory barrier in flip_sys_cpu_util Use the GNU built-in __sync_fetch_and_xor() to perform the XOR operation on global variable "__sys_cpu_util_idx" to ensure the operations visibility. The built-in function "__sync_fetch_and_xor()" can provide both atomic operation and full memory barrier which is needed by every operation (especially store operation) on global variables. Signed-off-by: I Hsin Cheng <richard120310@gmail.com>	2024-05-26 15:27:10 +08:00
David Vernet	17c0c10b4e	Merge pull request #294 from sched-ext/fix_warnings Fix warnings	2024-05-18 10:47:54 -05:00
Changwoo Min	4cba06dc33	scx_lavd: fix inconsistent indentation in main.bpf.c Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-18 22:22:16 +09:00
David Vernet	a1c60ce589	lavd: Remove unused variables from scx_lavd Fix unused variable warnings. Signed-off-by: David Vernet <void@manifault.com>	2024-05-18 07:51:20 -05:00
Tejun Heo	ab25992416	Add missing skel.attach() calls C SCX_OPS_ATTACH() and rust scx_ops_attach() macros were not calling .attach() and were only attaching the struct_ops. This meant that all non-struct_ops BPF programs contained in the skels were never attached which breaks e.g. scx_layered. Let's fix it by adding .attach() invocation the the attach macros.	2024-05-17 14:33:04 -10:00
I Hsin Cheng	6cce01c66b	Avoid redundant substraction in rsigmoid_u64 Originally the implementation of function rsigmoid_u64 will perform substraction even when the value of "v" equals to the value of "max" , in which the result is certainly zero. We can avoid this redundant substration by changing the condition from ">" to ">=" since we know when the value of "v" and "max" are equal we can return 0 without any substract operation.	2024-05-16 11:58:39 +08:00
vax-r	f293995b59	Fix typo Fix the usage of "scheduler" in the comment of main.bpf.c , it should a verb which is "schedule".	2024-05-15 23:02:35 +08:00
Changwoo Min	08e7e23cbe	scx_lavd: priint out the current limitaiton of scx_lavd for users Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-15 12:04:09 +09:00
Changwoo Min	a4560c7f7f	scx_lavd: add comments describing the idea of preemption Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-15 12:04:03 +09:00
Changwoo Min	446de3ef3c	scdx_lavd: minor style changes Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-10 11:07:32 +09:00
Changwoo Min	7fcc6e4576	scx_lavd: support yield-based preemption If there is a higher priority task when running ops.tick(), ops.select_cpu(), and ops.enqueue() callbacks, the current running tasks yields its CPU by shrinking time slice to zero and a higher priority task can run on the current CPU. As low-cost, fine-grained preemption becomes available, default parameters are adjusted as follows: - Raise the bar for remote CPU preemption to avoid IPIs. - Increase the maximum time slice. - Gradually enforce the fair use of CPU time (i.e., ineligible duration) Lastly, using CAS, we ensure that a remote CPU is preempted by only one CPU. This removes unnecessary remote preemptions (and IPIs). Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-10 00:54:41 +09:00
Changwoo Min	01e5a46371	Merge pull request #263 from multics69/scx_lavd-power01 scx_lavd: support CPU frequency scaling	2024-05-05 10:16:00 +09:00
Changwoo Min	a24e1d7adf	scx_lavd: more comments about CPU frequency scaling Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-04 10:41:13 +09:00
David Vernet	9bb8e9a548	common: Pull bpf_log2l() into helper function header scx_lavd implemented 32 and 64 bit versions of a base-2 logarithm function. This is now also used in rusty. To avoid code duplication, let's pull it into a shared header. Note that there is technically a functional change here as we remove the always inline compiler directive. We instead assume that the compiler will know best whether or not to inline the function. Signed-off-by: David Vernet <void@manifault.com>	2024-05-03 14:50:24 -05:00
Changwoo Min	6892898469	scx_lavd: support CPU frequency scaling To know the required CPU performance (e.g., frequency) demand, we keep track of 1) utilization of each CPU and 2) _performance criticality_ of each task. The performance criticality of a task denotes how critical it is to CPU performance (frequency). Like the notion of latency criticality, we use three factors: the task's average runtime, wake-up frequency, and waken-up frequency. A task's runtime is longer, and its two frequencies are higher; the task is more performance-critical because it would be a bottleneck in the middle of the task chain. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-05-04 00:30:25 +09:00
Tejun Heo	e5e88b7e18	Bump versions to prepare for a release	2024-04-29 09:07:27 -10:00
Tejun Heo	3e7ef35649	Merge pull request #250 from multics69/lavd-issue-234 scx_lavd: replesih time slice at ops.running() only when necessary	2024-04-29 09:01:04 -10:00
Tejun Heo	5b7b7d5193	Merge pull request #247 from multics69/lavd-issue-244 scx_lavd: always inline submit_task_ctx to make the verifier happy	2024-04-29 07:53:38 -10:00
Changwoo Min	5f63e0ca30	scx_lavd: replesih time slice at ops.running() only when necessary The current code replenishes the task's time slice whenever the task becomes ops.running(). However, there is a case where such behavior can starve the other tasks, causing the watchdog timeout error. One (if not all) such case is when a task is preempted while running by the higher scheduler class (e.g., RT, DL). In such a case, the task will be transit in a cycle of ops.running() -> ops.stopping() -> ops.running() -> etc. Whenever it becomes re-running, it will be placed at the head of local DSQ and ops.running() will renew its time slice. Hence, in the worst case, the task can run forever since its time slice is never exhausted. The fix is assigning the time slice only once by checking if the time slice is calculated before. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-04-29 12:13:31 +09:00
Andrea Righi	cabde30736	scx_utils: bump up version to 0.8.0 Bump up scx-utils version to provide the new scx_utils::TopologyMap. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>	2024-04-28 21:01:16 +02:00
Andrea Righi	905960f752	scx_lavd: use c_char consistently In Rust c_char can be aliased to i8 or u8, depending on the particular target architecture. For example, trying to build scx_lavd on ppc64 triggers the following error: error[E0308]: mismatched types --> src/main.rs:200:38 \| 200 \| let c_tx_cm: const c_char = (&tx.comm as const [i8; 17]) as const i8; \| ------------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `const u8`, found `const i8` \| \| \| expected due to this \| = note: expected raw pointer `const u8` found raw pointer `*const i8` To fix this, consistently use c_char instead of assuming it corresponds to i8. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>	2024-04-27 17:21:19 +02:00
Changwoo Min	f470b1aa13	scx_lavd: always inline submit_task_ctx to make the verifier happy In _some_ kernel versions, loading scx_lavd fails with an error of "bpf_rcu_read_unlock is missing". The usage of bpf_rcu_read_lock/unlock() in proc_dump_all_tasks() is correct but the bpf verifier still think bpf_rcu_read_unlock() is missing. The most plausible reason so far is that the problematic kernel does not have a commit 6fceea0fa59f ("bpf: Transfer RCU lock state between subprog calls"), failing inter-procedural analysis between proc_dump_all_tasks() and submit_task_ctx(). Thus, we force inline submit_task_ctx() (no inter-procedural analysis by the verifier is necessary) for the time being. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-04-28 00:11:38 +09:00
Changwoo Min	d0d0a18b10	scx_lavd: fix copyright information Correct the copyright and author information Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-04-26 16:36:58 +09:00
takase1121	5d20f89a87	scheds-rust: build rust schedulers in sequence	2024-04-23 08:06:27 +08:00
David Vernet	45589cd0f7	lavd: Fix a few typos Noticed a few typos. Let's fix em up Signed-off-by: David Vernet <void@manifault.com>	2024-04-17 08:17:52 -05:00
Changwoo Min	f53c29759e	scx_lavd: support preemption (in some scenarios) (#224 ) * scx-lavd: preemption of a lower-priority task using kick cpu When a task is enqueued to the global queue, the scheduler checks if there is a lower priority task than the enqueued task. If so, it kicks out the lower-priority task, hoping the newly enqueued task or another higher-priority task runs on the kicked CPU. Kicking another CPU is expensive as an IPI is involved, so the scheduler judiciously kicks the CPU when its benefit (i.e., priority gap) is clear enough. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-04-09 14:25:53 +09:00
Tejun Heo	ba52cc131b	scx_lavd: Add .gitignore	2024-04-04 07:15:37 -10:00
Tejun Heo	a60737a6bf	Merge pull request #207 from sched-ext/api-updates scx: Apply API updates from sched_ext	2024-04-02 14:26:42 -10:00
Tejun Heo	b925bdf94d	Cargo.toml: Update libbpf-rs/cargo dependencies to 0.23 and drop patch.crates-io sections New versions of libbpf-rs and libbpf-cargo are now available with all the needed features. Update the dependencies and drop the patch sections.	2024-04-02 11:19:39 -10:00
Tejun Heo	6f81409df4	Bump versions - scx_utils bumped from 0.6.0 to 0.7.0. - Repo and rust schedulers get a PATCH level bump.	2024-04-02 10:58:50 -10:00
Tejun Heo	dfa978d166	scx_lavd: Apply API updates	2024-04-02 10:08:02 -10:00
Tejun Heo	59bbd800c1	compat: Implement scx_utils::compat and fix up scx_layered Implement scx_utils::compat to match C's scx/compat.h and update scx_layered. Other rust scheds are still broken.	2024-04-02 07:08:56 -10:00
Changwoo Min	3a3bd2a750	scx_lavd: increase the upper bound of ineligible duration Change the upper bound of ineligible duration (LAVD_ELIGIBLE_TIME_MAX). The updated (2x increased) upper bound reflects the distribution of tasks' eligible_delta_ns better. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-03-30 22:59:06 +09:00
Changwoo Min	8efaf0c4c2	scx_lavd: improve the accuracy of task's run_freq Change the calculation of the run_frequence using the wait_period from the last time the task yielded CPU to this time when the task is running. The old implementation measures the time interval between the last stopping and the current running and increases run_freq without reason. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-03-30 22:55:17 +09:00
Changwoo Min	fe3efb8ce2	scx_lavd: rename last_{start/stop/wait/wake}_clk for consistency Change the last_{start/stop/wait/wake}_clk in task_ctx to last_{running/stopping/quiescent/runnable}_clk, matching with state transition names. In addition, add comments and reorder fields in task_ctx for readability. Signed-off-by: Changwoo Min <changwoo@igalia.com>	2024-03-30 10:13:20 +09:00

1 2

83 Commits