scx-upstream

mirror of https://github.com/sched-ext/scx.git synced 2024-12-12 03:34:22 +00:00

Author	SHA1	Message	Date
Tejun Heo	25d7e6f787	scx_layered: Implement on-demand statistics generation Instead of keeping one copy of sched_stats, each stats server session carries their own so that stats can be generated independently by each client at any interval. CPU allocation min/max tracking is broken for now.	2024-08-19 08:27:36 -10:00
Tejun Heo	0b9c8b5cbd	scx_stats: Update versions to 0.2.0 to republish	2024-08-15 12:29:27 -10:00
Tejun Heo	751a38e34e	scx_layered: Refactor stats printing code	2024-08-15 08:53:19 -10:00
Tejun Heo	8aae9a5de2	scx_stats: s/scx_stat/scx_stats/ Use plural form which is more widespread and also used in scheduler implementations. No functional changes.	2024-08-15 05:31:34 -10:00
Tejun Heo	6e466d18df	scx_layered: Initial switch to scx_stat - This makes the scheduler side simpler and allows on-demand monitoring. - OpenMetrics support is dropped for now. Will add a generic tool for it. - This is a naive conversion. Will be further refined. scx_layered no longer prints statistics by default. To watch statistics, run `scx_layered --monitor` while the scheduler is running.	2024-08-14 13:48:41 -10:00
Tejun Heo	63c4a0191f	Merge branch 'main' into topic/inlined-skeleton-members	2024-08-08 14:23:37 -10:00
Tejun Heo	cd6a4d72c7	Bump versions for 1.0.2 release	2024-08-08 14:10:16 -10:00
Tejun Heo	7c3ffe96e1	Unify crate dependency versions Different sub-projects are using different versions for the same crates. Synchronize them to the latest.	2024-08-08 13:26:47 -10:00
Daniel Müller	565aec3662	rust: Update libbpf-rs & libbpf-cargo to 0.24 Update libbpf-rs & libbpf-cargo to 0.24. Among other things, generated skeletons now contain directly accessible map and program objects, no longer necessitating the use of accessor methods. As a result, the risk for mutability conflicts is reduced greatly. Signed-off-by: Daniel Müller <deso@posteo.net>	2024-07-16 11:48:52 -07:00
Tejun Heo	51334b5c4d	Bump versions for 1.0.1 release	2024-07-15 13:21:52 -10:00
Tejun Heo	761ec142ce	Bump most versions to 1.0.0 sched_ext is about to be merged upstream. There are some compatibility breaking changes and we're making the current sched_ext/for-6.11 1edab907b57d ("sched_ext/scx_qmap: Pick idle CPU for direct dispatch on !wakeup enqueues") the baseline. Tag everything except scx_mitosis as 1.0.0. As scx_mitosis is still in early development and is currently temporarily disabled, only the patchlevel is bumped.	2024-07-12 11:34:14 -10:00
Tejun Heo	a2d5310cb6	Bump versions for a release	2024-06-03 08:35:21 -10:00
Tejun Heo	e5e88b7e18	Bump versions to prepare for a release	2024-04-29 09:07:27 -10:00
Andrea Righi	cabde30736	scx_utils: bump up version to 0.8.0 Bump up scx-utils version to provide the new scx_utils::TopologyMap. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>	2024-04-28 21:01:16 +02:00
Tejun Heo	b925bdf94d	Cargo.toml: Update libbpf-rs/cargo dependencies to 0.23 and drop patch.crates-io sections New versions of libbpf-rs and libbpf-cargo are now available with all the needed features. Update the dependencies and drop the patch sections.	2024-04-02 11:19:39 -10:00
Tejun Heo	6f81409df4	Bump versions - scx_utils bumped from 0.6.0 to 0.7.0. - Repo and rust schedulers get a PATCH level bump.	2024-04-02 10:58:50 -10:00
Tejun Heo	59bbd800c1	compat: Implement scx_utils::compat and fix up scx_layered Implement scx_utils::compat to match C's scx/compat.h and update scx_layered. Other rust scheds are still broken.	2024-04-02 07:08:56 -10:00
Jordan Rome	1b3a9a1e72	[scx_layered] downgrade prometheus-client This library at version 22 is not available in fedora: https://src.fedoraproject.org/rpms/rust-prometheus-client Rather than bothering the maintainer, let's just downgrade here.	2024-01-31 04:36:01 -08:00
Tejun Heo	988b7d13c1	Bump versions scx_exit_info change doesn't require code to be updated but breaks binary compatbility. Bump versions and cut a new release.	2024-01-25 09:01:23 -10:00
Dan Schatzberg	7f9548eb34	scx_layered: Add support for OpenMetrics format Currently scx_layered outputs statistics periodically as info! logs. The format of this is largely unstructured and mostly suitable for running scx_layered interactively (e.g. observing its behavior on the command line or via logs after the fact). In order to run scx_layered at larger scale, it's desireable to have statistics output in some format that is amenable to being ingested into monitoring databases (e.g. Prometheseus). This allows collection of stats across many machines. This commit adds a command line flag (-o) that outputs statistics to stdout in OpenMetrics format instead of the normal log mechanism. OpenMetrics has a public format specification (https://github.com/OpenObservability/OpenMetrics) and is in use by many projects. The library for producing OpenMetrics metrics is lightweight but does induce some changes. Primarily, metrics need to be pre-registered (see OpenMetricsStats::new()). Without -o, the output looks as before, for example: ``` 19:39:54 [INFO] CPUs: online/possible=52/52 nr_cores=26 19:39:54 [INFO] Layered Scheduler Attached 19:39:56 [INFO] tot= 9912 local=76.71 open_idle= 0.00 affn_viol= 2.63 tctx_err=0 proc=21ms 19:39:56 [INFO] busy= 1.3 util= 65.2 load= 263.4 fallback_cpu= 1 19:39:56 [INFO] batch : util/frac= 49.7/ 76.3 load/frac= 252.0: 95.7 tasks= 458 19:39:56 [INFO] tot= 2842 local=45.04 open_idle= 0.00 preempt= 0.00 affn_viol= 0.00 19:39:56 [INFO] cpus= 2 [ 0, 2] 04000001 00000000 19:39:56 [INFO] immediate: util/frac= 0.0/ 0.0 load/frac= 0.0: 0.0 tasks= 0 19:39:56 [INFO] tot= 0 local= 0.00 open_idle= 0.00 preempt= 0.00 affn_viol= 0.00 19:39:56 [INFO] cpus= 50 [ 0, 50] fbfffffe 000fffff 19:39:56 [INFO] normal : util/frac= 15.4/ 23.7 load/frac= 11.4: 4.3 tasks= 556 19:39:56 [INFO] tot= 7070 local=89.43 open_idle= 0.00 preempt= 0.00 affn_viol= 3.69 19:39:56 [INFO] cpus= 50 [ 0, 50] fbfffffe 000fffff 19:39:58 [INFO] tot= 7091 local=84.91 open_idle= 0.00 affn_viol= 2.64 tctx_err=0 proc=21ms 19:39:58 [INFO] busy= 0.6 util= 31.2 load= 107.1 fallback_cpu= 1 19:39:58 [INFO] batch : util/frac= 18.3/ 58.5 load/frac= 93.9: 87.7 tasks= 589 19:39:58 [INFO] tot= 2011 local=60.67 open_idle= 0.00 preempt= 0.00 affn_viol= 0.00 19:39:58 [INFO] cpus= 2 [ 2, 2] 04000001 00000000 19:39:58 [INFO] immediate: util/frac= 0.0/ 0.0 load/frac= 0.0: 0.0 tasks= 0 19:39:58 [INFO] tot= 0 local= 0.00 open_idle= 0.00 preempt= 0.00 affn_viol= 0.00 19:39:58 [INFO] cpus= 50 [ 50, 50] fbfffffe 000fffff 19:39:58 [INFO] normal : util/frac= 13.0/ 41.5 load/frac= 13.2: 12.3 tasks= 650 19:39:58 [INFO] tot= 5080 local=94.51 open_idle= 0.00 preempt= 0.00 affn_viol= 3.68 19:39:58 [INFO] cpus= 50 [ 50, 50] fbfffffe 000fffff ^C19:39:59 [INFO] EXIT: BPF scheduler unregistered ``` With -o passed, the output is in OpenMetrics format: ``` 19:40:08 [INFO] CPUs: online/possible=52/52 nr_cores=26 19:40:08 [INFO] Layered Scheduler Attached # HELP total Total scheduling events in the period. # TYPE total gauge total 8489 # HELP local % that got scheduled directly into an idle CPU. # TYPE local gauge local 86.45305689716104 # HELP open_idle % of open layer tasks scheduled into occupied idle CPUs. # TYPE open_idle gauge open_idle 0.0 # HELP affn_viol % which violated configured policies due to CPU affinity restrictions. # TYPE affn_viol gauge affn_viol 2.332430203793144 # HELP tctx_err Failures to free task contexts. # TYPE tctx_err gauge tctx_err 0 # HELP proc_ms CPU time this binary has consumed during the period. # TYPE proc_ms gauge proc_ms 20 # HELP busy CPU busy % (100% means all CPUs were fully occupied). # TYPE busy gauge busy 0.5294061026085283 # HELP util CPU utilization % (100% means one CPU was fully occupied). # TYPE util gauge util 27.37195512782239 # HELP load Sum of weight * duty_cycle for all tasks. # TYPE load gauge load 81.55024768702126 # HELP layer_util CPU utilization of the layer (100% means one CPU was fully occupied). # TYPE layer_util gauge layer_util{layer_name="immediate"} 0.0 layer_util{layer_name="normal"} 19.340849995024997 layer_util{layer_name="batch"} 8.031105132797393 # HELP layer_util_frac Fraction of total CPU utilization consumed by the layer. # TYPE layer_util_frac gauge layer_util_frac{layer_name="batch"} 29.34063385422595 layer_util_frac{layer_name="immediate"} 0.0 layer_util_frac{layer_name="normal"} 70.65936614577405 # HELP layer_load Sum of weight * duty_cycle for tasks in the layer. # TYPE layer_load gauge layer_load{layer_name="immediate"} 0.0 layer_load{layer_name="normal"} 11.14363313258934 layer_load{layer_name="batch"} 70.40661455443191 # HELP layer_load_frac Fraction of total load consumed by the layer. # TYPE layer_load_frac gauge layer_load_frac{layer_name="normal"} 13.664744680306903 layer_load_frac{layer_name="immediate"} 0.0 layer_load_frac{layer_name="batch"} 86.33525531969309 # HELP layer_tasks Number of tasks in the layer. # TYPE layer_tasks gauge layer_tasks{layer_name="immediate"} 0 layer_tasks{layer_name="normal"} 490 layer_tasks{layer_name="batch"} 343 # HELP layer_total Number of scheduling events in the layer. # TYPE layer_total gauge layer_total{layer_name="normal"} 6711 layer_total{layer_name="batch"} 1778 layer_total{layer_name="immediate"} 0 # HELP layer_local % of scheduling events directly into an idle CPU. # TYPE layer_local gauge layer_local{layer_name="batch"} 69.79752530933632 layer_local{layer_name="immediate"} 0.0 layer_local{layer_name="normal"} 90.86574281031143 # HELP layer_open_idle % of scheduling events into idle CPUs occupied by other layers. # TYPE layer_open_idle gauge layer_open_idle{layer_name="immediate"} 0.0 layer_open_idle{layer_name="batch"} 0.0 layer_open_idle{layer_name="normal"} 0.0 # HELP layer_preempt % of scheduling events that preempted other tasks. # # TYPE layer_preempt gauge layer_preempt{layer_name="normal"} 0.0 layer_preempt{layer_name="batch"} 0.0 layer_preempt{layer_name="immediate"} 0.0 # HELP layer_affn_viol % of scheduling events that violated configured policies due to CPU affinity restrictions. # TYPE layer_affn_viol gauge layer_affn_viol{layer_name="normal"} 2.950379973178364 layer_affn_viol{layer_name="batch"} 0.0 layer_affn_viol{layer_name="immediate"} 0.0 # HELP layer_cur_nr_cpus Current # of CPUs assigned to the layer. # TYPE layer_cur_nr_cpus gauge layer_cur_nr_cpus{layer_name="normal"} 50 layer_cur_nr_cpus{layer_name="batch"} 2 layer_cur_nr_cpus{layer_name="immediate"} 50 # HELP layer_min_nr_cpus Minimum # of CPUs assigned to the layer. # TYPE layer_min_nr_cpus gauge layer_min_nr_cpus{layer_name="normal"} 0 layer_min_nr_cpus{layer_name="batch"} 0 layer_min_nr_cpus{layer_name="immediate"} 0 # HELP layer_max_nr_cpus Maximum # of CPUs assigned to the layer. # TYPE layer_max_nr_cpus gauge layer_max_nr_cpus{layer_name="immediate"} 50 layer_max_nr_cpus{layer_name="normal"} 50 layer_max_nr_cpus{layer_name="batch"} 2 # EOF ^C19:40:11 [INFO] EXIT: BPF scheduler unregistered ``` Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>	2024-01-25 09:59:49 -08:00
Jordan Rome	9f9a97a97f	Update descriptions in cargo toml files	2024-01-19 18:19:46 -08:00
Tejun Heo	942b0269b8	Bump versions After updates to reflect the updated init and direct dispatch API, the schedulers aren't compatible with older kernels. Bump versions and publish releases.	2024-01-08 18:49:54 -10:00
Jordan Rome	661ea57c5c	bump scx_rusty and scx_layered These were supposed to be bumped in this commit: `fed1dae9da`	2024-01-04 13:57:29 -08:00
Jordan Rome	e9a9d32ab6	Restructure scheds folder names - combine c and kernel-examples as it's confusing to have both - rename 'rust-user' and 'c-user' to just 'rust' and 'c', which is simpler - update and fix sync-to-kernel.sh	2023-12-17 13:14:31 -08:00

24 Commits