Daniel Hodges
05a2721f8e
Merge pull request #510 from hodgesds/layered-core-topo-selection
...
scx_layered: Use topology for core selection
2024-08-19 20:01:16 -04:00
Tejun Heo
695a33cdcc
Merge pull request #517 from sched-ext/htejun/fix
...
scx_layered: Fix verification failure
2024-08-19 13:44:38 -10:00
Tejun Heo
d01b49bd0e
scx_layered: Fix verification failure
...
4fccc06905
("scx_layered: Fix uninitialized variable") causes the
following verification failure. Fix it by moving assignments below range
checking.
Validating match_layer() func#1...
283: R1=scalar() R2=scalar() R3=mem_or_null(id=49,sz=1) R10=fp0
; int match_layer(u32 layer_id, pid_t pid, const char *cgrp_path) @ main.bpf.c:1029
283: (7b) *(u64 *)(r10 -24) = r3 ; R3=mem_or_null(id=49,sz=1) R10=fp0 fp-24_w=mem_or_null(id=49,sz=1)
284: (bc) w7 = w1 ; R1=scalar() R7_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
; struct layer *layer = &layers[layer_id]; @ main.bpf.c:1033
285: (bc) w1 = w7 ; R1_w=scalar(id=50,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R7_w=scalar(id=50,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
286: (27) r1 *= 1061192 ; R1_w=scalar(smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8))
287: (18) r8 = 0xffffc90002a26000 ; R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080)
289: (0f) r8 += r1 ; R1_w=scalar(smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8))
; u32 nr_match_ors = layer->nr_match_ors; @ main.bpf.c:1034
290: (bf) r1 = r8 ; R1_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8)) R8_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8))
291: (07) r1 += 1060992 ; R1_w=map_value(map=bpf_bpf.bss,ks=4,vs=16979080,off=0x103080,smin=0,smax=umax=0x103147ffefceb8,smax32=0x7ffffff8,umax32=0xfffffff8,var_off=(0x0; 0x1ffffffffffff8))
292: (61) r1 = *(u32 *)(r1 +0)
R1 unbounded memory access, make sure to bounds check any such access
processed 1099 insns (limit 1000000) max_states_per_insn 2 total_states 72 peak_states 72 mark_read 9
-- END PROG LOAD LOG --
2024-08-19 13:18:20 -10:00
Tejun Heo
c0b4deb9ec
Merge pull request #516 from sched-ext/htejun/scx_stats
...
scx_stats/scripts/scxstats_to_openmetrics: Retry connection
2024-08-19 13:02:22 -10:00
Tejun Heo
4e859d067e
scx_stats/scripts/scxstats_to_openmetrics: Retry connection
...
It now retries until told to exit. This is a bit easier to use and matches
`scx_layered --monitor`.
2024-08-19 12:52:57 -10:00
Daniel Hodges
b3793e0069
scx_layered: Use topology for core selection
...
Currently the core selection logic in scx_layered uses the first
available core in the bitmask. This is suboptimal when the scheduler is
configured with specific NUMA/LLC restrictions. The ideal core selection
logic should try to find the least used cores within the preferred
scheduling domain and allocate new cpus from shared cores within that
domain.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-19 15:51:35 -07:00
Tejun Heo
3498a2b899
Merge pull request #514 from sched-ext/htejun/scx_stats
...
scx_stats, scx_layered: Implement independent stats client sessions
2024-08-19 11:24:53 -10:00
Tejun Heo
4198807841
Merge pull request #506 from vax-r/uninit_var
...
scx_layered: Fix uninitialized variable
2024-08-19 11:13:23 -10:00
Tejun Heo
f6bc52d31e
scx_layered: Make --monitor behavior more useful
...
- If --monitor is specified with layer specs, the scheduler also starts
stats monitoring on a thread.
- Standalone monitoring mode no longer exits when the scheduler isn't there.
2024-08-19 10:55:02 -10:00
Tejun Heo
cb9a2f5c32
Merge pull request #512 from hodgesds/doc-improvements
...
docs: Update developer guide
2024-08-19 09:33:38 -10:00
Tejun Heo
ab6cf29a2d
Merge pull request #513 from hodgesds/ci-fixes
...
ci: Fix veristat pull request workflow
2024-08-19 09:33:09 -10:00
Tejun Heo
d03e48eb75
scx_layered: Implement per-stats-client nr_layer_cpus_ranges tracking
...
With this, every client sees the correct nr_layer_cpus_ranges without
interfering with each other.
2024-08-19 09:12:51 -10:00
Daniel Hodges
7c27f8067d
ci: Fix veristat pull request workflow
...
See the [failure](https://github.com/sched-ext/scx/actions/runs/10389671253 ), which needs to have an action defined.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-19 12:08:59 -07:00
Tejun Heo
448aacfd60
scx_layered: Initialize Stats.prev_layer_cycles properly on new()
...
So that new stats session doesn't start with an inflated utilization number.
2024-08-19 08:40:40 -10:00
Tejun Heo
6cba8d786a
scx_stats: server: open_ops must be kept throughout a client session
...
open_ops tracks which ops have been opened by the client session; however,
it was being created on each handle_request() making every request to open
each time. Fix it by moving it to the caller.
2024-08-19 08:38:13 -10:00
Daniel Hodges
0048f8dd38
docs: Update developer guide
...
Add some info on `perf` to the developer guide and link from the main
readme.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
2024-08-19 11:34:15 -07:00
Tejun Heo
25d7e6f787
scx_layered: Implement on-demand statistics generation
...
Instead of keeping one copy of sched_stats, each stats server session
carries their own so that stats can be generated independently by each
client at any interval. CPU allocation min/max tracking is broken for now.
2024-08-19 08:27:36 -10:00
Daniel Hodges
0fdb8405dd
Merge pull request #494 from hodgesds/veristat-merge
...
meson: Add github action to run veristat
2024-08-19 14:03:34 -04:00
Tejun Heo
17a460c179
scx_stats: ScxStatsOps fields must be public
2024-08-19 07:51:05 -10:00
Tejun Heo
27c530e17e
scx_stats: Add missing trait exports
2024-08-19 07:16:43 -10:00
Tejun Heo
1e89184ba7
scx_stats: server: s/Tx/Req/ and s/Rx/Res/ for clarity
2024-08-19 07:11:26 -10:00
Tejun Heo
4d88c9aec7
scx_stats: Add channel arguments to open and close ops too
2024-08-19 06:56:14 -10:00
Tejun Heo
0cf5ca605d
scx_layered: Move processing_dur accounting into Stats and protect it with Arc<Mutex<>>
2024-08-19 06:25:23 -10:00
Tejun Heo
a77fe372d6
scx_stats: Make server shutdown when connection is dropped and add communication channel
...
This will make implementing connection sessions easier where each stats
client connection maintains a set of states.
2024-08-19 06:23:16 -10:00
I Hsin Cheng
4fccc06905
scx_layered: Fix uninitialized variable
...
Fix the uninitialized variable "layer" in the function match_layer which
caused the compiling process to fail. "layer" is supposed to be the same
as "&layers[layer_id]".
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-08-17 23:32:53 +08:00
Tejun Heo
689d380db1
scx_stats: Implement ScxStatsServer::add_stats_ops()
...
This allows stats reader to maintain persistent states per connection.
2024-08-16 13:24:17 -10:00
Tejun Heo
96050f8bdd
Merge pull request #505 from sched-ext/htejun/scx_stats
...
scx_stats: Add support for no-value user attributes and a bunch of ot…
2024-08-16 09:12:20 -10:00
Tejun Heo
da2e014d15
scx_stats: Add more documentation on user attributes in README
2024-08-16 09:11:23 -10:00
Tejun Heo
3a688cfde7
scx_stats: Add support for no-value user attributes and a bunch of other changes
...
- Allow no-value user attributes which are automatically assigned "true"
when specified.
- Make "top" attribute string "true" instead of bool true for consistency.
Testing for existence is always enough for value-less attributes.
- Don't drop leading "_" from user attribute names when storing in dicts.
Dropping makes things more confusing.
- Add "_om_skip" to scx_layered fields which don't jive well with OM.
scxstats_to_openmetrics.py is updated accordignly and no longer generates
warnings on those fields.
- Examples and README updated accordingly.
2024-08-16 07:52:02 -10:00
Tejun Heo
ee77090a2b
Merge pull request #501 from CachyOS/install/fedora
...
INSTALL: Update Fedora installation docs
2024-08-16 07:14:01 -10:00
Tejun Heo
e2b8525fbe
Merge pull request #504 from vax-r/fix_typo
...
scx_rusty: Fix typo
2024-08-16 07:13:23 -10:00
I Hsin Cheng
5d85937842
scx_rusty: Fix typo
...
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-08-16 22:03:59 +08:00
Tejun Heo
7c924e001b
Merge pull request #503 from sched-ext/htejun/cargo-update
...
scheds/rust: Include Cargo.lock in the repo
2024-08-15 23:10:03 -10:00
Tejun Heo
c16b48d7b2
scheds/rust: Include Cargo.lock in the repo
...
Binary packages are expected to include Cargo.lock in the repo so that the
produced binaries match across different builds.
2024-08-15 23:08:35 -10:00
Tejun Heo
22167aeb14
Merge pull request #502 from sched-ext/htejun/scx_stats
...
scx_stats: Refine scx_stats and implement scxstats_to_openmetrics.py
2024-08-15 22:55:11 -10:00
Tejun Heo
870a262713
scx_stats: Add scripts/scxstats_to_openmetrics.py
...
This is a generic tool to pipe from scx_stats to OpenMetrics. This is a
barebone implmentation and the current output may not match what scx_layered
was outputting before. Will be updated later.
2024-08-15 22:51:22 -10:00
Eric Naim
a0bfd011bd
INSTALL: Fix Chaotic Nyx hyperlink
...
Signed-off-by: Eric Naim <dnaim@proton.me>
2024-08-16 14:53:14 +07:00
Eric Naim
91fcbb17bd
INSTALL: Update Fedora installation docs
...
The guide that is currently available for Fedora sched-ext is outdated. To remedy this,
I have opted to update the guide to use CachyOS's kernel that is also available on Fedora.
The scx schedulers that are available on Fedora's repositories are also outdated and doesn't work
with the current patchset. I have also updated the scheduler installation to use our package
in the CachyOS Addons COPR.
Signed-off-by: Eric Naim <dnaim@proton.me>
2024-08-16 14:53:08 +07:00
Tejun Heo
570ca56c57
scx_layered: s/_om_field_prefix/_om_prefix/
2024-08-15 21:29:58 -10:00
Tejun Heo
af01dd19ec
Merge pull request #500 from sched-ext/htejun/scx_stats
...
scx_stats, scx_layered: Add `om_prefix` attribute and fix s/stat/stats/ stragglers
2024-08-15 21:27:38 -10:00
Tejun Heo
ea453e51d3
scx_stats: Rename "all" attribute to "top" and clean up examples a bit
2024-08-15 21:24:55 -10:00
Tejun Heo
a910fa451a
scx_layered: Add _om attributes to LayerStats for OpenMetrics piping
2024-08-15 19:11:49 -10:00
Tejun Heo
6a5d6f7c27
scx_stats: Replace field_prefix attribute with '_' prefixed user attributes
2024-08-15 19:09:59 -10:00
Tejun Heo
f7c5a598bc
scx_stats: Store ScxStatsMeta in BTreeMap instead of Vec
...
This makes the metadata easier to use.
2024-08-15 18:32:53 -10:00
Tejun Heo
834ce62b95
scx_stats: Fields ScxStatsMeta should be a BTreeMap not vec
...
Also simplify trait bound assertion.
2024-08-15 18:21:19 -10:00
Tejun Heo
f345e83c05
scx_stats: Replace om_prefix field attribute with field_prefix struct attribute
...
And strongly distinguish between field and struct attributes while parsing.
2024-08-15 18:05:24 -10:00
Tejun Heo
a9922deaa2
scx_stats: Add "all" attribute and rename metadata type strings
2024-08-15 14:50:00 -10:00
Tejun Heo
ebc1a89c34
scx_stats: s/stat/stats/ stragglers
2024-08-15 14:00:00 -10:00
Tejun Heo
bafd67b568
scx_stats: Fix parsing for multiple stat attributes
...
The code was assuming single attribute per #[stat()] block. Update it so
that there can be multiple comma separated attributes in a single block.
2024-08-15 13:46:20 -10:00
Tejun Heo
8f361af077
scx_layered: Shorten stat field descriptions
2024-08-15 13:25:48 -10:00