- Allow no-value user attributes which are automatically assigned "true"
when specified.
- Make "top" attribute string "true" instead of bool true for consistency.
Testing for existence is always enough for value-less attributes.
- Don't drop leading "_" from user attribute names when storing in dicts.
Dropping makes things more confusing.
- Add "_om_skip" to scx_layered fields which don't jive well with OM.
scxstats_to_openmetrics.py is updated accordignly and no longer generates
warnings on those fields.
- Examples and README updated accordingly.
This is a generic tool to pipe from scx_stats to OpenMetrics. This is a
barebone implmentation and the current output may not match what scx_layered
was outputting before. Will be updated later.
scx_stat is a statistics reporting library with the following goals:
- Minimal boilerplate code. Statistics are defined as a regular rust struct
which can contain integers, floats, strings, structs, and array or
BTreeMap of them. The struct and each field can be annotated using the
"stat" attribute. The "Stat" derive macro generates the necessary
metadata.
- On-demand reporting. ScxStatServer implements a UNIX domain socket server
which communicates using a simple json protocol. Stat structs can be added
to the server with closures to output them. There can be any number of
readers. ScxStatServer can be queried for the metadata of all stat structs
that it may report so that a generic tool can be written to pipe the
reported statistics to other frameworks (e.g. openmetrics).
- The interface protocol is pretty simple and it isn't difficult to interact
directly in json. ScxStatClient is provided to further simplify client
implementation.
- See rust/scx_stat/examples/{server,client}.rs for usage.
Re-add the partial mode option that was dropped during the refactoring.
The partial option allows to apply the scheduler only to the tasks which
have their scheduling policy set to SCHED_EXT via sched_setscheduler().
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
The API for determining which PID is running on a specific CPU is racy
and is unnecessary since this information can be obtained from user
space.
Additionally, it's not reliable for identifying idle CPUs. Therefore,
it's better to remove this API and, in the future, provide a cpumask
alternative that can export the idle state of the CPUs to user space.
As a consequence also change scx_rustland to dispatch one task a time,
instead of dispatching tasks in batches of idle cores (that are usually
not accurate due to the racy nature of the CPU ownership interaface).
Dispatching one task at a time even makes the scheduler more performant,
due to the vruntime scheduling being applied to more tasks sitting in
the scheduler's queue.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Drop the slice boost logic and apply a vruntime and task time slice
evaluation approach similar to scx_bpfland (but implement this in the
user-space component instead of the BPF part).
Additionally, introduce a slice_us_min parameter to define the minimum
time slice that can be assigned to a task, also similar to scx_bpfland.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Use the same idle selection logic used in scx_bpfland also in
scx_rustland_core.
Also drop fifo_mode and always use the BPF idle selection logic by
default as long as the system is not saturated, unless full_user is
specified.
This approach allows user-space schedulers aiming for maximum
performance to leverage the BPF idle selection logic (bypassing
user-space), while those seeking full control can enable full_user to
bypass the BPF CPU idle selection logic and choose the target CPU for
each task from user-space.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Fix args to bpf_builder, existing warnings display for
"-Wno-compare-distinct-pointer-types'".
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
We don't need to send the number of voluntary context switches (nvcsw)
from BPF to user-space, as this information is already accessible in
user-space via procfs. Sending this data would only create unnecessary
overhead for schedulers that don't require it, and those that do can
easily retrieve it through procfs.
Therefore, drop this metric from scx_rustland_core and change
scx_rustland implementing an interactive task classifier fully in the
user-space part of the scheduler.
Also drop some options that are not provide any significant benefit
(also in preparation of a bigger refactoring to define a better API for
the user-space framework).
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Explicitly import timespec to fix the following potential build error:
error[E0422]: cannot find struct, variant or union type `timespec` in this scope
--> src/bpf.rs:365:35
|
365 | sched_ss_repl_period: timespec {
| ^^^^^^^^ not found in this scope
|
help: consider importing this struct
|
6 + use libc::timespec;
|
This fixes issue #469.
Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
Add a cpus method various subfields in the topology struct to easily get
the map of CPUs for nodes/LLCs.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Update libbpf-rs & libbpf-cargo to 0.24. Among other things, generated
skeletons now contain directly accessible map and program objects, no
longer necessitating the use of accessor methods. As a result, the risk
for mutability conflicts is reduced greatly.
Signed-off-by: Daniel Müller <deso@posteo.net>
Add the LLC id to the Cpu struct, which will make it easier for
referencing the LLC when doing operations on a Cpu.
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
sched_ext is about to be merged upstream. There are some compatibility
breaking changes and we're making the current sched_ext/for-6.11
1edab907b57d ("sched_ext/scx_qmap: Pick idle CPU for direct dispatch on
!wakeup enqueues") the baseline.
Tag everything except scx_mitosis as 1.0.0. As scx_mitosis is still in early
development and is currently temporarily disabled, only the patchlevel is
bumped.
This change adds the ability to customize the log recorder format for
each metric type. There is a default format that is used if no custom
`MetricFormatter` is provided. This is the same format that was used
before this change.
The `MetricFormatter` should be implemented by the user to customize the
format of the log recorder. The `LogRecorderBuilder` now takes a
`MetricFormatter` as an optional parameter.
Following changes will allow additional customization of the log
recorder format, such as how many metrics are logged per line.
Signed-off-by: Jose Fernandez <josef@netflix.com>