Commit Graph

1386 Commits

Author SHA1 Message Date
Tejun Heo
25d7e6f787 scx_layered: Implement on-demand statistics generation
Instead of keeping one copy of sched_stats, each stats server session
carries their own so that stats can be generated independently by each
client at any interval. CPU allocation min/max tracking is broken for now.
2024-08-19 08:27:36 -10:00
Andrea Righi
f8a2445869 scx_bpfland: introduce performance/powersave primary domain
The primary scheduling domain represents a group of CPUs in the system
where the scheduler will initially attempt to assign tasks. Tasks will
only be dispatched to CPUs within this primary domain until they are
fully utilized, after which tasks may overflow to other available CPUs.

The primary scheduling domain can defined using the option
`--primary-domain CPUMASK` (by default all the CPUs in the system are
used as primary domain).

This change introduces two new special values for the CPUMASK argument:
 - `performance`: automatically detect the fastest CPUs in the system
   and use them as primary scheduling domain,
 - `powersave`: automatically detect the slowest CPUs in the system and
   use them as primary scheduling domain.

The current logic only supports creating two groups: fast and slow CPUs.

The fast CPU group is created by excluding CPUs with the lowest
frequency from the overall set, which means that within the fast CPU
group, CPUs may have different maximum frequencies.

When using the `performance` mode the fast CPUs will be used as primary
domain, whereas in `powersave` mode, the slow CPUs will be used instead.

This option is particularly useful in hybrid architectures (with P-cores
and E-cores), as it allows the use of bpfland to prioritize task
scheduling on either P-cores or E-cores, depending on the desired
performance profile.

Example:

 - Dell Precision 5480
   - CPU: 13th Gen Intel(R) Core(TM) i7-13800H
     - P-cores:  0-11 / max freq: 5.2GHz
     - E-cores: 12-19 / max freq: 4.0GHz

 $ scx_bpfland --primary-domain performance

  0[|||||||||                24.5%]  10[||||||||                  22.8%]
  1[||||||                   14.9%]  11[|||||||||||||             36.9%]
  2[||||||                   16.2%]  12[                           0.0%]
  3[|||||||||                25.3%]  13[                           0.0%]
  4[|||||||||||              33.3%]  14[                           0.0%]
  5[||||                      9.9%]  15[                           0.0%]
  6[|||||||||||              31.5%]  16[                           0.0%]
  7[|||||||                  17.4%]  17[                           0.0%]
  8[||||||||                 23.4%]  18[                           0.0%]
  9[|||||||||                26.1%]  19[                           0.0%]

  Avg power consumption: 3.29W

 $ scx_bpfland --primary-domain powersave

  0[|                         2.5%]  10[                           0.0%]
  1[                          0.0%]  11[                           0.0%]
  2[                          0.0%]  12[||||                       8.0%]
  3[                          0.0%]  13[|||||||||||||||||||||     64.2%]
  4[                          0.0%]  14[||||||||||                29.6%]
  5[                          0.0%]  15[|||||||||||||||||         52.5%]
  6[                          0.0%]  16[|||||||||                 24.7%]
  7[                          0.0%]  17[||||||||||                30.4%]
  8[                          0.0%]  18[|||||||                   22.4%]
  9[                          0.0%]  19[|||||                     12.4%]

  Avg power consumption: 2.17W

(Info collected from htop and turbostat)

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-08-19 20:19:21 +02:00
Andrea Righi
174993f9d2 scx_bpfland: introduce cache awareness
While the system is not saturated the scheduler will use the following
strategy to select the next CPU for a task:
  - pick the same CPU if it's a full-idle SMT core
  - pick any full-idle SMT core in the primary scheduling group that
    shares the same L2 cache
  - pick any full-idle SMT core in the primary scheduling grouop that
    shares the same L3 cache
  - pick the same CPU (ignoring SMT)
  - pick any idle CPU in the primary scheduling group that shares the
    same L2 cache
  - pick any idle CPU in the primary scheduling group that shares the
    same L3 cache
  - pick any idle CPU in the system

While the system is completely saturated (no idle CPUs available), tasks
will be dispatched on the first CPU that becomes available.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
2024-08-19 20:19:21 +02:00
Daniel Hodges
0fdb8405dd
Merge pull request #494 from hodgesds/veristat-merge
meson: Add github action to run veristat
2024-08-19 14:03:34 -04:00
Tejun Heo
17a460c179 scx_stats: ScxStatsOps fields must be public 2024-08-19 07:51:05 -10:00
Tejun Heo
27c530e17e scx_stats: Add missing trait exports 2024-08-19 07:16:43 -10:00
Tejun Heo
1e89184ba7 scx_stats: server: s/Tx/Req/ and s/Rx/Res/ for clarity 2024-08-19 07:11:26 -10:00
Tejun Heo
4d88c9aec7 scx_stats: Add channel arguments to open and close ops too 2024-08-19 06:56:14 -10:00
Tejun Heo
0cf5ca605d scx_layered: Move processing_dur accounting into Stats and protect it with Arc<Mutex<>> 2024-08-19 06:25:23 -10:00
Tejun Heo
a77fe372d6 scx_stats: Make server shutdown when connection is dropped and add communication channel
This will make implementing connection sessions easier where each stats
client connection maintains a set of states.
2024-08-19 06:23:16 -10:00
Changwoo Min
832f194845 scx_lavd: add power profile options: --performance, --powersave, --balanced
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-19 19:03:51 +09:00
Changwoo Min
c4c157f91c scx_lavd: add "--prefer-little-core" option
This option chooses little (effiency) cores over big (performance) cores
to save power consumption for core compaction.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-19 18:23:35 +09:00
Changwoo Min
73b873827d scx_lavd: merge put_cpdom_rq() to ops.enqueue()
Clean and reorganized the code around ops.enqueue()

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-19 14:22:03 +09:00
Changwoo Min
9475ace336 scx_lavd: always enqueue to a DSQ in task's compute domain
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-19 14:07:56 +09:00
Changwoo Min
0656c3232e scx_lavd: revise FlatTopology prettier
The changes include 1) chopping down a big function into smaller ones
for readability and maintainability and 2) using the interior mutability
pattern (Cell and RefCell) to avoid unnecessary clone() calls.  There
are no functional changes.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-08-19 11:03:52 +09:00
I Hsin Cheng
4fccc06905 scx_layered: Fix uninitialized variable
Fix the uninitialized variable "layer" in the function match_layer which
caused the compiling process to fail. "layer" is supposed to be the same
as "&layers[layer_id]".

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-08-17 23:32:53 +08:00
Tejun Heo
689d380db1 scx_stats: Implement ScxStatsServer::add_stats_ops()
This allows stats reader to maintain persistent states per connection.
2024-08-16 13:24:17 -10:00
Tejun Heo
96050f8bdd
Merge pull request #505 from sched-ext/htejun/scx_stats
scx_stats: Add support for no-value user attributes and a bunch of ot…
2024-08-16 09:12:20 -10:00
Tejun Heo
da2e014d15 scx_stats: Add more documentation on user attributes in README 2024-08-16 09:11:23 -10:00
Tejun Heo
3a688cfde7 scx_stats: Add support for no-value user attributes and a bunch of other changes
- Allow no-value user attributes which are automatically assigned "true"
  when specified.

- Make "top" attribute string "true" instead of bool true for consistency.
  Testing for existence is always enough for value-less attributes.

- Don't drop leading "_" from user attribute names when storing in dicts.
  Dropping makes things more confusing.

- Add "_om_skip" to scx_layered fields which don't jive well with OM.
  scxstats_to_openmetrics.py is updated accordignly and no longer generates
  warnings on those fields.

- Examples and README updated accordingly.
2024-08-16 07:52:02 -10:00
Tejun Heo
ee77090a2b
Merge pull request #501 from CachyOS/install/fedora
INSTALL: Update Fedora installation docs
2024-08-16 07:14:01 -10:00
Tejun Heo
e2b8525fbe
Merge pull request #504 from vax-r/fix_typo
scx_rusty: Fix typo
2024-08-16 07:13:23 -10:00
I Hsin Cheng
5d85937842 scx_rusty: Fix typo
Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-08-16 22:03:59 +08:00
Tejun Heo
7c924e001b
Merge pull request #503 from sched-ext/htejun/cargo-update
scheds/rust: Include Cargo.lock in the repo
2024-08-15 23:10:03 -10:00
Tejun Heo
c16b48d7b2 scheds/rust: Include Cargo.lock in the repo
Binary packages are expected to include Cargo.lock in the repo so that the
produced binaries match across different builds.
2024-08-15 23:08:35 -10:00
Tejun Heo
22167aeb14
Merge pull request #502 from sched-ext/htejun/scx_stats
scx_stats: Refine scx_stats and implement scxstats_to_openmetrics.py
2024-08-15 22:55:11 -10:00
Tejun Heo
870a262713 scx_stats: Add scripts/scxstats_to_openmetrics.py
This is a generic tool to pipe from scx_stats to OpenMetrics. This is a
barebone implmentation and the current output may not match what scx_layered
was outputting before. Will be updated later.
2024-08-15 22:51:22 -10:00
Eric Naim
a0bfd011bd
INSTALL: Fix Chaotic Nyx hyperlink
Signed-off-by: Eric Naim <dnaim@proton.me>
2024-08-16 14:53:14 +07:00
Eric Naim
91fcbb17bd
INSTALL: Update Fedora installation docs
The guide that is currently available for Fedora sched-ext is outdated. To remedy this,
I have opted to update the guide to use CachyOS's kernel that is also available on Fedora.
The scx schedulers that are available on Fedora's repositories are also outdated and doesn't work
with the current patchset. I have also updated the scheduler installation to use our package
in the CachyOS Addons COPR.

Signed-off-by: Eric Naim <dnaim@proton.me>
2024-08-16 14:53:08 +07:00
Tejun Heo
570ca56c57 scx_layered: s/_om_field_prefix/_om_prefix/ 2024-08-15 21:29:58 -10:00
Tejun Heo
af01dd19ec
Merge pull request #500 from sched-ext/htejun/scx_stats
scx_stats, scx_layered: Add `om_prefix` attribute and fix s/stat/stats/ stragglers
2024-08-15 21:27:38 -10:00
Tejun Heo
ea453e51d3 scx_stats: Rename "all" attribute to "top" and clean up examples a bit 2024-08-15 21:24:55 -10:00
Tejun Heo
a910fa451a scx_layered: Add _om attributes to LayerStats for OpenMetrics piping 2024-08-15 19:11:49 -10:00
Tejun Heo
6a5d6f7c27 scx_stats: Replace field_prefix attribute with '_' prefixed user attributes 2024-08-15 19:09:59 -10:00
Tejun Heo
f7c5a598bc scx_stats: Store ScxStatsMeta in BTreeMap instead of Vec
This makes the metadata easier to use.
2024-08-15 18:32:53 -10:00
Tejun Heo
834ce62b95 scx_stats: Fields ScxStatsMeta should be a BTreeMap not vec
Also simplify trait bound assertion.
2024-08-15 18:21:19 -10:00
Tejun Heo
f345e83c05 scx_stats: Replace om_prefix field attribute with field_prefix struct attribute
And strongly distinguish between field and struct attributes while parsing.
2024-08-15 18:05:24 -10:00
Tejun Heo
a9922deaa2 scx_stats: Add "all" attribute and rename metadata type strings 2024-08-15 14:50:00 -10:00
Tejun Heo
ebc1a89c34 scx_stats: s/stat/stats/ stragglers 2024-08-15 14:00:00 -10:00
Tejun Heo
bafd67b568 scx_stats: Fix parsing for multiple stat attributes
The code was assuming single attribute per #[stat()] block. Update it so
that there can be multiple comma separated attributes in a single block.
2024-08-15 13:46:20 -10:00
Tejun Heo
8f361af077 scx_layered: Shorten stat field descriptions 2024-08-15 13:25:48 -10:00
Tejun Heo
db0ddbe249 scx_utils: Add support for om_prefix stat attribute
This will be used to build generic bridge to openmetrics.
2024-08-15 13:08:44 -10:00
Tejun Heo
1912e05f0b
Merge pull request #499 from sched-ext/htejun/scx_stats
scx_stats: Misc changes to sync dep versions and publish on crates.io
2024-08-15 12:32:44 -10:00
Tejun Heo
ba2e0be899 scx_stats: Add .gitignore 2024-08-15 12:31:04 -10:00
Tejun Heo
0b9c8b5cbd scx_stats: Update versions to 0.2.0 to republish 2024-08-15 12:29:27 -10:00
Tejun Heo
d6ef2c2a1f scx_stats: Synchronize crate dependency versions 2024-08-15 12:28:29 -10:00
Tejun Heo
d037fcd223 scx_stats: Drop version from Cargo.toml::dev-dependencies
Otherwise, the cyclic dependency prevents publishing.
2024-08-15 12:26:17 -10:00
Tejun Heo
ec51c567fe scx_stats: Adding missing license tag in Cargo.toml's 2024-08-15 12:23:46 -10:00
Tejun Heo
cef143210a
Merge pull request #498 from sched-ext/htejun/scx_stats
scx_stats: Add package metadata and some documentation
2024-08-15 12:20:00 -10:00
Tejun Heo
47c774d011 scx_stats: Add README and cleanup examples 2024-08-15 12:17:30 -10:00