Commit Graph

1061 Commits

Author SHA1 Message Date
Changwoo Min
4c5c564523 scx_lavd: initial current logical clock to zero
To easily distinguish, let's initialize the current logical clock to
zero (not the current physical time). Also, avoid the deadline
calculation being zero by adding +1 here and there.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-12 10:15:54 +09:00
Changwoo Min
bdbfeb9fd1 scx_lavd: use logical current clock for virtual deadlines
This commit changes the use of a physical clock to a virtual, logical
clock in calculating deadlines.

- The virtual current clock advances upon a task's running to its
  virtual deadline.

- When enqueuing a task, its virtual deadline from the virtual current
  clock is calculated.

With the above two changes, this guarantees that there is no such task
whose virtual deadline is smaller than the virtual current clock. This
means any enqueuing task can compete with any other already enqueued
tasks. This allows a latency-critical task to be immediately scheduled
if needed.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 22:41:56 +09:00
Changwoo Min
408ea7892c scx_lavd: induce sched_prio_to_latency_weight from slice weight
So sched_prio_to_latency_weight is removed.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 21:37:21 +09:00
Changwoo Min
bd964acff6 scx_lavd: deprioritize a newly forked task in latency
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 21:36:32 +09:00
Changwoo Min
48debe416e scx_lavd: tuning the deadline equation under high load
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 21:35:54 +09:00
Changwoo Min
c72e063680 scx_lavd: do not use lat_prio_to_greedy_thresholds
With other optimizations, lat_prio_to_greedy_thresholds is not effective
any more.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 21:35:01 +09:00
Changwoo Min
9ed488798e scx_lavd: use task's runtime to determine its deaddline
It has an effect of further perferring shorter jobs.

Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 21:34:25 +09:00
Changwoo Min
e081b2a294 scx_lavd: rename LAVD_MAX_CAS_RETRY to LAVD_MAX_RETRY
Signed-off-by: Changwoo Min <changwoo@igalia.com>
2024-07-11 21:33:56 +09:00
Andrea Righi
3df7a13117
Merge pull request #416 from sched-ext/bpfland-small-improvements
bpfland: small improvements
2024-07-08 23:11:47 +02:00
Andrea Righi
995577762a scx_bpfland: refill task time slice
Every time we need to dispatch a task re-evalate its time slice as:

 (unused_time_slice + min_time_slice) / 2

This allows to refill the time slice for tasks that haven't used much of
their previously assigned time, improving fairness.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-06 14:07:24 +02:00
Andrea Righi
6a64182ef2 scx_bpfland: always classify interactive tasks
Make sure to always classify interactive tasks, even when the system is
not fully utilized. This ensures that if the system suddenly becomes
overloaded, we already know which tasks need to be dispatched to the
priority DSQ.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-06 14:07:24 +02:00
Andrea Righi
8dd528abfd scx_bpfland: pass enqueue flags when dispatching kthreads
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-06 14:07:10 +02:00
Andrea Righi
fc0d1bd003
Merge pull request #415 from sched-ext/bpfland-output
scx_bpfland: additional stats and output improvements
2024-07-05 19:50:07 +02:00
Tejun Heo
af5e89e73c
Merge pull request #412 from vax-r/flatcg_delta_fetch
scx_flatcg: Make good use of __sync_fetch_and_sub()
2024-07-05 07:39:22 -10:00
Tejun Heo
7f8e4edb53
Merge pull request #397 from jfernandez/log-recorder-customize
sched_utils: Add log recorder format customization
2024-07-05 07:37:39 -10:00
Tejun Heo
14d0a0ef64
Merge pull request #411 from vax-r/Fix_typo
scx_flatcg: Fix_typo
2024-07-05 07:35:31 -10:00
Andrea Righi
2bc8f800e7 scx_bpfland: report build id version
Use the version string provided by scx_utils:build_id.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-05 09:29:29 +02:00
Andrea Righi
bdb31e98e2 scx_bpfland: show statistics in a more human-readable format
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-05 09:29:29 +02:00
Andrea Righi
f9d7844b2e scx_bpfland: split direct dispatches and kthread dispatches
Show separate statistics for direct dispatches and kthread direct
dispatches.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-05 09:27:59 +02:00
Andrea Righi
86d2f50230
Merge pull request #410 from sched-ext/bpfland-smooth-perf
scx_bpfland: enhance performance consistency and predictability
2024-07-04 21:37:07 +02:00
Andrea Righi
d98516fe75
Merge pull request #413 from vax-r/Remove_unused_variable
scx_rustland_core: Remove unused variable
2024-07-04 19:09:27 +02:00
I Hsin Cheng
1595da78dc scx_rustland_core: Remove unused variable
Remove unused variable "tctx" in rustland_select_cpu.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-07-05 01:04:49 +08:00
I Hsin Cheng
aae826b1b3 scx_flatcg: Make good use of __sync_fetch_and_sub()
Fetch the value of "delta" directly from the returned value from
__sync_fetch_and_sub, as it returns the origin value of
cgc->cvtime_delta.

Additional fetching instruction of cgc->cvtime_delta would be redundant
here.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-07-05 01:03:20 +08:00
I Hsin Cheng
3e52761487 scx_flatcg: Fix_typo
Fix "oppotunistic" to "opportunistic".

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
2024-07-04 22:04:40 +08:00
Andrea Righi
cfe2ed063d scx_bpfland: time-based starvation prevention
Tasks are consumed from various DSQs in the following order:

  per-CPU DSQs => priority DSQ => shared DSQ

Tasks in the shared DSQ may be starved by those in the priority DSQ,
which in turn may be starved by tasks dispatched to any per-CPU DSQ.

To mitigate this, record the timestamp of the last task scheduling event
both from the priority DSQ and the shared DSQ.

If the starvation threshold is exceeded without consuming a task, the
scheduler will be forced to consume a task from the corresponding DSQ.

The starvation threshold can be adjusted using the --starvation-thresh
command line parameter (default is 5ms).

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-04 10:52:39 +02:00
Andrea Righi
9e0db4ae17 scx_bpfland: remove unnecessary RCU read protection
There is no need to RCU protect the cpumask for the offline CPUs: it is
created once when the scheduler is initialized and it's never
deallocated.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-04 10:24:43 +02:00
Andrea Righi
cef6ca93cf scx_bpfland: adjust default time slice to 5ms
Reduce the default time slice down to 5ms for a faster reaction and
system responsiveness when the system is overcomissioned.

This also helps to provide a more predictable level of performance.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-04 10:24:43 +02:00
Andrea Righi
7d15e3171c scx_bpfland: ensure task time slice never exceeds the slice_ns limit
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-04 10:24:43 +02:00
Andrea Righi
e8a4d350ad scx_bpfland: unify dispatching kthreads with direct CPU dispatches
Always use direct CPU dispatch for kthreads, there is no need to treat
kthreads in a special way, simply reuse direct CPU dispatch to
prioritize them.

Moreover, change direct CPU dispatches to use scx_bpf_dispatch_vtime(),
since we may dispatch multiple tasks to the same per-CPU DSQ now.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-03 09:38:43 +02:00
Andrea Righi
d2231b0aed scx_bpfland: drop built-in idle CPU selection logic
Small refactoring of the idle CPU selection logic:
 - optimize idle CPU selection for tasks that can run on a single CPU
 - drop the built-in idle selection policy and completely rely on the
   custom one

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-03 08:54:17 +02:00
Andrea Righi
a72c9058a3
Merge pull request #409 from sched-ext/bpfland-fix-idle-cpumask
scx_bpfland: use the right cpumask to find any idle CPU
2024-07-01 21:37:35 +02:00
Andrea Righi
7c355f50b2 scx_bpfland: use the right cpumask to find any idle CPU
We are incorrectly using the SMT idle cpumask to find any idle CPU, fix
by using the generic idle cpumask.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-07-01 20:36:24 +02:00
Andrea Righi
c458f345b4
Merge pull request #408 from sched-ext/bpfland-cpu-hotplug
scx_bpfland: support CPU hotplugging
2024-07-01 19:41:00 +02:00
Dan Schatzberg
32ac4b2cff
Merge pull request #389 from dschatzberg/mitosis
mitosis: Update synchronization
2024-07-01 09:44:26 -04:00
Andrea Righi
ff7a518d28 scx_bpfland: support CPU hotplugging
Implement CPU hotplugging in scx_bpfland without restarting the
scheduler.

The idle selection logic has been updated to consider online CPUs.
Additionally, a cpumask for offline CPUs has been introduced. Tasks
that have been dispatched to the DSQs associated with offline CPUs are
consumed by the other CPUs that are still online.

Moreover, the dependency on the Topology crate is temporarily dropped
and instead, /sys/devices/system/cpu/smt/active is used to determine if
SMT should be taken into account during idle selection. The Topology
crate will be re-introduced later when scx_bpfland will gain more
topology-aware capabilities.

This fixes #406.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-30 23:04:13 +02:00
Andrea Righi
f965ceb572
Merge pull request #407 from sched-ext/rusty-fix-stats-init
scx_rusty: fix stats map initialization
2024-06-30 20:03:48 +02:00
Andrea Righi
d76551bbd3 scx_rusty: fix stats map initialization
The stats map in scx_rusty is a BPF_MAP_TYPE_PERCPU_ARRAY, with its size
determined by num_possible_cpus(). Initializing it with nr_cpu_ids() can
result in errors such as:

 Error: Failed to zero stat

 Caused by:
     number of values 6 != number of cpus 8

Fix by using num_possible_cpus() to initialize it.

Fixes: 263e02f6 ("rusty: Use nr_cpu_ids instead of nr_cpus_possible")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-30 17:37:14 +02:00
Andrea Righi
14a33b6275
Merge pull request #404 from sirlucjan/config-update3
scheds: Add scx_bpfland scheduler to /etc/default/scx
2024-06-28 22:08:56 +02:00
Piotr Gorski
ee7c0cbea6
scheds: Add scx_bpfland scheduler to /etc/default/scx
Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
2024-06-28 22:01:24 +02:00
Andrea Righi
338dd0dafb
Merge pull request #403 from sched-ext/bpfland-meson
scx_bpfland: properly integrate with meson build
2024-06-28 21:58:17 +02:00
Andrea Righi
74175f5a49 scx_bpfland: properly integrate with meson build
Properly honor the meson build `serialize` option.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-28 21:37:00 +02:00
Andrea Righi
f98c35fd07
Merge pull request #388 from sched-ext/bpfland
scheds: introduce scx_bpfland
2024-06-28 21:27:43 +02:00
Andrea Righi
183b1b2cfc
Merge pull request #399 from sched-ext/meson-serialize
meson: introduce serialize build option
2024-06-28 20:13:16 +02:00
Andrea Righi
b7977f1ce3
Merge pull request #402 from sched-ext/meson-check-git-commit
meson: check if commit exists in remote git repos
2024-06-28 19:03:02 +02:00
Andrea Righi
273728fd2b meson: check if commit exists in remote git repos
When fetching external git repositories (libbpf and bpftool) we don't
check if the target commit exists.

This can leads to issues such as #400, because we may silently use HEAD,
instead of the specified commit.

Prevent this by returning an error when the target SHA1 cannot be found.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-28 15:16:56 +02:00
Andrea Righi
38a05d49f8
Merge pull request #401 from CachyOS/feat/cargo-rel-with-plain
meson: run cargo build in release mode when using plain buildtype
2024-06-28 14:49:12 +02:00
Jose Fernandez
f64cdd1a51
sched_utils: Add log recorder format customization
This change adds the ability to customize the log recorder format for
each metric type. There is a default format that is used if no custom
`MetricFormatter` is provided. This is the same format that was used
before this change.

The `MetricFormatter` should be implemented by the user to customize the
format of the log recorder. The `LogRecorderBuilder` now takes a
`MetricFormatter` as an optional parameter.

Following changes will allow additional customization of the log
recorder format, such as how many metrics are logged per line.

Signed-off-by: Jose Fernandez <josef@netflix.com>
2024-06-28 06:21:03 -06:00
Vladislav Nepogodin
22f13e2284
meson: run cargo build in release mode when using plain buildtype 2024-06-28 16:10:16 +04:00
Andrea Righi
657fb6a4aa
Merge pull request #400 from sched-ext/fix-bpftool
meson: restore previous libbpf version and update bpftool
2024-06-28 13:21:46 +02:00
Andrea Righi
39a06c86f1 meson: restore previous libbpf version and update bpftool
The upstrem bpftool git repo (https://github.com/libbpf/bpftool.git) is
periodically force pushed and the specific commit that we needed is not
available anymore.

Instead of failing we are actually fetching the latest bpftool (HEAD)
that introduced some breakage initially fixed by commit e59c48a6
("Update libbpf commit hash").

However, updating libbpf seems to introduce a run-time problem and all
the schedulers are failing to start:

 libbpf: failed to find skeleton map ''
 libbpf: failed to populate skeleton maps for 'bpf_bpf': -3

So, revert libbpf to the previous version and update the commit for
bpftool to use a version that still allows to generate a compatible BPF
skel.

Fixes: e59c48a6 ("Update libbpf commit hash")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-06-28 12:43:37 +02:00