scx-upstream/scheds
Andrea Righi 9708a80130 scx_userland: use a custom memory allocator to prevent page faults
To prevent potential deadlock conditions under heavy loads, any
scheduler that delegates scheduling decisions to user-space should avoid
triggering page faults.

To address this issue, replace the default Rust allocator with a custom
one (RustLandAllocator), designed to operate on a pre-allocated buffer.

This, coupled with the memory locking (via mlockall), prevents page
faults from happening during the execution of the user-space scheduler,
avoiding the deadlock condition.

This memory allocator is completely transparent to the user-space
scheduler code and it is applied automatically when the bpf module is
imported.

In the future we may decide to move this allocator to a more generic
place (scx_utils crate), so that also other user-space Rust schedulers
can use it.

This initial implementation of the RustLandAllocator is very simple: a
basic block-based allocator that uses an array to track the status of
each memory block (allocated or free).

This allocator can be improved in the future, but right now, despite its
simplicity, it shows a reasonable speed and efficiency in meeting memory
requests from the user-space scheduler, having to deal mostly with small
and uniformly sized allocations.

With this change in place scx_rustland survived more than 10hrs on a
heavily stressed system (with stress-ng and kernel builds running in a
loop):

 $ ps -o pid,rss,etime,cmd -p `pidof scx_rustland`
     PID   RSS     ELAPSED CMD
   34966 75840    10:00:44 ./build/scheds/rust/scx_rustland/debug/scx_rustland

Without this change it is possible to trigger the sched-ext watchdog
timeout in less than 5min, under the same system load conditions.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
2024-01-14 22:07:37 +01:00
..
c scx_flatcg: Fix fallout from direct dispatch API update 2024-01-10 10:57:50 -10:00
include scx: Build fix after kernel update 2024-01-08 14:48:24 -10:00
rust scx_userland: use a custom memory allocator to prevent page faults 2024-01-14 22:07:37 +01:00
meson.build Restructure scheds folder names 2023-12-17 13:14:31 -08:00
README.md Restructure scheds folder names 2023-12-17 13:14:31 -08:00
sync-to-kernel.sh Restructure scheds folder names 2023-12-17 13:14:31 -08:00

SCHED_EXT SCHEDULERS

Introduction

This directory contains the repo's schedulers.

Some of these schedulers are simply examples of different types of schedulers that can be built using sched_ext. They can be loaded and used to schedule on your system, but their primary purpose is to illustrate how various features of sched_ext can be used.

Other schedulers are actually performant, production-ready schedulers. That is, for the correct workload and with the correct tuning, they may be deployed in a production environment with acceptable or possibly even improved performance. Some of the examples could be improved to become production schedulers.

Please see the following README files for details on each of the various types of schedulers:

  • rust describes all of the schedulers with rust user space components. All of these schedulers are production ready.
  • c describes all of the schedulers with C user space components. All of these schedulers are production ready.

Note on syncing

Note that there is a sync-to-kernel.sh script in this directory. This is used to sync any changes to the specific schedulers with the Linux kernel tree. If you've made any changes to a scheduler in please use the script to synchronize with the sched_ext Linux kernel tree:

$ ./sync-to-kernel.sh /path/to/kernel/tree