Crate scx_rlfifo

Source
Expand description

§Round-Robin Linux kernel scheduler that runs in user-space

§Overview

This is a fully functional Round-Robin scheduler for the Linux kernel that operates in user-space and it is 100% implemented in Rust.

It dequeues tasks in FIFO order and assigns dynamic time slices, preempting and re-enqueuing tasks to achieve basic Round-Robin behavior.

The scheduler is designed to serve as a simple template for developers looking to implement more advanced scheduling policies.

It is based on scx_rustland_core, a framework that is specifically designed to simplify the creation of user-space schedulers, leveraging the Linux kernel’s sched_ext feature (a technology that allows to implement schedulers in BPF).

The scx_rustland_core crate offers an abstraction layer over sched_ext, enabling developers to write schedulers in Rust without needing to interact directly with low-level kernel or BPF internal details.

§scx_rustland_core API

§struct BpfScheduler

The BpfScheduler struct is the core interface for interacting with sched_ext via BPF.

  • Initialization:

    • BpfScheduler::init() registers the scheduler and initializes the BPF component.
  • Task Management:

    • dequeue_task(): Consume a task that wants to run, returns a QueuedTask object
    • select_cpu(pid: i32, prev_cpu: i32, flags: u64): Select an idle CPU for a task
    • dispatch_task(task: &DispatchedTask): Dispatch a task
  • Completion Notification:

    • notify_complete(nr_pending: u64) Give control to the BPF component and report the number of tasks that are still pending (this function can sleep)

Each task received from dequeue_task() contains the following:

struct QueuedTask { pub pid: i32, // pid that uniquely identifies a task pub cpu: i32, // CPU previously used by the task pub flags: u64, // task’s enqueue flags pub sum_exec_runtime: u64, // Total cpu time in nanoseconds pub weight: u64, // Task priority in the range [1..10000] (default is 100) pub nvcsw: u64, // Total amount of voluntary context switches pub slice: u64, // Remaining time slice budget pub vtime: u64, // Current task vruntime / deadline (set by the scheduler) }

Each task dispatched using dispatch_task() contains the following:

struct DispatchedTask { pub pid: i32, // pid that uniquely identifies a task pub cpu: i32, // target CPU selected by the scheduler // (RL_CPU_ANY = dispatch on the first CPU available) pub flags: u64, // task’s enqueue flags pub slice_ns: u64, // time slice in nanoseconds assigned to the task // (0 = use default time slice) pub vtime: u64, // this value can be used to send the task’s vruntime or deadline // directly to the underlying BPF dispatcher }

Other internal statistics that can be used to implement better scheduling policies:

let n: u64 = *self.bpf.nr_online_cpus_mut(); // amount of online CPUs let n: u64 = *self.bpf.nr_running_mut(); // amount of currently running tasks let n: u64 = *self.bpf.nr_queued_mut(); // amount of tasks queued to be scheduled let n: u64 = *self.bpf.nr_scheduled_mut(); // amount of tasks managed by the user-space scheduler let n: u64 = *self.bpf.nr_user_dispatches_mut(); // amount of user-space dispatches let n: u64 = *self.bpf.nr_kernel_dispatches_mut(); // amount of kernel dispatches let n: u64 = *self.bpf.nr_cancel_dispatches_mut(); // amount of cancelled dispatches let n: u64 = *self.bpf.nr_bounce_dispatches_mut(); // amount of bounced dispatches let n: u64 = *self.bpf.nr_failed_dispatches_mut(); // amount of failed dispatches let n: u64 = *self.bpf.nr_sched_congested_mut(); // amount of scheduler congestion events

Modules§

bpf 🔒
bpf_intf
bpf_skel 🔒
types

Structs§

BpfLinks
BpfMaps
BpfProgs
BpfSkel
BpfSkelBuilder
OpenBpfMaps
OpenBpfProgs
OpenBpfSkel
Scheduler 🔒
StructOps

Constants§

SLICE_NS 🔒

Functions§

main 🔒
print_warning 🔒