From: Steven Rostedt <rostedt@goodmis.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Ankur Arora <ankur.a.arora@oracle.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org,
luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com,
hpa@zytor.com, juri.lelli@redhat.com, vincent.guittot@linaro.org,
willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com,
bharata@amd.com, raghavendra.kt@amd.com,
boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
jgross@suse.com, andrew.cooper3@citrix.com,
Joel Fernandes <joel@joelfernandes.org>,
Vineeth Pillai <vineethrp@google.com>,
Suleiman Souhlal <suleiman@google.com>,
Ingo Molnar <mingo@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Clark Williams <clark.williams@gmail.com>,
bigeasy@linutronix.de, daniel.wagner@suse.com,
joseph.salisbury@oracle.com, broonie@gmail.com
Subject: Re: [RFC][PATCH 1/2] sched: Extended scheduler time slice
Date: Tue, 4 Feb 2025 10:05:55 -0500 [thread overview]
Message-ID: <20250204100555.1a641b9b@gandalf.local.home> (raw)
In-Reply-To: <20250204081653.18dfe905@gandalf.local.home>
n Tue, 4 Feb 2025 08:16:53 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 4 Feb 2025 07:51:00 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > > I'm so confused, WTF do you then need the lazy crap?
>
> IOW, the "lazy crap" was created to solve this very issue. The holding of
> sleeping spin locks interrupted by a scheduler tick. I'm just giving user
> space the same feature that we gave the kernel in PREEMPT_RT.
>
Also, I believe it is best to follow the current preemption method and
that's what the NEED_RESCHED_LAZY gives us.
Let's say you have a low priority program, maybe even malicious, that goes
into a loop of calling a system call that can run for almost a millisecond
without sleeping.
In PREEMPT_NONE, this low priority program can cause RT tasks a latency of
a millisecond because if an RT task wakes up as the program just enters the
system call, it will have to wait for it to exit that system call for it to
run, which might be close to that millisecond.
For PREEMPT_VOLUNTARY, it will only preempt tasks until it hits a
might_sleep(), or cond_resched() (but so would PREEMPT_NONE on the
cond_resched(), but we want to get rid of those).
For PREEMPT_FULL, the program shouldn't affect any other task because its
system call will simply be preempted.
Now let's look at this new feature. It allows a task to ask for some
extended time to get out of a critical section if possible.
If we decide in the future that we remove PREEMPT_NONE and
PREEMPT_VOLUNTARY with a dynamic type like:
TYPE | Sched Tick | RT Wakeup | Enter user space |
===========+================+=============+====================+
None | Set LAZY | Set LAZY | schedule |
-----------+----------------+-------------+--------------------+
Voluntary? | Set LAZY | schedule | schedule |
-----------+----------------+-------------+--------------------+
Full | schedule | schedule | schedule |
-----------+----------------+-------------+--------------------+
(The "Enter user space" is when a NEED_RESCHED is set)
Where in NONE, the LAZY flag is set for both the sched tick and the RT
wakeup and it doesn't schedule until it hits user space.
In "Voluntary", the LAZY flag is set only for sched tick on SCHED_OTHER
tasks, but RT tasks will get to be scheduled immediately (depending on
preempt_disable of course).
With "Full" it will schedule whenever it can.
With that task that calls that long system call, which method type above is
in place determines the latency of other tasks.
Now, if we add this feature, I want it to behave the same as a long system
call. Where it would only extend the time if a long system call would
extend the time, as that means it wouldn't modify the typical behavior of
the system for other tasks, but it would help in the performance for the
task that is requesting this feature.
With this feature:
TYPE | Sched Tick | RT Wakeup | Enter user space |
===========+================+=============+=======================+
None | Set LAZY | Set LAZY | schedule if !LAZY |
-----------+----------------+-------------+-----------------------+
Voluntary? | Set LAZY | schedule | schedule if !LAZY |
-----------+----------------+-------------+-----------------------+
Full | schedule | schedule | schedule |
-----------+----------------+-------------+-----------------------+
Thus, in NONE, it would likely get to extend its time just like if it
called a long system call. This can include even making RT tasks wait a
little longer, just like they would wait on a system call.
In "Voluntary", it would only get its timeslice extended if it was another
SCHED_OTHER task that is to be scheduled. But if an RT task would wake up,
it would schedule immediately regardless if a extended time slice was
requested or not.
In "Full", it probably makes sense to simply disable this feature (the
program would see that it is disabled when it registers the rseq), as it
would never get its time slice extended, as a system call would be
preempted immediately if it was interrupted.
So back to your question about why I'm tying this to the "lazy crap", is
because I want the behavior of other tasks to not change due to one task
asking for an extended time slice.
-- Steve
next prev parent reply other threads:[~2025-02-04 15:05 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-31 22:58 [RFC][PATCH 0/2] sched: Extended Scheduler Time Slice revisited Steven Rostedt
2025-01-31 22:58 ` [RFC][PATCH 1/2] sched: Extended scheduler time slice Steven Rostedt
2025-02-01 11:59 ` Peter Zijlstra
2025-02-01 12:47 ` Steven Rostedt
2025-02-01 18:11 ` Peter Zijlstra
2025-02-01 23:06 ` Steven Rostedt
2025-02-03 8:43 ` Peter Zijlstra
2025-02-03 8:53 ` Peter Zijlstra
2025-02-03 16:45 ` Steven Rostedt
2025-02-04 3:28 ` Suleiman Souhlal
2025-02-04 3:57 ` Steven Rostedt
2025-02-04 9:16 ` Peter Zijlstra
2025-02-04 12:51 ` Steven Rostedt
2025-02-04 13:16 ` Steven Rostedt
2025-02-04 15:05 ` Steven Rostedt [this message]
2025-02-04 15:30 ` Peter Zijlstra
2025-02-04 16:11 ` Steven Rostedt
2025-02-05 9:07 ` Peter Zijlstra
2025-02-05 13:10 ` Steven Rostedt
2025-02-05 13:44 ` Steven Rostedt
2025-02-04 22:44 ` Prakash Sangappa
2025-02-05 0:56 ` Joel Fernandes
2025-02-05 3:04 ` Steven Rostedt
2025-02-05 5:09 ` Joel Fernandes
2025-02-05 13:16 ` Steven Rostedt
2025-02-05 13:38 ` Steven Rostedt
2025-02-05 21:08 ` Prakash Sangappa
2025-02-05 21:19 ` Steven Rostedt
2025-02-05 21:33 ` Steven Rostedt
2025-02-05 21:36 ` Prakash Sangappa
2025-02-06 3:07 ` Joel Fernandes
2025-02-06 13:30 ` Steven Rostedt
2025-02-06 13:44 ` Sebastian Andrzej Siewior
2025-02-06 13:48 ` Peter Zijlstra
2025-02-06 13:53 ` Sebastian Andrzej Siewior
2025-02-06 13:57 ` Peter Zijlstra
2025-02-06 14:20 ` Steven Rostedt
2025-02-06 14:22 ` Sebastian Andrzej Siewior
2025-02-06 14:27 ` Peter Zijlstra
2025-02-06 14:57 ` Steven Rostedt
2025-02-06 15:01 ` Sebastian Andrzej Siewior
2025-02-10 19:43 ` Steven Rostedt
2025-02-10 22:04 ` David Laight
2025-02-10 22:15 ` Steven Rostedt
2025-02-11 8:21 ` Sebastian Andrzej Siewior
2025-02-11 10:57 ` Peter Zijlstra
2025-02-11 15:28 ` Steven Rostedt
2025-02-12 12:11 ` Sebastian Andrzej Siewior
2025-02-12 15:00 ` Steven Rostedt
2025-02-12 15:18 ` Sebastian Andrzej Siewior
2025-02-10 14:07 ` Joel Fernandes
2025-02-10 19:48 ` Steven Rostedt
2025-02-10 17:20 ` David Laight
2025-02-10 17:27 ` Steven Rostedt
2025-02-10 19:44 ` Steven Rostedt
2025-02-10 21:51 ` David Laight
2025-02-10 21:58 ` Steven Rostedt
2025-02-01 14:35 ` Mathieu Desnoyers
2025-02-01 23:08 ` Steven Rostedt
2025-02-01 23:18 ` Linus Torvalds
2025-02-01 23:35 ` Linus Torvalds
2025-02-02 3:26 ` Steven Rostedt
2025-02-02 3:22 ` Steven Rostedt
2025-02-02 7:22 ` Matthew Wilcox
2025-02-02 22:29 ` Steven Rostedt
2025-01-31 22:58 ` [RFC][PATCH 2/2] sched: Shorten time that tasks can extend their time slice for Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250204100555.1a641b9b@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=akpm@linux-foundation.org \
--cc=andrew.cooper3@citrix.com \
--cc=ankur.a.arora@oracle.com \
--cc=bharata@amd.com \
--cc=bigeasy@linutronix.de \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=broonie@gmail.com \
--cc=clark.williams@gmail.com \
--cc=daniel.wagner@suse.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=joel@joelfernandes.org \
--cc=jon.grimm@amd.com \
--cc=joseph.salisbury@oracle.com \
--cc=juri.lelli@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=suleiman@google.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=vineethrp@google.com \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox