linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
       [not found]       ` <20230118173130.4n2b3cs4pxiqnqd3@techsingularity.net>
@ 2023-01-19  1:15         ` Hillf Danton
  2023-01-19  8:32           ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 7+ messages in thread
From: Hillf Danton @ 2023-01-19  1:15 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Davidlohr Bueso, linux-mm, LKML

On Wed, 18 Jan 2023 17:31:30 +0000 Mel Gorman <mgorman@techsingularity.net>
> On Wed, Jan 18, 2023 at 04:25:57PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2023-01-17 16:50:21 [+0000], Mel Gorman wrote:
> > 
> > > diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
> > > index c201aadb9301..99d81e8d1f25 100644
> > > --- a/kernel/locking/rwbase_rt.c
> > > +++ b/kernel/locking/rwbase_rt.c
> > > @@ -65,6 +69,64 @@ static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
> > >  	return 0;
> > >  }
> > >  
> > > +/*
> > > + * Allow reader bias with a pending writer for a minimum of 4ms or 1 tick.
> > > + * This matches RWSEM_WAIT_TIMEOUT for the generic RWSEM implementation.
> > > + * The granularity is not exact as the lowest bit in rwbase_rt->waiter_timeout
> > > + * is used to detect recent DL / RT tasks taking a read lock.
> > > + */
> > > +#define RWBASE_RT_WAIT_TIMEOUT DIV_ROUND_UP(HZ, 250)
> > > +
> > > +static void __sched update_dlrt_reader(struct rwbase_rt *rwb)
> > > +{
> > > +	/* No update required if DL / RT tasks already identified. */
> > > +	if (rwb->waiter_timeout & 1)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * Record a DL / RT task acquiring the lock for read. This may result
> > > +	 * in indefinite writer starvation but DL / RT tasks should avoid such
> > > +	 * behaviour.
> > > +	 */
> > > +	if (rt_task(current)) {
> > > +		struct rt_mutex_base *rtm = &rwb->rtmutex;
> > > +		unsigned long flags;
> > > +
> > > +		raw_spin_lock_irqsave(&rtm->wait_lock, flags);
> > > +		rwb->waiter_timeout |= 1;
> > 
> > Let me see of I parsed the whole logic right:
> > 
> > _After_ the RT reader acquired the lock, the lowest bit is set. This may
> > be immediately if the timeout did not occur yet.
> > With this flag set, all following reader incl. SCHED_OTHER will acquire
> > the lock.
> > 
> 
> Correct. The intent was to track if there were any DL / RT tasks since
> the last write unlock in case there was a mix of SCHED_OTHER and rt_tasks
> contending on the same lock with unknown arrival times.
> 
> > If so, then I don't know why this is a good idea.
> > 
> > If _only_ the RT reader is allowed to acquire the lock while the writer
> > is waiting then it make sense to prefer the RT tasks. (So the check is
> > on current and not on the lowest bit).
> > All other (SCHED_OTHER) reader would have to block on the rtmutex after
> > the timeout. This makes sense to avoid the starvation.
> > 
> 
> Ok, fair enough.
> 
> > If we drop that "we prefer the RT reader" then it would block on the
> > RTmutex. It will _still_ be preferred over the writer because it will be
> > enqueued before the writer in the queue due to its RT priority. The only
> > downside is that it has to wait until all readers are left.
> 
> The writer has to wait until all the readers have left anyway.
> 
> > So by allowing the RT reader to always acquire the lock as long as the
> > WRITER_BIAS isn't set, we would allow to enter early while the other
> > reader are still in and after the timeout you would only have RT reader
> > going in and out. All SCHED_OTHER reader block on the RTmutex.
> > 
> > I think I like this.
> > 
> 
> If I understand you correctly, the patch becomes this;
> 
> --8<--
> locking/rwbase: Prevent indefinite writer starvation
> 
> rw_semaphore and rwlock are explicitly unfair to writers in the presense
> of readers by design with a PREEMPT_RT configuration. Commit 943f0edb754f
> ("locking/rt: Add base code for RT rw_semaphore and rwlock") notes;
> 
>         The implementation is writer unfair, as it is not feasible to do
>         priority inheritance on multiple readers, but experience has shown
>         that real-time workloads are not the typical workloads which are
>         sensitive to writer starvation.
> 
> While atypical, it's also trivial to block writers with PREEMPT_RT
> indefinitely without ever making forward progress. Since LTP-20220121,
> the dio_truncate test case went from having 1 reader to having 16 readers
> and the number of readers is sufficient to prevent the down_write ever
> succeeding while readers exist. Eventually the test is killed after 30
> minutes as a failure.
> 
> dio_truncate is not a realtime application but indefinite writer starvation
> is undesirable. The test case has one writer appending and truncating files
> A and B while multiple readers read file A. The readers and writer are
> contending for one file's inode lock which never succeeds as the readers
> keep reading until the writer is done which never happens.
> 
> This patch records a timestamp when the first writer is blocked. DT /
> RT tasks can continue to take the lock for read as long as readers exist
> indefinitely. Other readers can acquire the read lock unless a writer
> has been blocked for a minimum of 4ms. This is sufficient to allow the
> dio_truncate test case to complete within the 30 minutes timeout.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  include/linux/rwbase_rt.h  |  3 +++
>  kernel/locking/rwbase_rt.c | 45 ++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/rwbase_rt.h b/include/linux/rwbase_rt.h
> index 1d264dd08625..b969b1d9bb85 100644
> --- a/include/linux/rwbase_rt.h
> +++ b/include/linux/rwbase_rt.h
> @@ -10,12 +10,14 @@
>  
>  struct rwbase_rt {
>  	atomic_t		readers;
> +	unsigned long		waiter_timeout;
>  	struct rt_mutex_base	rtmutex;
>  };
>  
>  #define __RWBASE_INITIALIZER(name)				\
>  {								\
>  	.readers = ATOMIC_INIT(READER_BIAS),			\
> +	.waiter_timeout = 0,					\
>  	.rtmutex = __RT_MUTEX_BASE_INITIALIZER(name.rtmutex),	\
>  }
>  
> @@ -23,6 +25,7 @@ struct rwbase_rt {
>  	do {							\
>  		rt_mutex_base_init(&(rwbase)->rtmutex);		\
>  		atomic_set(&(rwbase)->readers, READER_BIAS);	\
> +		(rwbase)->waiter_timeout = 0;			\
>  	} while (0)
>  
>  
> diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
> index c201aadb9301..84c5e4e4d25b 100644
> --- a/kernel/locking/rwbase_rt.c
> +++ b/kernel/locking/rwbase_rt.c
> @@ -39,7 +39,10 @@
>   * major surgery for a very dubious value.
>   *
>   * The risk of writer starvation is there, but the pathological use cases
> - * which trigger it are not necessarily the typical RT workloads.
> + * which trigger it are not necessarily the typical RT workloads. SCHED_OTHER
> + * reader acquisitions will be forced into the slow path if a writer is
> + * blocked for more than RWBASE_RT_WAIT_TIMEOUT jiffies. New DL / RT readers
> + * can still starve a writer indefinitely.
>   *
>   * Fast-path orderings:
>   * The lock/unlock of readers can run in fast paths: lock and unlock are only
> @@ -65,6 +68,35 @@ static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
>  	return 0;
>  }
>  
> +/*
> + * Allow reader bias for SCHED_OTHER tasks with a pending writer for a
> + * minimum of 4ms or 1 tick. This matches RWSEM_WAIT_TIMEOUT for the
> + * generic RWSEM implementation.
> + */
> +#define RWBASE_RT_WAIT_TIMEOUT DIV_ROUND_UP(HZ, 250)
> +
> +/* rtmutex->wait_lock must be held. */
> +static void __sched set_writer_blocked(struct rwbase_rt *rwb)
> +{
> +	/* Record the timeout based on the the first writer to block. */
> +	if (!rwb->waiter_timeout)
> +		rwb->waiter_timeout = jiffies + RWBASE_RT_WAIT_TIMEOUT;
> +}
> +
> +static bool __sched rwbase_allow_reader_bias(struct rwbase_rt *rwb)
> +{
> +	/*
> +	 * Allow reader bias for DL / RT tasks. Such tasks should be
> +	 * designed to avoid heavy writer contention or indefinite
> +	 * starvation.
> +	 */
> +	if (rt_task(current))
> +		return true;
> +
> +	/* Allow reader bias unless a writer timeout is reached. */
> +	return time_before(jiffies, rwb->waiter_timeout);
> +}

Good to see the change in your mind since last post.

But starvation remains same given RT contenders, even if prevent is put
on your subject line.

What sense could the 4ms deadline make at best if RT contenders extend
it to a week?
What will happen if RT contenders are hurt by the starvation deadline?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
  2023-01-19  1:15         ` [PATCH v2] locking/rwbase: Prevent indefinite writer starvation Hillf Danton
@ 2023-01-19  8:32           ` Sebastian Andrzej Siewior
  2023-01-19 13:59             ` Hillf Danton
  0 siblings, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-01-19  8:32 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Mel Gorman, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Davidlohr Bueso, linux-mm, LKML

On 2023-01-19 09:15:38 [+0800], Hillf Danton wrote:
> But starvation remains same given RT contenders, even if prevent is put
> on your subject line.
>
> What sense could the 4ms deadline make at best if RT contenders extend
> it to a week?
> What will happen if RT contenders are hurt by the starvation deadline?

As far as Mel's efforts go, I am satisfied so far. Maybe you can improve
it further afterwards.

Sebastian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
  2023-01-19  8:32           ` Sebastian Andrzej Siewior
@ 2023-01-19 13:59             ` Hillf Danton
  2023-01-19 16:36               ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 7+ messages in thread
From: Hillf Danton @ 2023-01-19 13:59 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Mel Gorman, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Davidlohr Bueso, linux-mm, LKML

On Thu, 19 Jan 2023 09:32:22 +0100 Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> As far as Mel's efforts go, I am satisfied so far.

If not because you can, could you specify why 4ms fails to cure starvation?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
  2023-01-19 13:59             ` Hillf Danton
@ 2023-01-19 16:36               ` Sebastian Andrzej Siewior
  2023-01-20  9:37                 ` Hillf Danton
  0 siblings, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-01-19 16:36 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Mel Gorman, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Davidlohr Bueso, linux-mm, LKML

On 2023-01-19 21:59:03 [+0800], Hillf Danton wrote:
> On Thu, 19 Jan 2023 09:32:22 +0100 Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > 
> > As far as Mel's efforts go, I am satisfied so far.
> 
> If not because you can, could you specify why 4ms fails to cure starvation?

It does not fail to cure the starvation. I haven't tested it myself but
base on Mel's description and the patch it very much looks like it cures
the writer starvation.

If you don't like the 4ms, it could be 1ms or 40ms - it does not really
matter. The 4ms is aligned on the generic implementation which uses the
same value. Unless there is strong evidence to use something else I
don't see the need to diverse.

Sebastian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
  2023-01-19 16:36               ` Sebastian Andrzej Siewior
@ 2023-01-20  9:37                 ` Hillf Danton
  2023-01-20 18:34                   ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 7+ messages in thread
From: Hillf Danton @ 2023-01-20  9:37 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Mel Gorman, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Davidlohr Bueso, linux-mm, LKML

On Thu, 19 Jan 2023 17:36:22 +0100 Sebastian Andrzej Siewior
> On 2023-01-19 21:59:03 [+0800], Hillf Danton wrote:
> > On Thu, 19 Jan 2023 09:32:22 +0100 Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > > 
> > > As far as Mel's efforts go, I am satisfied so far.
> > 
> > If not because you can, could you specify why 4ms fails to cure starvation?
> 
> It does not fail to cure the starvation. I haven't tested it myself but
> base on Mel's description and the patch it very much looks like it cures
> the writer starvation.
> 
> If you don't like the 4ms, it could be 1ms or 40ms - it does not really
> matter. The 4ms is aligned on the generic implementation which uses the
> same value. Unless there is strong evidence to use something else I
> don't see the need to diverse.

I am fine with either 4ms or 40ms, or a second.

Given the cure, does it still work when reader bias for RT tasks is allowed?
If not, why keep starving waiters after they pay the 40ms price?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
  2023-01-20  9:37                 ` Hillf Danton
@ 2023-01-20 18:34                   ` Sebastian Andrzej Siewior
  2023-01-21  3:46                     ` Hillf Danton
  0 siblings, 1 reply; 7+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-01-20 18:34 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Mel Gorman, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Davidlohr Bueso, linux-mm, LKML

On 2023-01-20 17:37:11 [+0800], Hillf Danton wrote:
> 
> I am fine with either 4ms or 40ms, or a second.
> 
> Given the cure, does it still work when reader bias for RT tasks is allowed?
No.

> If not, why keep starving waiters after they pay the 40ms price?

That kind of starvation will also happen if you have only spinlock_t
locks and you say 3 RT tasks that acquire the lock back to back. And a
few SCHED_OTHER tasks. Those 3 will be always be in front of the queue
(as they skip the line) and the following SCHED_OTHER tasks will starve
and never get the lock.

So it is basically the same scenario.

Sebastian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] locking/rwbase: Prevent indefinite writer starvation
  2023-01-20 18:34                   ` Sebastian Andrzej Siewior
@ 2023-01-21  3:46                     ` Hillf Danton
  0 siblings, 0 replies; 7+ messages in thread
From: Hillf Danton @ 2023-01-21  3:46 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Mel Gorman, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Davidlohr Bueso, linux-mm, LKML

On Fri, 20 Jan 2023 19:34:02 +0100 Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> On 2023-01-20 17:37:11 [+0800], Hillf Danton wrote:
> > 
> > I am fine with either 4ms or 40ms, or a second.
> > 
> > Given the cure, does it still work when reader bias for RT tasks is allowed?
> No.
> 
> > If not, why keep starving waiters after they pay the 40ms price?
> 
> That kind of starvation will also happen if you have only spinlock_t
> locks and you say 3 RT tasks that acquire the lock back to back. And a
> few SCHED_OTHER tasks. Those 3 will be always be in front of the queue
> (as they skip the line) and the following SCHED_OTHER tasks will starve
> and never get the lock.

Given priority, what sense could be made by keeping RT task starved?


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-21  3:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20230117083817.togfwc5cy4g67e5r@techsingularity.net>
     [not found] ` <Y8avJm1FQI9vB9cv@linutronix.de>
     [not found]   ` <20230117165021.t5m7c2d6frbbfzig@techsingularity.net>
     [not found]     ` <Y8gPhTGkfCbGwoUu@linutronix.de>
     [not found]       ` <20230118173130.4n2b3cs4pxiqnqd3@techsingularity.net>
2023-01-19  1:15         ` [PATCH v2] locking/rwbase: Prevent indefinite writer starvation Hillf Danton
2023-01-19  8:32           ` Sebastian Andrzej Siewior
2023-01-19 13:59             ` Hillf Danton
2023-01-19 16:36               ` Sebastian Andrzej Siewior
2023-01-20  9:37                 ` Hillf Danton
2023-01-20 18:34                   ` Sebastian Andrzej Siewior
2023-01-21  3:46                     ` Hillf Danton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox