linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hillf Danton <hdanton@sina.com>
To: Waiman Long <longman@redhat.com>, Eric Dumazet <edumazet@google.com>
Cc: linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] locking/rwlocks: do not starve writers
Date: Sat, 18 Jun 2022 16:43:22 +0800	[thread overview]
Message-ID: <20220618084322.81-1-hdanton@sina.com> (raw)
In-Reply-To: <980387bb-f46d-e9ab-96b0-7293df08c447@redhat.com>

On Fri, 17 Jun 2022 14:57:55 -0400 Waiman Long wrote:
>On 6/17/22 13:45, Eric Dumazet wrote:
>> On Fri, Jun 17, 2022 at 7:42 PM Waiman Long <longman@redhat.com> wrote:
>>> On 6/17/22 11:24, Eric Dumazet wrote:
>>>> On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@redhat.com> wrote:
>>>>> On 6/17/22 10:57, Shakeel Butt wrote:
>>>>>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
>>>>>>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>>>>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>>>>>>> --- a/kernel/locking/qrwlock.c
>>>>>>>>> +++ b/kernel/locking/qrwlock.c
>>>>>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>>>>>>         /*
>>>>>>>>>          * Readers come here when they cannot get the lock without waiting
>>>>>>>>>          */
>>>>>>>>> -    if (unlikely(in_interrupt())) {
>>>>>>>>> -            /*
>>>>>>>>> -             * Readers in interrupt context will get the lock immediately
>>>>>>>>> -             * if the writer is just waiting (not holding the lock yet),
>>>>>>>>> -             * so spin with ACQUIRE semantics until the lock is available
>>>>>>>>> -             * without waiting in the queue.
>>>>>>>>> -             */
>>>>>>>>> -            atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>>>>>>> -            return;
>>>>>>>>> -    }
>>>>>>>>>         atomic_sub(_QR_BIAS, &lock->cnts);
>>>>>>>>>
>>>>>>>>>         trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>>>>>>> This is known to break tasklist_lock.
>>>>>>>>
>>>>>>> We certainly can't break the current usage of tasklist_lock.
>>>>>>>
>>>>>>> I am aware of this problem with networking code and is thinking about
>>>>>>> either relaxing the check to exclude softirq or provide a
>>>>>>> read_lock_unfair() variant for networking use.
>>>>>> read_lock_unfair() for networking use or tasklist_lock use?
>>>>> I mean to say read_lock_fair(), but it could also be the other way
>>>>> around. Thanks for spotting that.
>>>>>
>>>> If only tasklist_lock is problematic and needs the unfair variant,
>>>> then changing a few read_lock() for tasklist_lock will be less
>>>> invasive than ~1000 read_lock() elsewhere....
>>> After a second thought, I think the right way is to introduce a fair
>>> variant, if needed. If an arch isn't using qrwlock, the native rwlock
>>> implementation will be unfair. In that sense, unfair rwlock is the
>>> default. We will only need to change the relevant network read_lock()
>>> calls to use the fair variant which will still be unfair if qrwlock
>>> isn't used. We are not going to touch other read_lock call that don't
>>> care about fair or unfair.
>>>
>> Hmm... backporting this kind of invasive change to stable kernels will
>> be a daunting task.
>>
>> Were rwlocks always unfair, and we have been lucky ?
>>
> Yes, rwlocks was always unfair and it always had this kind of soft 
> lockup problem and scalability problem because of cacheline bouncing. 
> That was reason of creating qrwlock which can at least provide a fair 
> rwlock at task context. Now we have systems with more and more cpus and 
> that is the reason why you are seeing it all over again with the 
> networking code.

No fair play without paying the price.

If writer wants to play fair game with readers, it has to wait in queue
for at least a tick then forces readers to go the slow path by setting
_QW_WAITING.

Only for thoughts now.

Hillf

+++ b/kernel/locking/qrwlock.c
@@ -22,16 +22,20 @@ void queued_read_lock_slowpath(struct qr
 	/*
 	 * Readers come here when they cannot get the lock without waiting
 	 */
+	if (_QW_WAITING & atomic_read(&lock->cnts))
+		goto queue;
+
 	if (unlikely(in_interrupt())) {
 		/*
 		 * Readers in interrupt context will get the lock immediately
-		 * if the writer is just waiting (not holding the lock yet),
+		 * if the writer is not waiting long enough,
 		 * so spin with ACQUIRE semantics until the lock is available
 		 * without waiting in the queue.
 		 */
 		atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
 		return;
 	}
+queue:
 	atomic_sub(_QR_BIAS, &lock->cnts);
 
 	/*
@@ -60,6 +64,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath)
  */
 void queued_write_lock_slowpath(struct qrwlock *lock)
 {
+	unsigned long start;
+	int qw;
 	int cnts;
 
 	/* Put the writer into the wait queue */
@@ -70,12 +76,20 @@ void queued_write_lock_slowpath(struct q
 	    atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))
 		goto unlock;
 
-	/* Set the waiting flag to notify readers that a writer is pending */
-	atomic_or(_QW_WAITING, &lock->cnts);
-
+	start = jiffies;
+	qw = 0;
 	/* When no more readers or writers, set the locked flag */
 	do {
-		cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+		if (qw == 0 && start + 2 < jiffies) {
+			qw = _QW_WAITING;
+			/*
+			 * Set the waiting flag to notify readers that a writer is pending
+			 * only after waiting long enough - that is the price writer pays
+			 * for fairness
+			 */
+			atomic_or(_QW_WAITING, &lock->cnts);
+		}
+		cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == qw);
 	} while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED));
 unlock:
 	arch_spin_unlock(&lock->wait_lock);


           reply	other threads:[~2022-06-18  8:43 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <980387bb-f46d-e9ab-96b0-7293df08c447@redhat.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220618084322.81-1-hdanton@sina.com \
    --to=hdanton@sina.com \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox