From: Hillf Danton <hdanton@sina.com>
To: Waiman Long <longman@redhat.com>, Eric Dumazet <edumazet@google.com>
Cc: linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] locking/rwlocks: do not starve writers
Date: Sat, 18 Jun 2022 16:43:22 +0800 [thread overview]
Message-ID: <20220618084322.81-1-hdanton@sina.com> (raw)
In-Reply-To: <980387bb-f46d-e9ab-96b0-7293df08c447@redhat.com>
On Fri, 17 Jun 2022 14:57:55 -0400 Waiman Long wrote:
>On 6/17/22 13:45, Eric Dumazet wrote:
>> On Fri, Jun 17, 2022 at 7:42 PM Waiman Long <longman@redhat.com> wrote:
>>> On 6/17/22 11:24, Eric Dumazet wrote:
>>>> On Fri, Jun 17, 2022 at 5:00 PM Waiman Long <longman@redhat.com> wrote:
>>>>> On 6/17/22 10:57, Shakeel Butt wrote:
>>>>>> On Fri, Jun 17, 2022 at 7:43 AM Waiman Long <longman@redhat.com> wrote:
>>>>>>> On 6/17/22 08:07, Peter Zijlstra wrote:
>>>>>>>> On Fri, Jun 17, 2022 at 02:10:39AM -0700, Eric Dumazet wrote:
>>>>>>>>> --- a/kernel/locking/qrwlock.c
>>>>>>>>> +++ b/kernel/locking/qrwlock.c
>>>>>>>>> @@ -23,16 +23,6 @@ void queued_read_lock_slowpath(struct qrwlock *lock)
>>>>>>>>> /*
>>>>>>>>> * Readers come here when they cannot get the lock without waiting
>>>>>>>>> */
>>>>>>>>> - if (unlikely(in_interrupt())) {
>>>>>>>>> - /*
>>>>>>>>> - * Readers in interrupt context will get the lock immediately
>>>>>>>>> - * if the writer is just waiting (not holding the lock yet),
>>>>>>>>> - * so spin with ACQUIRE semantics until the lock is available
>>>>>>>>> - * without waiting in the queue.
>>>>>>>>> - */
>>>>>>>>> - atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
>>>>>>>>> - return;
>>>>>>>>> - }
>>>>>>>>> atomic_sub(_QR_BIAS, &lock->cnts);
>>>>>>>>>
>>>>>>>>> trace_contention_begin(lock, LCB_F_SPIN | LCB_F_READ);
>>>>>>>> This is known to break tasklist_lock.
>>>>>>>>
>>>>>>> We certainly can't break the current usage of tasklist_lock.
>>>>>>>
>>>>>>> I am aware of this problem with networking code and is thinking about
>>>>>>> either relaxing the check to exclude softirq or provide a
>>>>>>> read_lock_unfair() variant for networking use.
>>>>>> read_lock_unfair() for networking use or tasklist_lock use?
>>>>> I mean to say read_lock_fair(), but it could also be the other way
>>>>> around. Thanks for spotting that.
>>>>>
>>>> If only tasklist_lock is problematic and needs the unfair variant,
>>>> then changing a few read_lock() for tasklist_lock will be less
>>>> invasive than ~1000 read_lock() elsewhere....
>>> After a second thought, I think the right way is to introduce a fair
>>> variant, if needed. If an arch isn't using qrwlock, the native rwlock
>>> implementation will be unfair. In that sense, unfair rwlock is the
>>> default. We will only need to change the relevant network read_lock()
>>> calls to use the fair variant which will still be unfair if qrwlock
>>> isn't used. We are not going to touch other read_lock call that don't
>>> care about fair or unfair.
>>>
>> Hmm... backporting this kind of invasive change to stable kernels will
>> be a daunting task.
>>
>> Were rwlocks always unfair, and we have been lucky ?
>>
> Yes, rwlocks was always unfair and it always had this kind of soft
> lockup problem and scalability problem because of cacheline bouncing.
> That was reason of creating qrwlock which can at least provide a fair
> rwlock at task context. Now we have systems with more and more cpus and
> that is the reason why you are seeing it all over again with the
> networking code.
No fair play without paying the price.
If writer wants to play fair game with readers, it has to wait in queue
for at least a tick then forces readers to go the slow path by setting
_QW_WAITING.
Only for thoughts now.
Hillf
+++ b/kernel/locking/qrwlock.c
@@ -22,16 +22,20 @@ void queued_read_lock_slowpath(struct qr
/*
* Readers come here when they cannot get the lock without waiting
*/
+ if (_QW_WAITING & atomic_read(&lock->cnts))
+ goto queue;
+
if (unlikely(in_interrupt())) {
/*
* Readers in interrupt context will get the lock immediately
- * if the writer is just waiting (not holding the lock yet),
+ * if the writer is not waiting long enough,
* so spin with ACQUIRE semantics until the lock is available
* without waiting in the queue.
*/
atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
return;
}
+queue:
atomic_sub(_QR_BIAS, &lock->cnts);
/*
@@ -60,6 +64,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath)
*/
void queued_write_lock_slowpath(struct qrwlock *lock)
{
+ unsigned long start;
+ int qw;
int cnts;
/* Put the writer into the wait queue */
@@ -70,12 +76,20 @@ void queued_write_lock_slowpath(struct q
atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED))
goto unlock;
- /* Set the waiting flag to notify readers that a writer is pending */
- atomic_or(_QW_WAITING, &lock->cnts);
-
+ start = jiffies;
+ qw = 0;
/* When no more readers or writers, set the locked flag */
do {
- cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+ if (qw == 0 && start + 2 < jiffies) {
+ qw = _QW_WAITING;
+ /*
+ * Set the waiting flag to notify readers that a writer is pending
+ * only after waiting long enough - that is the price writer pays
+ * for fairness
+ */
+ atomic_or(_QW_WAITING, &lock->cnts);
+ }
+ cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == qw);
} while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED));
unlock:
arch_spin_unlock(&lock->wait_lock);
parent reply other threads:[~2022-06-18 8:43 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <980387bb-f46d-e9ab-96b0-7293df08c447@redhat.com>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220618084322.81-1-hdanton@sina.com \
--to=hdanton@sina.com \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox