From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B155DC6FD18 for ; Wed, 29 Mar 2023 02:01:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DE5C6B0072; Tue, 28 Mar 2023 22:01:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1728B6B0074; Tue, 28 Mar 2023 22:01:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02F166B0075; Tue, 28 Mar 2023 22:01:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E5DA66B0072 for ; Tue, 28 Mar 2023 22:01:36 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AA2551C5E37 for ; Wed, 29 Mar 2023 02:01:36 +0000 (UTC) X-FDA: 80620284192.14.7780905 Received: from mail3-166.sinamail.sina.com.cn (mail3-166.sinamail.sina.com.cn [202.108.3.166]) by imf10.hostedemail.com (Postfix) with ESMTP id 8C9BAC0015 for ; Wed, 29 Mar 2023 02:01:33 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.166 as permitted sender) smtp.mailfrom=hdanton@sina.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680055295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MgUEk5Y7tk/co2BLHh9mh2C+aqOeJoHPuZUnvWpoTyY=; b=yTHT3Bo/8xFAHinlGURV+ApLR5Cl3+qXRw1lga59JJ7lCgLT7KWUeEP3MRQGx23dA0Yx0E wx6PiTW5tkPDX1dY8qXIYjJ+mBvDGBqvEwQQ1NVXzVewGkuLgtEnP6hpIMMVla08V+wIQy b9wyUmwwXQ3DHGyrj+N6VXRVzK+uzwQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.166 as permitted sender) smtp.mailfrom=hdanton@sina.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680055295; a=rsa-sha256; cv=none; b=GM+zCOyrRfC/F00vtqsjwGNEPNclabYMDlkJn2ib96Cg7FCWhJHCXtKwJYj2rfVSu/EVJB I7htef2Ow1UBN2TqnchzS0gOemdIysKh2V7Mfx0VPVRbIEDGmQypQ8pdQFtIYt9/MLDega TiImc1lJ8f/FOIt9zSV/mZkV94sRau8= X-SMAIL-HELO: localhost.localdomain Received: from unknown (HELO localhost.localdomain)([114.249.61.130]) by sina.com (172.16.97.23) with ESMTP id 64239BD000008025; Wed, 29 Mar 2023 10:00:50 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 41076854920336 From: Hillf Danton To: Peter Zijlstra Cc: Waiman Long , Ingo Molnar , Will Deacon , Boqun Feng , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 8/8] locking/rwsem: Restore old write lock slow path behavior Date: Wed, 29 Mar 2023 10:01:19 +0800 Message-Id: <20230329020119.2673-1-hdanton@sina.com> In-Reply-To: <20230328140259.GF4253@hirez.programming.kicks-ass.net> References: <20230327202413.1955856-1-longman@redhat.com> <20230327202413.1955856-9-longman@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 8C9BAC0015 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: yno9y14cfdbiwzaadi8eimkiahfdutsz X-HE-Tag: 1680055293-818990 X-HE-Meta: U2FsdGVkX18wTMKAaYJkaRaJZpT5LjzquO7g4Ph/3JvagFKKfX53Tk+oh9UWmuf/+wDFxmMA+dTM5N1k9l8rTzZPEMqu6cvvum9RHRIIngG205YOPhzi6+ypHlfJQs3f+eyqu8uuzjawkP/seo4wxJjv18P3++CF87pFccpqrqk3+2XmcRhhrGoMVJFp8TtoH1Rr0CCZp/UH4DYmAb6Cw/m7mGnhq0ltX8aWo9ke3DIYPYujrCbBwA2a35PKXoG5OPBSN03THLNYjwWnVfLwBSTI1pnHCYKWHc6ZyLErtZpSNZg3ZEvuUYd1Z/YEc02Ks0vOpDBUnEqEwCKiDVl041sL3vk3UA6NGmIxGIpneEufRKLeAiWoEU4cFMW2Tgf313El7rRf1i5TvDiy93vxDupo/F/9Xu3G+/krSiEJkSXExL9cSnZ6NIb+k9KBMqwvKQG6G8RlIOYm9QBa2uVRqPe2imp/mMsXRmvo0s1g88kCJHp+Lbnf7kNgcsfiPp8XDHQQz59yr7gUIORuXTtn+pAj9k5TUs9DqQZhb7UOnMuilc0x1IFod2vZBQ5LE7T/fvERNekmtrbqSLD/8DyMmXAbyMvspgyTZY29qNCbpTydAToe2yeTwpx96KIHm1xTQnRWamVqhEgbbZE5GhmBzuVzCzGlJWUFfV4a2T6IFJ64Fdos/krhqnqdSKjE+lSs0//1ljKktm1vbKIJ9WgB5qMEG66Zi8gZ+gdbx5IqFh93PoNd89gQla9ZrDGsMQOwpmOYewTjqKgIdzaPL8f1EIyRt/pvo5hyk9wC3Y/MGWvk/VOvorZp8rq2BqtTXze8dMKMvLWLH56QS3wBQBWqJcaNcD0EK8EhKP5VnTm2IZd48mHwng2cHLnV9WFUSQ4QeYErv0Prckur0+SiTxLiEhflLYFeufryK8aYs0qjOKuwN7ZUyPmUERJQDTCyOXYzTY2t5AdMIkE4/ES38hs 0+WpLuEy 0kujlMgTtMyYtV4QIXb7WAuDfOgBnv/TGo0/UN98SxHSNJKykMISMAQYlGOeGnmQgn4j6I0K9H4pHTfnMSWgYf3LZyP85EbK4GNsu5UdlyDZzGmG5IufJP8j2Vp24vGpIbIk0m/oX87cjMuQamndb8Qa73ZSyKk1ko4NtJD20EsCFbNKchkic5GbxgLUYgYGlchk6ctcp77dN4D02EWnPNpSfTRxpU2Wn8y09uhfAw+rJp0vEHxfArfViOB/Zf6ZzNqEcy9JK0JLJngoNN0kzeJVfirq/f5jjJXo+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 28 Mar 2023 16:02:59 +0200 Peter Zijlstra > On Mon, Mar 27, 2023 at 04:24:13PM -0400, Waiman Long wrote: > > kernel/locking/rwsem.c | 34 +++++++++++++++++++++++++++++++--- > > 1 file changed, 31 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c > > index 7bd26e64827a..cf9dc1e250e0 100644 > > --- a/kernel/locking/rwsem.c > > +++ b/kernel/locking/rwsem.c > > @@ -426,6 +426,7 @@ rwsem_waiter_wake(struct rwsem_waiter *waiter, struct wake_q_head *wake_q) > > static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, > > struct rwsem_waiter *waiter) > > { > > + bool first = (rwsem_first_waiter(sem) == waiter); > > long count, new; > > > > lockdep_assert_held(&sem->wait_lock); > > @@ -434,6 +435,9 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, > > do { > > new = count; > > > > + if (!first && (count & (RWSEM_FLAG_HANDOFF | RWSEM_LOCK_MASK))) > > + return false; > > + > > if (count & RWSEM_LOCK_MASK) { > > /* > > * A waiter (first or not) can set the handoff bit > > I couldn't immediately make sense of the above, and I think there's case > where not-first would still want to set handoff you're missing. > > After a few detours, I ended up with the below; does that make sense or > did I just make a bigger mess? (entirely possible due to insufficient > sleep etc..). > > > --- a/kernel/locking/rwsem.c > +++ b/kernel/locking/rwsem.c > @@ -426,12 +426,27 @@ rwsem_waiter_wake(struct rwsem_waiter *w > static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, > struct rwsem_waiter *waiter) > { > + bool first = (rwsem_first_waiter(sem) == waiter); > long count, new; > > lockdep_assert_held(&sem->wait_lock); > > count = atomic_long_read(&sem->count); > do { > + /* > + * first handoff > + * > + * 0 0 | take > + * 0 1 | not take > + * 1 0 | take > + * 1 1 | take > + * > + */ > + bool handoff = count & RWSEM_FLAG_HANDOFF; > + > + if (!first && handoff) > + return false; > + > new = count; > > if (count & RWSEM_LOCK_MASK) { > @@ -440,7 +455,7 @@ static inline bool rwsem_try_write_lock( > * if it is an RT task or wait in the wait queue > * for too long. > */ > - if ((count & RWSEM_FLAG_HANDOFF) || > + if (handoff || > (!rt_task(waiter->task) && > !time_after(jiffies, waiter->timeout))) > return false; > @@ -501,11 +516,19 @@ static void rwsem_writer_wake(struct rw_ > */ > list_del(&waiter->list); > atomic_long_set(&sem->owner, (long)waiter->task); > - > - } else if (!rwsem_try_write_lock(sem, waiter)) > + rwsem_waiter_wake(waiter, wake_q); > return; > + } > > - rwsem_waiter_wake(waiter, wake_q); > + /* > + * Mark writer at the front of the queue for wakeup. > + * > + * Until the task is actually awoken later by the caller, other writers > + * are able to steal it. Readers, on the other hand, will block as they > + * will notice the queued writer. > + */ > + wake_q_add(wake_q, waiter->task); > + lockevent_inc(rwsem_wake_writer); > } > > static void rwsem_reader_wake(struct rw_semaphore *sem, > @@ -1038,6 +1061,25 @@ rwsem_waiter_wait(struct rw_semaphore *s > /* Matches rwsem_waiter_wake()'s smp_store_release(). */ > break; > } > + if (!reader) { > + /* > + * If rwsem_writer_wake() did not take the lock, we must > + * rwsem_try_write_lock() here. > + */ > + raw_spin_lock_irq(&sem->wait_lock); > + if (waiter->task && rwsem_try_write_lock(sem, waiter)) { > + waiter->task = NULL; > + raw_spin_unlock_irq(&sem->wait_lock); > + break; > + } > + raw_spin_unlock_irq(&sem->wait_lock); > + > + if (waiter->handoff_set) > + rwsem_spin_on_owner(sem); No sense made without waiter being the first one if the comment below is correct. > + > + if (!smp_load_acquire(&waiter->task)) > + break; > + } > if (signal_pending_state(state, current)) { > raw_spin_lock_irq(&sem->wait_lock); > if (waiter->task) > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/tree/kernel/locking/rwsem.c?h=locking/rwsem&id=91deffc826935#n459 /* * We have either acquired the lock with handoff bit cleared or set * the handoff bit. Only the first waiter can have its handoff_set * set here to enable optimistic spinning in slowpath loop. */ if (new & RWSEM_FLAG_HANDOFF) { waiter->handoff_set = true; lockevent_inc(rwsem_wlock_handoff); return false; }