From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41181C5321E for ; Fri, 23 Aug 2024 21:05:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8EBB7800CF; Fri, 23 Aug 2024 17:05:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8742B800C8; Fri, 23 Aug 2024 17:05:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 714C8800CF; Fri, 23 Aug 2024 17:05:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 52480800C8 for ; Fri, 23 Aug 2024 17:05:37 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B2C2080188 for ; Fri, 23 Aug 2024 21:05:36 +0000 (UTC) X-FDA: 82484741472.23.BEF76A9 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf09.hostedemail.com (Postfix) with ESMTP id BAF68140012 for ; Fri, 23 Aug 2024 21:05:33 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=YwzsGVYQ; dkim=pass header.d=linutronix.de header.s=2020e header.b="Etv/EtDZ"; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf09.hostedemail.com: domain of tglx@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=tglx@linutronix.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724447116; a=rsa-sha256; cv=none; b=IZNXwAhil/NmaqkDEzIF/U2fnYYWQTt/gdB6MyKan55woQMhGSGXA6Mi5ryeduUAz48lt2 jYjN3vLx6Lwl0qr1Dkn4PALkNE9Kl9m88FztMCmFDCWbJxYM24Ubu0kqAbg0tY9RZeG7zI FL++f8D6m/1NqWLQGyyBwqfcA6PnWTs= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=YwzsGVYQ; dkim=pass header.d=linutronix.de header.s=2020e header.b="Etv/EtDZ"; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf09.hostedemail.com: domain of tglx@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=tglx@linutronix.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724447116; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CaUWysiD7XirCK5HuajRc2QEt1BaDWiUbxhTzIekA+0=; b=p3TrqhcKArM4H5MS8kiXgg1F4dAyvGkvDLKUu33YSyU69U73YY9mfD3eAPx6pueKKGtSLZ i0w9RlxP4UZraXfpl+rFLK58p9itgbzfafgmwiQBMw4VxFbEKzKrtqUMdVk93E9XAdOSyQ krEUBTzq1npxF/AUG/xzAMiCoTkfxrU= From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1724447131; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CaUWysiD7XirCK5HuajRc2QEt1BaDWiUbxhTzIekA+0=; b=YwzsGVYQ5XR7NRFLfI/nzcORuNypZpfKRXI4fvQ7uFLG4XB8ht8QJ7UubXKqBhgWq/FMW9 c9IBL1HeA4MpBm8SU+uoAXBUiQLOlIMXj+1vB4TVvZeXevMq0MH8dro8AM8RpQUi6b0gud /4DHdiLfStMahn78gc9JpomQI1F5RwCBa1nz9twIlK0IGbkA/gAuRmk5cz198dwJQ/KIbS orMR6waO5r5oijY63LSjk+B6nR+zybuR7fjiVbvGpSQTFQZuKkiu3Ce8xBXpVRRm0ygNe2 gwaArWJN5ilmV9jESC203ZLO5Pj7EO9R/kveZZvqElZ7SqiDpKyoMI5M3cKMPA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1724447131; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CaUWysiD7XirCK5HuajRc2QEt1BaDWiUbxhTzIekA+0=; b=Etv/EtDZ4MKLo1zgBOn/jZ4XjhaeeyIfMv4KVDs8i5C1AA011fmqaGilJL6XYVrs7Bl6nV Wdy2Y5lHAvWFeoCA== To: Christoph Lameter via B4 Relay , Catalin Marinas , Will Deacon , Peter Zijlstra , Ingo Molnar , Waiman Long , Boqun Feng Cc: Linus Torvalds , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org, "Christoph Lameter (Ampere)" Subject: Re: [PATCH v2] Avoid memory barrier in read_seqcount() through load acquire In-Reply-To: <20240819-seq_optimize-v2-1-9d0da82b022f@gentwo.org> References: <20240819-seq_optimize-v2-1-9d0da82b022f@gentwo.org> Date: Fri, 23 Aug 2024 23:05:30 +0200 Message-ID: <87ttfbeyqt.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Rspam-User: X-Rspamd-Queue-Id: BAF68140012 X-Rspamd-Server: rspam01 X-Stat-Signature: n7mc9s8pg475rbmy6119ungry413r6nz X-HE-Tag: 1724447133-674538 X-HE-Meta: U2FsdGVkX18jgK8jiDmmfts/LXMjOMeOdDKxsmo92DKK4ohldOaSySDuN7h0qUgnSxNQToLJwwixzQHabqZSK0ZAlrT4nYdK+JUVsuWxXLNo/I/lMABH3Yh31qrAbvT94Nb209OZFy87BPCr6Bq1zJqktPkOJ0a1PyL4ur8fQkMDQTHkNw2CfiGGQEmjyVPIU0MTid5zHh9kkonOG0YvYo6jPAlUB2Ikfqod+Busns02lXN2ZzvxUNDaeUtRF6PipODvvuMrr228uUXJD+YWOXFy8qVighppg4i0fR/UfrXHPUJSapYFdEg2pLGFFvgRilK4SEURzYq9SCfQrT/TLqqp0TAEs7xZdR2/fTj8qfTYf0xCh225n/kT3C4MaRYumT5NUjojZ5Zbosg8JnaFAMEyDCwgIxA4A6CThT2jcbASH9FB3edABlMvFeRJkqelf+X9/hqjcNHApPywOoj0BjymL6QvP91ZcfFee6JW/yNHpdcZqwbYv+7GHlV0F/PJ7eZSWXQQOXsmlRdBJOJdxOKSu22DN7ZBbux4+npdTxWKQ9zGIUjYSQ99hHovNPL5eLV7Rf2buHkFoFVX4hB1s0tB1xE0eTTOWQOfSkxH1dsStF3J4U6dDFWeI3xK6KiV6kTJoh9LmPEdnPOtGP6KluMP0lYNyY6vdG69KCw1FsFvY564G2MuIchRPGDPyWyI+yUt3/okgTN1ZaEC22ZcmDFtGvLIidYDHtBhVF+K0nIfg6WX7aBqAtFlWFbRtGZuh2UNmhbDpMmijoLFTpeq2elxCAz4waNl735KNt4SgAEiDwuSRQBfJj6FjCW/0uyFztlFYpEaNFRbzX2LHwA8s6tQjGQG27YhDcQl1FW6QJ3jQl5BJNVn4nM2pv4Y5FAYsHgjhv2VcVb/UFI0BRDU338MSHWh1NALDjuWlp67oZkReoo7XNmXDPEpJWRXODOyhKMhnZKMvChICFs6x8N f2TBuE4D 7G+4mTIMTavoZZbnwdRQYAKkwrU3ILTVSWWqjh+ABVqAFq4qrLxb0dUQZRVLApBhUqoVhTyZ5BcMrw6zTplncH9PkgwEG9Wl15U/vw9yVOS0u7aHv6rt0MahoV3hNZ14hT+eWMizEbfFeKS9U54PXdqDEQRsuQWGC6JGYQk5eRkHnmEsZ8VZ2bWUIKLMCMwBq6o8b8Ugyb0gnKO51SQp4EmeER3JWrKr/Yo8xItNQFDUj4BcQsUd/N4BhSURzJm2KNmMAf3yt3ml1II4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 19 2024 at 11:30, Christoph Lameter via wrote: > @@ -293,6 +321,18 @@ SEQCOUNT_LOCKNAME(mutex, struct mutex, true, mutex) > * > * Return: count to be passed to read_seqcount_retry() > */ > +#ifdef CONFIG_ARCH_HAS_ACQUIRE_RELEASE > +#define raw_read_seqcount_begin(s) \ > +({ \ > + unsigned _seq; \ > + \ > + while ((_seq = seqprop_sequence_acquire(s)) & 1) \ > + cpu_relax(); \ > + \ > + kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \ > + _seq; \ > +}) So this covers only raw_read_seqcount_begin(), but not raw_read_seqcount() which has the same smp_rmb() inside. This all can be done without the extra copies of the counter accessors. Uncompiled patch below. It's a little larger than I initialy wanted to do it, but I had to keep the raw READ_ONCE() for __read_seqcount_begin() to not inflict the smp_load_acquire() to the only usage site in the dcache code. The acquire conditional in __seqprop_load_sequence() is optimized out by the compiler as all of this is macro/__always_inline. Thanks, tglx --- --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -132,6 +132,14 @@ static inline void seqcount_lockdep_read #define seqcount_rwlock_init(s, lock) seqcount_LOCKNAME_init(s, lock, rwlock) #define seqcount_mutex_init(s, lock) seqcount_LOCKNAME_init(s, lock, mutex) +static __always_inline unsigned __seqprop_load_sequence(const seqcount_t *s, bool acquire) +{ + if (acquire && IS_ENABLED(CONFIG_ARCH_HAS_ACQUIRE_RELEASE)) + return smp_load_acquire(&s->sequence); + else + return READ_ONCE(s->sequence); +} + /* * SEQCOUNT_LOCKNAME() - Instantiate seqcount_LOCKNAME_t and helpers * seqprop_LOCKNAME_*() - Property accessors for seqcount_LOCKNAME_t @@ -155,9 +163,10 @@ static __always_inline const seqcount_t } \ \ static __always_inline unsigned \ -__seqprop_##lockname##_sequence(const seqcount_##lockname##_t *s) \ +__seqprop_##lockname##_sequence(const seqcount_##lockname##_t *s, \ + bool acquire) \ { \ - unsigned seq = READ_ONCE(s->seqcount.sequence); \ + unsigned seq = __seqprop_load_sequence(&s->seqcount, acquire); \ \ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) \ return seq; \ @@ -170,7 +179,7 @@ static __always_inline unsigned \ * Re-read the sequence counter since the (possibly \ * preempted) writer made progress. \ */ \ - seq = READ_ONCE(s->seqcount.sequence); \ + seq = __seqprop_load_sequence(&s->seqcount, acquire); \ } \ \ return seq; \ @@ -206,9 +215,9 @@ static inline const seqcount_t *__seqpro return s; } -static inline unsigned __seqprop_sequence(const seqcount_t *s) +static inline unsigned __seqprop_sequence(const seqcount_t *s, bool acquire) { - return READ_ONCE(s->sequence); + return __seqprop_load_sequence(s, acquire); } static inline bool __seqprop_preemptible(const seqcount_t *s) @@ -258,29 +267,23 @@ SEQCOUNT_LOCKNAME(mutex, struct m #define seqprop_ptr(s) __seqprop(s, ptr)(s) #define seqprop_const_ptr(s) __seqprop(s, const_ptr)(s) -#define seqprop_sequence(s) __seqprop(s, sequence)(s) +#define seqprop_sequence(s, a) __seqprop(s, sequence)(s, a) #define seqprop_preemptible(s) __seqprop(s, preemptible)(s) #define seqprop_assert(s) __seqprop(s, assert)(s) /** - * __read_seqcount_begin() - begin a seqcount_t read section w/o barrier - * @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants - * - * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb() - * barrier. Callers should ensure that smp_rmb() or equivalent ordering is - * provided before actually loading any of the variables that are to be - * protected in this critical section. - * - * Use carefully, only in critical code, and comment how the barrier is - * provided. + * read_seqcount_begin_cond_acquire() - begin a seqcount_t read section + * @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants + * @acquire: If true, the read of the sequence count uses smp_load_acquire() + * if the architecure provides and enabled it. * * Return: count to be passed to read_seqcount_retry() */ -#define __read_seqcount_begin(s) \ +#define read_seqcount_begin_cond_acquire(s, acquire) \ ({ \ unsigned __seq; \ \ - while ((__seq = seqprop_sequence(s)) & 1) \ + while ((__seq = seqprop_sequence(s, acquire)) & 1) \ cpu_relax(); \ \ kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \ @@ -288,6 +291,26 @@ SEQCOUNT_LOCKNAME(mutex, struct m }) /** + * __read_seqcount_begin() - begin a seqcount_t read section w/o barrier + * @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants + * + * __read_seqcount_begin is like read_seqcount_begin, but it neither + * provides a smp_rmb() barrier nor does it use smp_load_acquire() on + * architectures which provide it. + * + * Callers should ensure that smp_rmb() or equivalent ordering is provided + * before actually loading any of the variables that are to be protected in + * this critical section. + * + * Use carefully, only in critical code, and comment how the barrier is + * provided. + * + * Return: count to be passed to read_seqcount_retry() + */ +#define __read_seqcount_begin(s) \ + read_seqcount_begin_cond_acquire(s, false) + +/** * raw_read_seqcount_begin() - begin a seqcount_t read section w/o lockdep * @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants * @@ -295,9 +318,10 @@ SEQCOUNT_LOCKNAME(mutex, struct m */ #define raw_read_seqcount_begin(s) \ ({ \ - unsigned _seq = __read_seqcount_begin(s); \ + unsigned _seq = read_seqcount_begin_cond_acquire(s, true); \ \ - smp_rmb(); \ + if (!IS_ENABLED(CONFIG_ARCH_HAS_ACQUIRE_RELEASE)) \ + smp_rmb(); \ _seq; \ }) @@ -326,9 +350,10 @@ SEQCOUNT_LOCKNAME(mutex, struct m */ #define raw_read_seqcount(s) \ ({ \ - unsigned __seq = seqprop_sequence(s); \ + unsigned __seq = seqprop_sequence(s, true); \ \ - smp_rmb(); \ + if (!IS_ENABLED(CONFIG_ARCH_HAS_ACQUIRE_RELEASE)) \ + smp_rmb(); \ kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \ __seq; \ })