From: Joel Fernandes <joelagnelf@nvidia.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Boqun Feng <boqun.feng@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Alan Stern <stern@rowland.harvard.edu>,
John Stultz <jstultz@google.com>,
Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Frederic Weisbecker <frederic@kernel.org>,
Josh Triplett <josh@joshtriplett.org>,
Uladzislau Rezki <urezki@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>,
Ingo Molnar <mingo@redhat.com>, Waiman Long <longman@redhat.com>,
Mark Rutland <mark.rutland@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>,
maged.michael@gmail.com, Mateusz Guzik <mjguzik@gmail.com>,
Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>,
rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev
Subject: Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers
Date: Fri, 19 Dec 2025 10:42:56 -0500 [thread overview]
Message-ID: <6353feeb-c2ab-4ff6-9ea6-04ae5102641d@nvidia.com> (raw)
In-Reply-To: <edbe45b3-0f85-48d2-8d4f-784781a8dee5@efficios.com>
On 12/19/2025 10:14 AM, Mathieu Desnoyers wrote:
> On 2025-12-18 19:25, Boqun Feng wrote:
> [...]
>>> And if we pre-populate the 8 slots for each cpu, and thus force
>>> fallback to overflow list:
>>>
>>> hazptr-smp-mb-8-fail 67.1 ns
>>>
>>
>> Thank you! So involving locking seems to hurt performance more than
>> per-CPU/per-task operations. This may suggest that enabling
>> PREEMPT_HAZPTR by default has an acceptable performance.
>
> Indeed, I can fold it into the hazptr patch and remove the config
> option then.
>
>>
>>> So breaking up the iteration into pieces is not just to handle
>>> busy-waiting, but also to make sure we don't increase the
>>> system latency by holding a raw spinlock (taken with rq lock
>>> held) for more than the little time needed to iterate to the next
>>> node.
>>>
>>
>> I agree that it helps reduce the latency, but I feel like with a scan
>> thread in the picture (and we don't need to busy-wait), we should use
>> a forward-progress-guaranteed way in the updater side scan, which means
>> we may need to explore other solutions for the latency (e.g.
>> fine-grained locking hashlist for the overflow list) than the generation
>> counter.
>
> Guaranteeing forward progress of synchronize even with a steady stream
> of unrelated hazard pointers addition/removal to/from the overflow list
> is something we should aim for, with or without a scan thread.
>
> As is, my current generation scheme does not guarantee this. But we can
> use liburcu RCU grace period "parity" concept as inspiration [1] and
> introduce a two-lists scheme, and have hazptr_synchronize flip the
> current "list_add" head while it iterates on the other list. There would
> be one generation counter for each of the two lists.
>
> This would be protected by holding a global mutex across
> hazptr_synchronize. hazptr_synchronize would need to iterate
> on the two lists one after the other, carefully flipping the
> current "addition list" head between the two iterations.
>
> So the worse case that can happen in terms of retry caused by
> generation counter increments is if list entries are deleted while
> the list is being traversed by hazptr_synchronize. Because there
> are no possible concurrent additions to that list, the worse case
> is that the list becomes empty, which bounds the number of retry
> to the number of list elements.
>
> Thoughts ?
IMHO the overflow case is "special" and should not happen often, otherwise
things are "bad" anyway. I am not sure if this kind of complexity will be worth
it unless we know HP forward-progress is a real problem. Also, since HP acquire
will be short lived, are we that likely to not get past a temporary shortage of
slots?
Perhaps the forward-progress problem should be rephrased to the following?: If a
reader hit an overflow slot, it should probably be able to get a non-overflow
slot soon, even if hazard pointer slots are over-subscribed.
Thanks.
next prev parent reply other threads:[~2025-12-19 15:43 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-18 1:45 [RFC PATCH v4 0/4] " Mathieu Desnoyers
2025-12-18 1:45 ` [RFC PATCH v4 1/4] compiler.h: Introduce ptr_eq() to preserve address dependency Mathieu Desnoyers
2025-12-18 9:03 ` David Laight
2025-12-18 13:51 ` Mathieu Desnoyers
2025-12-18 15:54 ` David Laight
2025-12-18 14:27 ` Gary Guo
2025-12-18 16:12 ` David Laight
2025-12-18 1:45 ` [RFC PATCH v4 2/4] Documentation: RCU: Refer to ptr_eq() Mathieu Desnoyers
2025-12-18 1:45 ` [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers Mathieu Desnoyers
2025-12-18 8:36 ` Boqun Feng
2025-12-18 17:35 ` Mathieu Desnoyers
2025-12-18 20:22 ` Boqun Feng
2025-12-18 23:36 ` Mathieu Desnoyers
2025-12-19 0:25 ` Boqun Feng
2025-12-19 6:06 ` Joel Fernandes
2025-12-19 15:14 ` Mathieu Desnoyers
2025-12-19 15:42 ` Joel Fernandes [this message]
2025-12-19 22:19 ` Mathieu Desnoyers
2025-12-19 22:39 ` Joel Fernandes
2025-12-21 9:59 ` Boqun Feng
2025-12-19 0:43 ` Boqun Feng
2025-12-19 14:22 ` Mathieu Desnoyers
2025-12-19 1:22 ` Joel Fernandes
2025-12-18 1:45 ` [RFC PATCH v4 4/4] hazptr: Migrate per-CPU slots to backup slot on context switch Mathieu Desnoyers
2025-12-18 16:20 ` Mathieu Desnoyers
2025-12-18 22:16 ` Boqun Feng
2025-12-19 0:21 ` Mathieu Desnoyers
2025-12-18 10:33 ` [RFC PATCH v4 0/4] Hazard Pointers Joel Fernandes
2025-12-18 17:54 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6353feeb-c2ab-4ff6-9ea6-04ae5102641d@nvidia.com \
--to=joelagnelf@nvidia.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=jonas.oberhauser@huaweicloud.com \
--cc=josh@joshtriplett.org \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkmm@lists.linux.dev \
--cc=longman@redhat.com \
--cc=maged.michael@gmail.com \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=stern@rowland.harvard.edu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox