Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Joel Fernandes <joelagnelf@nvidia.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Frederic Weisbecker <frederic@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	John Stultz <jstultz@google.com>,
	Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang1211@gmail.com>,
	Ingo Molnar <mingo@redhat.com>, Waiman Long <longman@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vlastimil Babka <vbabka@suse.cz>,
	maged.michael@gmail.com, Mateusz Guzik <mjguzik@gmail.com>,
	Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>,
	rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev
Subject: Re: [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers
Date: Thu, 8 Jan 2026 18:34:49 -0500	[thread overview]
Message-ID: <b045dc42-233f-4bb9-8619-6a688c05b7ae@nvidia.com> (raw)
In-Reply-To: <fc0eaac9-930f-4692-b913-80e48dcdd301@efficios.com>



On 1/8/2026 11:45 AM, Mathieu Desnoyers wrote:
> On 2026-01-08 11:34, Frederic Weisbecker wrote:
>> Le Fri, Dec 19, 2025 at 09:22:19AM -0500, Mathieu Desnoyers a écrit :
>>> On 2025-12-18 19:43, Boqun Feng wrote:
>>>> On Thu, Dec 18, 2025 at 12:35:18PM -0500, Mathieu Desnoyers wrote:
>>>> [...]
>>>>>> Could you utilize this[1] to see a
>>>>>> comparison of the reader-side performance against RCU/SRCU?
>>>>>
>>>>> Good point ! Let's see.
>>>>>
>>>>> On a AMD 2x EPYC 9654 96-Core Processor with 192 cores,
>>>>> hyperthreading disabled,
>>>>> CONFIG_PREEMPT=y,
>>>>> CONFIG_PREEMPT_RCU=y,
>>>>> CONFIG_PREEMPT_HAZPTR=y.
>>>>>
>>>>> scale_type                 ns
>>>>> -----------------------
>>>>> hazptr-smp-mb             13.1   <- this implementation
>>>>> hazptr-barrier            11.5   <- replace smp_mb() on acquire with
>>>>> barrier(), requires IPIs on synchronize.
>>>>> hazptr-smp-mb-hlist       12.7   <- replace per-task hp context and per-cpu
>>>>> overflow lists by hlist.
>>>>> rcu                       17.0
>>>>
>>>> Hmm.. now looking back, how is it possible that hazptr is faster than
>>>> RCU on the reader-side? Because a grace period was happening and
>>>> triggered rcu_read_unlock_special()? This is actualy more interesting.
>>> So I could be entirely misreading the code, but, we have:
>>>
>>> rcu_flavor_sched_clock_irq():
>>> [...]
>>>          /* If GP is oldish, ask for help from rcu_read_unlock_special(). */
>>>          if (rcu_preempt_depth() > 0 &&
>>>              __this_cpu_read(rcu_data.core_needs_qs) &&
>>>              __this_cpu_read(rcu_data.cpu_no_qs.b.norm) &&
>>>              !t->rcu_read_unlock_special.b.need_qs &&
>>>              time_after(jiffies, rcu_state.gp_start + HZ))
>>>                  t->rcu_read_unlock_special.b.need_qs = true;
>>>
>>> which means we set need_qs = true as a result from observing
>>> cpu_no_qs.b.norm == true.
>>>
>>> This is sufficient to trigger calls (plural) to rcu_read_unlock_special()
>>> from __rcu_read_unlock.
>>>
>>> But then if we look at rcu_preempt_deferred_qs_irqrestore()
>>> which we would expect to clear the rcu_read_unlock_special.b.need_qs
>>> state, we have this:
>>>
>>>          special = t->rcu_read_unlock_special;
>>>          if (!special.s && !rdp->cpu_no_qs.b.exp) {
>>>                  local_irq_restore(flags);
>>>                  return;
>>>          }
>>>          t->rcu_read_unlock_special.s = 0;
>>>
>>> which skips over clearing the state unless there is an expedited
>>> grace period required.
>>>
>>> So unless I'm missing something, we should _also_ clear that state
>>> when it's invoked after rcu_flavor_sched_clock_irq, so the next
>>> __rcu_read_unlock won't all call into rcu_read_unlock_special().
>>>
>>> I'm adding a big warning about sleep deprivation and possibly
>>> misunderstanding the whole thing. What am I missing ?
>>
>> As far as I can tell, this skips clearing the state if the state is
>> already cleared. Or am I even more sleep deprived than you? :o)
> 
> No, you are right. The (!x && !y) pattern confused me, but the
> code is correct. Good thing I've put a warning about sleep
> deprivation. ;-)
> 
> Sorry for the noise.

Right, I think this can happen when after a rcu_flavor_sched_clock_irq() set
special.b.need_qs, then another upcoming rcu_flavor_sched_clock_irq() raced with
reader's rcu_read_unlock() and interrupted rcu_read_unlock_special() before it
could disable interrupts.

rcu_read_unlock()
 -> rcu_read_lock_nesting--;
  -> nesting == 0 and special is set.

   <interrupted by sched clock>
      -> rcu_flavor_sched_clock_irq()
         -> rcu_preempt_deferred_qs_irqrestore
            -> clear b.special
   <interrupt returned>

     -> rcu_read_unlock_special()
       -> local_irq_save(flags);  // too late
          -> rcu_preempt_deferred_qs_irqrestore
             -> Early return.

thanks,

 - Joel

next prev parent reply	other threads:[~2026-01-08 23:35 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-18  1:45 [RFC PATCH v4 0/4] " Mathieu Desnoyers
2025-12-18  1:45 ` [RFC PATCH v4 1/4] compiler.h: Introduce ptr_eq() to preserve address dependency Mathieu Desnoyers
2025-12-18  9:03   ` David Laight
2025-12-18 13:51     ` Mathieu Desnoyers
2025-12-18 15:54       ` David Laight
2025-12-18 14:27     ` Gary Guo
2025-12-18 16:12       ` David Laight
2025-12-18  1:45 ` [RFC PATCH v4 2/4] Documentation: RCU: Refer to ptr_eq() Mathieu Desnoyers
2025-12-18  1:45 ` [RFC PATCH v4 3/4] hazptr: Implement Hazard Pointers Mathieu Desnoyers
2025-12-18  8:36   ` Boqun Feng
2025-12-18 17:35     ` Mathieu Desnoyers
2025-12-18 20:22       ` Boqun Feng
2025-12-18 23:36         ` Mathieu Desnoyers
2025-12-19  0:25           ` Boqun Feng
2025-12-19  6:06             ` Joel Fernandes
2025-12-19 15:14             ` Mathieu Desnoyers
2025-12-19 15:42               ` Joel Fernandes
2025-12-19 22:19                 ` Mathieu Desnoyers
2025-12-19 22:39                   ` Joel Fernandes
2025-12-21  9:59                     ` Boqun Feng
2025-12-19  0:43       ` Boqun Feng
2025-12-19 14:22         ` Mathieu Desnoyers
2026-01-08 16:34           ` Frederic Weisbecker
2026-01-08 16:45             ` Mathieu Desnoyers
2026-01-08 23:34               ` Joel Fernandes [this message]
2026-01-08 19:01             ` Paul E. McKenney
2025-12-19  1:22   ` Joel Fernandes
2025-12-18  1:45 ` [RFC PATCH v4 4/4] hazptr: Migrate per-CPU slots to backup slot on context switch Mathieu Desnoyers
2025-12-18 16:20   ` Mathieu Desnoyers
2025-12-18 22:16   ` Boqun Feng
2025-12-19  0:21     ` Mathieu Desnoyers
2025-12-18 10:33 ` [RFC PATCH v4 0/4] Hazard Pointers Joel Fernandes
2025-12-18 17:54   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b045dc42-233f-4bb9-8619-6a688c05b7ae@nvidia.com \
    --to=joelagnelf@nvidia.com \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=jonas.oberhauser@huaweicloud.com \
    --cc=josh@joshtriplett.org \
    --cc=jstultz@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkmm@lists.linux.dev \
    --cc=longman@redhat.com \
    --cc=maged.michael@gmail.com \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=mjguzik@gmail.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qiang.zhang1211@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox