linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Boqun Feng <boqun.feng@gmail.com>
Cc: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>,
	linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
	linux-mm@kvack.org, lkmm@lists.linux.dev,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Josh Triplett <josh@joshtriplett.org>,
	"Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	rostedt <rostedt@goodmis.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang1211@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
	Waiman Long <longman@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Kent Overstreet <kent.overstreet@gmail.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	maged.michael@gmail.com,
	Neeraj Upadhyay <neeraj.upadhyay@amd.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [RFC PATCH 1/4] hazptr: Add initial implementation of hazard pointers
Date: Fri, 27 Sep 2024 13:15:19 -0400	[thread overview]
Message-ID: <7e1c8a5e-c110-414c-8fb2-022eacc2bd4a@efficios.com> (raw)
In-Reply-To: <CAHk-=wgQyXOt_HjDZHNqWMmyvv74xLAcMw88grfp4HkKoS2vLw@mail.gmail.com>

On 2024-09-27 18:44, Linus Torvalds wrote:
> On Thu, 26 Sept 2024 at 18:38, Boqun Feng <boqun.feng@gmail.com> wrote:
>>
>> Note that ADDRESS_EQ() only hide first parameter, so this should be ADDRESS_EQ(b, a).
> 
> Yeah, please stop making things unnecessarily complicated.
> 
> Just use a barrier(). Please. Stop these stupid games until you can
> show why it matters.

The barrier() is ineffective at fixing the issue.
It does not prevent the compiler CSE from losing the
address dependency:

int fct_2_volatile_barriers(void)
{
     int *a, *b;

     do {
         a = READ_ONCE(p);
         asm volatile ("" : : : "memory");
         b = READ_ONCE(p);
     } while (a != b);
     asm volatile ("" : : : "memory");  <----- where you would like your barrier
     return *b;
}

With gcc 14.2 (arm64):

fct_2_volatile_barriers:
         adrp    x0, .LANCHOR0
         add     x0, x0, :lo12:.LANCHOR0
.L2:
         ldr     x1, [x0]    <------ x1 populated by first load.
         ldr     x2, [x0]
         cmp     x1, x2
         bne     .L2
         ldr     w0, [x1]    <------ x1 is used for access which should depend on b.
         ret

On weakly-ordered architectures, this lets CPU speculation
use the result from the first load to speculate
"ldr w0, [x1]" before "ldr x2, [x0]".
Based on the RCU documentation, the control dependency
does not prevent the CPU from speculating loads.

There are a few ways to fix this:

- Compile everything with -fno-cse-follow-jumps.
- Make the compiler unaware of the relationship between the
   address equality and address-dependent use of b. This can
   be done either using ADDRESS_EQ() or asm goto.

I prefer ADDRESS_EQ() because it is arch-agnostic. I don't
care that it adds one more register movement instruction.

> And by "why it matters" I mean "major difference in code generation",
> not some "it uses one more register and has to spill" kind of small
> detail.

Why it matters is because gcc generates code that does not
preserve address dependency of the second READ_ONCE().

> At this point, I'm not even convinced the whole hazard pointer
> approach makes sense. And you're not helping by making it more
> complicated than it needs to be.

I'm preparing a small series that aims to show how a minimal
hazard pointer implementation can help improve common scenarios:

- Guarantee object existence on pointer dereference to increment
   a reference count:
   - replace locking used for that purpose in drivers (e.g. usb),
   - replace RCU + inc_not_zero pattern,
- rtmutex: I suspect we can improve situations where locks need
   to be taken in reverse dependency chain order by guaranteeing
   existence of first and second locks in traversal order,
   allowing them to be locked in the correct order (which is
   reverse from traversal order) rather than try-lock+retry on
   nested lock. This can be done with hazard pointers without
   requiring object reclaim to be delayed by an RCU grace period.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



  reply	other threads:[~2024-09-27 17:17 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-17 14:33 [RFC PATCH 0/4] Add hazard pointers to kernel Boqun Feng
2024-09-17 14:33 ` [RFC PATCH 1/4] hazptr: Add initial implementation of hazard pointers Boqun Feng
2024-09-18  8:27   ` Mathieu Desnoyers
2024-09-18 15:17   ` Alan Huang
2024-09-19  6:56     ` Boqun Feng
2024-09-19 18:07       ` Jonas Oberhauser
2024-09-19  0:12   ` Jann Horn
2024-09-19 20:30     ` Jonas Oberhauser
2024-09-20  7:43       ` Jonas Oberhauser
2024-09-19  6:39   ` Lai Jiangshan
2024-09-19  7:10     ` Boqun Feng
2024-09-19 12:33       ` Alan Huang
2024-09-19 13:57       ` Alan Huang
2024-09-19 18:58         ` Boqun Feng
2024-09-19 19:53           ` Alan Huang
2024-09-19 16:10       ` Alan Huang
2024-09-19 14:00   ` Jonas Oberhauser
2024-09-20  7:41   ` Jonas Oberhauser
2024-09-25 10:02     ` Boqun Feng
2024-09-25 10:11       ` Jonas Oberhauser
2024-09-25 10:45         ` Boqun Feng
2024-09-25 11:59           ` Mathieu Desnoyers
2024-09-25 12:16             ` Boqun Feng
2024-09-25 12:47               ` Mathieu Desnoyers
2024-09-25 13:10                 ` Mathieu Desnoyers
2024-09-25 13:20                   ` Mathieu Desnoyers
2024-09-26  6:16                     ` Mathieu Desnoyers
2024-09-26 15:53                       ` Jonas Oberhauser
2024-09-26 16:12                         ` Linus Torvalds
2024-09-26 16:40                           ` Jonas Oberhauser
2024-09-26 16:54                             ` Linus Torvalds
2024-09-27  0:01                               ` Boqun Feng
2024-09-27  1:30                                 ` Mathieu Desnoyers
2024-09-27  1:37                                   ` Boqun Feng
2024-09-27  4:28                                     ` Boqun Feng
2024-09-27 10:59                                       ` Mathieu Desnoyers
2024-09-27 14:43                                         ` Mathieu Desnoyers
2024-09-27 15:22                                           ` Mathieu Desnoyers
2024-09-27 16:06                                       ` Alan Huang
2024-09-27 16:44                                     ` Linus Torvalds
2024-09-27 17:15                                       ` Mathieu Desnoyers [this message]
2024-09-27 17:23                                         ` Linus Torvalds
2024-09-27 17:51                                           ` Mathieu Desnoyers
2024-09-27 18:13                                             ` Linus Torvalds
2024-09-27 19:12                                               ` Jonas Oberhauser
2024-09-27 19:28                                                 ` Linus Torvalds
2024-09-27 20:24                                                   ` Linus Torvalds
2024-09-27 20:02                                               ` Mathieu Desnoyers
2024-09-27  1:20                           ` Mathieu Desnoyers
2024-09-27  4:38                             ` Boqun Feng
2024-09-27 19:23                               ` Jonas Oberhauser
2024-09-27 20:10                                 ` Mathieu Desnoyers
2024-09-27 22:18                                   ` Jonas Oberhauser
2024-09-28 22:10                                     ` Alan Huang
2024-09-28 23:12                                       ` Alan Huang
2024-09-25 12:19             ` Jonas Oberhauser
2024-09-17 14:34 ` [RFC PATCH 2/4] refscale: Add benchmarks for hazptr Boqun Feng
2024-09-17 14:34 ` [RFC PATCH 3/4] refscale: Add benchmarks for percpu_ref Boqun Feng
2024-09-17 14:34 ` [RFC PATCH 4/4] WIP: hazptr: Add hazptr test sample Boqun Feng
2024-09-18  7:18 ` [RFC PATCH 0/4] Add hazard pointers to kernel Linus Torvalds
2024-09-18 22:44   ` Neeraj Upadhyay
2024-09-19  6:46     ` Linus Torvalds
2024-09-20  5:00       ` Neeraj Upadhyay
2024-09-19 14:30     ` Mateusz Guzik
2024-09-19 14:14   ` Christoph Hellwig
2024-09-19 14:21     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7e1c8a5e-c110-414c-8fb2-022eacc2bd4a@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=jonas.oberhauser@huaweicloud.com \
    --cc=josh@joshtriplett.org \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkmm@lists.linux.dev \
    --cc=longman@redhat.com \
    --cc=maged.michael@gmail.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=neeraj.upadhyay@amd.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qiang.zhang1211@gmail.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox