* [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency
@ 2024-09-27 20:33 Mathieu Desnoyers
2024-09-27 23:05 ` Mathieu Desnoyers
2024-09-27 23:10 ` Boqun Feng
0 siblings, 2 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2024-09-27 20:33 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-kernel, Mathieu Desnoyers, Boqun Feng, Greg Kroah-Hartman,
Sebastian Andrzej Siewior, Paul E. McKenney, Will Deacon,
Peter Zijlstra, Alan Stern, John Stultz, Neeraj Upadhyay,
Frederic Weisbecker, Joel Fernandes, Josh Triplett,
Uladzislau Rezki, Steven Rostedt, Lai Jiangshan, Zqiang,
Ingo Molnar, Waiman Long, Mark Rutland, Thomas Gleixner,
Vlastimil Babka, maged.michael, Mateusz Guzik, rcu, linux-mm,
lkmm
Compiler CSE and SSA GVN optimizations can cause the address dependency
of addresses returned by rcu_dereference to be lost when comparing those
pointers with either constants or previously loaded pointers.
Introduce ptr_eq() to compare two addresses while preserving the address
dependencies for later use of the address. It should be used when
comparing an address returned by rcu_dereference().
This is needed to prevent the compiler CSE and SSA GVN optimizations
from replacing the registers holding @a or @b based on their
equality, which does not preserve address dependencies and allows the
following misordering speculations:
- If @b is a constant, the compiler can issue the loads which depend
on @a before loading @a.
- If @b is a register populated by a prior load, weakly-ordered
CPUs can speculate loads which depend on @a before loading @a.
The same logic applies with @a and @b swapped.
The compiler barrier() is ineffective at fixing this issue.
It does not prevent the compiler CSE from losing the address dependency:
int fct_2_volatile_barriers(void)
{
int *a, *b;
do {
a = READ_ONCE(p);
asm volatile ("" : : : "memory");
b = READ_ONCE(p);
} while (a != b);
asm volatile ("" : : : "memory"); <----- barrier()
return *b;
}
With gcc 14.2 (arm64):
fct_2_volatile_barriers:
adrp x0, .LANCHOR0
add x0, x0, :lo12:.LANCHOR0
.L2:
ldr x1, [x0] <------ x1 populated by first load.
ldr x2, [x0]
cmp x1, x2
bne .L2
ldr w0, [x1] <------ x1 is used for access which should depend on b.
ret
On weakly-ordered architectures, this lets CPU speculation use the
result from the first load to speculate "ldr w0, [x1]" before
"ldr x2, [x0]".
Based on the RCU documentation, the control dependency does not prevent
the CPU from speculating loads.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: John Stultz <jstultz@google.com>
Cc: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: maged.michael@gmail.com
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: rcu@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: lkmm@lists.linux.dev
---
include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 2df665fa2964..f26705c267e8 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -186,6 +186,68 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
__asm__ ("" : "=r" (var) : "0" (var))
#endif
+/*
+ * Compare two addresses while preserving the address dependencies for
+ * later use of the address. It should be used when comparing an address
+ * returned by rcu_dereference().
+ *
+ * This is needed to prevent the compiler CSE and SSA GVN optimizations
+ * from replacing the registers holding @a or @b based on their
+ * equality, which does not preserve address dependencies and allows the
+ * following misordering speculations:
+ *
+ * - If @b is a constant, the compiler can issue the loads which depend
+ * on @a before loading @a.
+ * - If @b is a register populated by a prior load, weakly-ordered
+ * CPUs can speculate loads which depend on @a before loading @a.
+ *
+ * The same logic applies with @a and @b swapped.
+ *
+ * Return value: true if pointers are equal, false otherwise.
+ *
+ * The compiler barrier() is ineffective at fixing this issue. It does
+ * not prevent the compiler CSE from losing the address dependency:
+ *
+ * int fct_2_volatile_barriers(void)
+ * {
+ * int *a, *b;
+ *
+ * do {
+ * a = READ_ONCE(p);
+ * asm volatile ("" : : : "memory");
+ * b = READ_ONCE(p);
+ * } while (a != b);
+ * asm volatile ("" : : : "memory"); <-- barrier()
+ * return *b;
+ * }
+ *
+ * With gcc 14.2 (arm64):
+ *
+ * fct_2_volatile_barriers:
+ * adrp x0, .LANCHOR0
+ * add x0, x0, :lo12:.LANCHOR0
+ * .L2:
+ * ldr x1, [x0] <-- x1 populated by first load.
+ * ldr x2, [x0]
+ * cmp x1, x2
+ * bne .L2
+ * ldr w0, [x1] <-- x1 is used for access which should depend on b.
+ * ret
+ *
+ * On weakly-ordered architectures, this lets CPU speculation use the
+ * result from the first load to speculate "ldr w0, [x1]" before
+ * "ldr x2, [x0]".
+ * Based on the RCU documentation, the control dependency does not
+ * prevent the CPU from speculating loads.
+ */
+static __always_inline
+int ptr_eq(const volatile void *a, const volatile void *b)
+{
+ OPTIMIZER_HIDE_VAR(a);
+ OPTIMIZER_HIDE_VAR(b);
+ return a == b;
+}
+
#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)
/**
--
2.39.2
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency
2024-09-27 20:33 [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency Mathieu Desnoyers
@ 2024-09-27 23:05 ` Mathieu Desnoyers
2024-09-27 23:20 ` Boqun Feng
2024-09-28 11:32 ` Paul E. McKenney
2024-09-27 23:10 ` Boqun Feng
1 sibling, 2 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2024-09-27 23:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-kernel, Boqun Feng, Greg Kroah-Hartman,
Sebastian Andrzej Siewior, Paul E. McKenney, Will Deacon,
Peter Zijlstra, Alan Stern, John Stultz, Neeraj Upadhyay,
Frederic Weisbecker, Joel Fernandes, Josh Triplett,
Uladzislau Rezki, Steven Rostedt, Lai Jiangshan, Zqiang,
Ingo Molnar, Waiman Long, Mark Rutland, Thomas Gleixner,
Vlastimil Babka, maged.michael, Mateusz Guzik, rcu, linux-mm,
lkmm
On 2024-09-27 22:33, Mathieu Desnoyers wrote:
[...]
> ---
> include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 62 insertions(+)
>
I'm wondering if this really belongs in compiler.h, or if it's so
RCU/HP specific that it should be implemented in rcupdate.h ?
[... ]
> +static __always_inline
> +int ptr_eq(const volatile void *a, const volatile void *b)
And perhaps rename this to rcu_ptr_eq() ?
Thanks,
Mathieu
> +{
> + OPTIMIZER_HIDE_VAR(a);
> + OPTIMIZER_HIDE_VAR(b);
> + return a == b;
> +}
> +
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency
2024-09-27 23:05 ` Mathieu Desnoyers
@ 2024-09-27 23:20 ` Boqun Feng
2024-09-28 11:32 ` Paul E. McKenney
1 sibling, 0 replies; 5+ messages in thread
From: Boqun Feng @ 2024-09-27 23:20 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Linus Torvalds, linux-kernel, Greg Kroah-Hartman,
Sebastian Andrzej Siewior, Paul E. McKenney, Will Deacon,
Peter Zijlstra, Alan Stern, John Stultz, Neeraj Upadhyay,
Frederic Weisbecker, Joel Fernandes, Josh Triplett,
Uladzislau Rezki, Steven Rostedt, Lai Jiangshan, Zqiang,
Ingo Molnar, Waiman Long, Mark Rutland, Thomas Gleixner,
Vlastimil Babka, maged.michael, Mateusz Guzik, rcu, linux-mm,
lkmm
On Fri, Sep 27, 2024 at 07:05:53PM -0400, Mathieu Desnoyers wrote:
> On 2024-09-27 22:33, Mathieu Desnoyers wrote:
> [...]
>
> > ---
> > include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 62 insertions(+)
> >
>
> I'm wondering if this really belongs in compiler.h, or if it's so
> RCU/HP specific that it should be implemented in rcupdate.h ?
>
> [... ]
> > +static __always_inline
> > +int ptr_eq(const volatile void *a, const volatile void *b)
>
> And perhaps rename this to rcu_ptr_eq() ?
>
I think the current place and name is fine, yes RCU is the most
important address dependency user, but there are other users as well.
Regards,
Boqun
> Thanks,
>
> Mathieu
>
> > +{
> > + OPTIMIZER_HIDE_VAR(a);
> > + OPTIMIZER_HIDE_VAR(b);
> > + return a == b;
> > +}
> > +
>
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency
2024-09-27 23:05 ` Mathieu Desnoyers
2024-09-27 23:20 ` Boqun Feng
@ 2024-09-28 11:32 ` Paul E. McKenney
1 sibling, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2024-09-28 11:32 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Linus Torvalds, linux-kernel, Boqun Feng, Greg Kroah-Hartman,
Sebastian Andrzej Siewior, Will Deacon, Peter Zijlstra,
Alan Stern, John Stultz, Neeraj Upadhyay, Frederic Weisbecker,
Joel Fernandes, Josh Triplett, Uladzislau Rezki, Steven Rostedt,
Lai Jiangshan, Zqiang, Ingo Molnar, Waiman Long, Mark Rutland,
Thomas Gleixner, Vlastimil Babka, maged.michael, Mateusz Guzik,
rcu, linux-mm, lkmm
On Fri, Sep 27, 2024 at 07:05:53PM -0400, Mathieu Desnoyers wrote:
> On 2024-09-27 22:33, Mathieu Desnoyers wrote:
> [...]
>
> > ---
> > include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 62 insertions(+)
> >
>
> I'm wondering if this really belongs in compiler.h, or if it's so
> RCU/HP specific that it should be implemented in rcupdate.h ?
>
> [... ]
> > +static __always_inline
> > +int ptr_eq(const volatile void *a, const volatile void *b)
>
> And perhaps rename this to rcu_ptr_eq() ?
For either location or name:
Acked-by: Paul E. McKenney <paulmck@kernel.org>
There are non-RCU uses, but RCU is currently by far the most common.
Thanx, Paul
> Thanks,
>
> Mathieu
>
> > +{
> > + OPTIMIZER_HIDE_VAR(a);
> > + OPTIMIZER_HIDE_VAR(b);
> > + return a == b;
> > +}
> > +
>
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency
2024-09-27 20:33 [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency Mathieu Desnoyers
2024-09-27 23:05 ` Mathieu Desnoyers
@ 2024-09-27 23:10 ` Boqun Feng
1 sibling, 0 replies; 5+ messages in thread
From: Boqun Feng @ 2024-09-27 23:10 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Linus Torvalds, linux-kernel, Greg Kroah-Hartman,
Sebastian Andrzej Siewior, Paul E. McKenney, Will Deacon,
Peter Zijlstra, Alan Stern, John Stultz, Neeraj Upadhyay,
Frederic Weisbecker, Joel Fernandes, Josh Triplett,
Uladzislau Rezki, Steven Rostedt, Lai Jiangshan, Zqiang,
Ingo Molnar, Waiman Long, Mark Rutland, Thomas Gleixner,
Vlastimil Babka, maged.michael, Mateusz Guzik, rcu, linux-mm,
lkmm, Gary Guo
[Cc Gary]
On Fri, Sep 27, 2024 at 04:33:34PM -0400, Mathieu Desnoyers wrote:
> Compiler CSE and SSA GVN optimizations can cause the address dependency
> of addresses returned by rcu_dereference to be lost when comparing those
> pointers with either constants or previously loaded pointers.
>
> Introduce ptr_eq() to compare two addresses while preserving the address
> dependencies for later use of the address. It should be used when
> comparing an address returned by rcu_dereference().
>
> This is needed to prevent the compiler CSE and SSA GVN optimizations
> from replacing the registers holding @a or @b based on their
> equality, which does not preserve address dependencies and allows the
> following misordering speculations:
>
> - If @b is a constant, the compiler can issue the loads which depend
> on @a before loading @a.
> - If @b is a register populated by a prior load, weakly-ordered
> CPUs can speculate loads which depend on @a before loading @a.
>
> The same logic applies with @a and @b swapped.
>
> The compiler barrier() is ineffective at fixing this issue.
> It does not prevent the compiler CSE from losing the address dependency:
>
> int fct_2_volatile_barriers(void)
> {
> int *a, *b;
>
> do {
> a = READ_ONCE(p);
> asm volatile ("" : : : "memory");
> b = READ_ONCE(p);
> } while (a != b);
> asm volatile ("" : : : "memory"); <----- barrier()
> return *b;
> }
>
> With gcc 14.2 (arm64):
>
> fct_2_volatile_barriers:
> adrp x0, .LANCHOR0
> add x0, x0, :lo12:.LANCHOR0
> .L2:
> ldr x1, [x0] <------ x1 populated by first load.
> ldr x2, [x0]
> cmp x1, x2
> bne .L2
> ldr w0, [x1] <------ x1 is used for access which should depend on b.
> ret
>
> On weakly-ordered architectures, this lets CPU speculation use the
> result from the first load to speculate "ldr w0, [x1]" before
> "ldr x2, [x0]".
> Based on the RCU documentation, the control dependency does not prevent
> the CPU from speculating loads.
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Suggested-by: Boqun Feng <boqun.feng@gmail.com>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: John Stultz <jstultz@google.com>
> Cc: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Zqiang <qiang.zhang1211@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Waiman Long <longman@redhat.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: maged.michael@gmail.com
> Cc: Mateusz Guzik <mjguzik@gmail.com>
> Cc: rcu@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: lkmm@lists.linux.dev
> ---
> include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 62 insertions(+)
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 2df665fa2964..f26705c267e8 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -186,6 +186,68 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> __asm__ ("" : "=r" (var) : "0" (var))
> #endif
>
> +/*
> + * Compare two addresses while preserving the address dependencies for
> + * later use of the address. It should be used when comparing an address
> + * returned by rcu_dereference().
> + *
> + * This is needed to prevent the compiler CSE and SSA GVN optimizations
> + * from replacing the registers holding @a or @b based on their
> + * equality, which does not preserve address dependencies and allows the
> + * following misordering speculations:
> + *
> + * - If @b is a constant, the compiler can issue the loads which depend
> + * on @a before loading @a.
> + * - If @b is a register populated by a prior load, weakly-ordered
> + * CPUs can speculate loads which depend on @a before loading @a.
> + *
> + * The same logic applies with @a and @b swapped.
> + *
> + * Return value: true if pointers are equal, false otherwise.
> + *
> + * The compiler barrier() is ineffective at fixing this issue. It does
> + * not prevent the compiler CSE from losing the address dependency:
> + *
> + * int fct_2_volatile_barriers(void)
> + * {
> + * int *a, *b;
> + *
> + * do {
> + * a = READ_ONCE(p);
> + * asm volatile ("" : : : "memory");
> + * b = READ_ONCE(p);
> + * } while (a != b);
> + * asm volatile ("" : : : "memory"); <-- barrier()
> + * return *b;
> + * }
> + *
> + * With gcc 14.2 (arm64):
> + *
> + * fct_2_volatile_barriers:
> + * adrp x0, .LANCHOR0
> + * add x0, x0, :lo12:.LANCHOR0
> + * .L2:
> + * ldr x1, [x0] <-- x1 populated by first load.
> + * ldr x2, [x0]
> + * cmp x1, x2
> + * bne .L2
> + * ldr w0, [x1] <-- x1 is used for access which should depend on b.
> + * ret
> + *
> + * On weakly-ordered architectures, this lets CPU speculation use the
> + * result from the first load to speculate "ldr w0, [x1]" before
> + * "ldr x2, [x0]".
> + * Based on the RCU documentation, the control dependency does not
> + * prevent the CPU from speculating loads.
> + */
> +static __always_inline
> +int ptr_eq(const volatile void *a, const volatile void *b)
> +{
> + OPTIMIZER_HIDE_VAR(a);
> + OPTIMIZER_HIDE_VAR(b);
> + return a == b;
> +}
> +
This is better than what I proposed, thank you!
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Could you also make some documentation changes at the "Be very careful
about comparing pointers obtained from..." paragraph in
Documentation/RCU/rcu_dereference.rst? Since 'ptr_eq' is a good tool in
those cases mentioned there. Thanks.
Regards,
Boqun
> #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)
>
> /**
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-09-28 11:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-27 20:33 [RFC PATCH] compiler.h: Introduce ptr_eq() to preserve address dependency Mathieu Desnoyers
2024-09-27 23:05 ` Mathieu Desnoyers
2024-09-27 23:20 ` Boqun Feng
2024-09-28 11:32 ` Paul E. McKenney
2024-09-27 23:10 ` Boqun Feng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox