From: David Laight <David.Laight@ACULAB.COM>
To: 'Alan Stern' <stern@rowland.harvard.edu>
Cc: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
"Paul E. McKenney" <paulmck@kernel.org>,
Will Deacon <will@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Boqun Feng <boqun.feng@gmail.com>,
John Stultz <jstultz@google.com>,
Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>,
Frederic Weisbecker <frederic@kernel.org>,
"Joel Fernandes" <joel@joelfernandes.org>,
Josh Triplett <josh@joshtriplett.org>,
Uladzislau Rezki <urezki@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>,
Ingo Molnar <mingo@redhat.com>, Waiman Long <longman@redhat.com>,
"Mark Rutland" <mark.rutland@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>,
"maged.michael@gmail.com" <maged.michael@gmail.com>,
Mateusz Guzik <mjguzik@gmail.com>, Gary Guo <gary@garyguo.net>,
"rcu@vger.kernel.org" <rcu@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"lkmm@lists.linux.dev" <lkmm@lists.linux.dev>
Subject: RE: [PATCH 1/2] compiler.h: Introduce ptr_eq() to preserve address dependency
Date: Wed, 2 Oct 2024 15:24:45 +0000 [thread overview]
Message-ID: <e39c6e5975f345c4b1a97145e207dee4@AcuMS.aculab.com> (raw)
In-Reply-To: <d192cf63-a274-4721-968e-a2c098db523b@rowland.harvard.edu>
From: 'Alan Stern'
> Sent: 02 October 2024 15:15
>
> On Wed, Oct 02, 2024 at 08:13:15AM +0000, David Laight wrote:
> > From: 'Alan Stern'
> > > Sent: 01 October 2024 23:57
> > >
> > > On Tue, Oct 01, 2024 at 05:11:05PM +0000, David Laight wrote:
> > > > From: Alan Stern
> > > > > Sent: 30 September 2024 19:53
> > > > >
> > > > > On Mon, Sep 30, 2024 at 07:05:06PM +0200, Jonas Oberhauser wrote:
> > > > > >
> > > > > >
> > > > > > Am 9/30/2024 um 6:43 PM schrieb Alan Stern:
> > > > > > > On Mon, Sep 30, 2024 at 01:26:53PM +0200, Jonas Oberhauser wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > Am 9/28/2024 um 4:49 PM schrieb Alan Stern:
> > > > > > > >
> > > > > > > > I should also point out that it is not enough to prevent the compiler from
> > > > > > > > using @a instead of @b.
> > > > > > > >
> > > > > > > > It must also be prevented from assigning @b=@a, which it is often allowed to
> > > > > > > > do after finding @a==@b.
> > > > > > >
> > > > > > > Wouldn't that be a bug?
> > > > > >
> > > > > > That's why I said that it is often allowed to do it. In your case it
> > > > > > wouldn't, but it is often possible when a and b are non-atomic &
> > > > > > non-volatile (and haven't escaped, and I believe sometimes even then).
> > > > > >
> > > > > > It happens for example here with GCC 14.1.0 -O3:
> > > > > >
> > > > > > int fct_hide(void)
> > > > > > {
> > > > > > int *a, *b;
> > > > > >
> > > > > > do {
> > > > > > a = READ_ONCE(p);
> > > > > > asm volatile ("" : : : "memory");
> > > > > > b = READ_ONCE(p);
> > > > > > } while (a != b);
> > > > > > OPTIMIZER_HIDE_VAR(b);
> > > > > > return *b;
> > > > > > }
> > > > > >
> > > > > >
> > > > > >
> > > > > > ldr r1, [r2]
> > > > > > ldr r3, [r2]
> > > > > > cmp r1, r3
> > > > > > bne .L6
> > > > > > mov r3, r1 // nay...
> > > > >
> > > > > A totally unnecessary instruction, which accomplishes nothing other than
> > > > > to waste time, space, and energy. But nonetheless, allowed -- I agree.
> > > > >
> > > > > The people in charge of GCC's optimizer might like to hear about this,
> > > > > if they're not already aware of it...
> > > > >
> > > > > > ldr r0, [r3] // yay!
> > > > > > bx lr
> > > > >
> > > > > One could argue that in this example the compiler _has_ used *a instead
> > > > > of *b. However, such an argument would have more force if we had
> > > > > described what we are talking about more precisely.
> > > >
> > > > The 'mov r3, r1' has nothing to do with 'a'.
> > >
> > > What do you mean by that? At this point in the program, a is the
> > > variable whose value is stored in r1 and b is the variable whose value
> > > is stored in r3. "mov r3, r1" copies the value from r1 into r3 and is
> > > therefore equivalent to executing "b = a". (That is why I said one
> > > could argue that the "return *b" statement uses the value of *a.) Thus
> > > it very much does have something to do with "a".
> >
> > After the cmp and bne r1 and r3 have the same value.
> > The compiler tracks that and will use either register later.
> > That can never matter.
>
> The whole point of this thread is that sometimes it _does_ matter. Not
> on x86, but on weakly ordered architectures where using the wrong
> register will bypass a dependency and allow the CPU to speculatively
> load values earlier than the programmer wants it to.
>
> > Remember the compiler tracks values (in pseudo/internal registers)
> > not variables.
> >
> > > > It is a more general problem that OPTIMISER_HIDE_VAR() pretty much
> > > > always ends up allocating a different internal 'register' for the
> > > > output and then allocating a separate physical rehgister.
> > >
> > > What output are you referring to? Does OPTIMISER_HIDE_VAR() have an
> > > output? If it does, the source program above ignores it, discarding any
> > > returned value.
> >
> > Look up OPTIMISER_HIDE_VAR(x) it basically x = f(x) where f() is
> > the identity operation:
> > asm ("" : "+r"(x))
> > I'll bet that gcc allocates a separate internal/pseudo register
> > for the result so wants to do y = f(x).
> > Probably generating y = x; y = f(y);
> > (The 'mov' might be after the asm, but I think that would get
> > optimised away - the listing file might help.)
> >
> > So here the compiler has just decided to reuse the register that
> > held the other of a/b for the extra temporary.
>
> I think you've got this backward. As mentioned above, a is originally
> in r1 and b is in r3. The source says OPTIMIZER_HIDE_VAR(b), so you're
> saying that gcc should be copying r3 into a separate internal/pseudo
> register. But instead it's copying r1.
I think I know what you are trying to do, and you just fail.
Whether something can work is another matter, but that code
can't ever work.
Inside if (a == b) the compiler will always use the same register
for references to a and b - because it knows they have the same value.
Possibly something like:
c = b;
OPTIMISER_HIDE_VAR(c);
if (a == c) {
*b
will ensure that there isn't a speculative load from *a.
You'll get at least one register-register move - but they are safe.
Otherwise you'll need to put the condition inside an asm block.
David
>
> Alan
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
next prev parent reply other threads:[~2024-10-02 15:25 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-28 13:51 [PATCH 0/2] " Mathieu Desnoyers
2024-09-28 13:51 ` [PATCH 1/2] compiler.h: " Mathieu Desnoyers
2024-09-28 14:49 ` Alan Stern
2024-09-28 15:30 ` Mathieu Desnoyers
2024-09-28 15:32 ` Mathieu Desnoyers
2024-09-28 15:49 ` Alan Stern
2024-09-28 15:55 ` Mathieu Desnoyers
2024-09-28 21:15 ` Alan Stern
2024-09-30 9:42 ` Jonas Oberhauser
2024-09-30 11:04 ` Paul E. McKenney
2024-09-30 12:06 ` Jonas Oberhauser
2024-09-30 13:54 ` Paul E. McKenney
2024-09-28 22:26 ` Alan Huang
2024-09-28 23:55 ` Boqun Feng
2024-09-29 0:20 ` Alan Huang
2024-09-30 8:57 ` Jonas Oberhauser
2024-09-30 9:15 ` Alan Huang
2024-09-30 9:27 ` Alan Huang
2024-09-30 9:33 ` Jonas Oberhauser
2024-09-30 10:12 ` Alan Huang
2024-09-30 11:26 ` Jonas Oberhauser
2024-09-30 16:43 ` Alan Stern
2024-09-30 17:05 ` Jonas Oberhauser
2024-09-30 18:53 ` Alan Stern
2024-10-01 17:11 ` David Laight
2024-10-01 22:57 ` 'Alan Stern'
2024-10-02 8:13 ` David Laight
2024-10-02 14:14 ` 'Alan Stern'
2024-10-02 15:24 ` David Laight [this message]
2024-10-03 1:50 ` 'Alan Stern'
2024-10-03 13:23 ` Mathieu Desnoyers
2024-10-03 17:07 ` David Laight
2024-10-03 18:00 ` Mathieu Desnoyers
2024-10-07 11:54 ` Jonas Oberhauser
2024-10-07 13:18 ` David Laight
2024-10-07 13:21 ` Mathieu Desnoyers
2024-10-07 14:59 ` Jonas Oberhauser
2024-09-28 23:24 ` Gary Guo
2024-09-29 10:36 ` Mathieu Desnoyers
2024-09-28 13:51 ` [PATCH 2/2] Documentation: RCU: Refer to ptr_eq() Mathieu Desnoyers
2024-09-28 14:58 ` Alan Stern
2024-09-28 15:09 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e39c6e5975f345c4b1a97145e207dee4@AcuMS.aculab.com \
--to=david.laight@aculab.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=jonas.oberhauser@huaweicloud.com \
--cc=josh@joshtriplett.org \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkmm@lists.linux.dev \
--cc=longman@redhat.com \
--cc=maged.michael@gmail.com \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@redhat.com \
--cc=mjguzik@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=stern@rowland.harvard.edu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox