ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Hannes Reinecke <hare@suse.com>
Cc: jakub@redhat.com, parri.andrea@gmail.com,
	ksummit-discuss@lists.linuxfoundation.org, peterz@infradead.org,
	Alan Stern <stern@rowland.harvard.edu>,
	ramana.radhakrishnan@arm.com, luc.maranget@inria.fr,
	j.alglave@ucl.ac.uk
Subject: Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops
Date: Tue, 26 Jul 2016 15:40:35 -0700	[thread overview]
Message-ID: <20160726224035.GD7094@linux.vnet.ibm.com> (raw)
In-Reply-To: <f7c305f7-939a-b66c-4d6b-333e41c23017@suse.com>

On Tue, Jul 26, 2016 at 05:23:36PM +0200, Hannes Reinecke wrote:
> On 07/26/2016 03:10 PM, Alan Stern wrote:
> >On Tue, 26 Jul 2016, Hannes Reinecke wrote:
> >
> >>I have been playing around with RCUs and memory barriers quite a lot
> >>recently, and found some really 'odd' use-cases in the kernel which
> >>would benefit from improvements here.
> >
> >Could you post one or two examples?  It would be interesting to see
> >what they involve.
> >
> I have been working on a performance regression when calling
> 'dm_suspend/dm_resume' repeatedly for several (hundreds) devices.
> That boiled down to the patch introducing srcu in the device mapper core
> with commit 83d5e5b0af907 (dm: optimize use SRCU and RCU).
> Looking at it the code they do things like:
> 
> 	set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags);
> 	if (map)
> 		synchronize_srcu(&md->io_barrier);
> 
> where the srcu is used to ensure the code has left the critical
> sections. However, if the memory pointed to by the srcu isn't
> actually freed why we could easily drop the 'synchronize_srcu' call.
> But that would require that
> a) the set_bit() above is indeed atomic
> and
> b) there's no need to call 'synchronize_rcu' if you're not actually
> freeing memory but rather fiddle pointers.
> Both are somewhat shady areas where the documentation nor usage
> reveals some obvious insights.

The set_bit() function is guaranteed to execute atomically, but it does
not guarantee any ordering against prior accesses.  The synchronize_rcu()
does provide ordering against the set_bit(), to subsequent accesses are
covered, but only in the case where "map" is non-zero.

For RCU, it does depend on the use case.  For example, there are some
rare but real cases where synchronize_rcu() is required even if you are
not freeing memory:

https://www.usenix.org/legacy/event/atc11/tech/final_files/Triplett.pdf

> On another example I've been doing performance patches to the lpfc
> driver (cf my talk at VAULT this year), where I've replaced most
> spinlocks with atomics and bitops.
> Which should work as well, only that it's still a bit unclear to me
> if an when you need barriers in addition to atomic resp bitops.

If the bitop returns a value, you don't need additional barriers.
Otherwise ...
 ... you need smp_mb__before_atomic() to order prior accesses
against the bitop and smp_mb__after_atomic() to order subsequent
accesses against the bitop.  If you need the bitop to be ordered
against both prior and subsequent accesses, then you need both
smp_mb__before_atomic() and smp_mb__after_atomic().

> And if you need barriers, which variant would be most appropriate?
> The __before or the __after variant?

 ... you need smp_mb__before_atomic() to order prior accesses
against the bitop and smp_mb__after_atomic() to order subsequent
accesses against the bitop.  If you need the bitop to be ordered
against both prior and subsequent accesses, then you need both
smp_mb__before_atomic() and smp_mb__after_atomic().

> Also, what happens to bitops on bitfields longer than an unsigned long?
> Are they still atomic?

>From what I can see, yes, sort of.  The "sort of" part is due to the fact
that bitops on widely separated bits would be would avoid interfering
with each other, but on the other hand, there would be no cause-and-effect
relationship between them, either.  Furthermore, processes reading the
bits set might disagree on the order in which they were set.

All that aside, please note that the initial memory model is limited
to memory reference, barriers, and RCU.  We do not yet have locking or
read-modify-write atomic operations.  We have to start somewhere!

							Thanx, Paul

  reply	other threads:[~2016-07-26 22:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-22 10:34 David Howells
2016-07-22 16:44 ` Paul E. McKenney
2016-07-25 17:14   ` Luis R. Rodriguez
2016-07-26  6:09     ` Hannes Reinecke
2016-07-26 13:10       ` Alan Stern
2016-07-26 13:35         ` Paul E. McKenney
2016-07-29  1:06           ` Steven Rostedt
2016-07-26 15:23         ` Hannes Reinecke
2016-07-26 22:40           ` Paul E. McKenney [this message]
2016-07-23 20:21 ` Benjamin Herrenschmidt
2016-07-26 15:11 ` David Woodhouse
2016-07-28 10:41   ` Will Deacon
2016-08-02 13:42   ` Peter Zijlstra
2016-08-03  8:49     ` Will Deacon
2016-07-26 15:20 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160726224035.GD7094@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=hare@suse.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=jakub@redhat.com \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    --cc=luc.maranget@inria.fr \
    --cc=parri.andrea@gmail.com \
    --cc=peterz@infradead.org \
    --cc=ramana.radhakrishnan@arm.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox