* [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops
@ 2016-07-22 10:34 David Howells
2016-07-22 16:44 ` Paul E. McKenney
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: David Howells @ 2016-07-22 10:34 UTC (permalink / raw)
To: ksummit-discuss; +Cc: jakub, peterz, ramana.radhakrishnan
Earlier this year we had a discussion of the possibilities of using ISO C++11
atomic operations inside the kernel to implement kernel atomic ops of sorts
various, posted here:
Subject: [RFC PATCH 00/15] Provide atomics and bitops implemented with ISO C++11 atomics
Date: Wed, 18 May 2016 16:10:37 +0100
Is it worth getting together to discuss these in person in one of the tech
slots - especially if there are some gcc or llvm people available at plumbers
who could join in?
Further, Paul McKenney and others are assembling a memory model description.
Do we want to consider loosening up the kernel memory model?
Currently, for example, locks imply semi-permeable barriers that are not tied
to the lock variable - but, as I understand it, this may not hold true on all
arches, and on those arches an extra memory barrier would be required. For
example, { LOCK(A), UNLOCK(A), LOCK(B), UNLOCK(B) } would not imply a full
memory barrier in the middle.
Also, we have read and write memory barrier constructs - but not all CPUs have
such things, some instead have acquire and release. Would it be worth having
higher level constructs that encapsulate what you're trying to do and select
the most appropriate barrier from the available choices? For instance, we
have a fair few circular buffers, with linux/circ_buf.h providing some useful
bits. Should we provide circular buffering barrier constructs?
If we do this, Will Deacon, Peter Zijlstra and Paul McKenney should definitely
be there. I would suggest Jakub Jelinek and Ramana Radhakrishnan as gcc
representatives. I don't know anyone from LLVM.
David
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-22 10:34 [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops David Howells @ 2016-07-22 16:44 ` Paul E. McKenney 2016-07-25 17:14 ` Luis R. Rodriguez 2016-07-23 20:21 ` Benjamin Herrenschmidt ` (2 subsequent siblings) 3 siblings, 1 reply; 15+ messages in thread From: Paul E. McKenney @ 2016-07-22 16:44 UTC (permalink / raw) To: David Howells Cc: jakub, parri.andrea, ksummit-discuss, peterz, stern, ramana.radhakrishnan, luc.maranget, j.alglave On Fri, Jul 22, 2016 at 11:34:35AM +0100, David Howells wrote: > Earlier this year we had a discussion of the possibilities of using ISO C++11 > atomic operations inside the kernel to implement kernel atomic ops of sorts > various, posted here: > > Subject: [RFC PATCH 00/15] Provide atomics and bitops implemented with ISO C++11 atomics > Date: Wed, 18 May 2016 16:10:37 +0100 > > Is it worth getting together to discuss these in person in one of the tech > slots - especially if there are some gcc or llvm people available at plumbers > who could join in? > > > Further, Paul McKenney and others are assembling a memory model description. We are attempting to automate memory-barriers.txt, so that you could provide fragments of C code, and the tool would tell you whether a given outcome happens always, sometimes, or never. The current prototype handles memory accesses, memory barriers, and RCU, but not yet locking or read-modify-write atomics (though there is some vestigal support for RMW atomics). We are currently playing whack-a-mole with odd corner cases of various architectures' memory models. We are therefore also working on ways of handling the resulting uncertainty. Good clean fun! ;-) > Do we want to consider loosening up the kernel memory model? > > Currently, for example, locks imply semi-permeable barriers that are not tied > to the lock variable - but, as I understand it, this may not hold true on all > arches, and on those arches an extra memory barrier would be required. For > example, { LOCK(A), UNLOCK(A), LOCK(B), UNLOCK(B) } would not imply a full > memory barrier in the middle. Agreed, the current version of memory-barriers.txt documents the fact that an unlock/lock sequence does not imply a full memory barrier: Similarly, the reverse case of a RELEASE followed by an ACQUIRE does not imply a full memory barrier. If we are going to talk about changing this, we should most definitely include a powerpc arch maintainer. > Also, we have read and write memory barrier constructs - but not all CPUs have > such things, some instead have acquire and release. Would it be worth having > higher level constructs that encapsulate what you're trying to do and select > the most appropriate barrier from the available choices? For instance, we > have a fair few circular buffers, with linux/circ_buf.h providing some useful > bits. Should we provide circular buffering barrier constructs? > > > If we do this, Will Deacon, Peter Zijlstra and Paul McKenney should definitely > be there. I would suggest Jakub Jelinek and Ramana Radhakrishnan as gcc > representatives. I don't know anyone from LLVM. Alan Stern should be included, as he is the author of the most recent version of the prototype formalized memory model (including the current extremely nice formal model of RCU!), Luc Maranget as the author of the previous version, Jade Alglave as author of the first version and founder of this project, and Andrea Parri for his many contributions to recent models. Shouldn't we also include experts on other SMP architectures? Thanx, Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-22 16:44 ` Paul E. McKenney @ 2016-07-25 17:14 ` Luis R. Rodriguez 2016-07-26 6:09 ` Hannes Reinecke 0 siblings, 1 reply; 15+ messages in thread From: Luis R. Rodriguez @ 2016-07-25 17:14 UTC (permalink / raw) To: Paul E. McKenney Cc: jakub, parri.andrea, ksummit-discuss, peterz, stern, ramana.radhakrishnan, luc.maranget, j.alglave On Fri, Jul 22, 2016 at 09:44:11AM -0700, Paul E. McKenney wrote: > On Fri, Jul 22, 2016 at 11:34:35AM +0100, David Howells wrote: > > Earlier this year we had a discussion of the possibilities of using ISO C++11 > > atomic operations inside the kernel to implement kernel atomic ops of sorts > > various, posted here: > > > > Subject: [RFC PATCH 00/15] Provide atomics and bitops implemented with ISO C++11 atomics > > Date: Wed, 18 May 2016 16:10:37 +0100 > > > > Is it worth getting together to discuss these in person in one of the tech > > slots - especially if there are some gcc or llvm people available at plumbers > > who could join in? > > > > > > Further, Paul McKenney and others are assembling a memory model description. > > We are attempting to automate memory-barriers.txt, so that you could > provide fragments of C code, and the tool would tell you whether a given > outcome happens always, sometimes, or never. The current prototype > handles memory accesses, memory barriers, and RCU, but not yet locking > or read-modify-write atomics (though there is some vestigal support for > RMW atomics). We are currently playing whack-a-mole with odd corner cases > of various architectures' memory models. We are therefore also working > on ways of handling the resulting uncertainty. Good clean fun! ;-) Consider me interested in this discussion, patches, etc. Luis ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-25 17:14 ` Luis R. Rodriguez @ 2016-07-26 6:09 ` Hannes Reinecke 2016-07-26 13:10 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: Hannes Reinecke @ 2016-07-26 6:09 UTC (permalink / raw) To: Luis R. Rodriguez, Paul E. McKenney Cc: jakub, parri.andrea, ksummit-discuss, peterz, stern, ramana.radhakrishnan, luc.maranget, j.alglave On 07/25/2016 07:14 PM, Luis R. Rodriguez wrote: > On Fri, Jul 22, 2016 at 09:44:11AM -0700, Paul E. McKenney wrote: >> On Fri, Jul 22, 2016 at 11:34:35AM +0100, David Howells wrote: >>> Earlier this year we had a discussion of the possibilities of using ISO C++11 >>> atomic operations inside the kernel to implement kernel atomic ops of sorts >>> various, posted here: >>> >>> Subject: [RFC PATCH 00/15] Provide atomics and bitops implemented with ISO C++11 atomics >>> Date: Wed, 18 May 2016 16:10:37 +0100 >>> >>> Is it worth getting together to discuss these in person in one of the tech >>> slots - especially if there are some gcc or llvm people available at plumbers >>> who could join in? >>> >>> >>> Further, Paul McKenney and others are assembling a memory model description. >> >> We are attempting to automate memory-barriers.txt, so that you could >> provide fragments of C code, and the tool would tell you whether a given >> outcome happens always, sometimes, or never. The current prototype >> handles memory accesses, memory barriers, and RCU, but not yet locking >> or read-modify-write atomics (though there is some vestigal support for >> RMW atomics). We are currently playing whack-a-mole with odd corner cases >> of various architectures' memory models. We are therefore also working >> on ways of handling the resulting uncertainty. Good clean fun! ;-) > > Consider me interested in this discussion, patches, etc. > Same here. I have been playing around with RCUs and memory barriers quite a lot recently, and found some really 'odd' use-cases in the kernel which would benefit from improvements here. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 6:09 ` Hannes Reinecke @ 2016-07-26 13:10 ` Alan Stern 2016-07-26 13:35 ` Paul E. McKenney 2016-07-26 15:23 ` Hannes Reinecke 0 siblings, 2 replies; 15+ messages in thread From: Alan Stern @ 2016-07-26 13:10 UTC (permalink / raw) To: Hannes Reinecke Cc: jakub, parri.andrea, j.alglave, ksummit-discuss, peterz, ramana.radhakrishnan, luc.maranget On Tue, 26 Jul 2016, Hannes Reinecke wrote: > I have been playing around with RCUs and memory barriers quite a lot > recently, and found some really 'odd' use-cases in the kernel which > would benefit from improvements here. Could you post one or two examples? It would be interesting to see what they involve. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 13:10 ` Alan Stern @ 2016-07-26 13:35 ` Paul E. McKenney 2016-07-29 1:06 ` Steven Rostedt 2016-07-26 15:23 ` Hannes Reinecke 1 sibling, 1 reply; 15+ messages in thread From: Paul E. McKenney @ 2016-07-26 13:35 UTC (permalink / raw) To: Alan Stern Cc: jakub, parri.andrea, ksummit-discuss, peterz, ramana.radhakrishnan, luc.maranget, j.alglave On Tue, Jul 26, 2016 at 09:10:32AM -0400, Alan Stern wrote: > On Tue, 26 Jul 2016, Hannes Reinecke wrote: > > > I have been playing around with RCUs and memory barriers quite a lot > > recently, and found some really 'odd' use-cases in the kernel which > > would benefit from improvements here. > > Could you post one or two examples? It would be interesting to see > what they involve. I would be intersted as well! Thanx, Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 13:35 ` Paul E. McKenney @ 2016-07-29 1:06 ` Steven Rostedt 0 siblings, 0 replies; 15+ messages in thread From: Steven Rostedt @ 2016-07-29 1:06 UTC (permalink / raw) To: Paul E. McKenney Cc: jakub, parri.andrea, ksummit-discuss, peterz, Alan Stern, ramana.radhakrishnan, luc.maranget, j.alglave On Tue, 26 Jul 2016 06:35:51 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > On Tue, Jul 26, 2016 at 09:10:32AM -0400, Alan Stern wrote: > > On Tue, 26 Jul 2016, Hannes Reinecke wrote: > > > > > I have been playing around with RCUs and memory barriers quite a lot > > > recently, and found some really 'odd' use-cases in the kernel which > > > would benefit from improvements here. > > > > Could you post one or two examples? It would be interesting to see > > what they involve. > > I would be intersted as well! > Me too, as tracing plays a bit of magic with memory models as well. -- Steve ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 13:10 ` Alan Stern 2016-07-26 13:35 ` Paul E. McKenney @ 2016-07-26 15:23 ` Hannes Reinecke 2016-07-26 22:40 ` Paul E. McKenney 1 sibling, 1 reply; 15+ messages in thread From: Hannes Reinecke @ 2016-07-26 15:23 UTC (permalink / raw) To: Alan Stern Cc: jakub, parri.andrea, j.alglave, ksummit-discuss, peterz, ramana.radhakrishnan, luc.maranget On 07/26/2016 03:10 PM, Alan Stern wrote: > On Tue, 26 Jul 2016, Hannes Reinecke wrote: > >> I have been playing around with RCUs and memory barriers quite a lot >> recently, and found some really 'odd' use-cases in the kernel which >> would benefit from improvements here. > > Could you post one or two examples? It would be interesting to see > what they involve. > I have been working on a performance regression when calling 'dm_suspend/dm_resume' repeatedly for several (hundreds) devices. That boiled down to the patch introducing srcu in the device mapper core with commit 83d5e5b0af907 (dm: optimize use SRCU and RCU). Looking at it the code they do things like: set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags); if (map) synchronize_srcu(&md->io_barrier); where the srcu is used to ensure the code has left the critical sections. However, if the memory pointed to by the srcu isn't actually freed why we could easily drop the 'synchronize_srcu' call. But that would require that a) the set_bit() above is indeed atomic and b) there's no need to call 'synchronize_rcu' if you're not actually freeing memory but rather fiddle pointers. Both are somewhat shady areas where the documentation nor usage reveals some obvious insights. On another example I've been doing performance patches to the lpfc driver (cf my talk at VAULT this year), where I've replaced most spinlocks with atomics and bitops. Which should work as well, only that it's still a bit unclear to me if an when you need barriers in addition to atomic resp bitops. And if you need barriers, which variant would be most appropriate? The __before or the __after variant? Also, what happens to bitops on bitfields longer than an unsigned long? Are they still atomic? Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 15:23 ` Hannes Reinecke @ 2016-07-26 22:40 ` Paul E. McKenney 0 siblings, 0 replies; 15+ messages in thread From: Paul E. McKenney @ 2016-07-26 22:40 UTC (permalink / raw) To: Hannes Reinecke Cc: jakub, parri.andrea, ksummit-discuss, peterz, Alan Stern, ramana.radhakrishnan, luc.maranget, j.alglave On Tue, Jul 26, 2016 at 05:23:36PM +0200, Hannes Reinecke wrote: > On 07/26/2016 03:10 PM, Alan Stern wrote: > >On Tue, 26 Jul 2016, Hannes Reinecke wrote: > > > >>I have been playing around with RCUs and memory barriers quite a lot > >>recently, and found some really 'odd' use-cases in the kernel which > >>would benefit from improvements here. > > > >Could you post one or two examples? It would be interesting to see > >what they involve. > > > I have been working on a performance regression when calling > 'dm_suspend/dm_resume' repeatedly for several (hundreds) devices. > That boiled down to the patch introducing srcu in the device mapper core > with commit 83d5e5b0af907 (dm: optimize use SRCU and RCU). > Looking at it the code they do things like: > > set_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags); > if (map) > synchronize_srcu(&md->io_barrier); > > where the srcu is used to ensure the code has left the critical > sections. However, if the memory pointed to by the srcu isn't > actually freed why we could easily drop the 'synchronize_srcu' call. > But that would require that > a) the set_bit() above is indeed atomic > and > b) there's no need to call 'synchronize_rcu' if you're not actually > freeing memory but rather fiddle pointers. > Both are somewhat shady areas where the documentation nor usage > reveals some obvious insights. The set_bit() function is guaranteed to execute atomically, but it does not guarantee any ordering against prior accesses. The synchronize_rcu() does provide ordering against the set_bit(), to subsequent accesses are covered, but only in the case where "map" is non-zero. For RCU, it does depend on the use case. For example, there are some rare but real cases where synchronize_rcu() is required even if you are not freeing memory: https://www.usenix.org/legacy/event/atc11/tech/final_files/Triplett.pdf > On another example I've been doing performance patches to the lpfc > driver (cf my talk at VAULT this year), where I've replaced most > spinlocks with atomics and bitops. > Which should work as well, only that it's still a bit unclear to me > if an when you need barriers in addition to atomic resp bitops. If the bitop returns a value, you don't need additional barriers. Otherwise ... ... you need smp_mb__before_atomic() to order prior accesses against the bitop and smp_mb__after_atomic() to order subsequent accesses against the bitop. If you need the bitop to be ordered against both prior and subsequent accesses, then you need both smp_mb__before_atomic() and smp_mb__after_atomic(). > And if you need barriers, which variant would be most appropriate? > The __before or the __after variant? ... you need smp_mb__before_atomic() to order prior accesses against the bitop and smp_mb__after_atomic() to order subsequent accesses against the bitop. If you need the bitop to be ordered against both prior and subsequent accesses, then you need both smp_mb__before_atomic() and smp_mb__after_atomic(). > Also, what happens to bitops on bitfields longer than an unsigned long? > Are they still atomic? >From what I can see, yes, sort of. The "sort of" part is due to the fact that bitops on widely separated bits would be would avoid interfering with each other, but on the other hand, there would be no cause-and-effect relationship between them, either. Furthermore, processes reading the bits set might disagree on the order in which they were set. All that aside, please note that the initial memory model is limited to memory reference, barriers, and RCU. We do not yet have locking or read-modify-write atomic operations. We have to start somewhere! Thanx, Paul ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-22 10:34 [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops David Howells 2016-07-22 16:44 ` Paul E. McKenney @ 2016-07-23 20:21 ` Benjamin Herrenschmidt 2016-07-26 15:11 ` David Woodhouse 2016-07-26 15:20 ` David Howells 3 siblings, 0 replies; 15+ messages in thread From: Benjamin Herrenschmidt @ 2016-07-23 20:21 UTC (permalink / raw) To: David Howells, ksummit-discuss; +Cc: jakub, peterz, ramana.radhakrishnan On Fri, 2016-07-22 at 11:34 +0100, David Howells wrote: > If we do this, Will Deacon, Peter Zijlstra and Paul McKenney should definitely > be there. I would suggest Jakub Jelinek and Ramana Radhakrishnan as gcc > representatives. I don't know anyone from LLVM. I'd like to be there too. Cheers, Ben. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-22 10:34 [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops David Howells 2016-07-22 16:44 ` Paul E. McKenney 2016-07-23 20:21 ` Benjamin Herrenschmidt @ 2016-07-26 15:11 ` David Woodhouse 2016-07-28 10:41 ` Will Deacon 2016-08-02 13:42 ` Peter Zijlstra 2016-07-26 15:20 ` David Howells 3 siblings, 2 replies; 15+ messages in thread From: David Woodhouse @ 2016-07-26 15:11 UTC (permalink / raw) To: David Howells, ksummit-discuss; +Cc: jakub, peterz, ramana.radhakrishnan [-- Attachment #1: Type: text/plain, Size: 909 bytes --] On Fri, 2016-07-22 at 11:34 +0100, David Howells wrote: > Further, Paul McKenney and others are assembling a memory model description. > Do we want to consider loosening up the kernel memory model? It's not clear that 'loosening up' is what we're after. In Seoul last year, weren't we looking at things like readl_relaxed() and lamenting the fact that they do actually still have strong enough requirements that they can't *really* be very relaxed on Power and ARM64 at all, because they're basically being used with the assumption of Intel-like semantics. The cheap answer is "well, it sucks to be on POWER or ARM64 because then readl_relaxed() has to be as slow as readl() is". But it would be good to follow up on that properly, and maybe introduce a variant which *can* be implemented across more architectures. Is that what Paul is working on, that you mention above? -- dwmw2 [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5760 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 15:11 ` David Woodhouse @ 2016-07-28 10:41 ` Will Deacon 2016-08-02 13:42 ` Peter Zijlstra 1 sibling, 0 replies; 15+ messages in thread From: Will Deacon @ 2016-07-28 10:41 UTC (permalink / raw) To: David Woodhouse; +Cc: jakub, peterz, ksummit-discuss, ramana.radhakrishnan On Tue, Jul 26, 2016 at 04:11:21PM +0100, David Woodhouse wrote: > On Fri, 2016-07-22 at 11:34 +0100, David Howells wrote: > > Further, Paul McKenney and others are assembling a memory model description. > > Do we want to consider loosening up the kernel memory model? > > It's not clear that 'loosening up' is what we're after. > > In Seoul last year, weren't we looking at things like readl_relaxed() > and lamenting the fact that they do actually still have strong enough > requirements that they can't *really* be very relaxed on Power and > ARM64 at all, because they're basically being used with the assumption > of Intel-like semantics. > > The cheap answer is "well, it sucks to be on POWER or ARM64 because > then readl_relaxed() has to be as slow as readl() is". I wasn't in Seoul, but I think some people got the wrong end of the stick about the relaxed accessors and I'd be interested in trying to address some of that, at least from the arm64 point-of-view. Paul has been busy writing something up a summary for lwn, but I don't think it's quite ready yet. Having said that, the memory model work that I'm aware of focusses completely on SMP synchronisation and I don't think it's particularly helpful to throw I/O into the mix just yet. Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-26 15:11 ` David Woodhouse 2016-07-28 10:41 ` Will Deacon @ 2016-08-02 13:42 ` Peter Zijlstra 2016-08-03 8:49 ` Will Deacon 1 sibling, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2016-08-02 13:42 UTC (permalink / raw) To: David Woodhouse; +Cc: jakub, ksummit-discuss, ramana.radhakrishnan On Tue, Jul 26, 2016 at 04:11:21PM +0100, David Woodhouse wrote: > On Fri, 2016-07-22 at 11:34 +0100, David Howells wrote: > > Further, Paul McKenney and others are assembling a memory model description. > > Do we want to consider loosening up the kernel memory model? > > It's not clear that 'loosening up' is what we're after. Linus (who should also very much be present for this) always argues against relaxing ordering. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-08-02 13:42 ` Peter Zijlstra @ 2016-08-03 8:49 ` Will Deacon 0 siblings, 0 replies; 15+ messages in thread From: Will Deacon @ 2016-08-03 8:49 UTC (permalink / raw) To: Peter Zijlstra; +Cc: jakub, ksummit-discuss, ramana.radhakrishnan On Tue, Aug 02, 2016 at 03:42:23PM +0200, Peter Zijlstra wrote: > On Tue, Jul 26, 2016 at 04:11:21PM +0100, David Woodhouse wrote: > > On Fri, 2016-07-22 at 11:34 +0100, David Howells wrote: > > > Further, Paul McKenney and others are assembling a memory model description. > > > Do we want to consider loosening up the kernel memory model? > > > > It's not clear that 'loosening up' is what we're after. > > Linus (who should also very much be present for this) always argues > against relaxing ordering. Even if he had an inexplicable change in heart, how on Earth do you go about validating the existing codebase against a new memory model? It's one thing to show that the relaxation is strictly a relaxation (and therefore the existing backend implementations remain sound), but quite another to show that locking implementations don't fall apart, or the guarantees that no longer hold aren't relied upon someplace. One might consider adding more atomic operations. For example, the release/acquire stuff we grew recently is "weaker" than full fences but, unlike the C11 atomics, the release/acquire primitives complement full fences. Conversely, having two sets of fences (C11 and kernel) or two sets of release/acquire, each with subtly different semantics sounds like an awful maintainance burden and not somewhere we should be going. That's why I'd be interested in building kernel-compatible operations using C11 atomics (relaxed accesses and fences) for asm/generic, but not more than that, at least outside of arch/*. Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops 2016-07-22 10:34 [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops David Howells ` (2 preceding siblings ...) 2016-07-26 15:11 ` David Woodhouse @ 2016-07-26 15:20 ` David Howells 3 siblings, 0 replies; 15+ messages in thread From: David Howells @ 2016-07-26 15:20 UTC (permalink / raw) To: David Woodhouse; +Cc: jakub, peterz, ksummit-discuss, ramana.radhakrishnan David Woodhouse <dwmw2@infradead.org> wrote: > In Seoul last year, weren't we looking at things like readl_relaxed() > and lamenting the fact that they do actually still have strong enough > requirements that they can't *really* be very relaxed on Power and > ARM64 at all, because they're basically being used with the assumption > of Intel-like semantics. I don't recall that. Possibly that was a track I wasn't in. > The cheap answer is "well, it sucks to be on POWER or ARM64 because > then readl_relaxed() has to be as slow as readl() is". Does the memory model for CPU/device interactions have to be the same as that for CPU/CPU interactions? I guess with respect to locks, it does so that two processors who both want to access a device don't trample over each other. > Is that what Paul is working on, that you mention above? Paul is working on a general overall description. David ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-08-03 8:49 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-22 10:34 [Ksummit-discuss] [TECH TOPIC] Memory model, using ISO C++11 atomic ops David Howells 2016-07-22 16:44 ` Paul E. McKenney 2016-07-25 17:14 ` Luis R. Rodriguez 2016-07-26 6:09 ` Hannes Reinecke 2016-07-26 13:10 ` Alan Stern 2016-07-26 13:35 ` Paul E. McKenney 2016-07-29 1:06 ` Steven Rostedt 2016-07-26 15:23 ` Hannes Reinecke 2016-07-26 22:40 ` Paul E. McKenney 2016-07-23 20:21 ` Benjamin Herrenschmidt 2016-07-26 15:11 ` David Woodhouse 2016-07-28 10:41 ` Will Deacon 2016-08-02 13:42 ` Peter Zijlstra 2016-08-03 8:49 ` Will Deacon 2016-07-26 15:20 ` David Howells
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox