[PATCH 0/2] rwsem: performance enhancements for systems with many cores

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] rwsem: performance enhancements for systems with many cores
@ 2013-06-21 23:51 Tim Chen
  2013-06-22  0:00 ` Davidlohr Bueso
  0 siblings, 1 reply; 5+ messages in thread
From: Tim Chen @ 2013-06-21 23:51 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: Andrea Arcangeli, Alex Shi, Andi Kleen, Michel Lespinasse,
	Davidlohr Bueso, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, Tim Chen, linux-kernel, linux-mm

In this patchset, we introduce two optimizations to read write semaphore.
The first one reduces cache bouncing of the sem->count field
by doing a pre-read of the sem->count and avoid cmpxchg if possible.
The second patch introduces similar optimistic spining logic in
the mutex code for the writer lock acquisition of rw-sem.

Combining the two patches, in testing by Davidlohr Bueso on aim7 workloads
on 8 socket 80 cores system, he saw improvements of
alltests (+14.5%), custom (+17%), disk (+11%), high_systime
(+5%), shared (+15%) and short (+4%), most of them after around 500
users when i_mmap was implemented as rwsem.

Feedbacks on the effectiveness of these tweaks on other workloads
will be appreciated.

Alex Shi (1):
  rwsem: check the lock before cpmxchg in down_write_trylock and    
    rwsem_do_wake

Tim Chen (1):
  rwsem: do optimistic spinning for writer lock acquisition

 Makefile                    |    2 +-
 include/asm-generic/rwsem.h |    8 +-
 include/linux/rwsem.h       |    3 +
 init/Kconfig                |    9 +++
 kernel/rwsem.c              |   29 +++++++-
 lib/rwsem.c                 |  169 ++++++++++++++++++++++++++++++++++++++-----
 6 files changed, 195 insertions(+), 25 deletions(-)

-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] rwsem: performance enhancements for systems with many cores
  2013-06-21 23:51 [PATCH 0/2] rwsem: performance enhancements for systems with many cores Tim Chen
@ 2013-06-22  0:00 ` Davidlohr Bueso
  2013-06-22  0:25   ` Michel Lespinasse
  0 siblings, 1 reply; 5+ messages in thread
From: Davidlohr Bueso @ 2013-06-22  0:00 UTC (permalink / raw)
  To: Tim Chen
  Cc: Ingo Molnar, Andrew Morton, Andrea Arcangeli, Alex Shi,
	Andi Kleen, Michel Lespinasse, Matthew R Wilcox, Dave Hansen,
	Peter Zijlstra, Rik van Riel, linux-kernel, linux-mm

On Fri, 2013-06-21 at 16:51 -0700, Tim Chen wrote:
> In this patchset, we introduce two optimizations to read write semaphore.
> The first one reduces cache bouncing of the sem->count field
> by doing a pre-read of the sem->count and avoid cmpxchg if possible.
> The second patch introduces similar optimistic spining logic in
> the mutex code for the writer lock acquisition of rw-sem.
> 
> Combining the two patches, in testing by Davidlohr Bueso on aim7 workloads
> on 8 socket 80 cores system, he saw improvements of
> alltests (+14.5%), custom (+17%), disk (+11%), high_systime
> (+5%), shared (+15%) and short (+4%), most of them after around 500
> users when i_mmap was implemented as rwsem.
> 
> Feedbacks on the effectiveness of these tweaks on other workloads
> will be appreciated.

Tim, I was really hoping to send all this in one big bundle. I was doing
some further testing (enabling hyperthreading and some Oracle runs),
fortunately everything looks ok and we are getting actual improvements
on large boxes.

That said, how about I send you my i_mmap rwsem patchset for a v2 of
this patchset?

Thanks,
Davidlohr

> 
> 
> Alex Shi (1):
>   rwsem: check the lock before cpmxchg in down_write_trylock and    
>     rwsem_do_wake
> 
> Tim Chen (1):
>   rwsem: do optimistic spinning for writer lock acquisition
> 
>  Makefile                    |    2 +-
>  include/asm-generic/rwsem.h |    8 +-
>  include/linux/rwsem.h       |    3 +
>  init/Kconfig                |    9 +++
>  kernel/rwsem.c              |   29 +++++++-
>  lib/rwsem.c                 |  169 ++++++++++++++++++++++++++++++++++++++-----
>  6 files changed, 195 insertions(+), 25 deletions(-)
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] rwsem: performance enhancements for systems with many cores
  2013-06-22  0:00 ` Davidlohr Bueso
@ 2013-06-22  0:25   ` Michel Lespinasse
  2013-06-22  0:43     ` Davidlohr Bueso
  0 siblings, 1 reply; 5+ messages in thread
From: Michel Lespinasse @ 2013-06-22  0:25 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Tim Chen, Ingo Molnar, Andrew Morton, Andrea Arcangeli, Alex Shi,
	Andi Kleen, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, linux-kernel, linux-mm

On Fri, Jun 21, 2013 at 5:00 PM, Davidlohr Bueso <davidlohr.bueso@hp.com> wrote:
> On Fri, 2013-06-21 at 16:51 -0700, Tim Chen wrote:
>> In this patchset, we introduce two optimizations to read write semaphore.
>> The first one reduces cache bouncing of the sem->count field
>> by doing a pre-read of the sem->count and avoid cmpxchg if possible.
>> The second patch introduces similar optimistic spining logic in
>> the mutex code for the writer lock acquisition of rw-sem.
>>
>> Combining the two patches, in testing by Davidlohr Bueso on aim7 workloads
>> on 8 socket 80 cores system, he saw improvements of
>> alltests (+14.5%), custom (+17%), disk (+11%), high_systime
>> (+5%), shared (+15%) and short (+4%), most of them after around 500
>> users when i_mmap was implemented as rwsem.
>>
>> Feedbacks on the effectiveness of these tweaks on other workloads
>> will be appreciated.
>
> Tim, I was really hoping to send all this in one big bundle. I was doing
> some further testing (enabling hyperthreading and some Oracle runs),
> fortunately everything looks ok and we are getting actual improvements
> on large boxes.
>
> That said, how about I send you my i_mmap rwsem patchset for a v2 of
> this patchset?

I'm a bit confused about the state of these patchsets - it looks like
I'm only copied into half of the conversations. Should I wait for a v2
here, or should I hunt down for Alex's version of things, or... ?

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] rwsem: performance enhancements for systems with many cores
  2013-06-22  0:25   ` Michel Lespinasse
@ 2013-06-22  0:43     ` Davidlohr Bueso
  2013-06-24 17:47       ` Tim Chen
  0 siblings, 1 reply; 5+ messages in thread
From: Davidlohr Bueso @ 2013-06-22  0:43 UTC (permalink / raw)
  To: Michel Lespinasse
  Cc: Tim Chen, Ingo Molnar, Andrew Morton, Andrea Arcangeli, Alex Shi,
	Andi Kleen, Matthew R Wilcox, Dave Hansen, Peter Zijlstra,
	Rik van Riel, linux-kernel, linux-mm

On Fri, 2013-06-21 at 17:25 -0700, Michel Lespinasse wrote:
> On Fri, Jun 21, 2013 at 5:00 PM, Davidlohr Bueso <davidlohr.bueso@hp.com> wrote:
> > On Fri, 2013-06-21 at 16:51 -0700, Tim Chen wrote:
> >> In this patchset, we introduce two optimizations to read write semaphore.
> >> The first one reduces cache bouncing of the sem->count field
> >> by doing a pre-read of the sem->count and avoid cmpxchg if possible.
> >> The second patch introduces similar optimistic spining logic in
> >> the mutex code for the writer lock acquisition of rw-sem.
> >>
> >> Combining the two patches, in testing by Davidlohr Bueso on aim7 workloads
> >> on 8 socket 80 cores system, he saw improvements of
> >> alltests (+14.5%), custom (+17%), disk (+11%), high_systime
> >> (+5%), shared (+15%) and short (+4%), most of them after around 500
> >> users when i_mmap was implemented as rwsem.
> >>
> >> Feedbacks on the effectiveness of these tweaks on other workloads
> >> will be appreciated.
> >
> > Tim, I was really hoping to send all this in one big bundle. I was doing
> > some further testing (enabling hyperthreading and some Oracle runs),
> > fortunately everything looks ok and we are getting actual improvements
> > on large boxes.
> >
> > That said, how about I send you my i_mmap rwsem patchset for a v2 of
> > this patchset?
> 
> I'm a bit confused about the state of these patchsets - it looks like
> I'm only copied into half of the conversations. Should I wait for a v2
> here, or should I hunt down for Alex's version of things, or... ?

Except for some internal patch logistics, you haven't been left out on
any conversations :)

My original plan was to send out, in one patchset: 

- rwsem optimizations from Alex (patch 1/2 here, which should be
actually 4 patches) +
- rwsem optimistic spinning (patch 2/2 here) +
- i_mmap_mutex to rwsem conversion (5 more patches)

Now, I realize that the i_mmap stuff might not be welcomed in a
rwsem-specific optimizations patchset like this one, but I think it's
relevant to include everything in a single bundle as it really shows the
performance boosts and it's what I have been using and measuring the
original negative rwsem performance when compared to a mutex. 

If folks don't agree, I can always send it as a separate patchset.

Thanks,
Davidlohr


the rwsem spin on owner functionality (2/2) + 4 from Alex )which is
really patch 1/2 here + I haven't sent out any

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] rwsem: performance enhancements for systems with many cores
  2013-06-22  0:43     ` Davidlohr Bueso
@ 2013-06-24 17:47       ` Tim Chen
  0 siblings, 0 replies; 5+ messages in thread
From: Tim Chen @ 2013-06-24 17:47 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Michel Lespinasse, Ingo Molnar, Andrew Morton, Andrea Arcangeli,
	Alex Shi, Andi Kleen, Matthew R Wilcox, Dave Hansen,
	Peter Zijlstra, Rik van Riel, linux-kernel, linux-mm

On Fri, 2013-06-21 at 17:43 -0700, Davidlohr Bueso wrote:
> On Fri, 2013-06-21 at 17:25 -0700, Michel Lespinasse wrote:
> > On Fri, Jun 21, 2013 at 5:00 PM, Davidlohr Bueso <davidlohr.bueso@hp.com> wrote:
> > > On Fri, 2013-06-21 at 16:51 -0700, Tim Chen wrote:
> > >> In this patchset, we introduce two optimizations to read write semaphore.
> > >> The first one reduces cache bouncing of the sem->count field
> > >> by doing a pre-read of the sem->count and avoid cmpxchg if possible.
> > >> The second patch introduces similar optimistic spining logic in
> > >> the mutex code for the writer lock acquisition of rw-sem.
> > >>
> > >> Combining the two patches, in testing by Davidlohr Bueso on aim7 workloads
> > >> on 8 socket 80 cores system, he saw improvements of
> > >> alltests (+14.5%), custom (+17%), disk (+11%), high_systime
> > >> (+5%), shared (+15%) and short (+4%), most of them after around 500
> > >> users when i_mmap was implemented as rwsem.
> > >>
> > >> Feedbacks on the effectiveness of these tweaks on other workloads
> > >> will be appreciated.
> > >
> > > Tim, I was really hoping to send all this in one big bundle. I was doing
> > > some further testing (enabling hyperthreading and some Oracle runs),
> > > fortunately everything looks ok and we are getting actual improvements
> > > on large boxes.
> > >
> > > That said, how about I send you my i_mmap rwsem patchset for a v2 of
> > > this patchset?
> > 
> > I'm a bit confused about the state of these patchsets - it looks like
> > I'm only copied into half of the conversations. Should I wait for a v2
> > here, or should I hunt down for Alex's version of things, or... ?
> 
> Except for some internal patch logistics, you haven't been left out on
> any conversations :)
> 
> My original plan was to send out, in one patchset: 
> 
> - rwsem optimizations from Alex (patch 1/2 here, which should be
> actually 4 patches) +
> - rwsem optimistic spinning (patch 2/2 here) +
> - i_mmap_mutex to rwsem conversion (5 more patches)
> 
> Now, I realize that the i_mmap stuff might not be welcomed in a
> rwsem-specific optimizations patchset like this one, but I think it's
> relevant to include everything in a single bundle as it really shows the
> performance boosts and it's what I have been using and measuring the
> original negative rwsem performance when compared to a mutex. 
> 
> If folks don't agree, I can always send it as a separate patchset.

I think the i_mmap_mutex conversion probably should be a separate
patch set.  There are probably a lot of i_mmap specific considerations
that need to be considered.

I'll resend a version two of the patchset that restructure Alex's
changes into 4 patches and incorporate review comments.

Thanks.

Tim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-06-24 17:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-21 23:51 [PATCH 0/2] rwsem: performance enhancements for systems with many cores Tim Chen
2013-06-22  0:00 ` Davidlohr Bueso
2013-06-22  0:25   ` Michel Lespinasse
2013-06-22  0:43     ` Davidlohr Bueso
2013-06-24 17:47       ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox