Re: [RFC PATCH 0/9] Improve zone lock scalability using Daniel Jordan's list work

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: Aaron Lu <aaron.lu@intel.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Matthew Wilcox <willy@infradead.org>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	Yosef Lev <levyossi@icloud.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Subject: Re: [RFC PATCH 0/9] Improve zone lock scalability using Daniel Jordan's list work
Date: Fri, 21 Sep 2018 10:45:36 -0700	[thread overview]
Message-ID: <20180921174536.7igaoi36rg76auy4@ca-dmjordan1.us.oracle.com> (raw)
In-Reply-To: <20180911053616.6894-1-aaron.lu@intel.com>

On Tue, Sep 11, 2018 at 01:36:07PM +0800, Aaron Lu wrote:
> Daniel Jordan and others proposed an innovative technique to make
> multiple threads concurrently use list_del() at any position of the
> list and list_add() at head position of the list without taking a lock
> in this year's MM summit[0].
> 
> People think this technique may be useful to improve zone lock
> scalability so here is my try.

Nice, this uses the smp_list_* functions well in spite of the limitations you
encountered with them here.

> Performance wise on 56 cores/112 threads Intel Skylake 2 sockets server
> using will-it-scale/page_fault1 process mode(higher is better):
> 
> kernel        performance      zone lock contention
> patch1         9219349         76.99%
> patch7         2461133 -73.3%  54.46%(another 34.66% on smp_list_add())
> patch8        11712766 +27.0%  68.14%
> patch9        11386980 +23.5%  67.18%

Is "zone lock contention" the percentage that readers and writers combined
spent waiting?  I'm curious to see read and write wait time broken out, since
it seems there are writers (very likely on the allocation side) spoiling the
parallelism we get with the read lock.

If the contention is from allocation, I wonder whether it's feasible to make
that path use SMP list functions.  Something like smp_list_cut_position
combined with using page clusters from [*] to cut off a chunk of list.  Many
details to keep in mind there, though, like having to unset PageBuddy in that
list chunk when other tasks can be concurrently merging pages that are part of
it.

Or maybe what's needed is a more scalable data structure than an array of
lists, since contention on the heads seems to be the limiting factor.  A simple
list that keeps the pages in most-recently-used order (except when adding to
the list tail) is good for cache warmth, but I wonder how helpful that is when
all CPUs can allocate from the front.  Having multiple places to put pages of a
given order/mt would ease the contention.

> Though lock contention reduced a lot for patch7, the performance dropped
> considerably due to severe cache bouncing on free list head among
> multiple threads doing page free at the same time, because every page free
> will need to add the page to the free list head.

Could be beneficial to take an MCS-style approach in smp_list_splice/add so
that multiple waiters aren't bouncing the same cacheline around.  This is
something I'm planning to try on lru_lock.

Daniel

[*] https://lkml.kernel.org/r/20180509085450.3524-1-aaron.lu@intel.com

next prev parent reply	other threads:[~2018-09-21 17:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-11  5:36 Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 1/9] mm: do not add anon pages to LRU Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 2/9] mm: introduce smp_list_del for concurrent list entry removals Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 3/9] mm: introduce smp_list_splice to prepare for concurrent LRU adds Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 4/9] mm: convert zone lock from spinlock to rwlock Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 5/9] mm/page_alloc: use helper functions to add/remove a page to/from buddy Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 6/9] use atomic for free_area[order].nr_free Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 7/9] mm: use read_lock for free path Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 8/9] mm: use smp_list_splice() on " Aaron Lu
2018-09-11  5:36 ` [RFC PATCH 9/9] mm: page_alloc: merge before sending pages to global pool Aaron Lu
2018-09-21 17:45 ` Daniel Jordan [this message]
2018-09-25  2:37   ` [RFC PATCH 0/9] Improve zone lock scalability using Daniel Jordan's list work Aaron Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180921174536.7igaoi36rg76auy4@ca-dmjordan1.us.oracle.com \
    --to=daniel.m.jordan@oracle.com \
    --cc=aaron.lu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=brouer@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=levyossi@icloud.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=tariqt@mellanox.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox