linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	 Arjan Van De Ven <arjan@linux.intel.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Mel Gorman <mgorman@techsingularity.net>,
	 Vlastimil Babka <vbabka@suse.cz>,
	David Hildenbrand <david@redhat.com>,
	 Johannes Weiner <jweiner@redhat.com>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	 Pavel Tatashin <pasha.tatashin@soleen.com>,
	 Matthew Wilcox <willy@infradead.org>
Subject: Re: [RFC 0/6] mm: improve page allocator scalability via splitting zones
Date: Fri, 12 May 2023 10:55:21 +0800	[thread overview]
Message-ID: <87r0rm8die.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <ZF0ET82ajDbFrIw/@dhcp22.suse.cz> (Michal Hocko's message of "Thu, 11 May 2023 17:05:51 +0200")

Hi, Michal,

Thanks for comments!

Michal Hocko <mhocko@suse.com> writes:

> On Thu 11-05-23 14:56:01, Huang Ying wrote:
>> The patchset is based on upstream v6.3.
>> 
>> More and more cores are put in one physical CPU (usually one NUMA node
>> too).  In 2023, one high-end server CPU has 56, 64, or more cores.
>> Even more cores per physical CPU are planned for future CPUs.  While
>> all cores in one physical CPU will contend for the page allocation on
>> one zone in most cases.  This causes heavy zone lock contention in
>> some workloads.  And the situation will become worse and worse in the
>> future.
>> 
>> For example, on an 2-socket Intel server machine with 224 logical
>> CPUs, if the kernel is built with `make -j224`, the zone lock
>> contention cycles% can reach up to about 12.7%.
>> 
>> To improve the scalability of the page allocation, in this series, we
>> will create one zone instance for each about 256 GB memory of a zone
>> type generally.  That is, one large zone type will be split into
>> multiple zone instances.  Then, different logical CPUs will prefer
>> different zone instances based on the logical CPU No.  So the total
>> number of logical CPUs contend on one zone will be reduced.  Thus the
>> scalability is improved.
>
> It is not really clear to me why you need a new zone for all this rather
> than partition free lists internally within the zone? Essentially to
> increase the current two level system to 3: per cpu caches, per cpu
> arenas and global fallback.

Sorry, I didn't get your idea here.  What is per cpu arenas?  What's the
difference between it and per cpu caches (PCP)?

> I am also missing some information why pcp caches tunning is not
> sufficient.

PCP does improve the page allocation scalability greatly!  But it
doesn't help much for workloads that allocating pages on one CPU and
free them in different CPUs.  PCP tuning can improve the page allocation
scalability for a workload greatly.  But it's not trivial to find the
best tuning parameters for various workloads and workload run time
statuses (workloads may have different loads and memory requirements at
different time).  And we may run different workloads on different
logical CPUs of the system.  This also makes it hard to find the best
PCP tuning globally.  It would be better to find a solution to improve
the page allocation scalability out of box or automatically.  Do you
agree?

Best Regards,
Huang, Ying


  reply	other threads:[~2023-05-12  2:56 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11  6:56 Huang Ying
2023-05-11  6:56 ` [RFC 1/6] mm: distinguish zone type and zone instance explicitly Huang Ying
2023-05-11  6:56 ` [RFC 2/6] mm: add struct zone_type_struct to describe zone type Huang Ying
2023-05-11  6:56 ` [RFC 3/6] mm: support multiple zone instances per zone type in memory online Huang Ying
2023-05-11  6:56 ` [RFC 4/6] mm: avoid show invalid zone in /proc/zoneinfo Huang Ying
2023-05-11  6:56 ` [RFC 5/6] mm: create multiple zone instances for one zone type based on memory size Huang Ying
2023-05-11  6:56 ` [RFC 6/6] mm: prefer different zone list on different logical CPU Huang Ying
2023-05-11 10:30 ` [RFC 0/6] mm: improve page allocator scalability via splitting zones Jonathan Cameron
2023-05-11 13:07   ` Arjan van de Ven
2023-05-11 14:23 ` Dave Hansen
2023-05-12  3:08   ` Huang, Ying
2023-05-11 15:05 ` Michal Hocko
2023-05-12  2:55   ` Huang, Ying [this message]
2023-05-15 11:14     ` Michal Hocko
2023-05-16  9:38       ` Huang, Ying
2023-05-16 10:30         ` David Hildenbrand
2023-05-17  1:34           ` Huang, Ying
2023-05-17  8:09             ` David Hildenbrand
2023-05-18  8:06               ` Huang, Ying
2023-05-24 12:30           ` Michal Hocko
2023-05-29  1:13             ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0rm8die.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox