Re: [PATCH rfc 0/3] mm: allow more high-order pages stored on PCP lists

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Barry Song <21cnbao@gmail.com>
Cc: David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Huang Ying <ying.huang@intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Barry Song <v-songbaohua@oppo.com>,
	Vlastimil Babka <vbabka@suse.cz>, Zi Yan <ziy@nvidia.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Jonathan Corbet <corbet@lwn.net>, Yang Shi <shy828301@gmail.com>,
	Yu Zhao <yuzhao@google.com>, <linux-mm@kvack.org>
Subject: Re: [PATCH rfc 0/3] mm: allow more high-order pages stored on PCP lists
Date: Tue, 16 Apr 2024 12:50:00 +0800	[thread overview]
Message-ID: <ab4f688b-b86a-47c0-9049-0bb33489d4f7@huawei.com> (raw)
In-Reply-To: <CAGsJ_4x5AvffOynnJTm-DPeQO=Wb3X3OKKHi4bPq1E7b8bo+xg@mail.gmail.com>



On 2024/4/16 8:21, Barry Song wrote:
> On Tue, Apr 16, 2024 at 12:18 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>
>>
>>
>> On 2024/4/15 18:52, David Hildenbrand wrote:
>>> On 15.04.24 10:59, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2024/4/15 16:18, Barry Song wrote:
>>>>> On Mon, Apr 15, 2024 at 8:12 PM Kefeng Wang
>>>>> <wangkefeng.wang@huawei.com> wrote:
>>>>>>
>>>>>> Both the file pages and anonymous pages support large folio, high-order
>>>>>> pages except PMD_ORDER will also be allocated frequently which could
>>>>>> increase the zone lock contention, allow high-order pages on pcp lists
>>>>>> could reduce the big zone lock contention, but as commit 44042b449872
>>>>>> ("mm/page_alloc: allow high-order pages to be stored on the per-cpu
>>>>>> lists")
>>>>>> pointed, it may not win in all the scenes, add a new control sysfs to
>>>>>> enable or disable specified high-order pages stored on PCP lists,
>>>>>> the order
>>>>>> (PAGE_ALLOC_COSTLY_ORDER, PMD_ORDER) won't be stored on PCP list by
>>>>>> default.
>>>>>
>>>>> This is precisely something Baolin and I have discussed and intended
>>>>> to implement[1],
>>>>> but unfortunately, we haven't had the time to do so.
>>>>
>>>> Indeed, same thing. Recently, we are working on unixbench/lmbench
>>>> optimization, I tested Multi-size THP for anonymous memory by hard-cord
>>>> PAGE_ALLOC_COSTLY_ORDER from 3 to 4[1], it shows some improvement but
>>>> not for all cases and not very stable, so re-implemented it by according
>>>> to the user requirement and enable it dynamically.
>>>
>>> I'm wondering, though, if this is really a suitable candidate for a
>>> sysctl toggle. Can anybody really come up with an educated guess for
>>> these values?
>>
>> Not sure this is suitable in sysctl, but mTHP anon is enabled in sysctl,
>> we could trace __alloc_pages() and do order statistic to decide to
>> choose the high-order to be enabled on PCP.
>>
>>>
>>> Especially reading "Benchmarks Score shows a little improvoment(0.28%)"
>>> and "it may not win in all the scenes", to me it mostly sounds like
>>> "minimal impact" -- so who cares?
>>
>> Even though lock conflicts are eliminated, there is very limited
>> performance improvement(even maybe fluctuation), it is not a good
>> testcase to show improvement, just show the zone-lock issue, we need to
>> find other better testcase, maybe some test on Andriod(heavy use 64K, no
>> PMD THP), or LKP maybe give some help?
>>
>> I will try to find other testcase to show the benefit.
> 
> Hi Kefeng,
> 
> I wonder if you will see some major improvements on mTHP 64KiB using
> the below microbench I wrote just now, for example perf and time to
> finish the program
> 
> #define DATA_SIZE (2UL * 1024 * 1024)
> 
> int main(int argc, char **argv)
> {
>          /* make 32 concurrent alloc and free of mTHP */
>          fork(); fork(); fork(); fork(); fork();
> 
>          for (int i = 0; i < 100000; i++) {
>                  void *addr = mmap(NULL, DATA_SIZE, PROT_READ | PROT_WRITE,
>                                  MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>                  if (addr == MAP_FAILED) {
>                          perror("fail to malloc");
>                          return -1;
>                  }
>                  memset(addr, 0x11, DATA_SIZE);
>                  munmap(addr, DATA_SIZE);
>          }
> 
>          return 0;
> }
> 

1) PCP disabled
	1	2	3	4	5	average		
real	200.41	202.18	203.16	201.54	200.91	201.64	
user	6.49	6.21	6.25	6.31	6.35	6.322		
sys 	193.3	195.39	196.3	194.65	194.01	194.73	
	
2) PCP enabled							
real	198.25	199.26	195.51	199.28	189.12	196.284	   -2.66%
user	6.21	6.02	6.02	6.28	6.21	6.148	   -2.75%
sys 	191.46	192.64	188.96	192.47	182.39	189.584	   -2.64%

for above test, time reduce 2.x%


And re-test page_fault1(anon) from will-it-scale

1) PCP enabled 					
tasks	processes	processes_idle	threads	threads_idle	linear
0	0	100	0	100	0
1	1416915	98.95	1418128	98.95	1418128
20	5327312	79.22	3821312	94.36	28362560
40	9437184	58.58	4463657	94.55	56725120
60	8120003	38.16	4736716	94.61	85087680
80	7356508	18.29	4847824	94.46	113450240
100	7256185	1.48	4870096	94.61	141812800

2) PCP disabled
tasks	processes	processes_idle	threads	threads_idle	linear
0	0	100	0	100	0
1	1365398	98.95	1354502	98.95	1365398
20	5174918	79.22	3722368	94.65	27307960
40	9094265	58.58	4427267	94.82	54615920
60	8021606	38.18	4572896	94.93	81923880
80	7497318	18.2	4637062	94.76	109231840
100	6819897	1.47	4654521	94.63	136539800

------------------------------------
1) vs 2)  pcp enabled improve 3.86%

3) PCP re-enabled					
tasks	processes	processes_idle	threads	threads_idle	linear
0	0	100	0	100	0
1	1419036	98.96	1428403	98.95	1428403
20	5356092	79.23	3851849	94.41	28568060
40	9437184	58.58	4512918	94.63	57136120
60	8252342	38.16	4659552	94.68	85704180
80	7414899	18.26	4790576	94.77	114272240
100	7062902	1.46	4759030	94.64	142840300

4) PCP re-disabled
tasks	processes	processes_idle	threads	threads_idle	linear
0	0	100	0	100	0
1	1352649	98.95	1354806	98.95	1354806
20	5172924	79.22	3719292	94.64	27096120
40	9174505	58.59	4310649	94.93	54192240
60	8021606	38.17	4552960	94.81	81288360
80	7497318	18.18	4671638	94.81	108384480
100	6823926	1.47	4725955	94.64	135480600

------------------------------------
3) vs 4)  pcp enabled improve 5.43%

Average: 4.645%





>>
>>>
>>> How much is the cost vs. benefit of just having one sane system
>>> configuration?
>>>
>>
>> For arm64 with 4k, five more high-orders(4~8), five more pcplists,
>> and for high-orders, we assumes most of them are moveable, but maybe
>> not, so enable it by default maybe more fragmentization, see
>> 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized
>> allocations").
>>

next prev parent reply	other threads:[~2024-04-16  4:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15  8:12 Kefeng Wang
2024-04-15  8:12 ` [PATCH rfc 1/3] mm: prepare more high-order pages to be stored on the per-cpu lists Kefeng Wang
2024-04-15 11:41   ` Baolin Wang
2024-04-15 12:25     ` Kefeng Wang
2024-04-15  8:12 ` [PATCH rfc 2/3] mm: add control to allow specified high-order pages stored on PCP list Kefeng Wang
2024-04-15  8:12 ` [PATCH rfc 3/3] mm: pcp: show per-order pages count Kefeng Wang
2024-04-15  8:18 ` [PATCH rfc 0/3] mm: allow more high-order pages stored on PCP lists Barry Song
2024-04-15  8:59   ` Kefeng Wang
2024-04-15 10:52     ` David Hildenbrand
2024-04-15 11:14       ` Barry Song
2024-04-15 12:17       ` Kefeng Wang
2024-04-16  0:21         ` Barry Song
2024-04-16  4:50           ` Kefeng Wang [this message]
2024-04-16  4:58             ` Kefeng Wang
2024-04-16  5:26               ` Barry Song
2024-04-16  7:03                 ` David Hildenbrand
2024-04-16  8:06                   ` Kefeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab4f688b-b86a-47c0-9049-0bb33489d4f7@huawei.com \
    --to=wangkefeng.wang@huawei.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox