linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhenhua Huang <quic_zhenhuah@quicinc.com>
To: Michal Hocko <mhocko@suse.com>
Cc: <akpm@linux-foundation.org>, <mgorman@techsingularity.net>,
	<vbabka@suse.cz>, <linux-mm@kvack.org>,
	<quic_tingweiz@quicinc.com>
Subject: Re: [RESEND PATCH] mm:page_alloc.c: lower the order requirement of should_reclaim_retry
Date: Mon, 19 Sep 2022 19:24:32 +0800	[thread overview]
Message-ID: <daa7f5c3-806d-c2fc-3bfd-957708e52867@quicinc.com> (raw)
In-Reply-To: <Yygk3qSH02Otbpgs@dhcp22.suse.cz>

Thanks Michal for comments!

On 2022/9/19 16:14, Michal Hocko wrote:
> On Mon 19-09-22 11:00:55, Zhenhua Huang wrote:
>> When a driver was continuously allocating order 3
>> pages, it would be very easily OOM even there were lots of reclaimable
>> pages. A test module is used to reproduce this issue,
>> several key ftrace events are as below:
>>
>>            insmod-6968    [005] ....   321.306007: reclaim_retry_zone: node=0 zone=Normal
>> order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0
>> wmark_check=0
>>            insmod-6968    [005] ....   321.306009: compact_retry: order=3
>> priority=COMPACT_PRIO_SYNC_LIGHT compaction_result=withdrawn retries=0
>> max_retries=16 should_retry=1
>>            insmod-6968    [004] ....   321.308220:
>> mm_compaction_try_to_compact_pages: order=3 gfp_mask=GFP_KERNEL priority=0
>>            insmod-6968    [004] ....   321.308964: mm_compaction_end:
>> zone_start=0x80000 migrate_pfn=0xaa800 free_pfn=0x80800 zone_end=0x940000,
>> mode=sync status=complete
>>            insmod-6968    [004] ....   321.308971: reclaim_retry_zone: node=0
>> zone=Normal   order=3 reclaimable=539830 available=592776 min_wmark=21227
>> no_progress_loops=0 wmark_check=0
>>            insmod-6968    [004] ....   321.308973: compact_retry: order=3
>> priority=COMPACT_PRIO_SYNC_FULL compaction_result=failed retries=0
>> max_retries=16 should_retry=0
>>
>> There're ~2GB reclaimable pages(reclaimable=539988) but VM decides not to
>> reclaim any more:
>> insmod-6968    [005] ....   321.306007: reclaim_retry_zone: node=0 zone=Normal
>> order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0
>> wmark_check=0
>>
>> >From meminfo when oom, there was NO qualified order >= 3 pages(CMA page not qualified)
>> can meet should_reclaim_retry's requirement:
>> Normal  : 24671*4kB (UMEC) 13807*8kB (UMEC) 8214*16kB (UEC) 190*32kB (C)
>> 94*64kB (C) 28*128kB (C) 16*256kB (C) 7*512kB (C) 5*1024kB (C) 7*2048kB (C)
>> 46*4096kB (C) = 571796kB
>>
>> The reason of should_reclaim_retry early aborting was that is based on having the order
>> pages in its free_list. For order 3 pages, that's easily fragmented. Considering enough free
>> pages are the fundamental of compaction. It may not be suitable to stop reclaiming
>> when lots of page cache there. Relax order by one to fix this issue.
> 
> For the higher order request we rely on should_compact_retry which backs
> on based on the compaction feedback. I would recommend looking why the
> compaction fails.
I think the reason of compaction failure is there're not enough free 
pages. Like in ftrace events showed, free pages(which include CMA) was 
only 592856 - 539988 = 52868 pages(reclaimable=539988 available=592856).

There are some restrictions like suitable_migration_target() for free 
pages and suitable_migration_source() for movable pages. Hence eligible 
targets is fewer.

> 
> Also this patch doesn't really explain why it should work and honestly
> it doesn't really make much sense to me either.
Sorry, my fault. IMO, The reason it should work is, say for this case of 
order 3 allocation: we can perform direct reclaim more times as we have 
only order 2 pages(which *lowered* by this change) in 
free_list(8214*16kB (UEC)). The order requirement which I have lowered 
is should_reclaim_retry -> __zone_watermark_ok:
        for (o = order; o < MAX_ORDER; o++) {
                 struct free_area *area = &z->free_area[o];
...
                 for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
                         if (!free_area_empty(area, mt))
                                 return true;
                 }

Order 2 pages can be more easily met, hence VM has more chance to return 
true from should_reclaim_retry.


> 
>> With the change meminfo output when first OOM showing page cache was nearly
>> exhausted:
>>
>> Normal free: 462956kB min:8644kB low:44672kB high:50844kB
>> reserved_highatomic:4096KB active_anon:48kB inactive_anon:12kB
>> active_file:508kB inactive_file:552kB unevictable:109016kB writepending:160kB
>> present:7111680kB managed:6175004kB mlocked:107784kB pagetables:78732kB
>> bounce:0kB free_pcp:996kB local_pcp:0kB free_cma:376412kB
>>
>> Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com>
>> ---
>>   mm/page_alloc.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 36b2021..b4ca6d1 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -4954,8 +4954,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
>>   		/*
>>   		 * Would the allocation succeed if we reclaimed all
>>   		 * reclaimable pages?
>> +		 * considering fragmentation, enough free pages are the
>> +		 * fundamental of compaction:
>> +		 * lower the order requirement by one
>>   		 */
>> -		wmark = __zone_watermark_ok(zone, order, min_wmark,
>> +		wmark = __zone_watermark_ok(zone, order ? order - 1 : 0, min_wmark,
>>   				ac->highest_zoneidx, alloc_flags, available);
>>   		trace_reclaim_retry_zone(z, order, reclaimable,
>>   				available, min_wmark, *no_progress_loops, wmark);
>> -- 
>> 2.7.4
> 

BTW, regarding the test case:
I used order 3 with GFP_KERNEL flag to continuously allocate to 
reproduce it. If you have other suggestions/ideas, please let me know. 
Thank you.

Thanks,
Zhenhua


  reply	other threads:[~2022-09-19 11:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19  3:00 Zhenhua Huang
2022-09-19  8:14 ` Michal Hocko
2022-09-19 11:24   ` Zhenhua Huang [this message]
2022-09-19 13:28     ` Michal Hocko
2022-09-20  9:38       ` Zhenhua Huang
2022-09-20 11:02         ` Mel Gorman
2022-09-21  8:12           ` Zhenhua Huang
2022-09-20 11:20         ` Michal Hocko
2022-09-21  8:18           ` Zhenhua Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=daa7f5c3-806d-c2fc-3bfd-957708e52867@quicinc.com \
    --to=quic_zhenhuah@quicinc.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=quic_tingweiz@quicinc.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox