From: Vlastimil Babka <vbabka@suse.cz>
To: Yu Zhao <yuzhao@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Mel Gorman <mgorman@techsingularity.net>,
Matt Fleming <mfleming@cloudflare.com>,
David Rientjes <rientjes@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Link Lin <linkl@google.com>
Subject: Re: [PATCH mm-unstable v2] mm/page_alloc: keep track of free highatomic
Date: Sun, 27 Oct 2024 21:36:39 +0100 [thread overview]
Message-ID: <8459b884-5877-41bd-a882-546e046b9dad@suse.cz> (raw)
In-Reply-To: <CAOUHufaS-dGAPGs1Y1=imW_nusaTDeysN3qfJc9-76DBVEHScQ@mail.gmail.com>
On 10/27/24 21:17, Yu Zhao wrote:
> On Sun, Oct 27, 2024 at 1:53 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 10/26/24 05:36, Yu Zhao wrote:
>> > OOM kills due to vastly overestimated free highatomic reserves were
>> > observed:
>> >
>> > ... invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0 ...
>> > Node 0 Normal free:1482936kB boost:0kB min:410416kB low:739404kB high:1068392kB reserved_highatomic:1073152KB ...
>> > Node 0 Normal: 1292*4kB (ME) 1920*8kB (E) 383*16kB (UE) 220*32kB (ME) 340*64kB (E) 2155*128kB (UE) 3243*256kB (UE) 615*512kB (U) 1*1024kB (M) 0*2048kB 0*4096kB = 1477408kB
>> >
>> > The second line above shows that the OOM kill was due to the following
>> > condition:
>> >
>> > free (1482936kB) - reserved_highatomic (1073152kB) = 409784KB < min (410416kB)
>> >
>> > And the third line shows there were no free pages in any
>> > MIGRATE_HIGHATOMIC pageblocks, which otherwise would show up as type
>> > 'H'. Therefore __zone_watermark_unusable_free() underestimated the
>> > usable free memory by over 1GB, which resulted in the unnecessary OOM
>> > kill above.
>> >
>> > The comments in __zone_watermark_unusable_free() warns about the
>> > potential risk, i.e.,
>> >
>> > If the caller does not have rights to reserves below the min
>> > watermark then subtract the high-atomic reserves. This will
>> > over-estimate the size of the atomic reserve but it avoids a search.
>> >
>> > However, it is possible to keep track of free pages in reserved
>> > highatomic pageblocks with a new per-zone counter nr_free_highatomic
>> > protected by the zone lock, to avoid a search when calculating the
>>
>> It's only possible to track this reliably since the "mm: page_alloc:
>> freelist migratetype hygiene" patchset was merged, which explains why
>> nr_reserved_highatomic was used until now, even if it's imprecise.
>
> I just refreshed my memory by quickly going through the discussion
> around that series and didn't find anything that helps me understand
> the above. More pointers please?
For example:
- a page is on pcplist in MIGRATE_MOVABLE list
- we reserve its pageblock as highatomic, which does nothing to the page on
the pcplist
- page above is flushed from pcplist to zone freelist, but it remembers it
was MIGRATE_MOVABLE, merges with another buddy/buddies from the
now-highatomic list, the resulting order-X page ends up on the movable
freelist despite being in highatomic pageblock. The counter of free
highatomic is now wrong wrt the freelist reality
The series has addressed various scenarios like that, where page can end up
on the wrong freelist.
next prev parent reply other threads:[~2024-10-27 20:36 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-26 3:36 Yu Zhao
2024-10-26 4:24 ` Andrew Morton
2024-10-26 4:40 ` Yu Zhao
2024-10-27 19:40 ` Vlastimil Babka
2024-10-27 20:03 ` Yu Zhao
2024-10-26 5:35 ` David Rientjes
2024-10-27 19:53 ` Vlastimil Babka
2024-10-27 20:17 ` Yu Zhao
2024-10-27 20:36 ` Vlastimil Babka [this message]
2024-10-27 20:51 ` Yu Zhao
2024-10-27 21:05 ` Vlastimil Babka
2024-10-28 0:24 ` Yu Zhao
2024-10-28 11:04 ` Vlastimil Babka
2024-10-28 17:54 ` Yu Zhao
2024-10-28 18:29 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8459b884-5877-41bd-a882-546e046b9dad@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linkl@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mfleming@cloudflare.com \
--cc=mgorman@techsingularity.net \
--cc=rientjes@google.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox