From: Yu Zhao <yuzhao@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Mel Gorman <mgorman@techsingularity.net>,
Matt Fleming <mfleming@cloudflare.com>,
David Rientjes <rientjes@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Link Lin <linkl@google.com>
Subject: Re: [PATCH mm-unstable v2] mm/page_alloc: keep track of free highatomic
Date: Sun, 27 Oct 2024 14:51:42 -0600 [thread overview]
Message-ID: <CAOUHufbHVXNZpW1mVhuF+4p8PbPq44w4chQX7Q6QYVDCjSqa1Q@mail.gmail.com> (raw)
In-Reply-To: <8459b884-5877-41bd-a882-546e046b9dad@suse.cz>
On Sun, Oct 27, 2024 at 2:36 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 10/27/24 21:17, Yu Zhao wrote:
> > On Sun, Oct 27, 2024 at 1:53 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>
> >> On 10/26/24 05:36, Yu Zhao wrote:
> >> > OOM kills due to vastly overestimated free highatomic reserves were
> >> > observed:
> >> >
> >> > ... invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0 ...
> >> > Node 0 Normal free:1482936kB boost:0kB min:410416kB low:739404kB high:1068392kB reserved_highatomic:1073152KB ...
> >> > Node 0 Normal: 1292*4kB (ME) 1920*8kB (E) 383*16kB (UE) 220*32kB (ME) 340*64kB (E) 2155*128kB (UE) 3243*256kB (UE) 615*512kB (U) 1*1024kB (M) 0*2048kB 0*4096kB = 1477408kB
> >> >
> >> > The second line above shows that the OOM kill was due to the following
> >> > condition:
> >> >
> >> > free (1482936kB) - reserved_highatomic (1073152kB) = 409784KB < min (410416kB)
> >> >
> >> > And the third line shows there were no free pages in any
> >> > MIGRATE_HIGHATOMIC pageblocks, which otherwise would show up as type
> >> > 'H'. Therefore __zone_watermark_unusable_free() underestimated the
> >> > usable free memory by over 1GB, which resulted in the unnecessary OOM
> >> > kill above.
> >> >
> >> > The comments in __zone_watermark_unusable_free() warns about the
> >> > potential risk, i.e.,
> >> >
> >> > If the caller does not have rights to reserves below the min
> >> > watermark then subtract the high-atomic reserves. This will
> >> > over-estimate the size of the atomic reserve but it avoids a search.
> >> >
> >> > However, it is possible to keep track of free pages in reserved
> >> > highatomic pageblocks with a new per-zone counter nr_free_highatomic
> >> > protected by the zone lock, to avoid a search when calculating the
> >>
> >> It's only possible to track this reliably since the "mm: page_alloc:
> >> freelist migratetype hygiene" patchset was merged, which explains why
> >> nr_reserved_highatomic was used until now, even if it's imprecise.
> >
> > I just refreshed my memory by quickly going through the discussion
> > around that series and didn't find anything that helps me understand
> > the above. More pointers please?
>
> For example:
>
> - a page is on pcplist in MIGRATE_MOVABLE list
> - we reserve its pageblock as highatomic, which does nothing to the page on
> the pcplist
> - page above is flushed from pcplist to zone freelist, but it remembers it
> was MIGRATE_MOVABLE, merges with another buddy/buddies from the
> now-highatomic list, the resulting order-X page ends up on the movable
> freelist despite being in highatomic pageblock. The counter of free
> highatomic is now wrong wrt the freelist reality
This is the part I don't follow: how is it wrong w.r.t. the freelist
reality? The new nr_free_highatomic should reflect how many pages are
exactly on free_list[MIGRATE_HIGHATOMIC], because it's updated
accordingly.
(My current understanding is that, in this case, the reservation
itself is messed up, i.e., under-reserved.)
> The series has addressed various scenarios like that, where page can end up
> on the wrong freelist.
next prev parent reply other threads:[~2024-10-27 20:52 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-26 3:36 Yu Zhao
2024-10-26 4:24 ` Andrew Morton
2024-10-26 4:40 ` Yu Zhao
2024-10-27 19:40 ` Vlastimil Babka
2024-10-27 20:03 ` Yu Zhao
2024-10-26 5:35 ` David Rientjes
2024-10-27 19:53 ` Vlastimil Babka
2024-10-27 20:17 ` Yu Zhao
2024-10-27 20:36 ` Vlastimil Babka
2024-10-27 20:51 ` Yu Zhao [this message]
2024-10-27 21:05 ` Vlastimil Babka
2024-10-28 0:24 ` Yu Zhao
2024-10-28 11:04 ` Vlastimil Babka
2024-10-28 17:54 ` Yu Zhao
2024-10-28 18:29 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOUHufbHVXNZpW1mVhuF+4p8PbPq44w4chQX7Q6QYVDCjSqa1Q@mail.gmail.com \
--to=yuzhao@google.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linkl@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mfleming@cloudflare.com \
--cc=mgorman@techsingularity.net \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox