From: Yu Zhao <yuzhao@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Mel Gorman <mgorman@techsingularity.net>,
Matt Fleming <mfleming@cloudflare.com>,
David Rientjes <rientjes@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Link Lin <linkl@google.com>
Subject: Re: [PATCH mm-unstable v2] mm/page_alloc: keep track of free highatomic
Date: Mon, 28 Oct 2024 11:54:27 -0600 [thread overview]
Message-ID: <CAOUHufZB8zCwO4nT3aeZWxJO99SD9vUJxhkGedWrmmz3J-Sczw@mail.gmail.com> (raw)
In-Reply-To: <fb1db044-e5da-4a77-b0ba-9a059a5f5ad9@suse.cz>
On Mon, Oct 28, 2024 at 5:01 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 10/28/24 1:24 AM, Yu Zhao wrote:
> > On Sun, Oct 27, 2024 at 3:05 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>
> >> On 10/27/24 21:51, Yu Zhao wrote:
> >>> On Sun, Oct 27, 2024 at 2:36 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>>>
> >>>> On 10/27/24 21:17, Yu Zhao wrote:
> >>>>> On Sun, Oct 27, 2024 at 1:53 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>>>>>
> >>>>
> >>>> For example:
> >>>>
> >>>> - a page is on pcplist in MIGRATE_MOVABLE list
> >>>> - we reserve its pageblock as highatomic, which does nothing to the page on
> >>>> the pcplist
> >>>> - page above is flushed from pcplist to zone freelist, but it remembers it
> >>>> was MIGRATE_MOVABLE, merges with another buddy/buddies from the
> >>>> now-highatomic list, the resulting order-X page ends up on the movable
> >>>> freelist despite being in highatomic pageblock. The counter of free
> >>>> highatomic is now wrong wrt the freelist reality
> >>>
> >>> This is the part I don't follow: how is it wrong w.r.t. the freelist
> >>> reality? The new nr_free_highatomic should reflect how many pages are
> >>> exactly on free_list[MIGRATE_HIGHATOMIC], because it's updated
> >>> accordingly.
> >>
> >> You'd have to try implementing your change in the kernel without that
> >> migratetype hygiene series, and see how it would either not work, or you'd
> >> end up implementing the series as part of that.
> >
> > A proper backport would need to track the MT of the free_list a page
> > is deleted from, and I see no reason why in such a proper backport
> > "the counter could drift easily" or "the counter of free highatomic is
> > now wrong wrt the freelist reality". So I assume you actually mean
> > this patch can't be backported cleanly? (Which I do agree.)
>
> Yes you're right. But since we don't plan to backport it beyond 6.12,
> sorry for sidetracking the discussion unnecessarily. More importantly,
> is it possible to change the implementation as I suggested?
The only reason I didn't fold account_highatomic_freepages() into
account_freepages() is because the former must be called under the
zone lock, which is also how the latter is called but not as a
requirement.
I understand where you come from when suggesting a new per-cpu counter
for free highatomic. I have to disagree with that because 1) free
highatomic is relatively small and drifting might defeat its purpose;
2) per-cpu memory is among the top kernel memory overhead in our fleet
-- it really adds up. So I prefer not to use per-cpu counters unless
necessary.
So if it's ok with you, I'll just fold account_highatomic_freepages()
into account_freepages(), but keep the counter as per zone, not per
cpu.
> [1] Hooking
> to __del_page_from_free_list() and __add_to_free_list() means extra work
> in every loop iteration in expand() and __free_one_page(). The
> migratetype hygiene should ensure it's not necessary to intercept every
> freelist add/move and hooking to account_freepages() should be
> sufficient and in line with the intended design.
Agreed.
next prev parent reply other threads:[~2024-10-28 17:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-26 3:36 Yu Zhao
2024-10-26 4:24 ` Andrew Morton
2024-10-26 4:40 ` Yu Zhao
2024-10-27 19:40 ` Vlastimil Babka
2024-10-27 20:03 ` Yu Zhao
2024-10-26 5:35 ` David Rientjes
2024-10-27 19:53 ` Vlastimil Babka
2024-10-27 20:17 ` Yu Zhao
2024-10-27 20:36 ` Vlastimil Babka
2024-10-27 20:51 ` Yu Zhao
2024-10-27 21:05 ` Vlastimil Babka
2024-10-28 0:24 ` Yu Zhao
2024-10-28 11:04 ` Vlastimil Babka
2024-10-28 17:54 ` Yu Zhao [this message]
2024-10-28 18:29 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOUHufZB8zCwO4nT3aeZWxJO99SD9vUJxhkGedWrmmz3J-Sczw@mail.gmail.com \
--to=yuzhao@google.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linkl@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mfleming@cloudflare.com \
--cc=mgorman@techsingularity.net \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox