From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>
Cc: Chris Mason <clm@fb.com>, Kiryl Shutsemau <kirill@shutemov.name>,
Brendan Jackman <jackmanb@google.com>,
Michal Hocko <mhocko@suse.com>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>, Zi Yan <ziy@nvidia.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-team@meta.com
Subject: [PATCH v2 4/4] mm/page_alloc: Batch page freeing in free_frozen_page_commit
Date: Wed, 24 Sep 2025 13:44:08 -0700 [thread overview]
Message-ID: <20250924204409.1706524-5-joshua.hahnjy@gmail.com> (raw)
In-Reply-To: <20250924204409.1706524-1-joshua.hahnjy@gmail.com>
Before returning, free_frozen_page_commit calls free_pcppages_bulk using
nr_pcp_free to determine how many pages can appropritately be freed,
based on the tunable parameters stored in pcp. While this number is an
accurate representation of how many pages should be freed in total, it
is not an appropriate number of pages to free at once using
free_pcppages_bulk, since we have seen the value consistently go above
2000 in the Meta fleet on larger machines.
As such, perform batched page freeing in free_pcppages_bulk by using
pcp->batch member. In order to ensure that other processes are not
starved of the pcp (and zone) lock, free the pcp lock.
Note that because free_frozen_page_commit now performs a spinlock inside the
function (and can fail), the function may now return with a freed pcp.
To handle this, return true if the pcp is locked on exit and false otherwise.
In addition, since free_frozen_page_commit must now be aware of what UP
flags were stored at the time of the spin lock, and because we must be
able to report new UP flags to the callers, add a new unsigned long*
parameter UP_flags to keep track of this.
Suggested-by: Chris Mason <clm@fb.com>
Co-developed-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
mm/page_alloc.c | 47 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 37 insertions(+), 10 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 467e524a99df..50589ff0a9c6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2821,11 +2821,19 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone,
return high;
}
-static void free_frozen_page_commit(struct zone *zone,
+/*
+ * Tune pcp alloc factor and adjust count & free_count. Free pages to bring the
+ * pcp's watermarks below high.
+ *
+ * May return a freed pcp, if during page freeing the pcp spinlock cannot be
+ * reacquired. Return true if pcp is locked, false otherwise.
+ */
+static bool free_frozen_page_commit(struct zone *zone,
struct per_cpu_pages *pcp, struct page *page, int migratetype,
- unsigned int order, fpi_t fpi_flags)
+ unsigned int order, fpi_t fpi_flags, unsigned long *UP_flags)
{
int high, batch;
+ int to_free, to_free_batched;
int pindex;
bool free_high = false;
@@ -2864,17 +2872,34 @@ static void free_frozen_page_commit(struct zone *zone,
* Do not attempt to take a zone lock. Let pcp->count get
* over high mark temporarily.
*/
- return;
+ return true;
}
high = nr_pcp_high(pcp, zone, batch, free_high);
- if (pcp->count >= high) {
- free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high),
- pcp, pindex);
+ to_free = nr_pcp_free(pcp, batch, high, free_high);
+ while (to_free > 0 && pcp->count >= high) {
+ to_free_batched = min(to_free, batch);
+ free_pcppages_bulk(zone, to_free_batched, pcp, pindex);
if (test_bit(ZONE_BELOW_HIGH, &zone->flags) &&
zone_watermark_ok(zone, 0, high_wmark_pages(zone),
ZONE_MOVABLE, 0))
clear_bit(ZONE_BELOW_HIGH, &zone->flags);
+
+ high = nr_pcp_high(pcp, zone, batch, free_high);
+ to_free -= to_free_batched;
+ if (pcp->count >= high) {
+ pcp_spin_unlock(pcp);
+ pcp_trylock_finish(*UP_flags);
+
+ pcp_trylock_prepare(*UP_flags);
+ pcp = pcp_spin_trylock(zone->per_cpu_pageset);
+ if (!pcp) {
+ pcp_trylock_finish(*UP_flags);
+ return false;
+ }
+ }
}
+
+ return true;
}
/*
@@ -2922,8 +2947,9 @@ static void __free_frozen_pages(struct page *page, unsigned int order,
pcp_trylock_prepare(UP_flags);
pcp = pcp_spin_trylock(zone->per_cpu_pageset);
if (pcp) {
- free_frozen_page_commit(zone, pcp, page, migratetype, order, fpi_flags);
- pcp_spin_unlock(pcp);
+ if (free_frozen_page_commit(zone, pcp, page, migratetype, order,
+ fpi_flags, &UP_flags))
+ pcp_spin_unlock(pcp);
} else {
free_one_page(zone, page, pfn, order, fpi_flags);
}
@@ -3022,8 +3048,9 @@ void free_unref_folios(struct folio_batch *folios)
migratetype = MIGRATE_MOVABLE;
trace_mm_page_free_batched(&folio->page);
- free_frozen_page_commit(zone, pcp, &folio->page, migratetype,
- order, FPI_NONE);
+ if (!free_frozen_page_commit(zone, pcp, &folio->page,
+ migratetype, order, FPI_NONE, &UP_flags))
+ pcp = NULL;
}
if (pcp) {
--
2.47.3
next prev parent reply other threads:[~2025-09-24 20:44 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-24 20:44 [PATCH v2 0/4] mm/page_alloc: Batch callers of free_pcppages_bulk Joshua Hahn
2025-09-24 20:44 ` [PATCH v2 1/4] mm/page_alloc/vmstat: Simplify refresh_cpu_vm_stats change detection Joshua Hahn
2025-09-24 22:51 ` Christoph Lameter (Ampere)
2025-09-25 18:26 ` Joshua Hahn
2025-09-26 15:34 ` Dan Carpenter
2025-09-26 16:40 ` Joshua Hahn
2025-09-26 17:50 ` SeongJae Park
2025-09-26 18:24 ` Joshua Hahn
2025-09-26 18:33 ` SeongJae Park
2025-09-24 20:44 ` [PATCH v2 2/4] mm/page_alloc: Perform appropriate batching in drain_pages_zone Joshua Hahn
2025-09-24 23:09 ` Christoph Lameter (Ampere)
2025-09-25 18:44 ` Joshua Hahn
2025-09-26 16:21 ` Christoph Lameter (Ampere)
2025-09-26 17:25 ` Joshua Hahn
2025-10-01 11:23 ` Vlastimil Babka
2025-09-26 14:01 ` Brendan Jackman
2025-09-26 15:48 ` Joshua Hahn
2025-09-26 16:57 ` Brendan Jackman
2025-09-26 17:33 ` Joshua Hahn
2025-09-27 0:46 ` Hillf Danton
2025-09-30 14:42 ` Joshua Hahn
2025-09-30 22:14 ` Hillf Danton
2025-10-01 15:37 ` Joshua Hahn
2025-10-01 23:48 ` Hillf Danton
2025-10-03 8:35 ` Vlastimil Babka
2025-10-03 10:02 ` Hillf Danton
2025-10-04 9:03 ` Mike Rapoport
2025-09-24 20:44 ` [PATCH v2 3/4] mm/page_alloc: Batch page freeing in decay_pcp_high Joshua Hahn
2025-09-24 20:44 ` Joshua Hahn [this message]
2025-09-28 5:17 ` [PATCH v2 4/4] mm/page_alloc: Batch page freeing in free_frozen_page_commit kernel test robot
2025-09-29 15:17 ` Joshua Hahn
2025-10-01 10:04 ` Vlastimil Babka
2025-10-01 15:55 ` Joshua Hahn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250924204409.1706524-5-joshua.hahnjy@gmail.com \
--to=joshua.hahnjy@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=clm@fb.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kernel-team@meta.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox