* [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene
@ 2023-08-21 18:33 Johannes Weiner
2023-08-21 18:33 ` [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available Johannes Weiner
` (7 more replies)
0 siblings, 8 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
This is a breakout series from the huge page allocator patches[1].
While testing and benchmarking the series incrementally, as per
reviewer request, it became apparent that there are several sources of
freelist migratetype violations that later patches in the series hid.
Those violations occur when pages of one migratetype end up on the
freelists of another type. This encourages incompatible page mixing
down the line, where allocation requests ask for one migrate type, but
receives pages of another. This defeats the mobility grouping.
The series addresses those causes. The last patch adds type checks on
all freelist movements to rule out any violations. I used these checks
to identify the violations fixed up in the preceding patches.
The series is a breakout, but has merit on its own: Less type mixing
means improved grouping, means less work for compaction, means higher
THP success rate and lower allocation latencies. The results can be
seen in a mixed workload that stresses the machine with a kernel build
job while periodically attempting to allocate batches of THP. The data
is averaged over 50 consecutive defconfig builds:
VANILLA PATCHED-CLEANLISTS
Hugealloc Time median 14642.00 ( +0.00%) 10506.00 ( -28.25%)
Hugealloc Time min 4820.00 ( +0.00%) 4783.00 ( -0.77%)
Hugealloc Time max 6786868.00 ( +0.00%) 6556624.00 ( -3.39%)
Kbuild Real time 240.03 ( +0.00%) 241.45 ( +0.59%)
Kbuild User time 1195.49 ( +0.00%) 1195.69 ( +0.02%)
Kbuild System time 96.44 ( +0.00%) 97.03 ( +0.61%)
THP fault alloc 11490.00 ( +0.00%) 11802.30 ( +2.72%)
THP fault fallback 782.62 ( +0.00%) 478.88 ( -38.76%)
THP fault fail rate % 6.38 ( +0.00%) 3.90 ( -33.52%)
Direct compact stall 297.70 ( +0.00%) 224.56 ( -24.49%)
Direct compact fail 265.98 ( +0.00%) 191.56 ( -27.87%)
Direct compact success 31.72 ( +0.00%) 33.00 ( +3.91%)
Direct compact success rate % 13.11 ( +0.00%) 17.26 ( +29.43%)
Compact daemon scanned migrate 1673661.58 ( +0.00%) 1591682.18 ( -4.90%)
Compact daemon scanned free 2711252.80 ( +0.00%) 2615217.78 ( -3.54%)
Compact direct scanned migrate 384998.62 ( +0.00%) 261689.42 ( -32.03%)
Compact direct scanned free 966308.94 ( +0.00%) 667459.76 ( -30.93%)
Compact migrate scanned daemon % 80.86 ( +0.00%) 83.34 ( +3.02%)
Compact free scanned daemon % 74.41 ( +0.00%) 78.26 ( +5.10%)
Alloc stall 338.06 ( +0.00%) 440.72 ( +30.28%)
Pages kswapd scanned 1356339.42 ( +0.00%) 1402313.42 ( +3.39%)
Pages kswapd reclaimed 581309.08 ( +0.00%) 587956.82 ( +1.14%)
Pages direct scanned 56384.18 ( +0.00%) 141095.04 ( +150.24%)
Pages direct reclaimed 17055.54 ( +0.00%) 22427.96 ( +31.50%)
Pages scanned kswapd % 96.38 ( +0.00%) 93.60 ( -2.86%)
Swap out 41528.00 ( +0.00%) 47969.92 ( +15.51%)
Swap in 6541.42 ( +0.00%) 9093.30 ( +39.01%)
File refaults 127666.50 ( +0.00%) 135766.84 ( +6.34%)
The series is based on v6.5-rc7.
include/linux/mm.h | 18 +-
include/linux/page-isolation.h | 2 +-
include/linux/vmstat.h | 8 -
mm/debug_page_alloc.c | 12 +-
mm/internal.h | 5 -
mm/page_alloc.c | 382 +++++++++++++++++++++------------------
mm/page_isolation.c | 23 ++-
7 files changed, 230 insertions(+), 220 deletions(-)
[1] https://lore.kernel.org/lkml/20230418191313.268131-1-hannes@cmpxchg.org/
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 20:14 ` Zi Yan
2023-08-21 20:29 ` Zi Yan
2023-08-21 18:33 ` [PATCH 2/8] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
` (6 subsequent siblings)
7 siblings, 2 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
Save a pfn_to_page() lookup when the pfn is right there already.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 977bb4d5e8e1..e430ac45df7c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -824,7 +824,7 @@ static inline void __free_one_page(struct page *page,
* pageblock isolation could cause incorrect freepage or CMA
* accounting or HIGHATOMIC accounting.
*/
- int buddy_mt = get_pageblock_migratetype(buddy);
+ int buddy_mt = get_pfnblock_migratetype(buddy, buddy_pfn);
if (migratetype != buddy_mt
&& (!migratetype_is_mergeable(migratetype) ||
@@ -900,7 +900,7 @@ int split_free_page(struct page *free_page,
goto out;
}
- mt = get_pageblock_migratetype(free_page);
+ mt = get_pfnblock_migratetype(free_page, free_page_pfn);
if (likely(!is_migrate_isolate(mt)))
__mod_zone_freepage_state(zone, -(1UL << order), mt);
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/8] mm: page_alloc: remove pcppage migratetype caching
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-08-21 18:33 ` [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 3/8] mm: page_alloc: fix highatomic landing on the wrong buddy list Johannes Weiner
` (5 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
The idea behind the cache is to save get_pageblock_migratetype()
lookups during bulk freeing. A microbenchmark suggests this isn't
helping, though. The pcp migratetype can get stale, which means that
bulk freeing has an extra branch to check if the pageblock was
isolated while on the pcp.
While the variance overlaps, the cache write and the branch seem to
make this a net negative. The following test allocates and frees
batches of 10,000 pages (~3x the pcp high marks to trigger flushing):
Before:
8,668.48 msec task-clock # 99.735 CPUs utilized ( +- 2.90% )
19 context-switches # 4.341 /sec ( +- 3.24% )
0 cpu-migrations # 0.000 /sec
17,440 page-faults # 3.984 K/sec ( +- 2.90% )
41,758,692,473 cycles # 9.541 GHz ( +- 2.90% )
126,201,294,231 instructions # 5.98 insn per cycle ( +- 2.90% )
25,348,098,335 branches # 5.791 G/sec ( +- 2.90% )
33,436,921 branch-misses # 0.26% of all branches ( +- 2.90% )
0.0869148 +- 0.0000302 seconds time elapsed ( +- 0.03% )
After:
8,444.81 msec task-clock # 99.726 CPUs utilized ( +- 2.90% )
22 context-switches # 5.160 /sec ( +- 3.23% )
0 cpu-migrations # 0.000 /sec
17,443 page-faults # 4.091 K/sec ( +- 2.90% )
40,616,738,355 cycles # 9.527 GHz ( +- 2.90% )
126,383,351,792 instructions # 6.16 insn per cycle ( +- 2.90% )
25,224,985,153 branches # 5.917 G/sec ( +- 2.90% )
32,236,793 branch-misses # 0.25% of all branches ( +- 2.90% )
0.0846799 +- 0.0000412 seconds time elapsed ( +- 0.05% )
A side effect is that this also ensures that pages whose pageblock
gets stolen while on the pcplist end up on the right freelist and we
don't perform potentially type-incompatible buddy merges (or skip
merges when we shouldn't), whis is likely beneficial to long-term
fragmentation management, although the effects would be harder to
measure. Settle for simpler and faster code as justification here.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/page_alloc.c | 61 ++++++++++++-------------------------------------
1 file changed, 14 insertions(+), 47 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e430ac45df7c..20973887999b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -204,24 +204,6 @@ EXPORT_SYMBOL(node_states);
gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
-/*
- * A cached value of the page's pageblock's migratetype, used when the page is
- * put on a pcplist. Used to avoid the pageblock migratetype lookup when
- * freeing from pcplists in most cases, at the cost of possibly becoming stale.
- * Also the migratetype set in the page does not necessarily match the pcplist
- * index, e.g. page might have MIGRATE_CMA set but be on a pcplist with any
- * other index - this ensures that it will be put on the correct CMA freelist.
- */
-static inline int get_pcppage_migratetype(struct page *page)
-{
- return page->index;
-}
-
-static inline void set_pcppage_migratetype(struct page *page, int migratetype)
-{
- page->index = migratetype;
-}
-
#ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
unsigned int pageblock_order __read_mostly;
#endif
@@ -1213,7 +1195,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
int min_pindex = 0;
int max_pindex = NR_PCP_LISTS - 1;
unsigned int order;
- bool isolated_pageblocks;
struct page *page;
/*
@@ -1226,7 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
pindex = pindex - 1;
spin_lock_irqsave(&zone->lock, flags);
- isolated_pageblocks = has_isolate_pageblock(zone);
while (count > 0) {
struct list_head *list;
@@ -1249,10 +1229,12 @@ static void free_pcppages_bulk(struct zone *zone, int count,
order = pindex_to_order(pindex);
nr_pages = 1 << order;
do {
+ unsigned long pfn;
int mt;
page = list_last_entry(list, struct page, pcp_list);
- mt = get_pcppage_migratetype(page);
+ pfn = page_to_pfn(page);
+ mt = get_pfnblock_migratetype(page, pfn);
/* must delete to avoid corrupting pcp list */
list_del(&page->pcp_list);
@@ -1261,11 +1243,8 @@ static void free_pcppages_bulk(struct zone *zone, int count,
/* MIGRATE_ISOLATE page should not go to pcplists */
VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
- /* Pageblock could have been isolated meanwhile */
- if (unlikely(isolated_pageblocks))
- mt = get_pageblock_migratetype(page);
- __free_one_page(page, page_to_pfn(page), zone, order, mt, FPI_NONE);
+ __free_one_page(page, pfn, zone, order, mt, FPI_NONE);
trace_mm_page_pcpu_drain(page, order, mt);
} while (count > 0 && !list_empty(list));
}
@@ -1611,7 +1590,6 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
continue;
del_page_from_free_list(page, zone, current_order);
expand(zone, page, order, current_order, migratetype);
- set_pcppage_migratetype(page, migratetype);
trace_mm_page_alloc_zone_locked(page, order, migratetype,
pcp_allowed_order(order) &&
migratetype < MIGRATE_PCPTYPES);
@@ -2181,7 +2159,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
* pages are ordered properly.
*/
list_add_tail(&page->pcp_list, list);
- if (is_migrate_cma(get_pcppage_migratetype(page)))
+ if (is_migrate_cma(get_pageblock_migratetype(page)))
__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
-(1 << order));
}
@@ -2340,19 +2318,6 @@ void drain_all_pages(struct zone *zone)
__drain_all_pages(zone, false);
}
-static bool free_unref_page_prepare(struct page *page, unsigned long pfn,
- unsigned int order)
-{
- int migratetype;
-
- if (!free_pages_prepare(page, order, FPI_NONE))
- return false;
-
- migratetype = get_pfnblock_migratetype(page, pfn);
- set_pcppage_migratetype(page, migratetype);
- return true;
-}
-
static int nr_pcp_free(struct per_cpu_pages *pcp, int high, int batch,
bool free_high)
{
@@ -2440,7 +2405,7 @@ void free_unref_page(struct page *page, unsigned int order)
unsigned long pfn = page_to_pfn(page);
int migratetype;
- if (!free_unref_page_prepare(page, pfn, order))
+ if (!free_pages_prepare(page, order, FPI_NONE))
return;
/*
@@ -2450,7 +2415,7 @@ void free_unref_page(struct page *page, unsigned int order)
* areas back if necessary. Otherwise, we may have to free
* excessively into the page allocator
*/
- migratetype = get_pcppage_migratetype(page);
+ migratetype = get_pfnblock_migratetype(page, pfn);
if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
if (unlikely(is_migrate_isolate(migratetype))) {
free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
@@ -2486,7 +2451,8 @@ void free_unref_page_list(struct list_head *list)
/* Prepare pages for freeing */
list_for_each_entry_safe(page, next, list, lru) {
unsigned long pfn = page_to_pfn(page);
- if (!free_unref_page_prepare(page, pfn, 0)) {
+
+ if (!free_pages_prepare(page, 0, FPI_NONE)) {
list_del(&page->lru);
continue;
}
@@ -2495,7 +2461,7 @@ void free_unref_page_list(struct list_head *list)
* Free isolated pages directly to the allocator, see
* comment in free_unref_page.
*/
- migratetype = get_pcppage_migratetype(page);
+ migratetype = get_pfnblock_migratetype(page, pfn);
if (unlikely(is_migrate_isolate(migratetype))) {
list_del(&page->lru);
free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE);
@@ -2504,10 +2470,11 @@ void free_unref_page_list(struct list_head *list)
}
list_for_each_entry_safe(page, next, list, lru) {
+ unsigned long pfn = page_to_pfn(page);
struct zone *zone = page_zone(page);
list_del(&page->lru);
- migratetype = get_pcppage_migratetype(page);
+ migratetype = get_pfnblock_migratetype(page, pfn);
/*
* Either different zone requiring a different pcp lock or
@@ -2530,7 +2497,7 @@ void free_unref_page_list(struct list_head *list)
pcp = pcp_spin_trylock(zone->per_cpu_pageset);
if (unlikely(!pcp)) {
pcp_trylock_finish(UP_flags);
- free_one_page(zone, page, page_to_pfn(page),
+ free_one_page(zone, page, pfn,
0, migratetype, FPI_NONE);
locked_zone = NULL;
continue;
@@ -2705,7 +2672,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
}
}
__mod_zone_freepage_state(zone, -(1 << order),
- get_pcppage_migratetype(page));
+ get_pageblock_migratetype(page));
spin_unlock_irqrestore(&zone->lock, flags);
} while (check_new_pages(page, order));
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/8] mm: page_alloc: fix highatomic landing on the wrong buddy list
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-08-21 18:33 ` [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available Johannes Weiner
2023-08-21 18:33 ` [PATCH 2/8] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
` (4 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
The following triggers from a custom debug check:
[ 89.401754] page type is 3, passed migratetype is 1 (nr=8)
[ 89.407930] WARNING: CPU: 2 PID: 75 at mm/page_alloc.c:706 __free_one_page+0x5ea/0x6b0
[ 89.415847] Modules linked in:
[ 89.418902] CPU: 2 PID: 75 Comm: kswapd0 Not tainted 6.5.0-rc1-00013-g42be896e9f77-dirty #233
[ 89.427415] Hardware name: Micro-Star International Co., Ltd. MS-7B98/Z390-A PRO (MS-7B98), BIOS 1.80 12/25/2019
[ 89.437572] RIP: 0010:__free_one_page+0x5ea/0x6b0
[ 89.442271] Code: <snip>
[ 89.461003] RSP: 0000:ffffc900001acea8 EFLAGS: 00010092
[ 89.466221] RAX: 0000000000000036 RBX: 0000000000000003 RCX: 0000000000000000
[ 89.473349] RDX: 0000000000000106 RSI: 0000000000000027 RDI: 00000000ffffffff
[ 89.480478] RBP: ffffffff82ca4780 R08: 0000000000000001 R09: 0000000000000000
[ 89.487601] R10: ffffffff8285d1e0 R11: ffffffff8285d1e0 R12: 0000000000000000
[ 89.494725] R13: 0000000000063448 R14: ffffea00018d1200 R15: 0000000000063401
[ 89.501853] FS: 0000000000000000(0000) GS:ffff88806e680000(0000) knlGS:0000000000000000
[ 89.509930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 89.515671] CR2: 00007fc66488b006 CR3: 00000000190b5001 CR4: 00000000003706e0
[ 89.522798] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 89.529924] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 89.537048] Call Trace:
[ 89.539498] <IRQ>
[ 89.541517] ? __free_one_page+0x5ea/0x6b0
[ 89.545619] ? __warn+0x7d/0x130
[ 89.548852] ? __free_one_page+0x5ea/0x6b0
[ 89.552946] ? report_bug+0x18d/0x1c0
[ 89.556607] ? handle_bug+0x3a/0x70
[ 89.560097] ? exc_invalid_op+0x13/0x60
[ 89.563933] ? asm_exc_invalid_op+0x16/0x20
[ 89.568113] ? __free_one_page+0x5ea/0x6b0
[ 89.572210] ? __free_one_page+0x5ea/0x6b0
[ 89.576306] ? refill_obj_stock+0xf5/0x1c0
[ 89.580399] free_one_page.constprop.0+0x5c/0xe0
This is a HIGHATOMIC page being freed to the MOVABLE buddy list.
Highatomic pages have their own buddy freelists, but not their own
pcplist. free_one_page() adjusts the migratetype so they can hitchhike
on the MOVABLE pcplist. However, when the pcp trylock then fails,
they're fed directly to the buddy list - with the incorrect type.
Use MIGRATE_MOVABLE only for the pcp, not for the buddy bypass.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/page_alloc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 20973887999b..a5e36d186893 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2403,7 +2403,7 @@ void free_unref_page(struct page *page, unsigned int order)
struct per_cpu_pages *pcp;
struct zone *zone;
unsigned long pfn = page_to_pfn(page);
- int migratetype;
+ int migratetype, pcpmigratetype;
if (!free_pages_prepare(page, order, FPI_NONE))
return;
@@ -2415,20 +2415,20 @@ void free_unref_page(struct page *page, unsigned int order)
* areas back if necessary. Otherwise, we may have to free
* excessively into the page allocator
*/
- migratetype = get_pfnblock_migratetype(page, pfn);
+ migratetype = pcpmigratetype = get_pfnblock_migratetype(page, pfn);
if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
if (unlikely(is_migrate_isolate(migratetype))) {
free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
return;
}
- migratetype = MIGRATE_MOVABLE;
+ pcpmigratetype = MIGRATE_MOVABLE;
}
zone = page_zone(page);
pcp_trylock_prepare(UP_flags);
pcp = pcp_spin_trylock(zone->per_cpu_pageset);
if (pcp) {
- free_unref_page_commit(zone, pcp, page, migratetype, order);
+ free_unref_page_commit(zone, pcp, page, pcpmigratetype, order);
pcp_spin_unlock(pcp);
} else {
free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
` (2 preceding siblings ...)
2023-08-21 18:33 ` [PATCH 3/8] mm: page_alloc: fix highatomic landing on the wrong buddy list Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 20:41 ` Zi Yan
2023-08-21 18:33 ` [PATCH 5/8] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
` (3 subsequent siblings)
7 siblings, 1 reply; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
The buddy allocator coalesces compatible blocks during freeing, but it
doesn't update the types of the subblocks to match. When an allocation
later breaks the chunk down again, its pieces will be put on freelists
of the wrong type. This encourages incompatible page mixing (ask for
one type, get another), and thus long-term fragmentation.
Update the subblocks when merging a larger chunk, such that a later
expand() will maintain freelist type hygiene.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/page_alloc.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a5e36d186893..6c9f565b2613 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -438,6 +438,17 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
page_to_pfn(page), MIGRATETYPE_MASK);
}
+static void change_pageblock_range(struct page *pageblock_page,
+ int start_order, int migratetype)
+{
+ int nr_pageblocks = 1 << (start_order - pageblock_order);
+
+ while (nr_pageblocks--) {
+ set_pageblock_migratetype(pageblock_page, migratetype);
+ pageblock_page += pageblock_nr_pages;
+ }
+}
+
#ifdef CONFIG_DEBUG_VM
static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
{
@@ -808,10 +819,17 @@ static inline void __free_one_page(struct page *page,
*/
int buddy_mt = get_pfnblock_migratetype(buddy, buddy_pfn);
- if (migratetype != buddy_mt
- && (!migratetype_is_mergeable(migratetype) ||
- !migratetype_is_mergeable(buddy_mt)))
- goto done_merging;
+ if (migratetype != buddy_mt) {
+ if (!migratetype_is_mergeable(migratetype) ||
+ !migratetype_is_mergeable(buddy_mt))
+ goto done_merging;
+ /*
+ * Match buddy type. This ensures that
+ * an expand() down the line puts the
+ * sub-blocks on the right freelists.
+ */
+ set_pageblock_migratetype(buddy, migratetype);
+ }
}
/*
@@ -1687,17 +1705,6 @@ int move_freepages_block(struct zone *zone, struct page *page,
num_movable);
}
-static void change_pageblock_range(struct page *pageblock_page,
- int start_order, int migratetype)
-{
- int nr_pageblocks = 1 << (start_order - pageblock_order);
-
- while (nr_pageblocks--) {
- set_pageblock_migratetype(pageblock_page, migratetype);
- pageblock_page += pageblock_nr_pages;
- }
-}
-
/*
* When we are falling back to another migratetype during allocation, try to
* steal extra free pages from the same pageblocks to satisfy further
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 5/8] mm: page_alloc: move free pages when converting block during isolation
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
` (3 preceding siblings ...)
2023-08-21 18:33 ` [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 6/8] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
` (2 subsequent siblings)
7 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
When claiming a block during compaction isolation, move any remaining
free pages to the correct freelists as well, instead of stranding them
on the wrong list. Otherwise, this encourages incompatible page mixing
down the line, and thus long-term fragmentation.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/page_alloc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6c9f565b2613..6a4004f07123 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2586,9 +2586,12 @@ int __isolate_free_page(struct page *page, unsigned int order)
* Only change normal pageblocks (i.e., they can merge
* with others)
*/
- if (migratetype_is_mergeable(mt))
+ if (migratetype_is_mergeable(mt)) {
set_pageblock_migratetype(page,
MIGRATE_MOVABLE);
+ move_freepages_block(zone, page,
+ MIGRATE_MOVABLE, NULL);
+ }
}
}
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 6/8] mm: page_alloc: fix move_freepages_block() range error
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
` (4 preceding siblings ...)
2023-08-21 18:33 ` [PATCH 5/8] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 7/8] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-08-21 18:33 ` [PATCH 8/8] mm: page_alloc: consolidate free page accounting Johannes Weiner
7 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
When a block is partially outside the zone of the cursor page, the
function cuts the range to the pivot page instead of the zone
start. This can leave large parts of the block behind, which
encourages incompatible page mixing down the line (ask for one type,
get another), and thus long-term fragmentation.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6a4004f07123..6fcda8e96f16 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1697,7 +1697,7 @@ int move_freepages_block(struct zone *zone, struct page *page,
/* Do not cross zone boundaries */
if (!zone_spans_pfn(zone, start_pfn))
- start_pfn = pfn;
+ start_pfn = zone->zone_start_pfn;
if (!zone_spans_pfn(zone, end_pfn))
return 0;
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 7/8] mm: page_alloc: fix freelist movement during block conversion
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
` (5 preceding siblings ...)
2023-08-21 18:33 ` [PATCH 6/8] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 8/8] mm: page_alloc: consolidate free page accounting Johannes Weiner
7 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
Currently, page block type conversion during fallbacks, atomic
reservations and isolation can strand various amounts of free pages on
incorrect freelists.
For example, fallback stealing moves free pages in the block to the
new type's freelists, but then may not actually claim the block for
that type if there aren't enough compatible pages already allocated.
In all cases, free page moving might fail if the block straddles more
than one zone, in which case no free pages are moved at all, but the
block type is changed anyway.
This is detrimental to type hygiene on the freelists. It encourages
incompatible page mixing down the line (ask for one type, get another)
and thus contributes to long-term fragmentation.
Split the process into a proper transaction: check first if conversion
will happen, then try to move the free pages, and only if that was
successful convert the block to the new type.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
include/linux/page-isolation.h | 3 +-
mm/page_alloc.c | 176 ++++++++++++++++++++-------------
mm/page_isolation.c | 22 +++--
3 files changed, 121 insertions(+), 80 deletions(-)
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 4ac34392823a..8550b3c91480 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -34,8 +34,7 @@ static inline bool is_migrate_isolate(int migratetype)
#define REPORT_FAILURE 0x2
void set_pageblock_migratetype(struct page *page, int migratetype);
-int move_freepages_block(struct zone *zone, struct page *page,
- int migratetype, int *num_movable);
+int move_freepages_block(struct zone *zone, struct page *page, int migratetype);
int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
int migratetype, int flags, gfp_t gfp_flags);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6fcda8e96f16..42b62832323f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1646,9 +1646,8 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
* Note that start_page and end_pages are not aligned on a pageblock
* boundary. If alignment is required, use move_freepages_block()
*/
-static int move_freepages(struct zone *zone,
- unsigned long start_pfn, unsigned long end_pfn,
- int migratetype, int *num_movable)
+static int move_freepages(struct zone *zone, unsigned long start_pfn,
+ unsigned long end_pfn, int migratetype)
{
struct page *page;
unsigned long pfn;
@@ -1658,14 +1657,6 @@ static int move_freepages(struct zone *zone,
for (pfn = start_pfn; pfn <= end_pfn;) {
page = pfn_to_page(pfn);
if (!PageBuddy(page)) {
- /*
- * We assume that pages that could be isolated for
- * migration are movable. But we don't actually try
- * isolating, as that would be expensive.
- */
- if (num_movable &&
- (PageLRU(page) || __PageMovable(page)))
- (*num_movable)++;
pfn++;
continue;
}
@@ -1683,26 +1674,62 @@ static int move_freepages(struct zone *zone,
return pages_moved;
}
-int move_freepages_block(struct zone *zone, struct page *page,
- int migratetype, int *num_movable)
+static bool prep_move_freepages_block(struct zone *zone, struct page *page,
+ unsigned long *start_pfn,
+ unsigned long *end_pfn,
+ int *num_free, int *num_movable)
{
- unsigned long start_pfn, end_pfn, pfn;
-
- if (num_movable)
- *num_movable = 0;
+ unsigned long pfn, start, end;
pfn = page_to_pfn(page);
- start_pfn = pageblock_start_pfn(pfn);
- end_pfn = pageblock_end_pfn(pfn) - 1;
+ start = pageblock_start_pfn(pfn);
+ end = pageblock_end_pfn(pfn) - 1;
/* Do not cross zone boundaries */
- if (!zone_spans_pfn(zone, start_pfn))
- start_pfn = zone->zone_start_pfn;
- if (!zone_spans_pfn(zone, end_pfn))
- return 0;
+ if (!zone_spans_pfn(zone, start))
+ start = zone->zone_start_pfn;
+ if (!zone_spans_pfn(zone, end))
+ return false;
+
+ *start_pfn = start;
+ *end_pfn = end;
+
+ if (num_free) {
+ *num_free = 0;
+ *num_movable = 0;
+ for (pfn = start; pfn <= end;) {
+ page = pfn_to_page(pfn);
+ if (PageBuddy(page)) {
+ int nr = 1 << buddy_order(page);
+
+ *num_free += nr;
+ pfn += nr;
+ continue;
+ }
+ /*
+ * We assume that pages that could be isolated for
+ * migration are movable. But we don't actually try
+ * isolating, as that would be expensive.
+ */
+ if (PageLRU(page) || __PageMovable(page))
+ (*num_movable)++;
+ pfn++;
+ }
+ }
+
+ return true;
+}
- return move_freepages(zone, start_pfn, end_pfn, migratetype,
- num_movable);
+int move_freepages_block(struct zone *zone, struct page *page,
+ int migratetype)
+{
+ unsigned long start_pfn, end_pfn;
+
+ if (!prep_move_freepages_block(zone, page, &start_pfn, &end_pfn,
+ NULL, NULL))
+ return -1;
+
+ return move_freepages(zone, start_pfn, end_pfn, migratetype);
}
/*
@@ -1776,33 +1803,36 @@ static inline bool boost_watermark(struct zone *zone)
}
/*
- * This function implements actual steal behaviour. If order is large enough,
- * we can steal whole pageblock. If not, we first move freepages in this
- * pageblock to our migratetype and determine how many already-allocated pages
- * are there in the pageblock with a compatible migratetype. If at least half
- * of pages are free or compatible, we can change migratetype of the pageblock
- * itself, so pages freed in the future will be put on the correct free list.
+ * This function implements actual steal behaviour. If order is large enough, we
+ * can claim the whole pageblock for the requested migratetype. If not, we check
+ * the pageblock for constituent pages; if at least half of the pages are free
+ * or compatible, we can still claim the whole block, so pages freed in the
+ * future will be put on the correct free list. Otherwise, we isolate exactly
+ * the order we need from the fallback block and leave its migratetype alone.
*/
static void steal_suitable_fallback(struct zone *zone, struct page *page,
- unsigned int alloc_flags, int start_type, bool whole_block)
+ int current_order, int order, int start_type,
+ unsigned int alloc_flags, bool whole_block)
{
- unsigned int current_order = buddy_order(page);
int free_pages, movable_pages, alike_pages;
- int old_block_type;
+ unsigned long start_pfn, end_pfn;
+ int block_type;
- old_block_type = get_pageblock_migratetype(page);
+ block_type = get_pageblock_migratetype(page);
/*
* This can happen due to races and we want to prevent broken
* highatomic accounting.
*/
- if (is_migrate_highatomic(old_block_type))
+ if (is_migrate_highatomic(block_type))
goto single_page;
/* Take ownership for orders >= pageblock_order */
if (current_order >= pageblock_order) {
+ del_page_from_free_list(page, zone, current_order);
change_pageblock_range(page, current_order, start_type);
- goto single_page;
+ expand(zone, page, order, current_order, start_type);
+ return;
}
/*
@@ -1817,8 +1847,11 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
if (!whole_block)
goto single_page;
- free_pages = move_freepages_block(zone, page, start_type,
- &movable_pages);
+ /* moving whole block can fail due to zone boundary conditions */
+ if (!prep_move_freepages_block(zone, page, &start_pfn, &end_pfn,
+ &free_pages, &movable_pages))
+ goto single_page;
+
/*
* Determine how many pages are compatible with our allocation.
* For movable allocation, it's the number of movable pages which
@@ -1834,29 +1867,27 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
* vice versa, be conservative since we can't distinguish the
* exact migratetype of non-movable pages.
*/
- if (old_block_type == MIGRATE_MOVABLE)
+ if (block_type == MIGRATE_MOVABLE)
alike_pages = pageblock_nr_pages
- (free_pages + movable_pages);
else
alike_pages = 0;
}
- /* moving whole block can fail due to zone boundary conditions */
- if (!free_pages)
- goto single_page;
-
/*
* If a sufficient number of pages in the block are either free or of
* comparable migratability as our allocation, claim the whole block.
*/
if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
- page_group_by_mobility_disabled)
+ page_group_by_mobility_disabled) {
+ move_freepages(zone, start_pfn, end_pfn, start_type);
set_pageblock_migratetype(page, start_type);
-
- return;
+ block_type = start_type;
+ }
single_page:
- move_to_free_list(page, zone, current_order, start_type);
+ del_page_from_free_list(page, zone, current_order);
+ expand(zone, page, order, current_order, block_type);
}
/*
@@ -1921,9 +1952,10 @@ static void reserve_highatomic_pageblock(struct page *page, struct zone *zone,
mt = get_pageblock_migratetype(page);
/* Only reserve normal pageblocks (i.e., they can merge with others) */
if (migratetype_is_mergeable(mt)) {
- zone->nr_reserved_highatomic += pageblock_nr_pages;
- set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
- move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);
+ if (move_freepages_block(zone, page, MIGRATE_HIGHATOMIC) != -1) {
+ set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
+ zone->nr_reserved_highatomic += pageblock_nr_pages;
+ }
}
out_unlock:
@@ -1948,7 +1980,7 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
struct zone *zone;
struct page *page;
int order;
- bool ret;
+ int ret;
for_each_zone_zonelist_nodemask(zone, z, zonelist, ac->highest_zoneidx,
ac->nodemask) {
@@ -1997,10 +2029,14 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
* of pageblocks that cannot be completely freed
* may increase.
*/
+ ret = move_freepages_block(zone, page, ac->migratetype);
+ /*
+ * Reserving this block already succeeded, so this should
+ * not fail on zone boundaries.
+ */
+ WARN_ON_ONCE(ret == -1);
set_pageblock_migratetype(page, ac->migratetype);
- ret = move_freepages_block(zone, page, ac->migratetype,
- NULL);
- if (ret) {
+ if (ret > 0) {
spin_unlock_irqrestore(&zone->lock, flags);
return ret;
}
@@ -2021,7 +2057,7 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
* deviation from the rest of this file, to make the for loop
* condition simpler.
*/
-static __always_inline bool
+static __always_inline struct page *
__rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
unsigned int alloc_flags)
{
@@ -2068,7 +2104,7 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
goto do_steal;
}
- return false;
+ return NULL;
find_smallest:
for (current_order = order; current_order <= MAX_ORDER;
@@ -2089,13 +2125,14 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
do_steal:
page = get_page_from_free_area(area, fallback_mt);
- steal_suitable_fallback(zone, page, alloc_flags, start_migratetype,
- can_steal);
+ /* take off list, maybe claim block, expand remainder */
+ steal_suitable_fallback(zone, page, current_order, order,
+ start_migratetype, alloc_flags, can_steal);
trace_mm_page_alloc_extfrag(page, order, current_order,
start_migratetype, fallback_mt);
- return true;
+ return page;
}
@@ -2123,15 +2160,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
return page;
}
}
-retry:
+
page = __rmqueue_smallest(zone, order, migratetype);
if (unlikely(!page)) {
if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
-
- if (!page && __rmqueue_fallback(zone, order, migratetype,
- alloc_flags))
- goto retry;
+ else
+ page = __rmqueue_fallback(zone, order, migratetype,
+ alloc_flags);
}
return page;
}
@@ -2586,12 +2622,10 @@ int __isolate_free_page(struct page *page, unsigned int order)
* Only change normal pageblocks (i.e., they can merge
* with others)
*/
- if (migratetype_is_mergeable(mt)) {
- set_pageblock_migratetype(page,
- MIGRATE_MOVABLE);
- move_freepages_block(zone, page,
- MIGRATE_MOVABLE, NULL);
- }
+ if (migratetype_is_mergeable(mt) &&
+ move_freepages_block(zone, page,
+ MIGRATE_MOVABLE) != -1)
+ set_pageblock_migratetype(page, MIGRATE_MOVABLE);
}
}
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 6599cc965e21..f5e4d8676b36 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -178,15 +178,18 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
migratetype, isol_flags);
if (!unmovable) {
- unsigned long nr_pages;
+ int nr_pages;
int mt = get_pageblock_migratetype(page);
+ nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE);
+ /* Block spans zone boundaries? */
+ if (nr_pages == -1) {
+ spin_unlock_irqrestore(&zone->lock, flags);
+ return -EBUSY;
+ }
+ __mod_zone_freepage_state(zone, -nr_pages, mt);
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
zone->nr_isolate_pageblock++;
- nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE,
- NULL);
-
- __mod_zone_freepage_state(zone, -nr_pages, mt);
spin_unlock_irqrestore(&zone->lock, flags);
return 0;
}
@@ -206,7 +209,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
static void unset_migratetype_isolate(struct page *page, int migratetype)
{
struct zone *zone;
- unsigned long flags, nr_pages;
+ unsigned long flags;
bool isolated_page = false;
unsigned int order;
struct page *buddy;
@@ -252,7 +255,12 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
* allocation.
*/
if (!isolated_page) {
- nr_pages = move_freepages_block(zone, page, migratetype, NULL);
+ int nr_pages = move_freepages_block(zone, page, migratetype);
+ /*
+ * Isolating this block already succeeded, so this
+ * should not fail on zone boundaries.
+ */
+ WARN_ON_ONCE(nr_pages == -1);
__mod_zone_freepage_state(zone, nr_pages, migratetype);
}
set_pageblock_migratetype(page, migratetype);
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 8/8] mm: page_alloc: consolidate free page accounting
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
` (6 preceding siblings ...)
2023-08-21 18:33 ` [PATCH 7/8] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
@ 2023-08-21 18:33 ` Johannes Weiner
2023-08-23 22:40 ` kernel test robot
7 siblings, 1 reply; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 18:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
Free page accounting currently happens a bit too high up the call
stack, where it has to deal with guard pages, compaction capturing,
block stealing and even page isolation. This is subtle and fragile,
and makes it difficult to hack on the code.
Push the accounting down to where pages enter and leave the physical
freelists, where all these higher-level exceptions are of no concern.
v2:
- fix CONFIG_DEBUG_PAGEALLOC build (Mel)
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
include/linux/mm.h | 18 ++---
include/linux/page-isolation.h | 3 +-
include/linux/vmstat.h | 8 --
mm/debug_page_alloc.c | 12 +--
mm/internal.h | 5 --
mm/page_alloc.c | 131 ++++++++++++++++++---------------
mm/page_isolation.c | 7 +-
7 files changed, 88 insertions(+), 96 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 406ab9ea818f..950c400ac53b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3550,24 +3550,22 @@ static inline bool page_is_guard(struct page *page)
return PageGuard(page);
}
-bool __set_page_guard(struct zone *zone, struct page *page, unsigned int order,
- int migratetype);
+bool __set_page_guard(struct zone *zone, struct page *page, unsigned int order);
static inline bool set_page_guard(struct zone *zone, struct page *page,
- unsigned int order, int migratetype)
+ unsigned int order)
{
if (!debug_guardpage_enabled())
return false;
- return __set_page_guard(zone, page, order, migratetype);
+ return __set_page_guard(zone, page, order);
}
-void __clear_page_guard(struct zone *zone, struct page *page, unsigned int order,
- int migratetype);
+void __clear_page_guard(struct zone *zone, struct page *page, unsigned int order);
static inline void clear_page_guard(struct zone *zone, struct page *page,
- unsigned int order, int migratetype)
+ unsigned int order)
{
if (!debug_guardpage_enabled())
return;
- __clear_page_guard(zone, page, order, migratetype);
+ __clear_page_guard(zone, page, order);
}
#else /* CONFIG_DEBUG_PAGEALLOC */
@@ -3577,9 +3575,9 @@ static inline unsigned int debug_guardpage_minorder(void) { return 0; }
static inline bool debug_guardpage_enabled(void) { return false; }
static inline bool page_is_guard(struct page *page) { return false; }
static inline bool set_page_guard(struct zone *zone, struct page *page,
- unsigned int order, int migratetype) { return false; }
+ unsigned int order) { return false; }
static inline void clear_page_guard(struct zone *zone, struct page *page,
- unsigned int order, int migratetype) {}
+ unsigned int order) {}
#endif /* CONFIG_DEBUG_PAGEALLOC */
#ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 8550b3c91480..901915747960 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -34,7 +34,8 @@ static inline bool is_migrate_isolate(int migratetype)
#define REPORT_FAILURE 0x2
void set_pageblock_migratetype(struct page *page, int migratetype);
-int move_freepages_block(struct zone *zone, struct page *page, int migratetype);
+int move_freepages_block(struct zone *zone, struct page *page,
+ int old_mt, int new_mt);
int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
int migratetype, int flags, gfp_t gfp_flags);
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index fed855bae6d8..a4eae03f6094 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -487,14 +487,6 @@ static inline void node_stat_sub_folio(struct folio *folio,
mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
}
-static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pages,
- int migratetype)
-{
- __mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
- if (is_migrate_cma(migratetype))
- __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, nr_pages);
-}
-
extern const char * const vmstat_text[];
static inline const char *zone_stat_name(enum zone_stat_item item)
diff --git a/mm/debug_page_alloc.c b/mm/debug_page_alloc.c
index f9d145730fd1..03a810927d0a 100644
--- a/mm/debug_page_alloc.c
+++ b/mm/debug_page_alloc.c
@@ -32,8 +32,7 @@ static int __init debug_guardpage_minorder_setup(char *buf)
}
early_param("debug_guardpage_minorder", debug_guardpage_minorder_setup);
-bool __set_page_guard(struct zone *zone, struct page *page, unsigned int order,
- int migratetype)
+bool __set_page_guard(struct zone *zone, struct page *page, unsigned int order)
{
if (order >= debug_guardpage_minorder())
return false;
@@ -41,19 +40,12 @@ bool __set_page_guard(struct zone *zone, struct page *page, unsigned int order,
__SetPageGuard(page);
INIT_LIST_HEAD(&page->buddy_list);
set_page_private(page, order);
- /* Guard pages are not available for any usage */
- if (!is_migrate_isolate(migratetype))
- __mod_zone_freepage_state(zone, -(1 << order), migratetype);
return true;
}
-void __clear_page_guard(struct zone *zone, struct page *page, unsigned int order,
- int migratetype)
+void __clear_page_guard(struct zone *zone, struct page *page, unsigned int order)
{
__ClearPageGuard(page);
-
set_page_private(page, 0);
- if (!is_migrate_isolate(migratetype))
- __mod_zone_freepage_state(zone, (1 << order), migratetype);
}
diff --git a/mm/internal.h b/mm/internal.h
index a7d9e980429a..d86fd621880e 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -865,11 +865,6 @@ static inline bool is_migrate_highatomic(enum migratetype migratetype)
return migratetype == MIGRATE_HIGHATOMIC;
}
-static inline bool is_migrate_highatomic_page(struct page *page)
-{
- return get_pageblock_migratetype(page) == MIGRATE_HIGHATOMIC;
-}
-
void setup_zone_pageset(struct zone *zone);
struct migration_target_control {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 42b62832323f..e7e790a64237 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -676,24 +676,36 @@ compaction_capture(struct capture_control *capc, struct page *page,
}
#endif /* CONFIG_COMPACTION */
-/* Used for pages not on another list */
-static inline void add_to_free_list(struct page *page, struct zone *zone,
- unsigned int order, int migratetype)
+static inline void account_freepages(struct page *page, struct zone *zone,
+ int nr_pages, int migratetype)
{
- struct free_area *area = &zone->free_area[order];
+ if (is_migrate_isolate(migratetype))
+ return;
- list_add(&page->buddy_list, &area->free_list[migratetype]);
- area->nr_free++;
+ __mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
+
+ if (is_migrate_cma(migratetype))
+ __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, nr_pages);
}
/* Used for pages not on another list */
-static inline void add_to_free_list_tail(struct page *page, struct zone *zone,
- unsigned int order, int migratetype)
+static inline void add_to_free_list(struct page *page, struct zone *zone,
+ unsigned int order, int migratetype,
+ bool tail)
{
struct free_area *area = &zone->free_area[order];
- list_add_tail(&page->buddy_list, &area->free_list[migratetype]);
+ VM_WARN_ONCE(get_pageblock_migratetype(page) != migratetype,
+ "page type is %lu, passed migratetype is %d (nr=%d)\n",
+ get_pageblock_migratetype(page), migratetype, 1 << order);
+
+ if (tail)
+ list_add_tail(&page->buddy_list, &area->free_list[migratetype]);
+ else
+ list_add(&page->buddy_list, &area->free_list[migratetype]);
area->nr_free++;
+
+ account_freepages(page, zone, 1 << order, migratetype);
}
/*
@@ -702,16 +714,28 @@ static inline void add_to_free_list_tail(struct page *page, struct zone *zone,
* allocation again (e.g., optimization for memory onlining).
*/
static inline void move_to_free_list(struct page *page, struct zone *zone,
- unsigned int order, int migratetype)
+ unsigned int order, int old_mt, int new_mt)
{
struct free_area *area = &zone->free_area[order];
- list_move_tail(&page->buddy_list, &area->free_list[migratetype]);
+ /* Free page moving can fail, so it happens before the type update */
+ VM_WARN_ONCE(get_pageblock_migratetype(page) != old_mt,
+ "page type is %lu, passed migratetype is %d (nr=%d)\n",
+ get_pageblock_migratetype(page), old_mt, 1 << order);
+
+ list_move_tail(&page->buddy_list, &area->free_list[new_mt]);
+
+ account_freepages(page, zone, -(1 << order), old_mt);
+ account_freepages(page, zone, 1 << order, new_mt);
}
static inline void del_page_from_free_list(struct page *page, struct zone *zone,
- unsigned int order)
+ unsigned int order, int migratetype)
{
+ VM_WARN_ONCE(get_pageblock_migratetype(page) != migratetype,
+ "page type is %lu, passed migratetype is %d (nr=%d)\n",
+ get_pageblock_migratetype(page), migratetype, 1 << order);
+
/* clear reported state and update reported page count */
if (page_reported(page))
__ClearPageReported(page);
@@ -720,6 +744,8 @@ static inline void del_page_from_free_list(struct page *page, struct zone *zone,
__ClearPageBuddy(page);
set_page_private(page, 0);
zone->free_area[order].nr_free--;
+
+ account_freepages(page, zone, -(1 << order), migratetype);
}
static inline struct page *get_page_from_free_area(struct free_area *area,
@@ -793,23 +819,21 @@ static inline void __free_one_page(struct page *page,
VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page);
VM_BUG_ON(migratetype == -1);
- if (likely(!is_migrate_isolate(migratetype)))
- __mod_zone_freepage_state(zone, 1 << order, migratetype);
-
VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page);
VM_BUG_ON_PAGE(bad_range(zone, page), page);
while (order < MAX_ORDER) {
- if (compaction_capture(capc, page, order, migratetype)) {
- __mod_zone_freepage_state(zone, -(1 << order),
- migratetype);
+ int buddy_mt;
+
+ if (compaction_capture(capc, page, order, migratetype))
return;
- }
buddy = find_buddy_page_pfn(page, pfn, order, &buddy_pfn);
if (!buddy)
goto done_merging;
+ buddy_mt = get_pfnblock_migratetype(buddy, buddy_pfn);
+
if (unlikely(order >= pageblock_order)) {
/*
* We want to prevent merge between freepages on pageblock
@@ -837,9 +861,9 @@ static inline void __free_one_page(struct page *page,
* merge with it and move up one order.
*/
if (page_is_guard(buddy))
- clear_page_guard(zone, buddy, order, migratetype);
+ clear_page_guard(zone, buddy, order);
else
- del_page_from_free_list(buddy, zone, order);
+ del_page_from_free_list(buddy, zone, order, buddy_mt);
combined_pfn = buddy_pfn & pfn;
page = page + (combined_pfn - pfn);
pfn = combined_pfn;
@@ -856,10 +880,7 @@ static inline void __free_one_page(struct page *page,
else
to_tail = buddy_merge_likely(pfn, buddy_pfn, page, order);
- if (to_tail)
- add_to_free_list_tail(page, zone, order, migratetype);
- else
- add_to_free_list(page, zone, order, migratetype);
+ add_to_free_list(page, zone, order, migratetype, to_tail);
/* Notify page reporting subsystem of freed page */
if (!(fpi_flags & FPI_SKIP_REPORT_NOTIFY))
@@ -901,10 +922,8 @@ int split_free_page(struct page *free_page,
}
mt = get_pfnblock_migratetype(free_page, free_page_pfn);
- if (likely(!is_migrate_isolate(mt)))
- __mod_zone_freepage_state(zone, -(1UL << order), mt);
+ del_page_from_free_list(free_page, zone, order, mt);
- del_page_from_free_list(free_page, zone, order);
for (pfn = free_page_pfn;
pfn < free_page_pfn + (1UL << order);) {
int mt = get_pfnblock_migratetype(pfn_to_page(pfn), pfn);
@@ -1433,10 +1452,10 @@ static inline void expand(struct zone *zone, struct page *page,
* Corresponding page table entries will not be touched,
* pages will stay not present in virtual address space
*/
- if (set_page_guard(zone, &page[size], high, migratetype))
+ if (set_page_guard(zone, &page[size], high))
continue;
- add_to_free_list(&page[size], zone, high, migratetype);
+ add_to_free_list(&page[size], zone, high, migratetype, false);
set_buddy_order(&page[size], high);
}
}
@@ -1606,7 +1625,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
page = get_page_from_free_area(area, migratetype);
if (!page)
continue;
- del_page_from_free_list(page, zone, current_order);
+ del_page_from_free_list(page, zone, current_order, migratetype);
expand(zone, page, order, current_order, migratetype);
trace_mm_page_alloc_zone_locked(page, order, migratetype,
pcp_allowed_order(order) &&
@@ -1647,7 +1666,7 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
* boundary. If alignment is required, use move_freepages_block()
*/
static int move_freepages(struct zone *zone, unsigned long start_pfn,
- unsigned long end_pfn, int migratetype)
+ unsigned long end_pfn, int old_mt, int new_mt)
{
struct page *page;
unsigned long pfn;
@@ -1666,7 +1685,7 @@ static int move_freepages(struct zone *zone, unsigned long start_pfn,
VM_BUG_ON_PAGE(page_zone(page) != zone, page);
order = buddy_order(page);
- move_to_free_list(page, zone, order, migratetype);
+ move_to_free_list(page, zone, order, old_mt, new_mt);
pfn += 1 << order;
pages_moved += 1 << order;
}
@@ -1721,7 +1740,7 @@ static bool prep_move_freepages_block(struct zone *zone, struct page *page,
}
int move_freepages_block(struct zone *zone, struct page *page,
- int migratetype)
+ int old_mt, int new_mt)
{
unsigned long start_pfn, end_pfn;
@@ -1729,7 +1748,7 @@ int move_freepages_block(struct zone *zone, struct page *page,
NULL, NULL))
return -1;
- return move_freepages(zone, start_pfn, end_pfn, migratetype);
+ return move_freepages(zone, start_pfn, end_pfn, old_mt, new_mt);
}
/*
@@ -1829,7 +1848,7 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
/* Take ownership for orders >= pageblock_order */
if (current_order >= pageblock_order) {
- del_page_from_free_list(page, zone, current_order);
+ del_page_from_free_list(page, zone, current_order, block_type);
change_pageblock_range(page, current_order, start_type);
expand(zone, page, order, current_order, start_type);
return;
@@ -1880,13 +1899,13 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
*/
if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
page_group_by_mobility_disabled) {
- move_freepages(zone, start_pfn, end_pfn, start_type);
+ move_freepages(zone, start_pfn, end_pfn, block_type, start_type);
set_pageblock_migratetype(page, start_type);
block_type = start_type;
}
single_page:
- del_page_from_free_list(page, zone, current_order);
+ del_page_from_free_list(page, zone, current_order, block_type);
expand(zone, page, order, current_order, block_type);
}
@@ -1952,7 +1971,8 @@ static void reserve_highatomic_pageblock(struct page *page, struct zone *zone,
mt = get_pageblock_migratetype(page);
/* Only reserve normal pageblocks (i.e., they can merge with others) */
if (migratetype_is_mergeable(mt)) {
- if (move_freepages_block(zone, page, MIGRATE_HIGHATOMIC) != -1) {
+ if (move_freepages_block(zone, page,
+ mt, MIGRATE_HIGHATOMIC) != -1) {
set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
zone->nr_reserved_highatomic += pageblock_nr_pages;
}
@@ -1995,11 +2015,13 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
spin_lock_irqsave(&zone->lock, flags);
for (order = 0; order <= MAX_ORDER; order++) {
struct free_area *area = &(zone->free_area[order]);
+ int mt;
page = get_page_from_free_area(area, MIGRATE_HIGHATOMIC);
if (!page)
continue;
+ mt = get_pageblock_migratetype(page);
/*
* In page freeing path, migratetype change is racy so
* we can counter several free pages in a pageblock
@@ -2007,7 +2029,7 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
* from highatomic to ac->migratetype. So we should
* adjust the count once.
*/
- if (is_migrate_highatomic_page(page)) {
+ if (is_migrate_highatomic(mt)) {
/*
* It should never happen but changes to
* locking could inadvertently allow a per-cpu
@@ -2029,7 +2051,8 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
* of pageblocks that cannot be completely freed
* may increase.
*/
- ret = move_freepages_block(zone, page, ac->migratetype);
+ ret = move_freepages_block(zone, page, mt,
+ ac->migratetype);
/*
* Reserving this block already succeeded, so this should
* not fail on zone boundaries.
@@ -2202,12 +2225,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
* pages are ordered properly.
*/
list_add_tail(&page->pcp_list, list);
- if (is_migrate_cma(get_pageblock_migratetype(page)))
- __mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
- -(1 << order));
}
-
- __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
spin_unlock_irqrestore(&zone->lock, flags);
return i;
@@ -2604,11 +2622,9 @@ int __isolate_free_page(struct page *page, unsigned int order)
watermark = zone->_watermark[WMARK_MIN] + (1UL << order);
if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA))
return 0;
-
- __mod_zone_freepage_state(zone, -(1UL << order), mt);
}
- del_page_from_free_list(page, zone, order);
+ del_page_from_free_list(page, zone, order, mt);
/*
* Set the pageblock if the isolated page is at least half of a
@@ -2623,7 +2639,7 @@ int __isolate_free_page(struct page *page, unsigned int order)
* with others)
*/
if (migratetype_is_mergeable(mt) &&
- move_freepages_block(zone, page,
+ move_freepages_block(zone, page, mt,
MIGRATE_MOVABLE) != -1)
set_pageblock_migratetype(page, MIGRATE_MOVABLE);
}
@@ -2715,8 +2731,6 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
return NULL;
}
}
- __mod_zone_freepage_state(zone, -(1 << order),
- get_pageblock_migratetype(page));
spin_unlock_irqrestore(&zone->lock, flags);
} while (check_new_pages(page, order));
@@ -6488,8 +6502,9 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
BUG_ON(page_count(page));
BUG_ON(!PageBuddy(page));
+ VM_WARN_ON(get_pageblock_migratetype(page) != MIGRATE_ISOLATE);
order = buddy_order(page);
- del_page_from_free_list(page, zone, order);
+ del_page_from_free_list(page, zone, order, MIGRATE_ISOLATE);
pfn += (1 << order);
}
spin_unlock_irqrestore(&zone->lock, flags);
@@ -6540,11 +6555,12 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page,
current_buddy = page + size;
}
- if (set_page_guard(zone, current_buddy, high, migratetype))
+ if (set_page_guard(zone, current_buddy, high))
continue;
if (current_buddy != target) {
- add_to_free_list(current_buddy, zone, high, migratetype);
+ add_to_free_list(current_buddy, zone, high,
+ migratetype, false);
set_buddy_order(current_buddy, high);
page = next_page;
}
@@ -6572,12 +6588,11 @@ bool take_page_off_buddy(struct page *page)
int migratetype = get_pfnblock_migratetype(page_head,
pfn_head);
- del_page_from_free_list(page_head, zone, page_order);
+ del_page_from_free_list(page_head, zone, page_order,
+ migratetype);
break_down_buddy_pages(zone, page_head, page, 0,
page_order, migratetype);
SetPageHWPoisonTakenOff(page);
- if (!is_migrate_isolate(migratetype))
- __mod_zone_freepage_state(zone, -1, migratetype);
ret = true;
break;
}
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index f5e4d8676b36..b0705e709973 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -181,13 +181,12 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
int nr_pages;
int mt = get_pageblock_migratetype(page);
- nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE);
+ nr_pages = move_freepages_block(zone, page, mt, MIGRATE_ISOLATE);
/* Block spans zone boundaries? */
if (nr_pages == -1) {
spin_unlock_irqrestore(&zone->lock, flags);
return -EBUSY;
}
- __mod_zone_freepage_state(zone, -nr_pages, mt);
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
zone->nr_isolate_pageblock++;
spin_unlock_irqrestore(&zone->lock, flags);
@@ -255,13 +254,13 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
* allocation.
*/
if (!isolated_page) {
- int nr_pages = move_freepages_block(zone, page, migratetype);
+ int nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE,
+ migratetype);
/*
* Isolating this block already succeeded, so this
* should not fail on zone boundaries.
*/
WARN_ON_ONCE(nr_pages == -1);
- __mod_zone_freepage_state(zone, nr_pages, migratetype);
}
set_pageblock_migratetype(page, migratetype);
if (isolated_page)
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available
2023-08-21 18:33 ` [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available Johannes Weiner
@ 2023-08-21 20:14 ` Zi Yan
2023-08-21 20:29 ` Zi Yan
1 sibling, 0 replies; 16+ messages in thread
From: Zi Yan @ 2023-08-21 20:14 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 328 bytes --]
On 21 Aug 2023, at 14:33, Johannes Weiner wrote:
> Save a pfn_to_page() lookup when the pfn is right there already.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> mm/page_alloc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
--
Best Regards,
Yan, Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available
2023-08-21 18:33 ` [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available Johannes Weiner
2023-08-21 20:14 ` Zi Yan
@ 2023-08-21 20:29 ` Zi Yan
2023-08-21 21:22 ` Johannes Weiner
1 sibling, 1 reply; 16+ messages in thread
From: Zi Yan @ 2023-08-21 20:29 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 402 bytes --]
On 21 Aug 2023, at 14:33, Johannes Weiner wrote:
> Save a pfn_to_page() lookup when the pfn is right there already.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> mm/page_alloc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Just notice that it is already done in:
https://lkml.kernel.org/r/20230811115945.3423894-3-shikemeng@huaweicloud.com
--
Best Regards,
Yan, Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks
2023-08-21 18:33 ` [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
@ 2023-08-21 20:41 ` Zi Yan
2023-08-21 21:20 ` Johannes Weiner
0 siblings, 1 reply; 16+ messages in thread
From: Zi Yan @ 2023-08-21 20:41 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2918 bytes --]
On 21 Aug 2023, at 14:33, Johannes Weiner wrote:
> The buddy allocator coalesces compatible blocks during freeing, but it
> doesn't update the types of the subblocks to match. When an allocation
> later breaks the chunk down again, its pieces will be put on freelists
> of the wrong type. This encourages incompatible page mixing (ask for
> one type, get another), and thus long-term fragmentation.
>
> Update the subblocks when merging a larger chunk, such that a later
> expand() will maintain freelist type hygiene.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> mm/page_alloc.c | 37 ++++++++++++++++++++++---------------
> 1 file changed, 22 insertions(+), 15 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a5e36d186893..6c9f565b2613 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -438,6 +438,17 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
> page_to_pfn(page), MIGRATETYPE_MASK);
> }
>
> +static void change_pageblock_range(struct page *pageblock_page,
> + int start_order, int migratetype)
> +{
> + int nr_pageblocks = 1 << (start_order - pageblock_order);
> +
> + while (nr_pageblocks--) {
> + set_pageblock_migratetype(pageblock_page, migratetype);
> + pageblock_page += pageblock_nr_pages;
> + }
> +}
> +
Is this code move included by accident?
> #ifdef CONFIG_DEBUG_VM
> static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
> {
> @@ -808,10 +819,17 @@ static inline void __free_one_page(struct page *page,
> */
> int buddy_mt = get_pfnblock_migratetype(buddy, buddy_pfn);
>
> - if (migratetype != buddy_mt
> - && (!migratetype_is_mergeable(migratetype) ||
> - !migratetype_is_mergeable(buddy_mt)))
> - goto done_merging;
> + if (migratetype != buddy_mt) {
> + if (!migratetype_is_mergeable(migratetype) ||
> + !migratetype_is_mergeable(buddy_mt))
> + goto done_merging;
> + /*
> + * Match buddy type. This ensures that
> + * an expand() down the line puts the
> + * sub-blocks on the right freelists.
> + */
> + set_pageblock_migratetype(buddy, migratetype);
> + }
> }
>
> /*
> @@ -1687,17 +1705,6 @@ int move_freepages_block(struct zone *zone, struct page *page,
> num_movable);
> }
>
> -static void change_pageblock_range(struct page *pageblock_page,
> - int start_order, int migratetype)
> -{
> - int nr_pageblocks = 1 << (start_order - pageblock_order);
> -
> - while (nr_pageblocks--) {
> - set_pageblock_migratetype(pageblock_page, migratetype);
> - pageblock_page += pageblock_nr_pages;
> - }
> -}
> -
> /*
> * When we are falling back to another migratetype during allocation, try to
> * steal extra free pages from the same pageblocks to satisfy further
> --
> 2.41.0
--
Best Regards,
Yan, Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks
2023-08-21 20:41 ` Zi Yan
@ 2023-08-21 21:20 ` Johannes Weiner
0 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 21:20 UTC (permalink / raw)
To: Zi Yan; +Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
On Mon, Aug 21, 2023 at 04:41:44PM -0400, Zi Yan wrote:
> On 21 Aug 2023, at 14:33, Johannes Weiner wrote:
>
> > The buddy allocator coalesces compatible blocks during freeing, but it
> > doesn't update the types of the subblocks to match. When an allocation
> > later breaks the chunk down again, its pieces will be put on freelists
> > of the wrong type. This encourages incompatible page mixing (ask for
> > one type, get another), and thus long-term fragmentation.
> >
> > Update the subblocks when merging a larger chunk, such that a later
> > expand() will maintain freelist type hygiene.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> > mm/page_alloc.c | 37 ++++++++++++++++++++++---------------
> > 1 file changed, 22 insertions(+), 15 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index a5e36d186893..6c9f565b2613 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -438,6 +438,17 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
> > page_to_pfn(page), MIGRATETYPE_MASK);
> > }
> >
> > +static void change_pageblock_range(struct page *pageblock_page,
> > + int start_order, int migratetype)
> > +{
> > + int nr_pageblocks = 1 << (start_order - pageblock_order);
> > +
> > + while (nr_pageblocks--) {
> > + set_pageblock_migratetype(pageblock_page, migratetype);
> > + pageblock_page += pageblock_nr_pages;
> > + }
> > +}
> > +
>
> Is this code move included by accident?
Ah, yes, my bad.
I used to call change_pageblock_range() at the end of the merge,
before adding the coalesced chunk to the freelist, for which I needed
this further up. Then I changed it to dealing with individual buddies
instead, and forgot to drop this part.
I'll remove it in the next version.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available
2023-08-21 20:29 ` Zi Yan
@ 2023-08-21 21:22 ` Johannes Weiner
0 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-21 21:22 UTC (permalink / raw)
To: Zi Yan; +Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, linux-mm, linux-kernel
On Mon, Aug 21, 2023 at 04:29:36PM -0400, Zi Yan wrote:
> On 21 Aug 2023, at 14:33, Johannes Weiner wrote:
>
> > Save a pfn_to_page() lookup when the pfn is right there already.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> > mm/page_alloc.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Just notice that it is already done in:
> https://lkml.kernel.org/r/20230811115945.3423894-3-shikemeng@huaweicloud.com
Even better :) I'll rebase on top of this.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 8/8] mm: page_alloc: consolidate free page accounting
2023-08-21 18:33 ` [PATCH 8/8] mm: page_alloc: consolidate free page accounting Johannes Weiner
@ 2023-08-23 22:40 ` kernel test robot
2023-08-24 1:34 ` Johannes Weiner
0 siblings, 1 reply; 16+ messages in thread
From: kernel test robot @ 2023-08-23 22:40 UTC (permalink / raw)
To: Johannes Weiner, Andrew Morton
Cc: llvm, oe-kbuild-all, Linux Memory Management List,
Vlastimil Babka, Mel Gorman, linux-kernel
Hi Johannes,
kernel test robot noticed the following build errors:
[auto build test ERROR on linus/master]
[also build test ERROR on v6.5-rc7]
[cannot apply to akpm-mm/mm-everything next-20230823]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Johannes-Weiner/mm-page_alloc-use-get_pfnblock_migratetype-where-pfn-available/20230822-024104
base: linus/master
patch link: https://lore.kernel.org/r/20230821183733.106619-9-hannes%40cmpxchg.org
patch subject: [PATCH 8/8] mm: page_alloc: consolidate free page accounting
config: x86_64-randconfig-075-20230823 (https://download.01.org/0day-ci/archive/20230824/202308240628.YoW5rQTu-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce: (https://download.01.org/0day-ci/archive/20230824/202308240628.YoW5rQTu-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308240628.YoW5rQTu-lkp@intel.com/
All errors (new ones prefixed by >>):
>> mm/page_alloc.c:6702:2: error: call to undeclared function '__mod_zone_freepage_state'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
__mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
^
mm/page_alloc.c:6702:2: note: did you mean '__mod_zone_page_state'?
include/linux/vmstat.h:319:20: note: '__mod_zone_page_state' declared here
static inline void __mod_zone_page_state(struct zone *zone,
^
mm/page_alloc.c:6754:2: error: call to undeclared function '__mod_zone_freepage_state'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
__mod_zone_freepage_state(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
^
2 errors generated.
vim +/__mod_zone_freepage_state +6702 mm/page_alloc.c
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6681
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6682 static bool try_to_accept_memory_one(struct zone *zone)
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6683 {
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6684 unsigned long flags;
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6685 struct page *page;
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6686 bool last;
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6687
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6688 if (list_empty(&zone->unaccepted_pages))
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6689 return false;
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6690
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6691 spin_lock_irqsave(&zone->lock, flags);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6692 page = list_first_entry_or_null(&zone->unaccepted_pages,
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6693 struct page, lru);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6694 if (!page) {
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6695 spin_unlock_irqrestore(&zone->lock, flags);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6696 return false;
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6697 }
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6698
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6699 list_del(&page->lru);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6700 last = list_empty(&zone->unaccepted_pages);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6701
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 @6702 __mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6703 __mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6704 spin_unlock_irqrestore(&zone->lock, flags);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6705
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6706 accept_page(page, MAX_ORDER);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6707
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6708 __free_pages_ok(page, MAX_ORDER, FPI_TO_TAIL);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6709
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6710 if (last)
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6711 static_branch_dec(&zones_with_unaccepted_pages);
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6712
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6713 return true;
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6714 }
dcdfdd40fa82b6 Kirill A. Shutemov 2023-06-06 6715
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 8/8] mm: page_alloc: consolidate free page accounting
2023-08-23 22:40 ` kernel test robot
@ 2023-08-24 1:34 ` Johannes Weiner
0 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2023-08-24 1:34 UTC (permalink / raw)
To: kernel test robot
Cc: Andrew Morton, llvm, oe-kbuild-all, Linux Memory Management List,
Vlastimil Babka, Mel Gorman, linux-kernel
On Thu, Aug 24, 2023 at 06:40:58AM +0800, kernel test robot wrote:
> >> mm/page_alloc.c:6702:2: error: call to undeclared function '__mod_zone_freepage_state'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
> __mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
> ^
> mm/page_alloc.c:6702:2: note: did you mean '__mod_zone_page_state'?
> include/linux/vmstat.h:319:20: note: '__mod_zone_page_state' declared here
> static inline void __mod_zone_page_state(struct zone *zone,
> ^
> mm/page_alloc.c:6754:2: error: call to undeclared function '__mod_zone_freepage_state'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
> __mod_zone_freepage_state(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
> ^
> 2 errors generated.
Ah, that's in the new unaccepted memory bits. I'll fix those up in v2.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-08-24 1:34 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-21 18:33 [PATCH 0/8] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-08-21 18:33 ` [PATCH 1/8] mm: page_alloc: use get_pfnblock_migratetype where pfn available Johannes Weiner
2023-08-21 20:14 ` Zi Yan
2023-08-21 20:29 ` Zi Yan
2023-08-21 21:22 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 2/8] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-08-21 18:33 ` [PATCH 3/8] mm: page_alloc: fix highatomic landing on the wrong buddy list Johannes Weiner
2023-08-21 18:33 ` [PATCH 4/8] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-08-21 20:41 ` Zi Yan
2023-08-21 21:20 ` Johannes Weiner
2023-08-21 18:33 ` [PATCH 5/8] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-08-21 18:33 ` [PATCH 6/8] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-08-21 18:33 ` [PATCH 7/8] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-08-21 18:33 ` [PATCH 8/8] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-08-23 22:40 ` kernel test robot
2023-08-24 1:34 ` Johannes Weiner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox