linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Marc Hartmayer <mhartmay@linux.ibm.com>,
	linux-mm@kvack.org, linux-s390@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_alloc: change all pageblocks migrate type on coalescing
Date: Mon, 15 Dec 2025 00:37:01 +0000	[thread overview]
Message-ID: <20251215003701.2fdyuxvwquin4ovk@master> (raw)
In-Reply-To: <5e79bed1-598d-4e34-8f1e-87b6dba52bf8@suse.cz>

On Fri, Dec 12, 2025 at 04:46:46PM +0100, Vlastimil Babka wrote:
>On 12/12/25 16:14, Alexander Gordeev wrote:
>> When a page is freed it coalesces with a buddy into a higher
>> order page while possible. When the buddy page migrate type
>> differs, it is expected to be updated to match the one of the
>> page being freed.
>> 
>> However, only the first pageblock of the buddy page is updated,
>> while the rest of the pageblocks are left unchanged.
>> 
>> That causes warnings in later expand() and other code paths
>> (like below), since an inconsistency between migration type
>> of the list containing the page and the page-owned pageblocks
>> migration types is introduced.
>> 
>> The issue is first exposed with commit e0932b6c1f94 ("mm:
>> page_alloc: consolidate free page accounting"), where the
>> warnings were introduced, but it is observed in earlier
>> versions if similar warnings are added.
>> 
>> [  308.986589] ------------[ cut here ]------------
>> [  308.987227] page type is 0, passed migratetype is 1 (nr=256)
>> [  308.987275] WARNING: CPU: 1 PID: 5224 at mm/page_alloc.c:812 expand+0x23c/0x270
>> [  308.987293] Modules linked in: algif_hash(E) af_alg(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) loop(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) ctcm(E) fsm(E) diag288_wdt(E) watchdog(E) zfcp(E) scsi_transport_fc(E) ghash_s390(E) prng(E) aes_s390(E) des_generic(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha_common(E) paes_s390(E) crypto_engine(E) pkey_cca(E) pkey_ep11(E) zcrypt(E) rng_core(E) pkey_pckmo(E) pkey(E) autofs4(E)
>> [  308.987439] Unloaded tainted modules: hmac_s390(E):2
>> [  308.987650] CPU: 1 UID: 0 PID: 5224 Comm: mempig_verify Kdump: loaded Tainted: G            E       6.18.0-gcc-bpf-debug #431 PREEMPT
>> [  308.987657] Tainted: [E]=UNSIGNED_MODULE
>> [  308.987661] Hardware name: IBM 3906 M04 704 (z/VM 7.3.0)
>> [  308.987666] Krnl PSW : 0404f00180000000 00000349976fa600 (expand+0x240/0x270)
>> [  308.987676]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
>> [  308.987682] Krnl GPRS: 0000034980000004 0000000000000005 0000000000000030 000003499a0e6d88
>> [  308.987688]            0000000000000005 0000034980000005 000002be803ac000 0000023efe6c8300
>> [  308.987692]            0000000000000008 0000034998d57290 000002be00000100 0000023e00000008
>> [  308.987696]            0000000000000000 0000000000000000 00000349976fa5fc 000002c99b1eb6f0
>> [  308.987708] Krnl Code: 00000349976fa5f0: c020008a02f2	larl	%r2,000003499883abd4
>>                           00000349976fa5f6: c0e5ffe3f4b5	brasl	%r14,0000034997378f60
>>                          #00000349976fa5fc: af000000		mc	0,0
>>                          >00000349976fa600: a7f4ff4c		brc	15,00000349976fa498
>>                           00000349976fa604: b9040026		lgr	%r2,%r6
>>                           00000349976fa608: c0300088317f	larl	%r3,0000034998800906
>>                           00000349976fa60e: c0e5fffdb6e1	brasl	%r14,00000349976b13d0
>>                           00000349976fa614: af000000		mc	0,0
>> [  308.987734] Call Trace:
>> [  308.987738]  [<00000349976fa600>] expand+0x240/0x270
>> [  308.987744] ([<00000349976fa5fc>] expand+0x23c/0x270)
>> [  308.987749]  [<00000349976ff95e>] rmqueue_bulk+0x71e/0x940
>> [  308.987754]  [<00000349976ffd7e>] __rmqueue_pcplist+0x1fe/0x2a0
>> [  308.987759]  [<0000034997700966>] rmqueue.isra.0+0xb46/0xf40
>> [  308.987763]  [<0000034997703ec8>] get_page_from_freelist+0x198/0x8d0
>> [  308.987768]  [<0000034997706fa8>] __alloc_frozen_pages_noprof+0x198/0x400
>> [  308.987774]  [<00000349977536f8>] alloc_pages_mpol+0xb8/0x220
>> [  308.987781]  [<0000034997753bf6>] folio_alloc_mpol_noprof+0x26/0xc0
>> [  308.987786]  [<0000034997753e4c>] vma_alloc_folio_noprof+0x6c/0xa0
>> [  308.987791]  [<0000034997775b22>] vma_alloc_anon_folio_pmd+0x42/0x240
>> [  308.987799]  [<000003499777bfea>] __do_huge_pmd_anonymous_page+0x3a/0x210
>> [  308.987804]  [<00000349976cb08e>] __handle_mm_fault+0x4de/0x500
>> [  308.987809]  [<00000349976cb14c>] handle_mm_fault+0x9c/0x3a0
>> [  308.987813]  [<000003499734d70e>] do_exception+0x1de/0x540
>> [  308.987822]  [<0000034998387390>] __do_pgm_check+0x130/0x220
>> [  308.987830]  [<000003499839a934>] pgm_check_handler+0x114/0x160
>> [  308.987838] 3 locks held by mempig_verify/5224:
>> [  308.987842]  #0: 0000023ea44c1e08 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0xb2/0x2a0
>> [  308.987859]  #1: 0000023ee4d41b18 (&pcp->lock){+.+.}-{2:2}, at: rmqueue.isra.0+0xad6/0xf40
>> [  308.987871]  #2: 0000023efe6c8998 (&zone->lock){..-.}-{2:2}, at: rmqueue_bulk+0x5a/0x940
>> [  308.987886] Last Breaking-Event-Address:
>> [  308.987890]  [<0000034997379096>] __warn_printk+0x136/0x140
>> [  308.987897] irq event stamp: 52330356
>> [  308.987901] hardirqs last  enabled at (52330355): [<000003499838742e>] __do_pgm_check+0x1ce/0x220
>> [  308.987907] hardirqs last disabled at (52330356): [<000003499839932e>] _raw_spin_lock_irqsave+0x9e/0xe0
>> [  308.987913] softirqs last  enabled at (52329882): [<0000034997383786>] handle_softirqs+0x2c6/0x530
>> [  308.987922] softirqs last disabled at (52329859): [<0000034997382f86>] __irq_exit_rcu+0x126/0x140
>> [  308.987929] ---[ end trace 0000000000000000 ]---
>> [  308.987936] ------------[ cut here ]------------
>> [  308.987940] page type is 0, passed migratetype is 1 (nr=256)
>> [  308.987951] WARNING: CPU: 1 PID: 5224 at mm/page_alloc.c:860 __del_page_from_free_list+0x1be/0x1e0
>> [  308.987960] Modules linked in: algif_hash(E) af_alg(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) loop(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) ctcm(E) fsm(E) diag288_wdt(E) watchdog(E) zfcp(E) scsi_transport_fc(E) ghash_s390(E) prng(E) aes_s390(E) des_generic(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha_common(E) paes_s390(E) crypto_engine(E) pkey_cca(E) pkey_ep11(E) zcrypt(E) rng_core(E) pkey_pckmo(E) pkey(E) autofs4(E)
>> [  308.988070] Unloaded tainted modules: hmac_s390(E):2
>> [  308.988087] CPU: 1 UID: 0 PID: 5224 Comm: mempig_verify Kdump: loaded Tainted: G        W   E       6.18.0-gcc-bpf-debug #431 PREEMPT
>> [  308.988095] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
>> [  308.988100] Hardware name: IBM 3906 M04 704 (z/VM 7.3.0)
>> [  308.988105] Krnl PSW : 0404f00180000000 00000349976f9e32 (__del_page_from_free_list+0x1c2/0x1e0)
>> [  308.988118]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
>> [  308.988127] Krnl GPRS: 0000034980000004 0000000000000005 0000000000000030 000003499a0e6d88
>> [  308.988133]            0000000000000005 0000034980000005 0000034998d57290 0000023efe6c8300
>> [  308.988139]            0000000000000001 0000000000000008 000002be00000100 000002be803ac000
>> [  308.988144]            0000000000000000 0000000000000001 00000349976f9e2e 000002c99b1eb728
>> [  308.988153] Krnl Code: 00000349976f9e22: c020008a06d9	larl	%r2,000003499883abd4
>>                           00000349976f9e28: c0e5ffe3f89c	brasl	%r14,0000034997378f60
>>                          #00000349976f9e2e: af000000		mc	0,0
>>                          >00000349976f9e32: a7f4ff4e		brc	15,00000349976f9cce
>>                           00000349976f9e36: b904002b		lgr	%r2,%r11
>>                           00000349976f9e3a: c030008a06e7	larl	%r3,000003499883ac08
>>                           00000349976f9e40: c0e5fffdbac8	brasl	%r14,00000349976b13d0
>>                           00000349976f9e46: af000000		mc	0,0
>> [  308.988184] Call Trace:
>> [  308.988188]  [<00000349976f9e32>] __del_page_from_free_list+0x1c2/0x1e0
>> [  308.988195] ([<00000349976f9e2e>] __del_page_from_free_list+0x1be/0x1e0)
>> [  308.988202]  [<00000349976ff946>] rmqueue_bulk+0x706/0x940
>> [  308.988208]  [<00000349976ffd7e>] __rmqueue_pcplist+0x1fe/0x2a0
>> [  308.988214]  [<0000034997700966>] rmqueue.isra.0+0xb46/0xf40
>> [  308.988221]  [<0000034997703ec8>] get_page_from_freelist+0x198/0x8d0
>> [  308.988227]  [<0000034997706fa8>] __alloc_frozen_pages_noprof+0x198/0x400
>> [  308.988233]  [<00000349977536f8>] alloc_pages_mpol+0xb8/0x220
>> [  308.988240]  [<0000034997753bf6>] folio_alloc_mpol_noprof+0x26/0xc0
>> [  308.988247]  [<0000034997753e4c>] vma_alloc_folio_noprof+0x6c/0xa0
>> [  308.988253]  [<0000034997775b22>] vma_alloc_anon_folio_pmd+0x42/0x240
>> [  308.988260]  [<000003499777bfea>] __do_huge_pmd_anonymous_page+0x3a/0x210
>> [  308.988267]  [<00000349976cb08e>] __handle_mm_fault+0x4de/0x500
>> [  308.988273]  [<00000349976cb14c>] handle_mm_fault+0x9c/0x3a0
>> [  308.988279]  [<000003499734d70e>] do_exception+0x1de/0x540
>> [  308.988286]  [<0000034998387390>] __do_pgm_check+0x130/0x220
>> [  308.988293]  [<000003499839a934>] pgm_check_handler+0x114/0x160
>> [  308.988300] 3 locks held by mempig_verify/5224:
>> [  308.988305]  #0: 0000023ea44c1e08 (vm_lock){++++}-{0:0}, at: lock_vma_under_rcu+0xb2/0x2a0
>> [  308.988322]  #1: 0000023ee4d41b18 (&pcp->lock){+.+.}-{2:2}, at: rmqueue.isra.0+0xad6/0xf40
>> [  308.988334]  #2: 0000023efe6c8998 (&zone->lock){..-.}-{2:2}, at: rmqueue_bulk+0x5a/0x940
>> [  308.988346] Last Breaking-Event-Address:
>> [  308.988350]  [<0000034997379096>] __warn_printk+0x136/0x140
>> [  308.988356] irq event stamp: 52330356
>> [  308.988360] hardirqs last  enabled at (52330355): [<000003499838742e>] __do_pgm_check+0x1ce/0x220
>> [  308.988366] hardirqs last disabled at (52330356): [<000003499839932e>] _raw_spin_lock_irqsave+0x9e/0xe0
>> [  308.988373] softirqs last  enabled at (52329882): [<0000034997383786>] handle_softirqs+0x2c6/0x530
>> [  308.988380] softirqs last disabled at (52329859): [<0000034997382f86>] __irq_exit_rcu+0x126/0x140
>> [  308.988388] ---[ end trace 0000000000000000 ]---
>> 
>> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>> Closes: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/
>> Fixes: e0932b6c1f94 ("mm: page_alloc: consolidate free page accounting")
>> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
>
>Hm I guess we haven't seen this before because it's common that
>pageblock_order is just one below MAX_ORDER so we're only merging two
>pageblocks. But your arch/config must be different to expose it. In any case
>LGTM, thanks.

Deep hiding. I didn't spot it when reading the code.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

>
>Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
>> ---
>>  mm/page_alloc.c | 24 ++++++++++++------------
>>  1 file changed, 12 insertions(+), 12 deletions(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index ed82ee55e66a..6e644f2744c2 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -913,6 +913,17 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
>>  			NULL) != NULL;
>>  }
>>  
>> +static void change_pageblock_range(struct page *pageblock_page,
>> +				   int start_order, int migratetype)
>> +{
>> +	int nr_pageblocks = 1 << (start_order - pageblock_order);
>> +
>> +	while (nr_pageblocks--) {
>> +		set_pageblock_migratetype(pageblock_page, migratetype);
>> +		pageblock_page += pageblock_nr_pages;
>> +	}
>> +}
>> +
>>  /*
>>   * Freeing function for a buddy system allocator.
>>   *
>> @@ -999,7 +1010,7 @@ static inline void __free_one_page(struct page *page,
>>  			 * expand() down the line puts the sub-blocks
>>  			 * on the right freelists.
>>  			 */
>> -			set_pageblock_migratetype(buddy, migratetype);
>> +			change_pageblock_range(buddy, order, migratetype);
>>  		}
>>  
>>  		combined_pfn = buddy_pfn & pfn;
>> @@ -2146,17 +2157,6 @@ bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *pag
>>  
>>  #endif /* CONFIG_MEMORY_ISOLATION */
>>  
>> -static void change_pageblock_range(struct page *pageblock_page,
>> -					int start_order, int migratetype)
>> -{
>> -	int nr_pageblocks = 1 << (start_order - pageblock_order);
>> -
>> -	while (nr_pageblocks--) {
>> -		set_pageblock_migratetype(pageblock_page, migratetype);
>> -		pageblock_page += pageblock_nr_pages;
>> -	}
>> -}
>> -
>>  static inline bool boost_watermark(struct zone *zone)
>>  {
>>  	unsigned long max_boost;
>

-- 
Wei Yang
Help you, Help me


      parent reply	other threads:[~2025-12-15  0:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-12 15:14 Alexander Gordeev
2025-12-12 15:46 ` Vlastimil Babka
2025-12-14 16:06   ` Johannes Weiner
2025-12-15  7:49     ` Alexander Gordeev
2025-12-15  0:37   ` Wei Yang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251215003701.2fdyuxvwquin4ovk@master \
    --to=richard.weiyang@gmail.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhartmay@linux.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox