From: "Marc Hartmayer" <mhartmay@linux.ibm.com>
To: Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
linux-mm@kvack.org, linux-s390@vger.kernel.org,
Heiko Carstens <hca@linux.ibm.com>
Subject: Re: page type is 0, migratetype passed is 2 (nr=256)
Date: Tue, 20 May 2025 12:23:42 +0200 [thread overview]
Message-ID: <87zff7r369.fsf@linux.ibm.com> (raw)
In-Reply-To: <20250512171429.GB615800@cmpxchg.org>
On Mon, May 12, 2025 at 01:14 PM -0400, Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Mon, May 12, 2025 at 12:35:39PM -0400, Zi Yan wrote:
>> On 12 May 2025, at 12:16, Lorenzo Stoakes wrote:
>>
>> > +cc Zi
>> >
>> > Hi Marc,
>> >
>> > I noticed this same bug as reported in [0], but only for a _very_ recent
>> > patch series by Zi, which is only present in mm-new, which is the most
>> > unstable mm branch right now :)
>> >
>> > So I wonder if related or a coincidence caused by something else?
>>
>> Unless Marc's branch has my "make MIGRATE_ISOLATE a standalone bit" patchset,
>> it should be caused by something else.
>>
>> A bisect would be very helpful.
>>
>> >
>> > This is triggered by the mm self-test (in tools/testing/selftests/mm, you
>> > can just make -jXX there) transhuge-stress, invoked as:
>> >
>> > $ sudo ./transhuge-stress -d 20
>> >
>> > The stack traces do look very different though so perhaps unrelated?
>>
>> The warning is triggered, in the both cases, a pageblock with MIGRATE_UNMOVABLE(0)
>> is moved to MIGRATE_RECLAIMABLE(2). The pageblock is supposed to have
>> MIGRATE_RECLAIMABLE(2) before the movement.
>
> The weird thing is that the warning is from expand(), when the broken
> up chunks are put *back*. Marc, can you confirm that this is the only
> warning in dmesg, and there aren't any before this one?
Yep, I’ve just checked, it was the first warning and `panic_on_warn` is
set to 1.
I managed to reproduce a similar crash using 6.15.0-rc7 (this time THP
seems to be involved):
…
root@qemus390x:~# [ 40.442403] ------------[ cut here ]------------
[ 40.442471] page type is 0, passed migratetype is 1 (nr=256)
[ 40.442525] WARNING: CPU: 0 PID: 350 at mm/page_alloc.c:669 expand (mm/page_alloc.c:669 (discriminator 2) mm/page_alloc.c:1572 (discriminator 2))
[ 40.442558] Modules linked in: pkey_pckmo(E) pkey(E) diag288_wdt(E) watchdog(E) s390_trng(E) virtio_console(E) rng_core(E) vmw_vsock_virtio_transport(E) vmw_vsock_virtio_transport_common(E) vsock(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha256_s390(E) sha1_s390(E) sha_common(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) vfio(E) sch_fq_codel(E) drm(E) i2c_core(E) drm_panel_orientation_quirks(E) nfnetlink(E) autofs4(E)
[ 40.442651] Unloaded tainted modules: hmac_s390(E):1
[ 40.442677] CPU: 0 UID: 0 PID: 350 Comm: mempig_verify Tainted: G E 6.15.0-rc7-11557-ga01c92c55b53 #1 PREEMPT
[ 40.442683] Tainted: [E]=UNSIGNED_MODULE
[ 40.442687] Hardware name: IBM 3931 A01 701 (KVM/Linux)
[ 40.442692] Krnl PSW : 0404d00180000000 000002ff929af40c expand (mm/page_alloc.c:669 (discriminator 10) mm/page_alloc.c:1572 (discriminator 10))
[ 40.442696] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[ 40.442699] Krnl GPRS: 000002ff80000004 0000000000000005 0000000000000030 0000000000000000
[ 40.442701] 0000000000000005 0000027f80000005 0000000000000100 0000000000000008
[ 40.442703] 000002ff93f99290 000001f63a415900 0000027500000008 00000275829f4000
[ 40.442704] 0000000000000000 0000000000000008 000002ff929af408 0000027f928c36f8
[ 40.442722] Krnl Code: 000002ff929af3fc: c02000883f4b larl %r2,000002ff93ab7292
Code starting with the faulting instruction
===========================================
[ 40.442722] 000002ff929af402: c0e5ffe7bd17 brasl %r14,000002ff926a6e30
[ 40.442722] #000002ff929af408: af000000 mc 0,0
[ 40.442722] >000002ff929af40c: a7f4ff49 brc 15,000002ff929af29e
[ 40.442722] 000002ff929af410: b904002b lgr %r2,%r11
[ 40.442722] 000002ff929af414: c03000881980 larl %r3,000002ff93ab2714
[ 40.442722] 000002ff929af41a: c0e5fffdd883 brasl %r14,000002ff9296a520
[ 40.442722] 000002ff929af420: af000000 mc 0,0
[ 40.442736] Call Trace:
[ 40.442738] expand (mm/page_alloc.c:669 (discriminator 10) mm/page_alloc.c:1572 (discriminator 10))
[ 40.442741] expand (mm/page_alloc.c:669 (discriminator 2) mm/page_alloc.c:1572 (discriminator 2))
[ 40.442743] rmqueue_bulk (mm/page_alloc.c:1587 mm/page_alloc.c:1758 mm/page_alloc.c:2311 mm/page_alloc.c:2364)
[ 40.442745] __rmqueue_pcplist (mm/page_alloc.c:3086)
[ 40.442748] rmqueue.isra.0 (mm/page_alloc.c:3124 mm/page_alloc.c:3155)
[ 40.442751] get_page_from_freelist (mm/page_alloc.c:3683)
[ 40.442754] __alloc_frozen_pages_noprof (mm/page_alloc.c:4967 (discriminator 1))
[ 40.442756] alloc_pages_mpol (mm/mempolicy.c:2290)
[ 40.442764] folio_alloc_mpol_noprof (mm/mempolicy.c:2322)
[ 40.442766] vma_alloc_folio_noprof (mm/mempolicy.c:2355 (discriminator 1))
[ 40.442769] vma_alloc_anon_folio_pmd (mm/huge_memory.c:1167 (discriminator 1))
[ 40.442773] __do_huge_pmd_anonymous_page (mm/huge_memory.c:1227 (discriminator 1))
[ 40.442775] __handle_mm_fault (mm/memory.c:5862 mm/memory.c:6111)
[ 40.442781] handle_mm_fault (mm/memory.c:6321)
[ 40.442783] do_exception (arch/s390/mm/fault.c:298)
[ 40.442792] __do_pgm_check (arch/s390/kernel/traps.c:345)
[ 40.442802] pgm_check_handler (arch/s390/kernel/entry.S:334)
[ 40.442805] Last Breaking-Event-Address:
[ 40.442806] __warn_printk (kernel/panic.c:801)
[ 40.442818] Kernel panic - not syncing: kernel: panic_on_warn set ...
[ 40.442822] CPU: 0 UID: 0 PID: 350 Comm: mempig_verify Tainted: G E 6.15.0-rc7-11557-ga01c92c55b53 #1 PREEMPT
[ 40.442825] Tainted: [E]=UNSIGNED_MODULE
[ 40.442826] Hardware name: IBM 3931 A01 701 (KVM/Linux)
[ 40.442827] Call Trace:
[ 40.442828] dump_stack_lvl (lib/dump_stack.c:122)
[ 40.442831] panic (kernel/panic.c:372)
[ 40.442833] check_panic_on_warn (kernel/panic.c:247)
[ 40.442836] __warn (kernel/panic.c:751)
[ 40.443057] report_bug (lib/bug.c:176 lib/bug.c:215)
[ 40.443064] monitor_event_exception (arch/s390/kernel/traps.c:227 (discriminator 1))
[ 40.443067] __do_pgm_check (arch/s390/kernel/traps.c:345)
[ 40.443071] pgm_check_handler (arch/s390/kernel/entry.S:334)
[ 40.443074] expand (mm/page_alloc.c:669 (discriminator 10) mm/page_alloc.c:1572 (discriminator 10))
[ 40.443077] expand (mm/page_alloc.c:669 (discriminator 2) mm/page_alloc.c:1572 (discriminator 2))
[ 40.443080] rmqueue_bulk (mm/page_alloc.c:1587 mm/page_alloc.c:1758 mm/page_alloc.c:2311 mm/page_alloc.c:2364)
[ 40.443087] __rmqueue_pcplist (mm/page_alloc.c:3086)
[ 40.443090] rmqueue.isra.0 (mm/page_alloc.c:3124 mm/page_alloc.c:3155)
[ 40.443093] get_page_from_freelist (mm/page_alloc.c:3683)
[ 40.443097] __alloc_frozen_pages_noprof (mm/page_alloc.c:4967 (discriminator 1))
[ 40.443100] alloc_pages_mpol (mm/mempolicy.c:2290)
[ 40.443104] folio_alloc_mpol_noprof (mm/mempolicy.c:2322)
[ 40.443110] vma_alloc_folio_noprof (mm/mempolicy.c:2355 (discriminator 1))
[ 40.443114] vma_alloc_anon_folio_pmd (mm/huge_memory.c:1167 (discriminator 1))
[ 40.443117] __do_huge_pmd_anonymous_page (mm/huge_memory.c:1227 (discriminator 1))
[ 40.443120] __handle_mm_fault (mm/memory.c:5862 mm/memory.c:6111)
[ 40.443123] handle_mm_fault (mm/memory.c:6321)
[ 40.443126] do_exception (arch/s390/mm/fault.c:298)
[ 40.443129] __do_pgm_check (arch/s390/kernel/traps.c:345)
[ 40.443132] pgm_check_handler (arch/s390/kernel/entry.S:334)
This time, the setup is even simpler:
1. Start a 2GB QEMU/KVM guest
2. Now run some memory stress test
I run this test in a loop (with starting/shutting down the VM) and after
many iterations, the bug occurs.
[…snip…]
next prev parent reply other threads:[~2025-05-20 10:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-12 14:18 Marc Hartmayer
2025-05-12 16:16 ` Lorenzo Stoakes
2025-05-12 16:35 ` Zi Yan
2025-05-12 17:14 ` Johannes Weiner
2025-05-20 10:23 ` Marc Hartmayer [this message]
2025-06-12 9:05 ` Alexander Gordeev
2025-06-14 8:24 ` Johannes Weiner
2025-08-05 12:02 ` Alexander Gordeev
2025-11-10 14:39 ` Alexander Gordeev
2025-05-13 8:30 ` Marc Hartmayer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zff7r369.fsf@linux.ibm.com \
--to=mhartmay@linux.ibm.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=hannes@cmpxchg.org \
--cc=hca@linux.ibm.com \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox