linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: <linux-mm@kvack.org>, <akpm@linux-foundation.org>,
	<vbabka@suse.cz>, <chrisl@kernel.org>, <kasong@tencent.com>,
	<hughd@google.com>, <ryncsn@gmail.com>, <stable@vger.kernel.org>,
	David Hildenbrand <david@kernel.org>, <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>, <mhocko@suse.com>,
	<hannes@cmpxchg.org>, <jackmanb@google.com>,
	Kairui Song <ryncsn@gmail.com>
Subject: Re: [PATCH] mm/page_alloc: clear page->private in split_page() for tail pages
Date: Fri, 06 Feb 2026 15:49:37 -0500	[thread overview]
Message-ID: <4C3D8E3E-D9D6-4475-A122-FA0D930D7DAD@nvidia.com> (raw)
In-Reply-To: <7C7CDFE7-914C-46CE-A127-B7D34304C166@nvidia.com>

On 6 Feb 2026, at 14:58, Zi Yan wrote:

> On 6 Feb 2026, at 13:33, Zi Yan wrote:
>
>> Hit send too soon, sorry about that.
>>
>> On 6 Feb 2026, at 13:29, Zi Yan wrote:
>>
>>> On 6 Feb 2026, at 13:21, Mikhail Gavrilov wrote:
>>>
>>>> Hi, Yan
>>>>
>>>> On Fri, Feb 6, 2026 at 11:08 PM Zi Yan <ziy@nvidia.com> wrote:
>>>>>
>>>>> Do you have a reproducer for this issue?
>>>>
>>>> Yes, I have a stress test that reliably reproduces the crash.
>>>> It cycles swapon/swapoff on 8GB zram under memory pressure:
>>>> https://gist.github.com/NTMan/4ed363793ebd36bd702a39283f06cee1
>>
>> Got it.
>>
>> Merging replies from Kairui from another email:
>>
>> This patch is from previous discussion:
>> https://lore.kernel.org/linux-mm/CABXGCsO3XcXt5GDae7d74ynC6P6G2gLw3ZrwAYvSQ3PwP0mGXA@mail.gmail.com/
>>
>> It looks odd to me too. That bug starts with vmalloc dropping
>> __GFP_COMP in commit 3b8000ae185c, because with __GFP_COMP, the
>> allocator does clean the ->private of tail pages on allocation with
>> prep_compound_page. Without __GFP_COMP, these ->private fields are
>> left as it is.
>>
>>>>
>>>>> Last time I checked page->private usage, I find users clears ->private before free a page.
>>>>> I wonder which one I was missing.
>>>>
>>>> The issue is not about freeing - it's about allocation.
>>
>> I assume everyone zeros used ->private, head or tail, so PageBuddy has
>> all zeroed ->private.
>>
>>>> When buddy allocator merges/splits pages, it uses page->private to store order.
>>>> When a high-order page is later allocated and split via split_page(),
>>>> tail pages still have their old page->private values.
>>
>> No, in __free_one_page(), if a free page is merged to a higher order,
>> it is deleted from free list and its ->private is zeroed. There should not
>> be any non zero private.
>>
>>>> The path is:
>>>> 1. Page freed → free_pages_prepare() does NOT clear page->private
>>
>> Right. The code assume page->private is zero for all pages, head or tail
>> if it is compound.
>>
>>>> 2. Page goes to buddy allocator → buddy uses page->private for order
>>>> 3. Page allocated as high-order → post_alloc_hook() only clears head
>>>> page's private
>>>> 4. split_page() called → tail pages keep stale page->private
>>>>
>>>>> Clearing ->private in split_page() looks like a hack instead of a fix.
>>>>
>>>> I discussed this with Kairui Song earlier in the thread. We considered:
>>>>
>>>> 1. Fix in post_alloc_hook() - would need to clear all pages, not just head
>>>> 2. Fix in swapfile.c - doesn't work because stale value could
>>>> accidentally equal SWP_CONTINUED
>>>> 3. Fix in split_page() - ensures pages are properly initialized for
>>>> independent use
>>>>
>>>> The comment in vmalloc.c says split pages should be usable
>>>> independently ("some use page->mapping, page->lru, etc."), so
>>>> split_page() initializing the pages seems appropriate.
>>>>
>>>> But I agree post_alloc_hook() might be a cleaner place. Would you
>>>> prefer a patch there instead?
>>
>> I think it is better to find out which code causes non zero ->private
>> at page free time.
>
> Hi Mikhail,
>
> Do you mind sharing the kernel config? I am trying to reproduce it locally
> but have no luck (Iteration 111 and going) so far.
>

It seems that I reproduced it locally after enabling KASAN. And page owner
seems to tell that it is KASAN code causing the issue. I added the patch
below to dump_page() and dump_stack() when a freeing page’s private
is not zero. It is on top of 6.19-rc7.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cbf758e27aa2..2151c847c35d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1402,6 +1402,10 @@ __always_inline bool free_pages_prepare(struct page *page,
 #endif
                }
                for (i = 1; i < (1 << order); i++) {
+                       if ((page + i)->private) {
+                               dump_page(page + i, "non zero private");
+                               dump_stack();
+                       }
                        if (compound)
                                bad += free_tail_page_prepare(page, page + i);
                        if (is_check_pages_enabled()) {

Kernel dump below says the page with non zero private was allocated
in kasan_save_stack() and freed in kasan_save_stack().

So fix kasan instead? ;)

qemu-vm login: [   59.753874] zram: Added device: zram0
[   61.112878] zram0: detected capacity change from 0 to 16777216
[   61.131201] Adding 8388604k swap on /dev/zram0.  Priority:100 extents:1 across:8388604k SS
[   71.001984] zram0: detected capacity change from 16777216 to 0
[   71.089084] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff888131a9da00 pfn:0x131a9d
[   71.090751] flags: 0x100000000000000(node=0|zone=2)
[   71.091643] raw: 0100000000000000 dead000000000100 dead000000000122 0000000000000000
[   71.092913] raw: ffff888131a9da00 0000000000100000 00000000ffffffff 0000000000000000
[   71.094336] page dumped because: non zero private
[   71.095000] page_owner tracks the page as allocated
[   71.095871] page last allocated via order 2, migratetype Unmovable, gfp_mask 0x92cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_NOMEMALLOC), pid 834, tgid 834 (rmmod), ts 71089064250, free_ts 67872485904
[   71.099315]  get_page_from_freelist+0x79b/0x3fa0
[   71.100216]  __alloc_frozen_pages_noprof+0x245/0x2160
[   71.101177]  alloc_pages_mpol+0x14c/0x360
[   71.101704]  alloc_pages_noprof+0xfa/0x320
[   71.102497]  stack_depot_save_flags+0x81c/0x8e0
[   71.103100]  kasan_save_stack+0x3f/0x50
[   71.103624]  kasan_save_track+0x17/0x60
[   71.104168]  __kasan_slab_alloc+0x63/0x80
[   71.104694]  kmem_cache_alloc_lru_noprof+0x143/0x550
[   71.105385]  __d_alloc+0x2f/0x850
[   71.105831]  d_alloc_parallel+0xcd/0xc50
[   71.106395]  __lookup_slow+0xec/0x320
[   71.106880]  lookup_slow+0x4f/0x80
[   71.107463]  lookup_noperm_positive_unlocked+0x7d/0xb0
[   71.108173]  debugfs_lookup+0x74/0xe0
[   71.108660]  debugfs_lookup_and_remove+0xa/0x70
[   71.109363] page last free pid 808 tgid 808 stack trace:
[   71.110058]  register_dummy_stack+0x6d/0xb0
[   71.110749]  init_page_owner+0x2e/0x680
[   71.111296]  page_ext_init+0x485/0x4b0
[   71.111902]  mm_core_init+0x157/0x170
[   71.112422] CPU: 1 UID: 0 PID: 834 Comm: rmmod Not tainted 6.19.0-rc7-dirty #201 PREEMPT(voluntary)
[   71.112427] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
[   71.112431] Call Trace:
[   71.112434]  <TASK>
[   71.112436]  dump_stack_lvl+0x4d/0x70
[   71.112441]  __free_frozen_pages+0xef3/0x1100
[   71.112444]  stack_depot_save_flags+0x4d6/0x8e0
[   71.112447]  ? __d_alloc+0x2f/0x850
[   71.112450]  kasan_save_stack+0x3f/0x50
[   71.112454]  ? kasan_save_stack+0x30/0x50
[   71.112458]  ? kasan_save_track+0x17/0x60
[   71.112461]  ? __kasan_slab_alloc+0x63/0x80
[   71.112464]  ? kmem_cache_alloc_lru_noprof+0x143/0x550
[   71.112469]  ? __d_alloc+0x2f/0x850
[   71.112473]  ? d_alloc_parallel+0xcd/0xc50
[   71.112477]  ? __lookup_slow+0xec/0x320
[   71.112480]  ? lookup_slow+0x4f/0x80
[   71.112484]  ? lookup_noperm_positive_unlocked+0x7d/0xb0
[   71.112488]  ? debugfs_lookup+0x74/0xe0
[   71.112492]  ? debugfs_lookup_and_remove+0xa/0x70
[   71.112495]  ? kmem_cache_destroy+0xbe/0x1a0
[   71.112500]  ? zs_destroy_pool+0x145/0x200 [zsmalloc]
[   71.112506]  ? zram_reset_device+0x210/0x5e0 [zram]
[   71.112514]  ? zram_remove.part.0.cold+0x8f/0x37f [zram]
[   71.112522]  ? idr_for_each+0x10b/0x200
[   71.112526]  ? destroy_devices+0x21/0x57 [zram]
[   71.112533]  ? __do_sys_delete_module+0x33f/0x500
[   71.112537]  ? do_syscall_64+0xa4/0xf80
[   71.112541]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   71.112548]  kasan_save_track+0x17/0x60
[   71.112551]  __kasan_slab_alloc+0x63/0x80
[   71.112556]  kmem_cache_alloc_lru_noprof+0x143/0x550
[   71.112560]  ? kernfs_put.part.0+0x14d/0x340
[   71.112564]  __d_alloc+0x2f/0x850
[   71.112568]  ? destroy_devices+0x21/0x57 [zram]
[   71.112575]  ? __do_sys_delete_module+0x33f/0x500
[   71.112579]  d_alloc_parallel+0xcd/0xc50
[   71.112583]  ? __pfx_d_alloc_parallel+0x10/0x10
[   71.112586]  __lookup_slow+0xec/0x320
[   71.112590]  ? __pfx___lookup_slow+0x10/0x10
[   71.112594]  ? down_read+0x132/0x240
[   71.112598]  ? __pfx_down_read+0x10/0x10
[   71.112601]  ? __d_lookup+0x17b/0x1e0
[   71.112605]  lookup_slow+0x4f/0x80
[   71.112610]  lookup_noperm_positive_unlocked+0x7d/0xb0
[   71.112614]  debugfs_lookup+0x74/0xe0
[   71.112618]  debugfs_lookup_and_remove+0xa/0x70
[   71.112623]  kmem_cache_destroy+0xbe/0x1a0
[   71.112626]  zs_destroy_pool+0x145/0x200 [zsmalloc]
[   71.112631]  ? __pfx_zram_remove_cb+0x10/0x10 [zram]
[   71.112638]  zram_reset_device+0x210/0x5e0 [zram]
[   71.112645]  ? __pfx_zram_remove_cb+0x10/0x10 [zram]
[   71.112651]  ? __pfx_zram_remove_cb+0x10/0x10 [zram]
[   71.112658]  zram_remove.part.0.cold+0x8f/0x37f [zram]
[   71.112665]  ? __pfx_zram_remove_cb+0x10/0x10 [zram]
[   71.112671]  idr_for_each+0x10b/0x200
[   71.112675]  ? kasan_save_track+0x25/0x60
[   71.112678]  ? __pfx_idr_for_each+0x10/0x10
[   71.112681]  ? kfree+0x16e/0x490
[   71.112685]  destroy_devices+0x21/0x57 [zram]
[   71.112692]  __do_sys_delete_module+0x33f/0x500
[   71.112696]  ? __pfx___do_sys_delete_module+0x10/0x10
[   71.112702]  do_syscall_64+0xa4/0xf80
[   71.112706]  entry_SYSCALL_64_after_hwframe+0x77/0x7f



Best Regards,
Yan, Zi


  reply	other threads:[~2026-02-06 20:49 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABXGCs03XcXt5GDae7d74ynC6P6G2gLw3ZrwAYvSQ3PwP0mGXA@mail.gmail.com>
2026-02-06 17:40 ` Mikhail Gavrilov
2026-02-06 18:08   ` Zi Yan
2026-02-06 18:21     ` Mikhail Gavrilov
2026-02-06 18:29       ` Zi Yan
2026-02-06 18:33         ` Zi Yan
2026-02-06 19:58           ` Zi Yan
2026-02-06 20:49             ` Zi Yan [this message]
2026-02-06 22:16               ` Mikhail Gavrilov
2026-02-06 22:37                 ` Mikhail Gavrilov
2026-02-06 23:06                   ` Zi Yan
2026-02-07  3:28                     ` Zi Yan
2026-02-07 14:25                       ` Mikhail Gavrilov
2026-02-07 14:32                         ` Zi Yan
2026-02-07 15:03                           ` Mikhail Gavrilov
2026-02-07 15:06                             ` Zi Yan
2026-02-07 15:37                               ` [PATCH v2] mm/page_alloc: clear page->private in free_pages_prepare() Mikhail Gavrilov
2026-02-07 16:12                                 ` Zi Yan
2026-02-07 17:36                                   ` [PATCH v3] " Mikhail Gavrilov
2026-02-07 22:02                                     ` David Hildenbrand (Arm)
2026-02-07 22:08                                       ` David Hildenbrand (Arm)
2026-02-09 11:17                                         ` Vlastimil Babka
2026-02-09 15:46                                           ` David Hildenbrand (Arm)
2026-02-09 16:00                                             ` Zi Yan
2026-02-09 16:03                                               ` David Hildenbrand (Arm)
2026-02-09 16:05                                                 ` Zi Yan
2026-02-09 16:06                                                   ` David Hildenbrand (Arm)
2026-02-09 16:08                                                     ` Zi Yan
2026-02-07 23:00                                       ` Zi Yan
2026-02-09 16:16                                         ` David Hildenbrand (Arm)
2026-02-09 16:20                                           ` David Hildenbrand (Arm)
2026-02-09 16:33                                             ` Zi Yan
2026-02-09 17:36                                               ` David Hildenbrand (Arm)
2026-02-09 17:44                                                 ` Zi Yan
2026-02-09 19:39                                                   ` David Hildenbrand (Arm)
2026-02-09 19:42                                                     ` Zi Yan
2026-02-10  1:20                                                       ` Baolin Wang
2026-02-10  2:12                                                         ` Zi Yan
2026-02-10  2:25                                                           ` Baolin Wang
2026-02-10  2:32                                                             ` Zi Yan
2026-02-09 19:46                                     ` David Hildenbrand (Arm)
2026-02-09 11:11                                 ` [PATCH v2] " Vlastimil Babka
2026-02-06 18:24     ` [PATCH] mm/page_alloc: clear page->private in split_page() for tail pages Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C3D8E3E-D9D6-4475-A122-FA0D930D7DAD@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jackmanb@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mikhail.v.gavrilov@gmail.com \
    --cc=ryncsn@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox