linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, chrisl@kernel.org,
	kasong@tencent.com, hughd@google.com, stable@vger.kernel.org,
	David Hildenbrand <david@kernel.org>,
	surenb@google.com, Matthew Wilcox <willy@infradead.org>,
	mhocko@suse.com, hannes@cmpxchg.org, jackmanb@google.com,
	vbabka@suse.cz, Kairui Song <ryncsn@gmail.com>
Subject: Re: [PATCH] mm/page_alloc: clear page->private in split_page() for tail pages
Date: Sat, 07 Feb 2026 09:32:12 -0500	[thread overview]
Message-ID: <247E7FE9-E089-43D1-882B-81C7134C2FFE@nvidia.com> (raw)
In-Reply-To: <CABXGCsNyt6DB=SX9JWD=-WK_BiHhbXaCPNV-GOM8GskKJVAn_A@mail.gmail.com>

On 7 Feb 2026, at 9:25, Mikhail Gavrilov wrote:

> On Sat, Feb 7, 2026 at 8:28 AM Zi Yan <ziy@nvidia.com> wrote:
>>
>> OK, it seems that both slub and shmem do not reset ->private when freeing
>> pages/folios. And tail page's private is not zero, because when a page
>> with non zero private is freed and gets merged with a lower buddy, its
>> private is not set to 0 in the code path.
>>
>> The patch below seems to fix the issue, since I am at Iteration 104 and counting.
>> I also put a VM_BUG_ON(page->private) in free_pages_prepare() and it is not
>> triggered either.
>>
>>
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index ec6c01378e9d..546e193ef993 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -2437,8 +2437,10 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
>>  failed_nolock:
>>         if (skip_swapcache)
>>                 swapcache_clear(si, folio->swap, folio_nr_pages(folio));
>> -       if (folio)
>> +       if (folio) {
>> +               folio->swap.val = 0;
>>                 folio_put(folio);
>> +       }
>>         put_swap_device(si);
>>
>>         return error;
>> diff --git a/mm/slub.c b/mm/slub.c
>> index f77b7407c51b..2cdab6d66e1a 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -3311,6 +3311,7 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab)
>>
>>         __slab_clear_pfmemalloc(slab);
>>         page->mapping = NULL;
>> +       page->private = 0;
>>         __ClearPageSlab(page);
>>         mm_account_reclaimed_pages(pages);
>>         unaccount_slab(slab, order, s);
>>
>>
>>
>> But I am not sure if that is all. Maybe the patch below on top is needed to find all violators
>> and still keep the system running. I also would like to hear from others on whether page->private
>> should be reset or not before free_pages_prepare().
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index cbf758e27aa2..9058f94b0667 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1430,6 +1430,8 @@ __always_inline bool free_pages_prepare(struct page *page,
>>
>>         page_cpupid_reset_last(page);
>>         page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
>> +       VM_WARN_ON_ONCE(page->private);
>> +       page->private = 0;
>>         reset_page_owner(page, order);
>>         page_table_check_free(page, order);
>>         pgalloc_tag_sub(page, 1 << order);
>>
>>
>> --
>> Best Regards,
>> Yan, Zi
>
> I tested your patch. The VM_WARN_ON_ONCE caught another violator - TTM
> (GPU memory manager):

Thanks. As a fix, I think we could combine the two patches above into one and remove
the VM_WARN_ON_ONCE() or just send the second one without VM_WARN_ON_ONCE().
I can send a separate patch later to fix all users that do not reset ->private
and include VM_WARN_ON_ONCE().

WDYT?

>  ------------[ cut here ]------------
>  WARNING: mm/page_alloc.c:1433 at __free_pages_ok+0xe1e/0x12c0,
> CPU#16: gnome-shell/5841
>  Modules linked in: overlay uinput rfcomm snd_seq_dummy snd_hrtimer
> xt_mark xt_cgroup xt_MASQUERADE ip6t_REJECT ipt_REJECT nft_compat
> nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet
> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr uhid bnep sunrpc amd_atl
> intel_rapl_msr intel_rapl_common mt7921e mt7921_common mt792x_lib
> mt76_connac_lib btusb mt76 btmtk btrtl btbcm btintel vfat edac_mce_amd
> spd5118 bluetooth fat snd_hda_codec_atihdmi asus_ec_sensors mac80211
> snd_hda_codec_hdmi kvm_amd snd_hda_intel uvcvideo snd_usb_audio
> snd_hda_codec uvc videobuf2_vmalloc kvm videobuf2_memops snd_hda_core
> joydev videobuf2_v4l2 snd_intel_dspcfg videobuf2_common
> snd_usbmidi_lib videodev snd_intel_sdw_acpi snd_ump irqbypass
> snd_hwdep asus_nb_wmi mc snd_rawmidi rapl snd_seq asus_wmi cfg80211
> sparse_keymap snd_seq_device platform_profile wmi_bmof pcspkr snd_pcm
> snd_timer rfkill igc snd
>   libarc4 i2c_piix4 soundcore k10temp i2c_smbus gpio_amdpt
> gpio_generic nfnetlink zram lz4hc_compress lz4_compress amdgpu amdxcp
> i2c_algo_bit drm_ttm_helper ttm drm_exec drm_panel_backlight_quirks
> gpu_sched drm_suballoc_helper nvme video nvme_core drm_buddy
> ghash_clmulni_intel drm_display_helper nvme_keyring nvme_auth cec
> sp5100_tco hkdf wmi uas usb_storage fuse ntsync i2c_dev
>  CPU: 16 UID: 1000 PID: 5841 Comm: gnome-shell Tainted: G        W
>       6.19.0-rc8-f14faaf3a1fb-with-fix-reset-private-when-freeing+ #82
> PREEMPT(lazy)
>  Tainted: [W]=WARN
>  Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING
> WIFI, BIOS 3602 11/13/2025
>  RIP: 0010:__free_pages_ok+0xe1e/0x12c0
>  Code: ef 48 89 c6 e8 f3 59 ff ff 83 44 24 20 01 49 ba 00 00 00 00 00
> fc ff df e9 71 fe ff ff 41 c7 45 30 ff ff ff ff e9 f5 f4 ff ff <0f> 0b
> e9 73 f5 ff ff e8 86 4c 0e 00 e9 02 fb ff ff 48 c7 44 24 30
>  RSP: 0018:ffffc9000e0cf878 EFLAGS: 00010206
>  RAX: dffffc0000000000 RBX: 0000000000000f80 RCX: 1ffffd40028c6000
>  RDX: 1ffffd40028c6005 RSI: 0000000000000004 RDI: ffffea0014630038
>  RBP: ffffea0014630028 R08: ffffffff9e58e2de R09: 1ffffd40028c6006
>  R10: fffff940028c6007 R11: fffff940028c6007 R12: ffffffffa27376d8
>  R13: ffffea0014630000 R14: ffff889054e559c0 R15: 0000000000000000
>  FS:  00007f510f914000(0000) GS:ffff8890317a8000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00005607eaf70168 CR3: 00000001dfd6a000 CR4: 0000000000f50ef0
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ttm_pool_unmap_and_free+0x30c/0x520 [ttm]
>   ? dma_resv_iter_first_unlocked+0x2f9/0x470
>   ttm_pool_free_range+0xef/0x160 [ttm]
>   ? __pfx_drm_gem_close_ioctl+0x10/0x10
>   ttm_pool_free+0x70/0xe0 [ttm]
>   ? rcu_is_watching+0x15/0xe0
>   ttm_tt_unpopulate+0xa2/0x2d0 [ttm]
>   ttm_bo_cleanup_memtype_use+0xec/0x200 [ttm]
>   ttm_bo_release+0x371/0xb00 [ttm]
>   ? __pfx_ttm_bo_release+0x10/0x10 [ttm]
>   ? drm_vma_node_revoke+0x1a/0x1e0
>   ? local_clock+0x15/0x30
>   ? __pfx_drm_gem_close_ioctl+0x10/0x10
>   drm_gem_object_release_handle+0xcd/0x1f0
>   drm_gem_handle_delete+0x6a/0xc0
>   ? drm_dev_exit+0x35/0x50
>   drm_ioctl_kernel+0x172/0x2e0
>   ? __lock_release.isra.0+0x1a2/0x370
>   ? __pfx_drm_ioctl_kernel+0x10/0x10
>   drm_ioctl+0x571/0xb50
>   ? __pfx_drm_gem_close_ioctl+0x10/0x10
>   ? __pfx_drm_ioctl+0x10/0x10
>   ? rcu_is_watching+0x15/0xe0
>   ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170
>   ? trace_hardirqs_on+0x18/0x140
>   ? lockdep_hardirqs_on+0x90/0x130
>   ? __raw_spin_unlock_irqrestore+0x5d/0x80
>   ? __raw_spin_unlock_irqrestore+0x46/0x80
>   amdgpu_drm_ioctl+0xd3/0x190 [amdgpu]
>   __x64_sys_ioctl+0x13c/0x1d0
>   ? syscall_trace_enter+0x15c/0x2a0
>   do_syscall_64+0x9c/0x4e0
>   ? __lock_release.isra.0+0x1a2/0x370
>   ? do_user_addr_fault+0x87a/0xf60
>   ? fpregs_assert_state_consistent+0x8f/0x100
>   ? trace_hardirqs_on_prepare+0x101/0x140
>   ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170
>   ? irqentry_exit+0x99/0x600
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>  RIP: 0033:0x7f5113af889d
>  Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00
> 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2
> 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
>  RSP: 002b:00007fff83c100c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>  RAX: ffffffffffffffda RBX: 00005607ed127c50 RCX: 00007f5113af889d
>  RDX: 00007fff83c10150 RSI: 0000000040086409 RDI: 000000000000000e
>  RBP: 00007fff83c10110 R08: 00005607ead46d50 R09: 0000000000000000
>  R10: 0000000000000031 R11: 0000000000000246 R12: 00007fff83c10150
>  R13: 0000000040086409 R14: 000000000000000e R15: 00005607ead46d50
>   </TASK>
>  irq event stamp: 5186663
>  hardirqs last  enabled at (5186669): [<ffffffff9dc9ce6e>]
> __up_console_sem+0x7e/0x90
>  hardirqs last disabled at (5186674): [<ffffffff9dc9ce53>]
> __up_console_sem+0x63/0x90
>  softirqs last  enabled at (5186538): [<ffffffff9da5325b>]
> handle_softirqs+0x54b/0x810
>  softirqs last disabled at (5186531): [<ffffffff9da53654>]
> __irq_exit_rcu+0x124/0x240
>  ---[ end trace 0000000000000000 ]---
>
> So there are more violators than just slub and shmem.
> I also tested the post_alloc_hook() fix (clearing page->private for
> all pages at allocation) - 1600+ iterations without crash.
> Given multiple violators, maybe a defensive fix (either in
> split_page() which is already in mm-unstable, or in post_alloc_hook())
> is the right approach, rather than hunting down each violator?
>
> --
> Best Regards,
> Mike Gavrilov.


--
Best Regards,
Yan, Zi


  reply	other threads:[~2026-02-07 14:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABXGCs03XcXt5GDae7d74ynC6P6G2gLw3ZrwAYvSQ3PwP0mGXA@mail.gmail.com>
2026-02-06 17:40 ` Mikhail Gavrilov
2026-02-06 18:08   ` Zi Yan
2026-02-06 18:21     ` Mikhail Gavrilov
2026-02-06 18:29       ` Zi Yan
2026-02-06 18:33         ` Zi Yan
2026-02-06 19:58           ` Zi Yan
2026-02-06 20:49             ` Zi Yan
2026-02-06 22:16               ` Mikhail Gavrilov
2026-02-06 22:37                 ` Mikhail Gavrilov
2026-02-06 23:06                   ` Zi Yan
2026-02-07  3:28                     ` Zi Yan
2026-02-07 14:25                       ` Mikhail Gavrilov
2026-02-07 14:32                         ` Zi Yan [this message]
2026-02-07 15:03                           ` Mikhail Gavrilov
2026-02-07 15:06                             ` Zi Yan
2026-02-07 15:37                               ` [PATCH v2] mm/page_alloc: clear page->private in free_pages_prepare() Mikhail Gavrilov
2026-02-07 16:12                                 ` Zi Yan
2026-02-07 17:36                                   ` [PATCH v3] " Mikhail Gavrilov
2026-02-07 22:02                                     ` David Hildenbrand (Arm)
2026-02-07 22:08                                       ` David Hildenbrand (Arm)
2026-02-09 11:17                                         ` Vlastimil Babka
2026-02-09 15:46                                           ` David Hildenbrand (Arm)
2026-02-09 16:00                                             ` Zi Yan
2026-02-09 16:03                                               ` David Hildenbrand (Arm)
2026-02-09 16:05                                                 ` Zi Yan
2026-02-09 16:06                                                   ` David Hildenbrand (Arm)
2026-02-09 16:08                                                     ` Zi Yan
2026-02-07 23:00                                       ` Zi Yan
2026-02-09 16:16                                         ` David Hildenbrand (Arm)
2026-02-09 16:20                                           ` David Hildenbrand (Arm)
2026-02-09 16:33                                             ` Zi Yan
2026-02-09 17:36                                               ` David Hildenbrand (Arm)
2026-02-09 17:44                                                 ` Zi Yan
2026-02-09 19:39                                                   ` David Hildenbrand (Arm)
2026-02-09 19:42                                                     ` Zi Yan
2026-02-10  1:20                                                       ` Baolin Wang
2026-02-10  2:12                                                         ` Zi Yan
2026-02-10  2:25                                                           ` Baolin Wang
2026-02-10  2:32                                                             ` Zi Yan
2026-02-09 19:46                                     ` David Hildenbrand (Arm)
2026-02-09 11:11                                 ` [PATCH v2] " Vlastimil Babka
2026-02-06 18:24     ` [PATCH] mm/page_alloc: clear page->private in split_page() for tail pages Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=247E7FE9-E089-43D1-882B-81C7134C2FFE@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jackmanb@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mikhail.v.gavrilov@gmail.com \
    --cc=ryncsn@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox