linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	syzbot+7d917f67c05066cec295@syzkaller.appspotmail.com,
	Andrew Morton <akpm@linux-foundation.org>,
	Jann Horn <jannh@google.com>
Subject: Re: [PATCH v1] mm/pagewalk: fix usage of pmd_leaf()/pud_leaf() without present check
Date: Wed, 16 Oct 2024 13:05:54 +0200	[thread overview]
Message-ID: <0436c217-0afc-45e6-949b-2291ee1ebc6e@redhat.com> (raw)
In-Reply-To: <c364015e-ab37-411d-b2e9-4e7b10effdf5@bytedance.com>

On 16.10.24 12:58, Qi Zheng wrote:
> 
> 
> On 2024/10/15 21:13, David Hildenbrand wrote:
>> On 15.10.24 13:12, David Hildenbrand wrote:
>>> pmd_leaf()/pud_leaf() only implies a pmd_present()/pud_present() check on
>>> some architectures. We really should check for
>>> pmd_present()/pud_present() first.
>>>
>>> This should explain the report we got on ppc64 (which has
>>> CONFIG_PGTABLE_HAS_HUGE_LEAVES set in the config) that triggered:
>>>      VM_WARN_ON_ONCE(pmd_leaf(pmdp_get_lockless(pmdp)));
>>>
>>> Likely we had a PMD migration entry for which pmd_leaf() did not
>>> trigger. We raced with restoring the PMD migration entry, and suddenly
>>> saw a pmd_leaf(). In this case, pte_offset_map_lock() saved us from more
>>> trouble, because it rechecks the PMD value, but we would not have
>>> processed
>>> the migration entry -- which is not too bad because the only user of
>>> FW_MIGRATION is KSM for unsharing, and KSM only applies to small folios.
>>>
>>> Further, we shouldn't re-read the PMD/PUD value for our warning, the
>>> primary purpose of the VM_WARN_ON_ONCE() is to find spurious use of
>>> pmd_leaf()/pud_leaf() without CONFIG_PGTABLE_HAS_HUGE_LEAVES.
>>>
>>> As a side note, we are currently not implementing FW_MIGRATION support
>>> for PUD migration entries, which likely should exist due to hugetlb. Add
>>> a TODO so this won't fall through the cracks if more FW_MIGRATION users
>>> get added.
>>>
>>> Fixes: aa39ca6940f1 ("mm/pagewalk: introduce folio_walk_start() +
>>> folio_walk_end()")
>>> Reported-by: syzbot+7d917f67c05066cec295@syzkaller.appspotmail.com
>>> Closes:
>>> https://lkml.kernel.org/r/670d3248.050a0220.3e960.0064.GAE@google.com
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Jann Horn <jannh@google.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>> ---
>>
>> Was able to write a quick reproducer and verify that the issue no longer
>> triggers with this fix.
>>
>> https://gitlab.com/davidhildenbrand/scratchspace/-/blob/main/reproducers/move-pages-pmd-leaf.c
>>
>> Without this fix after a couple of seconds in a VM with 2 NUMA nodes:
>>
>> [   54.333753] ------------[ cut here ]------------
>> [   54.334901] WARNING: CPU: 20 PID: 1704 at mm/pagewalk.c:815
>> folio_walk_start+0x48f/0x6e0
>> [   54.336455] Modules linked in: ...
>> [   54.345009] CPU: 20 UID: 0 PID: 1704 Comm: move-pages-pmd- Not
>> tainted 6.12.0-rc2+ #81
>> [   54.346529] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.3-2.fc40 04/01/2014
>> [   54.348191] RIP: 0010:folio_walk_start+0x48f/0x6e0
>> [   54.349134] Code: b5 ad 48 8d 35 00 00 00 00 e8 6d 59 d7 ff e8 08 74
>> da ff e9 9c fe ff ff 4c 8b 7c 24 08 4c 89 ff e8 26 2b be 00 e9 8a fe ff
>> ff <0f> 0b e9 ec fe ff ff f7 c2 ff 0f 00 00 0f 85 81 fe ff ff 48 8b 02
>> [   54.352660] RSP: 0018:ffffb7e4c430bc78 EFLAGS: 00010282
>> [   54.353679] RAX: 80000002a3e008e7 RBX: ffff9946039aa580 RCX:
>> ffff994380000000
>> [   54.355056] RDX: ffff994606aec000 RSI: 00007f004b000000 RDI:
>> 0000000000000000
>> [   54.356440] RBP: 00007f004b000000 R08: 0000000000000591 R09:
>> 0000000000000001
>> [   54.357820] R10: 0000000000000200 R11: 0000000000000001 R12:
>> ffffb7e4c430bd10
>> [   54.359198] R13: ffff994606aec2c0 R14: 0000000000000002 R15:
>> ffff994604a89b00
>> [   54.360564] FS:  00007f004ae006c0(0000) GS:ffff9947f7400000(0000)
>> knlGS:0000000000000000
>> [   54.362111] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   54.363242] CR2: 00007f004adffe58 CR3: 0000000281e12005 CR4:
>> 0000000000770ef0
>> [   54.364615] PKRU: 55555554
>> [   54.365153] Call Trace:
>> [   54.365646]  <TASK>
>> [   54.366073]  ? __warn.cold+0xb7/0x14d
>> [   54.366796]  ? folio_walk_start+0x48f/0x6e0
>> [   54.367628]  ? report_bug+0xff/0x140
>> [   54.368324]  ? handle_bug+0x58/0x90
>> [   54.369019]  ? exc_invalid_op+0x17/0x70
>> [   54.369771]  ? asm_exc_invalid_op+0x1a/0x20
>> [   54.370606]  ? folio_walk_start+0x48f/0x6e0
>> [   54.371415]  ? folio_walk_start+0x9e/0x6e0
>> [   54.372227]  do_pages_move+0x1c5/0x680
>> [   54.372972]  kernel_move_pages+0x1a1/0x2b0
>> [   54.373804]  __x64_sys_move_pages+0x25/0x30
> 
> It would be better to add this call stack to the commit message, which
> can help people find this fix patch when they encounter same problem. ;)

The commit is not part of a released kernel, though, and a lore search 
would return the result until it's included.

Before it's included, the commit message won't really be helpful :)

But sure, @Andrew, can we include that in the commit?

> 
> Otherwise, LGTM.
> 
> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com>
> 

Thanks!

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-10-16 11:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 11:12 David Hildenbrand
2024-10-15 13:13 ` David Hildenbrand
2024-10-16 10:58   ` Qi Zheng
2024-10-16 11:05     ` David Hildenbrand [this message]
2024-10-16 22:44       ` Andrew Morton
2024-10-15 14:32 ` Kirill A. Shutemov
2024-10-15 14:40   ` David Hildenbrand
2024-10-15 14:43     ` Jann Horn
2024-10-15 15:45       ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0436c217-0afc-45e6-949b-2291ee1ebc6e@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=syzbot+7d917f67c05066cec295@syzkaller.appspotmail.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox