From: Suren Baghdasaryan <surenb@google.com>
To: Sasha Levin <sashal@kernel.org>
Cc: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
peterx@redhat.com, aarcange@redhat.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] mm/userfaultfd: fix missing PTE unmap for non-migration entries
Date: Tue, 8 Jul 2025 09:34:48 -0700 [thread overview]
Message-ID: <CAJuCfpF3K49Z8uevF6M9FZX-tFgJDCkCi54iL=xwDuQB2RMqoA@mail.gmail.com> (raw)
In-Reply-To: <aG0_-79QiMEk3N-R@lappy>
On Tue, Jul 8, 2025 at 8:57 AM Sasha Levin <sashal@kernel.org> wrote:
>
> On Tue, Jul 08, 2025 at 08:39:47AM -0700, Suren Baghdasaryan wrote:
> >On Tue, Jul 8, 2025 at 8:33 AM Sasha Levin <sashal@kernel.org> wrote:
> >>
> >> On Tue, Jul 08, 2025 at 05:10:44PM +0200, David Hildenbrand wrote:
> >> >On 01.07.25 02:57, Andrew Morton wrote:
> >> >>On Sun, 29 Jun 2025 23:19:58 -0400 Sasha Levin <sashal@kernel.org> wrote:
> >> >>
> >> >>>When handling non-swap entries in move_pages_pte(), the error handling
> >> >>>for entries that are NOT migration entries fails to unmap the page table
> >> >>>entries before jumping to the error handling label.
> >> >>>
> >> >>>This results in a kmap/kunmap imbalance which on CONFIG_HIGHPTE systems
> >> >>>triggers a WARNING in kunmap_local_indexed() because the kmap stack is
> >> >>>corrupted.
> >> >>>
> >> >>>Example call trace on ARM32 (CONFIG_HIGHPTE enabled):
> >> >>> WARNING: CPU: 1 PID: 633 at mm/highmem.c:622 kunmap_local_indexed+0x178/0x17c
> >> >>> Call trace:
> >> >>> kunmap_local_indexed from move_pages+0x964/0x19f4
> >> >>> move_pages from userfaultfd_ioctl+0x129c/0x2144
> >> >>> userfaultfd_ioctl from sys_ioctl+0x558/0xd24
> >> >>>
> >> >>>The issue was introduced with the UFFDIO_MOVE feature but became more
> >> >>>frequent with the addition of guard pages (commit 7c53dfbdb024 ("mm: add
> >> >>>PTE_MARKER_GUARD PTE marker")) which made the non-migration entry code
> >> >>>path more commonly executed during userfaultfd operations.
> >> >>>
> >> >>>Fix this by ensuring PTEs are properly unmapped in all non-swap entry
> >> >>>paths before jumping to the error handling label, not just for migration
> >> >>>entries.
> >> >>
> >> >>I don't get it.
> >> >>
> >> >>>--- a/mm/userfaultfd.c
> >> >>>+++ b/mm/userfaultfd.c
> >> >>>@@ -1384,14 +1384,15 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
> >> >>> entry = pte_to_swp_entry(orig_src_pte);
> >> >>> if (non_swap_entry(entry)) {
> >> >>>+ pte_unmap(src_pte);
> >> >>>+ pte_unmap(dst_pte);
> >> >>>+ src_pte = dst_pte = NULL;
> >> >>> if (is_migration_entry(entry)) {
> >> >>>- pte_unmap(src_pte);
> >> >>>- pte_unmap(dst_pte);
> >> >>>- src_pte = dst_pte = NULL;
> >> >>> migration_entry_wait(mm, src_pmd, src_addr);
> >> >>> err = -EAGAIN;
> >> >>>- } else
> >> >>>+ } else {
> >> >>> err = -EFAULT;
> >> >>>+ }
> >> >>> goto out;
> >> >>
> >> >>where we have
> >> >>
> >> >>out:
> >> >> ...
> >> >> if (dst_pte)
> >> >> pte_unmap(dst_pte);
> >> >> if (src_pte)
> >> >> pte_unmap(src_pte);
> >> >
> >> >AI slop?
> >>
> >> Nah, this one is sadly all me :(
> >>
> >> I was trying to resolve some of the issues found with linus-next on
> >> LKFT, and misunderstood the code. Funny enough, I thought that the
> >> change above "fixed" it by making the warnings go away, but clearly is
> >> the wrong thing to do so I went back to the drawing table...
> >>
> >> If you're curious, here's the issue: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-43418-g558c6dd4d863/testrun/29030370/suite/log-parser-test/test/exception-warning-cpu-pid-at-mmhighmem-kunmap_local_indexed/details/
> >
> >Any way to symbolize that Call trace? I can't find build artefacts to
> >extract vmlinux image...
>
> The build artifacts are at
> https://storage.tuxsuite.com/public/linaro/lkft/builds/2zSrTao2x4P640QKIx18JUuFdc1/
> but I couldn't get it to do the right thing. I'm guessing that I need
> some magical arm32 toolchain bits that I don't carry:
>
> cat tr.txt | ./scripts/decode_stacktrace.sh vmlinux
> <4>[ 38.566145] ------------[ cut here ]------------
> <4>[ 38.566392] WARNING: CPU: 1 PID: 637 at mm/highmem.c:622 kunmap_local_indexed+0x198/0x1a4
> <4>[ 38.569398] Modules linked in: nfnetlink ip_tables x_tables
> <4>[ 38.570481] CPU: 1 UID: 0 PID: 637 Comm: uffd-unit-tests Not tainted 6.16.0-rc4 #1 NONE
> <4>[ 38.570815] Hardware name: Generic DT based system
> <4>[ 38.571073] Call trace:
> <4>[ 38.571239] unwind_backtrace from show_stack (arch/arm64/kernel/stacktrace.c:465)
> <4>[ 38.571602] show_stack from dump_stack_lvl (lib/dump_stack.c:118 (discriminator 1))
> <4>[ 38.571805] dump_stack_lvl from __warn (kernel/panic.c:791)
> <4>[ 38.572002] __warn from warn_slowpath_fmt+0xa8/0x174
> <4>[ 38.572290] warn_slowpath_fmt from kunmap_local_indexed+0x198/0x1a4
> <4>[ 38.572520] kunmap_local_indexed from move_pages_pte+0xc40/0xf48
> <4>[ 38.572970] move_pages_pte from move_pages+0x428/0x5bc
> <4>[ 38.573189] move_pages from userfaultfd_ioctl+0x900/0x1ec0
> <4>[ 38.573376] userfaultfd_ioctl from sys_ioctl+0xd24/0xd90
> <4>[ 38.573581] sys_ioctl from ret_fast_syscall+0x0/0x5c
> <4>[ 38.573810] Exception stack(0xf9d69fa8 to 0xf9d69ff0)
> <4>[ 38.574546] 9fa0: 00001000 00000005 00000005 c028aa05 b2d3ecd8 b2d3ecc8
> <4>[ 38.574919] 9fc0: 00001000 00000005 b2d3ece0 00000036 b2d3ed84 b2d3ed50 b2d3ed7c b2d3ed58
> <4>[ 38.575131] 9fe0: 00000036 b2d3ecb0 b6df1861 b6d5f736
> <4>[ 38.575511] ---[ end trace 0000000000000000 ]---
Ah, I know what's going on. 6.13.rc7 which is used in this test does
not have my fix 927e926d72d9 ("userfaultfd: fix PTE unmapping
stack-allocated PTE copies") (see
https://elixir.bootlin.com/linux/v6.13.7/source/mm/userfaultfd.c#L1284).
It was backported into 6.13.rc8. So, it tries to unmap a copy of a
mapped PTE, which will fail when CONFIG_HIGHPTE is enabled. So, it
makes sense that it is failing on arm32.
>
> --
> Thanks,
> Sasha
next prev parent reply other threads:[~2025-07-08 16:35 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 3:19 Sasha Levin
2025-06-30 15:09 ` Dev Jain
2025-07-01 0:57 ` Andrew Morton
2025-07-08 15:10 ` David Hildenbrand
2025-07-08 15:32 ` Suren Baghdasaryan
2025-07-08 15:33 ` Sasha Levin
2025-07-08 15:39 ` Suren Baghdasaryan
2025-07-08 15:57 ` Sasha Levin
2025-07-08 16:34 ` Suren Baghdasaryan [this message]
2025-07-31 12:43 ` Sasha Levin
2025-07-08 15:42 ` David Hildenbrand
2025-07-31 12:37 ` Sasha Levin
2025-07-31 12:56 ` David Hildenbrand
2025-07-31 14:00 ` Suren Baghdasaryan
2025-07-31 14:07 ` Sasha Levin
2025-08-01 13:26 ` Sasha Levin
2025-08-01 14:06 ` David Hildenbrand
2025-08-01 14:13 ` David Hildenbrand
2025-08-01 14:24 ` Sasha Levin
2025-08-01 14:29 ` Sasha Levin
2025-08-07 19:51 ` Sasha Levin
2025-08-08 8:02 ` David Hildenbrand
2025-08-08 15:55 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJuCfpF3K49Z8uevF6M9FZX-tFgJDCkCi54iL=xwDuQB2RMqoA@mail.gmail.com' \
--to=surenb@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox