linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Acs <acsjakub@amazon.de>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: <linux-mm@kvack.org>, Hugh Dickins <hughd@google.com>,
	Jann Horn <jannh@google.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	<akpm@linux-foundation.org>, <david@redhat.com>,
	<xu.xin16@zte.com.cn>, <chengming.zhou@linux.dev>,
	<peterx@redhat.com>, <axelrasmussen@google.com>,
	<linux-kernel@vger.kernel.org>, <stable@vger.kernel.org>
Subject: Re: [PATCH v3 1/2] mm/ksm: fix flag-dropping behavior in ksm_madvise
Date: Fri, 7 Nov 2025 09:49:38 +0000	[thread overview]
Message-ID: <20251107094938.GA71570@dev-dsk-acsjakub-1b-6f9934e2.eu-west-1.amazon.com> (raw)
In-Reply-To: <13c7242e-3a40-469b-9e99-8a65a21449bb@suse.cz>

On Thu, Nov 06, 2025 at 11:39:28AM +0100, Vlastimil Babka wrote:
> On 10/1/25 11:03, Jakub Acs wrote:
> > syzkaller discovered the following crash: (kernel BUG)
> > 
> > [   44.607039] ------------[ cut here ]------------
> > [   44.607422] kernel BUG at mm/userfaultfd.c:2067!
> > [   44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
> > [   44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none)
> > [   44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> > [   44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
> > 
> > <snip other registers, drop unreliable trace>
> > 
> > [   44.617726] Call Trace:
> > [   44.617926]  <TASK>
> > [   44.619284]  userfaultfd_release+0xef/0x1b0
> > [   44.620976]  __fput+0x3f9/0xb60
> > [   44.621240]  fput_close_sync+0x110/0x210
> > [   44.622222]  __x64_sys_close+0x8f/0x120
> > [   44.622530]  do_syscall_64+0x5b/0x2f0
> > [   44.622840]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [   44.623244] RIP: 0033:0x7f365bb3f227
> > 
> > Kernel panics because it detects UFFD inconsistency during
> > userfaultfd_release_all(). Specifically, a VMA which has a valid pointer
> > to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
> > 
> > The inconsistency is caused in ksm_madvise(): when user calls madvise()
> > with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR
> > mode, it accidentally clears all flags stored in the upper 32 bits of
> > vma->vm_flags.
> > 
> > Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int
> > and int are 32-bit wide. This setup causes the following mishap during
> > the &= ~VM_MERGEABLE assignment.
> > 
> > VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000.
> > After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then
> > promoted to unsigned long before the & operation. This promotion fills
> > upper 32 bits with leading 0s, as we're doing unsigned conversion (and
> > even for a signed conversion, this wouldn't help as the leading bit is
> > 0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff
> > instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears
> > the upper 32-bits of its value.
> > 
> > Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the
> > BIT() macro.
> > 
> > Note: other VM_* flags are not affected:
> > This only happens to the VM_MERGEABLE flag, as the other VM_* flags are
> > all constants of type int and after ~ operation, they end up with
> > leading 1 and are thus converted to unsigned long with leading 1s.
> > 
> > Note 2:
> > After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is
> > no longer a kernel BUG, but a WARNING at the same place:
> > 
> > [   45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
> > 
> > but the root-cause (flag-drop) remains the same.
> > 
> > Fixes: 7677f7fd8be76 ("userfaultfd: add minor fault registration mode")
> 
> Late to the party, but it seems to me the correct Fixes: should be
> f8af4da3b4c1 ("ksm: the mm interface to ksm")
> which introduced the flag and the buggy clearing code, no?
> 
> Commit 7677f7fd8be76 is just one that notices it, right? But there are other
> flags in >32 bit area, including pkeys etc. Sounds rather dangerous if they
> can be cleared using a madvise.
> 
> So we can't amend the Fixes: now but maybe could advise stable to backport
> for even older versions than based on 7677f7fd8be76 ?
> 

Good point. It was a bit tricky to determine the correct "fixes" tag, as
there were more candidates:
- the commit that initially introduced VM_MERGEABLE as a constant with
  different inferred type to other vm_flags constants
- the commit that first started using upper 32 bits of vm_flags and did
  not make sure the constants are defined safely
- f8af4da3b4c1 indeed, as the one that makes the drop actually possible
- 7677f7fd8be76 that shows us a path where the drop manifests

Looking back, I agree f8af4da3b4c1 is the better option, but as you
said, that won't be changed now.

Nevertheless, I'll send the backports after a round of kselftests,
thanks for pointing this out.

Have a good day,
Jakub
 



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Christof Hellmis
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597



  parent reply	other threads:[~2025-11-07  9:49 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01  9:03 [PATCH v3 0/2] mm, ksm: fix flag-dropping behavior Jakub Acs
2025-10-01  9:03 ` [PATCH v3 1/2] mm/ksm: fix flag-dropping behavior in ksm_madvise Jakub Acs
2025-10-01 14:06   ` David Hildenbrand
2025-10-01 16:43   ` SeongJae Park
2025-11-06 10:39   ` Vlastimil Babka
2025-11-06 11:16     ` David Hildenbrand (Red Hat)
2025-11-07  9:49     ` Jakub Acs [this message]
2025-11-10 10:00     ` Vlastimil Babka
2025-10-01  9:03 ` [PATCH v3 2/2] mm: redefine VM_* flag constants with BIT() Jakub Acs
2025-10-01 14:04   ` David Hildenbrand
2025-10-02  8:03     ` Jakub Acs
2025-10-01 16:51   ` SeongJae Park
2025-10-02  7:29     ` David Hildenbrand
2025-10-02 17:39       ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251107094938.GA71570@dev-dsk-acsjakub-1b-6f9934e2.eu-west-1.amazon.com \
    --to=acsjakub@amazon.de \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=chengming.zhou@linux.dev \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=peterx@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=xu.xin16@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox