linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	 Jann Horn <jannh@google.com>, Pedro Falcato <pfalcato@suse.de>,
	 Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	 Kees Cook <kees@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 Vineet Gupta <vgupta@kernel.org>,
	Russell King <linux@armlinux.org.uk>,
	 Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Brian Cain <bcain@kernel.org>,
	 Huacai Chen <chenhuacai@kernel.org>,
	WANG Xuerui <kernel@xen0n.name>,
	 Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Dinh Nguyen <dinguyen@kernel.org>,
	 Madhavan Srinivasan <maddy@linux.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	 Nicholas Piggin <npiggin@gmail.com>,
	Christophe Leroy <chleroy@kernel.org>,
	 Paul Walmsley <pjw@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	 Albert Ou <aou@eecs.berkeley.edu>,
	Alexandre Ghiti <alex@ghiti.fr>,
	 Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	 Alexander Gordeev <agordeev@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	 Sven Schnelle <svens@linux.ibm.com>,
	Thomas Gleixner <tglx@kernel.org>,
	 Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
	 Richard Weinberger <richard@nod.at>,
	Anton Ivanov <anton.ivanov@cambridgegreys.com>,
	 Johannes Berg <johannes@sipsolutions.net>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Xu Xin <xu.xin16@zte.com.cn>,
	 Chengming Zhou <chengming.zhou@linux.dev>,
	Michal Hocko <mhocko@suse.com>, Paul Moore <paul@paul-moore.com>,
	 Stephen Smalley <stephen.smalley.work@gmail.com>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	 linux-snps-arc@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	 linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev,
	linux-mips@vger.kernel.org,  linuxppc-dev@lists.ozlabs.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org,
	 linux-um@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	selinux@vger.kernel.org
Subject: Re: [PATCH v3 22/23] mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t
Date: Fri, 20 Mar 2026 11:08:08 +0000	[thread overview]
Message-ID: <7e22cc48-aa04-406d-b4d0-8ebb182b34b9@lucifer.local> (raw)
In-Reply-To: <0b5765da-67e9-4e2e-99d8-08501730bf76@kernel.org>

On Fri, Mar 20, 2026 at 11:39:58AM +0100, Vlastimil Babka (SUSE) wrote:
> On 3/18/26 16:50, Lorenzo Stoakes (Oracle) wrote:
> > Update the vma_modify_flags() and vma_modify_flags_uffd() functions to
> > accept a vma_flags_t parameter rather than a vm_flags_t one, and propagate
> > the changes as needed to implement this change.
> >
> > Finally, update the VMA tests to reflect this.
> >
> > Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> > --- a/mm/mlock.c
> > +++ b/mm/mlock.c
> > @@ -415,13 +415,14 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr,
> >   * @vma - vma containing range to be mlock()ed or munlock()ed
> >   * @start - start address in @vma of the range
> >   * @end - end of range in @vma
> > - * @newflags - the new set of flags for @vma.
> > + * @new_vma_flags - the new set of flags for @vma.
> >   *
> >   * Called for mlock(), mlock2() and mlockall(), to set @vma VM_LOCKED;
> >   * called for munlock() and munlockall(), to clear VM_LOCKED from @vma.
> >   */
> >  static void mlock_vma_pages_range(struct vm_area_struct *vma,
> > -	unsigned long start, unsigned long end, vm_flags_t newflags)
> > +	unsigned long start, unsigned long end,
> > +	vma_flags_t *new_vma_flags)
> >  {
> >  	static const struct mm_walk_ops mlock_walk_ops = {
> >  		.pmd_entry = mlock_pte_range,
> > @@ -439,18 +440,18 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
> >  	 * combination should not be visible to other mmap_lock users;
> >  	 * but WRITE_ONCE so rmap walkers must see VM_IO if VM_LOCKED.
> >  	 */
> > -	if (newflags & VM_LOCKED)
> > -		newflags |= VM_IO;
> > +	if (vma_flags_test(new_vma_flags, VMA_LOCKED_BIT))
> > +		vma_flags_set(new_vma_flags, VMA_IO_BIT);
> >  	vma_start_write(vma);
> > -	vm_flags_reset_once(vma, newflags);
> > +	WRITE_ONCE(vma->flags, *new_vma_flags);
>
> It's not clear to me, how is switching from vm_flags_t to vma_flags_t
> allowing us to simply do WRITE_ONCE() instead of the full logic of
> vm_flags_reset_once()? Won't it fail to compile once once flags are more
> than single word? Or worse, will compile but silently allow tearing?

We only care about tearing in the flags that can be contained within a
system word, but true we should probably do this more carefully, as I did
for vm_flags_reset_once().

I will reimplement this as a new, hideous, helper function.

I am not a fan of this being a thing to handle races, it's a hack. But I
guess that should be addressed separately.

>
> >
> >  	lru_add_drain();
> >  	walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, NULL);
> >  	lru_add_drain();
> >
> > -	if (newflags & VM_IO) {
> > -		newflags &= ~VM_IO;
> > -		vm_flags_reset_once(vma, newflags);
> > +	if (vma_flags_test(new_vma_flags, VMA_IO_BIT)) {
> > +		vma_flags_clear(new_vma_flags, VMA_IO_BIT);
> > +		WRITE_ONCE(vma->flags, *new_vma_flags);
>
> Ditto.

Yup.

>
> >  	}
> >  }
> >
> > @@ -467,20 +468,22 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	       struct vm_area_struct **prev, unsigned long start,
> >  	       unsigned long end, vm_flags_t newflags)
> >  {
> > +	vma_flags_t new_vma_flags = legacy_to_vma_flags(newflags);
> > +	const vma_flags_t old_vma_flags = vma->flags;
> >  	struct mm_struct *mm = vma->vm_mm;
> >  	int nr_pages;
> >  	int ret = 0;
> > -	vm_flags_t oldflags = vma->vm_flags;
> >
> > -	if (newflags == oldflags || vma_is_secretmem(vma) ||
> > -	    !vma_supports_mlock(vma))
> > +	if (vma_flags_same_pair(&old_vma_flags, &new_vma_flags) ||
> > +	    vma_is_secretmem(vma) || !vma_supports_mlock(vma)) {
> >  		/*
> >  		 * Don't set VM_LOCKED or VM_LOCKONFAULT and don't count.
> >  		 * For secretmem, don't allow the memory to be unlocked.
> >  		 */
> >  		goto out;
> > +	}
> >
> > -	vma = vma_modify_flags(vmi, *prev, vma, start, end, &newflags);
> > +	vma = vma_modify_flags(vmi, *prev, vma, start, end, &new_vma_flags);
> >  	if (IS_ERR(vma)) {
> >  		ret = PTR_ERR(vma);
> >  		goto out;
> > @@ -490,9 +493,9 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	 * Keep track of amount of locked VM.
> >  	 */
> >  	nr_pages = (end - start) >> PAGE_SHIFT;
> > -	if (!(newflags & VM_LOCKED))
> > +	if (!vma_flags_test(&new_vma_flags, VMA_LOCKED_BIT))
> >  		nr_pages = -nr_pages;
> > -	else if (oldflags & VM_LOCKED)
> > +	else if (vma_flags_test(&old_vma_flags, VMA_LOCKED_BIT))
> >  		nr_pages = 0;
> >  	mm->locked_vm += nr_pages;
> >
> > @@ -501,12 +504,13 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	 * It's okay if try_to_unmap_one unmaps a page just after we
> >  	 * set VM_LOCKED, populate_vma_page_range will bring it back.
> >  	 */
> > -	if ((newflags & VM_LOCKED) && (oldflags & VM_LOCKED)) {
> > +	if (vma_flags_test(&new_vma_flags, VMA_LOCKED_BIT) &&
> > +	    vma_flags_test(&old_vma_flags, VMA_LOCKED_BIT)) {
> >  		/* No work to do, and mlocking twice would be wrong */
> >  		vma_start_write(vma);
> > -		vm_flags_reset(vma, newflags);
> > +		vma->flags = new_vma_flags;
>
> This also does lot less than vm_flags_reset()?

Well let's look:

	VM_WARN_ON_ONCE(!pgtable_supports_soft_dirty() && (flags & VM_SOFTDIRTY));

Are we really at a point where this is problematic? Do we hit this? Why are
we specifically checking only this case on every single instance of
resetting VMA flags?

	vma_assert_write_locked(vma);

Note the vma_start_write() line above. I want to separate vma_flags_t
helpers from asserts like that, because:

1. We might be operating on a VMA that is not yet added to the tree
2. We might be operating on a VMA that is now detached
3. Really in all but core code, you should be using vma_desc_xxx().
4. Other VMA fields are manipulated with no such checks.
5. It'd be egregious to have to add variants of flag functions just to
   account for cases such as the above, especially when we don't do so for
   other VMA fields. Drivers are the problematic cases and why it was
   especially important (and also for debug as VMA locks were introduced),
   the mmap_prepare work is solving this generally.

	vm_flags_init(vma, flags);

Analysing what's in this function:

	VM_WARN_ON_ONCE(!pgtable_supports_soft_dirty() && (flags & VM_SOFTDIRTY));

Duplicated.

	vma_flags_clear_all(&vma->flags);

Only necessary while you're only setting the first system word of
vma->flags.

	vma_flags_overwrite_word(&vma->flags, flags);

Again only necessary when you're only setting the first system word.

So yeah it's doing the equivalent and (intentionally) eliminating some
noise.

But I'll add the S/D check back I guess.

>
> >  	} else {
> > -		mlock_vma_pages_range(vma, start, end, newflags);
> > +		mlock_vma_pages_range(vma, start, end, &new_vma_flags);
> >  	}
> >  out:
> >  	*prev = vma;
> > diff --git a/mm/mprotect.c b/mm/mprotect.c
> > index eaa724b99908..2b8a85689ab7 100644
> > --- a/mm/mprotect.c
> > +++ b/mm/mprotect.c
> > @@ -756,13 +756,11 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> >  		vma_flags_clear(&new_vma_flags, VMA_ACCOUNT_BIT);
> >  	}
> >
> > -	newflags = vma_flags_to_legacy(new_vma_flags);
> > -	vma = vma_modify_flags(vmi, *pprev, vma, start, end, &newflags);
> > +	vma = vma_modify_flags(vmi, *pprev, vma, start, end, &new_vma_flags);
> >  	if (IS_ERR(vma)) {
> >  		error = PTR_ERR(vma);
> >  		goto fail;
> >  	}
> > -	new_vma_flags = legacy_to_vma_flags(newflags);
> >
> >  	*pprev = vma;
> >
> > @@ -771,7 +769,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> >  	 * held in write mode.
> >  	 */
> >  	vma_start_write(vma);
> > -	vm_flags_reset_once(vma, newflags);
> > +	WRITE_ONCE(vma->flags, new_vma_flags);
>
> Ditto.

I mean overall these cases are hacks in my opinion to work around code that
should have solved their problem another way.

But sure, as above I'll add a helper function or such.

>
> >  	if (vma_wants_manual_pte_write_upgrade(vma))
> >  		mm_cp_flags |= MM_CP_TRY_CHANGE_WRITABLE;
> >  	vma_set_page_prot(vma);
> > @@ -796,6 +794,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> >  	}
> >
> >  	vm_stat_account(mm, vma_flags_to_legacy(old_vma_flags), -nrpages);
> > +	newflags = vma_flags_to_legacy(new_vma_flags);
> >  	vm_stat_account(mm, newflags, nrpages);
> >  	perf_event_mmap(vma);
> >  	return 0;
> > diff --git a/mm/mseal.c b/mm/mseal.c
> > index 316b5e1dec78..603df53ad267 100644


  reply	other threads:[~2026-03-20 11:08 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 15:50 [PATCH v3 00/23] mm/vma: convert vm_flags_t to vma_flags_t in vma code Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 01/23] mm/vma: add vma_flags_empty(), vma_flags_and(), vma_flags_diff_pair() Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 02/23] tools/testing/vma: add unit tests flag empty, diff_pair, and[_mask] Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 03/23] mm/vma: add further vma_flags_t unions Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 04/23] tools/testing/vma: convert bulk of test code to vma_flags_t Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 05/23] mm/vma: use new VMA flags for sticky flags logic Lorenzo Stoakes (Oracle)
2026-03-19 14:50   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 06/23] tools/testing/vma: fix VMA flag tests Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 07/23] mm/vma: add append_vma_flags() helper Lorenzo Stoakes (Oracle)
2026-03-19 17:20   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 08/23] tools/testing/vma: add simple test for append_vma_flags() Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 09/23] mm: unexport vm_brk_flags() and eliminate vm_flags parameter Lorenzo Stoakes (Oracle)
2026-03-19 17:27   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 10/23] mm/vma: introduce vma_flags_same[_mask/_pair]() Lorenzo Stoakes (Oracle)
2026-03-19 17:31   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 11/23] mm/vma: introduce [vma_flags,legacy]_to_[legacy,vma_flags]() helpers Lorenzo Stoakes (Oracle)
2026-03-19 17:38   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 12/23] tools/testing/vma: test that legacy flag helpers work correctly Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 13/23] mm/vma: introduce vma_test[_any[_mask]](), and make inlining consistent Lorenzo Stoakes (Oracle)
2026-03-20  8:48   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 14/23] tools/testing/vma: update VMA flag tests to test vma_test[_any_mask]() Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 15/23] mm: introduce vma_flags_count() and vma[_flags]_test_single_mask() Lorenzo Stoakes (Oracle)
2026-03-20  8:59   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 16/23] tools/testing/vma: test vma_flags_count,vma[_flags]_test_single_mask Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 17/23] mm: convert do_brk_flags() to use vma_flags_t Lorenzo Stoakes (Oracle)
2026-03-20  9:57   ` Vlastimil Babka (SUSE)
2026-03-20 13:42     ` Lorenzo Stoakes (Oracle)
2026-03-20 15:06       ` Vlastimil Babka (SUSE)
2026-03-20 16:38         ` Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 18/23] mm: update vma_supports_mlock() to use new VMA flags Lorenzo Stoakes (Oracle)
2026-03-20 10:03   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 19/23] mm/vma: introduce vma_clear_flags[_mask]() Lorenzo Stoakes (Oracle)
2026-03-20 10:04   ` Vlastimil Babka (SUSE)
2026-03-18 15:50 ` [PATCH v3 20/23] tools/testing/vma: update VMA tests to test vma_clear_flags[_mask]() Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 21/23] mm/vma: convert as much as we can in mm/vma.c to vma_flags_t Lorenzo Stoakes (Oracle)
2026-03-20 10:15   ` Vlastimil Babka (SUSE)
2026-03-20 18:28     ` Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 22/23] mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t Lorenzo Stoakes (Oracle)
2026-03-20 10:39   ` Vlastimil Babka (SUSE)
2026-03-20 11:08     ` Lorenzo Stoakes (Oracle) [this message]
2026-03-20 11:56       ` Vlastimil Babka (SUSE)
2026-03-20 12:15         ` Lorenzo Stoakes (Oracle)
2026-03-18 15:50 ` [PATCH v3 23/23] mm/vma: convert __mmap_region() " Lorenzo Stoakes (Oracle)
2026-03-20 10:51   ` Vlastimil Babka (SUSE)
2026-03-18 17:47 ` [PATCH v3 00/23] mm/vma: convert vm_flags_t to vma_flags_t in vma code Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7e22cc48-aa04-406d-b4d0-8ebb182b34b9@lucifer.local \
    --to=ljs@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=bcain@kernel.org \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chengming.zhou@linux.dev \
    --cc=chenhuacai@kernel.org \
    --cc=chleroy@kernel.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=dinguyen@kernel.org \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=johannes@sipsolutions.net \
    --cc=kees@kernel.org \
    --cc=kernel@xen0n.name \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-um@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=maddy@linux.ibm.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=omosnace@redhat.com \
    --cc=palmer@dabbelt.com \
    --cc=paul@paul-moore.com \
    --cc=pfalcato@suse.de \
    --cc=pjw@kernel.org \
    --cc=richard@nod.at \
    --cc=rppt@kernel.org \
    --cc=selinux@vger.kernel.org \
    --cc=stephen.smalley.work@gmail.com \
    --cc=surenb@google.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@kernel.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@kernel.org \
    --cc=vgupta@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xu.xin16@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox