linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Paul Moore <paul@paul-moore.com>,
	Stephen Smalley <stephen.smalley.work@gmail.com>,
	Ondrej Mosnacek <omosnace@redhat.com>,
	Suren Baghdasaryan <surenb@google.com>,
	David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, selinux@vger.kernel.org
Subject: [RFC PATCH 0/2] mm: introduce anon_vma flags, reduce kernel allocs
Date: Wed,  5 Mar 2025 14:55:06 +0000	[thread overview]
Message-ID: <cover.1741185865.git.lorenzo.stoakes@oracle.com> (raw)

VMA resources are scarce. This is a data structure whose weight we wish to
reduce (certainly as slab allocations are unreclaimable and - for now -
unmigratable).

So adding additional fields is generally unviable, and VMA flags are
equally as contended, and prevent VMA merge, further impacting overhead.

We can however make use of the time-honoured kernel tradition of grabbing
bits where we can.

Since we can rely upon anon_vma allocations being at least system
word-aligned, we have a handful of bits in the vma->anon_vma available to
use as flags.

In this series we establish doing so, and immediately use this to solve a
problem encountered as part of the guard region feature
(MADV_GUARD_INSTALL, MADV_GUARD_REMOVE).

We absolutely must preserve guard regions over fork, however it turns out
the only reasonable means of doing so is to establish an anon_vma even if
the VMA is unfaulted.

This creates unnecessary overhead, a problem extenuated by the extension of
this functionality to file-backed regions, where such-allocated memory may
never be utilised or freed until the end of the VMA's lifetime.

We can avoid this if we have a means of indicating to fork that we wish to
copy page tables without having to have this overhead.

Having flags available in vma->anon_vma allows us to do so - we can
therefore introduce a flag, ANON_VMA_UNFAULTED, which indicates that this
is the case.

We introduce wrapper functions to mask off these bits, and nearly every
part of the kernel behaves precisely the same as a result, with only the
desired change in behaviour in the forking logic.

On fault, or any operation that actually requires an established anon_vma,
the ANON_VMA_UNFAULTED flag is cleared and replaced by an actual anon_vma.

An additional advantage of having this mechanism is that we can also remove
this flag, should no 'real' anon_vma be established, and the user is
executing MADV_GUARD_REMOVE on the whole VMA, meaning we can prevent future
unneeded page table operations.

A benefit of this change, aside from saving kernel memory allocations, is
that THP page collapse is no longer impacted if we apply guard regions then
remove them in their entirety from a VMA, as otherwise the immediate
collapse of aligned page tables in retract_page_tables() cannot proceed.

Lorenzo Stoakes (2):
  mm: introduce anon_vma flags and use wrapper functions
  mm/madvise: utilise anon_vma unfaulted flag on guard region install

 fs/coredump.c                    |  2 +-
 include/linux/mm_types.h         | 67 ++++++++++++++++++++-
 include/linux/rmap.h             |  4 +-
 kernel/fork.c                    |  4 +-
 mm/debug.c                       |  6 +-
 mm/huge_memory.c                 |  4 +-
 mm/khugepaged.c                  | 12 ++--
 mm/ksm.c                         | 16 +++---
 mm/madvise.c                     | 49 ++++++++++------
 mm/memory.c                      |  6 +-
 mm/mmap.c                        |  2 +-
 mm/mprotect.c                    |  2 +-
 mm/mremap.c                      |  8 +--
 mm/rmap.c                        | 42 +++++++-------
 mm/swapfile.c                    |  2 +-
 mm/userfaultfd.c                 |  2 +-
 mm/vma.c                         | 99 +++++++++++++++++++++++++-------
 mm/vma.h                         |  6 +-
 security/selinux/hooks.c         |  2 +-
 tools/testing/vma/vma.c          | 95 +++++++++++++++---------------
 tools/testing/vma/vma_internal.h | 78 ++++++++++++++++++++++---
 21 files changed, 358 insertions(+), 150 deletions(-)

--
2.48.1


             reply	other threads:[~2025-03-05 18:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 14:55 Lorenzo Stoakes [this message]
2025-03-05 14:55 ` [RFC PATCH 1/2] mm: introduce anon_vma flags and use wrapper functions Lorenzo Stoakes
2025-03-05 14:55 ` [RFC PATCH 2/2] mm/madvise: utilise anon_vma unfaulted flag on guard region install Lorenzo Stoakes
2025-03-05 15:59 ` [RFC PATCH 0/2] mm: introduce anon_vma flags, reduce kernel allocs Matthew Wilcox
2025-03-05 16:27   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1741185865.git.lorenzo.stoakes@oracle.com \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=david@redhat.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=omosnace@redhat.com \
    --cc=paul@paul-moore.com \
    --cc=selinux@vger.kernel.org \
    --cc=stephen.smalley.work@gmail.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox