From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Pedro Falcato <pfalcato@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>,
David Hildenbrand <david@redhat.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Jann Horn <jannh@google.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-mm@kvack.org,
linux-trace-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org, Andrei Vagin <avagin@gmail.com>,
Barry Song <21cnbao@gmail.com>
Subject: Re: [PATCH 1/3] mm: introduce VM_MAYBE_GUARD and make visible for guard regions
Date: Thu, 30 Oct 2025 19:37:55 +0000 [thread overview]
Message-ID: <8f4ad5bf-cd5a-4b93-8332-bc8b306d0e77@lucifer.local> (raw)
In-Reply-To: <6334993b-ead7-472f-bb82-bb7d6f15f3ef@lucifer.local>
On Thu, Oct 30, 2025 at 07:16:56PM +0000, Lorenzo Stoakes wrote:
> On Thu, Oct 30, 2025 at 07:31:26PM +0100, Vlastimil Babka wrote:
> > On 10/30/25 17:43, Lorenzo Stoakes wrote:
> > > On Thu, Oct 30, 2025 at 04:31:56PM +0000, Pedro Falcato wrote:
> > >> On Thu, Oct 30, 2025 at 04:23:58PM +0000, Lorenzo Stoakes wrote:
> > >> > On Thu, Oct 30, 2025 at 04:16:20PM +0000, Pedro Falcato wrote:
> > >> > > On Wed, Oct 29, 2025 at 04:50:31PM +0000, Lorenzo Stoakes wrote:
> > >> > > > Currently, if a user needs to determine if guard regions are present in a
> > >> > > > range, they have to scan all VMAs (or have knowledge of which ones might
> > >> > > > have guard regions).
> > >> > > >
> > >> > > > Since commit 8e2f2aeb8b48 ("fs/proc/task_mmu: add guard region bit to
> > >> > > > pagemap") and the related commit a516403787e0 ("fs/proc: extend the
> > >> > > > PAGEMAP_SCAN ioctl to report guard regions"), users can use either
> > >> > > > /proc/$pid/pagemap or the PAGEMAP_SCAN functionality to perform this
> > >> > > > operation at a virtual address level.
> > >> > > >
> > >> > > > This is not ideal, and it gives no visibility at a /proc/$pid/smaps level
> > >> > > > that guard regions exist in ranges.
> > >> > > >
> > >> > > > This patch remedies the situation by establishing a new VMA flag,
> > >> > > > VM_MAYBE_GUARD, to indicate that a VMA may contain guard regions (it is
> > >> > > > uncertain because we cannot reasonably determine whether a
> > >> > > > MADV_GUARD_REMOVE call has removed all of the guard regions in a VMA, and
> > >> > > > additionally VMAs may change across merge/split).
> > >> > > >
> > >> > > > We utilise 0x800 for this flag which makes it available to 32-bit
> > >> > > > architectures also, a flag that was previously used by VM_DENYWRITE, which
> > >> > > > was removed in commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") and hasn't
> > >> > > > bee reused yet.
> > >> > > >
> > >> > > > The MADV_GUARD_INSTALL madvise() operation now must take an mmap write
> > >> > > > lock (and also VMA write lock) whereas previously it did not, but this
> > >> > > > seems a reasonable overhead.
> > >> > >
> > >> > > Do you though? Could it be possible to simply atomically set the flag with
> > >> > > the read lock held? This would make it so we can't split the VMA (and tightly
> > >> >
> > >> > VMA flags are not accessed atomically so no I don't think we can do that in any
> > >> > workable way.
> > >> >
> > >>
> > >> FWIW I think you could work it as an atomic flag and treat those races as benign
> > >> (this one, at least).
> > >
> > > It's not benign as we need to ensure that page tables are correctly propagated
> > > on fork.
> >
> > Could we use MADVISE_VMA_READ_LOCK mode (would be actually an improvement
> > over the current MADVISE_MMAP_READ_LOCK), together with the atomic flag
> > setting? I think the places that could race with us to cause RMW use vma
>
> I mean, I just spoke about why I didn't think introducing an entirely new
> (afaik) one-sided atomic VMA flag write, so maybe deal with that first before
> proposing something new...
On the other hand, it's going to be difficult to get compelling data either
way as will always be workload dependent etc.
So since you and Pedro both bring this up, and it'd be a pity to establish more
stringent locking requirements here, let me look into an atomic write situation.
We'll need to tread carefully here but if we can achieve that it would obviously
be entirely preferable to requiring write lock.
I'll dig into it some :)
BTW I do think a VMA read lock is entirely possible here as-is, so we should try
to shift to that if we can make atomic VMA flag write here work.
Thanks, Lorenzo
next prev parent reply other threads:[~2025-10-30 19:39 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 16:50 [PATCH 0/3] introduce VM_MAYBE_GUARD and make it sticky Lorenzo Stoakes
2025-10-29 16:50 ` [PATCH 1/3] mm: introduce VM_MAYBE_GUARD and make visible for guard regions Lorenzo Stoakes
2025-10-29 19:50 ` Randy Dunlap
2025-10-30 8:13 ` Lorenzo Stoakes
2025-10-30 1:05 ` Suren Baghdasaryan
2025-10-30 8:22 ` Lorenzo Stoakes
2025-10-30 16:16 ` Pedro Falcato
2025-10-30 16:23 ` Lorenzo Stoakes
2025-10-30 16:31 ` Pedro Falcato
2025-10-30 16:43 ` Lorenzo Stoakes
2025-10-30 18:31 ` Vlastimil Babka
2025-10-30 18:47 ` Vlastimil Babka
2025-10-30 19:47 ` Lorenzo Stoakes
2025-10-30 21:48 ` Vlastimil Babka
2025-10-31 23:12 ` Suren Baghdasaryan
2025-11-03 9:34 ` Lorenzo Stoakes
2025-11-05 19:48 ` Lorenzo Stoakes
2025-11-06 7:34 ` Vlastimil Babka
2025-10-30 19:16 ` Lorenzo Stoakes
2025-10-30 19:37 ` Lorenzo Stoakes [this message]
2025-10-29 16:50 ` [PATCH 2/3] mm: implement sticky, copy on fork VMA flags Lorenzo Stoakes
2025-10-30 4:35 ` Suren Baghdasaryan
2025-10-30 8:25 ` Lorenzo Stoakes
2025-10-30 16:25 ` Pedro Falcato
2025-10-30 16:34 ` Lorenzo Stoakes
2025-10-29 16:50 ` [PATCH 3/3] selftests/mm/guard-regions: add smaps visibility test Lorenzo Stoakes
2025-10-30 4:40 ` Suren Baghdasaryan
2025-10-30 8:25 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8f4ad5bf-cd5a-4b93-8332-bc8b306d0e77@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=avagin@gmail.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=jannh@google.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=pfalcato@suse.de \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox