linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Pedro Falcato <pfalcato@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Christian Brauner <brauner@kernel.org>,
	linux-mm@kvack.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org, SeongJae Park <sj@kernel.org>,
	Usama Arif <usamaarif642@gmail.com>
Subject: Re: [RFC PATCH 4/5] mm/madvise: add PMADV_SET_FORK_EXEC_DEFAULT process_madvise() flag
Date: Tue, 20 May 2025 11:21:33 +0100	[thread overview]
Message-ID: <da1281bb-e49a-40f9-ac11-f976358e618e@lucifer.local> (raw)
In-Reply-To: <czd62f2vzwv6fow4ikvzlkjdj5odhc3nhtc72onwip52baobg5@yc5pjiqoqnh4>

On Tue, May 20, 2025 at 09:38:50AM +0100, Pedro Falcato wrote:
> On Mon, May 19, 2025 at 09:52:41PM +0100, Lorenzo Stoakes wrote:
> > It's useful in certain cases to be able to default-enable an madvise() flag
> > for all newly mapped VMAs, and for that to survive fork/exec.
> >
> > The natural place to specify something like this is in an madvise()
> > invocation, and thus providing this functionality as a flag to
> > process_madvise() makes sense.
> >
> > We intentionally limit this only to flags that we know should function
> > correctly without issue, and to be conservative about this, so we initially
> > limit ourselves only to MADV_HUGEPAGE, MADV_NOHUGEPAGE, that is - setting
> > the VM_HUGEPAGE, VM_NOHUGEPAGE VMA flags.
> >
> > We implement this functionality by using the mm_struct->def_flags field.
>
> This seems super specific. How about this:
>
> - PMADV_FUTURE (mirrors MCL_FUTURE). This only applies the flag to future VMAs in the current process.
> - PMADV_INHERIT_FORK. This makes it so the flag is propagated to child processes (does not imply PMADV_FUTURE)
> - PMADV_INHERIT_EXEC. This makes it so the flag is propagated through the execve boundary
>   (and this is where we'd filter for 'safe' flags, at least through the secureexec boundary). Does not imply
>   FUTURE nor INHERIT_FORK.

I don't know how we could implement separate current process, fork, exec, fork/exec.
mm->def_flags is propagated this way automatically.

And again on the security stuff, I think the correct answer is to require sys
admin capability to be able to use this option _at all_. This simplifies
everything.

To have this kind of thing we'd have to add a whole new mechanism, literally
just for this, and I'd really rather not generate brand new mm_struct flags for
every possible mode (in fact that would probably makes the whole thing
intractible), or add a new field there for this.

The idea is that we get the advantages of an improved madvise interface, while
also providing the interface Usama wants without having to add some hideous
prctl() whose logic is disconnected from the rest of madvise(), while being, in
effect, a 'default madvise() for new mappings'.

So while specific to the case, nothing prevents us in future adding more
functionality if we want.

We could also potentially:

- add PMADV_SET_DEFAULT (I'm iffy about PMADV_FUTURE... but whichever we go with)
- add PMADV_INHERIT_FORK
- add PMADV_INHERIT_EXEC

And only support PMADV_SET_DEFAULT | PMADV_INHERIT_FORK | PMADV_INHERIT_EXEC for
now.

THen we could have the security semantics you specify (require cap sys admin on
PMADV_INHERIT_EXEC) but have that propagate to the only supported case.

What do you think?

>
> and, while we're at it, rename PMADV_ENTIRE_ADDRESS_SPACE to PMADV_CURRENT, to align it with MCL_CURRENT.

I'm not sure making the mlock()/madvise() stuff analagous is a good idea, as
they have different semantics. I'd rather keep these flags descriptive. Though
I'm open to alternative naming of course...

Also keep in mind these flags are for mlockall(), whose name already tells you
you're locking everything :)

>
> Thoughts?
>
> --
> Pedro


  reply	other threads:[~2025-05-20 10:21 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-19 20:52 [RFC PATCH 0/5] add process_madvise() flags to modify behaviour Lorenzo Stoakes
2025-05-19 20:52 ` [RFC PATCH 1/5] mm: madvise: refactor madvise_populate() Lorenzo Stoakes
2025-05-20 10:30   ` David Hildenbrand
2025-05-20 10:36     ` Lorenzo Stoakes
2025-05-20 10:42       ` David Hildenbrand
2025-05-22 12:32         ` Mike Rapoport
2025-05-19 20:52 ` [RFC PATCH 2/5] mm/madvise: add PMADV_SKIP_ERRORS process_madvise() flag Lorenzo Stoakes
2025-05-19 20:52 ` [RFC PATCH 3/5] mm/madvise: add PMADV_NO_ERROR_ON_UNMAPPED " Lorenzo Stoakes
2025-05-19 20:52 ` [RFC PATCH 4/5] mm/madvise: add PMADV_SET_FORK_EXEC_DEFAULT " Lorenzo Stoakes
2025-05-20  8:38   ` Pedro Falcato
2025-05-20 10:21     ` Lorenzo Stoakes [this message]
2025-05-20 11:41       ` Pedro Falcato
2025-05-20 13:39         ` Lorenzo Stoakes
2025-05-20 16:11     ` Jann Horn
2025-05-20 16:19       ` Lorenzo Stoakes
2025-05-20 16:35         ` David Hildenbrand
2025-05-20 22:26   ` Johannes Weiner
2025-05-29 14:46     ` Lorenzo Stoakes
2025-05-19 20:52 ` [RFC PATCH 5/5] mm/madvise: add PMADV_ENTIRE_ADDRESS_SPACE " Lorenzo Stoakes
2025-05-19 21:53 ` [RFC PATCH 0/5] add process_madvise() flags to modify behaviour Jann Horn
2025-05-20  5:35   ` Lorenzo Stoakes
2025-05-20 16:04     ` Jann Horn
2025-05-20 16:14       ` Lorenzo Stoakes
2025-05-20 15:28 ` David Hildenbrand
2025-05-20 17:47   ` Lorenzo Stoakes
2025-05-20 18:24     ` Usama Arif
2025-05-20 19:21       ` Lorenzo Stoakes
2025-05-20 19:42         ` Usama Arif
2025-05-20 20:15           ` Lorenzo Stoakes
2025-05-20 18:25     ` Lorenzo Stoakes
2025-05-20 18:39       ` David Hildenbrand
2025-05-20 18:25 ` Shakeel Butt
2025-05-20 18:45   ` Lorenzo Stoakes
2025-05-20 19:49     ` Shakeel Butt
2025-05-20 20:39       ` Lorenzo Stoakes
2025-05-20 22:02         ` Shakeel Butt
2025-05-21  4:21           ` Lorenzo Stoakes
2025-05-21 16:28             ` Shakeel Butt
2025-05-21 16:49               ` Lorenzo Stoakes
2025-05-21 17:39                 ` Shakeel Butt
2025-05-22 13:05                   ` David Hildenbrand
2025-05-22 13:21                     ` Lorenzo Stoakes
2025-05-22 20:53                     ` Shakeel Butt
2025-05-26 12:57                       ` David Hildenbrand
2025-05-21 16:57               ` Usama Arif
2025-05-21 17:39                 ` Lorenzo Stoakes
2025-05-21 18:25                   ` Usama Arif
2025-05-21 18:40                     ` Lorenzo Stoakes
2025-05-21 18:45                       ` Usama Arif
2025-05-21 17:32             ` Johannes Weiner
2025-05-21 18:11               ` Lorenzo Stoakes
2025-05-22 12:45               ` David Hildenbrand
2025-05-22 13:49                 ` Lorenzo Stoakes
2025-05-22 15:32               ` Mike Rapoport
2025-05-22 15:47                 ` Lorenzo Stoakes
2025-05-21  2:16       ` Liam R. Howlett
2025-05-22 12:12 ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da1281bb-e49a-40f9-ac11-f976358e618e@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=brauner@kernel.org \
    --cc=david@redhat.com \
    --cc=jannh@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pfalcato@suse.de \
    --cc=sj@kernel.org \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox