From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Jann Horn <jannh@google.com>
Cc: Julian Orth <ju.orth@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Shuah Khan <shuah@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: reinstate ability to map write-sealed memfd mappings read-only
Date: Thu, 28 Nov 2024 18:35:26 +0000 [thread overview]
Message-ID: <7cde8a1c-28a9-41c3-b568-9198bc1ab662@lucifer.local> (raw)
In-Reply-To: <CAG48ez39-GaWbLYX77ZTJFW5oe0V7GS5MEQUaKSjYeCEDM0-vg@mail.gmail.com>
On Thu, Nov 28, 2024 at 07:27:44PM +0100, Jann Horn wrote:
> On Thu, Nov 28, 2024 at 7:21 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > On Thu, Nov 28, 2024 at 06:58:21PM +0100, Julian Orth wrote:
> > > (Re-sending the message below since I forgot to reply-all)
> > >
> > > On Thu, Nov 28, 2024 at 6:46 PM Jann Horn <jannh@google.com> wrote:
> > > >
> > > > On Thu, Nov 28, 2024 at 4:06 PM Lorenzo Stoakes
> > > > <lorenzo.stoakes@oracle.com> wrote:
> > > > > In commit 158978945f31 ("mm: perform the mapping_map_writable() check after
> > > > > call_mmap()") (and preceding changes in the same series) it became possible
> > > > > to mmap() F_SEAL_WRITE sealed memfd mappings read-only.
> > > > >
> > > > > This was previously unnecessarily disallowed, despite the man page
> > > > > documentation indicating that it would be, thereby limiting the usefulness
> > > > > of F_SEAL_WRITE logic.
> > > > >
> > > > > We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE
> > > > > seal (one which disallows future writes to the memfd) to also be used for
> > > > > F_SEAL_WRITE.
> > > > >
> > > > > For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a
> > > > > read-only mapping to disallow mprotect() from overriding the seal - an
> > > > > operation performed by seal_check_write(), invoked from shmem_mmap(), the
> > > > > f_op->mmap() hook used by shmem mappings.
> > > > >
> > > > > By extending this to F_SEAL_WRITE and critically - checking
> > > > > mapping_map_writable() to determine if we may map the memfd AFTER we invoke
> > > > > shmem_mmap() - the desired logic becomes possible. This is because
> > > > > mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will
> > > > > have cleared.
> > > > >
> > > > > Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path
> > > > > behaviour") unintentionally undid this logic by moving the
> > > > > mapping_map_writable() check before the shmem_mmap() hook is invoked,
> > > > > thereby regressing this change.
> > > > >
> > > > > We reinstate this functionality by moving the check out of shmem_mmap() and
> > > > > instead performing it in do_mmap() at the point at which VMA flags are
> > > > > being determined, which seems in any case to be a more appropriate place in
> > > > > which to make this determination.
> > > > >
> > > > > In order to achieve this we rework memfd seal logic to allow us access to
> > > > > this information using existing logic and eliminate the clearing of
> > > > > VM_MAYWRITE from seal_check_write() which we are performing in do_mmap()
> > > > > instead.
> > > >
> > > > If we already check is_readonly_sealed() and strip VM_MAYWRITE in
> > > > do_mmap(), without holding any kind of lock or counter on the file
> > > > yet, then this check is clearly racy somehow, right? I think we have a
> > > > race where we intermittently reject shared-readonly mmap() calls?
> > >
> > > Apropos race, some time ago I reported a way to get a mutable mapping
> > > for a write-sealed memfd via a race:
> > >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=219106
> >
> > Kind of hard to read rust code, but it looks like you're intentionally
> > trying to race sealing on the assumption it's atomic when it's not? That
> > doesn't seem like a bug?
> >
> > The intent of sealing memfds is you establish the memfd buffer, then seal
> > it and _only then_ expose it elsewhere.
> >
> > I may be missing something here, however.
>
> AFAIU memfds are supposed to also guarantee *to the recipient* of the
> shared memfd that the memory inside it won't change anymore, so that
> the recipient can parse data out of this shared memory buffer without
> having to worry about the data concurrently changing. udmabuf_create()
> looks like it indeed breaks that assumption by first calling
> check_memfd_seals() and then doing udmabuf_pin_folios() without any
> lock that prevents a seal being added in between those.
>
> That's also why we have memfd_wait_for_pins(), which ensures that
> folios in the memfd don't have elevated refcounts when a F_SEAL_WRITE
> seal is added.
>
> (I believe that's one of the major differences in usecases of
> F_SEAL_WRITE and F_SEAL_FUTURE_WRITE: F_SEAL_FUTURE_WRITE is enough
> for cases where only the creator of the memfd wants to prevent other
> tasks from writing into it, while F_SEAL_WRITE is suitable for cases
> where the creator and recipient of the memfd want mutual protection.)
Those being stupid can also be in kernel code... :) I mean that just looks
like udmabuf is doing something buggy and require a fix which is out of
scope for mm but perhaps worth reporting direct to udmabuf maintainers.
That makes sense re: F_SEAL_WRITE and further makes the case for making it
possible to read-only mmap() these buffers for convenience.
You'd also want to add other seals to ensure it's truly immutable.
next prev parent reply other threads:[~2024-11-28 18:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-28 15:06 [PATCH 0/2] " Lorenzo Stoakes
2024-11-28 15:06 ` [PATCH 1/2] " Lorenzo Stoakes
2024-11-28 17:45 ` Jann Horn
2024-11-28 17:58 ` Julian Orth
2024-11-28 18:20 ` Lorenzo Stoakes
2024-11-28 18:27 ` Julian Orth
2024-11-28 18:27 ` Jann Horn
2024-11-28 18:35 ` Lorenzo Stoakes [this message]
2024-11-28 18:05 ` Lorenzo Stoakes
2024-11-28 18:18 ` Jann Horn
2024-11-28 18:27 ` Lorenzo Stoakes
2024-11-28 15:06 ` [PATCH 2/2] selftests/memfd: add test for mapping write-sealed memfd read-only Lorenzo Stoakes
2024-11-29 10:03 ` [PATCH 0/2] mm: reinstate ability to map write-sealed memfd mappings read-only Lorenzo Stoakes
2024-11-29 10:24 ` Andrew Morton
2024-11-29 11:36 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cde8a1c-28a9-41c3-b568-9198bc1ab662@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=jannh@google.com \
--cc=ju.orth@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shuah@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox