From: Linus Torvalds <torvalds@linux-foundation.org>
To: Simon Ser <contact@emersion.fr>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Peter Xu <peterx@redhat.com>, Will Deacon <will@kernel.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
David Herrmann <dh.herrmann@gmail.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Greg Kroah-Hartman <greg@kroah.com>,
"tytso@mit.edu" <tytso@mit.edu>
Subject: Re: Sealed memfd & no-fault mmap
Date: Tue, 27 Apr 2021 09:51:58 -0700 [thread overview]
Message-ID: <CAHk-=wgmGv2EGscKSi8SrQWtEVpEQyk-ZN1Xj4EoAB87Dmx1gA@mail.gmail.com> (raw)
In-Reply-To: <vs1Us2sm4qmfvLOqNat0-r16GyfmWzqUzQ4KHbXJwEcjhzeoQ4sBTxx7QXDG9B6zk5AeT7FsNb3CSr94LaKy6Novh1fbbw8D_BBxYsbPLms=@emersion.fr>
[-- Attachment #1: Type: text/plain, Size: 2382 bytes --]
On Tue, Apr 27, 2021 at 1:25 AM Simon Ser <contact@emersion.fr> wrote:
>
> Rather than requiring changes in all compositors *and* clients, can we
> maybe only require changes in compositors? For instance, OpenBSD has a
> __MAP_NOFAULT flag. When passed to mmap, it means that out-of-bound
> accesses will read as zeroes instead of triggering SIGBUS. Such a flag
> would be very helpful to unblock the annoying SIGBUS situation.
>
> Would something among these lines be welcome in the Linux kernel?
Hmm. It doesn't look too hard to do. The biggest problem is actually
that we've run out of flags in the vma (on 32-bit architectures), but
you could try this UNTESTED patch that just does the MAP_NOFAULT thing
unconditionally.
NOTE! Not only is it untested, not only is this a "for your testing
only" (because it does it unconditionally rather than only for
__MAP_NOFAULT), but it might be bogus for other reasons. In
particular, this patch depends on "vmf->address" not being changed by
the ->fault() infrastructure, so that we can just re-use the vmf for
the anonymous case if we get a SIGBUS.
I think that's all ok these days, because Kirill and Peter Xu cleaned
up those paths, but I didn't actually check. So I'm cc'ing Kirill,
Peter and Will, who have been working in this area for other reasons
fairly recently.
Side note: this will only ever work for non-shared mappings. That's
fundamental. We won't add an anonymous page to a shared mapping, and
do_anonymous_page() does verify that. So a MAP_SHARED mappign will
still return SIGBUS even with this patch (although it's not obvious
from the patch - the VM_FAULT_SIGBUS will just be re-created by
do_anonymous_page()).
So if you want a _shared_ mapping to honor __MAP_NOFAULT and insert
random anonymous pages into it, I think the answer is "no, that's not
going to be viable".
So _if_ this works for you, and if it's ok that only MAP_PRIVATE can
have __MAP_NOFAULT, and if Kirill/Peter/Will don't say "Oh, Linus,
you're completely off your rocker and clearly need to be taking your
meds", something like this - if we figure out the conditional bit -
might be doable.
That's a fair number of "ifs".
Ok, back to the merge window for me, I'll be throwing away this crazy
untested patch immediately after hitting "send". This is very much a
"throw the idea over to other people" patch, in other words.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 904 bytes --]
mm/memory.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 550405fc3b5e..bbede6b52f7a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4312,10 +4312,21 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
}
if (!vmf->pte) {
- if (vma_is_anonymous(vmf->vma))
- return do_anonymous_page(vmf);
- else
- return do_fault(vmf);
+ if (!vma_is_anonymous(vmf->vma)) {
+ vm_fault_t ret = do_fault(vmf);
+ if (ret & VM_FAULT_RETRY)
+ return ret;
+ if (!(ret & VM_FAULT_SIGBUS))
+ return ret;
+/* FIXME! We don't have a VM_NOFAULT bit */
+#if 0
+ /* See if we should turn a SIGBUS into an anonymous page */
+ if (!(vma->vm_flags & VM_NOFAULT))
+ return ret;
+#endif
+/* Fall back on do_anonymous_page() instead of SIGBUS */
+ }
+ return do_anonymous_page(vmf);
}
if (!pte_present(vmf->orig_pte))
next prev parent reply other threads:[~2021-04-27 16:52 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-27 8:24 Simon Ser
2021-04-27 16:51 ` Linus Torvalds [this message]
2021-04-29 15:48 ` Kirill A. Shutemov
2021-04-29 18:38 ` Peter Xu
2021-05-04 9:29 ` Simon Ser
2021-05-04 16:08 ` Linus Torvalds
2021-05-05 10:21 ` Simon Ser
2021-05-05 18:42 ` Linus Torvalds
2021-05-28 17:07 ` Lin, Ming
2021-05-29 1:03 ` Linus Torvalds
2021-05-29 7:31 ` Lin, Ming
2021-05-29 15:44 ` Linus Torvalds
2021-05-29 20:15 ` Hugh Dickins
2021-05-29 23:36 ` Ming Lin
2021-05-31 21:13 ` Ming Lin
2021-06-01 6:24 ` Linus Torvalds
2021-06-01 7:08 ` Ming Lin
2021-06-03 13:01 ` Simon Ser
2021-06-03 20:07 ` Ming Lin
2021-06-03 20:49 ` Simon Ser
2021-06-03 13:14 ` Simon Ser
2021-06-03 13:57 ` Matthew Wilcox
2021-06-03 14:48 ` Simon Ser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wgmGv2EGscKSi8SrQWtEVpEQyk-ZN1Xj4EoAB87Dmx1gA@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=contact@emersion.fr \
--cc=dh.herrmann@gmail.com \
--cc=greg@kroah.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
--cc=tytso@mit.edu \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox