From: Dmitry Vyukov <dvyukov@google.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>,
fw@deneb.enyo.de, James.Bottomley@hansenpartnership.com,
Liam.Howlett@oracle.com, akpm@linux-foundation.org,
arnd@arndb.de, brauner@kernel.org, chris@zankel.net,
deller@gmx.de, hch@infradead.org, ink@jurassic.park.msu.ru,
jannh@google.com, jcmvbkbc@gmail.com, jeffxu@chromium.org,
jhubbard@nvidia.com, linux-alpha@vger.kernel.org,
linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-mips@vger.kernel.org, linux-mm@kvack.org,
linux-parisc@vger.kernel.org, mattst88@gmail.com,
muchun.song@linux.dev, paulmck@kernel.org,
richard.henderson@linaro.org, shuah@kernel.org,
sidhartha.kumar@oracle.com, surenb@google.com,
tsbogend@alpha.franken.de, vbabka@suse.cz, willy@infradead.org,
elver@google.com, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 0/5] implement lightweight guard pages
Date: Wed, 23 Oct 2024 10:56:33 +0200 [thread overview]
Message-ID: <CACT4Y+ZE9Zco7KaQoT50aooXCHxhz2N_psTAFtT+ZrH14Si7aw@mail.gmail.com> (raw)
In-Reply-To: <5a3d3bc8-60db-46d0-b689-9aeabcdb8eab@lucifer.local>
On Wed, 23 Oct 2024 at 10:12, Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> +cc Linus as reference a commit of his below...
>
> On Wed, Oct 23, 2024 at 09:19:03AM +0200, David Hildenbrand wrote:
> > On 23.10.24 08:24, Dmitry Vyukov wrote:
> > > Hi Florian, Lorenzo,
> > >
> > > This looks great!
>
> Thanks!
>
> > >
> > > What I am VERY interested in is if poisoned pages cause SIGSEGV even when
> > > the access happens in the kernel. Namely, the syscall still returns EFAULT,
> > > but also SIGSEGV is queued on return to user-space.
>
> Yeah we don't in any way.
>
> I think adding something like this would be a bit of its own project.
I can totally understand this.
> The fault andler for this is in handle_pte_marker() in mm/memory.c, where
> we do the following:
>
> /* Hitting a guard page is always a fatal condition. */
> if (marker & PTE_MARKER_GUARD)
> return VM_FAULT_SIGSEGV;
>
> So basically we pass this back to whoever invoked the fault. For uaccess we
> end up in arch-specific code that eventually checks exception tables
> etc. and for x86-64 that's kernelmode_fixup_or_oops().
>
> There used to be a sig_on_uaccess_err in the x86-specific thread_struct
> that let you propagate it but Linus pulled it out in commit 02b670c1f88e
> ("x86/mm: Remove broken vsyscall emulation code from the page fault code")
> where it was presumably used for vsyscall.
>
> Of course we could just get something much higher up the stack to send the
> signal, but we'd need to be careful we weren't breaking anything doing
> it...
Can setting TIF_NOTIFY_RESUME and then doing the rest when returning
to userspace help here?
> I address GUP below.
>
> > >
> > > Catching bad accesses in system calls is currently the weak spot for
> > > all user-space bug detection tools (GWP-ASan, libefence, libefency, etc).
> > > It's almost possible with userfaultfd, but catching faults in the kernel
> > > requires admin capability, so not really an option for generic bug
> > > detection tools (+inconvinience of userfaultfd setup/handler).
> > > Intercepting all EFAULT from syscalls is not generally possible
> > > (w/o ptrace, usually not an option as well), and EFAULT does not always
> > > mean a bug.
> > >
> > > Triggering SIGSEGV even in syscalls would be not just a performance
> > > optimization, but a new useful capability that would allow it to catch
> > > more bugs.
> >
> > Right, we discussed that offline also as a possible extension to the
> > userfaultfd SIGBUS mode.
> >
> > I did not look into that yet, but I was wonder if there could be cases where
> > a different process could trigger that SIGSEGV, and how to (and if to)
> > handle that.
> >
> > For example, ptrace (access_remote_vm()) -> GUP likely can trigger that. I
> > think with userfaultfd() we will currently return -EFAULT, because we call
> > get_user_page_vma_remote() that is not prepared for dropping the mmap lock.
> > Possibly that is the right thing to do, but not sure :)
That's a good corner case.
I guess also process_vm_readv/writev.
Not triggering the signal in these cases looks like the right thing to do.
> > These "remote" faults set FOLL_REMOTE -> FAULT_FLAG_REMOTE, so we might be
> > able to distinguish them and perform different handling.
>
> So all GUP will return -EFAULT when hitting guard pages unless we change
> something.
>
> In GUP we handle this in faultin_page():
>
> if (ret & VM_FAULT_ERROR) {
> int err = vm_fault_to_errno(ret, flags);
>
> if (err)
> return err;
> BUG();
> }
>
> And vm_fault_to_errno() is:
>
> static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
> {
> if (vm_fault & VM_FAULT_OOM)
> return -ENOMEM;
> if (vm_fault & (VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE))
> return (foll_flags & FOLL_HWPOISON) ? -EHWPOISON : -EFAULT;
> if (vm_fault & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV))
> return -EFAULT;
> return 0;
> }
>
> Again, I think if we wanted special handling here we'd need to probably
> propagate that fault from higher up, but yes we'd need to for one
> definitely not do so if it's remote but I worry about other cases.
>
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >
>
> Overall while I sympathise with this, it feels dangerous and a pretty major
> change, because there'll be something somewhere that will break because it
> expects faults to be swallowed that we no longer do swallow.
>
> So I'd say it'd be something we should defer, but of course it's a highly
> user-facing change so how easy that would be I don't know.
>
> But I definitely don't think a 'introduce the ability to do cheap PROT_NONE
> guards' series is the place to also fundmentally change how user access
> page faults are handled within the kernel :)
Will delivering signals on kernel access be a backwards compatible
change? Or will we need a different API? MADV_GUARD_POISON_KERNEL?
It's just somewhat painful to detect/update all userspace if we add
this feature in future. Can we say signal delivery on kernel accesses
is unspecified?
next prev parent reply other threads:[~2024-10-23 8:56 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-20 16:20 Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 1/5] mm: pagewalk: add the ability to install PTEs Lorenzo Stoakes
2024-10-21 13:27 ` Vlastimil Babka
2024-10-21 13:50 ` Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 2/5] mm: add PTE_MARKER_GUARD PTE marker Lorenzo Stoakes
2024-10-21 13:45 ` Vlastimil Babka
2024-10-21 19:57 ` Lorenzo Stoakes
2024-10-21 20:42 ` Lorenzo Stoakes
2024-10-21 21:13 ` Lorenzo Stoakes
2024-10-21 21:20 ` Dave Hansen
2024-10-21 14:13 ` Vlastimil Babka
2024-10-21 14:33 ` Lorenzo Stoakes
2024-10-21 14:54 ` Vlastimil Babka
2024-10-21 15:33 ` Lorenzo Stoakes
2024-10-21 15:41 ` Lorenzo Stoakes
2024-10-21 16:00 ` David Hildenbrand
2024-10-21 16:23 ` Lorenzo Stoakes
2024-10-21 16:44 ` David Hildenbrand
2024-10-21 16:51 ` Lorenzo Stoakes
2024-10-21 17:00 ` David Hildenbrand
2024-10-21 17:14 ` Lorenzo Stoakes
2024-10-21 17:21 ` David Hildenbrand
2024-10-21 17:26 ` Vlastimil Babka
2024-10-22 19:13 ` David Hildenbrand
2024-10-20 16:20 ` [PATCH v2 3/5] mm: madvise: implement lightweight guard page mechanism Lorenzo Stoakes
2024-10-21 17:05 ` David Hildenbrand
2024-10-21 17:15 ` Lorenzo Stoakes
2024-10-21 17:23 ` David Hildenbrand
2024-10-21 19:25 ` John Hubbard
2024-10-21 19:39 ` Lorenzo Stoakes
2024-10-21 20:18 ` David Hildenbrand
2024-10-21 20:11 ` Vlastimil Babka
2024-10-21 20:17 ` David Hildenbrand
2024-10-21 20:25 ` Vlastimil Babka
2024-10-21 20:30 ` Lorenzo Stoakes
2024-10-21 20:37 ` David Hildenbrand
2024-10-21 20:49 ` Lorenzo Stoakes
2024-10-21 21:20 ` David Hildenbrand
2024-10-21 21:33 ` Lorenzo Stoakes
2024-10-21 21:35 ` Vlastimil Babka
2024-10-21 21:46 ` Lorenzo Stoakes
2024-10-22 19:18 ` David Hildenbrand
2024-10-21 20:27 ` Lorenzo Stoakes
2024-10-21 20:45 ` Vlastimil Babka
2024-10-22 19:08 ` Jann Horn
2024-10-22 19:35 ` Lorenzo Stoakes
2024-10-22 19:57 ` Jann Horn
2024-10-22 20:45 ` Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 4/5] tools: testing: update tools UAPI header for mman-common.h Lorenzo Stoakes
2024-10-20 16:20 ` [PATCH v2 5/5] selftests/mm: add self tests for guard page feature Lorenzo Stoakes
2024-10-21 21:31 ` Shuah Khan
2024-10-22 10:25 ` Lorenzo Stoakes
2024-10-20 17:37 ` [PATCH v2 0/5] implement lightweight guard pages Florian Weimer
2024-10-20 19:45 ` Lorenzo Stoakes
2024-10-23 6:24 ` Dmitry Vyukov
2024-10-23 7:19 ` David Hildenbrand
2024-10-23 8:11 ` Lorenzo Stoakes
2024-10-23 8:56 ` Dmitry Vyukov [this message]
2024-10-23 9:06 ` Vlastimil Babka
2024-10-23 9:13 ` David Hildenbrand
2024-10-23 9:18 ` Lorenzo Stoakes
2024-10-23 9:29 ` David Hildenbrand
2024-10-23 11:31 ` Marco Elver
2024-10-23 11:36 ` David Hildenbrand
2024-10-23 11:40 ` Lorenzo Stoakes
2024-10-23 9:17 ` Dmitry Vyukov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACT4Y+ZE9Zco7KaQoT50aooXCHxhz2N_psTAFtT+ZrH14Si7aw@mail.gmail.com \
--to=dvyukov@google.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=brauner@kernel.org \
--cc=chris@zankel.net \
--cc=david@redhat.com \
--cc=deller@gmx.de \
--cc=elver@google.com \
--cc=fw@deneb.enyo.de \
--cc=hch@infradead.org \
--cc=ink@jurassic.park.msu.ru \
--cc=jannh@google.com \
--cc=jcmvbkbc@gmail.com \
--cc=jeffxu@chromium.org \
--cc=jhubbard@nvidia.com \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mattst88@gmail.com \
--cc=muchun.song@linux.dev \
--cc=paulmck@kernel.org \
--cc=richard.henderson@linaro.org \
--cc=shuah@kernel.org \
--cc=sidhartha.kumar@oracle.com \
--cc=surenb@google.com \
--cc=torvalds@linux-foundation.org \
--cc=tsbogend@alpha.franken.de \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox