From: Ingo Molnar <mingo@kernel.org>
To: Dave Hansen <dave@sr71.net>
Cc: Andy Lutomirski <luto@amacapital.net>,
Kees Cook <keescook@google.com>,
"x86@kernel.org" <x86@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Andy Lutomirski <luto@kernel.org>, Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH 26/26] x86, pkeys: Documentation
Date: Sat, 3 Oct 2015 09:27:55 +0200 [thread overview]
Message-ID: <20151003072755.GA23524@gmail.com> (raw)
In-Reply-To: <560EC3EC.2080803@sr71.net>
* Dave Hansen <dave@sr71.net> wrote:
> On 10/01/2015 11:23 PM, Ingo Molnar wrote:
> >> > Also, how do we do mprotect_pkey and say "don't change the key"?
> >
> > So if we start managing keys as a resource (i.e. alloc/free up to 16 of them),
> > and provide APIs for user-space to do all that, then user-space is not
> > supposed to touch keys it has not allocated for itself - just like it's not
> > supposed to write to fds it has not opened.
>
> I like that. It gives us at least a "soft" indicator to userspace about what
> keys it should or shouldn't be using.
Yes. A 16-bit allocation bitmap would solve this nicely.
> > Such an allocation method can still 'mess up', and if the kernel allocates a key
> > for its purposes it should not assume that user-space cannot change it, but at
> > least for non-buggy code there's no interaction and it would work out fine.
>
> Yeah. It also provides a clean interface so that future hardware could
> enforce enforce kernel "ownership" of a key which could protect against
> even buggy code.
>
> So, we add a pair of syscalls,
>
> unsigned long sys_alloc_pkey(unsigned long flags??)
> unsigned long sys_free_pkey(unsigned long pkey)
>
> keep the metadata in the mm, and then make sure that userspace allocated
> it before it is allowed to do an mprotect_pkey() with it.
Yeah, so such an interface would allow the clean, transparent usage of pkeys for
pure PROT_EXEC mappings.
I'd expect the --x/PROT_EXEC mappings to be _by far_ more frequently used than
pure pkeys - but we still need the management interface to keep the kernel's use
of pkeys separate from user-space's use.
If all the necessary tooling changes are propagated through then in fact I'd
expect every pkeys capable Linux system to use pkeys, for almost every user-space
task.
To have maximum future flexibility for pkeys I'd suggest the following additional
changes to the syscall ABI:
- Please name them with a pkey_ prefix, along the sys_pkey_* nomenclature, so
that it becomes an easily identified 'family' of system calls.
- I'd also suggest providing an initial value with the 'alloc' call. It's true
that user-space can do this itself in assembly, OTOH there's no reason not to
provide a C interface for this.
- Make the pkey identifier 'int', not 'long', like fds are. There's very little
expectation to ever have more than 4 billion pkeys per mm, right?
- How far do we want the kernel to manage this? Any reason we don't want a
'set pkey' operation, if user-space wants to use pure C interfaces? That could
be vDSO accelerated as well, to use the unprivileged op. An advantage of such
an interface would be that it would enable the kernel to more actively manage
the actual mappings as well in the future: for example to automatically not
allow accidental RWX mappings. Such an interface would also allow the future
introduction of privileged pkey mappings on the hardware side, without having
to change user-space, since everything goes via the kernel interface.
- Along similar considerations, also add a sys_pkey_query() system call to query
the mapping of a specific pkey. (returns -EBADF or so if the key is not mapped
at the moment.) This too could be vDSO accelerated in the future.
I.e. something like:
unsigned long sys_pkey_alloc (unsigned long flags, unsigned long init_val)
unsigned long sys_pkey_set (int pkey, unsigned long new_val)
unsigned long sys_pkey_get (int pkey)
unsigned long sys_pkey_free (int pkey)
Optional suggestion:
- _Maybe_ also allow the 'remote managed' setup of pkeys: of non-local tasks -
but I'm not sure about that: it looks expensive and complex, and a TID argument
can always be added later if there's some real need.
> That should be pretty easy to implement. The only real overhead is the 16 bits
> we need to keep in the mm somewhere.
Yes.
Note that if we use the C syscall interface suggestions I outlined above, we could
in the future also change to have a full table, and manage it explicitly - without
user-space changes - if the hardware side is tweaked to allow kernel side pkeys.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-03 7:28 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-16 17:49 [PATCH 00/26] [RFCv2] x86: Memory Protection Keys Dave Hansen
2015-09-16 17:49 ` [PATCH 02/26] x86, pkeys: Add Kconfig option Dave Hansen
2015-09-16 17:49 ` [PATCH 01/26] x86, fpu: add placeholder for Processor Trace XSAVE state Dave Hansen
2015-09-16 17:49 ` [PATCH 03/26] x86, pkeys: cpuid bit definition Dave Hansen
2015-09-16 17:49 ` [PATCH 04/26] x86, pku: define new CR4 bit Dave Hansen
2015-09-16 17:49 ` [PATCH 07/26] x86, pkeys: new page fault error code bit: PF_PK Dave Hansen
2015-09-16 17:49 ` [PATCH 06/26] x86, pkeys: PTE bits for storing protection key Dave Hansen
2015-09-16 17:49 ` [PATCH 05/26] x86, pkey: add PKRU xsave fields and data structure(s) Dave Hansen
2015-09-22 19:53 ` Thomas Gleixner
2015-09-22 19:58 ` Dave Hansen
2015-09-16 17:49 ` [PATCH 08/26] x86, pkeys: store protection in high VMA flags Dave Hansen
2015-09-16 17:49 ` [PATCH 09/26] x86, pkeys: arch-specific protection bits Dave Hansen
2015-09-16 17:49 ` [PATCH 11/26] x86, pkeys: add functions for set/fetch PKRU Dave Hansen
2015-09-22 20:05 ` Thomas Gleixner
2015-09-22 20:22 ` Dave Hansen
2015-09-16 17:49 ` [PATCH 10/26] x86, pkeys: notify userspace about protection key faults Dave Hansen
2015-09-22 20:03 ` Thomas Gleixner
2015-09-22 20:21 ` Dave Hansen
2015-09-22 20:27 ` Thomas Gleixner
2015-09-22 20:29 ` Dave Hansen
2015-09-23 8:05 ` Ingo Molnar
2015-09-24 9:23 ` Ingo Molnar
2015-09-24 9:30 ` Ingo Molnar
2015-09-24 17:41 ` Dave Hansen
2015-09-25 7:11 ` Ingo Molnar
2015-09-25 23:18 ` Dave Hansen
2015-09-26 6:20 ` Ingo Molnar
2015-09-27 22:39 ` Dave Hansen
2015-09-28 5:59 ` Ingo Molnar
2015-09-24 17:15 ` Dave Hansen
2015-09-28 19:25 ` Christian Borntraeger
2015-09-28 19:32 ` Dave Hansen
2015-09-16 17:49 ` [PATCH 14/26] x86, pkeys: check VMAs and PTEs for protection keys Dave Hansen
2015-09-16 17:49 ` [PATCH 13/26] mm: simplify get_user_pages() PTE bit handling Dave Hansen
2015-09-16 17:49 ` [PATCH 12/26] mm: factor out VMA fault permission checking Dave Hansen
2015-09-16 17:49 ` [PATCH 15/26] x86, pkeys: optimize fault handling in access_error() Dave Hansen
2015-09-16 17:49 ` [PATCH 16/26] x86, pkeys: dump PKRU with other kernel registers Dave Hansen
2015-09-16 17:49 ` [PATCH 17/26] x86, pkeys: dump PTE pkey in /proc/pid/smaps Dave Hansen
2015-09-16 17:49 ` [PATCH 18/26] x86, pkeys: add Kconfig prompt to existing config option Dave Hansen
2015-09-16 17:49 ` [PATCH 20/26] [NEWSYSCALL] mm: implement new mprotect_pkey() system call Dave Hansen
2015-09-16 17:49 ` [PATCH 19/26] [NEWSYSCALL] mm, multi-arch: pass a protection key in to calc_vm_flag_bits() Dave Hansen
2015-09-16 17:49 ` [PATCH 21/26] [NEWSYSCALL] x86: wire up mprotect_key() system call Dave Hansen
2015-09-16 17:49 ` [PATCH 22/26] [HIJACKPROT] mm: Pass the 4-bit protection key in via PROT_ bits to syscalls Dave Hansen
2015-09-16 17:49 ` [PATCH 23/26] [HIJACKPROT] x86, pkeys: add x86 version of arch_validate_prot() Dave Hansen
2015-09-16 17:49 ` [PATCH 25/26] x86, pkeys: actually enable Memory Protection Keys in CPU Dave Hansen
2015-09-16 17:49 ` [PATCH 24/26] [HIJACKPROT] x86, pkeys: mask off pkeys bits in mprotect() Dave Hansen
2015-09-16 17:49 ` [PATCH 26/26] x86, pkeys: Documentation Dave Hansen
2015-09-20 8:55 ` Ingo Molnar
2015-09-21 4:34 ` Dave Hansen
2015-09-24 9:49 ` Ingo Molnar
2015-09-24 19:10 ` Dave Hansen
2015-09-24 19:17 ` Andy Lutomirski
2015-09-25 7:16 ` Ingo Molnar
2015-09-25 6:15 ` Ingo Molnar
2015-10-01 11:17 ` Ingo Molnar
2015-10-01 20:39 ` Kees Cook
2015-10-01 20:45 ` Andy Lutomirski
2015-10-02 6:23 ` Ingo Molnar
2015-10-02 17:50 ` Dave Hansen
2015-10-03 7:27 ` Ingo Molnar [this message]
2015-10-06 23:28 ` Dave Hansen
2015-10-07 7:11 ` Ingo Molnar
2015-10-16 15:12 ` Dave Hansen
2015-10-21 18:55 ` Andy Lutomirski
2015-10-21 19:11 ` Dave Hansen
2015-10-21 23:22 ` Andy Lutomirski
2015-10-01 20:58 ` Dave Hansen
2015-10-01 22:33 ` Dave Hansen
2015-10-01 22:35 ` Kees Cook
2015-10-01 22:39 ` Dave Hansen
2015-10-01 22:48 ` Linus Torvalds
2015-10-01 22:56 ` Dave Hansen
2015-10-02 1:38 ` Linus Torvalds
2015-10-02 18:08 ` Dave Hansen
2015-10-02 7:09 ` Ingo Molnar
2015-10-03 6:59 ` Ingo Molnar
2015-10-02 11:49 ` Paolo Bonzini
2015-10-02 11:58 ` Linus Torvalds
2015-10-02 12:14 ` Paolo Bonzini
2015-10-03 6:46 ` Ingo Molnar
2015-10-01 22:57 ` Andy Lutomirski
2015-10-02 6:09 ` Ingo Molnar
2015-10-03 8:17 ` Ingo Molnar
2015-10-07 20:24 ` Dave Hansen
2015-10-07 20:39 ` Andy Lutomirski
2015-10-07 20:47 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151003072755.GA23524@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave@sr71.net \
--cc=keescook@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox