Re: [LSF/MM TOPIC] Address space isolation inside the kernel

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kees Cook <keescook@chromium.org>
To: Mike Rapoport <rppt@linux.ibm.com>
Cc: lsf-pc@lists.linux-foundation.org, Linux-MM <linux-mm@kvack.org>,
	 James Bottomley <James.Bottomley@hansenpartnership.com>
Subject: Re: [LSF/MM TOPIC] Address space isolation inside the kernel
Date: Thu, 14 Feb 2019 11:21:30 -0800	[thread overview]
Message-ID: <CAGXu5jKLBYMZ-cHyp4m_9TO1gAF2cQDVKu-XyH4i3aj7MqRCnA@mail.gmail.com> (raw)
In-Reply-To: <20190207072421.GA9120@rapoport-lnx>

On Wed, Feb 6, 2019 at 11:24 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> Address space isolation has been used to protect the kernel from the
> userspace and userspace programs from each other since the invention of
> the virtual memory.

Well, traditionally the kernel's protection has been one-sided: we've
left userspace mapped while in the kernel, which has lead to countless
exploits. SMEP/SMAP (or similar for other architectures, like ARM's
PXN/PAN) have finally mitigated that, but we're still left with a lot
of older machines (and other architectures) that would benefit from
unmapping the userspace while in the kernel.

> Assuming that kernel bugs and therefore vulnerabilities are inevitable
> it might be worth isolating parts of the kernel to minimize damage
> that these vulnerabilities can cause.

Yes please. :) Two cases jump to mind:

1) Make regions unwritable to avoid write-anywhere data modification
attacks. For code and rodata, this is already done with regular page
table bits making them read-only for the entire lifetime of the
kernel. For areas that need writing but are sensitive (e.g. the page
tables themselves, and generally function pointer tables), there needs
to be a way to keep modifications isolated to given code (to block
write-anywhere attacks), keeping them read-only through all other
accesses. This is could be done with per-CPU page tables, a faster
version of the "write rarely" patch set[1], or maybe with the kernel
text poking (mentioned in your email). Attacking the page tables
directly is now the common way to gain execute control on the kernel,
since so much of the rest of memory is locked down[2]. How can we keep
page tables read-only except for when the page table code needs to
write to them?

2) Make a region unreadable to avoid read-anywhere memory disclosure
attacks. This mean it's either unmapped (for both data and code cases)
or we gain execute-not-read hardware bits (for code cases). Unmapping
code means a reduction in ROP gadgets, unmapping data means reduction
in memory disclosure surface. Note that while both coarse (CET) and
fine-grain (function-prototype-checking) CFI vastly reduces the
availability of ROP gadgets, the kernel still has a lot of functions
that return void and take a single unsigned long, so anything to
remove more code from visibility is good.

> There is already ongoing work in a similar direction, like XPFO [1] and
> temporary mappings proposed for the kernel text poking [2].
>
> We have several vague ideas how we can take this even further and make
> different parts of kernel run in different address spaces:
> * Remove most of the kernel mappings from the syscall entry and add a
>   trampoline when the syscall processing needs to call the "core
>   kernel".

Defining this boundary may be very tricky, but maybe the same logic
used for CFI and function graph analysis could be used to find the
existing bright lines between code regions...

> * Make the parts of the kernel that execute in a namespace use their
>   own mappings for the namespace private data
> * Extend EXPORT_SYMBOL to include a trampoline so that the code
>   running in modules won't map the entire kernel
> * Execute BFP programs in a dedicated address space

Pushing drivers into isolated regions would be very interesting. If it
needs context-switching, though, we're headed to microkernel fun.

> These are very general possible directions. We are exploring some of
> them now to understand if the security value is worth the complexity
> and the performance impact.
>
> We believe it would be helpful to discuss the general idea of address
> space isolation inside the kernel, both from the technical aspect of
> how it can be achieved simply and efficiently and from the isolation
> aspect of what actual security guarantees it usefully provides.
>
> [1] https://lore.kernel.org/lkml/cover.1547153058.git.khalid.aziz@oracle.com/
> [2] https://lore.kernel.org/lkml/20190129003422.9328-4-rick.p.edgecombe@intel.com/

I won't be able to make it to the conference, but I'm very interested
in finding ways forward on this topic. :)

-Kees

[1] https://patchwork.kernel.org/project/kernel-hardening/list/?series=79855
[2] https://www.blackhat.com/docs/asia-18/asia-18-WANG-KSMA-Breaking-Android-kernel-isolation-and-Rooting-with-ARM-MMU-features.pdf

-- 
Kees Cook

next prev parent reply	other threads:[~2019-02-14 19:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07  7:24 Mike Rapoport
2019-02-14 19:21 ` Kees Cook [this message]
     [not found] ` <CA+VK+GOpjXQ2-CLZt6zrW6m-=WpWpvcrXGSJ-723tRDMeAeHmg@mail.gmail.com>
2019-02-16 11:13   ` Paul Turner
2019-04-25 20:47     ` Jonathan Adams
2019-04-25 21:56       ` James Bottomley
2019-04-25 22:25         ` Paul Turner
2019-04-25 22:31           ` [Lsf-pc] " Alexei Starovoitov
2019-04-25 22:40             ` Paul Turner
2019-02-16 12:19 ` Balbir Singh
2019-02-16 16:30   ` James Bottomley
2019-02-17  8:01     ` Balbir Singh
2019-02-17 16:43       ` James Bottomley
2019-02-17 19:34     ` Matthew Wilcox
2019-02-17 20:09       ` James Bottomley
2019-02-17 21:54         ` Balbir Singh
2019-02-17 22:01         ` Balbir Singh
2019-02-17 22:20           ` [Lsf-pc] " James Bottomley
2019-02-18 11:15             ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGXu5jKLBYMZ-cHyp4m_9TO1gAF2cQDVKu-XyH4i3aj7MqRCnA@mail.gmail.com \
    --to=keescook@chromium.org \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=rppt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox