linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Balbir Singh <bsingharora@gmail.com>
Cc: linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org,
	Matthew Wilcox <willy@infradead.org>,
	Mike Rapoport <rppt@linux.ibm.com>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] Address space isolation inside the kernel
Date: Sun, 17 Feb 2019 14:20:50 -0800	[thread overview]
Message-ID: <1550442050.2809.36.camel@HansenPartnership.com> (raw)
In-Reply-To: <20190217220150.GI31125@350D>

On Mon, 2019-02-18 at 09:01 +1100, Balbir Singh wrote:
> On Sun, Feb 17, 2019 at 12:09:06PM -0800, James Bottomley wrote:
> > On Sun, 2019-02-17 at 11:34 -0800, Matthew Wilcox wrote:
> > > On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > > > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > > > For namespaces, does allocating the right memory protection
> > > > > key work? At some point we'll need to recycle the keys
> > > > 
> > > > I don't think anyone mentioned memory keys and namespaces ... I
> > > > take it you're thinking of SEV/MKTME?
> > > 
> > > I thought he meant Protection Keys
> > > https://en.wikipedia.org/wiki/Memory_protection#Protection_keys
> > 
> > Really?  I wasn't really considering that mainly because in parisc
> > we use them to implement no execute, so they'd have to be
> > repurposed.
> > 
> > > > The idea being to shield one container's execution from another
> > > > using memory encryption?  We've speculated it's possible but
> > > > the actual mechanism we were looking at is tagging pages to
> > > > namespaces (essentially using the mount namspace and tags on
> > > > the page cache) so the kernel would refuse to map a page into
> > > > the wrong namespace.  This approach doesn't seem to be as
> > > > promising as the separated address space one because the
> > > > security properties are harder to measure.
> > > 
> > > What do you mean by "tags on the pages cache"?  Is that different
> > > from the radix tree tags (now renamed to XArray marks), which are
> > > search keys.
> > 
> > Tagging the page cache to namespaces means having a set of mount
> > namespaces per page in the page cache and not allowing placing the
> > page into a VMA unless the owning task's nsproxy is one of the
> > tagged mount namespaces.  The idea was to introduce kernel
> > supported fencing between containers, particularly if they were
> > handling sensitive data, so that if a container used an exploit to
> > map another container's page, the mapping would fail.  However,
> > since sensitive data should be on an encrypted filesystem, it looks
> > like SEV/MKTME coupled with file based encryption might provide a
> > better mechanism.
> > 
> 
> Splitting out this point to a different email, I think being able to
> tag page cache is quite interesting and in the long run might help
> us to get things like mincore() right across shared boundaries.
> 
> But any fencing will come in the way of sharing and density of
> containers. I still don't see how a container can map page cache it
> does not have right permissions to/for? In an ideal world any
> writable pages (sensitive) should ideally go to the writable bits of
> the union mount filesystem which is private to the container (but I
> could be making up things without trying them out)

As I said before, it's about reducing the horizontal attack profile
(HAP).  If the kernel were perfectly free from bugs and exploits,
containment would be perfect and the HAP would be zero.  In the real
world, where the kernel is trusted (it's your kernel) but potentially
vulnerable (it's not free from possibly exploitable defects), the HAP
is non-zero and the question becomes how do you prevent one tenant from
exploiting a defect to interfere with or exfiltrate data from another
tenant.

The idea behind page tagging is that modern techniqes (like ROP
attacks) use existing code sequences within the kernel to perform the
exploit so if all code sequences that map pages contain tag guards, the
defences against one container accessing another pages remain in place
even in the face of exploits.

James


  reply	other threads:[~2019-02-17 22:20 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-07  7:24 Mike Rapoport
2019-02-14 19:21 ` Kees Cook
     [not found] ` <CA+VK+GOpjXQ2-CLZt6zrW6m-=WpWpvcrXGSJ-723tRDMeAeHmg@mail.gmail.com>
2019-02-16 11:13   ` Paul Turner
2019-04-25 20:47     ` Jonathan Adams
2019-04-25 21:56       ` James Bottomley
2019-04-25 22:25         ` Paul Turner
2019-04-25 22:31           ` [Lsf-pc] " Alexei Starovoitov
2019-04-25 22:40             ` Paul Turner
2019-02-16 12:19 ` Balbir Singh
2019-02-16 16:30   ` James Bottomley
2019-02-17  8:01     ` Balbir Singh
2019-02-17 16:43       ` James Bottomley
2019-02-17 19:34     ` Matthew Wilcox
2019-02-17 20:09       ` James Bottomley
2019-02-17 21:54         ` Balbir Singh
2019-02-17 22:01         ` Balbir Singh
2019-02-17 22:20           ` James Bottomley [this message]
2019-02-18 11:15             ` [Lsf-pc] " Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1550442050.2809.36.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=bsingharora@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=rppt@linux.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox