linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: "Adalbert Lazăr" <alazar@bitdefender.com>,
	Linux-MM <linux-mm@kvack.org>,
	"Linux API" <linux-api@vger.kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Alexander Graf" <graf@amazon.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Jerome Glisse" <jglisse@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Mihai Donțu" <mdontu@bitdefender.com>,
	"Mircea Cirjaliu" <mcirjaliu@bitdefender.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Sargun Dhillon" <sargun@sargun.me>,
	"Aleksa Sarai" <cyphar@cyphar.com>,
	"Oleg Nesterov" <oleg@redhat.com>, "Jann Horn" <jannh@google.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Matthew Wilcox" <willy@infradead.org>
Subject: Re: [RESEND RFC PATCH 0/5] Remote mapping
Date: Mon, 7 Sep 2020 13:43:48 -0700	[thread overview]
Message-ID: <CALCETrUSUp_7svg8EHNTk3nQ0x9sdzMCU=h8G-Sy6=SODq5GHg@mail.gmail.com> (raw)
In-Reply-To: <20200907150547.hst4luvrpntdb3lr@wittgenstein>

On Mon, Sep 7, 2020 at 8:05 AM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Fri, Sep 04, 2020 at 02:31:11PM +0300, Adalbert Lazăr wrote:
> > This patchset adds support for the remote mapping feature.
> > Remote mapping, as its name suggests, is a means for transparent and
> > zero-copy access of a remote process' address space.
> > access of a remote process' address space.
> >
> > The feature was designed according to a specification suggested by
> > Paolo Bonzini:
> > >> The proposed API is a new pidfd system call, through which the parent
> > >> can map portions of its virtual address space into a file descriptor
> > >> and then pass that file descriptor to a child.
> > >>
> > >> This should be:
> > >>
> > >> - upstreamable, pidfd is the new cool thing and we could sell it as a
> > >> better way to do PTRACE_{PEEK,POKE}DATA
>
> In all honesty, that sentence made me a bit uneasy as it reads like this
> is implemented on top of pidfds because it makes it more likely to go
> upstream not because it is the right design. To be clear, I'm not
> implying any sort of malicious intent on your part but I would suggest
> to phrase this a little better. :)


I thought about this whole thing some more, and here are some thoughts.

First, I was nervous about two things.  One was faulting in pages from
the wrong context.  (When a normal page fault or KVM faults in a page,
the mm is loaded.  (In the KVM case, the mm is sort of not loaded when
the actual fault happens, but the mm is loaded when the fault is
handled, I think.  Maybe there are workqueues involved and I'm wrong.)
 When a remote mapping faults in a page, the mm is *not* loaded.)
This ought not to be a problem, though -- get_user_pages_remote() also
faults in pages from a non-current mm, and that's at least supposed to
work correctly, so maybe this is okay.

Second is recursion.  I think this is a genuine problem.

And I think that tying this to pidfds is the wrong approach.  In fact,
tying it to processes at all seems wrong.  There is a lot of demand
for various forms of memory isolation in which memory is mapped only
by its intended user.  Using something tied to a process mm gets in
the way of this in the same way that KVM's current mapping model gets
in the way.

All that being said, I think the whole idea of making fancy address
spaces composed from other mappable objects is neat and possibly quite
useful.  And, if you squint a bit, this is a lot like what KVM does
today.

So I suggest something that may be more generally useful as an
alternative.  This is a sketch and very subject to bikeshedding:

Create an empty address space:

int address_space_create(int flags, etc);

Map an fd into an address space:

int address_space_mmap(int asfd, int fd_to_map, offset, size, prot,
...);  /* might run out of args here */

Unmap from an address space:

int address_space_munmap(int asfd, unsigned long addr, unsigned long len);

Stick an address space into KVM:

ioctl(vmfd, KVM_MAP_ADDRESS_SPACE, asfd);  /* or similar */

Maybe some day allow mapping an address space into a process.

mmap(..., asfd, ...);


And at least for now, there's a rule that an address space that is
address_space_mmapped into an address space is disallowed.


Maybe some day we also allow mremap(), madvise(), etc.  And maybe some
day we allow creating a special address_space that represents a real
process's address space.


Under the hood, an address_space could own an mm_struct that is not
used by any tasks.  And we could have special memfds that are bound to
a VM such that all you can do with them is stick them into an
address_space and map that address_space into the VM in question.  For
this to work, we would want a special vm_operation for mapping into a
VM.


What do you all think?  Is this useful?  Does it solve your problems?
Is it a good approach going forward?


  reply	other threads:[~2020-09-07 20:44 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-04 11:31 Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 1/5] mm: add atomic capability to zap_details Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 2/5] mm: let the VMA decide how zap_pte_range() acts on mapped pages Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 3/5] mm/mmu_notifier: remove lockdep map, allow mmu notifier to be used in nested scenarios Adalbert Lazăr
2020-09-04 12:03   ` Jason Gunthorpe
2020-09-04 11:31 ` [RESEND RFC PATCH 4/5] mm/remote_mapping: use a pidfd to access memory belonging to unrelated process Adalbert Lazăr
2020-09-04 17:55   ` Oleg Nesterov
2020-09-07 14:30   ` Oleg Nesterov
2020-09-07 15:16     ` Adalbert Lazăr
2020-09-09  8:32     ` Mircea CIRJALIU - MELIU
2020-09-10 16:43       ` Oleg Nesterov
2020-09-07 15:02   ` Christian Brauner
2020-09-07 16:04     ` Mircea CIRJALIU - MELIU
2020-09-04 11:31 ` [RESEND RFC PATCH 5/5] pidfd_mem: implemented remote memory mapping system call Adalbert Lazăr
2020-09-04 19:18   ` Florian Weimer
2020-09-07 14:55   ` Christian Brauner
2020-09-04 12:11 ` [RESEND RFC PATCH 0/5] Remote mapping Jason Gunthorpe
2020-09-04 13:24   ` Mircea CIRJALIU - MELIU
2020-09-04 13:39     ` Jason Gunthorpe
2020-09-04 14:18       ` Mircea CIRJALIU - MELIU
2020-09-04 14:39         ` Jason Gunthorpe
2020-09-04 15:40           ` Mircea CIRJALIU - MELIU
2020-09-04 16:11             ` Jason Gunthorpe
2020-09-04 19:41   ` Matthew Wilcox
2020-09-04 19:49     ` Jason Gunthorpe
2020-09-04 20:08     ` Paolo Bonzini
2020-12-01 18:01     ` Jason Gunthorpe
2020-09-04 19:19 ` Florian Weimer
2020-09-04 20:18   ` Paolo Bonzini
2020-09-07  8:33     ` Christian Brauner
2020-09-04 19:39 ` Andy Lutomirski
2020-09-04 20:09   ` Paolo Bonzini
2020-09-04 20:34     ` Andy Lutomirski
2020-09-04 21:58       ` Paolo Bonzini
2020-09-04 23:17         ` Andy Lutomirski
2020-09-05 18:27           ` Paolo Bonzini
2020-09-07  8:38             ` Christian Brauner
2020-09-07 12:41           ` Mircea CIRJALIU - MELIU
2020-09-07  7:05         ` Christoph Hellwig
2020-09-07  8:44           ` Paolo Bonzini
2020-09-07 10:25   ` Mircea CIRJALIU - MELIU
2020-09-07 15:05 ` Christian Brauner
2020-09-07 20:43   ` Andy Lutomirski [this message]
2020-09-09 11:38     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUSUp_7svg8EHNTk3nQ0x9sdzMCU=h8G-Sy6=SODq5GHg@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alazar@bitdefender.com \
    --cc=arnd@arndb.de \
    --cc=christian.brauner@ubuntu.com \
    --cc=cyphar@cyphar.com \
    --cc=graf@amazon.com \
    --cc=jannh@google.com \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcirjaliu@bitdefender.com \
    --cc=mdontu@bitdefender.com \
    --cc=oleg@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sargun@sargun.me \
    --cc=stefanha@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox