From: Andy Lutomirski <luto@kernel.org>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: "Adalbert Lazăr" <alazar@bitdefender.com>,
Linux-MM <linux-mm@kvack.org>,
"Linux API" <linux-api@vger.kernel.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Alexander Graf" <graf@amazon.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Jerome Glisse" <jglisse@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Mihai Donțu" <mdontu@bitdefender.com>,
"Mircea Cirjaliu" <mcirjaliu@bitdefender.com>,
"Andy Lutomirski" <luto@kernel.org>,
"Arnd Bergmann" <arnd@arndb.de>,
"Sargun Dhillon" <sargun@sargun.me>,
"Aleksa Sarai" <cyphar@cyphar.com>,
"Oleg Nesterov" <oleg@redhat.com>, "Jann Horn" <jannh@google.com>,
"Kees Cook" <keescook@chromium.org>,
"Matthew Wilcox" <willy@infradead.org>
Subject: Re: [RESEND RFC PATCH 0/5] Remote mapping
Date: Mon, 7 Sep 2020 13:43:48 -0700 [thread overview]
Message-ID: <CALCETrUSUp_7svg8EHNTk3nQ0x9sdzMCU=h8G-Sy6=SODq5GHg@mail.gmail.com> (raw)
In-Reply-To: <20200907150547.hst4luvrpntdb3lr@wittgenstein>
On Mon, Sep 7, 2020 at 8:05 AM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Fri, Sep 04, 2020 at 02:31:11PM +0300, Adalbert Lazăr wrote:
> > This patchset adds support for the remote mapping feature.
> > Remote mapping, as its name suggests, is a means for transparent and
> > zero-copy access of a remote process' address space.
> > access of a remote process' address space.
> >
> > The feature was designed according to a specification suggested by
> > Paolo Bonzini:
> > >> The proposed API is a new pidfd system call, through which the parent
> > >> can map portions of its virtual address space into a file descriptor
> > >> and then pass that file descriptor to a child.
> > >>
> > >> This should be:
> > >>
> > >> - upstreamable, pidfd is the new cool thing and we could sell it as a
> > >> better way to do PTRACE_{PEEK,POKE}DATA
>
> In all honesty, that sentence made me a bit uneasy as it reads like this
> is implemented on top of pidfds because it makes it more likely to go
> upstream not because it is the right design. To be clear, I'm not
> implying any sort of malicious intent on your part but I would suggest
> to phrase this a little better. :)
I thought about this whole thing some more, and here are some thoughts.
First, I was nervous about two things. One was faulting in pages from
the wrong context. (When a normal page fault or KVM faults in a page,
the mm is loaded. (In the KVM case, the mm is sort of not loaded when
the actual fault happens, but the mm is loaded when the fault is
handled, I think. Maybe there are workqueues involved and I'm wrong.)
When a remote mapping faults in a page, the mm is *not* loaded.)
This ought not to be a problem, though -- get_user_pages_remote() also
faults in pages from a non-current mm, and that's at least supposed to
work correctly, so maybe this is okay.
Second is recursion. I think this is a genuine problem.
And I think that tying this to pidfds is the wrong approach. In fact,
tying it to processes at all seems wrong. There is a lot of demand
for various forms of memory isolation in which memory is mapped only
by its intended user. Using something tied to a process mm gets in
the way of this in the same way that KVM's current mapping model gets
in the way.
All that being said, I think the whole idea of making fancy address
spaces composed from other mappable objects is neat and possibly quite
useful. And, if you squint a bit, this is a lot like what KVM does
today.
So I suggest something that may be more generally useful as an
alternative. This is a sketch and very subject to bikeshedding:
Create an empty address space:
int address_space_create(int flags, etc);
Map an fd into an address space:
int address_space_mmap(int asfd, int fd_to_map, offset, size, prot,
...); /* might run out of args here */
Unmap from an address space:
int address_space_munmap(int asfd, unsigned long addr, unsigned long len);
Stick an address space into KVM:
ioctl(vmfd, KVM_MAP_ADDRESS_SPACE, asfd); /* or similar */
Maybe some day allow mapping an address space into a process.
mmap(..., asfd, ...);
And at least for now, there's a rule that an address space that is
address_space_mmapped into an address space is disallowed.
Maybe some day we also allow mremap(), madvise(), etc. And maybe some
day we allow creating a special address_space that represents a real
process's address space.
Under the hood, an address_space could own an mm_struct that is not
used by any tasks. And we could have special memfds that are bound to
a VM such that all you can do with them is stick them into an
address_space and map that address_space into the VM in question. For
this to work, we would want a special vm_operation for mapping into a
VM.
What do you all think? Is this useful? Does it solve your problems?
Is it a good approach going forward?
next prev parent reply other threads:[~2020-09-07 20:44 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-04 11:31 Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 1/5] mm: add atomic capability to zap_details Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 2/5] mm: let the VMA decide how zap_pte_range() acts on mapped pages Adalbert Lazăr
2020-09-04 11:31 ` [RESEND RFC PATCH 3/5] mm/mmu_notifier: remove lockdep map, allow mmu notifier to be used in nested scenarios Adalbert Lazăr
2020-09-04 12:03 ` Jason Gunthorpe
2020-09-04 11:31 ` [RESEND RFC PATCH 4/5] mm/remote_mapping: use a pidfd to access memory belonging to unrelated process Adalbert Lazăr
2020-09-04 17:55 ` Oleg Nesterov
2020-09-07 14:30 ` Oleg Nesterov
2020-09-07 15:16 ` Adalbert Lazăr
2020-09-09 8:32 ` Mircea CIRJALIU - MELIU
2020-09-10 16:43 ` Oleg Nesterov
2020-09-07 15:02 ` Christian Brauner
2020-09-07 16:04 ` Mircea CIRJALIU - MELIU
2020-09-04 11:31 ` [RESEND RFC PATCH 5/5] pidfd_mem: implemented remote memory mapping system call Adalbert Lazăr
2020-09-04 19:18 ` Florian Weimer
2020-09-07 14:55 ` Christian Brauner
2020-09-04 12:11 ` [RESEND RFC PATCH 0/5] Remote mapping Jason Gunthorpe
2020-09-04 13:24 ` Mircea CIRJALIU - MELIU
2020-09-04 13:39 ` Jason Gunthorpe
2020-09-04 14:18 ` Mircea CIRJALIU - MELIU
2020-09-04 14:39 ` Jason Gunthorpe
2020-09-04 15:40 ` Mircea CIRJALIU - MELIU
2020-09-04 16:11 ` Jason Gunthorpe
2020-09-04 19:41 ` Matthew Wilcox
2020-09-04 19:49 ` Jason Gunthorpe
2020-09-04 20:08 ` Paolo Bonzini
2020-12-01 18:01 ` Jason Gunthorpe
2020-09-04 19:19 ` Florian Weimer
2020-09-04 20:18 ` Paolo Bonzini
2020-09-07 8:33 ` Christian Brauner
2020-09-04 19:39 ` Andy Lutomirski
2020-09-04 20:09 ` Paolo Bonzini
2020-09-04 20:34 ` Andy Lutomirski
2020-09-04 21:58 ` Paolo Bonzini
2020-09-04 23:17 ` Andy Lutomirski
2020-09-05 18:27 ` Paolo Bonzini
2020-09-07 8:38 ` Christian Brauner
2020-09-07 12:41 ` Mircea CIRJALIU - MELIU
2020-09-07 7:05 ` Christoph Hellwig
2020-09-07 8:44 ` Paolo Bonzini
2020-09-07 10:25 ` Mircea CIRJALIU - MELIU
2020-09-07 15:05 ` Christian Brauner
2020-09-07 20:43 ` Andy Lutomirski [this message]
2020-09-09 11:38 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALCETrUSUp_7svg8EHNTk3nQ0x9sdzMCU=h8G-Sy6=SODq5GHg@mail.gmail.com' \
--to=luto@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=alazar@bitdefender.com \
--cc=arnd@arndb.de \
--cc=christian.brauner@ubuntu.com \
--cc=cyphar@cyphar.com \
--cc=graf@amazon.com \
--cc=jannh@google.com \
--cc=jglisse@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcirjaliu@bitdefender.com \
--cc=mdontu@bitdefender.com \
--cc=oleg@redhat.com \
--cc=pbonzini@redhat.com \
--cc=sargun@sargun.me \
--cc=stefanha@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox