From: Ming Lei <tom.leiming@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm <linux-mm@kvack.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
linux-block <linux-block@vger.kernel.org>
Subject: Re: [LSF/MM TOPIC] A high-performance userspace block driver
Date: Mon, 22 Jan 2018 20:18:06 +0800 [thread overview]
Message-ID: <CACVXFVOF+vnha=-8-ahhib235iHq1ZCsBeF83Nu9zYWYYrStuA@mail.gmail.com> (raw)
In-Reply-To: <20180117212144.GD25862@bombadil.infradead.org>
On Thu, Jan 18, 2018 at 5:21 AM, Matthew Wilcox <willy@infradead.org> wrote:
> On Wed, Jan 17, 2018 at 10:49:24AM +0800, Ming Lei wrote:
>> Userfaultfd might be another choice:
>>
>> 1) map the block LBA space into a range of process vm space
>
> That would limit the size of a block device to ~200TB (with my laptop's
> CPU). That's probably OK for most users, but I suspect there are some
> who would chafe at such a restriction (before the 57-bit CPUs arrive).
In theory, it won't be a issue, since the LBA space can be partitioned into
more than one process's vm space, so no matter what the size of block device
is, this way should work.
>
>> 2) when READ/WRITE req comes, convert it to page fault on the
>> mapped range, and let userland to take control of it, and meantime
>> kernel req context is slept
>
> You don't want to sleep the request; you want it to be able to submit
> more I/O. But we have infrastructure in place to inform the submitter
> when I/Os have completed.
Yes, the current bio completion(.end_bio) model can be respected, and
this issue(where to sleep) may depend on UFFD's read/POLLIN protocol.
>
>> 3) IO req context in kernel side is waken up after userspace completed
>> the IO request via userfaultfd
>>
>> 4) kernel side continue to complete the IO, such as copying page from
>> storage range to req(bio) pages.
>>
>> Seems READ should be fine since it is very similar with the use case
>> of QEMU postcopy live migration, WRITE can be a bit different, and
>> maybe need some change on userfaultfd.
>
> I like this idea, and maybe extending UFFD is the way to solve this
> problem. Perhaps I should explain a little more what the requirements
> are. At the point the driver gets the I/O, pages to copy data into (for
> a read) or copy data from (for a write) have already been allocated.
> At all costs, we need to avoid playing VM tricks (because TLB flushes
> are expensive). So one copy is probably OK, but we'd like to avoid it
> if reasonable.
I agree, and one time of page copy can be easier to implement.
>
> Let's assume that the userspace program looks at the request metadata and
> decides that it needs to send a network request. Ideally, it would find
> a way to have the data from the response land in the pre-allocated pages
> (for a read) or send the data straight from the pages in the request
> (for a write). I'm not sure UFFD helps us with that part of the problem.
--
Ming Lei
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-22 12:18 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-16 14:52 Matthew Wilcox
2018-01-16 23:04 ` Viacheslav Dubeyko
2018-01-16 23:23 ` Theodore Ts'o
2018-01-16 23:28 ` [Lsf-pc] " James Bottomley
2018-01-16 23:57 ` Bart Van Assche
2018-01-17 0:41 ` Bart Van Assche
2018-01-17 2:49 ` Ming Lei
2018-01-17 21:21 ` Matthew Wilcox
2018-01-22 12:02 ` Mike Rapoport
2018-01-22 12:18 ` Ming Lei [this message]
2018-01-18 5:27 ` Figo.zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACVXFVOF+vnha=-8-ahhib235iHq1ZCsBeF83Nu9zYWYYrStuA@mail.gmail.com' \
--to=tom.leiming@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox