From: Jan Kara <jack@suse.cz>
To: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
David Howells <dhowells@redhat.com>,
David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH 4/5] block: Add support for bouncing pinned pages
Date: Thu, 16 Feb 2023 13:33:16 +0100 [thread overview]
Message-ID: <20230216123316.vkmtucazg33vidzg@quack3> (raw)
In-Reply-To: <Y+x6oQkLex8PbfgL@infradead.org>
On Tue 14-02-23 22:24:33, Christoph Hellwig wrote:
> On Wed, Feb 15, 2023 at 03:59:52PM +1100, Dave Chinner wrote:
> > I don't think this works, especially if the COW mechanism relies on
> > delayed allocation to prevent ENOSPC during writeback. That is, we
> > need a write() or page fault (to run ->page_mkwrite()) after every
> > call to folio_clear_dirty_for_io() in the writeback path to ensure
> > that new space is reserved for the allocation that will occur
> > during a future writeback of that page.
> >
> > Hence we can't just leave the page dirty on COW filesystems - it has
> > to go through a clean state so that the clean->dirty event can be
> > gated on gaining the space reservation that allows it to be written
> > back again.
>
> Exactly. Although if we really want we could do the redirtying without
> formally moving to a clean state, but it certainly would require special
> new code to the same steps as if we were redirtying.
Yes.
> Which is another reason why I'd prefer to avoid all that if we can.
> For that we probably need an inventory of what long term pins we have
> in the kernel tree that can and do operate on shared file mappings,
> and what kind of I/O semantics they expect.
I'm a bit skeptical we can reasonably assess that (as much as I would love
to just not write these pages and be done with it) because a lot of
FOLL_LONGTERM users just pin passed userspace address range, then allow
userspace to manipulate it with other operations, and finally unpin it with
another call. Who knows whether shared pagecache pages are passed in and
what userspace is doing with them while they are pinned?
We have stuff like io_uring using FOLL_LONGTERM for IO buffers passed from
userspace (e.g. IORING_REGISTER_BUFFERS operation), we have V4L2 which
similarly pins buffers for video processing (and I vaguely remember one
bugreport due to some phone passing shared file pages there to acquire
screenshots from a webcam), and we have various infiniband drivers doing
this (not all of them are using FOLL_LONGTERM but they should AFAICS). We
even have vmsplice(2) that should be arguably using pinning with
FOLL_LONGTERM (at least that's the plan AFAIK) and not writing such pages
would IMO provide an interesting attack vector...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2023-02-16 12:33 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-09 12:31 [PATCH RFC 0/5] Writeback handling of " Jan Kara
2023-02-09 12:31 ` [PATCH 1/5] mm: Do not reclaim private data from pinned page Jan Kara
2023-02-09 16:17 ` Matthew Wilcox
2023-02-10 11:29 ` Jan Kara
2023-02-13 9:55 ` Christoph Hellwig
2023-02-14 13:06 ` Jan Kara
2023-02-14 21:40 ` John Hubbard
2023-02-16 11:56 ` Jan Kara
2023-02-13 9:01 ` David Hildenbrand
2023-02-14 13:00 ` Jan Kara
2023-02-09 12:31 ` [PATCH 2/5] ext4: Drop workaround for mm reclaiming fs private page data Jan Kara
2023-02-09 12:31 ` [PATCH 3/5] mm: Do not try to write pinned folio during memory cleaning writeback Jan Kara
2023-02-10 1:54 ` John Hubbard
2023-02-10 2:10 ` John Hubbard
2023-02-10 10:42 ` Jan Kara
2023-02-10 10:54 ` Jan Kara
2023-02-09 12:31 ` [PATCH 4/5] block: Add support for bouncing pinned pages Jan Kara
2023-02-13 9:59 ` Christoph Hellwig
2023-02-14 13:56 ` Jan Kara
2023-02-15 4:59 ` Dave Chinner
2023-02-15 6:24 ` Christoph Hellwig
2023-02-16 12:33 ` Jan Kara [this message]
2023-02-20 6:22 ` Christoph Hellwig
2023-02-27 11:39 ` Jan Kara
2023-02-27 13:36 ` Christoph Hellwig
2023-02-09 12:31 ` [PATCH 5/5] iomap: Bounce pinned pages during writeback Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230216123316.vkmtucazg33vidzg@quack3 \
--to=jack@suse.cz \
--cc=david@fromorbit.com \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hch@infradead.org \
--cc=jhubbard@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox