From: Andrew Morton <akpm@linux-foundation.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-block@vger.kernel.org, willy@infradead.org, clm@fb.com
Subject: Re: [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED
Date: Fri, 13 Dec 2019 16:01:09 -0800 [thread overview]
Message-ID: <20191213160109.6c680b680e34891a2db387a9@linux-foundation.org> (raw)
In-Reply-To: <20191210204304.12266-4-axboe@kernel.dk>
On Tue, 10 Dec 2019 13:43:02 -0700 Jens Axboe <axboe@kernel.dk> wrote:
> If RWF_UNCACHED is set for io_uring (or pwritev2(2)), we'll drop the
> cache instantiated for buffered writes. If new pages aren't
> instantiated, we leave them alone. This provides similar semantics to
> reads with RWF_UNCACHED set.
>
Wouid be nice to see a description of the proposed userspace API(s)
for exploiting this feature.
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -285,6 +285,7 @@ enum positive_aop_returns {
> #define AOP_FLAG_NOFS 0x0002 /* used by filesystem to direct
> * helper code (eg buffer layer)
> * to clear GFP_FS from alloc */
> +#define AOP_FLAG_UNCACHED 0x0004
>
> /*
> * oh the beauties of C type declarations.
> @@ -3106,6 +3107,10 @@ extern ssize_t generic_file_direct_write(struct kiocb *, struct iov_iter *);
> extern ssize_t generic_perform_write(struct file *, struct iov_iter *,
> struct kiocb *);
>
> +struct pagevec;
> +extern void write_drop_cached_pages(struct pagevec *pvec,
> + struct address_space *mapping);
> +
> ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
> rwf_t flags);
> ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
> diff --git a/mm/filemap.c b/mm/filemap.c
> index fe37bd2b2630..2e36129ebe38 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3287,10 +3287,12 @@ struct page *grab_cache_page_write_begin(struct address_space *mapping,
> pgoff_t index, unsigned flags)
> {
> struct page *page;
> - int fgp_flags = FGP_LOCK|FGP_WRITE|FGP_CREAT;
> + int fgp_flags = FGP_LOCK|FGP_WRITE;
>
> if (flags & AOP_FLAG_NOFS)
> fgp_flags |= FGP_NOFS;
> + if (!(flags & AOP_FLAG_UNCACHED))
> + fgp_flags |= FGP_CREAT;
>
> page = pagecache_get_page(mapping, index, fgp_flags,
> mapping_gfp_mask(mapping));
> @@ -3301,21 +3303,65 @@ struct page *grab_cache_page_write_begin(struct address_space *mapping,
> }
> EXPORT_SYMBOL(grab_cache_page_write_begin);
>
> +/*
> + * Start writeback on the pages in pgs[], and then try and remove those pages
> + * from the page cached. Used with RWF_UNCACHED.
> + */
> +void write_drop_cached_pages(struct pagevec *pvec,
> + struct address_space *mapping)
> +{
> + loff_t start, end;
> + int i;
> +
> + end = 0;
> + start = LLONG_MAX;
> + for (i = 0; i < pagevec_count(pvec); i++) {
> + loff_t off = page_offset(pvec->pages[i]);
> + if (off < start)
> + start = off;
> + if (off > end)
> + end = off;
> + }
> +
> + __filemap_fdatawrite_range(mapping, start, end, WB_SYNC_NONE);
> +
> + for (i = 0; i < pagevec_count(pvec); i++) {
> + struct page *page = pvec->pages[i];
> +
> + lock_page(page);
> + if (page->mapping == mapping) {
> + wait_on_page_writeback(page);
> + if (!page_has_private(page) ||
> + try_to_release_page(page, 0))
> + remove_mapping(mapping, page);
> + }
> + unlock_page(page);
> + }
This is kinda invalidate_inode_pages2_range(), only much less so? Why
doesn't this code need to do all the things which
invalidate_inode_pages2_range() does? What happens if these pages are
mmapped, faulted in? Not faulted in?
> + pagevec_release(pvec);
> +}
next prev parent reply other threads:[~2019-12-14 0:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-10 20:42 [PATCHSET v2 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-10 20:43 ` [PATCH 1/5] fs: add read support " Jens Axboe
2019-12-10 20:43 ` [PATCH 2/5] mm: make generic_perform_write() take a struct kiocb Jens Axboe
2019-12-10 20:43 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-14 0:01 ` Andrew Morton [this message]
2019-12-10 20:43 ` [PATCH 4/5] iomap: pass in the write_begin/write_end flags to iomap_actor Jens Axboe
2019-12-10 20:43 ` [PATCH 5/5] iomap: support RWF_UNCACHED for buffered writes Jens Axboe
2019-12-11 1:14 ` Dave Chinner
2019-12-11 14:44 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2019-12-12 19:01 [PATCHSET v4 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-12 19:01 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-11 15:29 [PATCHSET v3 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-10 16:24 [PATCHSET 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-10 16:24 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-10 16:55 ` Matthew Wilcox
2019-12-10 17:02 ` Jens Axboe
2019-12-10 18:35 ` Chris Mason
2019-12-10 18:58 ` Matthew Wilcox
2019-12-10 19:10 ` Jens Axboe
2019-12-11 0:23 ` Dave Chinner
2019-12-11 0:28 ` Dave Chinner
2019-12-11 14:39 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191213160109.6c680b680e34891a2db387a9@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=clm@fb.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox