From: Andreas Dilger <adilger@dilger.ca>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Jan Kara <jack@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Dave Hansen <dave.hansen@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-block@vger.kernel.org
Subject: Re: [PATCHv6 11/37] HACK: readahead: alloc huge pages, if allowed
Date: Thu, 9 Feb 2017 17:23:31 -0700 [thread overview]
Message-ID: <7D35EB8E-29F8-41DA-BB46-8BCF7B6C5A72@dilger.ca> (raw)
In-Reply-To: <20170209233436.GZ2267@bombadil.infradead.org>
[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]
On Feb 9, 2017, at 4:34 PM, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Jan 26, 2017 at 02:57:53PM +0300, Kirill A. Shutemov wrote:
>> Most page cache allocation happens via readahead (sync or async), so if
>> we want to have significant number of huge pages in page cache we need
>> to find a ways to allocate them from readahead.
>>
>> Unfortunately, huge pages doesn't fit into current readahead design:
>> 128 max readahead window, assumption on page size, PageReadahead() to
>> track hit/miss.
>>
>> I haven't found a ways to get it right yet.
>>
>> This patch just allocates huge page if allowed, but doesn't really
>> provide any readahead if huge page is allocated. We read out 2M a time
>> and I would expect spikes in latancy without readahead.
>>
>> Therefore HACK.
>>
>> Having that said, I don't think it should prevent huge page support to
>> be applied. Future will show if lacking readahead is a big deal with
>> huge pages in page cache.
>>
>> Any suggestions are welcome.
>
> Well ... what if we made readahead 2 hugepages in size for inodes which
> are using huge pages? That's only 8x our current readahead window, and
> if you're asking for hugepages, you're accepting that IOs are going to
> be larger, and you probably have the kind of storage system which can
> handle doing larger IOs.
It would be nice if the bdi had a parameter for the maximum readahead size.
Currently, readahead is capped at 2MB chunks by force_page_cache_readahead()
even if bdi->ra_pages and bdi->io_pages are much larger.
It should be up to the filesystem to decide how large the readahead chunks
are rather than imposing some policy in the MM code. For high-speed (network)
storage access it is better to have at least 4MB read chunks, for RAID storage
it is desirable to have stripe-aligned readahead to avoid read inflation when
verifying the parity. Any fixed size will eventually be inadequate as disks
and filesystems change, so it may as well be a per-bdi tunable that can be set
by the filesystem as needed, or possibly with a mount option if needed.
Cheers, Andreas
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
next prev parent reply other threads:[~2017-02-10 0:23 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-26 11:57 [PATCHv6 00/37] ext4: support of huge pages Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 01/37] mm, shmem: swich huge tmpfs to multi-order radix-tree entries Kirill A. Shutemov
2017-02-09 3:57 ` Matthew Wilcox
2017-02-09 16:58 ` Kirill A. Shutemov
2017-02-13 13:43 ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 02/37] Revert "radix-tree: implement radix_tree_maybe_preload_order()" Kirill A. Shutemov
2017-01-26 15:38 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 03/37] page-flags: relax page flag policy for few flags Kirill A. Shutemov
2017-02-09 4:01 ` Matthew Wilcox
2017-02-13 13:59 ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 04/37] mm, rmap: account file thp pages Kirill A. Shutemov
2017-02-09 20:17 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 05/37] thp: try to free page's buffers before attempt split Kirill A. Shutemov
2017-02-09 20:14 ` Matthew Wilcox
2017-02-13 14:32 ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 06/37] thp: handle write-protection faults for file THP Kirill A. Shutemov
2017-01-26 15:44 ` Matthew Wilcox
2017-01-26 15:57 ` Kirill A. Shutemov
2017-02-09 20:19 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 07/37] filemap: allocate huge page in page_cache_read(), if allowed Kirill A. Shutemov
2017-02-09 21:18 ` Matthew Wilcox
2017-02-13 15:17 ` Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 08/37] filemap: handle huge pages in do_generic_file_read() Kirill A. Shutemov
2017-02-09 21:55 ` Matthew Wilcox
2017-02-13 15:33 ` Kirill A. Shutemov
2017-02-13 16:01 ` Matthew Wilcox
2017-02-13 16:09 ` Matthew Wilcox
2017-02-13 16:28 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 09/37] filemap: allocate huge page in pagecache_get_page(), if allowed Kirill A. Shutemov
2017-02-09 21:59 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 10/37] filemap: handle huge pages in filemap_fdatawait_range() Kirill A. Shutemov
2017-02-09 23:03 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 11/37] HACK: readahead: alloc huge pages, if allowed Kirill A. Shutemov
2017-02-09 23:34 ` Matthew Wilcox
2017-02-10 0:23 ` Andreas Dilger [this message]
2017-02-10 14:51 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 12/37] brd: make it handle huge pages Kirill A. Shutemov
2017-02-10 17:24 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 13/37] mm: make write_cache_pages() work on " Kirill A. Shutemov
2017-02-10 17:46 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 14/37] thp: introduce hpage_size() and hpage_mask() Kirill A. Shutemov
2017-01-26 11:57 ` [PATCHv6 15/37] thp: do not threat slab pages as huge in hpage_{nr_pages,size,mask} Kirill A. Shutemov
2017-02-10 22:13 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 16/37] thp: make thp_get_unmapped_area() respect S_HUGE_MODE Kirill A. Shutemov
2017-02-10 17:50 ` Matthew Wilcox
2017-01-26 11:57 ` [PATCHv6 17/37] fs: make block_read_full_page() be able to read huge page Kirill A. Shutemov
2017-02-10 17:58 ` Matthew Wilcox
2017-01-26 11:58 ` [PATCHv6 18/37] fs: make block_write_{begin,end}() be able to handle huge pages Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 19/37] fs: make block_page_mkwrite() aware about " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 20/37] truncate: make truncate_inode_pages_range() " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 21/37] truncate: make invalidate_inode_pages2_range() " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 22/37] mm, hugetlb: switch hugetlbfs to multi-order radix-tree entries Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 23/37] mm: account huge pages to dirty, writaback, reclaimable, etc Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 24/37] ext4: make ext4_mpage_readpages() hugepage-aware Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 25/37] ext4: make ext4_writepage() work on huge pages Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 26/37] ext4: handle huge pages in ext4_page_mkwrite() Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 27/37] ext4: handle huge pages in __ext4_block_zero_page_range() Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 28/37] ext4: make ext4_block_write_begin() aware about huge pages Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 29/37] ext4: handle huge pages in ext4_da_write_end() Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 30/37] ext4: make ext4_da_page_release_reservation() aware about huge pages Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 31/37] ext4: handle writeback with " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 32/37] ext4: make EXT4_IOC_MOVE_EXT work " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 33/37] ext4: fix SEEK_DATA/SEEK_HOLE for " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 34/37] ext4: make fallocate() operations work with " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 35/37] ext4: reserve larger jounral transaction for " Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 36/37] mm, fs, ext4: expand use of page_mapping() and page_to_pgoff() Kirill A. Shutemov
2017-01-26 11:58 ` [PATCHv6 37/37] ext4, vfs: add huge= mount option Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7D35EB8E-29F8-41DA-BB46-8BCF7B6C5A72@dilger.ca \
--to=adilger@dilger.ca \
--cc=aarcange@redhat.com \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ross.zwisler@linux.intel.com \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox