From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 119ACC3402E for ; Mon, 17 Feb 2020 18:46:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC7992464E for ; Mon, 17 Feb 2020 18:46:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="CWFqtE8+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC7992464E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 980C36B0037; Mon, 17 Feb 2020 13:46:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9314F6B0070; Mon, 17 Feb 2020 13:46:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81A206B0071; Mon, 17 Feb 2020 13:46:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0248.hostedemail.com [216.40.44.248]) by kanga.kvack.org (Postfix) with ESMTP id 648A46B0037 for ; Mon, 17 Feb 2020 13:46:22 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0E7032839 for ; Mon, 17 Feb 2020 18:46:22 +0000 (UTC) X-FDA: 76500499404.06.ocean44_1a6d56315f412 X-HE-Tag: ocean44_1a6d56315f412 X-Filterd-Recvd-Size: 8827 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Mon, 17 Feb 2020 18:46:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=LRP8iW3fX7X8gMRjaL8r0/h5O3SYseGf5THdZm5lDf4=; b=CWFqtE8+YByR6rLztwBmOt8VmN Ztsago0Z6HhquLJO7jgr3b9xW9olWopZTuvhYnw6fRr3KOx1NgMdQk0NcSnqNxA3zb4lzNxCyjEi7 qIFLPmGwyCwOipFX5Dj6vewwUtfCuQR6zLbkvIdVdGqfwdCPw7Thh3R+z6+Q5VVa67/Ny1L6HjIYB M53eFz6Fwy19kWlIkv3xUS7TaPlsHL83Sht520SvtqISLkauaq/fCKtl4CglnB1jiTxYSgLUeli6t l3CrfQPKNQvGNXZouEXnD0zddfBsW17Wo6obmuHyzsv+TgIe1TxUJjgHMdB5dYViDKes0GellYriP iYbijsKg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j3lPL-0005Ay-VQ; Mon, 17 Feb 2020 18:46:15 +0000 From: Matthew Wilcox To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org Subject: [PATCH v6 11/19] btrfs: Convert from readpages to readahead Date: Mon, 17 Feb 2020 10:45:59 -0800 Message-Id: <20200217184613.19668-19-willy@infradead.org> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200217184613.19668-1-willy@infradead.org> References: <20200217184613.19668-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Use the new readahead operation in btrfs. Add a readahead_for_each_batch() iterator to optimise the loop in the XArray. Signed-off-by: Matthew Wilcox (Oracle) --- fs/btrfs/extent_io.c | 46 +++++++++++++---------------------------- fs/btrfs/extent_io.h | 3 +-- fs/btrfs/inode.c | 16 +++++++------- include/linux/pagemap.h | 27 ++++++++++++++++++++++++ 4 files changed, 49 insertions(+), 43 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c0f202741e09..e97a6acd6f5d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4278,52 +4278,34 @@ int extent_writepages(struct address_space *mappi= ng, return ret; } =20 -int extent_readpages(struct address_space *mapping, struct list_head *pa= ges, - unsigned nr_pages) +void extent_readahead(struct readahead_control *rac) { struct bio *bio =3D NULL; unsigned long bio_flags =3D 0; struct page *pagepool[16]; struct extent_map *em_cached =3D NULL; - struct extent_io_tree *tree =3D &BTRFS_I(mapping->host)->io_tree; - int nr =3D 0; + struct extent_io_tree *tree =3D &BTRFS_I(rac->mapping->host)->io_tree; u64 prev_em_start =3D (u64)-1; + int nr; =20 - while (!list_empty(pages)) { - u64 contig_end =3D 0; - - for (nr =3D 0; nr < ARRAY_SIZE(pagepool) && !list_empty(pages);) { - struct page *page =3D lru_to_page(pages); - - prefetchw(&page->flags); - list_del(&page->lru); - if (add_to_page_cache_lru(page, mapping, page->index, - readahead_gfp_mask(mapping))) { - put_page(page); - break; - } - - pagepool[nr++] =3D page; - contig_end =3D page_offset(page) + PAGE_SIZE - 1; - } + readahead_for_each_batch(rac, pagepool, ARRAY_SIZE(pagepool), nr) { + u64 contig_start =3D page_offset(pagepool[0]); + u64 contig_end =3D page_offset(pagepool[nr - 1]) + PAGE_SIZE - 1; =20 - if (nr) { - u64 contig_start =3D page_offset(pagepool[0]); + ASSERT(contig_start + nr * PAGE_SIZE - 1 =3D=3D contig_end); =20 - ASSERT(contig_start + nr * PAGE_SIZE - 1 =3D=3D contig_end); - - contiguous_readpages(tree, pagepool, nr, contig_start, - contig_end, &em_cached, &bio, &bio_flags, - &prev_em_start); - } + contiguous_readpages(tree, pagepool, nr, contig_start, + contig_end, &em_cached, &bio, &bio_flags, + &prev_em_start); } =20 if (em_cached) free_extent_map(em_cached); =20 - if (bio) - return submit_one_bio(bio, 0, bio_flags); - return 0; + if (bio) { + if (submit_one_bio(bio, 0, bio_flags)) + return; + } } =20 /* diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 5d205bbaafdc..bddac32948c7 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -198,8 +198,7 @@ int extent_writepages(struct address_space *mapping, struct writeback_control *wbc); int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc); -int extent_readpages(struct address_space *mapping, struct list_head *pa= ges, - unsigned nr_pages); +void extent_readahead(struct readahead_control *rac); int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinf= o, __u64 start, __u64 len); void set_page_extent_mapped(struct page *page); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7d26b4bfb2c6..61d5137ce4e9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4802,8 +4802,8 @@ static void evict_inode_truncate_pages(struct inode= *inode) =20 /* * Keep looping until we have no more ranges in the io tree. - * We can have ongoing bios started by readpages (called from readahead= ) - * that have their endio callback (extent_io.c:end_bio_extent_readpage) + * We can have ongoing bios started by readahead that have + * their endio callback (extent_io.c:end_bio_extent_readpage) * still in progress (unlocked the pages in the bio but did not yet * unlocked the ranges in the io tree). Therefore this means some * ranges can still be locked and eviction started because before @@ -7004,11 +7004,11 @@ static int lock_extent_direct(struct inode *inode= , u64 lockstart, u64 lockend, * for it to complete) and then invalidate the pages for * this range (through invalidate_inode_pages2_range()), * but that can lead us to a deadlock with a concurrent - * call to readpages() (a buffered read or a defrag call + * call to readahead (a buffered read or a defrag call * triggered a readahead) on a page lock due to an * ordered dio extent we created before but did not have * yet a corresponding bio submitted (whence it can not - * complete), which makes readpages() wait for that + * complete), which makes readahead wait for that * ordered extent to complete while holding a lock on * that page. */ @@ -8247,11 +8247,9 @@ static int btrfs_writepages(struct address_space *= mapping, return extent_writepages(mapping, wbc); } =20 -static int -btrfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +static void btrfs_readahead(struct readahead_control *rac) { - return extent_readpages(mapping, pages, nr_pages); + extent_readahead(rac); } =20 static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) @@ -10456,7 +10454,7 @@ static const struct address_space_operations btrf= s_aops =3D { .readpage =3D btrfs_readpage, .writepage =3D btrfs_writepage, .writepages =3D btrfs_writepages, - .readpages =3D btrfs_readpages, + .readahead =3D btrfs_readahead, .direct_IO =3D btrfs_direct_IO, .invalidatepage =3D btrfs_invalidatepage, .releasepage =3D btrfs_releasepage, diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 4f36c06d064d..1bbb60a0bf16 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -669,6 +669,33 @@ static inline void readahead_next(struct readahead_c= ontrol *rac) #define readahead_for_each(rac, page) \ for (; (page =3D readahead_page(rac)); readahead_next(rac)) =20 +static inline unsigned int readahead_page_batch(struct readahead_control= *rac, + struct page **array, unsigned int size) +{ + unsigned int batch =3D 0; + XA_STATE(xas, &rac->mapping->i_pages, rac->_start); + struct page *page; + + rac->_batch_count =3D 0; + xas_for_each(&xas, page, rac->_start + rac->_nr_pages - 1) { + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(PageTail(page), page); + array[batch++] =3D page; + rac->_batch_count +=3D hpage_nr_pages(page); + if (PageHead(page)) + xas_set(&xas, rac->_start + rac->_batch_count); + + if (batch =3D=3D size) + break; + } + + return batch; +} + +#define readahead_for_each_batch(rac, array, size, nr) \ + for (; (nr =3D readahead_page_batch(rac, array, size)); \ + readahead_next(rac)) + /* The byte offset into the file of this readahead block */ static inline loff_t readahead_offset(struct readahead_control *rac) { --=20 2.25.0