From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86549C2D0A3 for ; Thu, 29 Oct 2020 19:34:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F262520EDD for ; Thu, 29 Oct 2020 19:34:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="UBv1pHc9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F262520EDD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E9D736B007D; Thu, 29 Oct 2020 15:34:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A26266B007E; Thu, 29 Oct 2020 15:34:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89E0A6B0070; Thu, 29 Oct 2020 15:34:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0197.hostedemail.com [216.40.44.197]) by kanga.kvack.org (Postfix) with ESMTP id 458916B007E for ; Thu, 29 Oct 2020 15:34:14 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CC701180AD811 for ; Thu, 29 Oct 2020 19:34:13 +0000 (UTC) X-FDA: 77425963986.24.board68_380841127290 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id AB40B1A4A7 for ; Thu, 29 Oct 2020 19:34:13 +0000 (UTC) X-HE-Tag: board68_380841127290 X-Filterd-Recvd-Size: 6343 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Thu, 29 Oct 2020 19:34:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=wmxCHbi3U3r2Cgj8P7MXYuiRGLaiVPEgjTYbK8P//3o=; b=UBv1pHc9YqhyysyzsEv48fXnR5 OsvB0zVLCy1RrfHvu5sor3juarZomqtlTo6qP9+dl4UXMlbLqBh6jXcMvlNMY0c1PcUVxM8hfdDDf bSm+ZVY3xoNuC3O95k7Lday+o2ssEVKRNXi9a4+x3RBLDd+hFLvEUdsp8xOwuBgji8n32avKg7YtH ANeDZoy65mZsMbciBszt4BatDfn0IkgZmwImOkCVYlHH+Pheu6doQAv/jSqOcIHCoUzfOUsYGnazZ MAp80CGE5Sj4INiR2kj0g/wdOcUXQqW3LdSmTGxr0hedawo7cgBthiR711qQhWIvgd98l4D9o+3If SBX0EbRg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgZ-0007cj-I3; Thu, 29 Oct 2020 19:34:11 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 15/19] mm/readahead: Add THP readahead Date: Thu, 29 Oct 2020 19:34:01 +0000 Message-Id: <20201029193405.29125-16-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If the filesystem supports THPs, allocate larger pages in the readahead code when it seems worth doing. The heuristic for choosing larger page sizes will surely need some tuning, but this aggressive ramp-up seems good for testing. Signed-off-by: Matthew Wilcox (Oracle) --- mm/readahead.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 94 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index c5b0457415be..dc9876104ee8 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -149,7 +149,7 @@ static void read_pages(struct readahead_control *rac,= struct list_head *pages, =20 blk_finish_plug(&plug); =20 - BUG_ON(!list_empty(pages)); + BUG_ON(pages && !list_empty(pages)); BUG_ON(readahead_count(rac)); =20 out: @@ -429,11 +429,99 @@ static int try_context_readahead(struct address_spa= ce *mapping, return 1; } =20 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static inline int ra_alloc_page(struct readahead_control *ractl, pgoff_t= index, + pgoff_t mark, unsigned int order, gfp_t gfp) +{ + int err; + struct page *page =3D __page_cache_alloc_order(gfp, order); + + if (!page) + return -ENOMEM; + if (mark - index < (1UL << order)) + SetPageReadahead(page); + err =3D add_to_page_cache_lru(page, ractl->mapping, index, gfp); + if (err) + put_page(page); + else + ractl->_nr_pages +=3D 1UL << order; + return err; +} + +static void page_cache_ra_order(struct readahead_control *ractl, + struct file_ra_state *ra, unsigned int new_order) +{ + struct address_space *mapping =3D ractl->mapping; + pgoff_t index =3D readahead_index(ractl); + pgoff_t limit =3D (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; + pgoff_t mark =3D index + ra->size - ra->async_size; + int err =3D 0; + gfp_t gfp =3D readahead_gfp_mask(mapping); + + if (!mapping_thp_support(mapping) || ra->size < 4) + goto fallback; + + limit =3D min(limit, index + ra->size - 1); + + /* Grow page size up to PMD size */ + if (new_order < HPAGE_PMD_ORDER) { + new_order +=3D 2; + if (new_order > HPAGE_PMD_ORDER) + new_order =3D HPAGE_PMD_ORDER; + while ((1 << new_order) > ra->size) + new_order--; + } + + while (index <=3D limit) { + unsigned int order =3D new_order; + + /* Align with smaller pages if needed */ + if (index & ((1UL << order) - 1)) { + order =3D __ffs(index); + if (order =3D=3D 1) + order =3D 0; + } + /* Don't allocate pages past EOF */ + while (index + (1UL << order) - 1 > limit) { + if (--order =3D=3D 1) + order =3D 0; + } + err =3D ra_alloc_page(ractl, index, mark, order, gfp); + if (err) + break; + index +=3D 1UL << order; + } + + if (index > limit) { + ra->size +=3D index - limit - 1; + ra->async_size +=3D index - limit - 1; + } + + read_pages(ractl, NULL, false); + + /* + * If there were already pages in the page cache, then we may have + * left some gaps. Let the regular readahead code take care of this + * situation. + */ + if (!err) + return; +fallback: + do_page_cache_ra(ractl, ra->size, ra->async_size); +} +#else +static void page_cache_ra_order(struct readahead_control *ractl, + struct file_ra_state *ra, unsigned int order) +{ + do_page_cache_ra(ractl, ra->size, ra->async_size); +} +#endif + /* * A minimal readahead algorithm for trivial sequential/random reads. */ static void ondemand_readahead(struct readahead_control *ractl, - struct file_ra_state *ra, bool hit_readahead_marker, + struct file_ra_state *ra, struct page *page, unsigned long req_size) { struct backing_dev_info *bdi =3D inode_to_bdi(ractl->mapping->host); @@ -473,7 +561,7 @@ static void ondemand_readahead(struct readahead_contr= ol *ractl, * Query the pagecache for async_size, which normally equals to * readahead size. Ramp it up and use it as the new readahead size. */ - if (hit_readahead_marker) { + if (page) { pgoff_t start; =20 rcu_read_lock(); @@ -546,7 +634,7 @@ static void ondemand_readahead(struct readahead_contr= ol *ractl, } =20 ractl->_index =3D ra->start; - do_page_cache_ra(ractl, ra->size, ra->async_size); + page_cache_ra_order(ractl, ra, page ? thp_order(page) : 0); } =20 void page_cache_sync_ra(struct readahead_control *ractl, @@ -574,7 +662,7 @@ void page_cache_sync_ra(struct readahead_control *rac= tl, } =20 /* do read-ahead */ - ondemand_readahead(ractl, ra, false, req_count); + ondemand_readahead(ractl, ra, NULL, req_count); } EXPORT_SYMBOL_GPL(page_cache_sync_ra); =20 @@ -604,7 +692,7 @@ void page_cache_async_ra(struct readahead_control *ra= ctl, return; =20 /* do read-ahead */ - ondemand_readahead(ractl, ra, true, req_count); + ondemand_readahead(ractl, ra, page, req_count); } EXPORT_SYMBOL_GPL(page_cache_async_ra); =20 --=20 2.28.0