From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 492DCC55178 for ; Thu, 29 Oct 2020 19:34:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B4291207DE for ; Thu, 29 Oct 2020 19:34:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="b0ecZ0DX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B4291207DE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 12A0C6B007B; Thu, 29 Oct 2020 15:34:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 017AF6B0070; Thu, 29 Oct 2020 15:34:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B38FC6B0073; Thu, 29 Oct 2020 15:34:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 5B08D6B0073 for ; Thu, 29 Oct 2020 15:34:13 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 054253624 for ; Thu, 29 Oct 2020 19:34:13 +0000 (UTC) X-FDA: 77425963986.05.scale63_260210727290 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id CA7CD180284F0 for ; Thu, 29 Oct 2020 19:34:12 +0000 (UTC) X-HE-Tag: scale63_260210727290 X-Filterd-Recvd-Size: 6235 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Thu, 29 Oct 2020 19:34:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=toB6t8pYm3LNmloiIQYto1Jnb9pOJ7DO7/Ur4ZlMNcE=; b=b0ecZ0DXNvHzbtvr3lRs9mLhr1 3gFIpggSVNuTQP6apiQvXyXhbVeyFuUt8FzveMDCuYRrxSY/ySB/u09BhV7CNCROjkJq8FfaULIrW PVzglXmnsWoXRa4okzDML9C1j6nZN05GcGhmPvuT7tSdkIwfEzSEl7qWfLBpu1/rOhdlJBDrLh+TM DToJBTbso6oLe34BpzzfzJ+bqb+G2oeM8KQAkfyVhXKTa3sDyapEvIAaN9gEnDF6o7aXsHUt4qCXo QAqvtOXrdwqW7z2ZrK7RHXrswqSbLmchxoUFVsuLHHpvlOP9OVfyBv7EcpdTTuMOckXiyFkwmPDFb vIeyp1sg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgY-0007cU-V3; Thu, 29 Oct 2020 19:34:11 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 13/19] mm/filemap: Support readpage splitting a page Date: Thu, 29 Oct 2020 19:33:59 +0000 Message-Id: <20201029193405.29125-14-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We need to tell readpage which subpage we're actually interested in (by passing the subpage to gfbr_read_page()), and if it does split the THP, we need to update the page in the page array to be the subpage. For page splitting to succeed, the thread asking to split the page has to be the only one with a reference to the page. Calling wait_on_page_locked() while holding a reference to the page will effectively prevent this from happening with sufficient threads waiting on the same page. Use put_and_wait_on_page_locked() to sleep without holding a reference to the page, then retry the page lookup after the page is unlocked. Since we now get the page lock a little earlier in gfbr_update_page(), we can eliminate a number of duplicate checks. The original intent (commit ebded02788b5 ("avoid unnecessary calls to lock_page when waiting for IO to complete during a read") behind getting the page lock later was to avoid re-locking the page after it has been brought uptodate by another thread. We will still avoid that because we go through the norma= l lookup path again after the winning thread has brought the page uptodate. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 76 +++++++++++++++++----------------------------------- 1 file changed, 24 insertions(+), 52 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 215729048cbd..87f89e5dd64e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1358,14 +1358,6 @@ static int __wait_on_page_locked_async(struct page= *page, return ret; } =20 -static int wait_on_page_locked_async(struct page *page, - struct wait_page_queue *wait) -{ - if (!PageLocked(page)) - return 0; - return __wait_on_page_locked_async(compound_head(page), wait, false); -} - /** * put_and_wait_on_page_locked - Drop a reference and wait for it to be = unlocked * @page: The page to wait for. @@ -2259,6 +2251,7 @@ static struct page *gfbr_read_page(struct kiocb *io= cb, return error !=3D AOP_TRUNCATED_PAGE ? ERR_PTR(error) : NULL; } =20 + page =3D thp_head(page); if (!PageUptodate(page)) { error =3D lock_page_for_iocb(iocb, page); if (unlikely(error)) { @@ -2292,64 +2285,42 @@ static struct page *gfbr_update_page(struct kiocb= *iocb, struct inode *inode =3D mapping->host; int error; =20 - /* - * See comment in do_read_cache_page on why - * wait_on_page_locked is used to avoid unnecessarily - * serialisations and why it's safe. - */ if (iocb->ki_flags & IOCB_WAITQ) { - error =3D wait_on_page_locked_async(page, - iocb->ki_waitq); - } else { - error =3D wait_on_page_locked_killable(page); - } - if (unlikely(error)) { - put_page(page); - return ERR_PTR(error); + error =3D lock_page_async(page, iocb->ki_waitq); + if (error) { + put_page(page); + return ERR_PTR(error); + } + } else if (!trylock_page(page)) { + put_and_wait_on_page_locked(page, TASK_KILLABLE); + return NULL; } + if (PageUptodate(page)) - return page; + goto uptodate; =20 if (inode->i_blkbits =3D=3D PAGE_SHIFT || !mapping->a_ops->is_partially_uptodate) - goto page_not_up_to_date; + goto readpage; /* pipes can't handle partially uptodate pages */ if (unlikely(iov_iter_is_pipe(iter))) - goto page_not_up_to_date; - if (!trylock_page(page)) - goto page_not_up_to_date; - /* Did it get truncated before we got the lock? */ + goto readpage; if (!page->mapping) - goto page_not_up_to_date_locked; + goto truncated; if (!mapping->a_ops->is_partially_uptodate(page, - pos & ~PAGE_MASK, count)) - goto page_not_up_to_date_locked; + pos & (thp_size(page) - 1), count)) + goto readpage; +uptodate: unlock_page(page); return page; =20 -page_not_up_to_date: - /* Get exclusive access to the page ... */ - error =3D lock_page_for_iocb(iocb, page); - if (unlikely(error)) { - put_page(page); - return ERR_PTR(error); - } - -page_not_up_to_date_locked: - /* Did it get truncated before we got the lock? */ - if (!page->mapping) { - unlock_page(page); - put_page(page); - return NULL; - } - - /* Did somebody else fill it already? */ - if (PageUptodate(page)) { - unlock_page(page); - return page; - } - +readpage: + page +=3D (pos / PAGE_SIZE) - page->index; return gfbr_read_page(iocb, mapping, page); +truncated: + unlock_page(page); + put_page(page); + return NULL; } =20 static struct page *gfbr_create_page(struct kiocb *iocb, @@ -2443,6 +2414,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struc= t iov_iter *iter, err =3D PTR_ERR_OR_ZERO(page); break; } + pages[i] =3D page; } } =20 --=20 2.28.0