From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D74CEC6FD1D for ; Mon, 20 Mar 2023 05:19:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4547900003; Mon, 20 Mar 2023 01:19:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF5C1900002; Mon, 20 Mar 2023 01:19:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBD03900003; Mon, 20 Mar 2023 01:19:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BC29B900002 for ; Mon, 20 Mar 2023 01:19:38 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 84867A0783 for ; Mon, 20 Mar 2023 05:19:38 +0000 (UTC) X-FDA: 80588124036.30.001F5DA Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) by imf10.hostedemail.com (Postfix) with ESMTP id B6704C000C for ; Mon, 20 Mar 2023 05:19:35 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=VIOl8mEl; spf=pass (imf10.hostedemail.com: domain of hughd@google.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679289575; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=76ELY9xLQn4IJW/KKPmxMLn2xf0xOodzC5hbi9tbLEo=; b=pp2jO+tpUDlKAaJHif4ksu8Do01Sv4iXc2rK98uxo8p+e+70nIsMBeZeH1cfdYVBiSA9Fq pz4qWXDFX6KtOdwS3PqswCAktCFI0fFVxj76hND/DeJXaNveyMCVDo8Feu7f7/L+iIEXBN OfFyZhfdD3+5C6QOHnQyljfq6BuzB4A= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=VIOl8mEl; spf=pass (imf10.hostedemail.com: domain of hughd@google.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679289575; a=rsa-sha256; cv=none; b=7aeIgLZmXgk5jSpyfnjE10DM2yt75vAADWlh5HNkY5+YeCWpQ0Q8MrnGOsypbXKke24JG9 u2K+IooOv/6f9AQksILcokb5GZ/5/OO1sCdSVojT09yTLUgS5Kr1oGQfhfYYNaLPkD0vPG Nr9ySX5bbJRzkvo/gRD8LbNWUf060d8= Received: by mail-qv1-f44.google.com with SMTP id p93so6829480qvp.1 for ; Sun, 19 Mar 2023 22:19:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679289575; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=76ELY9xLQn4IJW/KKPmxMLn2xf0xOodzC5hbi9tbLEo=; b=VIOl8mEl+TRm12B9vr0x3MI5qg8HBc3Nvf52ADz2JTRE/pUljORPAwc9zGnOtL4n1q vY+fhSDADSjmuqWodTBtrH1DhVZeZ2kit132/OUM0iOL20vYKI667wtJE/oms15hgrXf dC6EpWmifHlAfRjrxg6QBYvE8ZqZwVq1TYZgOrdvLuc5N4pedfxUdPHf1feziMopEW0b SpjoYiMhcD12XJr9g0F0C6O1oDAreASe3as52lgZsY5Jt+89aEdG66BxLERBD5SirCqV FcaIo7as2beEAwrD58d10DhNG6Sgy5ZDddUiykyc3xGAhVNWJgAvdw9XEon+S0h4faek qWbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679289575; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=76ELY9xLQn4IJW/KKPmxMLn2xf0xOodzC5hbi9tbLEo=; b=ig14h/k+wTKGKumH3uWsGsbCMje5NW9XJfaGole6iwJr2cgfJXTSz/SYXMw2Q9/YyL CMXWQk+QvIFa01f1Wk19skwvRlr+7wmgjp6Mrm73gAhCBL6qotKN9aGQ+1kZltWjIdDT 0Zojd6s1vWLNUCnGrZlWcN363eB92GUzL0flQhVkuvu+U6+gyIUaQBoG3LulRst9zxzw 1OvnfCBCXgrqIuoQB72bMrAQI0IOBPE2kOCdiM76SWbHXS7nPqiU5GdciCJD75+Z+VNZ tx/aOC9Z9zUbpwL9Pv2orLJN2C6lwzSArfsHPENcipLd+ycaExBduHiL+rVppA42Flby hyJg== X-Gm-Message-State: AO0yUKXvQlKHllwtT+ZdpCyTjOxgO3Y8KksSaFp9npITH/2kc8fIiLLB m1UzAdiW+3TveBrBwJjC9PJ9dQ== X-Google-Smtp-Source: AK7set/4aSMg0lOiupi+qDtWYFFs7L1A7Zocw+KE8xhWhyWQ8cPaDq5MMjDuNzS9Qd93vE1UyEn9+g== X-Received: by 2002:a05:6214:29e4:b0:5b9:3f17:b219 with SMTP id jv4-20020a05621429e400b005b93f17b219mr20607901qvb.3.1679289574718; Sun, 19 Mar 2023 22:19:34 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id e14-20020a05620a014e00b0073b45004754sm6694504qkn.34.2023.03.19.22.19.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Mar 2023 22:19:32 -0700 (PDT) Date: Sun, 19 Mar 2023 22:19:21 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Christoph Hellwig cc: Andrew Morton , Matthew Wilcox , Hugh Dickins , linux-afs@lists.infradead.org, linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, cluster-devel@redhat.com, linux-mm@kvack.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-nilfs@vger.kernel.org Subject: Re: [PATCH 4/7] shmem: remove shmem_get_partial_folio In-Reply-To: <20230307143410.28031-5-hch@lst.de> Message-ID: <9d1aaa4-1337-fb81-6f37-74ebc96f9ef@google.com> References: <20230307143410.28031-1-hch@lst.de> <20230307143410.28031-5-hch@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B6704C000C X-Rspam-User: X-Stat-Signature: hsqqmti347w414krwi6joo8kbitm1rzp X-HE-Tag: 1679289575-889959 X-HE-Meta: U2FsdGVkX1/k9rwF+jXuBdoKE3GT/Mo2EFF2H4Vhx+0fjtoiog038Z+FJPBhT0YnmGxE/vsyWr4KQpEheCEO2gqW/C6leZ4sSs93wLXvmh5Uu9lXgIkx2Re1ajW6GpWqM6kLJYb9sPWorPVF+2UJkKLfXAPPk/Tn5wXF6Ank0pgijUETVSq0SlCT53z2sxgO9UHhLwyoszf0xfAsngzdJ2YDGGPMaZGf0AoG14WJ9UsSqF/wojdZLklVT3NFRgwT+vjkLC0EgPwwXo4f/uu2SE+5eSQK3dgcqCsKZo9h0gYhT8owAz99QSzmMKWRnQd7ecXU593EOoKviBaxli0vl/DlfW7EHyoNI48AB0Yd6gwJeibYC4FF2H3d0/0mQm6bGaNkF36ARZNXPI2bE5KQYsuLLRTvTFTCIZqpMMPJI88nMKncaSmqvAZv2T3KoVkd7wmLoWfPQUPi7Y3R/JPlxtJDRB8swqSGStG1G+WcxSJLq6xxu/M1rWOpMVILxy1xFZniwT73NfPaV8N8QZNinEvnQL0fIebp47IDfSsRCrT1Z991xvNC1b0nclovspq8xEz2x+nshX5JY1w3a5BGZShPFjXgX+95hgxb94yLEw/zqBffiuGxovF83EJbCS5Xj0BZUKemzF1iG/UhUIlGRoAU3mW7yPK24Dv6IceiZDE0WAKEjKLcfwI6+g0sZ6XNBWJ/4jr080AUGEQTYE7ymuBoTeKrnLILEot0shUaVnbFBLqO7U4R6Xu04QXe3NU4yrFbPI/zAJv6SLaBhxCYUYbjiDNP6pNFtUbgxsccMzD4oNeHS6JdgMcuiZka6tQZTQSIUaHuZvqvntAJGe8SWhIQSj7Et2gcHm1UzhR4TZ3/aB3wuFGseTCsIAO9FhQUDZ/zr08b8tcPqnBhpOYWMwH7uABuVCvnLIcZeEICTlkpH//0DkUKMAS9o5wIDF85sqg88m0qAmlfvXD3F3x 7dWV3R3Y BvLC10w/dq8UuBt+dABvdXKE3okIm5iC/MZUnZbNEDnfrpMHRME4JMijOibeA1LAKc/eJDu0f46MxEOhCfJz9tFh5+p4KRjYKawP4z9MWrkEkbGPHPl/6kN45JRcfWnmPUU0xMu9R/FuhKW1clAf6C7PRUZGeyZuOnUilXxrCyzl9VEIKzM8sFgglplhZ05cJLLnuMhYZGwUVOGKKC+rwxqWr+SKt47/59nWo6i/tlJOAOOjkd8vz5szpSPH86itkj2bp1MKaOLmKqNB4z/EyHUqTGGUraWoqiZd/yXBuut44Mv/Ivgvi+FBSq6Fh7FZenM+hNIeYQHPVTh4f1m/gZhqNgYfdQDN0I2isgtB5EX6wzr4f1ZWDe94VpnysVmUooC3J1xsfcZQb/9S3smJpKdKACMJk+bzVpt+PUMSODgipUvBWqHhTpsfN8z/ll4s9Govy2nsBzRm0BLM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 7 Mar 2023, Christoph Hellwig wrote: > Add a new SGP_FIND mode for shmem_get_partial_folio that works like > SGP_READ, but does not check i_size. Use that instead of open coding > the page cache lookup in shmem_get_partial_folio. Note that this is > a behavior change in that it reads in swap cache entries for offsets > outside i_size, possibly causing a little bit of extra work. > > Signed-off-by: Christoph Hellwig > --- > include/linux/shmem_fs.h | 1 + > mm/shmem.c | 46 ++++++++++++---------------------------- > 2 files changed, 15 insertions(+), 32 deletions(-) I thought this was fine at first, and of course it's good for all the usual cases; but not for shmem_get_partial_folio()'s awkward cases. Two issues with it. One, as you highlight above, the possibility of reading more swap unnecessarily. I do not mind if partial truncation entails reading a little unnecessary swap; but I don't like the common case of truncation to 0 to entail that; even less eviction; even less unmounting, when eviction of all risks reading lots of swap. The old code behaved well at i_size 0, the new code not so much. Two, truncating a large folio is expected to trim that folio down to the smaller sizei (if splitting allows); but SGP_FIND was coded too much like SGP_READ, in reporting fallocated (!uptodate) folios as NULL, unlike before. Then the following loop of shmem_undo_range() removed that whole folio - removing pages "promised" to the file by the earlier fallocate. Not as seriously bad as deleting data would be, and not very likely to be noticed, but still not right. Replacing shmem_get_partial_folio() by SGP_FIND was a good direction to try, but it hasn't worked out. I tried to get SGPs to work right for it before, when shmem_get_partial_page() was introduced; but I did not manage to do so. I think we have to go back to how this was. Andrew, please replace Christoph's "shmem: remove shmem_get_partial_folio" in mm-unstable by this patch below, which achieves the same object (eliminating FGP_ENTRY) while keeping closer to the old mechanism. [PATCH] shmem: shmem_get_partial_folio use filemap_get_entry To avoid use of the FGP_ENTRY flag, adapt shmem_get_partial_folio() to use filemap_get_entry() and folio_lock() instead of __filemap_get_folio(). Update "page" in the comments there to "folio". Signed-off-by: Hugh Dickins --- mm/shmem.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) --- a/mm/shmem.c +++ b/mm/shmem.c @@ -886,14 +886,21 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) /* * At first avoid shmem_get_folio(,,,SGP_READ): that fails - * beyond i_size, and reports fallocated pages as holes. + * beyond i_size, and reports fallocated folios as holes. */ - folio = __filemap_get_folio(inode->i_mapping, index, - FGP_ENTRY | FGP_LOCK, 0); - if (!xa_is_value(folio)) + folio = filemap_get_entry(inode->i_mapping, index); + if (!folio) return folio; + if (!xa_is_value(folio)) { + folio_lock(folio); + if (folio->mapping == inode->i_mapping) + return folio; + /* The folio has been swapped out */ + folio_unlock(folio); + folio_put(folio); + } /* - * But read a page back from swap if any of it is within i_size + * But read a folio back from swap if any of it is within i_size * (although in some cases this is just a waste of time). */ folio = NULL;