From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89662C83F1A for ; Fri, 11 Jul 2025 03:50:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E402F6B008A; Thu, 10 Jul 2025 23:50:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF0466B0092; Thu, 10 Jul 2025 23:50:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2C9C6B0093; Thu, 10 Jul 2025 23:50:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C38876B008A for ; Thu, 10 Jul 2025 23:50:14 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B28981419B4 for ; Fri, 11 Jul 2025 03:50:13 +0000 (UTC) X-FDA: 83650605906.13.8926C29 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf14.hostedemail.com (Postfix) with ESMTP id E2D0D100004 for ; Fri, 11 Jul 2025 03:50:10 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=KD4OTcWo; spf=pass (imf14.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752205812; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P++FyLylFWDrf4qabBvwROQ5Y8G2c6cQfw+TiURJ7Xw=; b=XbpxYzoh4DTZ3n6UBW+i/bvjFVLlOpWx1IPPnsjqSkEu8+PjF/2ZlAXlL0I13VkobjtvQW yzw9lVLo/QZTwNVfmCX+Whe0p1unxX30uhGV9DEVrRWUACntNF1/++4FFByK1y5+5yg8r8 YQRBzbINvsqE7sh2OFODRskrKCp+iRI= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=KD4OTcWo; spf=pass (imf14.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752205812; a=rsa-sha256; cv=none; b=Jd4Z+qN/0foAdpHRLF2Y8ExUwtEsXfRZmYx/3+4Du5vZ41hXyyNoPpiraRFOMh8RzhH3gM yCopARfhlcZ9vSwqSJId/LMJVyNOhUDjT5NLd3yb2bDPD0rqfhT7pziGVVxff+1JEcyGMx laANwzctjN+blXRjOc48nsdJenCZRaM= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1752205807; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=P++FyLylFWDrf4qabBvwROQ5Y8G2c6cQfw+TiURJ7Xw=; b=KD4OTcWo6GpHYQXD5CwkCVeusO4XIbeFbwI2jPX/KFyXSREpOTHuM7qdUeBvQwHJZ8WXvsoIsDZf8OjrghN7Khx3/7R/xgzCU7ONRh3PHKb8PNGmgTZMT4QrMFj7Q/W0Jp2fSZ4hOtK26sjvfUg2f1FTlEcl0XOHqW8lDPV48zI= Received: from 30.74.144.131(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WifSQgo_1752205805 cluster:ay36) by smtp.aliyun-inc.com; Fri, 11 Jul 2025 11:50:06 +0800 Message-ID: <0224ed0f-d207-4c79-8c9d-f4915a91c11d@linux.alibaba.com> Date: Fri, 11 Jul 2025 11:50:05 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] tmpfs: zero post-eof folio range on file extension To: Hugh Dickins Cc: Brian Foster , linux-mm@kvack.org, Matthew Wilcox , Usama Arif References: <20250625184930.269727-1-bfoster@redhat.com> <297e44e9-1b58-d7c4-192c-9408204ab1e3@google.com> <67f0461b-3359-41e7-a7cd-b059cbef4154@linux.alibaba.com> <097c0b07-1f43-51c3-3591-aaa2015226c2@google.com> From: Baolin Wang In-Reply-To: <097c0b07-1f43-51c3-3591-aaa2015226c2@google.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: E2D0D100004 X-Stat-Signature: k1gsqg4cbicm7syj4xfd66ieog1kutom X-Rspam-User: X-HE-Tag: 1752205810-146121 X-HE-Meta: U2FsdGVkX1/dA4hoh3pbIvirYu9/sNFcZTY1JD3Wkw+k4xIYbxdrr7bIscg8vHbys2lckN1t20Z7dpvuUVp/ai0H+xwiMU+e0dEvF1m4wH/YTlhN8NKc6rSYHw1dUGUZBoNgl4hYNLF7G9e6W8riqSY/RU/gGdmdYbH2T8y6q43VLa+oaxqKH15o2R42ZTvdYoYCFa/Rf8ZmeFaoeWGqOM7v8cDc00yizZH/6eQzA1SHiUTPnjZMwNqGSdAkVuDxmR9zmICfWZ+VUvn+pJIyfF8NPrujkl0HMGydnsuOWKjaCXaTZIVvIea+DYbNT4SqzpWgPGvFDPRW7C8Nplf8A00TElsHkzjgSW+Z55siudhMWWIjUlA2sAxe60E4RQEwN99gIfsoa4rDRH2BHxfLD3QrFQs9LtiZ4F8/Q3mKtFMQZ50gBiK9RVNY2XlL7AWqMkjfOqxHs/HHOaLyl1/GEgzmZRKWSWyQr5Q3SEavEWvMCM8gmYA8N+JMKC7k9U9yGi1G3ZKfAcrbAkzPkcZQ/v0K5Fadx/wP8z0aKSvVZLaWF7u70dOBqf6G3sIdcxo9DCs/3HiLxtdrgloyl5lMhcN8G+Mflu4R5RMlDv6xzGPhP95rmnqa2hCEP66ArduBKByHIOvrpkGSET5YsYwuedw2f+3XsCh/i/OkZ+7INHYAHQPXS8jDrA1Q3PUo0g8qut4LSp7Pk1iBmbeefXpijevEz0Z44Lh8DMJbV4PR9BaIcb1gEO6lMisa7m32iHDrtBkQOnXtJQ4qCodeUzZa5tnjF4IeJqHnG09SOYnIK4iOoBHiO2K+ZfBlNs+DZdG3dcn3taFAFr2xGYu4HDusEpQszc7O7x6UHGLuX+TchDacLmKFOWi4pk7PCaEtjHhs5eF2ui2aulBVd4yNlaNVyJDyjRrmsbCPMbwXrpmallJs2/xvL7uxKY5IJnCPzzsZ85nGH4Y9T1hVx+ZsQsL uxn7yomx fAz8op+/mEIsNVXEoEnbpojnmDKZRVCRIFhsEQ7NgYLUkmTdXP+WR5T6SB6upmAAqF+gDNJgCghntjeEtxjo3ETbNRz9NDRoCL29aWytizCbfnLKTIjrl0lPUULMNdVF0Fp8IWymcJbpz6Gm9alL19AWiwURS7Kjw0tgXBfbGOzo6YYA4Bvq+jxF8sfmAjgMuzyxOOpSeC4LxoJENRCCX4MBiiN/S/wxLf/lhVsnT6lHDRChXZKrlSuUjlsAKPsw/BvaZHIgCAQMZDqy9AxAA+wKMsATcSyEza22e8pLuT7WnIIUYukcg2OCuX0O24nfZ2i3K X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/7/11 06:20, Hugh Dickins wrote: > On Thu, 10 Jul 2025, Baolin Wang wrote: >> On 2025/7/9 15:57, Hugh Dickins wrote: > ... >>> >>> The problem is with huge pages (or large folios) in shmem_writeout(): >>> what goes in as a large folio may there have to be split into small >>> pages; or it may be swapped out as one large folio, but fragmentation >>> at swapin time demand that it be split into small pages when swapped in. >> >> Good point. >> >>> So, if there has been swapout since the large folio was modified beyond >>> EOF, the folio that shmem_zero_eof() brings in does not guarantee what >>> length needs to be zeroed. >>> >>> We could set that aside as a deficiency to be fixed later on: that >>> would not be unreasonable, but I'm guessing that won't satisfy you. >>> >>> We could zero the maximum (the remainder of PMD size I believe) in >>> shmem_zero_eof(): looping over small folios within the range, skipping >>> !uptodate ones (but we do force them uptodate when swapping out, in >>> order to keep the space reservation). TBH I've ignored that as a bad >>> option, but it doesn't seem so bad to me now: ugly, but maybe not bad. >> >> However, IIUC, if the large folios are split in shmem_writeout(), and those >> small folios which beyond EOF will be dropped and freed in >> __split_unmapped_folio(), should we still consider them? > > You're absolutely right about the normal case, and thank you for making > that point. Had I forgotten that when writing? Or was I already > jumping ahead to the problem case? I don't recall, but was certainly > wrong for not mentioning it. > > The abnormal case is when there's a "fallocend" beyond i_size (or beyond > the small page extent spanning i_size) i.e. fallocate() has promised to > keep pages allocated beyond EOF. In that case, __split_unmapped_folio() > is keeping those pages. Ah, yes, you are right. > There could well be some optimization, involving fallocend, to avoid > zeroing more than necessary; but I wouldn't want to say what in a hurry, > it's quite confusing! Like you said, not only can a large folio split occur during swapout, but it can also happen during a punch hole operation. Moreover, considering the abnormal case of fallocate() you mentioned, we should find a more common approach to mitigate the impact of fallocate(). For instance, when splitting, we could clear the 'uptodate' flag for these EOF small folios that are beyond 'i_size' but less than the 'fallocend', so that these EOF small folios will be re-initialized if they are used again. What do you think? diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ce130225a8e5..2ccb442525d1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3546,6 +3546,18 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, lru_add_split_folio(origin_folio, release, lruvec, list); + /* + * fallocate() will keep folios allocated beyond EOF, we should + * clear the uptodate flag for these folios to re-zero them + * if necessary. + */ + if (shmem_mapping(mapping)) { + loff_t i_size = i_size_read(mapping->host); + + if (i_size < end && release->index >= i_size) + folio_clear_uptodate(release); + } + /* Some pages can be beyond EOF: drop them from cache */ if (release->index >= end) { if (shmem_mapping(mapping))