From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFE1AC7EE30 for ; Fri, 27 Jun 2025 03:22:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B5F16B00B1; Thu, 26 Jun 2025 23:22:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 33FAB6B00B2; Thu, 26 Jun 2025 23:22:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 256276B00B3; Thu, 26 Jun 2025 23:22:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1728C6B00B1 for ; Thu, 26 Jun 2025 23:22:06 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8AF1F10473A for ; Fri, 27 Jun 2025 03:22:05 +0000 (UTC) X-FDA: 83599731810.13.D3CC62C Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) by imf01.hostedemail.com (Postfix) with ESMTP id 9A7F140005 for ; Fri, 27 Jun 2025 03:22:02 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=bwsrfSRm; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf01.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750994523; a=rsa-sha256; cv=none; b=lB8bvV/8LWj2fsBX8D0/gGdLohI0OKH0MI0G8mHt5fulFyM4omtyZ4bwmcvX2f7vrMK4kI 96g+NUINIegujidsKsnKnJaar8FOaEO5jidpoaE4tH2xR6u2kWmopSPhxdShD3KWhYUqBQ 9Bsa60970CI1PPOjNLkJ3zzjxsaLldY= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=bwsrfSRm; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf01.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750994523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eCnVue4Ls3CB85q+RzhO9x/YHb5PI45ALBYyX4x13wQ=; b=EViQbuTpc5ckNzhNhBxlebWHja5kxBUgt7fueqwfONGC4yAiyAcdOEpC9CQqfX+LkCigBP 5L61GKLf8MDqE3G3Kqrz2nTEVDWKkIWZTM2pc56DSVQ5ipjl9X62ImSadLtOoCtjl/2ubS lYOP4UwQrnKJZvLGtujZ6Zcg/hfnBiA= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1750994519; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=eCnVue4Ls3CB85q+RzhO9x/YHb5PI45ALBYyX4x13wQ=; b=bwsrfSRmfTJDyvaMbaHg5uPlPFOUOc9/y5fDk++8pur7C8DBTk9ovIlqTxc4nGWWvjTb1qPFCVU0J96CfwWy+bjnPV8TSIMjy2li6m+kT1j1YMpwomqYCb0etAjYnZwk0M4gNOn8MhERWqcZezbiL6O1xP28iQtp0m3/9VCHMRc= Received: from 30.74.144.110(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WfNUk-D_1750994517 cluster:ay36) by smtp.aliyun-inc.com; Fri, 27 Jun 2025 11:21:57 +0800 Message-ID: Date: Fri, 27 Jun 2025 11:21:56 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] tmpfs: zero post-eof folio range on file extension To: Brian Foster , Hugh Dickins Cc: Matthew Wilcox , linux-mm@kvack.org References: <20250625184930.269727-1-bfoster@redhat.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 9A7F140005 X-Rspamd-Server: rspam10 X-Stat-Signature: qm5pt5anhkufdwdtn5noa5ggh4euuqu9 X-HE-Tag: 1750994522-954586 X-HE-Meta: U2FsdGVkX1/E5c+RBoMdnoVTRPH38n8JoaehvS4OE2JqrAMnRLtZN+cBKpRNg3oDyyajYf1JV8DQVIUnu7V5h9XJ8Y3GaZbWE3qFGi8U7Dqt8yy6Q3uaC/JYUjlD3iqt3QfYXltz/EfYv3xFQNS7tGiM6JtaEUuq1breitFvMLYLnI9aKy7boHrPGG5xyyzw7FCNziStrBYxwCLbxxjnYaAZdCIfsj4BF5Bb7Fw8N5R2KATzOrt+T2/cFqjaiReHWb+JaHEs5RJzSoyomLL/ze4q2QnAmeGUaMv3a31CiOvNJES6VgcyFckAACqXPmN14FTD8YRAFFVvDZ94Sbxf5OSACe4Pb+FI9os6j1GryEGw3OuUl8WvErXgxZCSiap+4ZUoFEXnqnrElsufQtUjjYLKonFTz13Rbx83ofiVYvWkIiOVZPWLyQjOJ0eMQcCcEQvFriDv/7eMl4QqqL0NKhIHBUDRQhvPZvCK1dhYXx8hUFatIrvzV1dtKruO2+SVM/C0fha8p+MG9Pawa3jF6GcqYG7qHssc3W2ssp7HPHo2DqtgpTVpjCrcvkHgT/BfwlpIPpwLl+FBOtc/3CVDqzbeZkNgRsIdKSHNnWUi6sqnBAjfGiuGCcDK92gRxGK6rxjdu1lXHPjdq2APNXWlFvjxIoZMFoS+zdeeJ+pVzx+qA3G3BJM1+czCAlp8TDNHgWcxYoVBNR+7TG4Ftko4k9+jKHolkbYmjI+LhV4imA+GbqcFLFy8OF7UDRzeGnC5ROC/K2VAb8glMbZZrVaVDlyuqyzHGkuQrMysBiwLJII2ES+NFh+CHGqfTfXVdRW4bSG3MZ42Kea2hYwtkXUaC18lVw0e0OXxiWHGtnRXRH4AgwizyUBxQuGW7QbR4yq4X9nyC6yQ8gH0go9qAT/pXBzAD3RVVPUu+ShZGdO3IVAPL+1KQ33wfEifitxxk9VMHsVhGVYIj8pvlM/P7JP SbGcP90W PXRXjdUiZDD25cUkf5q3Jid2gJ8WGMXAdqLhbRuRKh11P93BPmomkliAE+8s1dOFSUlX/Fuf2DANSJPjdFh+5Yk4M/6bV1y5SxcqXCxz65RJkiQDmnftIyLKkhqGt0+EyXvIZ1bTLLKQmJy2rNQ/T7baAgi0nDAR5xwrXfT9Ti9bUnofhkYXnOtbDWLRQVq1xZRGltwllQFVP+uGikZhsalu3KreUrcWwbEsKy+v/UEP5V3V5pc2I8fntwmQB077+B6z+tmF75obAMzDTqq4GbMVJI75cwRHRdekeQhmO9cbS89NlRknKlsfU9MaVczlpvJ0rO/lcQXBq0G5q9cOcLG+P6P63VGhwfV8m7LUx2cYkXY85TppOYvZXeRDTCQseGH4IlPgBrUp4XyZW9+g3u+ySKQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/6/26 20:55, Brian Foster wrote: > On Wed, Jun 25, 2025 at 10:35:44PM -0700, Hugh Dickins wrote: >> On Wed, 25 Jun 2025, Matthew Wilcox wrote: >>> On Wed, Jun 25, 2025 at 02:49:30PM -0400, Brian Foster wrote: >>>> Most traditional filesystems zero the post-eof portion of the eof >>>> folio at writeback time, or when the file size is extended by >>>> truncate or extending writes. This ensures that the previously >>>> post-eof range of the folio is zeroed before it is exposed to the >>>> file. >>>> >>>> tmpfs doesn't implement the writeback path the way a traditional >>>> filesystem does, so zeroing behavior won't be exactly the same. >>>> However, it can still perform explicit zeroing from the various >>>> operations that extend a file and expose a post-eof portion of the >>>> eof folio. The current lack of zeroing is observed via failure of >>>> fstests test generic/363 on tmpfs. This test injects post-eof mapped >>>> writes in certain situations to detect gaps in zeroing behavior. >>>> >>>> Add a new eof zeroing helper for file extending operations. Look up >>>> the current eof folio, and if one exists, zero the range about to be >>>> exposed. This allows generic/363 to pass on tmpfs. >>> >>> This seems like what I did here? >>> >>> https://lore.kernel.org/linux-fsdevel/20230202204428.3267832-4-willy@infradead.org/ >>> >>> Which fix should we prefer? >> > > Quite similar, indeed. This is actually about the same as my initial > prototype when I was just trying to confirm the problem for truncate. As > Hugh notes, we still need to cover other size extension paths > (fallocate, buffered write). > >> Thank you Brian, thank you Matthew. >> >> Of course my immediate reaction is to prefer the smaller patch, >> Matthew's, but generic/363 still fails with that one: I need to look >> into what each of you is doing (and that mail thread from 2023) and >> make up my own mind. >> > > FWIW, I started off with something trying to use shmem_undo_range() and > eventually moved toward the current patch because I couldn't get it > working quite right. Explicit zeroing seemed like less complexity than > calling through undo_range() on various operations, primarily due to > less risk of unintentional side effect. It's possible (likely :P) that > was just user error on my part, so now that I have a functional patch I > can revisit that approach if it is preferred. I also tried using shmem_truncate_range() to fix this issue, but I failed. Ultimately, at least for me, I think the fix is simple and works. > However one thing I wasn't aware of at the time was the additional > zeroing needed into the target range on fallocate, so that might require > care to avoid not immediately punching out the folios that were > fallocated just prior. I suspect this would mean we'd need a helper or > something to restrict the range to undo to just the eof folio. That > seems like a plausible approach, I'm just not so sure the final result > will end up much smaller or simpler than this patch. > >> (I'm worried why I had no copy of Matthew's 2023 patch: it's sadly >> common for me to bury patches for long periods of time, but not >> usually to lose them completely. But that is my worry, not yours.) >> >> I haven't been much concerned by generic/383 failing on tmpfs: >> but promise to respect your efforts in due course. >> > > It's certainly not the bug of the century. ;) I added the test somewhat > recently because we had bigger zeroing issues on other filesystems and I > noticed we had no decent test coverage. > >> I procrastinate "in due course" because (a) the full correct answwer >> will depend on what happens with large folios, and (b) large folio >> work in 6.14 changed (I'll say broke) the behaviour of huge=always >> at eof (I expected a danger of that, and thought I checked before >> 6.14-rc settled, but must have messed up my checking somehow). >> > > Interesting.. I assume huge=always refers to a mount option. I need to > give that a test. I tested your patch by adding the 'transparent_hugepage_tmpfs=always' command line parameter, which will change the default huge policy to 'huge=always' for tmpfs mounts. Your patch also passed the generic/363 test with the tmpfs 'huge=always' mount option. > I'm also curious if either of you have any thoughts on the uptodate > question. Does filtering zeroing on uptodate folios ensure we'd zero > only folios that were previously written to? Yes, I think so. Caller will handle the !uptodate folios. So I change your patch to: static void shmem_zero_eof(struct inode *inode, loff_t pos) { struct folio *folio; loff_t i_size = i_size_read(inode); size_t from, len; folio = shmem_get_partial_folio(inode, i_size >> PAGE_SHIFT); if (!folio) return; if (folio_test_uptodate(folio)) { /* zero to the end of the folio or start of extending operation */ from = offset_in_folio(folio, i_size); len = min_t(loff_t, folio_size(folio) - from, pos - i_size); folio_zero_range(folio, from, len); } folio_unlock(folio); folio_put(folio); } >> So there's more (and more urgent) to sort out before settling on >> the right generic/383 fix. However, let's still wait to see if Hugh has any better ideas.