From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F560C27C65 for ; Tue, 11 Jun 2024 20:30:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C7996B00D7; Tue, 11 Jun 2024 16:30:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 577496B00DA; Tue, 11 Jun 2024 16:30:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43ED36B00DB; Tue, 11 Jun 2024 16:30:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 240266B00D7 for ; Tue, 11 Jun 2024 16:30:00 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CA164120BDA for ; Tue, 11 Jun 2024 20:29:59 +0000 (UTC) X-FDA: 82219749318.16.69433FB Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf29.hostedemail.com (Postfix) with ESMTP id A52E4120011 for ; Tue, 11 Jun 2024 20:29:57 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b="uA/KgJ8Y"; spf=none (imf29.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718137798; a=rsa-sha256; cv=none; b=uyBDznAUKWeCnGL6EbhnV4Wr1NXaqoTtuwLkRIYhiaepjxD6yqccJXeWWb8cd33fm3sRom ORiKmtBRr2cSaLf0fHK5gmZ9v43WiuUc1XhMxcHGvjLqEwa9Qm0+yWP/wz8PL5DZG3Rn+3 xXZ98/bYEW4eDtWzU75rGbJ5nTfdiQo= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b="uA/KgJ8Y"; spf=none (imf29.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718137798; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HHzIStUOi1BlV4wZTTFPN76eG154vJPSJLDHU9Auc44=; b=JzUt/xPQ5j026CW+NjikNVMuzAx5rtUSuAF85xTeOYcRhYHynm95yBZKcV2x4It1dfRygK HJwrGuUilPZYNH9oBKTdoP3rDYzvZZmfyYhFZ96lxOzbcVatiqBGFuiVZ5EXjatIqB8uOt wc+RgrHpjrxK2N6FV/kh3oKMWyysXxM= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=HHzIStUOi1BlV4wZTTFPN76eG154vJPSJLDHU9Auc44=; b=uA/KgJ8YREtF6OVuaY9TnG905E PDJi0R8oAMwiEe/Dm8lxtKIRVml1NvCqfp4Lfueb7dwt8LKtfCyEwblg/XskJKdlr4KzPec1plzJp qMBiHeC3I6lN0ZreVNVi4CWRbhrV50jXkRJiJ3lOA6IOXw9IsO51W8l1sqzAxlQ4WZ4Gsw85LU5xA jaIapNkFvtI++2amxryf5Sq1KbXnzNxUXbQYvs9vVk3RlHRXy0IBv7lMXe1rtlUMIlt252WUheX5G lN33sZx/Re4m6YLn85oOGyz0VCAnuVTg129NzISLFi6DpTv6ZBG270XmQz/iW1uisYx9ilRHRv0UG OvxWTTLQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1sH87r-0000000AAtM-3xS6; Tue, 11 Jun 2024 20:29:51 +0000 Date: Tue, 11 Jun 2024 13:29:51 -0700 From: Luis Chamberlain To: "Darrick J. Wong" , hughd@google.com Cc: patches@lists.linux.dev, fstests@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org, ziy@nvidia.com, vbabka@suse.cz, seanjc@google.com, willy@infradead.org, david@redhat.com, hughd@google.com, linmiaohe@huawei.com, muchun.song@linux.dev, osalvador@suse.de, p.raghav@samsung.com, da.gomez@samsung.com, hare@suse.de, john.g.garry@oracle.com Subject: Re: [PATCH 2/5] fstests: add mmap page boundary tests Message-ID: References: <20240611030203.1719072-1-mcgrof@kernel.org> <20240611030203.1719072-3-mcgrof@kernel.org> <20240611164811.GL52977@frogsfrogsfrogs> <20240611184603.GA52987@frogsfrogsfrogs> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240611184603.GA52987@frogsfrogsfrogs> X-Stat-Signature: 6ggdebwu1ub3c1pkx6zxgwdnbcsz8smb X-Rspamd-Queue-Id: A52E4120011 X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1718137797-430265 X-HE-Meta: U2FsdGVkX1+OHHt4fjILg9s4a/fKk9NqGtNXAs1Sum1tq4JGzm9SUGIAFUYIjkhQtXjFiuonh9fl7t66hHF/HkaxE7wh3HeUx0YriZQvVVOC6H3LSoxdTt3Q3k5+D9fr1Wv+9Xnspqd0HBCJEGMbhg+Bh4JN5HWYVALi3OzAOI/4zktaXei9mNHrPx9Nurd6Aov7JClryd1ZZH6+gs7l7h+k3ZwrYN3wjG4J1muSO0H3maowpWHezQbMpEx/MP2G5Djak292/ihUju7ccit3EBqhLnOeiAqXcAoU2S0NS/ofYvOz6cZcBYTmonNsoUo+6ixFYdJCMN7HjAhJYPDkNUhrFLQjldzAmSxYFwXFdZIZ0XnIPbKYVEHZ9hwGhAvBH213wft8MGi8gbRiEgPSEM1N1WEL46HqiIEUo0OZwI2eBMo49VOGNDpPeBh3ObPRY54KahJj6ovJIwoKpzPpzfkmy+i5VHfBs8+xwqla9jy/CiupdAL1yHB4a2PK0BPKanp87aKVymjw1DRIwFOMyLDoA29qXYFmHLxkWkgOPft/UMMhPovcpAnW4D7blKiW2VVZI97jeJQ/vq/vMNgbrL6anvX6Hse8NiX28kjtxonuSTQMpuPmPqs6gS44KK3lsUtyxUQ8PF6tthLCyxZJd14Hz/xN0S4C1wjOndP4u3Ta7lFsg5AV48dF6caF3PkySwYkHGbu3VDVCs6GQxug9Cw+140xDQCV9EevaSRqDMY6nVAESVFXumIjqAcWAgBC9GW7TulI3ZBKieHYkKc7K4+Yffa/SBN9mEbodi/uGhZ5N01jNMDkttUVsj7FTt7guDJKBqwrirW9WG8l0M+Q8h6iiKaMR+9P80A79wlqp6RunMh6uqBDKJlUApe0RUqmv0MWp8ezNXTvf5i9Ddm/KhLk/UIFwNYg2Fw/Ml5lx3Fo3i8EDiJh3q4seHTB6CyS5eq4/eZn1LmGgQyZBNA lgM9sltq KoKmXcl6PTQPzPaUWAPxNH89GdQCRRv4Sch5ozFvd0NbdwPIz6DF2AP5GLqkXlMHORn8bxslyzoakF0a7ZGT92zRsUZTEVJVWDsubeyTn9z3LibANLLReFDcD8E4rLU8y/Vtl1fAMNaNdq/xBCrn/dzQ5ZeJf7S2kbGGXo0AYPNQTKqbicBgGUfHpLHetlunwC57OcgkRoHpmbsZuQbusI71ojvY3DgVW43GR70R4tEpDGIvidBGzkhpi1AnSqjO3ZO1HLUzaL9Uqkj/9r4caF5NIPlucHihSSVv4sXMw55rcREw15TCEcwxhgQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 11, 2024 at 11:46:03AM -0700, Darrick J. Wong wrote: > On Tue, Jun 11, 2024 at 11:10:13AM -0700, Luis Chamberlain wrote: > > On Tue, Jun 11, 2024 at 09:48:11AM -0700, Darrick J. Wong wrote: > > > On Mon, Jun 10, 2024 at 08:01:59PM -0700, Luis Chamberlain wrote: > > > > +# As per POSIX NOTES mmap(2) maps multiples of the system page size, but if the > > > > +# data mapped is not multiples of the page size the remaining bytes are zeroed > > > > +# out when mapped and modifications to that region are not written to the file. > > > > +# On Linux when you write data to such partial page after the end of the > > > > +# object, the data stays in the page cache even after the file is closed and > > > > +# unmapped and even though the data is never written to the file itself, > > > > +# subsequent mappings may see the modified content. If you go *beyond* this > > > > > > Does this happen (mwrite data beyond eof sticks around) with large > > > folios as well? > > > > That corner case of checking to see if it stays is not tested by this > > test, but we could / should extend this test later for that. But then > > the question becomes, what is right, given we are in grey area, if we > > don't have any defined standard for it, it seems odd to test for it. > > > > So the test currently only tests for correctness of what we expect for > > POSIX and what we all have agreed for Linux. > > > > Hurding everyone to follow suit for the other corner cases is something > > perhaps we should do. Do we have a "strict fail" ? So that perhaps we can > > later add a test case for it and so that onnce and if we get consensus > > on what we do we can enable say a "strict-Linux" mode where we are > > pedantic about a new world order? > > I doubt there's an easy way to guarantee more than "initialized to zero, > contents may stay around in memory but will not be written to disk". > You could do asinine things like fault on every access and manually > inject zero bytes, but ... yuck. Sure, but I suspect the real issue is if it does something like leak data which it should not. The test as-is does test to ensure the data is zeroed. If we want to add a test to close the mmap and ensure the data beyond the file content up to PAGE_SIZE is still zeroed out, it's easy to do, it was just that it seems that *could* end up with different results depending on the filesystem. > That said -- let's say you have a 33k file, and a space mapping for > 0-63k What block size filesystem in this example? If the lengh is 33k, whether or not it was truncated does not matter, the file size is what matters. The block size is what we use for the minimum order folio, and sincee we start at offset 0, a 33k sized file on a 64k block size filesystem will get a 64k folio. On a 32k block size filesystem, it will get two 32k foios. > (e.g. it was preallocated). Do you mean sparse or what? Because if its a sparse file it will still change the size of the file, so just wanted to check. > Can the pagecache grab (say) a 64k folio for the EOF part of the pagecache? It depends on the block size of the filesystem. If 4k, we'd go up to 36k, and 33k-46k would be zereod. With min order, we'd have a folio of 8k, 32k, or 64k. For 8k we'd have 5 folios of 8k size each, the last one have only 1k of data, and 3k zeroed out. No PTEs would be assigned for that folio beyond 36k boundary and so we'd SIGBUS on access beyond it. We test for this in this test. > And can you mmap that whole region? No, we test for this too here. You can only mmap up to the aligned PAGE_SIZE of the file size. > And see even more grey area mmapping? No, we limit up to PAGE_SIZE alignement. > Or does mmap always cut > off the mapping at roundup(i_size_read(), PAGE_SIZE) ? That's right, we do this, without LBS this was implied, but with LBS we have to be explicit about using the PAGE_SIZE alignment restriction. This test checks for all that, and checks for both integrity of the contents and file size even if you muck with the extra fluff allowed by mmap(). > > > What other data? > > > > Beats me, got that from the man page bible on mmap. I think its homework > > for us to find out who is spewing that out, which gives a bit more value > > to the idea of that strict-linux thing. How else will we find out? > > Oh, ok. I couldn't tell if *you* had seen "other" data emerging from > the murk, or if that was merely what a spec says. Please cite the > particular bible you were reading. ;) >From the mmap(2) man page: "subsequent mappings may see the modified content." so I extended this with the implications of it using *may*. Speaking of the man page, I see also that huge pages are addressed there and when a huge page is used it says: "The system automatically aligns length to be a multiple of the underlying huge page size" And so I believes that means we need to check for the huge page on filemap_map_pages() and also the test and adjust it to align to the specific huge page size if used... Or just skip tmpfs / hugetlbfs for now... Luis