From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40C37C3271F for ; Fri, 5 Jul 2024 09:03:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C481A6B00A3; Fri, 5 Jul 2024 05:03:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BF8326B00A4; Fri, 5 Jul 2024 05:03:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A98526B00A5; Fri, 5 Jul 2024 05:03:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8A0C36B00A3 for ; Fri, 5 Jul 2024 05:03:39 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 19371A1521 for ; Fri, 5 Jul 2024 09:03:39 +0000 (UTC) X-FDA: 82305110958.21.10AF294 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 51401C0017 for ; Fri, 5 Jul 2024 09:03:37 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720170204; a=rsa-sha256; cv=none; b=h89EXWjeUkkiwcU4kHQ8DYw+5TmiKaONpEzEDIRkq2tfm8mxoMLvweAvFAgaV77RK6LZjM AK1oaDmWboDdqIMbLIbBE0syykxePnYC2L4ZV29LxClYsL5AYHa4h3nW9CBtMUmP64jf8d 3s83rwAn9Dm8hbPLKXhWy7OxEQs32RM= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720170204; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E93sZlOqoy4lHccC89b3segxW+cCoa2gm9AT3/66thM=; b=K8ERrkhyBnwPr/fvS/Vb1bTn37pS4KUf1WaymVYzDl7RO7GRMcI6VhWk+vBeVsZ2y52sK+ vACClnuXN8HUeYrwTQPp7sgowTtMMPaiJ0FM0OnkT+s2gChHk9v6KfAPZa8JHFqpv6Sl+c Ppu1/XoTp1zsi2lBRRml4SYsBK5Jq4s= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B5AD8367; Fri, 5 Jul 2024 02:04:01 -0700 (PDT) Received: from [10.57.74.223] (unknown [10.57.74.223]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 813DD3F762; Fri, 5 Jul 2024 02:03:33 -0700 (PDT) Message-ID: <03b9ea6c-a3c4-41e8-ad47-4e82344da419@arm.com> Date: Fri, 5 Jul 2024 10:03:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes Content-Language: en-GB To: Dave Chinner , Matthew Wilcox Cc: "Pankaj Raghav (Samsung)" , chandan.babu@oracle.com, djwong@kernel.org, brauner@kernel.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, linux-mm@kvack.org, john.g.garry@oracle.com, linux-fsdevel@vger.kernel.org, hare@suse.de, p.raghav@samsung.com, mcgrof@kernel.org, gost.dev@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, Zi Yan References: <20240625114420.719014-1-kernel@pankajraghav.com> <20240625114420.719014-2-kernel@pankajraghav.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 48ab6ahnuaq11ramen16kmda49884fhm X-Rspamd-Queue-Id: 51401C0017 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1720170217-955695 X-HE-Meta: U2FsdGVkX19gCwAc/mlpd5XtWXKlg07N/J1mJbAHbUy/eVeFHP0eFdTl3YnE8gbiowVP5qsNU0dxSo/74BJeH/THI86+uyp5ZizFDaxpwceCTghip/i5G0epJNkG/doAEH1m//Ax+UJUR5ZxiSEL16QdpiaH07fxmj//A/+/ClkE1SXV5WO0bL1yhl8KuJ9uoIKdoZ20FamGOxRUyUXLup7/yE/TJkacTJtIxDQINHyt/aaCU4GCr82pnO3v9/TgL4IKNomWWetnaTHwYiY51kxvVqHZvLTrcLTxKpKmSA7/tZ4Sq+2EEvIiQL2jPpZ63zC0FiKxStj7FuZnf3uce1RI9SI595Gg2SK6mf27kB+KGSGg1Ht+e+oxtzWUXbImv6oPer4/51mAzhlfb/whHQeKmFkTHpD0uC/dOfBf1O9yAe+JT4Qmb6Xs5QsRefXyWtz4pGVZhqSatTopeUrVMQ2gs7iA94k3p9YpwMX0x3a9J58OpQBWLB74+pS5i5PCbEwBNbISgL40uitpz+btiEWY9G+Be+iyxUcWbHi/W5bjYiRVuUbQ6b2IQWGy3wOLHmvBY1rrAu16S+7DD/VMcvDM+z3CwNfCTPptbK4OsjmVeXPxuQndX3l9jDKYwB8XvvzQD5YXF+ToX53qtzA4i7Q6fmlUKSaUcU6DF8TeJkShhmnDwQzx1otq2hNxooTAkPj18N3enbWj8BPbbeUigoDkVkLMfKKL0ZRY8ffPMcKP+8bWmVrj8PDA8OAdXWx3GELbdKLS4h+XVspSM4NuNTPfMFUIEFTU03WkGilUw3oRIBwpRLkkxf/MK6tb6jTdLpP81zJNGeD7n/B6oVN1yau29zMKo1RMJdUDVYQVyWWLxIHpxecuMa4gn93RW0toDws/LWx8cnFLb/yjraJ/V82RYOuQmrWQKegxe8wA7k49KL4g++COkX7D7ZGovKnMxupCp3CJe8lz3C+EGA3 y74gYoBj 6AqyFgIWTUDT55IDYuB5w2PXWTLzRub7m3i82HYXcjuHlw6ZK7oVgjIT+/IQI07XZPR5EsNF5LjlRJtnuYNGhaR91G8fX0qdB5S62NykOgBiwGjjBNdYQuqcddD6CU4CYjrrLRJEe352vFbx+v0Q4NEHB1A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 05/07/2024 05:32, Dave Chinner wrote: > On Fri, Jul 05, 2024 at 12:56:28AM +0100, Matthew Wilcox wrote: >> On Fri, Jul 05, 2024 at 08:06:51AM +1000, Dave Chinner wrote: >>>>> It seems strange to silently clamp these? Presumably for the bs>ps usecase, >>>>> whatever values are passed in are a hard requirement? So wouldn't want them to >>>>> be silently reduced. (Especially given the recent change to reduce the size of >>>>> MAX_PAGECACHE_ORDER to less then PMD size in some cases). >>>> >>>> Hm, yes. We should probably make this return an errno. Including >>>> returning an errno for !IS_ENABLED() and min > 0. >>> >>> What are callers supposed to do with an error? In the case of >>> setting up a newly allocated inode in XFS, the error would be >>> returned in the middle of a transaction and so this failure would >>> result in a filesystem shutdown. >> >> I suggest you handle it better than this. If the device is asking for a >> blocksize > PMD_SIZE, you should fail to mount it. A detail, but MAX_PAGECACHE_ORDER may be smaller than PMD_SIZE even on systems with CONFIG_TRANSPARENT_HUGEPAGE as of a fix that is currently in mm-unstable: #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define PREFERRED_MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER #else #define PREFERRED_MAX_PAGECACHE_ORDER 8 #endif /* * xas_split_alloc() does not support arbitrary orders. This implies no * 512MB THP on ARM64 with 64KB base page size. */ #define MAX_XAS_ORDER (XA_CHUNK_SHIFT * 2 - 1) #define MAX_PAGECACHE_ORDER min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER) But that also implies that the page cache can handle up to order-8 without CONFIG_TRANSPARENT_HUGEPAGE so sounds like there isn't a dependcy on CONFIG_TRANSPARENT_HUGEPAGE in this respect? > > That's my point: we already do that. > > The largest block size we support is 64kB and that's way smaller > than PMD_SIZE on all platforms and we always check for bs > ps > support at mount time when the filesystem bs > ps. > > Hence we're never going to set the min value to anything unsupported > unless someone makes a massive programming mistake. At which point, > we want a *hard, immediate fail* so the developer notices their > mistake immediately. All filesystems and block devices need to > behave this way so the limits should be encoded as asserts in the > function to trigger such behaviour. > >> If the device is >> asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is >> not set, you should also decline to mount the filesystem. > > What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems > being able to use large folios? > > If that's an actual dependency of using large folios, then we're at > the point where the mm side of large folios needs to be divorced > from CONFIG_TRANSPARENT_HUGEPAGE and always supported. > Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the > block layer and also every filesystem that wants to support > sector/blocks sizes larger than PAGE_SIZE. IOWs, large folio > support needs to *always* be enabled on systems that say > CONFIG_BLOCK=y. > > I'd much prefer the former occurs, because making the block layer > and filesystems dependent on an mm feature they don't actually use > is kinda weird... > > -Dave. >