From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A14DBC3DA41 for ; Tue, 9 Jul 2024 16:50:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 279426B0099; Tue, 9 Jul 2024 12:50:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 202A46B009B; Tue, 9 Jul 2024 12:50:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A3886B009C; Tue, 9 Jul 2024 12:50:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D744A6B0099 for ; Tue, 9 Jul 2024 12:50:56 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5E593A4B87 for ; Tue, 9 Jul 2024 16:50:56 +0000 (UTC) X-FDA: 82320803712.29.C4E209D Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf25.hostedemail.com (Postfix) with ESMTP id 09106A0002 for ; Tue, 9 Jul 2024 16:50:52 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pdwmCL5K; spf=pass (imf25.hostedemail.com: domain of djwong@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=djwong@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720543823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SJ4pZKQfCuv6nJVkMr6Rviofz3CViNXm4OKyDcJ4Upo=; b=qVnrbH/IOch2vL5qJJNaqVdzbxYc95vErxnewiEERgFU/HMBYt4ofkvqQ4PUqS9u3tx0DE FN+5Vs6xVab901lmCNtvbveb+uytMVNzo1Xw0+2a6r4eFCEUOW6E+5Y977FwRSyWMu3zt+ WnaDp+Wht/q1a7c65PrPRrRLL6b6jMU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720543823; a=rsa-sha256; cv=none; b=QwHqhao66Kv8rhKkfZdhgkGIXExT1VLEMFMVLajPDKsalwf+s8z8QhVj0e7Zg5zfKYnnCg h4JuCxtXQD0KO+anjopFLNl937mcRwo2Vr5ADRikaXvnRV0miXiUsquytKhEiwu/tQdgUT dQC6r2Y+kI4cZFxsZNU5J9wYtzYe6Sg= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=pdwmCL5K; spf=pass (imf25.hostedemail.com: domain of djwong@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=djwong@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 2819DCE129E; Tue, 9 Jul 2024 16:50:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 55F1AC3277B; Tue, 9 Jul 2024 16:50:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1720543848; bh=znRxZ45h1cDZyCtgvwF+R25ShiH7+CYAc+wYck4CujA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=pdwmCL5K4Pp+fpYhosmD6dCZS6cTj0UW0wCYxOUgaJOKuTEx5NQ3BbI3gccWvk8A+ hvDSBeFKAuCrwN5DF+iCq2TfuzBO5G9x78rT8hf6jBDCg+6fuMCdsWy380z56wO5xw kdYNYnA7Fk370m00h8QmUlwcsh39TvCsR7uuutRX6Ld4kS3kiz3QgKZDUztLn6ma7E iXC5i5PzVV33Kp2W7XJ0TIe3iYC0TTm7kuAwCYI7k86chLjj1vWxb9O72SMrghAyX3 5XG8RC+D8may4rwCSt1K6DiubW1tjggdkU/hrR4SWnANYCSwQ5MpdfNdXaqT7PVNVJ tKv0Zmn2sLDKQ== Date: Tue, 9 Jul 2024 09:50:47 -0700 From: "Darrick J. Wong" To: "Pankaj Raghav (Samsung)" Cc: david@fromorbit.com, willy@infradead.org, ryan.roberts@arm.com, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, linux-mm@kvack.org, john.g.garry@oracle.com, linux-fsdevel@vger.kernel.org, hare@suse.de, p.raghav@samsung.com, mcgrof@kernel.org, gost.dev@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, Zi Yan , akpm@linux-foundation.org, chandan.babu@oracle.com Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes Message-ID: <20240709165047.GS1998502@frogsfrogsfrogs> References: <20240625114420.719014-1-kernel@pankajraghav.com> <20240625114420.719014-2-kernel@pankajraghav.com> <20240709162907.gsd5nf33teoss5ir@quentin> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240709162907.gsd5nf33teoss5ir@quentin> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 09106A0002 X-Stat-Signature: fsktnod9pf5fwxebai94mbpymyjc9bq9 X-HE-Tag: 1720543852-645016 X-HE-Meta: U2FsdGVkX19xaru47DUxptXSbulW1L3EsfSdylLIuiWk41xaQO4ZU24qrk+zaJOawuK5yKVa6Jz/RPJCt0cTaGjzYdbTnneu7wgy/QsINAwozCmUHts9lG+L5/8w+17mn+4m+6++m8Z67oEPNG99eYBYe0sefd1nc7jqEkgt5SBpmTXUo+olrcECX6T984w/4qURHrBUkUJp4jEjkl8SclBmiWkFgAGi/WfZyC68J3YEPCeUfXZTHfr8JPmCTSjz0KL05ysjuzWfd3aQzcirWHHvxyJmJ+J1jeo1KaBHP7A01qFtY9y6TfLIuBTk/6RdT5zaMnf1ogq7lpBMUaJj52jtyz+N8OP/RY0AwP5atytKdI5bj9XeUp+3rJ2MpS6kjJDqlBX62GTWM1MhLVddunuDHWXe5swwE7ltF4CLtrPeUyRu1AzpUjlcRyaWGvbNNoV1vfwLRw7IJiNpZtt3uyrUfweDRsff2xjUDZGMweEmYch4eTt2gmMZq4z5i68kXv77GhLRL/n80ZaJojUW8fEmtoRRFv6ay0clvDLUaegZ5DH8HWmLAa0hu+jEKTDd1+9JhQYq9F2rWcTIWR8s8+RNFXAWzxEiqkqFuYHZhsYUwjSKco0DoX3zuriyDcQRL/t7ncSuk2LlT4oP/4d8ZabNFEFsbjFYXzKpH0cIMqtwePkocvsloYzWoC/KVp7bv4jCqU85Cqff0Nn4qTi+MUiu8v2WP50WXdh2U4osN1eLbMllc9DprgxioJ6RMSAGQ8s4dKEkxK4+GjJCqujMfFx2wG5jUlEeSAMf4hs1FowYST+esH1XW4OjX3Ln8/OtvX2FHE3LDWQgbP3x3CpZJVrbQr2eehnM8vKfIM/WHqiw6Dq2M/ERAYCcD6EpofzibQDijBgcKsdUguiaxBcgfMhyA8FR7zO/SZ+brNK1XwaenyQhDxVIhTElZwYGcoAdWFpeo0e2wtaxOGXvZ08 Owcg0lq3 jsZbaOtXVaL3YVIwpNA/LLqbV4i67RT69qFR5qpfHkIKXZtw6+j0r7uPF1hw5pVzUuLcJ030uP0VJ02PSTuDZMut5RODUhlJjq/Vl+DYNoRjr8B6aeI/CZ+8etiV4gCqXy5Vhk55gclfk7temz8JoXY0u6PmQsjWKzpR8Ij7HdKtPgb0I0dgwSwfogX//LieFvtAM3EbPFk595QRK9qVui5STW9ZzNkzoKq8S X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 09, 2024 at 04:29:07PM +0000, Pankaj Raghav (Samsung) wrote: > For now, this is the only patch that is blocking for the next version. > > Based on the discussion, is the following logical @ryan, @dave and > @willy? > > - We give explicit VM_WARN_ONCE if we try to set folio order range if > the THP is disabled, min and max is greater than MAX_PAGECACHE_ORDER. > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index 14e1415f7dcf4..313c9fad61859 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -394,13 +394,24 @@ static inline void mapping_set_folio_order_range(struct address_space *mapping, > unsigned int min, > unsigned int max) > { > - if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) > + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { > + VM_WARN_ONCE(1, > + "THP needs to be enabled to support mapping folio order range"); > return; > + } > > - if (min > MAX_PAGECACHE_ORDER) > + if (min > MAX_PAGECACHE_ORDER) { > + VM_WARN_ONCE(1, > + "min order > MAX_PAGECACHE_ORDER. Setting min_order to MAX_PAGECACHE_ORDER"); > min = MAX_PAGECACHE_ORDER; > - if (max > MAX_PAGECACHE_ORDER) > + } > + > + if (max > MAX_PAGECACHE_ORDER) { > + VM_WARN_ONCE(1, > + "max order > MAX_PAGECACHE_ORDER. Setting max_order to MAX_PAGECACHE_ORDER"); > max = MAX_PAGECACHE_ORDER; > + } > + > if (max < min) > max = min; > > - We make THP an explicit dependency for XFS: > > diff --git a/fs/xfs/Kconfig b/fs/xfs/Kconfig > index d41edd30388b7..be2c1c0e9fe8b 100644 > --- a/fs/xfs/Kconfig > +++ b/fs/xfs/Kconfig > @@ -5,6 +5,7 @@ config XFS_FS > select EXPORTFS > select LIBCRC32C > select FS_IOMAP > + select TRANSPARENT_HUGEPAGE > help > XFS is a high performance journaling filesystem which originated > on the SGI IRIX platform. It is completely multi-threaded, can > > OR > > We create a helper in page cache that FSs can use to check if a specific > order can be supported at mount time: I like this solution better; if XFS is going to drop support for o[ld]d architectures I think we need /some/ sort of notice period. Or at least a better story than "we want to support 64k fsblocks on x64 so we're withdrawing support even for 4k fsblocks and smallish filesystems on m68k". You probably don't want bs>ps support to block on some arcane discussion about 32-bit, right? ;) > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index 14e1415f7dcf..9be775ef11a5 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -374,6 +374,14 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) > #define MAX_XAS_ORDER (XA_CHUNK_SHIFT * 2 - 1) > #define MAX_PAGECACHE_ORDER min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER) > > + > +static inline unsigned int mapping_max_folio_order_supported() > +{ > + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) > + return 0; Shouldn't this line be indented by two tabs, not six spaces? > + return MAX_PAGECACHE_ORDER; > +} Alternately, should this return the max folio size in bytes? static inline size_t mapping_max_folio_size(void) { if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return 1U << (PAGE_SHIFT + MAX_PAGECACHE_ORDER); return PAGE_SIZE; } Then the validation looks like: const size_t max_folio_size = mapping_max_folio_size(); if (mp->m_sb.sb_blocksize > max_folio_size) { xfs_warn(mp, "block size (%u bytes) not supported; maximum folio size is %u.", mp->m_sb.sb_blocksize, max_folio_size); error = -ENOSYS; goto out_free_sb; } (Don't mind me bikeshedding here.) > + > > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index b8a93a8f35cac..e2be8743c2c20 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -1647,6 +1647,15 @@ xfs_fs_fill_super( > goto out_free_sb; > } > > + if (mp->m_sb.sb_blocklog - PAGE_SHIFT > > + mapping_max_folio_order_supported()) { > + xfs_warn(mp, > +"Block Size (%d bytes) is not supported. Check MAX_PAGECACHE_ORDER", > + mp->m_sb.sb_blocksize); You might as well print MAX_PAGECACHE_ORDER here to make analysis easier on less-familiar architectures: xfs_warn(mp, "block size (%d bytes) is not supported; max folio size is %u.", mp->m_sb.sb_blocksize, 1U << mapping_max_folio_order_supported()); (I wrote this comment first.) --D > + error = -ENOSYS; > + goto out_free_sb; > + } > + > xfs_warn(mp, > "EXPERIMENTAL: V5 Filesystem with Large Block Size (%d bytes) enabled.", > mp->m_sb.sb_blocksize); > > > -- > Pankaj