From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31F99C3271E for ; Mon, 8 Jul 2024 23:01:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CEC46B009B; Mon, 8 Jul 2024 19:01:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 87F406B00A0; Mon, 8 Jul 2024 19:01:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71EED6B00A1; Mon, 8 Jul 2024 19:01:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 53BD96B009B for ; Mon, 8 Jul 2024 19:01:40 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0275D121513 for ; Mon, 8 Jul 2024 23:01:39 +0000 (UTC) X-FDA: 82318109160.08.151D519 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf13.hostedemail.com (Postfix) with ESMTP id EAA482002B for ; Mon, 8 Jul 2024 23:01:37 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=YDHVPnF6; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf13.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720479675; a=rsa-sha256; cv=none; b=F/gUkg2pw1uKDbfhmv9PqUgPaFqvYgAup7SjdmeXmyeFXm6pf5yO7dNTkMd8kKSHM6Fdxy CSQd1tjzhh9jz4MID01jyGfDmsu+eTBbXJL4W9s4eSJ+fw4iqHrDJ0sRjx5W6jBmtJMIxE inKMuuqYm1ZhwgZZSAokG/e+kv8y10U= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=YDHVPnF6; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf13.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720479675; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x5cu9RTdpEetTRtYJGCxvqzeg6gGlXm6jK4lXHnlKYE=; b=kVLQgBFXX4SMAXWqtKKHhDoNOVf3dkC2bin0Vor5THjSjYbrw/GSzJdO2PiYGD5Z0Q40iw DjHMDgeXu5vBgkYfLG7zyo5nCCAhKw4Ax/p6cGTeucddlNgcMcdw9BKKXbyjb8n7zMM2Yf kJa37M4gyFTT4tYSeMvJrIV9Ia+S4l4= Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-70b4267ccfcso557513b3a.3 for ; Mon, 08 Jul 2024 16:01:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1720479696; x=1721084496; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=x5cu9RTdpEetTRtYJGCxvqzeg6gGlXm6jK4lXHnlKYE=; b=YDHVPnF6xNXPspSo9MATEZQQ3y/n18HYstfIIg0Y7ZDXeju7qlMl/tZBg0mmc6AOdR tJhWPq5fNa+jFmbqH/uh2hL8I3ImpOFDzPpSNjhdqNnF+ibWAiP2NCTvSLU8L4h68/Re G06HEqQ+Lq91jV27bxrw+jtbtvKvvQT4wC7Ekce+UyS7T6+4X9NQH9CQl1n8YfLgIZib iYOGqDQ/jv6TGdTN4Pdi6R6oKST7prKVN7+auLaM9HJIe0g1zBtSVnswyGDf7DKCH1Sk 5bbva/Py5XiJabdldo/suw6nuPHqb4OOYHUnH7NSxkVGDyOpDnemETsmwpSEuhrgtL2p flkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720479696; x=1721084496; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=x5cu9RTdpEetTRtYJGCxvqzeg6gGlXm6jK4lXHnlKYE=; b=nHvYelLLaMzA8KikmtB+g4ow81FxHNEY3EQ9eQ8UsKmqPSor6bc2xts/vT9mwabfuq EtJqAZeGmPZlzKJxL4jYkMzPq18lhZhiN+mAgVPC2B7OhFDTxfVN5MsuIGv1irFz0Jv3 ikNKg8kDPwFZjpKKJvPxUAmxetC7sZIdfRCC7XmfOfcigS+i5rOkXZ24V9UYScf4m/iC ZF3X0EfXLmZYqfIe8WymLViSOkPIBD03RssVEOTQN6ahW9XVGw6VaOqxcZ12ttVNGp1D JGea92A99BCLOvvIvt9e6n9zA5yBpfd3/0eqHC2cosSavitgHYqmCYyET/6Xl/tHZhYM UFiQ== X-Forwarded-Encrypted: i=1; AJvYcCWSDAz6fdlnHK6MFADfKHJcQOivmZYzoQx1RVuUMl9ED96wCNAb4jdWNs52LMJgcwvHXzZD8sIjH+TxBQwokbWlv0E= X-Gm-Message-State: AOJu0Ywkni2BQafyXtov13x/zDm47Rmx3PwTeJefPff5pzBN3bZ1rCPV FzHn/QYJtGoX3lCfq04G0wetFQUCH2K+o2znKD1Pp8vHFEoV0RgUArHWTzJwCu0= X-Google-Smtp-Source: AGHT+IH6RNZxAqDXMv0zOhu7Tw81IhoGG5N9YhzT4A9nMqKAP406Ux6SzAC2wf4QwiAKI2xXuKZ8wQ== X-Received: by 2002:a05:6a00:893:b0:706:29d3:3c32 with SMTP id d2e1a72fcca58-70b4351fef3mr1419047b3a.2.1720479696539; Mon, 08 Jul 2024 16:01:36 -0700 (PDT) Received: from dread.disaster.area (pa49-179-32-121.pa.nsw.optusnet.com.au. [49.179.32.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70b439ba1casm404582b3a.202.2024.07.08.16.01.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jul 2024 16:01:36 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1sQxMT-0092DB-22; Tue, 09 Jul 2024 09:01:33 +1000 Date: Tue, 9 Jul 2024 09:01:33 +1000 From: Dave Chinner To: Ryan Roberts Cc: "Pankaj Raghav (Samsung)" , Matthew Wilcox , chandan.babu@oracle.com, djwong@kernel.org, brauner@kernel.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, linux-mm@kvack.org, john.g.garry@oracle.com, linux-fsdevel@vger.kernel.org, hare@suse.de, p.raghav@samsung.com, mcgrof@kernel.org, gost.dev@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, Zi Yan Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes Message-ID: References: <20240625114420.719014-1-kernel@pankajraghav.com> <20240625114420.719014-2-kernel@pankajraghav.com> <20240705132418.gk7oeucdisat3sq5@quentin> <1e0e89ea-3130-42b0-810d-f52da2affe51@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1e0e89ea-3130-42b0-810d-f52da2affe51@arm.com> X-Rspamd-Queue-Id: EAA482002B X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: mfe4tygeezd8y8jizf466riyhrp7sof6 X-HE-Tag: 1720479697-167354 X-HE-Meta: U2FsdGVkX18hv+MLSMKVJMmTS8AXEqizLNx8NyZAuHwxHFEf8qLsJbiu6PvmcqYtU3QDhmwLegrd2Aa4EepcR0Mdr+7aHfIod8NxetQg4uKGkZMfNwdKO1TLL40o+0S2lYPI9bhHW4JCyvdk/5DhXkM33hro/0G5wi8mi1oDjxrt6+aW12RiAICq/BAb53cS8di37vOTaPFMKkhLS5BOhuhO0y9NACfsalbLjVQ9tbok/d+m1DFz1YrwsOItEy36PWjY7jDChC1NFmsXuaVZ3zIzu6nEd3bIzMCNAAKhyQmn+3jiGTG3wJKUw+r0kFRukD/hSBi/AtPkwKsUS80EUkT8dmybG6N18Y7G7ZTfhuBaX11bCUmOmZFYxwb2g83pNkqosT2fMT+7R2FcVQ1mTcKpI6VydLXi02dhbZ7CG9zLxGkHKUo08ZZionItoIM1KG7793hUn/Jaj9/YU9/Oy58fES+xCypWn+LYGdr8SmiVDrhgHacJgI+Ly4pNOFT6yqfrwyo2aau4D3RWOzYIX2aKVwiHiA2XQMppuKyiyflZRHt60kPpoji+6PqkpVQEbzM5FJLNZjhhOoRDjF0i+RXENrWQ9+ATEvrsY7kTDXOBaGTaeRpflELTHOQYkog+j4/gTzgG9vND4WTYbOjdVTUcu1yoi7dp/12yL8uqygqpAIUA3jpDxOPfA0nQ+70sT/EichL9c+9LEKxmbA4ImPah9gBwDOfoz4mIAkgiK/EO0hFB1pTr4LsZYZz5DHaKKksDOld4e+XrYpdnEUPP9TzUj7urtINGG+EBgCku/k3pBO7BMl3ae2A4kkoITBbsp0UxxcLJOk9b1zahlRRyauS2cQpGQsDwGMoBkdzCOssmc02QuLxocxUHNCzAImFALNlUq9ksfLDaHwAZUW3+GXN2KCWuHKmUcmE9nhzyWKB8MJchwwqhqIBQlVYpINahzHQGTILHpPLkzM7VYdX cnhcs3T7 aDTQD6t5TujjZbquHoSNo3tVEnsHedumLzunMq/KvOp0tSCdLm1kgO9EyF7ahZn1czwh4NmjHB3YbqDtcdBGcPO/s5iVukuMO0CurCba4gpF/fj4gskR5NVyt82QY7Ml9vv3mLcdkopGfdOyI9sZd7hQeWJWZMOmYYAk71mGe3nlr8Yl0273EFcHpRHtJZ+SGwDhXql10kghJWvL+PMbYZlR7G8hEkhkF6U5WP/5Kwu40KY12VdAYc6Zq1cMdzNAcRBtx6aSR3wWM1JLXpeau5FLUMiQ1aHspL25McovZltomHeBbDcfnS7lcfo0ZW5IhqqTz9YSxITL+OGSC27cB4Gx7Fo9TMSo6SmuayoRmN9u0UJ9tth5jIZUObBClmL+MdRQ8/Gh3GMun5vQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jul 05, 2024 at 02:31:08PM +0100, Ryan Roberts wrote: > On 05/07/2024 14:24, Pankaj Raghav (Samsung) wrote: > >>> I suggest you handle it better than this. If the device is asking for a > >>> blocksize > PMD_SIZE, you should fail to mount it. > >> > >> That's my point: we already do that. > >> > >> The largest block size we support is 64kB and that's way smaller > >> than PMD_SIZE on all platforms and we always check for bs > ps > >> support at mount time when the filesystem bs > ps. > >> > >> Hence we're never going to set the min value to anything unsupported > >> unless someone makes a massive programming mistake. At which point, > >> we want a *hard, immediate fail* so the developer notices their > >> mistake immediately. All filesystems and block devices need to > >> behave this way so the limits should be encoded as asserts in the > >> function to trigger such behaviour. > > > > I agree, this kind of bug will be encountered only during developement > > and not during actual production due to the limit we have fs block size > > in XFS. > > > >> > >>> If the device is > >>> asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is > >>> not set, you should also decline to mount the filesystem. > >> > >> What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems > >> being able to use large folios? > >> > >> If that's an actual dependency of using large folios, then we're at > >> the point where the mm side of large folios needs to be divorced > >> from CONFIG_TRANSPARENT_HUGEPAGE and always supported. > >> Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the > >> block layer and also every filesystem that wants to support > >> sector/blocks sizes larger than PAGE_SIZE. IOWs, large folio > >> support needs to *always* be enabled on systems that say > >> CONFIG_BLOCK=y. > > > > Why CONFIG_BLOCK? I think it is enough if it comes from the FS side > > right? And for now, the only FS that needs that sort of bs > ps > > guarantee is XFS with this series. Other filesystems such as bcachefs > > that call mapping_set_large_folios() only enable it as an optimization > > and it is not needed for the filesystem to function. > > > > So this is my conclusion from the conversation: > > - Add a dependency in Kconfig on THP for XFS until we fix the dependency > > of large folios on THP > > THP isn't supported on some arches, so isn't this effectively saying XFS can no > longer be used with those arches, even if the bs <= ps? I'm good with that - we're already long past the point where we try to support XFS on every linux platform. Indeed, we've recent been musing about making XFS depend on 64 bit only - 32 bit systems don't have the memory capacity to run the full xfs tool chain (e.g. xfs_repair) on filesystems over about a TB in size, and they are greatly limited in kernel memory and vmap areas, both of which XFS makes heavy use of. Basically, friends don't let friends use XFS on 32 bit systems, and that's been true for about 20 years now. Our problem is the test matrix - if we now have to explicitly test XFS both with and without large folios enabled to support these platforms, we've just doubled our test matrix. The test matrix is already far too large to robustly cover, so anything that requires doubling the number of kernel configs we have to test is, IMO, a non-starter. That's why we really don't support XFS on 32 bit systems anymore and why we're talking about making that official with a config option. If we're at the point where XFS will now depend on large folios (i.e THP), then we need to seriously consider reducing the supported arches to just those that support both 64 bit and THP. If niche arches want to support THP, or enable large folios without the need for THP, then they can do that work and then they get XFS for free. Just because an arch might run a Linux kernel, it doesn't mean we have to support XFS on it.... -Dave. -- Dave Chinner david@fromorbit.com