From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1A29C27C79 for ; Mon, 17 Jun 2024 16:31:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50AF06B020D; Mon, 17 Jun 2024 12:31:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4BBF06B020E; Mon, 17 Jun 2024 12:31:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 382686B020F; Mon, 17 Jun 2024 12:31:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 19BF76B020D for ; Mon, 17 Jun 2024 12:31:51 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BFF3B161958 for ; Mon, 17 Jun 2024 16:31:49 +0000 (UTC) X-FDA: 82240921938.25.71BDD26 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf23.hostedemail.com (Postfix) with ESMTP id ECBBB14002B for ; Mon, 17 Jun 2024 16:31:46 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=JWkKiRWp; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf23.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718641901; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KuXIJ7LqfJ3Qb7t/gAz8A9hXQtRUtJOjtWlveOqVqMY=; b=s3OUbQg0zy1a8qA1m5oYQB9TB6Z5lyoObcdtc/HX6ormqIs3DnSf+VFE2adINqzv3vse7B WQEIdM1iYFDrHLWttBmZwI6D881GurKevbynRlRC8L5QfqD3ntUCLI5B1OV8nvpxYE+s/l ah1UOiwE9iMiaaymBVE95srURnVmMcQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718641901; a=rsa-sha256; cv=none; b=d+0ozSxRXXWGWepw6MtDgfoMkqbgboBmXUsQ6cl8dQ+OqXVnZBFnWXoIBOpfzfoE324h2h qQ5lGTWcjaAvICAK0z0YmjgCaiy+9kn6JhJwp9FyoAOj4Pt1m9A1gz8k9KAwXEFtlCiQyY jh+pk34smaFvLk8DJ92n97iF4gI5j1E= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=JWkKiRWp; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf23.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4W2wPp2gXcz9sbq; Mon, 17 Jun 2024 18:31:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1718641902; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KuXIJ7LqfJ3Qb7t/gAz8A9hXQtRUtJOjtWlveOqVqMY=; b=JWkKiRWpGNhxs402fjJ1SGsC2FOtKPjRvo6mKa70tUJkaX1x+g+HSinnBkMxJvipXP6EOo HYkWfrYA0rxbNJvuyHEA48QNv6GtrcYKFVzJcr5Zf9j+oKLARU7yMDuusWVc4IcufP7b+s BxQvCVcHvH15El7O0iNZ3w91lfIsDlRFcEKK9KwDqV+jJFBHbSM9de2fCvYXGD8mKeaatI RPShWka2kHdS+p+bJTH69FL8xBeFnqwFPJyLWo5/vIa3Q6QXnT0zb684x2hvZ/kCxWFUCK nfybmgFIAvhP9gxxbyiIXcOsjpiCExUQP+qLJi9OOzLan/G2zXOtYI1H0nXQrA== Date: Mon, 17 Jun 2024 16:31:36 +0000 From: "Pankaj Raghav (Samsung)" To: Christoph Hellwig Cc: Dave Chinner , djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org, mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: Re: [PATCH v7 11/11] xfs: enable block size larger than page size support Message-ID: <20240617163136.ozxrlxljmblcgny3@quentin> References: <20240607145902.1137853-1-kernel@pankajraghav.com> <20240607145902.1137853-12-kernel@pankajraghav.com> <20240613084725.GC23371@lst.de> <20240617065104.GA18547@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240617065104.GA18547@lst.de> X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: ECBBB14002B X-Stat-Signature: at4df8y8exjx1hhxtopz1k53uunfxhru X-Rspam-User: X-HE-Tag: 1718641906-720376 X-HE-Meta: U2FsdGVkX19Ph131d47rykVgZMm9ZB1TT4lybXmb90hr7GmdjWXzbEo4ucVnH9tRoJywfyBezymEmvodqFkMiDHJWfuCWTONArkDhoXLaafyH9ufATBCVAIe0xxoqLrCYbjzxnw5cm6uvIe+bdmlfSelQEausA5mbUnVAX23g48cQHQShjforHyvQeA9UOhVyaoEBtnKRvr/mNMRLhMB3vWq6gmLhNv/zI7d1LOXQiy7bni+r0G3y0npDUzVtj0YpawRa3pxlo7e3CFa0lRnB9CoTTo0S1toEce8V1FcfOAdr2XH7yRKCoD53alf6LBvdIHnn6Sk4iLyOKzOjus6vbQqsRAdty9WIMOtC/RUtOxgMUFWOVdlMmvwMP7EueDOP46HlqXgX6cD9oiSPlQbZLeyDysgkOh/0shapvUVJ7iQ8sibCvgQQKD5svvsI4U0otTIcPSjlBv6oEeVrlf6s3CH+4E//JdCqFQDaoAst+4PvC9pcppASfl3MBAqBx9pMXtV0MxiXkWpCjUJV1bT7tbI54Mm7WA5EYAT79UlUds///tMzBaFyZwQGFmLBCYBjDmv4q3h9DBDc59oO2SGVevpi4bmk1AiUsVpMwZeTHlh83Z+DlJvoEikIATRoQ6fgG0pnX3cXeDlQARVpI1VaC9l6+Ew6o6gfD0fNsAbeyc8Ax17THe0b6dnCPZXurykGLq4sQqsRaiB+Y2XJsHS+eT1WLT+I2XGs6qXXdM4ZT8gntsIxDAnF9CoDSOQKeLRNdcUZtUYwZOVFRVClXwfeP2G8knzA8dIdkb/Fzsp8frvQA8QPFUlkVZsLFNrzLDEg7sPkdcrZa7phaqOuVQCF9rimqdkr8rwkTOYE3Udz2PFCx0PIsTkIHtNgTJ/LPjtadSEf+RcqS+dA2Uzzjf63xkYyO7deeCjEYd+YYHe98WodyOl+lgSIo5VHosQ5kVD5nOWLVRFm/G/T8M7z+0 9fHlM5EZ +FLcFUVLxwaYRq4Lqa7a5EM9KNyVksNMBtodr4wPfhQ44mLN64IK5UebdzOVNB6hmRxFqveuMR65sLJLzen0aMmI1xjJ6j35bsmcQ9WjlYwWsOw9385jVQY0G+ciiIGxyuR9k5B7VibeBVdITHmtj7SWPlJ318FmxXOxV2z8wEI5d7Qq3htSHqhdGTzokdhGjd3o87Dr8Kx2aP/7NZ7IJz3Y49+v+OmrNtFZIYNiieCsYrp+aluYwodfdUg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 17, 2024 at 08:51:04AM +0200, Christoph Hellwig wrote: > On Mon, Jun 17, 2024 at 11:29:42AM +1000, Dave Chinner wrote: > > > > + if (mp->m_sb.sb_blocksize > PAGE_SIZE) > > > > + igeo->min_folio_order = mp->m_sb.sb_blocklog - PAGE_SHIFT; > > > > + else > > > > + igeo->min_folio_order = 0; > > > > } > > > > > > The minimum folio order isn't really part of the inode (allocation) > > > geometry, is it? > > > > I suggested it last time around instead of calculating the same > > constant on every inode allocation. We're already storing in-memory > > strunct xfs_inode allocation init values in this structure. e.g. in > > xfs_inode_alloc() we see things like this: > > While new_diflags2 isn't exactly inode geometry, it at least is part > of the inode allocation. Folio min order for file data has nothing > to do with this at all. > > > The only other place we might store it is the struct xfs_mount, but > > given all the inode allocation constants are already in the embedded > > mp->m_ino_geo structure, it just seems like a much better idea to > > put it will all the other inode allocation constants than dump it > > randomly into the struct xfs_mount.... > > Well, it is very closely elated to say the m_blockmask field in > struct xfs_mount. The again modern CPUs tend to get a you simple > subtraction for free in most pipelines doing other things, so I'm > not really sure it's worth caching for use in inode allocation to > start with, but I don't care strongly about that. But there will also be an extra conditional apart from subtraction right? Initially it was something like this: @@ -73,6 +73,7 @@ xfs_inode_alloc( xfs_ino_t ino) { struct xfs_inode *ip; + int min_order = 0; /* * XXX: If this didn't occur in transactions, we could drop GFP_NOFAIL @@ -88,7 +89,8 @@ xfs_inode_alloc( /* VFS doesn't initialise i_mode or i_state! */ VFS_I(ip)->i_mode = 0; VFS_I(ip)->i_state = 0; - mapping_set_large_folios(VFS_I(ip)->i_mapping); + min_order = max(min_order, ilog2(mp->m_sb.sb_blocksize) - PAGE_SHIFT); + mapping_set_folio_orders(VFS_I(ip)->i_mapping, min_order, MAX_PAGECACHE_ORDER); XFS_STATS_INC(mp, vn_active); ASSERT(atomic_read(&ip->i_pincount) == 0); @@ -313,6 +315,7 @@ xfs_reinit_inode( dev_t dev = inode->i_rdev; kuid_t uid = inode->i_uid; kgid_t gid = inode->i_gid; + int min_order = 0; error = inode_init_always(mp->m_super, inode); @@ -323,7 +326,8 @@ xfs_reinit_inode( inode->i_rdev = dev; inode->i_uid = uid; inode->i_gid = gid; - mapping_set_large_folios(inode->i_mapping); + min_order = max(min_order, ilog2(mp->m_sb.sb_blocksize) - PAGE_SHIFT); + mapping_set_folio_orders(inode->i_mapping, min_order, MAX_PAGECACHE_ORDER); return error; } It does introduce a conditional in the inode allocation hot path so I went with what Chinner proposed as it is something we use when we initialize an inode.