From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36042C25B74 for ; Sun, 2 Jun 2024 23:22:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3592E6B00CB; Sun, 2 Jun 2024 19:22:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E3856B00CC; Sun, 2 Jun 2024 19:22:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15C556B00CD; Sun, 2 Jun 2024 19:22:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E85406B00CB for ; Sun, 2 Jun 2024 19:22:42 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7AC1BA0202 for ; Sun, 2 Jun 2024 23:22:42 +0000 (UTC) X-FDA: 82187525364.11.80128AF Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by imf17.hostedemail.com (Postfix) with ESMTP id 78B3140003 for ; Sun, 2 Jun 2024 23:22:39 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=ODGf4+4M; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf17.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717370559; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sMZz/hfMDPMeBm9TTQvgkTSQG2xm4Pd/LaSyXt2DrG8=; b=ikZJBURYsH6v6/rFAKn4svO17GulWmrzPNudmHcEv1rmqPWILKn8PMVfreOJeswrOW/HBi V+uZBWZNhnHcpM4yc34ntMT0P6d0glTbnAJ81ObsZLcPWTLcWmfN4umUYyiEWbDKLnYxwi lvv4NaikArJdgMoRQ2ZAC4tuZMbC3bE= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=ODGf4+4M; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf17.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717370559; a=rsa-sha256; cv=none; b=FGyypARKW9UW+VjFOGUQLB6y9PYQBIEm/Vx5cPvvOHjMkT6I9yTqbF2kRaKCZgq1d1EZZm jXtmv8DOeGltcxq/YK/Qx5W5UNKCOwehAU5dY2xO+McrOYBXPiUngVCh204o84zKAPcBVW p6DQJLc0Ur2enkLHxCrAugT/gBx6elc= Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-70257104b4dso990934b3a.1 for ; Sun, 02 Jun 2024 16:22:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1717370558; x=1717975358; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=sMZz/hfMDPMeBm9TTQvgkTSQG2xm4Pd/LaSyXt2DrG8=; b=ODGf4+4MbL9vsewn3hXMkUUuawb952IpW7XSv8q2+iS4etCfE6SuUY36k1eN1YK7qj 4ZeePca3mxx/RGKjENbUIBrUjsdZWkVCkpGT9mb5fp5EcGNibVAa4j2cqtQXdS3IRVdT TK95oEUb4SrRzgLBAvoK4zwB8zNPycuqmJbuvwcWvzCgalZIzhxGsxzU/dn3UqX65le5 Rbqu9RHqM/9JU3XR4hqWrm55gltzX6JJwgiln77gQfuvVONNyEWD1xTbeL3gf1MDBvGw 7GIVSFGtKtc0b3Xvq4ICjMtidnJBG/cslmzhNjsEZqs7u0quI9wv7hhsaMSr7nyNmMt5 KBtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717370558; x=1717975358; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=sMZz/hfMDPMeBm9TTQvgkTSQG2xm4Pd/LaSyXt2DrG8=; b=tn0VVvWoxO3liu6dn9ez1X5CoqztaiIfNZLBlM/4ZuQiwbBu34llrgENpVZd//syky TGLiR7xmk0IeBgE3JPZBFhFeLCbh2SyB5yUc8QrQv19SmCyA+lKsE6wl5O14zuwIZwzr Rw0+2AtDctiO8VN2YYsBRHZpQ5nqPhs3ZEqp1BgoYJt/wgFAzGgTztDCJBqx+zrvR+Xg zLOU0lq1t8bXRv67UffR0Mp+vtxcZxJoaZGYOHwK2VTNAbU15qFHmnAszIFkKJJQ9f5s PuXN5SVSCzWtYG/lOgW90YXimEkInc9H4ZcGyasj1iz9IYikZi5E9xzBGRXLcdr+jCU9 zL7w== X-Forwarded-Encrypted: i=1; AJvYcCVm7yYRj0J1tPDKa1zTMbOHtpOUSw06ivQW++PxjYB2pyEV4zSzeYl3KiCHOYkrmJUwubfXhR3i6t4Kd+IoJP6RSOg= X-Gm-Message-State: AOJu0YyEuLTpqIjI7Wsr0mLX7cj4PXmE/3Gcw9fzkmOHm3lff1L/sDyr CtTLmcaUJmsicbYT36153TQMk3nTNlskz/W/WyI++uHpPb/ngzgjJmpsB1t1qZk= X-Google-Smtp-Source: AGHT+IGpRpKITHqY9FQ8NaSVtN7UtSeOVKDshRosxLiBJXgm9K3AzmnrJgeBMI9D7eZ56TtOGV2AnQ== X-Received: by 2002:a05:6a00:847:b0:6f8:b262:5b15 with SMTP id d2e1a72fcca58-702477e9f67mr8377508b3a.11.1717370557986; Sun, 02 Jun 2024 16:22:37 -0700 (PDT) Received: from dread.disaster.area (pa49-179-32-121.pa.nsw.optusnet.com.au. [49.179.32.121]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70242aea324sm4633288b3a.109.2024.06.02.16.22.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Jun 2024 16:22:37 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1sDuX4-002DDg-0z; Mon, 03 Jun 2024 09:22:34 +1000 Date: Mon, 3 Jun 2024 09:22:34 +1000 From: Dave Chinner To: "Pankaj Raghav (Samsung)" Cc: chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org, linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v6 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Message-ID: References: <20240529134509.120826-1-kernel@pankajraghav.com> <20240529134509.120826-8-kernel@pankajraghav.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240529134509.120826-8-kernel@pankajraghav.com> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 78B3140003 X-Stat-Signature: 6rhn4m1jetdmw8euxa68d36cobmmiae6 X-HE-Tag: 1717370559-724011 X-HE-Meta: U2FsdGVkX1+n02G6xX3pmT1SAxvJDhXdy4mzQNQdZkX7hlk/gaRcNse2KL6AQM4dnUaZMz5axIiEUNc/THDb/8zUljj1BkTulTml8kujoidVqIfM4SA0RhQ4MT7MYS2k47o08sVkcTdGPqSDxIE7SsmBJHkAdoXKzyiBpX4yHMozXhChR4uBXeQYLs026CVQbfVwyCSsawyUVnswGGjq1GjjfLkN+I/phatPNfYL+gc6UD6/QLdapxEHS4q8oDl23EVyaalQM0NhH+gXcPc6qfriAy1fzoPAZ9tcOH7qGNoxnBeCV/XpE/N36TyCBPlgbSkuMf112bRUFUPPef5yre2AghMz/PYiamEA/SvaT3pCsfbhUAPOIatEE95DvHYP85yTL2eNoBlCYHruFEiDoTj2oMy8m0EoAF43PfU6msHt8P9XnbXGS+DwVqYRIvqG6mKi5Kte0xUOjeZmErOI2s+YqLqiBB2OQrL5rrV21HhDxsfjWLEstayQmNMGgWBXUI43OGU2UWWCpYo9co5lza4qGWkSncC8DwW5ZfX92Vurn2/0XkmJpQ1jqFcvy9piJnjrkhudgn7tKSumLrUboAj6/yRc72ai1TS71+otIUL6sAT0yCnLHWIFX5hvTB1/8Qn7ys1ClGg1mjIbAIyb7MdIzIo2f69q1Man/1b/LOhDYPPR6SIItWOtVvv6c2jh3KwbvU92AizMGe2t4dvgRKWymoy9hMl9WQuy7DMNHZjeKKz811i5f619qRxx2cadr1Aux8ZbARQrhkERqtzRmMoRqoQ9XPsWolokyUt23JfpwtvjxkzFpRS5Z3iBReKklar/Jl9ioBYYM51aFycV2pt+676kiVIlrxYzwprEA8N7uwm//uB+pgexuhvKijCYq5iWzRorrclgP9DcL0KpMRQw0xsIQ3pncur/KnY6JgVM2Q2YzadbcBHlIinnSb6VjPfXPRxdiLkGG2RN8wi G8hIaJKC ua0FsaAfUxrK439qy3KgrlcSjO0bT8+WpjDwO+VDJYMoESwZJKdJFbm/53YokFXOiJW6eC8cQrRvRf4iIv5pwys0isAhd5MuAK9YKKfXEBLLtTPAOUe3NyARxqqgGNIbnu6QTIMKPj4uoRE6FZdYvSK1Va93GG7zLunxjBUZErZ0Lmjr+IlK2N57d6eha4uOYgdUC/2E0nm1hniydhRh+0IY1dxVkRjLdsUaudO36jduWSqDGO/MckayKm7ko9gJDhrlGI36mZNNGCEs5nK//3XtdwBjLznKl23XuJR1iwOJUy893eHTr2OU0q9paCkIMK21jA5/Qin/9p78wUdv2B6iMl64AdijzAVB+UEecqowWv1MCWBjsKYHwN0Hnu4D8PekAFG0CI9CcppzkhKJI+vWQxGKEbqa9j1qdeQJ6qbIpokGpxydd0bu6X3+d0ffd8y7iaIl+IJHUT8jtPFZeZ3AgqZPmPuyZt9SHVZ2Pd9zanKh2b8mjSILFK3YmqpdbGvatzB2iv5+PXkpj6tXALSkYiLKUGdd3giCr+byjCKGPGSKL00OAK+fVrtICEXb7uOApd0wSTIp2AqEbRakqjp4yoifLnm8qTZr1LclXYP4XOLwgFKMIQ4MLhPWju1P8RlRPmO7fUQehT8s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 29, 2024 at 03:45:05PM +0200, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav > > iomap_dio_zero() will pad a fs block with zeroes if the direct IO size > < fs block size. iomap_dio_zero() has an implicit assumption that fs block > size < page_size. This is true for most filesystems at the moment. > > If the block size > page size, this will send the contents of the page > next to zero page(as len > PAGE_SIZE) to the underlying block device, > causing FS corruption. > > iomap is a generic infrastructure and it should not make any assumptions > about the fs block size and the page size of the system. > > Signed-off-by: Pankaj Raghav > --- > > After disucssing a bit in LSFMM about this, it was clear that using a > PMD sized zero folio might not be a good idea[0], especially in platforms > with 64k base page size, the huge zero folio can be as high as > 512M just for zeroing small block sizes in the direct IO path. > > The idea to use iomap_init to allocate 64k zero buffer was suggested by > Dave Chinner as it gives decent tradeoff between memory usage and efficiency. > > This is a good enough solution for now as moving beyond 64k block size > in XFS might take a while. We can work on a more generic solution in the > future to offer different sized zero folio that can go beyond 64k. > > [0] https://lore.kernel.org/linux-fsdevel/ZkdcAsENj2mBHh91@casper.infradead.org/ > > fs/internal.h | 8 ++++++++ > fs/iomap/buffered-io.c | 5 +++++ > fs/iomap/direct-io.c | 9 +++++++-- > 3 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/fs/internal.h b/fs/internal.h > index 84f371193f74..18eedbb82c50 100644 > --- a/fs/internal.h > +++ b/fs/internal.h > @@ -35,6 +35,14 @@ static inline void bdev_cache_init(void) > int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len, > get_block_t *get_block, const struct iomap *iomap); > > +/* > + * iomap/buffered-io.c > + */ > + > +#define ZERO_FSB_SIZE (65536) > +#define ZERO_FSB_ORDER (get_order(ZERO_FSB_SIZE)) > +extern struct page *zero_fs_block; This is really iomap direct IO private stuff. It should be visible anywhere else... > + > /* > * char_dev.c > */ > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index c5802a459334..2c0149c827cd 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -42,6 +42,7 @@ struct iomap_folio_state { > }; > > static struct bio_set iomap_ioend_bioset; > +struct page *zero_fs_block; > > static inline bool ifs_is_fully_uptodate(struct folio *folio, > struct iomap_folio_state *ifs) > @@ -1998,6 +1999,10 @@ EXPORT_SYMBOL_GPL(iomap_writepages); > > static int __init iomap_init(void) > { > + zero_fs_block = alloc_pages(GFP_KERNEL | __GFP_ZERO, ZERO_FSB_ORDER); > + if (!zero_fs_block) > + return -ENOMEM; > + > return bioset_init(&iomap_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE), > offsetof(struct iomap_ioend, io_bio), > BIOSET_NEED_BVECS); just create an iomap_dio_init() function in iomap/direct-io.c and call that from here. Then everything can be private to iomap/direct-io.c... -Dave. -- Dave Chinner david@fromorbit.com