linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pankaj Raghav <p.raghav@samsung.com>
To: "Darrick J . Wong" <djwong@kernel.org>, hch@lst.de, willy@infradead.org
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	David Hildenbrand <david@redhat.com>,
	linux-fsdevel@vger.kernel.org, mcgrof@kernel.org,
	gost.dev@samsung.com, Andrew Morton <akpm@linux-foundation.org>,
	kernel@pankajraghav.com, Pankaj Raghav <p.raghav@samsung.com>
Subject: [RFC 0/3] add large zero page for zeroing out larger segments
Date: Fri, 16 May 2025 12:10:51 +0200	[thread overview]
Message-ID: <20250516101054.676046-1-p.raghav@samsung.com> (raw)

Introduce LARGE_ZERO_PAGE of size 2M as an alternative to ZERO_PAGE.
Similar to ZERO_PAGE, LARGE_ZERO_PAGE is also a global shared page.
2M seems to be a decent compromise between memory usage and performance.

This idea (but not the implementation) was suggested during the review of
adding LBS support to XFS[1][2].

NOTE:
===
This implementation probably has a lot of holes, and it is not complete.
For example, this implementation only works on x86.

The intent of the RFC is:
- To understand if this is something we still need in the kernel.
- If this is the approach we want to take to implement a feature like
  this or should we explore other alternatives.

I have excluded a lot of Maintainers/mailing list and only included relevant
folks in this RFC to understand the direction we want to take if this
feature is needed.
===

There are many places in the kernel where we need to zeroout larger
chunks but the maximum segment we can zeroout at a time is limited by
PAGE_SIZE.

This is especially annoying in block devices and filesystems where we
attach multiple ZERO_PAGEs to the bio in different bvecs. With multipage
bvec support in block layer, it is much more efficient to send out
larger zero pages as a part of a single bvec.

Some examples of places in the kernel where this could be useful:
- blkdev_issue_zero_pages()
- iomap_dio_zero()
- vmalloc.c:zero_iter()
- rxperf_process_call()
- fscrypt_zeroout_range_inline_crypt()
- bch2_checksum_update()
...

I have converted blkdev_issue_zero_pages() and iomap_dio_zero() as an
example as a part of this series.

While there are other options such as huge_zero_page, they can fail
based on the system conditions requiring a fallback to ZERO_PAGE[3].

LARGE_ZERO_PAGE is added behind a config option so that systems that are
constrained by memory are not forced to use it.

Looking forward to some feedback.

[1] https://lore.kernel.org/linux-xfs/20231027051847.GA7885@lst.de/
[2] https://lore.kernel.org/linux-xfs/ZitIK5OnR7ZNY0IG@infradead.org/

Pankaj Raghav (3):
  mm: add large zero page for efficient zeroing of larger segments
  block: use LARGE_ZERO_PAGE in __blkdev_issue_zero_pages()
  iomap: use LARGE_ZERO_PAGE in iomap_dio_zero()

 arch/Kconfig                   |  8 ++++++++
 arch/x86/include/asm/pgtable.h | 20 +++++++++++++++++++-
 arch/x86/kernel/head_64.S      |  9 ++++++++-
 block/blk-lib.c                |  4 ++--
 fs/iomap/direct-io.c           | 31 +++++++++----------------------
 5 files changed, 46 insertions(+), 26 deletions(-)


base-commit: 9e619cd4fefd19cdce16e169d5827bc64ae01aa1
-- 
2.47.2



             reply	other threads:[~2025-05-16 10:11 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-16 10:10 Pankaj Raghav [this message]
2025-05-16 10:10 ` [RFC 1/3] mm: add large zero page for efficient zeroing of " Pankaj Raghav
2025-05-16 12:21   ` David Hildenbrand
2025-05-16 13:03     ` Pankaj Raghav (Samsung)
2025-05-16 14:54       ` David Hildenbrand
2025-05-16 10:10 ` [RFC 2/3] block: use LARGE_ZERO_PAGE in __blkdev_issue_zero_pages() Pankaj Raghav
2025-05-16 10:10 ` [RFC 3/3] iomap: use LARGE_ZERO_PAGE in iomap_dio_zero() Pankaj Raghav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250516101054.676046-1-p.raghav@samsung.com \
    --to=p.raghav@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=kernel@pankajraghav.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox