From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47638C3ABD8 for ; Fri, 16 May 2025 10:11:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 94D476B0127; Fri, 16 May 2025 06:11:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D47D6B0128; Fri, 16 May 2025 06:11:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7750E6B0129; Fri, 16 May 2025 06:11:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 54DAA6B0127 for ; Fri, 16 May 2025 06:11:18 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 149B7801ED for ; Fri, 16 May 2025 10:11:19 +0000 (UTC) X-FDA: 83448353478.30.FFA4E07 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 484E6A0002 for ; Fri, 16 May 2025 10:11:17 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=samsung.com (policy=none); spf=pass (imf15.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747390277; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=sRXQr3XN1wKO0Kf10GvAGerXAKwdO+nBmQG10DKVZuc=; b=CtMZt0HaoLnxte19QkCMHt3x4CXbWMDhniu1c6luDW1dEZIG2MEV43lARxFA/orrjGp9Ox EkHHi66LyjVpR+wWeGrsI4meqsn1HOQKx+PEa/i+EQGll+EAgra5xp+AhfFbnonfYxQ1eq 9Va/uDe5S5bH9xRljyd2HSSubO+V2tE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747390277; a=rsa-sha256; cv=none; b=kHCjzMyXhUd43wZxgO0V5wUpacQpZI/kMZCxGBHYh4v/qWlI0qYtBOYJ8XAka/zGZDltOX D9+R3UG/jhkU9gKiKIElr5gb7JbZDuIplnSaJHbMbFqElBgTSWnAj4voXlh9NBM8cDGkNb zuqMM0dZ98l1vztnN4nG391ZUAYB90E= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=samsung.com (policy=none); spf=pass (imf15.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4ZzNC35VRsz9stK; Fri, 16 May 2025 12:11:11 +0200 (CEST) From: Pankaj Raghav To: "Darrick J . Wong" , hch@lst.de, willy@infradead.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, David Hildenbrand , linux-fsdevel@vger.kernel.org, mcgrof@kernel.org, gost.dev@samsung.com, Andrew Morton , kernel@pankajraghav.com, Pankaj Raghav Subject: [RFC 0/3] add large zero page for zeroing out larger segments Date: Fri, 16 May 2025 12:10:51 +0200 Message-ID: <20250516101054.676046-1-p.raghav@samsung.com> Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 484E6A0002 X-Stat-Signature: z58t4njq3igahoyuer339efkxu18q188 X-Rspam-User: X-HE-Tag: 1747390277-821587 X-HE-Meta: U2FsdGVkX19jX+uTAeTAIVnXGSljHfAFzVDYTMCCQpXlMPJENO8+1LPlGz/jbxQDhR84NFjs2fZg7qimPXm5mfcqHd+y1KrzemPxGDVsQ/g0vGq8Y0GCeXfBXmT/Hgf3FzZUX7x106e0oou6mUbqeW6qgNR/kUmt2ySC6bZmRizEv6EXU4tc76e8jmJ0iLuvz5q6qhVuHAoPCp4sVYUMjRxLN9h1N2nk+r43MncyAKULZ/94LwyVdy1ksSSnYJ9rYNkSZwLdYwZLmr55YFk3JpsOoBG6MqXgMMr3uxpleDZF8jZ5ZYKJNKdgSHe3n/PfEx6sl8deOLSFVE9PE4bWRo2/R0DFyJa7Teq0earMFffIubAzUOQWV1v+/CN4umswM2IeJ4FLIYFGFekA26ewLG3uYWXrM7uBsZX0Ak378z+n7PoMTnC638hp4R30NkGYJAf8I21kYryxzeuFIxmv6lcYjUVTxv0dWoWtd3H9wdowKV9QA2PDW9Dsw8Cn6fQ1W3shHgo9byP1C2l7G3xZCAeX0dNVHfrn/iUG8aEiQcMC3twGqnd1uOdmQv0UphwI9IRqiQvP0LMLWIJhfxMkiLizBdKF4doLdWoP1udskIjeaMEYTdl386U27joYx7OUOFuCcZomjfJPlMFEfr4Y3/Jcvo5kHMJrkllJLSd4709qmdY/tPWcFkmA3PnAjL8DwLPgyUXY2KhHTAdM93ZV5lCSD4XSwABwdwhgPw6MMeDmtnlwGW3tPWslxk0SMno9DEhgcakTgHUwZTvXUAWtTQdH2TvgBc9+yCUcPKNS2IQSIegBED5OLAvh4Q6LaaTof7NaufxpCtVCBrtL3qiCt7VBU3Gs1lfAsPCEpNKxgrLAFebHy/2riyaJRAYUTyV9HC9DxlrBDEPt4Guw/qv1vdJOn44FcCm0WXRZkTli2wIqqvfVaCUXt7uJ2NAuixS2pCs6IY/FUEI0UBGBWDa qzY5e4J9 3vSEBewUURD8shI8ZPxlgyxR5IcKismi+3sIy4wW8JQRVF9TI9J+4RPjC6XoaQ/b7sdoyfqvMxDI8beOvwsnmYbRpizQhzrPkORyaVlu6+8RyUXXw7Ll3GcC2gE1oKCh/LQp8lpr+Wf2+iZrhIiX8T0jIcqoxOIZUF/2w/APxsB2xBQWToBNo/suJou6m5JWbnVBYKkG8OQYm0NhPAY2sPprhi4yOTAOfdhBrOflixgZbqiuNhxOt1kKG8VcEKd/IvMZX9QrpY2ZI9VLAbOoMbuEFaOTwIkAQkFgZJgwyvhg+dSEw9da7FJjI/fdFskQshUiDi392SiuBnTjmm6fiZQDpYCUwAuotO6S+Fxw7QwYCSaSlhnIo1JkgMn6ZMuHZpMsrs8tAz7CEBkVTDrhO/uTuUInU5kPvAIGuTdQ62KF07AmxtLaId9bceA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce LARGE_ZERO_PAGE of size 2M as an alternative to ZERO_PAGE. Similar to ZERO_PAGE, LARGE_ZERO_PAGE is also a global shared page. 2M seems to be a decent compromise between memory usage and performance. This idea (but not the implementation) was suggested during the review of adding LBS support to XFS[1][2]. NOTE: === This implementation probably has a lot of holes, and it is not complete. For example, this implementation only works on x86. The intent of the RFC is: - To understand if this is something we still need in the kernel. - If this is the approach we want to take to implement a feature like this or should we explore other alternatives. I have excluded a lot of Maintainers/mailing list and only included relevant folks in this RFC to understand the direction we want to take if this feature is needed. === There are many places in the kernel where we need to zeroout larger chunks but the maximum segment we can zeroout at a time is limited by PAGE_SIZE. This is especially annoying in block devices and filesystems where we attach multiple ZERO_PAGEs to the bio in different bvecs. With multipage bvec support in block layer, it is much more efficient to send out larger zero pages as a part of a single bvec. Some examples of places in the kernel where this could be useful: - blkdev_issue_zero_pages() - iomap_dio_zero() - vmalloc.c:zero_iter() - rxperf_process_call() - fscrypt_zeroout_range_inline_crypt() - bch2_checksum_update() ... I have converted blkdev_issue_zero_pages() and iomap_dio_zero() as an example as a part of this series. While there are other options such as huge_zero_page, they can fail based on the system conditions requiring a fallback to ZERO_PAGE[3]. LARGE_ZERO_PAGE is added behind a config option so that systems that are constrained by memory are not forced to use it. Looking forward to some feedback. [1] https://lore.kernel.org/linux-xfs/20231027051847.GA7885@lst.de/ [2] https://lore.kernel.org/linux-xfs/ZitIK5OnR7ZNY0IG@infradead.org/ Pankaj Raghav (3): mm: add large zero page for efficient zeroing of larger segments block: use LARGE_ZERO_PAGE in __blkdev_issue_zero_pages() iomap: use LARGE_ZERO_PAGE in iomap_dio_zero() arch/Kconfig | 8 ++++++++ arch/x86/include/asm/pgtable.h | 20 +++++++++++++++++++- arch/x86/kernel/head_64.S | 9 ++++++++- block/blk-lib.c | 4 ++-- fs/iomap/direct-io.c | 31 +++++++++---------------------- 5 files changed, 46 insertions(+), 26 deletions(-) base-commit: 9e619cd4fefd19cdce16e169d5827bc64ae01aa1 -- 2.47.2