From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FBB6C83F09 for ; Wed, 9 Jul 2025 09:59:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6F326B00C3; Wed, 9 Jul 2025 05:59:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1CF56B00C4; Wed, 9 Jul 2025 05:59:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0B446B00C5; Wed, 9 Jul 2025 05:59:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8B23C6B00C3 for ; Wed, 9 Jul 2025 05:59:20 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E9F38127B4F for ; Wed, 9 Jul 2025 09:59:19 +0000 (UTC) X-FDA: 83644278438.28.5D88DEB Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) by imf10.hostedemail.com (Postfix) with ESMTP id B7C16C0005 for ; Wed, 9 Jul 2025 09:59:17 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=iMW6PKHA; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf10.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752055158; a=rsa-sha256; cv=none; b=4gfWcFBAgP989/rdpiuMPtzQrqg91X8mdKOdTAOxJj6tgzKd7/H0zBNWT0YXgnKE+6rYHa dWidBGhq4eFl6S8Kv428qjtCeif3NJhldKxoj2K6g9WT4m09nXhra2BojbbpODcv3Fc9zY pgB8527i/aFxif4Wcsz3ytyGFLJaB8w= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=iMW6PKHA; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf10.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752055158; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MaNm/Rqy67eO1HZemeROtIk1IORMMZeTew+oRtrvGkQ=; b=mmCiAF26OFY4wdd3Fvw3DBSI9Y4ShH5Iz/WnBtZOgyIUKGtMwsbRw/IuXO9tCkD9TPAWkl NhpLdWwpOqewbDoARMzqbn9GTHj2DIrb4aw9YWM21WospVymuRBb4gMPaIL29GBfgfyfgO oLMfysT8wI+/OmRmPiFPegwSIo25yuU= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4bcYNJ6s2Wz9sSf; Wed, 9 Jul 2025 11:59:12 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1752055153; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MaNm/Rqy67eO1HZemeROtIk1IORMMZeTew+oRtrvGkQ=; b=iMW6PKHALmPyYY17kX+7z/UasELzVAlOwMHkXhfmMfYQ/HGql/rTF8RnQV/PY/RKU2dbTv Q1wW0tPyR4pJpPZukYuAdQh7+xBhk+xD18a/CmM7ZQf1WuimGQjMXFOVqCLMCIptjtbVat NM9TO+Vu4oJiQDPY6wTAbYUO3uWnNXxqlsxu6CY5Mi7Wm3KcYBnb5jjtoU4s5bojivYl6d au6Jm0/uFu4tf9hZPt5U/AyAyhzkpH/XE0Fq7fEGRjMd/NnirMwGNMKOAUNX/b/3sBuWaV Dyytuh906m1aINvPOuiJvUzah44GbnrgicnGClNOIJaF4X4YJfRO/swAnCxFbQ== Message-ID: <3ed1e744-5536-4b47-a5ab-66cd300ded67@pankajraghav.com> Date: Wed, 9 Jul 2025 11:59:00 +0200 MIME-Version: 1.0 Subject: Re: [PATCH v2 0/5] add static PMD zero page support To: Andrew Morton , David Hildenbrand Cc: Suren Baghdasaryan , Ryan Roberts , Baolin Wang , Borislav Petkov , Ingo Molnar , "H . Peter Anvin" , Vlastimil Babka , Zi Yan , Mike Rapoport , Dave Hansen , Michal Hocko , Lorenzo Stoakes , Thomas Gleixner , Nico Pache , Dev Jain , "Liam R . Howlett" , Jens Axboe , linux-kernel@vger.kernel.org, willy@infradead.org, linux-mm@kvack.org, x86@kernel.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , mcgrof@kernel.org, gost.dev@samsung.com, hch@lst.de, Pankaj Raghav References: <20250707142319.319642-1-kernel@pankajraghav.com> <20250707153844.d868f7cfe16830cce66f3929@linux-foundation.org> Content-Language: en-US From: Pankaj Raghav In-Reply-To: <20250707153844.d868f7cfe16830cce66f3929@linux-foundation.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B7C16C0005 X-Stat-Signature: n9uoxqjtms1tnbjio49ci7pebh1m447z X-HE-Tag: 1752055157-366270 X-HE-Meta: U2FsdGVkX1/t6HZ4hb5oC+SFv8yBJA8PX3x2YKM1v5suGrUvF+BtVrL5e6d6zL6eEmjoCoYQrPfpTv/gLu4+CqvR1QPW3pd/Z0D+ZYd48A0f8gjnay/YMcWhWSpnNUvKnxsO7A7JmDue9yyheSLQv1Lx7eI9u/hFE2CaiQB1d8J8IoJfUSKw3bJ3lGdaN+mh9exTyfkcKVLH5/p5N88hFVdI3KrOJSpJiEervnQ0eBFwwe6P1z45n40uYsrz1/pVYrEDDirT5ArQceEoYfITQQpaeFTT0wEWNHGFPoG4PmEhIiFzmfrmpEMdm1dvIb8biZ1nAu6L2CFVkYzRD4Q/CCbzeeBWkSeDOIiPh15IMEcDIQXCOJZTTfl8iL1gtxd5NLreGXDXVeXvAjmXr8rWWxK617/MVdLMYrDf+WP3aeZi/aIeC6tEncdWoi17wfR5TqndfGqQ8HM4bYMlvMPkz7ySHEtHJjC5UXZfqvZltHmioQWmBvnr+46J2EMjdgYTKyOom95q/W8+/WMVE/tSnUBeZ9LTMOTkjY+Xf9PTdg6cXghxuMcqsJ9Og7uB1l4Sm6evfMR05vIEy6hQBU0ZTlxQDpUb29FpMWpo4HGjBN3zm3iv8zwvk+ghV+M7uhwtu4PCGVtkbN+Ab7IMsW06EMkF09xNEFcJchb3A6v1wATqRzEIaa4ArSIThvDPXGuDl4QX29QXuFw01JfEGkSrEYGt76/zNQ4XjB+CtBInrAkgFjx9/VpRqSD8wDoNE5QAsg4b3b+yU6xQdSXqyXIHS3K8YZRVBdEELJbSU1MqjTbdVY4iiKBXzyutJnDk40IQQZms89WDU9rkX2bP7nianqfRcgKZSDvr/6MScU7FX3CG4t6UjefcQM5rGddtUk0AZUosIgw65G9tjtQFYJCyX2TES1pUis64Z9Wy3AVldOVMkXyG++Zl9gsusX0shaOPCyYD5OhjNw+QSjZvqKE fXzbMJow nCtCpj21rBuhjMkpPltQHlXpt7l9aXfnRIu0Ng0UY7glcEytrAc09Z8ONQkRhH6nRkWbpoCrxYPCo1nP4FHXJjQtL/9TaHb24KRD6Jn/d2DBesL2pC39S3VHmMgrd7FF/SFb/jhry+F5dEWhfxw8w+DckQB9X2U4mbUr0SPINEQQ9JPhALWjD3oZgbFjuhrX0IhgQTqTHd9tSoApKP7QjTuRGUDTjBjBH37jrs2+EE4xkNAWmMm5nLFAdaNwmvs98p/xtd8ovjkCJeqhArKtEBr6NDPoTpgQCNYSCXBwcZW+SyvG4DzJfk9qGWLOnJRUF0hp8So+zzmM1KnXdW+yQWJMIlA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Andrew, >> We already have huge_zero_folio that is allocated on demand, and it will be >> deallocated by the shrinker if there are no users of it left. >> >> At moment, huge_zero_folio infrastructure refcount is tied to the process >> lifetime that created it. This might not work for bio layer as the completions >> can be async and the process that created the huge_zero_folio might no >> longer be alive. > > Can we change that? Alter the refcounting model so that dropping the > final reference at interrupt time works as expected? > That is an interesting point. I did not try it. At the moment, we always drop the reference in __mmput(). Going back to the discussion before this work started, one of the main thing that people wanted was to use some sort of a **drop in replacement** for ZERO_PAGE that can be bigger than PAGE_SIZE[1]. And, during the RFCs of these patches, one of the feedback I got from David was in big server systems, 2M (in the case of 4k page size) should not be a problem and we don't need any unnecessary refcounting for them. Also when I had a chat with David, he also wants to make changes to the existing mm_huge_zero_folio infrastructure to get rid of shrinker if possible. So we decided that it is better to have opt-in static allocation and keep the existing dynamic allocation path. So that is why I went with this approach of having a static PMD allocation. I hope this clarifies the motivation a bit. Let me know if you have more questions. > And if we were to do this, what sort of benefit might it produce? > >> Add a config option STATIC_PMD_ZERO_PAGE that will always allocate >> the huge_zero_folio via memblock, and it will never be freed. [1] https://lore.kernel.org/linux-xfs/20231027051847.GA7885@lst.de/