From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 798DBC54ED1 for ; Tue, 27 May 2025 05:05:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19D8A6B0083; Tue, 27 May 2025 01:05:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1752F6B0085; Tue, 27 May 2025 01:05:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08BCB6B008A; Tue, 27 May 2025 01:05:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DE3966B0083 for ; Tue, 27 May 2025 01:05:17 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 26C991A1AD4 for ; Tue, 27 May 2025 05:05:17 +0000 (UTC) X-FDA: 83487499074.24.A4FAB2D Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) by imf11.hostedemail.com (Postfix) with ESMTP id 701BC4000A for ; Tue, 27 May 2025 05:05:15 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=samsung.com (policy=none); spf=pass (imf11.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748322315; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vzidc4Ec5bvluo62lQ+kpJ5u2eKQJjpvCY6ae05q3ow=; b=pSYjVNqdk/zBrmErV0ZBrSUIBbThxl+Yxt8U8WQxJX1L2IGoMemJ+25pzM/kctYOxkr0/Q pEOF6CfVhoFVq3N+yZhwaUqfpvj9feEy1ULqhyEe3AqR0uiHyLAKiyLzo8KPgnT6GMm/KT y/P5O8VAfiLpueK+RFEqwLJ8+iRcpb8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=samsung.com (policy=none); spf=pass (imf11.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748322315; a=rsa-sha256; cv=none; b=2pcnfWRMW98iqR/WM+mpGo3LmfCLr4mxiK+2MZvnwjYINp/MlWg2xKqb5JZ0nLHCCFX/dr V0FcjqYMu1JXLYBt/M6uqKxA7pwm/+MdT1X6JCTRAu0ItbH58zOudzJUqu6tvyptLjNOiR FxzjFo/V1xGzONVi4yCxwpfTqwHCFPM= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4b60tv5BxNz9tX0; Tue, 27 May 2025 07:05:11 +0200 (CEST) From: Pankaj Raghav To: Suren Baghdasaryan , Ryan Roberts , Vlastimil Babka , Baolin Wang , Borislav Petkov , Ingo Molnar , "H . Peter Anvin" , Zi Yan , Mike Rapoport , Dave Hansen , Michal Hocko , David Hildenbrand , Lorenzo Stoakes , Andrew Morton , Thomas Gleixner , Nico Pache , Dev Jain , "Liam R . Howlett" , Jens Axboe Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, willy@infradead.org, x86@kernel.org, linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , mcgrof@kernel.org, gost.dev@samsung.com, kernel@pankajraghav.com, hch@lst.de, Pankaj Raghav Subject: [RFC 1/3] mm: move huge_zero_folio from huge_memory.c to memory.c Date: Tue, 27 May 2025 07:04:50 +0200 Message-ID: <20250527050452.817674-2-p.raghav@samsung.com> In-Reply-To: <20250527050452.817674-1-p.raghav@samsung.com> References: <20250527050452.817674-1-p.raghav@samsung.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 701BC4000A X-Stat-Signature: bzsjwdr8k8wjcagp59f3hwedj4tgtd6d X-Rspam-User: X-HE-Tag: 1748322315-122359 X-HE-Meta: U2FsdGVkX1/H0hc1/JaA4F3/bBtWSWOPkVR14OdruwCjeT6kB6aSNdRyjx6G8lISkjoBT9sUVzYRn+ZlbIpWC/41MLEJLsIcrA6fzGqtFuZGr6WA8XPtGh2wIvWQ/xXkkN0Q+Cyjs9EOddSAuiMj4qFPFMG6S7H1H428je4sIP0l0+mbu7nE+P0DUkdvFRQ/0ntjnzpNNMnGS3h8J9LEupIEkLDWA0OrIvJdBACjKd+etZyiymQ8LQsHooBk71BjHdf30fiMEH0fNCKGF27GhAu2t5NoCj8aNV2P/Yll+nurMkmmAml1PDLfQmItQvFAPfLy9o+aOeyfIuEXzp/XZPvHBiXRHRtl+7GwV9ZUV5KmcJMDjjZaLd5ZKEfO3cnkHIPmL4S/maoacv4wDM8q9lF7w2TIthil1BcfVTMMROU6X+lrqCj6mQ2KVvYwVZDgEWLxOPnLRM+onehr5ORg+XiUa4nr+JgxKLr3i8EKt+UAaG1YShg/pPW7jGW4a+0nOJLh14Igz7tyhr+QSUqX60PlHKLreTDdGi3FJ18+/P0f6bpVjgdsCZU5upVBX89bNs8MMAX3ozXyfRwB6D/CdytkxoW6UC+Czwc5qWxn27edxGIoN3n2aSOgM4UXq6acz/tAuFawExPKioTLzkzoKEetdbuHcetPc4neQmHrsgyb7fQrjbkauFrnQyWIMJ+GHc1b3jxvdl7JGGxitbuOGF449Fkhs0xaVo/1zDRSxaHvGCdE4Fqu0uU0L/c05xb3OJERt862Gggs9v/kl09a//kEqJ6GoaV5hd5olGPQ4vyEExJwLctgqiUOmRcZLPvTW7fbt/s3iCi/xePHg+QFulHstl4bnvGquZVxxaKHkn92a2pCdoeF3ZqWrdvROfuAny28WsBUIhNipIHdG7boZ0B+ZR39cJzGSdW34UZaTo/zevCZ4V2nyph+78840dAcf3Z4GD8tdPP0Ujz6SXX 9hJLqPEE B8Y+3bvN59/jUxg6NrhhHdYfbPoZgHFrJsVcUxLUyeVOlduMlYBZDF6orKPVfg8iHAf8YTySGxXH/rSPyNWbyslW5iLWcwo42q6bAaAjgA7US0NVSZ3nDd7PDhgfftNr3nCv9TK520zVns6XXj6IdKO1Ec8b8DDTQlq/rzz6ZidzY3sYu/9wF0j+Ec5DnaMjZEFMqATkBAGH2jmLdg4WTEq4PH8CAHdBZmxX3KteIMRhbYne1QO1GT6co3wCT1dvYdsvtCWcDEaLixakTNtuO3pAMAOOSk1kyCtO/zNrfzsyquG8uIq0f0eZ3VHZE7DupL5TQsfbuLlUQXSDXSInVLiUq3wqq8zkOQUJSFeppWhmWdWbWGto7x+bydB6Cjdppvlf7OijPPe0OzcnMCnYbyIzkZAkz9PZoetnLcqYjgjGAWi5f1F2AcXgBq88eB7GT2epq0wInz51Tc+O+mGrZUzsWvuD4Bevi3NdX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The huge_zero_folio was initially placed in huge_memory.c as most of the users were in that file. But it does not depend on THP, so it could very well be a part of memory.c file. As huge_zero_folio is going to be exposed to more users outside of mm, let's move it to memory.c file. This is a prep patch to add CONFIG_STATIC_PMD_ZERO_PAGE. No functional changes. Suggested-by: David Hildenbrand Signed-off-by: Pankaj Raghav --- include/linux/huge_mm.h | 16 ------ include/linux/mm.h | 16 ++++++ mm/huge_memory.c | 105 +--------------------------------------- mm/memory.c | 99 +++++++++++++++++++++++++++++++++++++ 4 files changed, 117 insertions(+), 119 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 2f190c90192d..d48973a6bd0f 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -478,22 +478,6 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf); -extern struct folio *huge_zero_folio; -extern unsigned long huge_zero_pfn; - -static inline bool is_huge_zero_folio(const struct folio *folio) -{ - return READ_ONCE(huge_zero_folio) == folio; -} - -static inline bool is_huge_zero_pmd(pmd_t pmd) -{ - return pmd_present(pmd) && READ_ONCE(huge_zero_pfn) == pmd_pfn(pmd); -} - -struct folio *mm_get_huge_zero_folio(struct mm_struct *mm); -void mm_put_huge_zero_folio(struct mm_struct *mm); - static inline bool thp_migration_supported(void) { return IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION); diff --git a/include/linux/mm.h b/include/linux/mm.h index cd2e513189d6..58d150dfc2da 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -69,6 +69,22 @@ static inline void totalram_pages_add(long count) extern void * high_memory; +extern struct folio *huge_zero_folio; +extern unsigned long huge_zero_pfn; + +static inline bool is_huge_zero_folio(const struct folio *folio) +{ + return READ_ONCE(huge_zero_folio) == folio; +} + +static inline bool is_huge_zero_pmd(pmd_t pmd) +{ + return pmd_present(pmd) && READ_ONCE(huge_zero_pfn) == pmd_pfn(pmd); +} + +struct folio *mm_get_huge_zero_folio(struct mm_struct *mm); +void mm_put_huge_zero_folio(struct mm_struct *mm); + #ifdef CONFIG_SYSCTL extern int sysctl_legacy_va_layout; #else diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d3e66136e41a..c6e203abb2de 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -75,9 +75,6 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc); static bool split_underused_thp = true; -static atomic_t huge_zero_refcount; -struct folio *huge_zero_folio __read_mostly; -unsigned long huge_zero_pfn __read_mostly = ~0UL; unsigned long huge_anon_orders_always __read_mostly; unsigned long huge_anon_orders_madvise __read_mostly; unsigned long huge_anon_orders_inherit __read_mostly; @@ -208,88 +205,6 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, return orders; } -static bool get_huge_zero_page(void) -{ - struct folio *zero_folio; -retry: - if (likely(atomic_inc_not_zero(&huge_zero_refcount))) - return true; - - zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, - HPAGE_PMD_ORDER); - if (!zero_folio) { - count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED); - return false; - } - /* Ensure zero folio won't have large_rmappable flag set. */ - folio_clear_large_rmappable(zero_folio); - preempt_disable(); - if (cmpxchg(&huge_zero_folio, NULL, zero_folio)) { - preempt_enable(); - folio_put(zero_folio); - goto retry; - } - WRITE_ONCE(huge_zero_pfn, folio_pfn(zero_folio)); - - /* We take additional reference here. It will be put back by shrinker */ - atomic_set(&huge_zero_refcount, 2); - preempt_enable(); - count_vm_event(THP_ZERO_PAGE_ALLOC); - return true; -} - -static void put_huge_zero_page(void) -{ - /* - * Counter should never go to zero here. Only shrinker can put - * last reference. - */ - BUG_ON(atomic_dec_and_test(&huge_zero_refcount)); -} - -struct folio *mm_get_huge_zero_folio(struct mm_struct *mm) -{ - if (test_bit(MMF_HUGE_ZERO_PAGE, &mm->flags)) - return READ_ONCE(huge_zero_folio); - - if (!get_huge_zero_page()) - return NULL; - - if (test_and_set_bit(MMF_HUGE_ZERO_PAGE, &mm->flags)) - put_huge_zero_page(); - - return READ_ONCE(huge_zero_folio); -} - -void mm_put_huge_zero_folio(struct mm_struct *mm) -{ - if (test_bit(MMF_HUGE_ZERO_PAGE, &mm->flags)) - put_huge_zero_page(); -} - -static unsigned long shrink_huge_zero_page_count(struct shrinker *shrink, - struct shrink_control *sc) -{ - /* we can free zero page only if last reference remains */ - return atomic_read(&huge_zero_refcount) == 1 ? HPAGE_PMD_NR : 0; -} - -static unsigned long shrink_huge_zero_page_scan(struct shrinker *shrink, - struct shrink_control *sc) -{ - if (atomic_cmpxchg(&huge_zero_refcount, 1, 0) == 1) { - struct folio *zero_folio = xchg(&huge_zero_folio, NULL); - BUG_ON(zero_folio == NULL); - WRITE_ONCE(huge_zero_pfn, ~0UL); - folio_put(zero_folio); - return HPAGE_PMD_NR; - } - - return 0; -} - -static struct shrinker *huge_zero_page_shrinker; - #ifdef CONFIG_SYSFS static ssize_t enabled_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) @@ -850,22 +765,12 @@ static inline void hugepage_exit_sysfs(struct kobject *hugepage_kobj) static int __init thp_shrinker_init(void) { - huge_zero_page_shrinker = shrinker_alloc(0, "thp-zero"); - if (!huge_zero_page_shrinker) - return -ENOMEM; - deferred_split_shrinker = shrinker_alloc(SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE | SHRINKER_NONSLAB, "thp-deferred_split"); - if (!deferred_split_shrinker) { - shrinker_free(huge_zero_page_shrinker); + if (!deferred_split_shrinker) return -ENOMEM; - } - - huge_zero_page_shrinker->count_objects = shrink_huge_zero_page_count; - huge_zero_page_shrinker->scan_objects = shrink_huge_zero_page_scan; - shrinker_register(huge_zero_page_shrinker); deferred_split_shrinker->count_objects = deferred_split_count; deferred_split_shrinker->scan_objects = deferred_split_scan; @@ -874,12 +779,6 @@ static int __init thp_shrinker_init(void) return 0; } -static void __init thp_shrinker_exit(void) -{ - shrinker_free(huge_zero_page_shrinker); - shrinker_free(deferred_split_shrinker); -} - static int __init hugepage_init(void) { int err; @@ -923,7 +822,7 @@ static int __init hugepage_init(void) return 0; err_khugepaged: - thp_shrinker_exit(); + shrinker_free(deferred_split_shrinker); err_shrinker: khugepaged_destroy(); err_slab: diff --git a/mm/memory.c b/mm/memory.c index 5cb48f262ab0..11edc4d66e74 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -159,6 +159,105 @@ static int __init init_zero_pfn(void) } early_initcall(init_zero_pfn); +static atomic_t huge_zero_refcount; +struct folio *huge_zero_folio __read_mostly; +unsigned long huge_zero_pfn __read_mostly = ~0UL; +static struct shrinker *huge_zero_page_shrinker; + +static bool get_huge_zero_page(void) +{ + struct folio *zero_folio; +retry: + if (likely(atomic_inc_not_zero(&huge_zero_refcount))) + return true; + + zero_folio = folio_alloc((GFP_TRANSHUGE | __GFP_ZERO) & ~__GFP_MOVABLE, + HPAGE_PMD_ORDER); + if (!zero_folio) { + count_vm_event(THP_ZERO_PAGE_ALLOC_FAILED); + return false; + } + /* Ensure zero folio won't have large_rmappable flag set. */ + folio_clear_large_rmappable(zero_folio); + preempt_disable(); + if (cmpxchg(&huge_zero_folio, NULL, zero_folio)) { + preempt_enable(); + folio_put(zero_folio); + goto retry; + } + WRITE_ONCE(huge_zero_pfn, folio_pfn(zero_folio)); + + /* We take additional reference here. It will be put back by shrinker */ + atomic_set(&huge_zero_refcount, 2); + preempt_enable(); + count_vm_event(THP_ZERO_PAGE_ALLOC); + return true; +} + +static void put_huge_zero_page(void) +{ + /* + * Counter should never go to zero here. Only shrinker can put + * last reference. + */ + BUG_ON(atomic_dec_and_test(&huge_zero_refcount)); +} + +struct folio *mm_get_huge_zero_folio(struct mm_struct *mm) +{ + if (test_bit(MMF_HUGE_ZERO_PAGE, &mm->flags)) + return READ_ONCE(huge_zero_folio); + + if (!get_huge_zero_page()) + return NULL; + + if (test_and_set_bit(MMF_HUGE_ZERO_PAGE, &mm->flags)) + put_huge_zero_page(); + + return READ_ONCE(huge_zero_folio); +} + +void mm_put_huge_zero_folio(struct mm_struct *mm) +{ + if (test_bit(MMF_HUGE_ZERO_PAGE, &mm->flags)) + put_huge_zero_page(); +} + +static unsigned long shrink_huge_zero_page_count(struct shrinker *shrink, + struct shrink_control *sc) +{ + /* we can free zero page only if last reference remains */ + return atomic_read(&huge_zero_refcount) == 1 ? HPAGE_PMD_NR : 0; +} + +static unsigned long shrink_huge_zero_page_scan(struct shrinker *shrink, + struct shrink_control *sc) +{ + if (atomic_cmpxchg(&huge_zero_refcount, 1, 0) == 1) { + struct folio *zero_folio = xchg(&huge_zero_folio, NULL); + BUG_ON(zero_folio == NULL); + WRITE_ONCE(huge_zero_pfn, ~0UL); + folio_put(zero_folio); + return HPAGE_PMD_NR; + } + + return 0; +} + +static int __init init_huge_zero_page(void) +{ + huge_zero_page_shrinker = shrinker_alloc(0, "thp-zero"); + if (!huge_zero_page_shrinker) + return -ENOMEM; + + huge_zero_page_shrinker->count_objects = shrink_huge_zero_page_count; + huge_zero_page_shrinker->scan_objects = shrink_huge_zero_page_scan; + shrinker_register(huge_zero_page_shrinker); + + return 0; +} +early_initcall(init_huge_zero_page); + void mm_trace_rss_stat(struct mm_struct *mm, int member) { trace_rss_stat(mm, member); -- 2.47.2