From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5384AD36129 for ; Wed, 6 Nov 2024 02:11:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0F196B00A4; Tue, 5 Nov 2024 21:11:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D97D46B00A5; Tue, 5 Nov 2024 21:11:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C116E6B00A6; Tue, 5 Nov 2024 21:11:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9EFBB6B00A4 for ; Tue, 5 Nov 2024 21:11:21 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 53A4AABCA1 for ; Wed, 6 Nov 2024 02:11:21 +0000 (UTC) X-FDA: 82754040810.17.20914F4 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf01.hostedemail.com (Postfix) with ESMTP id 342ED4000B for ; Wed, 6 Nov 2024 02:10:53 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=nQmOcj2v; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf01.hostedemail.com: domain of sashal@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sashal@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730858911; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=mjhhqHH21NLgVTwdMEShmee5jJfpFoxURQwO2TgGcHA=; b=7FQyz2UPRDmTT4mnRpAGn3pxSsk+IuAAzEYuA2/uBeUUPpaDPwGGZEe7W6pqBVg6+FRbFd JtkWSkTMTBidrXWXwCUxd/S8hcqHrpWWe0MMMkYfH3D1iQ/2NGP5PvZ0jgqIC4I9OQC5Hl Uti+pmLwDazBk3TYYSt4+jZItiM4Ayo= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=nQmOcj2v; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf01.hostedemail.com: domain of sashal@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=sashal@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730858911; a=rsa-sha256; cv=none; b=jvEgOrODheJ9FU8aQkteEZe5tIrfcFzMdiF8n+Fs0jCUOa/DP0+Bol1y1ltIUAcSlUDJJK jj04B29jxbONsgOu9CXxXTqHF9z+KclDfzr/LcAI8mBu3KwLbmnIC1n9quJNh5MeMjLXuK N3amNMJUewbmaUUOATxc94xv0X4ronE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 135955C0291; Wed, 6 Nov 2024 02:10:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 587A3C4CECF; Wed, 6 Nov 2024 02:11:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730859078; bh=u1cBqTlLkcMfs7knaNVaxjBMF/GjAhEmKjW6vnyFDFw=; h=From:To:Cc:Subject:Date:From; b=nQmOcj2v9Rcif6xm/U/8YtIc9jcYJqHKnZHh//Tqwt1uUBzuy0mWYP7+xdlWH6Rjg a7huCDLW54O/E+sCJR22SBdCJ6wrXJ+pWbQafp6tzyKPPpUw+cx3zcwsjRwELg6YQm d/vBpecTzrSF8ZAziH1Y+6Ddek5YlM0uvyVNutIfpgdD9RJy3Q2QZ6DWaXFeAVZJhR Z/0Yx/EHRfqQKzP420asRIbxQQZyy3oZUwyJGQCJfBrU+NoSPjULqKYOzEBmH44JnG zhffyJJV9vgM14JLMicyue6CEGoQVul4qg+gjZdWR+ZgFpf4TI+IfK2MAuwXZzLGnn 5fTl19wBoIAzw== From: Sasha Levin To: stable@vger.kernel.org, yuzhao@google.com Cc: James Houghton , David Stevens , Axel Rasmussen , David Matlack , David Rientjes , Oliver Upton , Paolo Bonzini , Sean Christopherson , Wei Xu , kernel test robot , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: FAILED: Patch "mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify()" failed to apply to v6.1-stable tree Date: Tue, 5 Nov 2024 21:11:13 -0500 Message-ID: <20241106021114.182124-1-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Patchwork-Hint: ignore X-stable: review Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 342ED4000B X-Stat-Signature: cbwyydkptmykwef6ux3q7qxtme5aa1pj X-Rspam-User: X-HE-Tag: 1730859053-451218 X-HE-Meta: U2FsdGVkX18c7RTMmZZf13GacziVvAlBjy4t9woqqpmh5Q/OmDV5/GCNifodinK8nzF9cQP+p4CRL1yNVzMpH1wMeXPEst88zdK+K4hK7hyY30zrgxod91XELe2IEFa3wTODud3hq4t7sS4W7fEnH0TqLKYnEkwNqn+vU5NMUvXv8Dkyhdn93+mI/MfyEgaCAY8iuNFXiWqk4FqEZFVNX37VF3/ouAk2lQy7z0znphXqFPI7SKam3tnMAeB/afICEihdqMRPhUZwa+fxFivRuJgBJDHZ49v4+pt1xrdLmmz9JevAbi77t/rOYteMs0Edmms1hY9FrKujAv0x0BAzhClRfLSEAz32lHfewQnlb2iKHQrEbskQDPrnkfSC998XzyAnlV7y2X69sKDm/q5Jogm7eiF7RoeDWGFzudVVKHyxSbNzoS5qZhFPOiChBlq2gFkbEQCeh8t8l6300R2RoA95BQorcRftMSq7426Ak42ZSLJdS8+wWQkjBOwE5t4MnD+Wg4lKTelXEg3vzU04+Qp9eQw6Uf0DbPgAxhJOfyJdxxPze/jaVrXsAKkxGQJmf9AvAhEumpFqU7wXfxpHQFMjQfQoCkkhM1gVA4M6woQdflpaxCpMIhK606hO8HbF3GNrO8GrJkHbONlOPMtCYqt303pYGH+Yi1/wnQjt2uaUgL9nsKUpZkOmgaEHMzU1+mKZrHabxVwNOi32J5Xzv3Hyet7q3/19xdFlkaLsidpevgLErmWLdxQkj2/4OHentCVcnDOLOqfKsy+hAhEg1Wc96FPWUczKdHxgi3EUlCGVnNEXrp1Incsw1cH/9otgO3mkUFcLgCSjp9b3lgALockfE3B+ek2lFTq6puoc14t3KMivvw8UVmFBCU6Y/Zp0RgsyAuTkGK0l9io6bLXoBT7tS6ztj0t9T7cNw3t7gb0LYrRH6hCtCbkem6nh09dnWdDBiQ/7UsKFdQbICAe n6eNxgWR vIOm8vcHVlGbf//rSyF9iWtmGU8iGA+H2oXUQMuF8wETR3WIihYkRft9RQ1AD5nB3+E+0rWk0kZZ7rR46TXkLgBWi8EwaUZVPP6aKb1S6jn3W/rp1g8+fC8bWi3PwRWWWH6GvXWt1CRhpNMdHc+t/auMndLlm5ix5G+ORTSTOJTlafBhI5dVVgALCw+xE1blJb2n2VVxlRuGNxfzpA8eQqYRjTmIdb6eYbWdJx8TBSYPlj1e16JUIzhmWLO+6/V+Hwhyb3lPI/TLvWk1ZGjTGFOxfclVBHerVkFLpbY1wWN7didaIsV6gl8wjVGk5mM9/cQr5EFOAoVJ/Pv5Pojrc8MSxrkd46JelRaAAMUL5F11yda1/cijWXrRnywr4ml+KDKqiNewS/FHOqqCgMLWOgyfCxVk5h2Sth6FZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The patch below does not apply to the v6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . Thanks, Sasha ------------------ original commit in Linus's tree ------------------ >From 1d4832becdc2cdb2cffe2a6050c9d9fd8ff1c58c Mon Sep 17 00:00:00 2001 From: Yu Zhao Date: Sat, 19 Oct 2024 01:29:39 +0000 Subject: [PATCH] mm: multi-gen LRU: use {ptep,pmdp}_clear_young_notify() When the MM_WALK capability is enabled, memory that is mostly accessed by a VM appears younger than it really is, therefore this memory will be less likely to be evicted. Therefore, the presence of a running VM can significantly increase swap-outs for non-VM memory, regressing the performance for the rest of the system. Fix this regression by always calling {ptep,pmdp}_clear_young_notify() whenever we clear the young bits on PMDs/PTEs. [jthoughton@google.com: fix link-time error] Link: https://lkml.kernel.org/r/20241019012940.3656292-3-jthoughton@google.com Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") Signed-off-by: Yu Zhao Signed-off-by: James Houghton Reported-by: David Stevens Cc: Axel Rasmussen Cc: David Matlack Cc: David Rientjes Cc: Oliver Upton Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Wei Xu Cc: Cc: kernel test robot Signed-off-by: Andrew Morton --- include/linux/mmzone.h | 5 ++- mm/rmap.c | 9 ++--- mm/vmscan.c | 88 +++++++++++++++++++++++------------------- 3 files changed, 55 insertions(+), 47 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9342e5692dab6..5b1c984daf454 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -555,7 +555,7 @@ struct lru_gen_memcg { void lru_gen_init_pgdat(struct pglist_data *pgdat); void lru_gen_init_lruvec(struct lruvec *lruvec); -void lru_gen_look_around(struct page_vma_mapped_walk *pvmw); +bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw); void lru_gen_init_memcg(struct mem_cgroup *memcg); void lru_gen_exit_memcg(struct mem_cgroup *memcg); @@ -574,8 +574,9 @@ static inline void lru_gen_init_lruvec(struct lruvec *lruvec) { } -static inline void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +static inline bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { + return false; } static inline void lru_gen_init_memcg(struct mem_cgroup *memcg) diff --git a/mm/rmap.c b/mm/rmap.c index a8797d1b3d496..73d5998677d40 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -885,13 +885,10 @@ static bool folio_referenced_one(struct folio *folio, return false; } - if (pvmw.pte) { - if (lru_gen_enabled() && - pte_young(ptep_get(pvmw.pte))) { - lru_gen_look_around(&pvmw); + if (lru_gen_enabled() && pvmw.pte) { + if (lru_gen_look_around(&pvmw)) referenced++; - } - + } else if (pvmw.pte) { if (ptep_clear_flush_young_notify(vma, address, pvmw.pte)) referenced++; diff --git a/mm/vmscan.c b/mm/vmscan.c index 4f1d33e4b3601..ddaaff67642e1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -56,6 +56,7 @@ #include #include #include +#include #include #include @@ -3294,7 +3295,8 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk return false; } -static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned long addr) +static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned long addr, + struct pglist_data *pgdat) { unsigned long pfn = pte_pfn(pte); @@ -3306,13 +3308,20 @@ static unsigned long get_pte_pfn(pte_t pte, struct vm_area_struct *vma, unsigned if (WARN_ON_ONCE(pte_devmap(pte) || pte_special(pte))) return -1; + if (!pte_young(pte) && !mm_has_notifiers(vma->vm_mm)) + return -1; + if (WARN_ON_ONCE(!pfn_valid(pfn))) return -1; + if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) + return -1; + return pfn; } -static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned long addr) +static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned long addr, + struct pglist_data *pgdat) { unsigned long pfn = pmd_pfn(pmd); @@ -3324,9 +3333,15 @@ static unsigned long get_pmd_pfn(pmd_t pmd, struct vm_area_struct *vma, unsigned if (WARN_ON_ONCE(pmd_devmap(pmd))) return -1; + if (!pmd_young(pmd) && !mm_has_notifiers(vma->vm_mm)) + return -1; + if (WARN_ON_ONCE(!pfn_valid(pfn))) return -1; + if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) + return -1; + return pfn; } @@ -3335,10 +3350,6 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg, { struct folio *folio; - /* try to avoid unnecessary memory loads */ - if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) - return NULL; - folio = pfn_folio(pfn); if (folio_nid(folio) != pgdat->node_id) return NULL; @@ -3394,20 +3405,16 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, total++; walk->mm_stats[MM_LEAF_TOTAL]++; - pfn = get_pte_pfn(ptent, args->vma, addr); + pfn = get_pte_pfn(ptent, args->vma, addr, pgdat); if (pfn == -1) continue; - if (!pte_young(ptent)) { - continue; - } - folio = get_pfn_folio(pfn, memcg, pgdat, walk->can_swap); if (!folio) continue; - if (!ptep_test_and_clear_young(args->vma, addr, pte + i)) - VM_WARN_ON_ONCE(true); + if (!ptep_clear_young_notify(args->vma, addr, pte + i)) + continue; young++; walk->mm_stats[MM_LEAF_YOUNG]++; @@ -3473,21 +3480,25 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long addr, struct vm_area /* don't round down the first address */ addr = i ? (*first & PMD_MASK) + i * PMD_SIZE : *first; - pfn = get_pmd_pfn(pmd[i], vma, addr); - if (pfn == -1) + if (!pmd_present(pmd[i])) goto next; if (!pmd_trans_huge(pmd[i])) { - if (!walk->force_scan && should_clear_pmd_young()) + if (!walk->force_scan && should_clear_pmd_young() && + !mm_has_notifiers(args->mm)) pmdp_test_and_clear_young(vma, addr, pmd + i); goto next; } + pfn = get_pmd_pfn(pmd[i], vma, addr, pgdat); + if (pfn == -1) + goto next; + folio = get_pfn_folio(pfn, memcg, pgdat, walk->can_swap); if (!folio) goto next; - if (!pmdp_test_and_clear_young(vma, addr, pmd + i)) + if (!pmdp_clear_young_notify(vma, addr, pmd + i)) goto next; walk->mm_stats[MM_LEAF_YOUNG]++; @@ -3545,24 +3556,18 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, } if (pmd_trans_huge(val)) { - unsigned long pfn = pmd_pfn(val); struct pglist_data *pgdat = lruvec_pgdat(walk->lruvec); + unsigned long pfn = get_pmd_pfn(val, vma, addr, pgdat); walk->mm_stats[MM_LEAF_TOTAL]++; - if (!pmd_young(val)) { - continue; - } - - /* try to avoid unnecessary memory loads */ - if (pfn < pgdat->node_start_pfn || pfn >= pgdat_end_pfn(pgdat)) - continue; - - walk_pmd_range_locked(pud, addr, vma, args, bitmap, &first); + if (pfn != -1) + walk_pmd_range_locked(pud, addr, vma, args, bitmap, &first); continue; } - if (!walk->force_scan && should_clear_pmd_young()) { + if (!walk->force_scan && should_clear_pmd_young() && + !mm_has_notifiers(args->mm)) { if (!pmd_young(val)) continue; @@ -4036,13 +4041,13 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) * the PTE table to the Bloom filter. This forms a feedback loop between the * eviction and the aging. */ -void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { int i; unsigned long start; unsigned long end; struct lru_gen_mm_walk *walk; - int young = 0; + int young = 1; pte_t *pte = pvmw->pte; unsigned long addr = pvmw->address; struct vm_area_struct *vma = pvmw->vma; @@ -4058,12 +4063,15 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) lockdep_assert_held(pvmw->ptl); VM_WARN_ON_ONCE_FOLIO(folio_test_lru(folio), folio); + if (!ptep_clear_young_notify(vma, addr, pte)) + return false; + if (spin_is_contended(pvmw->ptl)) - return; + return true; /* exclude special VMAs containing anon pages from COW */ if (vma->vm_flags & VM_SPECIAL) - return; + return true; /* avoid taking the LRU lock under the PTL when possible */ walk = current->reclaim_state ? current->reclaim_state->mm_walk : NULL; @@ -4071,6 +4079,9 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) start = max(addr & PMD_MASK, vma->vm_start); end = min(addr | ~PMD_MASK, vma->vm_end - 1) + 1; + if (end - start == PAGE_SIZE) + return true; + if (end - start > MIN_LRU_BATCH * PAGE_SIZE) { if (addr - start < MIN_LRU_BATCH * PAGE_SIZE / 2) end = start + MIN_LRU_BATCH * PAGE_SIZE; @@ -4084,7 +4095,7 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) /* folio_update_gen() requires stable folio_memcg() */ if (!mem_cgroup_trylock_pages(memcg)) - return; + return true; arch_enter_lazy_mmu_mode(); @@ -4094,19 +4105,16 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) unsigned long pfn; pte_t ptent = ptep_get(pte + i); - pfn = get_pte_pfn(ptent, vma, addr); + pfn = get_pte_pfn(ptent, vma, addr, pgdat); if (pfn == -1) continue; - if (!pte_young(ptent)) - continue; - folio = get_pfn_folio(pfn, memcg, pgdat, can_swap); if (!folio) continue; - if (!ptep_test_and_clear_young(vma, addr, pte + i)) - VM_WARN_ON_ONCE(true); + if (!ptep_clear_young_notify(vma, addr, pte + i)) + continue; young++; @@ -4136,6 +4144,8 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) /* feedback from rmap walkers to page table walkers */ if (mm_state && suitable_to_scan(i, young)) update_bloom_filter(mm_state, max_seq, pvmw->pmd); + + return true; } /****************************************************************************** -- 2.43.0