From: Yu Zhao <yuzhao@google.com>
To: James Houghton <jthoughton@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Ankit Agrawal <ankita@nvidia.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Catalin Marinas <catalin.marinas@arm.com>,
David Matlack <dmatlack@google.com>,
David Rientjes <rientjes@google.com>,
James Morse <james.morse@arm.com>,
Jonathan Corbet <corbet@lwn.net>, Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Raghavendra Rao Ananta <rananta@google.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Sean Christopherson <seanjc@google.com>,
Shaoqin Huang <shahuang@redhat.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Wei Xu <weixugc@google.com>, Will Deacon <will@kernel.org>,
Zenghui Yu <yuzenghui@huawei.com>,
kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v5 8/9] mm: multi-gen LRU: Have secondary MMUs participate in aging
Date: Fri, 5 Jul 2024 12:35:29 -0600 [thread overview]
Message-ID: <CAOUHufb2f_EwHY5LQ59k7Nh7aS1-ZbOKtkoysb8BtxRNRFMypQ@mail.gmail.com> (raw)
In-Reply-To: <20240611002145.2078921-9-jthoughton@google.com>
On Mon, Jun 10, 2024 at 6:22 PM James Houghton <jthoughton@google.com> wrote:
>
> Secondary MMUs are currently consulted for access/age information at
> eviction time, but before then, we don't get accurate age information.
> That is, pages that are mostly accessed through a secondary MMU (like
> guest memory, used by KVM) will always just proceed down to the oldest
> generation, and then at eviction time, if KVM reports the page to be
> young, the page will be activated/promoted back to the youngest
> generation.
>
> The added feature bit (0x8), if disabled, will make MGLRU behave as if
> there are no secondary MMUs subscribed to MMU notifiers except at
> eviction time.
>
> Implement aging with the new mmu_notifier_test_clear_young_fast_only()
> notifier. For architectures that do not support this notifier, this
> becomes a no-op. For architectures that do implement it, it should be
> fast enough to make aging worth it.
>
> Suggested-by: Yu Zhao <yuzhao@google.com>
> Signed-off-by: James Houghton <jthoughton@google.com>
> ---
>
> Notes:
> should_look_around() can sometimes use two notifiers now instead of one.
>
> This simply comes from restricting myself from not changing
> mmu_notifier_clear_young() to return more than just "young or not".
>
> I could change mmu_notifier_clear_young() (and
> mmu_notifier_test_young()) to return if it was fast or not. At that
> point, I could just as well combine all the notifiers into one notifier,
> like what was in v2 and v3.
>
> Documentation/admin-guide/mm/multigen_lru.rst | 6 +-
> include/linux/mmzone.h | 6 +-
> mm/rmap.c | 9 +-
> mm/vmscan.c | 185 ++++++++++++++----
> 4 files changed, 164 insertions(+), 42 deletions(-)
...
> static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end,
> struct mm_walk *args)
> {
> @@ -3357,8 +3416,9 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end,
> struct pglist_data *pgdat = lruvec_pgdat(walk->lruvec);
> DEFINE_MAX_SEQ(walk->lruvec);
> int old_gen, new_gen = lru_gen_from_seq(max_seq);
> + struct mm_struct *mm = args->mm;
>
> - pte = pte_offset_map_nolock(args->mm, pmd, start & PMD_MASK, &ptl);
> + pte = pte_offset_map_nolock(mm, pmd, start & PMD_MASK, &ptl);
> if (!pte)
> return false;
> if (!spin_trylock(ptl)) {
> @@ -3376,11 +3436,12 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end,
> total++;
> walk->mm_stats[MM_LEAF_TOTAL]++;
>
> - pfn = get_pte_pfn(ptent, args->vma, addr);
> + pfn = get_pte_pfn(ptent, args->vma, addr, pgdat);
> if (pfn == -1)
> continue;
>
> - if (!pte_young(ptent)) {
> + if (!pte_young(ptent) &&
> + !lru_gen_notifier_test_young(mm, addr)) {
> walk->mm_stats[MM_LEAF_OLD]++;
> continue;
> }
> @@ -3389,8 +3450,9 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end,
> if (!folio)
> continue;
>
> - if (!ptep_test_and_clear_young(args->vma, addr, pte + i))
> - VM_WARN_ON_ONCE(true);
> + lru_gen_notifier_clear_young(mm, addr, addr + PAGE_SIZE);
> + if (pte_young(ptent))
> + ptep_test_and_clear_young(args->vma, addr, pte + i);
>
> young++;
> walk->mm_stats[MM_LEAF_YOUNG]++;
There are two ways to structure the test conditions in walk_pte_range():
1. a single pass into the MMU notifier (combine test/clear) which
causes a cache miss from get_pfn_page() if the page is NOT young.
2. two passes into the MMU notifier (separate test/clear) if the page
is young, which does NOT cause a cache miss if the page is NOT young.
v2 can batch up to 64 PTEs, i.e., it only goes into the MMU notifier
twice every 64 PTEs, and therefore the second option is a clear win.
But you are doing twice per PTE. So what's the rationale behind going
with the second option? Was the first option considered?
In addition, what about the non-lockless cases? Would this change make
them worse by grabbing the MMU lock twice per PTE?
next prev parent reply other threads:[~2024-07-05 18:36 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-11 0:21 [PATCH v5 0/9] mm: multi-gen LRU: Walk secondary MMU page tables while aging James Houghton
2024-06-11 0:21 ` [PATCH v5 1/9] KVM: Add lockless memslot walk to KVM James Houghton
2024-06-11 0:21 ` [PATCH v5 2/9] KVM: x86: Relax locking for kvm_test_age_gfn and kvm_age_gfn James Houghton
2024-06-11 0:21 ` [PATCH v5 3/9] KVM: arm64: " James Houghton
2024-06-11 5:57 ` Oliver Upton
2024-06-11 16:52 ` James Houghton
2024-06-11 0:21 ` [PATCH v5 4/9] mm: Add test_clear_young_fast_only MMU notifier James Houghton
2024-06-11 5:33 ` Yu Zhao
2024-06-11 16:49 ` James Houghton
2024-06-11 18:54 ` Oliver Upton
2024-06-11 19:49 ` Sean Christopherson
2024-06-13 6:52 ` Oliver Upton
2024-06-14 0:48 ` James Houghton
2024-06-11 19:42 ` Sean Christopherson
2024-06-11 23:04 ` James Houghton
2024-06-12 0:34 ` Sean Christopherson
2024-06-14 0:45 ` James Houghton
2024-06-14 16:12 ` Sean Christopherson
2024-06-14 18:23 ` James Houghton
2024-06-14 23:17 ` Sean Christopherson
2024-06-17 16:50 ` James Houghton
2024-06-17 18:37 ` Sean Christopherson
2024-06-28 23:38 ` James Houghton
2024-07-08 16:50 ` James Houghton
2024-07-09 17:49 ` Sean Christopherson
2024-07-10 23:10 ` James Houghton
2024-07-12 15:06 ` Sean Christopherson
2024-07-15 23:15 ` James Houghton
2024-06-11 20:39 ` Yu Zhao
2024-06-11 0:21 ` [PATCH v5 5/9] KVM: Add kvm_fast_age_gfn and kvm_fast_test_age_gfn James Houghton
2024-06-11 0:21 ` [PATCH v5 6/9] KVM: x86: Move tdp_mmu_enabled and shadow_accessed_mask James Houghton
2024-06-11 0:21 ` [PATCH v5 7/9] KVM: x86: Implement kvm_fast_test_age_gfn and kvm_fast_age_gfn James Houghton
2024-06-11 0:21 ` [PATCH v5 8/9] mm: multi-gen LRU: Have secondary MMUs participate in aging James Houghton
2024-06-12 16:02 ` Sean Christopherson
2024-06-12 16:59 ` Yu Zhao
2024-06-12 17:23 ` Sean Christopherson
2024-06-13 6:49 ` Oliver Upton
2024-07-05 18:35 ` Yu Zhao [this message]
2024-07-08 17:30 ` James Houghton
2024-07-08 23:41 ` Yu Zhao
2024-07-22 20:45 ` James Houghton
2024-07-22 21:23 ` Yu Zhao
2024-06-11 0:21 ` [PATCH v5 9/9] KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test James Houghton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOUHufb2f_EwHY5LQ59k7Nh7aS1-ZbOKtkoysb8BtxRNRFMypQ@mail.gmail.com \
--to=yuzhao@google.com \
--cc=akpm@linux-foundation.org \
--cc=ankita@nvidia.com \
--cc=axelrasmussen@google.com \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=dmatlack@google.com \
--cc=james.morse@arm.com \
--cc=jthoughton@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=rananta@google.com \
--cc=rientjes@google.com \
--cc=ryan.roberts@arm.com \
--cc=seanjc@google.com \
--cc=shahuang@redhat.com \
--cc=suzuki.poulose@arm.com \
--cc=weixugc@google.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox