linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yu Zhao <yuzhao@google.com>
To: James Houghton <jthoughton@google.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	David Matlack <dmatlack@google.com>,
	 David Rientjes <rientjes@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jonathan Corbet <corbet@lwn.net>,
	 Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	Wei Xu <weixugc@google.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 David Stevens <stevensd@google.com>
Subject: Re: [PATCH v7 00/18] mm: multi-gen LRU: Walk secondary MMU page tables while aging
Date: Tue, 15 Oct 2024 16:47:39 -0600	[thread overview]
Message-ID: <CAOUHufZU8C-48H0n2v02D52PoC8b0mYUJJS=C-dz+bruruOfdg@mail.gmail.com> (raw)
In-Reply-To: <CADrL8HUP1=eXE5QpVrKjgQGpusr_Raejr1sY2LLW1uSigpptOw@mail.gmail.com>

On Mon, Oct 14, 2024 at 6:07 PM James Houghton <jthoughton@google.com> wrote:
>
> On Mon, Oct 14, 2024 at 4:22 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Sep 26, 2024, James Houghton wrote:
> > > This patchset makes it possible for MGLRU to consult secondary MMUs
> > > while doing aging, not just during eviction. This allows for more
> > > accurate reclaim decisions, which is especially important for proactive
> > > reclaim.
> >
> > ...
> >
> > > James Houghton (14):
> > >   KVM: Remove kvm_handle_hva_range helper functions
> > >   KVM: Add lockless memslot walk to KVM
> > >   KVM: x86/mmu: Factor out spte atomic bit clearing routine
> > >   KVM: x86/mmu: Relax locking for kvm_test_age_gfn and kvm_age_gfn
> > >   KVM: x86/mmu: Rearrange kvm_{test_,}age_gfn
> > >   KVM: x86/mmu: Only check gfn age in shadow MMU if
> > >     indirect_shadow_pages > 0
> > >   mm: Add missing mmu_notifier_clear_young for !MMU_NOTIFIER
> > >   mm: Add has_fast_aging to struct mmu_notifier
> > >   mm: Add fast_only bool to test_young and clear_young MMU notifiers
> >
> > Per offline discussions, there's a non-zero chance that fast_only won't be needed,
> > because it may be preferable to incorporate secondary MMUs into MGLRU, even if
> > they don't support "fast" aging.
> >
> > What's the status on that front?  Even if the status is "TBD", it'd be very helpful
> > to let others know, so that they don't spend time reviewing code that might be
> > completely thrown away.
>
> The fast_only MMU notifier changes will probably be removed in v8.
>
> ChromeOS folks found that the way MGLRU *currently* interacts with KVM
> is problematic. That is, today, with the MM_WALK MGLRU capability
> enabled, normal PTEs have their Accessed bits cleared via a page table
> scan and then during an rmap walk upon attempted eviction, whereas,
> KVM SPTEs only have their Accessed bits cleared via the rmap walk at
> eviction time. So KVM SPTEs have their Accessed bits cleared less
> frequently than normal PTEs, and therefore they appear younger than
> they should.
>
> It turns out that this causes tab open latency regressions on ChromeOS
> where a significant amount of memory is being used by a VM. IIUC, the
> fix for this is to have MGLRU age SPTEs as often as it ages normal
> PTEs; i.e., it should call the correct MMU notifiers each time it
> clears A bits on PTEs. The final patch in this series sort of does
> this, but instead of calling the new fast_only notifier, we need to
> call the normal test/clear_young() notifiers regardless of how fast
> they are.
>
> This also means that the MGLRU changes no longer depend on the KVM
> optimizations, as they can motivated independently.
>
> Yu, have I gotten anything wrong here? Do you have any more details to share?

Yes, that's precisely the problem. My original justification [1] for
not scanning KVM MMU when lockless is not supported turned out to be
harmful to some workloads too.

On one hand, scanning KVM MMU when not lockless can cause the KVM MMU
lock contention; on the other hand, not scanning KVM MMU can skew
anon/file LRU aging and thrash page cache. Given the lock contention
is being tackled, the latter seems to be the lesser of two evils.

[1] https://lore.kernel.org/linux-mm/CAOUHufYFHKLwt1PWp2uS6g174GZYRZURWJAmdUWs5eaKmhEeyQ@mail.gmail.com/


      reply	other threads:[~2024-10-15 22:48 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-26  1:34 James Houghton
2024-09-26  1:34 ` [PATCH v7 01/18] KVM: Remove kvm_handle_hva_range helper functions James Houghton
2024-09-26  1:34 ` [PATCH v7 02/18] KVM: Add lockless memslot walk to KVM James Houghton
2024-09-26  1:34 ` [PATCH v7 03/18] KVM: x86/mmu: Factor out spte atomic bit clearing routine James Houghton
2024-09-26  1:34 ` [PATCH v7 04/18] KVM: x86/mmu: Relax locking for kvm_test_age_gfn and kvm_age_gfn James Houghton
2024-09-26  1:55   ` James Houghton
2024-10-03 20:05   ` James Houghton
2024-09-26  1:34 ` [PATCH v7 05/18] KVM: x86/mmu: Rearrange kvm_{test_,}age_gfn James Houghton
2024-09-26  1:34 ` [PATCH v7 06/18] KVM: x86/mmu: Only check gfn age in shadow MMU if indirect_shadow_pages > 0 James Houghton
2024-09-26  1:34 ` [PATCH v7 07/18] KVM: x86/mmu: Refactor low level rmap helpers to prep for walking w/o mmu_lock James Houghton
2024-09-26  1:34 ` [PATCH v7 08/18] KVM: x86/mmu: Add infrastructure to allow walking rmaps outside of mmu_lock James Houghton
2024-09-26  1:34 ` [PATCH v7 09/18] KVM: x86/mmu: Add support for lockless walks of rmap SPTEs James Houghton
2024-09-26  1:34 ` [PATCH v7 10/18] KVM: x86/mmu: Support rmap walks without holding mmu_lock when aging gfns James Houghton
2024-09-26  1:34 ` [PATCH v7 11/18] mm: Add missing mmu_notifier_clear_young for !MMU_NOTIFIER James Houghton
2024-09-26  1:35 ` [PATCH v7 12/18] mm: Add has_fast_aging to struct mmu_notifier James Houghton
2024-09-26  1:35 ` [PATCH v7 13/18] mm: Add fast_only bool to test_young and clear_young MMU notifiers James Houghton
2024-09-26  1:35 ` [PATCH v7 14/18] KVM: Pass fast_only to kvm_{test_,}age_gfn James Houghton
2024-09-26  1:35 ` [PATCH v7 15/18] KVM: x86/mmu: Locklessly harvest access information from shadow MMU James Houghton
2024-09-26  1:35 ` [PATCH v7 16/18] KVM: x86/mmu: Enable has_fast_aging James Houghton
2024-09-26  1:35 ` [PATCH v7 17/18] mm: multi-gen LRU: Have secondary MMUs participate in aging James Houghton
2024-09-26  1:35 ` [PATCH v7 18/18] KVM: selftests: Add multi-gen LRU aging to access_tracking_perf_test James Houghton
2024-10-14 23:22 ` [PATCH v7 00/18] mm: multi-gen LRU: Walk secondary MMU page tables while aging Sean Christopherson
2024-10-15  0:07   ` James Houghton
2024-10-15 22:47     ` Yu Zhao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOUHufZU8C-48H0n2v02D52PoC8b0mYUJJS=C-dz+bruruOfdg@mail.gmail.com' \
    --to=yuzhao@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=corbet@lwn.net \
    --cc=dmatlack@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=stevensd@google.com \
    --cc=weixugc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox