From: Yosry Ahmed <yosryahmed@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>,
Huacai Chen <chenhuacai@kernel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
Anup Patel <anup@brainfault.org>,
Atish Patra <atishp@atishpatra.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>,
Joerg Roedel <joro@8bytes.org>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeelb@google.com>,
James Morse <james.morse@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
linux-mips@vger.kernel.org, kvm@vger.kernel.org,
kvm-riscv@lists.infradead.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, Linux-MM <linux-mm@kvack.org>,
cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v3 4/6] KVM: arm64/mmu: count KVM page table pages in pagetable stats
Date: Tue, 26 Apr 2022 12:33:28 -0700 [thread overview]
Message-ID: <CAJD7tkZ-mrhCXoR0ZONjJ7kuAuzWED8Dcjkw6mv812-3uSo_Cg@mail.gmail.com> (raw)
In-Reply-To: <874k2falbk.wl-maz@kernel.org>
Thanks a lot for taking the time to look at this!
On Tue, Apr 26, 2022 at 8:58 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 26 Apr 2022 06:39:02 +0100,
> Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > Count the pages used by KVM in arm64 for page tables in pagetable stats.
> >
> > Account pages allocated for PTEs in pgtable init functions and
> > kvm_set_table_pte().
> >
> > Since most page table pages are freed using put_page(), add a helper
> > function put_pte_page() that checks if this is the last ref for a pte
> > page before putting it, and unaccounts stats accordingly.
> >
> > Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> > ---
> > arch/arm64/kernel/image-vars.h | 3 ++
> > arch/arm64/kvm/hyp/pgtable.c | 50 +++++++++++++++++++++-------------
> > 2 files changed, 34 insertions(+), 19 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
> > index 241c86b67d01..25bf058714f6 100644
> > --- a/arch/arm64/kernel/image-vars.h
> > +++ b/arch/arm64/kernel/image-vars.h
> > @@ -143,6 +143,9 @@ KVM_NVHE_ALIAS(__hyp_rodata_end);
> > /* pKVM static key */
> > KVM_NVHE_ALIAS(kvm_protected_mode_initialized);
> >
> > +/* Called by kvm_account_pgtable_pages() to update pagetable stats */
> > +KVM_NVHE_ALIAS(__mod_lruvec_page_state);
>
> This cannot be right. It means that this function will be called
> directly from the EL2 code when in protected mode, and will result in
> extreme fireworks. There is no way you can call core kernel stuff
> like this from this context.
>
> Please do not add random symbols to this list just for the sake of
> being able to link the kernel.
Excuse my ignorance, this is my first time touching kvm code. Thanks a
lot for pointing this out.
>
> > +
> > #endif /* CONFIG_KVM */
> >
> > #endif /* __ARM64_KERNEL_IMAGE_VARS_H */
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index 2cb3867eb7c2..53e13c3313e9 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -152,6 +152,7 @@ static void kvm_set_table_pte(kvm_pte_t *ptep, kvm_pte_t *childp,
> >
> > WARN_ON(kvm_pte_valid(old));
> > smp_store_release(ptep, pte);
> > + kvm_account_pgtable_pages((void *)childp, +1);
>
> Why the + sign?
I am following conventions in other existing stat accounting hooks
(e.g. kvm_mod_used_mmu_pages(vcpu->kvm, +1) call in
arch/x86/kvm/mmu/mmu.c), but I can certainly remove it if you think
this is better.
>
> > }
> >
> > static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level)
> > @@ -326,6 +327,14 @@ int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr,
> > return ret;
> > }
> >
> > +static void put_pte_page(kvm_pte_t *ptep, struct kvm_pgtable_mm_ops *mm_ops)
> > +{
> > + /* If this is the last page ref, decrement pagetable stats first. */
> > + if (!mm_ops->page_count || mm_ops->page_count(ptep) == 1)
> > + kvm_account_pgtable_pages((void *)ptep, -1);
> > + mm_ops->put_page(ptep);
> > +}
> > +
> > struct hyp_map_data {
> > u64 phys;
> > kvm_pte_t attr;
> > @@ -488,10 +497,10 @@ static int hyp_unmap_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
> >
> > dsb(ish);
> > isb();
> > - mm_ops->put_page(ptep);
> > + put_pte_page(ptep, mm_ops);
> >
> > if (childp)
> > - mm_ops->put_page(childp);
> > + put_pte_page(childp, mm_ops);
> >
> > return 0;
> > }
> > @@ -522,6 +531,7 @@ int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
> > pgt->pgd = (kvm_pte_t *)mm_ops->zalloc_page(NULL);
> > if (!pgt->pgd)
> > return -ENOMEM;
> > + kvm_account_pgtable_pages((void *)pgt->pgd, +1);
> >
> > pgt->ia_bits = va_bits;
> > pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels;
> > @@ -541,10 +551,10 @@ static int hyp_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
> > if (!kvm_pte_valid(pte))
> > return 0;
> >
> > - mm_ops->put_page(ptep);
> > + put_pte_page(ptep, mm_ops);
> >
> > if (kvm_pte_table(pte, level))
> > - mm_ops->put_page(kvm_pte_follow(pte, mm_ops));
> > + put_pte_page(kvm_pte_follow(pte, mm_ops), mm_ops);
>
> OK, I see the pattern. I don't think this workable as such. I'd rather
> the callbacks themselves (put_page, zalloc_page*) call into the
> accounting code when it makes sense, rather than spreading the
> complexity and having to special case the protected case.
>
This makes sense. I am working on moving calls to
kvm_account_pgtable_pages to callbacks in mmu.c in the next version
(stage2_memcache_zalloc_page, kvm_host_put_page, etc).
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2022-04-26 19:34 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-26 5:38 [PATCH v3 0/6] KVM: mm: count KVM page table pages in memory stats Yosry Ahmed
2022-04-26 5:38 ` [PATCH v3 1/6] mm: add NR_SECONDARY_PAGETABLE stat Yosry Ahmed
2022-04-26 5:39 ` [PATCH v3 2/6] KVM: mmu: add a helper to account page table pages used by KVM Yosry Ahmed
2022-04-26 5:39 ` [PATCH v3 3/6] KVM: x86/mmu: count KVM page table pages in pagetable stats Yosry Ahmed
2022-04-26 5:39 ` [PATCH v3 4/6] KVM: arm64/mmu: " Yosry Ahmed
2022-04-26 7:34 ` Oliver Upton
2022-04-26 19:27 ` Yosry Ahmed
2022-04-28 17:45 ` Oliver Upton
2022-04-28 23:49 ` Yosry Ahmed
2022-04-26 15:58 ` Marc Zyngier
2022-04-26 19:33 ` Yosry Ahmed [this message]
2022-04-26 5:39 ` [PATCH v3 5/6] KVM: riscv/mmu: " Yosry Ahmed
2022-04-26 5:39 ` [PATCH v3 6/6] KVM: mips/mmu: " Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJD7tkZ-mrhCXoR0ZONjJ7kuAuzWED8Dcjkw6mv812-3uSo_Cg@mail.gmail.com \
--to=yosryahmed@google.com \
--cc=akpm@linux-foundation.org \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=alexandru.elisei@arm.com \
--cc=anup@brainfault.org \
--cc=atishp@atishpatra.org \
--cc=catalin.marinas@arm.com \
--cc=cgroups@vger.kernel.org \
--cc=chenhuacai@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=james.morse@arm.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maz@kernel.org \
--cc=mhocko@kernel.org \
--cc=pbonzini@redhat.com \
--cc=roman.gushchin@linux.dev \
--cc=seanjc@google.com \
--cc=shakeelb@google.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suzuki.poulose@arm.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox