From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61516C5B552 for ; Mon, 9 Jun 2025 07:01:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E91CE6B007B; Mon, 9 Jun 2025 03:01:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E42856B0088; Mon, 9 Jun 2025 03:01:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D31F26B0089; Mon, 9 Jun 2025 03:01:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B4D9E6B007B for ; Mon, 9 Jun 2025 03:01:58 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5355FC129B for ; Mon, 9 Jun 2025 07:01:58 +0000 (UTC) X-FDA: 83534967516.19.07D411F Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf06.hostedemail.com (Postfix) with ESMTP id 7452A180004 for ; Mon, 9 Jun 2025 07:01:56 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OQLDZOML; spf=pass (imf06.hostedemail.com: domain of tabba@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749452516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qd1ChU4RfLFVxxmL2KzsgjziSyYF1OUEgK3hVJNjklY=; b=Edr80fQJI11h1e5l9nbL1FArK1dw988COOiGVKROcu7WxZa+t66d/sYT/7KniglbO+uFER cdltYLUefjHGj6kzAPHb9+hGiFwAqPoRXvNGEVn7APxx+KENktbcb9B/vfKqVkBfkPYYP4 FT8TIynmH4uF0Rr72msJDoz6+w4zG0w= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OQLDZOML; spf=pass (imf06.hostedemail.com: domain of tabba@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749452516; a=rsa-sha256; cv=none; b=ZW9ZozbpK2Nxc4KHg47sriTJ6OJqggG0b7Ha0JvKOFI1VbmFk85vJalhbyf2Fgd7Q5cZ7F sAWiTSxb8yEY7C4sbJYkKN/A39XkjI+cptUeYuSGbO4I9VrIBm1h8X9pK5RX3Ujf2x2rQS 0r9Os78USSJaEmQ0r5GnrTEqtO+B5ys= Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-4a58ef58a38so322831cf.0 for ; Mon, 09 Jun 2025 00:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1749452515; x=1750057315; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qd1ChU4RfLFVxxmL2KzsgjziSyYF1OUEgK3hVJNjklY=; b=OQLDZOMLHlvTrnnRZUPQcNLzv3PWl3gCTo/wP9tRtEAGGK330CPHwT/EjwiMrDhz1Z DxfgjXhW49dykPLb+blKDIzLIVQKbdkStqbPCmxtHE3dB/lx1lGSfEPdykwVzX62fRQV TCAbdQDj8Pd4dUMrIXq+iCL+xjF/YEC7NG190fDZxv6vZT48+XzECkwAmrh+TFIstmKV oRMQ4ryq0V9jh666zAGjmliKMCG4W7Nvf7wjqHcYJXHU2p0F5/xjOJ8JfulGsCgCOtt0 Ie8uLrFWopA7EDBr7JEUQHQ9QllvCeVEdOIrgJThO0+usO2UJfXyllDxw7YNzrhIzLm6 PgRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749452515; x=1750057315; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qd1ChU4RfLFVxxmL2KzsgjziSyYF1OUEgK3hVJNjklY=; b=WCveHdtXYm6w9UKiWfmFkcOSOPtQjMaiP5M+3R7e8v6QSTo+cMCTwHjXgu195xnql4 soxj1a3FKD/rIfI4Y8TORrOv/jWJ1Oiy1AQwrs3DVJlUIImj3rBka6l4LBvVpzNtb2KB mmtbSUy0wokhPY2NF1j8sZ8Yw+iT5yi+aTooJsrcmgO4mAfg80jElVCQZie7JKApD7fE 2UIsWg5iPJlfvOkqMXOp2ryoZbrt10IugrmT95egN0Pd0501TcQGiOysM8YAxzvTEDqj DV3PJDpiCdCJplKQa9Z+sCp5KXXVWAUOhJydN9OBs0S93qYiHc2v3v3hIm0UzUHGSKCe d7YQ== X-Forwarded-Encrypted: i=1; AJvYcCWadqNU9a9GbDAs8sJ8l8lA2Nf7PaQ1HTUyp7K0/X8RTxpgGYee4yF6hYKYQXlVqE+ioFwV2mrz5w==@kvack.org X-Gm-Message-State: AOJu0YyxtIzVhMTjj8QWX2y18l92qzCncNLNHnM/aatT4JgYDOjJpTec X9w7WGe9q7/uyemylL/YVyBpK+8vyWGt81/iacldrZZjCODlD3zaL5kMuoOvJ6f/qwg2IvNKTmz ZBEVT5Ql2ZXV0uDzyet1ORcHzRL2IVKVlI3s/3eLi X-Gm-Gg: ASbGnct4qLi3757ECuooRK6viBHt8JV3Keh5d6bfY4UVwlT3gqVJa7OsdS9ol0z7zIW L5dWFJLg1FHuOqLN4I9s5lzhwF3sV2KvHph+vF0rW1iAU/5jqZHORRLeZ45vkDzu1s2jrgjFokb 7DFgRdQNJnIfNithGGGuGvQOGa5nWPtbzTWlL/z2fpM1B0jUyg29PORw== X-Google-Smtp-Source: AGHT+IEYTjTPB44LdkojWXh1DgFtqgfjL8IG40gv9XwgF99n42jhj0tJV9OMy0pM+Kly1ld8fJ7D+mY65UKD6ViNkf0= X-Received: by 2002:a05:622a:189a:b0:494:4aa0:ad5b with SMTP id d75a77b69052e-4a6ef9d0503mr6371521cf.2.1749452514924; Mon, 09 Jun 2025 00:01:54 -0700 (PDT) MIME-Version: 1.0 References: <20250605153800.557144-1-tabba@google.com> <20250605153800.557144-14-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Mon, 9 Jun 2025 08:01:17 +0100 X-Gm-Features: AX0GCFtueCmazOgupVxtUqJ3eDLk0zaHqIaK1R7EMlVz70IKr0ZL8XxR8nJ_svw Message-ID: Subject: Re: [PATCH v11 13/18] KVM: arm64: Refactor user_mem_abort() To: Gavin Shan Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7452A180004 X-Stat-Signature: 6ok6c1jtp84wuehii1b66ncph9e7pige X-Rspam-User: X-HE-Tag: 1749452516-306199 X-HE-Meta: U2FsdGVkX18LU5WMy1OJnBZ9/AfSpzTD8H4E7uE9ZV03z3PB8azPFlbfUoPtufIGtVApIFNJ6cP1G+L7tUaRPuBu/A8PXYaJuxlv/cwmUvm1I0A5vSrG/7KlE4/kWsiDOIINbTft4GCIAEk25BejuXFvcXUAcpCjEf4csGq70s92ZUO39IoZNu0majEzDcIOwgUCDpVNj0e4XSJ4/bwSPsk4Om8IJI0DgbRSwIkkNEydH1bbEsT9v6nK1lh3h4McFUfUXq83dGOUgqpXs6eFLBLx2zhkgb9IMZM6v4JPSJiEY++TPr2CgbeZ1XeZPkh6lwtjquTcUe43cIdsj+DMleXYyT+pZl+3bW59VXbysOiOHYaI1u9TqVkSqkTBDp/q05btaGuBMM3YC+e8XQor8y7KSr0xj9GRdCJ8bmr8dxFEFwy0JVYqgxh9juKEpJHoqA3wORnhWbcIzLfi1lzA4cNrbiImGBZTK+knvQzwOMwhlIAZzbtzV6m249Syg64TespL5I1TIUsT6h1HorF3qfFF0pY+WC9ltsve1/LdDC54CLl+jFPUFowaemGuWDsQ8lu5BO156Esx1WNaTBPvtV7jj63YHiJzSncovOkXtpXCjTaE1RVxsWX2yuaftfQriztjbrX1Y6cL3wrDQink/zaesoDoT7vI9mgBmHhEe6sEFCR84a9kJQKs4JrGl6i6wi4quStdBTTO/rwp9gPfuBU9Ox6iPE0m4qNGIi2OVYFENEkPw6NWxEeisDrb5xDuiggc2uGW/SUbSx6z3KoGwXIByu5k09tgXY1vBT6m+iBbA6bmy6PQJ9Z9vh9MHMC4YjzKrAemx9ddIcr6jo42a1soKbGw0XAtWnNQjcv/5cw3SEel4pgNSbrydwVb/Le7kTLCKs+6EMD9GxF9f4TuVrmYZdH/J/7dEhXBatWV0bPaPdWJiM45FAla1dOqM+jUiAdgwRh/YTxNd9obOD/ 3M+QpsBT pG4OfrTICppvNqA19M9X9ltkR5x4Z/W479bqlFA5Px3lnPO1rBZg0qsvfrceh99HOLIZjnYoXS14itDN2/x7eDLDG5jGvi3cXEZAyHV0KyYJSqMSY+KgDme/ftfKPLx5bdKOO/mpY898Fkyso2VD1wmu/JVhIlvBM/ZGUl+/ye32WZOdDADoV2UsKiv8ciiCgX71EXoIQefYX385oBc//kAuCEhHoej+Kj8LclH2CaGRUGwFzsScxmn8v/iK+bJhvgh4F26hZWx4Kk+qN15dFIm3j9i+bByKmEHOhjWnDOawqb+pTMyYTYVhBZZBNYs2SsfN9YVN39+ue8e5zZ1qcy0Qp5kXcbHcfSgbvUMTpr/l5cKU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Gavin, On Mon, 9 Jun 2025 at 01:27, Gavin Shan wrote: > > Hi Fuad, > > On 6/6/25 1:37 AM, Fuad Tabba wrote: > > To simplify the code and to make the assumptions clearer, > > refactor user_mem_abort() by immediately setting force_pte to > > true if the conditions are met. > > > > Remove the comment about logging_active being guaranteed to never be > > true for VM_PFNMAP memslots, since it's not actually correct. > > > > Move code that will be reused in the following patch into separate > > functions. > > > > Other small instances of tidying up. > > > > No functional change intended. > > > > Signed-off-by: Fuad Tabba > > --- > > arch/arm64/kvm/mmu.c | 100 ++++++++++++++++++++++++------------------- > > 1 file changed, 55 insertions(+), 45 deletions(-) > > > > One nitpick below in case v12 is needed. In either way, it looks good to me: > > Reviewed-by: Gavin Shan > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index eeda92330ade..ce80be116a30 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1466,13 +1466,56 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) > > return vma->vm_flags & VM_MTE_ALLOWED; > > } > > > > +static int prepare_mmu_memcache(struct kvm_vcpu *vcpu, bool topup_memcache, > > + void **memcache) > > +{ > > + int min_pages; > > + > > + if (!is_protected_kvm_enabled()) > > + *memcache = &vcpu->arch.mmu_page_cache; > > + else > > + *memcache = &vcpu->arch.pkvm_memcache; > > + > > + if (!topup_memcache) > > + return 0; > > + > > It's unnecessary to initialize 'memcache' when topup_memcache is false. I thought about this before, and I _think_ you're right. However, I couldn't completely convince myself that that's always the case for the code to be functionally equivalent (looking at the condition for kvm_pgtable_stage2_relax_perms() at the end of the function). Which is why, if I were to do that, I'd do it as a separate patch. Thanks, /fuad > if (!topup_memcache) > return 0; > > min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu); > if (!is_protected_kvm_enabled()) > *memcache = &vcpu->arch.mmu_page_cache; > else > *memcache = &vcpu->arch.pkvm_memcache; > > Thanks, > Gavin > > > + min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu); > > + > > + if (!is_protected_kvm_enabled()) > > + return kvm_mmu_topup_memory_cache(*memcache, min_pages); > > + > > + return topup_hyp_memcache(*memcache, min_pages); > > +} > > + > > +/* > > + * Potentially reduce shadow S2 permissions to match the guest's own S2. For > > + * exec faults, we'd only reach this point if the guest actually allowed it (see > > + * kvm_s2_handle_perm_fault). > > + * > > + * Also encode the level of the original translation in the SW bits of the leaf > > + * entry as a proxy for the span of that translation. This will be retrieved on > > + * TLB invalidation from the guest and used to limit the invalidation scope if a > > + * TTL hint or a range isn't provided. > > + */ > > +static void adjust_nested_fault_perms(struct kvm_s2_trans *nested, > > + enum kvm_pgtable_prot *prot, > > + bool *writable) > > +{ > > + *writable &= kvm_s2_trans_writable(nested); > > + if (!kvm_s2_trans_readable(nested)) > > + *prot &= ~KVM_PGTABLE_PROT_R; > > + > > + *prot |= kvm_encode_nested_level(nested); > > +} > > + > > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > struct kvm_s2_trans *nested, > > struct kvm_memory_slot *memslot, unsigned long hva, > > bool fault_is_perm) > > { > > int ret = 0; > > - bool write_fault, writable, force_pte = false; > > + bool topup_memcache; > > + bool write_fault, writable; > > bool exec_fault, mte_allowed; > > bool device = false, vfio_allow_any_uc = false; > > unsigned long mmu_seq; > > @@ -1484,6 +1527,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > gfn_t gfn; > > kvm_pfn_t pfn; > > bool logging_active = memslot_is_logging(memslot); > > + bool force_pte = logging_active || is_protected_kvm_enabled(); > > long vma_pagesize, fault_granule; > > enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > > struct kvm_pgtable *pgt; > > @@ -1501,28 +1545,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > return -EFAULT; > > } > > > > - if (!is_protected_kvm_enabled()) > > - memcache = &vcpu->arch.mmu_page_cache; > > - else > > - memcache = &vcpu->arch.pkvm_memcache; > > - > > /* > > * Permission faults just need to update the existing leaf entry, > > * and so normally don't require allocations from the memcache. The > > * only exception to this is when dirty logging is enabled at runtime > > * and a write fault needs to collapse a block entry into a table. > > */ > > - if (!fault_is_perm || (logging_active && write_fault)) { > > - int min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu); > > - > > - if (!is_protected_kvm_enabled()) > > - ret = kvm_mmu_topup_memory_cache(memcache, min_pages); > > - else > > - ret = topup_hyp_memcache(memcache, min_pages); > > - > > - if (ret) > > - return ret; > > - } > > + topup_memcache = !fault_is_perm || (logging_active && write_fault); > > + ret = prepare_mmu_memcache(vcpu, topup_memcache, &memcache); > > + if (ret) > > + return ret; > > > > /* > > * Let's check if we will get back a huge page backed by hugetlbfs, or > > @@ -1536,16 +1568,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > return -EFAULT; > > } > > > > - /* > > - * logging_active is guaranteed to never be true for VM_PFNMAP > > - * memslots. > > - */ > > - if (logging_active || is_protected_kvm_enabled()) { > > - force_pte = true; > > + if (force_pte) > > vma_shift = PAGE_SHIFT; > > - } else { > > + else > > vma_shift = get_vma_page_shift(vma, hva); > > - } > > > > switch (vma_shift) { > > #ifndef __PAGETABLE_PMD_FOLDED > > @@ -1597,7 +1623,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > max_map_size = PAGE_SIZE; > > > > force_pte = (max_map_size == PAGE_SIZE); > > - vma_pagesize = min(vma_pagesize, (long)max_map_size); > > + vma_pagesize = min_t(long, vma_pagesize, max_map_size); > > } > > > > /* > > @@ -1626,7 +1652,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs > > * with the smp_wmb() in kvm_mmu_invalidate_end(). > > */ > > - mmu_seq = vcpu->kvm->mmu_invalidate_seq; > > + mmu_seq = kvm->mmu_invalidate_seq; > > mmap_read_unlock(current->mm); > > > > pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, > > @@ -1661,24 +1687,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > if (exec_fault && device) > > return -ENOEXEC; > > > > - /* > > - * Potentially reduce shadow S2 permissions to match the guest's own > > - * S2. For exec faults, we'd only reach this point if the guest > > - * actually allowed it (see kvm_s2_handle_perm_fault). > > - * > > - * Also encode the level of the original translation in the SW bits > > - * of the leaf entry as a proxy for the span of that translation. > > - * This will be retrieved on TLB invalidation from the guest and > > - * used to limit the invalidation scope if a TTL hint or a range > > - * isn't provided. > > - */ > > - if (nested) { > > - writable &= kvm_s2_trans_writable(nested); > > - if (!kvm_s2_trans_readable(nested)) > > - prot &= ~KVM_PGTABLE_PROT_R; > > - > > - prot |= kvm_encode_nested_level(nested); > > - } > > + if (nested) > > + adjust_nested_fault_perms(nested, &prot, &writable); > > > > kvm_fault_lock(kvm); > > pgt = vcpu->arch.hw_mmu->pgt; >