From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 863DEC83F1B for ; Wed, 16 Jul 2025 11:26:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 136B96B00A0; Wed, 16 Jul 2025 07:26:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E8A06B00A3; Wed, 16 Jul 2025 07:26:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F18996B00A6; Wed, 16 Jul 2025 07:26:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DE7716B00A0 for ; Wed, 16 Jul 2025 07:26:53 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7730958895 for ; Wed, 16 Jul 2025 11:26:53 +0000 (UTC) X-FDA: 83669900706.13.7434FFF Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by imf07.hostedemail.com (Postfix) with ESMTP id AB5CB40007 for ; Wed, 16 Jul 2025 11:26:51 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=a3xD87WH; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of tabba@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752665211; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t5R7kWnWuNP4kHs3y3gQRpYRuK0llaSxBMJsJPZN3qQ=; b=XXYVz29kabvODsqnutAgRiJrG4+N0bhDdcoGsmfQM/SdUM5ZRYe29Y2MZm1AiUyREhA+WW fsGM8wFnJl5xZLIOSAbW3kOOkU/YsaK7FWGKcfHJAhQACe8FzIZilR7u0lBjnQ6oZ1jGcw eJqjZRRowfSMLXQARhqhPcJDFTcrd2c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752665211; a=rsa-sha256; cv=none; b=8Uc5SVZsVgd19xp0i84b2pyUnmtr1fyKd0ett0EjXax5qubFQJYPffzB/+opMRvizyPJHT jF9Te4aVLcUYkePKTuPk54lFq5xp7uVHXiPh37JYNa05HkDpGwPsrgIf8DcYOK/+nSKxsn KdDBzOF+8Knws2ZBTb3hjcZa2XRG6QU= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=a3xD87WH; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of tabba@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=tabba@google.com Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-4ab3ad4c61fso420551cf.0 for ; Wed, 16 Jul 2025 04:26:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752665211; x=1753270011; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=t5R7kWnWuNP4kHs3y3gQRpYRuK0llaSxBMJsJPZN3qQ=; b=a3xD87WHY8cGE6aDGihVxpDYcJfjWd/CbZyoZ8QWcIwtAOWjxFiug4Xc8LAqfWDySO Oc42wzfylg6NOVSdw7Nc/Ou9o3awnpGq0styqiyNcJ7tJ1uLroz9oLQ4o4LcimwM7KKB kxUZA+E4BmZCtS3q6lnQvn0jWsZO4oyZhCFj+On3xHLz2cTfRB/5XAmm7WID8EDEU3NZ wXj1AM3teTfAlXiNkeSj6ZOPLzGa2XmCHW2puwiwY125ajKZvqpC/CK/NDK1Zuimb6de myBdWb6RFmn2W1WkN7KH2V6QTU2T9CmdV4aJcz7iN2xWjlc29yX9KNIvy5tmHJZO2fky GKrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752665211; x=1753270011; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=t5R7kWnWuNP4kHs3y3gQRpYRuK0llaSxBMJsJPZN3qQ=; b=rI5mdWNOQw/FU0oywUrv64vqdBqtV58g2ALBfoIWG1+0tJyZjl6Ad8ivky68vU/P0F K/jEe344Oa6uXgkxFd8bLsAitiTthyaoNbNRPmguzsiNnpsPqmK6QQJyyJ6g3PfQtiCX HHBsZ5mzWgva5+VRzA3XQnSJWFGTNnR6MTK5lEmaoX2fKnMPlJ1IKxKuXdAdVizLlyTO 34Kf7xAW+gSXHRW3kfGina8ez2VkwGMwgiwLUvFJ0HJN0UxK1seoCylQt3jaK26yAtbg 8wRRZ6cJ+WuMOWVOvtsCWvm5dcO43eLOSiPuggdLzV94hikt2QekXwr1/FG7cM7w6f5+ e2Og== X-Forwarded-Encrypted: i=1; AJvYcCWEhs4q3ItilCvf/aghWOMJBnZhzTv+6bor7z2+YAvkli4lWicjtf8f5xNZLgtYsZnyZB8FV3PVhA==@kvack.org X-Gm-Message-State: AOJu0YyLGeWjlY/4Bo4WKKLIMLi5QCVkXEaF2U2/ikh7LfZq/Sjo23G1 SsNQXl7xRQJz50QfVnVCbVJdMVzf8PngHGMKe7KCMWMaZBM6BivTS/8Xz+S2bL1Z0ZL0xqQDlwN OFFRdjNUJQOFQihGZtGzmPgkLdYH5yaC2fHaOCzRs X-Gm-Gg: ASbGncsz/11saap+2fJOu2f5ujeJ367JU/bkI9Bydcg7E1n9wPbISWtzIdOyFhWvPKO JNOvKk6hERFEfEhiMp69kCxI/xHdMegW6T4j82evwnPdTw+p3MlYgRZvuPHcTD+dVwz365AvZmq pzYqrPgmnj3+xW542Lv7H5YHMSs64qoOUm16LEjMm69OUGjwsA/VNa35zI+krn5hKH49yYDXeJI QmZjxGz27R8fuTu9RpG406fvvuESmslT8PI X-Google-Smtp-Source: AGHT+IHaGYFlj+5UQCP3ycOWBKRQbIDwdhMKRPNIGN5Zyy7RYQr7ya7fA0vyRBNNF6bOnwvE+TY/1VOezg6wIMY0dnM= X-Received: by 2002:a05:622a:a28d:b0:4a7:26d2:5a38 with SMTP id d75a77b69052e-4ab97e14cb6mr1763881cf.19.1752665210038; Wed, 16 Jul 2025 04:26:50 -0700 (PDT) MIME-Version: 1.0 References: <20250715093350.2584932-1-tabba@google.com> <20250715093350.2584932-16-tabba@google.com> <39a217c1-29a9-4497-b3b6-bc0459e75a91@redhat.com> In-Reply-To: <39a217c1-29a9-4497-b3b6-bc0459e75a91@redhat.com> From: Fuad Tabba Date: Wed, 16 Jul 2025 12:26:13 +0100 X-Gm-Features: Ac12FXyf7dM4gwpjCXZKsFe03i-IhUXKmydFOHr4OJpO94k0Jnlho20N3Z_JVzU Message-ID: Subject: Re: [PATCH v14 15/21] KVM: arm64: Refactor user_mem_abort() To: David Hildenbrand Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: AB5CB40007 X-Stat-Signature: mxfea5s4iayesr6iy64i3r1h1ykiuuk6 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1752665211-829311 X-HE-Meta: U2FsdGVkX1+DHf2HmHi58J8c2NahSsElTEuusQmA9caT3f2K1y4/b84UTljPriBbQi9yJimvPlPvGsI7XvP7LSS0QsL8FiEGzkxkaIEn8sSn6hvETx7fQuEj6yDU/BIhXAByH1216aI78mnS2OUlba57xGvnOrg/rLXQHVjACb9JZF5cDC8wdoeH+GW7WqUtKl+fwBbZQUOHn0ltoFksAkK88fAeZmln+ArZBSYlssSB/kVmKQ3wOI3zcMwj6mwLlvVZXRB3khN5CfmydPGCU3qdUH3Y4mlVcUwf5iT6vcCVT+DpQopftWGFZ7+3lrqqMDJUeExEGSdRnApX6sM3CgybhPmyS1vNCGN7WmCMGTtgy4IigIDFzE0AsS+Wi4jFycsTTcu1GPkApPCi9PPfwXq2v89vCGyieFXdENmmrqqlNeblneftIgISrmEv4YtfLkIDbhnI9fFoTYuSay0xlCtbUk3Scvii9TterFIbmhkKeFbWDIim4EJv9GJcKNNqmKgZuj2vtNgf1aHbK8zGeJj6zviyPm4uqgbmdQFfLiff3dc314WsL20RHm2fR9Dt+digtkIOCqFMtIUUgoEn5ko5WBqCpq0KeXqQ1WrpVXZo4i0wIywiv6Z/JELHZ0ni+zKIi7gijm4kORncqcKx5kuFn0iNLB2Rff7luFXcMqVNN7SLeU2CdC3gcjB9GBZ2jIpCVM5cRnN7v7zzeExg1yBjTL2T/RM3UC9xESwx3nMHMiP5XEOsVaWOQ+7zl9a7vLQU2ITQTg883/qH0bOpyN2cntpTdTyNCjKSryDbMcQqCKyGXSwNhtNdcdMGTpEGm2IKCv4t+iizLslMNLyRhypNKefODdXldkp99SR5AnzG+fgmuTsgBeG4LV0635BQWsKI7sqkKGxVvaZFV6uQlqDHn7AoSNJOn2I9ZSTB1O2JEm2v3OqUZwmSjx/PI0ODY8FLvg7whdfZKw2W+jI QKKn8dJu zbUy2xgbajmo8UQpNLHokQ49poMKuL8opZ1YdgknfkuAejlpLmaNc4tPW7iQFGGsurVyeXG8bqtL2G7S9WFzw4kNY+PRnSPu8tL+WGQ8A5HSHv8GloNhonNy6aLAa8tfxxsjKW7lPpl+l9ipA/J6F2YO2u8VrQjory9OGjjqT3j1yLXn02WdyqGzzPVu8mkaAr6/2rLFb2h7pmxDCumBRUUiXByEYqDTYMs1QA/N4bn0RCExUJh2QISqeuHcIEjiT3/GAd7jk8xIJhuuRtDBmnBPKGVd1UbdwYqQzoIFGGLzFhrgs2X8AUSHIBPbP1vHuw0ywhLIg+Oxl2RcKVIQWCkCxUtgTKhLKyBbhmJaYqC4M2UARbdC0+liDBJQvFJG2MAkb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 16 Jul 2025 at 11:36, David Hildenbrand wrote: > > On 15.07.25 11:33, Fuad Tabba wrote: > > Refactor user_mem_abort() to improve code clarity and simplify > > assumptions within the function. > > > > Key changes include: > > > > * Immediately set force_pte to true at the beginning of the function if > > logging_active is true. This simplifies the flow and makes the > > condition for forcing a PTE more explicit. > > > > * Remove the misleading comment stating that logging_active is > > guaranteed to never be true for VM_PFNMAP memslots, as this assertion > > is not entirely correct. > > > > * Extract reusable code blocks into new helper functions: > > * prepare_mmu_memcache(): Encapsulates the logic for preparing and > > topping up the MMU page cache. > > * adjust_nested_fault_perms(): Isolates the adjustments to shadow S2 > > permissions and the encoding of nested translation levels. > > > > * Update min(a, (long)b) to min_t(long, a, b) for better type safety and > > consistency. > > > > * Perform other minor tidying up of the code. > > > > These changes primarily aim to simplify user_mem_abort() and make its > > logic easier to understand and maintain, setting the stage for future > > modifications. > > > > Reviewed-by: Gavin Shan > > Reviewed-by: Marc Zyngier > > Signed-off-by: Fuad Tabba > > --- > > arch/arm64/kvm/mmu.c | 110 +++++++++++++++++++++++-------------------- > > 1 file changed, 59 insertions(+), 51 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 2942ec92c5a4..b3eacb400fab 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1470,13 +1470,56 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) > > return vma->vm_flags & VM_MTE_ALLOWED; > > } > > > > +static int prepare_mmu_memcache(struct kvm_vcpu *vcpu, bool topup_memcache, > > + void **memcache) > > +{ > > + int min_pages; > > + > > + if (!is_protected_kvm_enabled()) > > + *memcache = &vcpu->arch.mmu_page_cache; > > + else > > + *memcache = &vcpu->arch.pkvm_memcache; > > + > > + if (!topup_memcache) > > + return 0; > > + > > + min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu); > > + > > + if (!is_protected_kvm_enabled()) > > + return kvm_mmu_topup_memory_cache(*memcache, min_pages); > > + > > + return topup_hyp_memcache(*memcache, min_pages); > > +} > > + > > +/* > > + * Potentially reduce shadow S2 permissions to match the guest's own S2. For > > + * exec faults, we'd only reach this point if the guest actually allowed it (see > > + * kvm_s2_handle_perm_fault). > > + * > > + * Also encode the level of the original translation in the SW bits of the leaf > > + * entry as a proxy for the span of that translation. This will be retrieved on > > + * TLB invalidation from the guest and used to limit the invalidation scope if a > > + * TTL hint or a range isn't provided. > > + */ > > +static void adjust_nested_fault_perms(struct kvm_s2_trans *nested, > > + enum kvm_pgtable_prot *prot, > > + bool *writable) > > +{ > > + *writable &= kvm_s2_trans_writable(nested); > > + if (!kvm_s2_trans_readable(nested)) > > + *prot &= ~KVM_PGTABLE_PROT_R; > > + > > + *prot |= kvm_encode_nested_level(nested); > > +} > > + > > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > struct kvm_s2_trans *nested, > > struct kvm_memory_slot *memslot, unsigned long hva, > > bool fault_is_perm) > > { > > int ret = 0; > > - bool write_fault, writable, force_pte = false; > > + bool topup_memcache; > > + bool write_fault, writable; > > bool exec_fault, mte_allowed; > > bool device = false, vfio_allow_any_uc = false; > > unsigned long mmu_seq; > > @@ -1488,6 +1531,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > gfn_t gfn; > > kvm_pfn_t pfn; > > bool logging_active = memslot_is_logging(memslot); > > + bool force_pte = logging_active; > > long vma_pagesize, fault_granule; > > enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > > struct kvm_pgtable *pgt; > > @@ -1498,17 +1542,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu); > > write_fault = kvm_is_write_fault(vcpu); > > exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); > > - VM_BUG_ON(write_fault && exec_fault); > > - > > - if (fault_is_perm && !write_fault && !exec_fault) { > > - kvm_err("Unexpected L2 read permission error\n"); > > - return -EFAULT; > > - } > > - > > - if (!is_protected_kvm_enabled()) > > - memcache = &vcpu->arch.mmu_page_cache; > > - else > > - memcache = &vcpu->arch.pkvm_memcache; > > + VM_WARN_ON_ONCE(write_fault && exec_fault); > > > > /* > > * Permission faults just need to update the existing leaf entry, > > @@ -1516,17 +1550,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > * only exception to this is when dirty logging is enabled at runtime > > * and a write fault needs to collapse a block entry into a table. > > */ > > - if (!fault_is_perm || (logging_active && write_fault)) { > > - int min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu); > > - > > - if (!is_protected_kvm_enabled()) > > - ret = kvm_mmu_topup_memory_cache(memcache, min_pages); > > - else > > - ret = topup_hyp_memcache(memcache, min_pages); > > - > > - if (ret) > > - return ret; > > - } > > + topup_memcache = !fault_is_perm || (logging_active && write_fault); > > + ret = prepare_mmu_memcache(vcpu, topup_memcache, &memcache); > > + if (ret) > > + return ret; > > > > /* > > * Let's check if we will get back a huge page backed by hugetlbfs, or > > @@ -1540,16 +1567,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > return -EFAULT; > > } > > > > - /* > > - * logging_active is guaranteed to never be true for VM_PFNMAP > > - * memslots. > > - */ > > - if (logging_active) { > > - force_pte = true; > > + if (force_pte) > > vma_shift = PAGE_SHIFT; > > - } else { > > + else > > vma_shift = get_vma_page_shift(vma, hva); > > - } > > > > switch (vma_shift) { > > #ifndef __PAGETABLE_PMD_FOLDED > > @@ -1601,7 +1622,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > max_map_size = PAGE_SIZE; > > > > force_pte = (max_map_size == PAGE_SIZE); > > - vma_pagesize = min(vma_pagesize, (long)max_map_size); > > + vma_pagesize = min_t(long, vma_pagesize, max_map_size); > > } > > > > /* > > @@ -1630,7 +1651,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs > > * with the smp_wmb() in kvm_mmu_invalidate_end(). > > */ > > - mmu_seq = vcpu->kvm->mmu_invalidate_seq; > > + mmu_seq = kvm->mmu_invalidate_seq; > > mmap_read_unlock(current->mm); > > > > pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, > > @@ -1665,24 +1686,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > if (exec_fault && device) > > return -ENOEXEC; > > > > - /* > > - * Potentially reduce shadow S2 permissions to match the guest's own > > - * S2. For exec faults, we'd only reach this point if the guest > > - * actually allowed it (see kvm_s2_handle_perm_fault). > > - * > > - * Also encode the level of the original translation in the SW bits > > - * of the leaf entry as a proxy for the span of that translation. > > - * This will be retrieved on TLB invalidation from the guest and > > - * used to limit the invalidation scope if a TTL hint or a range > > - * isn't provided. > > - */ > > - if (nested) { > > - writable &= kvm_s2_trans_writable(nested); > > - if (!kvm_s2_trans_readable(nested)) > > - prot &= ~KVM_PGTABLE_PROT_R; > > - > > - prot |= kvm_encode_nested_level(nested); > > - } > > + if (nested) > > + adjust_nested_fault_perms(nested, &prot, &writable); > > > > kvm_fault_lock(kvm); > > pgt = vcpu->arch.hw_mmu->pgt; > > @@ -1953,6 +1958,9 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) > > goto out_unlock; > > } > > > > + VM_WARN_ON_ONCE(kvm_vcpu_trap_is_permission_fault(vcpu) && > > + !write_fault && !kvm_vcpu_trap_is_exec_fault(vcpu)); > > + > > ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, > > esr_fsc_is_permission_fault(esr)); > > if (ret == 0) > > > A note that on the KVM arm next tree > > https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/log/?h=next > > there are some user_mem_abort() changes. IIUC, only smaller conflicts. Thanks for the heads up. I can work with Marc and Oliver to resolve any conflicts once we get there. Cheers, /fuad > -- > Cheers, > > David / dhildenb >