From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 167FFC5B543 for ; Wed, 4 Jun 2025 13:30:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A73EE6B0291; Wed, 4 Jun 2025 09:30:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4B1C6B0293; Wed, 4 Jun 2025 09:30:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 962386B0294; Wed, 4 Jun 2025 09:30:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 771036B0291 for ; Wed, 4 Jun 2025 09:30:58 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 288E91A1605 for ; Wed, 4 Jun 2025 13:30:58 +0000 (UTC) X-FDA: 83517803796.20.BD1DB3E Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf03.hostedemail.com (Postfix) with ESMTP id 3F33220011 for ; Wed, 4 Jun 2025 13:30:56 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KDPRpqXg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of tabba@google.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749043856; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uyiMafDC7E5cgno21MzsLK+8XJjpRNZsbhEY5eHo9Pc=; b=0GHRwZ2JKVCcOpcItN173KjKT6fWRuK8WlgX5U31szJGfHyvOCgUPQpP1SJn+OnlX+OGc8 jjPoZDWljc3y/9l0QLkcUvOynzFeyorUAZYEImtzerHTvJ6loLeaHgz+SeR37GVq9MDeAL zsFNKssQ8i7Z/lbgQ0LcPkvo5EKjV7A= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KDPRpqXg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of tabba@google.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749043856; a=rsa-sha256; cv=none; b=sd06dccQp6OPn1H80sZvtvNQ5lEQAy9ZBLimMpWRLeNgaFCZ5A+SO8WhvSXoF/9TnvNBEh CZHKh59dkbx4KDRVbsyvz984iKBMSXsyORPCtOW5jiXyGyYestZwFR+hZt/XnLEXmMwUou mH/7vuIbOLTnmV/mkOjlpWd+4/QIcoM= Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-47e9fea29easo414841cf.1 for ; Wed, 04 Jun 2025 06:30:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1749043855; x=1749648655; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uyiMafDC7E5cgno21MzsLK+8XJjpRNZsbhEY5eHo9Pc=; b=KDPRpqXgovPVK1wUOYf1hX6rgs+ao2+YP1EY5Tko1ATh4VXlKUTNbf1Pl5RbeCBy2m PyK1hoZdsid8856yifP42nZlhsHVuSPKkCwuygaKHIMyspOAoUXVS5bIvfjCwyCKrJhU s1ChS0N4OfLFh9cA4M0cdpcpq1esbPCXG/Je6SnOZgZjrlMUvk9MSk77+hNrl6vMDYUL AE3cPFJySGSJ9oxQN/UOxMDKzUPOElOAPpQsggdxgUsTUqI8pgCAlUEcORRuNTgyniny reaUGoc4WEdapGMZnleOY4gg8RfI8LB2KV3BY2Uhqogv1e9SPgLMWwa0qf0qUgckqLUR 4k7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749043855; x=1749648655; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uyiMafDC7E5cgno21MzsLK+8XJjpRNZsbhEY5eHo9Pc=; b=sToDbqZylfCN4SPtjSUSVfulKsPStpY3ltIAdrFXyj7w90z2noNvDvsfUuWo6pwK1G 4YVGbIPfgIJdJeVQDxYMCrCn1s2TSIyIhOcCa/QjMYt3xDdarf4Y8WR9cM+wxBbRP7wI 826qwto+SzwbueXg2bSzREvlGXdtYBnSm7fta+EBDZLxpj0h5PnbjsZg9MWxk+2dKo1g s4bBmDtJsWIQQv9MmfwMnm85Y949Nu+uW7/jd7Nfe9hg2q+m4LzXitKLeiwK5uid8uGD GRC6eQWv9VuocuHtV/R0IUVV4UhsTnCIP2uK+MEiY/aB08D5EJooReIrl6Y7PhK+kYnm qj2w== X-Forwarded-Encrypted: i=1; AJvYcCVzE+wJQ7C6O04GY+i7owclKtD1lgaCZiZtE9Tq8B8CtFmnLFqLchv9o4IJhlgRJYRWx9f/0OdJuw==@kvack.org X-Gm-Message-State: AOJu0YygEGE7JKZPfNlVlXYmwE3BloShMreIHSe2VLUGRL5muD5hUJ9V YNRHbiQmfDhWid5XR8mbIYJKqyXX5B7uBcPt9IVnjzmPrVBYjcChcWK9js3Q5LCZylzWH4O483X Ksmpb4iurNoGVAmySBQQa25Xz7mWzeLj1+lvcKsAh6l76KoYNdYCM92Rj3NIZ6VJ/VHM= X-Gm-Gg: ASbGncud1n87mrmIWSOkTuRL8XFXD/qQYXvzXgrejGwe3rgdjQcBfDrDh+gf5RwK2Gs p3sV8fXHJQxWvtVj+P58WX3hgJqS2ac2NGevVFtZXbbRXA113bxNhFIZcb5K5wuodHPEurjNJ7w Y+q4hR8MboEwWCywOF8YDlNRgglEZ3EmCpAqsw+lVE4lms91BRl8IwfLbgs1WrEbCeX5chzBw= X-Google-Smtp-Source: AGHT+IEiuOpRnWtYTRS3CT4V5jSIDwFYPZj4Nr5CsijT8Vf1ZuM39wYXhn6/p3Kb+QYfA9lohdSrdihp1xz3e/hQW6I= X-Received: by 2002:a05:622a:1e0a:b0:476:f1a6:d8e8 with SMTP id d75a77b69052e-4a5a60d6234mr3305471cf.11.1749043854897; Wed, 04 Jun 2025 06:30:54 -0700 (PDT) MIME-Version: 1.0 References: <20250527180245.1413463-1-tabba@google.com> <20250527180245.1413463-14-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Wed, 4 Jun 2025 14:30:18 +0100 X-Gm-Features: AX0GCFuVOmlbK_esYDoThzEqDgolE3P28B7qZ2y18EcGIsthsmM6eEmDwV6HU4g Message-ID: Subject: Re: [PATCH v10 13/16] KVM: arm64: Handle guest_memfd-backed guest page faults To: David Hildenbrand Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3F33220011 X-Stat-Signature: tkmwjy8qrttnkk9xjh4k5gdasj5yadsk X-Rspam-User: X-HE-Tag: 1749043856-239586 X-HE-Meta: U2FsdGVkX1/iK42X5ASlA93VpLE/6ZFgxXr0kvgkelV/luYBOLgd5sTlMyg808MdMqtoKhpomefBOlxSczTKXBOe/JoxttCuWzlQ8o6/ZHOGhaSyMXSHAx7KSEMpVO3hyreOeL1amN6wBkbmoJ0t2er1jbzb5M9Qgh7SG8mhKt/4Vt7RYoXypwSxuE0l92mEdl12QBDrsK/r2zhUCg79DwlZ/D9zbM2IzoEQE13ijMLI6Fx+NIrTXUg9pfhbBz49frA/Kj9B7TXAgk18n+1rTuhrLA1Pk6jCTPpQL0K/PQpzwj1eBlwP0/rzAq/5Ew+Es/nOlGYJgBxi+vJw2YBHK2WXB7wUGeAJUJz2kptMEHoizsBG9QFsPih5m7piVPwwcK7PlW/aOZjto18/kqxdH/sy/IncqzgUi+V4LnzrN3dq/XK8E+cQXST1PuXSz8gl4N4FgdLSApvIk3h/bj4k8WC5OzuDLUBjkYwVWjIsTz9wT2MRhRE78ll/XHQIpWuRCOYbIceJiKywD2AndIppmWurZHffKvw2MnRzgeEPFCMET4+phtawsObKZCOM/hKBGqju8S0rCTBY/gCDW0j8g99yIUEVw48mDuSbFogDHf746XNkh5rKolyq16uirYGogc5GraQOuNTA46ZbCuMHAoUaQj5sAyQ8hNKibtGMBj2eVb+7XS05Y0ZTlr5Uq5wrnBW1zIRAkUAq9QkPaTDO91fv5Jpj/u+GNv/WUnF6wg+Qvm0VL5+CvK5mAxpIzNHIskBhoX7dZsS0yJkrMU89IwYOiCt4Z7/6I2Dg0dhpjXvtGbHCJPgZXLu+cXwSluB8uPWhxRehxKOK9M+ouYabf0mM2R+uXzO7fyar2tmr5/ez/LtKoGtYIlHUkguUDN4Ip3ar20Ajj9jk9iBL3KifLc67hZRanlom4hCXEuOCUjVGaruZKvX9xkDx5zfvYhCxVo1H6qFF/0Q45FH83ql f2uwXyir 1fpw8CNu4Jc5+RoBvhKI/UTbk4c1yV9/6+6rD5no5YoqvB6lI6UU8iCQylBG+qUyPc7MGg8twbt652CT+xwFYL3eOClCZk0bQc7FtJqplxnNZC/G28O0Ba53wasRWtx2VdF3lBebgRSKIfMk7x4SVIOomMjYrjzIItt91vA1IPTwDFLEEq6mRtYVyeCDThWdemJ9KNapz8acMteAsrZggowcl/020vZpKevZ6FFhQELw9JFf1jm4rPXf5q2kzk5xL2+GEhtdzFZe6sp2mIzDjOWAsKqqZ/fdRj8sGaLxfaeOZkZWsiFM5FVOgYYBtBlXrUnIQQm3WtUIPmCCSonaXG0UPU6eGmvgZILVoAcmoIzRH3oe2asa39zVSCaQcdiRa0uhYWAdxE2Urs8MnCJwyOdKx2lx+cN5sLqg3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi David On Wed, 4 Jun 2025 at 14:17, David Hildenbrand wrote: > > On 27.05.25 20:02, Fuad Tabba wrote: > > Add arm64 support for handling guest page faults on guest_memfd backed > > memslots. Until guest_memfd supports huge pages, the fault granule is > > restricted to PAGE_SIZE. > > > > Signed-off-by: Fuad Tabba > > > > --- > > > > Note: This patch introduces a new function, gmem_abort() rather than > > previous attempts at trying to expand user_mem_abort(). This is because > > there are many differences in how faults are handled when backed by > > guest_memfd vs regular memslots with anonymous memory, e.g., lack of > > VMA, and for now, lack of huge page support for guest_memfd. The > > function user_mem_abort() is already big and unwieldly, adding more > > complexity to it made things more difficult to understand. > > > > Once larger page size support is added to guest_memfd, we could factor > > out the common code between these two functions. > > > > --- > > arch/arm64/kvm/mmu.c | 89 +++++++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 87 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 9865ada04a81..896c56683d88 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1466,6 +1466,87 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) > > return vma->vm_flags & VM_MTE_ALLOWED; > > } > > > > +static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > + struct kvm_memory_slot *memslot, bool is_perm) > > TBH, I have no idea why the existing function is called "_abort". I am > sure there is a good reason :) > The reason is ARM. They're called "memory aborts", see D8.15 Memory aborts in the ARM ARM: https://developer.arm.com/documentation/ddi0487/latest/ Warning: PDF is 100mb+ with almost 15k pages :) > > +{ > > + enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED; > > + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > > + bool logging, write_fault, exec_fault, writable; > > + struct kvm_pgtable *pgt; > > + struct page *page; > > + struct kvm *kvm; > > + void *memcache; > > + kvm_pfn_t pfn; > > + gfn_t gfn; > > + int ret; > > + > > + if (!is_perm) { > > + int min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu); > > + > > + if (!is_protected_kvm_enabled()) { > > + memcache = &vcpu->arch.mmu_page_cache; > > + ret = kvm_mmu_topup_memory_cache(memcache, min_pages); > > + } else { > > + memcache = &vcpu->arch.pkvm_memcache; > > + ret = topup_hyp_memcache(memcache, min_pages); > > + } > > + if (ret) > > + return ret; > > + } > > + > > + kvm = vcpu->kvm; > > + gfn = fault_ipa >> PAGE_SHIFT; > > These two can be initialized directly above. > I was trying to go with reverse christmas tree order of declarations, but I'll do that. > > + > > + logging = memslot_is_logging(memslot); > > + write_fault = kvm_is_write_fault(vcpu); > > + exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); > > + VM_BUG_ON(write_fault && exec_fault); > > No VM_BUG_ON please. > > VM_WARN_ON_ONCE() maybe. Or just handle it along the "Unexpected L2 read > permission error" below cleanly. I'm following the same pattern as the existing user_mem_abort(), but I'll change it. > > + > > + if (is_perm && !write_fault && !exec_fault) { > > + kvm_err("Unexpected L2 read permission error\n"); > > + return -EFAULT; > > + } > > + > > + ret = kvm_gmem_get_pfn(vcpu->kvm, memslot, gfn, &pfn, &page, NULL); > > + if (ret) { > > + kvm_prepare_memory_fault_exit(vcpu, fault_ipa, PAGE_SIZE, > > + write_fault, exec_fault, false); > > + return ret; > > + } > > + > > + writable = !(memslot->flags & KVM_MEM_READONLY) && > > + (!logging || write_fault); > > + > > + if (writable) > > + prot |= KVM_PGTABLE_PROT_W; > > + > > + if (exec_fault || cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) > > + prot |= KVM_PGTABLE_PROT_X; > > + > > + pgt = vcpu->arch.hw_mmu->pgt; > > Can probably also initialize directly above. Ack. > > + > > + kvm_fault_lock(kvm); > > + if (is_perm) { > > + /* > > + * Drop the SW bits in favour of those stored in the > > + * PTE, which will be preserved. > > + */ > > + prot &= ~KVM_NV_GUEST_MAP_SZ; > > + ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, fault_ipa, prot, flags); > > + } else { > > + ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, fault_ipa, PAGE_SIZE, > > + __pfn_to_phys(pfn), prot, > > + memcache, flags); > > + } > > + kvm_release_faultin_page(kvm, page, !!ret, writable); > > + kvm_fault_unlock(kvm); > > + > > + if (writable && !ret) > > + mark_page_dirty_in_slot(kvm, memslot, gfn); > > + > > + return ret != -EAGAIN ? ret : 0; > > +} > > + > > Nothing else jumped at me. But just like on the x86 code, I think we > need some arch experts take a look at this one ... Thanks! /fuad > -- > Cheers, > > David / dhildenb >