From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95F42C369DC for ; Thu, 1 May 2025 09:54:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BB9B76B008A; Thu, 1 May 2025 05:54:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B902D6B008C; Thu, 1 May 2025 05:54:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7FDA6B0092; Thu, 1 May 2025 05:54:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8AB806B008A for ; Thu, 1 May 2025 05:54:38 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A50BE14013C for ; Thu, 1 May 2025 09:54:39 +0000 (UTC) X-FDA: 83393879478.16.584348B Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf29.hostedemail.com (Postfix) with ESMTP id D18A7120007 for ; Thu, 1 May 2025 09:54:37 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DSzqUXJK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of tabba@google.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746093277; a=rsa-sha256; cv=none; b=Jp3uy++8k4/+bO/ztu1BopE3OXErJ0nHEHEV6CRFVPV4lMCxG1+5mW7EWHiANUOuykiGZa Nthb8mp6po4yc+prr+WU57W9i5gt39IBRTwdOeRoX0wPytQbJmAF+FEAc2n/5yh7dDimtl NYcRVXNNn2TP2zyhqhjBe3hvRL2YbwU= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DSzqUXJK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of tabba@google.com designates 209.85.160.175 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746093277; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Si4JfXR6mP4IYYMT29UOVbb1otTt0axA6CZ1YDI9+AQ=; b=4wC3bMNAuz3zle7k9+STUKc3S7/qoOXrSgoSucekzBxk+UVilp2chJbzOlGkK2jaE1DJha N5AZD99+eSKHieHPfMHDQbY4SEn3yOJ2jFsN+G9Fm6vuvOmDdOJkIzOA0dRsCc9IbRtxdb 3QZgpBumws+BmlRtreClrWcBokLoZ8s= Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-4774611d40bso165791cf.0 for ; Thu, 01 May 2025 02:54:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746093277; x=1746698077; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Si4JfXR6mP4IYYMT29UOVbb1otTt0axA6CZ1YDI9+AQ=; b=DSzqUXJKcKyLTJY/qsXj/7V5uViza1xUjLH5AfWrpZ7XNWOIrMg54Byy3UryMQh37P gTek25D75aZE73ln+Fq63QN8QYLySQsm3cKNHfjNN9Ph1O3V1FB6Rye7pbxtN1XsUijk avK5DMDpbaQi1ILARyA8qWPXWAacnyAFvKAX0235tM6LliFUpXiAaRjwCnrcG3GyK/jM o4s9RQrOdo0plxSr9OvcB3bUtinJuS47FXUavJDJYkzcrlrCkw2MK6qYALAAf9tNg3Fp WDOv5Qz928pTHHfpKmIXhkQIPSljKNJ+9oO78hixzqKxktKAH51J97h65C50WK/+lkhk JDcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746093277; x=1746698077; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Si4JfXR6mP4IYYMT29UOVbb1otTt0axA6CZ1YDI9+AQ=; b=ZWZxgqcQ/zp189c7CwQ2C15ao+cDhNxlln1PyT+xdIrFmZXaPwM5dxFxJeI8wZF1fC LvwpUv0AJpteQ75AdhTzJP6pDd4gpoE2xYomW0khevo9//JgyWd1DaeIqxfdhLRvgQoz F62wn3FHhqjrv9AoGhQR3FD8SQQ9XPLV+KZmlLPPIyE/XcaQWArQYHRK/VnnzT+prlEg qJPht+WPNuKOvEtYb8TezosibgtkB6G/xhu7o+FkKijF5yiN7rJenn/IbPyLbStUUPqq F9e//3nCdgu1GAEy2NPH8PViSOoW9C1qSANdTM5XzYQaSrneq05cwSreLKaatNLhhtyX 4m+w== X-Forwarded-Encrypted: i=1; AJvYcCVk4BDvQTtOiZH9uNFv58JdnT9nLrjCCau9wdcqrVYtotQFM/F9aTMww1aXMsBUvQVHxwmxJI+ViA==@kvack.org X-Gm-Message-State: AOJu0YwN7bNV+TIPGYv0BhhQW78Yi3G0Ef1mUOFQPxItT8cXOE67b/f4 O+boZ+GRykZE2lc7mFAjUts9HkalOR6BH1zqQJKHmxwx/nSmShtvhtTwzP1sCtDdjK0W865lAwN uzHXj6eQ5mhzmGSx2ATSmoJSx9PcOFPZ2mQZm X-Gm-Gg: ASbGncsu9HO2GAg3ZxtPMt5iOBIuJZROcELXpLW84d/zvAz1xPO4I+OV3MeyF7G6HCc EmRvBsbwvtwPUnjcUDavaI2o1+KhMOSOgSE6nunChkoKyb2+ETRs3GBCHJunqAl6P2ckdUr6hAu xzHNEQ+gFGdjxt4DwBKvcJziY= X-Google-Smtp-Source: AGHT+IFNyZ/h02gsvbeC0up9ulk27D2jhI3Rj1d7Wg620REgI9FAowU+ytZ0JoiwZxsj9oFSBbg/OCpnsoepTeOuk68= X-Received: by 2002:a05:622a:11:b0:486:8711:19af with SMTP id d75a77b69052e-48ae773409emr3692911cf.0.1746093276598; Thu, 01 May 2025 02:54:36 -0700 (PDT) MIME-Version: 1.0 References: <20250430165655.605595-7-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Thu, 1 May 2025 10:53:59 +0100 X-Gm-Features: ATxdqUE9VBlSm4IZL3pfrNaYkV2w1NjQaMBOEepCOPF4Q4EKmdB3yi6epRoxCN0 Message-ID: Subject: Re: [PATCH v8 06/13] KVM: x86: Generalize private fault lookups to guest_memfd fault lookups To: Ackerley Tng Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D18A7120007 X-Stat-Signature: 1pxpz9cy6rf7pro3zt4fjz9fajbyanzc X-HE-Tag: 1746093277-754740 X-HE-Meta: U2FsdGVkX182ZYD4WdNh/KONCDz4gD3jQ6hc40PSlzjEz0yAgidEMm9fB9V6ECaCLYHSED4EcjSTiMrfQro6Vy9CY5tT8PqdpKhtTBzkeCxMMS7IO/W0CoOxOrIieWy+6iZPW7fNnXEZV1K8qp1U1v5EBoHEqgh9Tx2WdAcLZjOU2bDgXpxZUSj+nMgKYNJc0p3mCHn+TmKn7Z1lv3S9aSGowi/12yI1/FDAsfZmVlT4fiqNdCDhBGgzRw559O/szzN/P1NMhdOG0tTafkj+uzc50jSICykFdJR7jm1jjD0ZmgsC17IX3jqRgP3OYe5+nRWDvlE+Bfm1nQWxGbYjoTlXcRRXRXNL4uBzYa54fisWsOOhod/uK85xo2UBYcxq0ja0p5p2+QAW9+ELQ+LsbUDcCYrG4XvKkEipg3AhQkL/dKjFbAmP4GWiz6XcBp0g0vx7750yzd5gTRprZJ352OvD5IEJcdvndFd+wesA2LkGxR+wRK50CXey70yg2PEpO/tvBpfJj0+2nBpQqaZDFcsPTdUScYuVYq4mJDUPvxkjwCCHU/TXc3jFdCHSgYQ6XsRXl5ZvbqH7321TJMcqUhlZ2bQp+rAJyMbBEmhn+q4tEFXtnaUzV0q2Y4A7Dur5NF6VqCKYxtauZhEiH2x1Q8Q9uGcSgqqciOqBZ0qn8ZIl7Sxw9+IJMo0c9mfruMI1jYvgi1R88FNFm6RTiRIlRWHTvvZquJCAGj2g+xQaRX9qY1Nf4z3N9aIVjkMybqtbAPSuQEQZi1vQLEus70jCgGkYBCsKmdAIWlKqyPo8tkiHG1ZSl72wYy4Vy+UZuwJasjAcRbQEfYncKvzCEVNiQbXcjnhAaMKH7czG/hMHhVMRur4wBloy+lt4L7O0kbGn0PCw8lkl9C6kt5PjwaryAmzeWe87rgPbFsKeJ9JX6oALLIqxvD4uYUCCM5Jt2xDgc6bTsLaWID4p3i2e89d C8DxsdWF C6VP55EXpRgIK9wMYZ72cub89+9K6WzmceKvbaMuQAcfD/DeEhu028qllcFytJ72ZwnSLKvOIAr8wUoKhwTh4YEHc9dcFqU8MKS/lUUW9e9DC8zJp+h0/NHbLL3CItE7GPKzOSKziv11Xqo+2SXe+55K7TiHt4QNZlf/dQf9KIQFmFvftTYuqNzhO65P60TJ7Sb6IF7eDOaEPd0w9Y5ioJ1GVhyzzGWL/7MCYF6JyWK4eTPE0KDaw0Pzgr5EwfmPdx+apTUYJjzjQjLrXvMxq4nsqdykqA0p+k6hn3GjC2HaYjhvP5Ti/cYHSXrJZHqLgX5+Sow59k1nOmXnjhM5UeLa7tNDrWmGUdh1kLDUqXOS027E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ackerley, On Wed, 30 Apr 2025 at 19:58, Ackerley Tng wrote: > > Fuad Tabba writes: > > > Until now, faults to private memory backed by guest_memfd are always > > consumed from guest_memfd whereas faults to shared memory are consumed > > from anonymous memory. Subsequent patches will allow sharing guest_memfd > > backed memory in-place, and mapping it by the host. Faults to in-place > > shared memory should be consumed from guest_memfd as well. > > > > In order to facilitate that, generalize the fault lookups. Currently, > > only private memory is consumed from guest_memfd and therefore as it > > stands, this patch does not change the behavior. > > > > Co-developed-by: David Hildenbrand > > Signed-off-by: David Hildenbrand > > Signed-off-by: Fuad Tabba > > --- > > arch/x86/kvm/mmu/mmu.c | 19 +++++++++---------- > > include/linux/kvm_host.h | 6 ++++++ > > 2 files changed, 15 insertions(+), 10 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 6d5dd869c890..08eebd24a0e1 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -3258,7 +3258,7 @@ static int host_pfn_mapping_level(struct kvm *kvm, gfn_t gfn, > > > > static int __kvm_mmu_max_mapping_level(struct kvm *kvm, > > const struct kvm_memory_slot *slot, > > - gfn_t gfn, int max_level, bool is_private) > > + gfn_t gfn, int max_level, bool is_gmem) > > { > > struct kvm_lpage_info *linfo; > > int host_level; > > @@ -3270,7 +3270,7 @@ static int __kvm_mmu_max_mapping_level(struct kvm *kvm, > > break; > > } > > > > - if (is_private) > > + if (is_gmem) > > return max_level; > > I think this renaming isn't quite accurate. > > IIUC in __kvm_mmu_max_mapping_level(), we skip considering > host_pfn_mapping_level() if the gfn is private because private memory > will not be mapped to userspace, so there's no need to query userspace > page tables in host_pfn_mapping_level(). > > Renaming is_private to is_gmem in this function implies that as long as > gmem is used, especially for shared pages from gmem, lpage_info will > always be updated and there's no need to query userspace page tables. > I understand. > > > > if (max_level == PG_LEVEL_4K) > > @@ -3283,10 +3283,9 @@ static int __kvm_mmu_max_mapping_level(struct kvm *kvm, > > int kvm_mmu_max_mapping_level(struct kvm *kvm, > > const struct kvm_memory_slot *slot, gfn_t gfn) > > { > > - bool is_private = kvm_slot_has_gmem(slot) && > > - kvm_mem_is_private(kvm, gfn); > > + bool is_gmem = kvm_slot_has_gmem(slot) && kvm_mem_from_gmem(kvm, gfn); > > This renaming should probably be undone too. Ack. > > > > - return __kvm_mmu_max_mapping_level(kvm, slot, gfn, PG_LEVEL_NUM, is_private); > > + return __kvm_mmu_max_mapping_level(kvm, slot, gfn, PG_LEVEL_NUM, is_gmem); > > } > > > > void kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > @@ -4465,7 +4464,7 @@ static inline u8 kvm_max_level_for_order(int order) > > return PG_LEVEL_4K; > > } > > > > -static u8 kvm_max_private_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, > > +static u8 kvm_max_gmem_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, > > u8 max_level, int gmem_order) > > { > > u8 req_max_level; > > @@ -4491,7 +4490,7 @@ static void kvm_mmu_finish_page_fault(struct kvm_vcpu *vcpu, > > r == RET_PF_RETRY, fault->map_writable); > > } > > > > -static int kvm_mmu_faultin_pfn_private(struct kvm_vcpu *vcpu, > > +static int kvm_mmu_faultin_pfn_gmem(struct kvm_vcpu *vcpu, > > struct kvm_page_fault *fault) > > { > > int max_order, r; > > @@ -4509,8 +4508,8 @@ static int kvm_mmu_faultin_pfn_private(struct kvm_vcpu *vcpu, > > } > > > > fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY); > > - fault->max_level = kvm_max_private_mapping_level(vcpu->kvm, fault->pfn, > > - fault->max_level, max_order); > > + fault->max_level = kvm_max_gmem_mapping_level(vcpu->kvm, fault->pfn, > > + fault->max_level, max_order); > > > > return RET_PF_CONTINUE; > > } > > @@ -4521,7 +4520,7 @@ static int __kvm_mmu_faultin_pfn(struct kvm_vcpu *vcpu, > > unsigned int foll = fault->write ? FOLL_WRITE : 0; > > > > if (fault->is_private) > > - return kvm_mmu_faultin_pfn_private(vcpu, fault); > > + return kvm_mmu_faultin_pfn_gmem(vcpu, fault); > > > > foll |= FOLL_NOWAIT; > > fault->pfn = __kvm_faultin_pfn(fault->slot, fault->gfn, foll, > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index d9616ee6acc7..cdcd7ac091b5 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -2514,6 +2514,12 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) > > } > > #endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */ > > > > +static inline bool kvm_mem_from_gmem(struct kvm *kvm, gfn_t gfn) > > +{ > > + /* For now, only private memory gets consumed from guest_memfd. */ > > + return kvm_mem_is_private(kvm, gfn); > > +} > > Can I understand this function as "should fault from gmem"? And hence > also "was faulted from gmem"? > > After this entire patch series, for arm64, KVM will always service stage > 2 faults from gmem. > > Perhaps this function should retain your suggested name of > kvm_mem_from_gmem() but only depend on > kvm_arch_gmem_supports_shared_mem(), since this patch series doesn't > update the MMU in X86. So something like this, Ack. > +static inline bool kvm_mem_from_gmem(struct kvm *kvm, gfn_t gfn) > +{ > + return kvm_arch_gmem_supports_shared_mem(kvm); > +} > > with the only usage in arm64. > > When the MMU code for X86 is updated, we could then update the above > with > > static inline bool kvm_mem_from_gmem(struct kvm *kvm, gfn_t gfn) > { > - return kvm_arch_gmem_supports_shared_mem(kvm); > + return kvm_arch_gmem_supports_shared_mem(kvm) || > + kvm_gmem_should_always_use_gmem(gfn_to_memslot(kvm, gfn)->gmem.file) || > + kvm_mem_is_private(kvm, gfn); > } > > where kvm_gmem_should_always_use_gmem() will read a guest_memfd flag? I'm not sure I follow this one... Could you please explain what you mean a bit more? Thanks, /fuad > > + > > #ifdef CONFIG_KVM_GMEM > > int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, > > gfn_t gfn, kvm_pfn_t *pfn, struct page **page,