From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2072EC7EE2A for ; Fri, 27 Jun 2025 15:01:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 560106B00A4; Fri, 27 Jun 2025 11:01:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 537326B00B3; Fri, 27 Jun 2025 11:01:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 474586B00B6; Fri, 27 Jun 2025 11:01:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 374D96B00A4 for ; Fri, 27 Jun 2025 11:01:08 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 99FE0B79BD for ; Fri, 27 Jun 2025 15:01:07 +0000 (UTC) X-FDA: 83601493374.08.5BCDA26 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf18.hostedemail.com (Postfix) with ESMTP id B50AB1C0005 for ; Fri, 27 Jun 2025 15:01:05 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LvEwjo9E; spf=pass (imf18.hostedemail.com: domain of 3MLJeaAsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3MLJeaAsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751036465; a=rsa-sha256; cv=none; b=sB8uuWAGLzuWqaJ82gksQN9A6iGlRmTTKiNz6bw4FlHPxv9AFmiBk7d5Zr/PT8TU72mvIN 6yN1BrNCAbbJPR1yC5ysEewn5XUKyb6Rcobf7zxMPJq4r/dDOzlVhc44TrwTVV79AkjHvw 7/MxTGN6/DDJa54bBIXjVXoejLo/L4A= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LvEwjo9E; spf=pass (imf18.hostedemail.com: domain of 3MLJeaAsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3MLJeaAsKCFk13B5IC5PKE77FF7C5.3FDC9ELO-DDBM13B.FI7@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751036465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aRAl9wI634BPGjcu8H/zKu0rYl19B+V87gNfnSSPISY=; b=COKEA3jlqjv4HJWXF+kWoc7m00t1NFZB49z/b8dicXDEM2p89ajtptTvq4lloyAtbyopKM J3WCpCSD0+d5fuIaj07hOczuRgwlbnj0ent8vIxmA+9TLmxu15KWtN2ToA7seWCqvkos/6 ytVuE0O3/CeAiZi0iYcWWH6nCCziG7M= Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b0e0c573531so1474223a12.3 for ; Fri, 27 Jun 2025 08:01:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1751036464; x=1751641264; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aRAl9wI634BPGjcu8H/zKu0rYl19B+V87gNfnSSPISY=; b=LvEwjo9ECQzE8j5QkKqj51CigKzluKKbRgJVoYoMzuscPfdzfKwQYOMBbcBSexzd8V cgIF9mT40nPF9OQMqydXQ1NO0onRMN8FvcOTE/FkeStygJZr1r8NuWlZ2iPtdBklhiDy u7g5WAxCPMAzmDr+ECWqj8v5smuy01Hcj3ifUkT8xyQbR2YejkWEoNZZzmuO3bj/EKbt SM220dMqh1uCQnTD3pMhisIaaDUmXmR8GA4RXrtvBBFehrIBdaHtTLY5xA7Vhidw1yFY 6gRpCy7uzfiFmroHR86WV0KagwJH5WZ6NgSNpF6c2mUlhaHhjcLCZqUhOKrkYuMgKygi yZgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751036464; x=1751641264; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aRAl9wI634BPGjcu8H/zKu0rYl19B+V87gNfnSSPISY=; b=jUhBRc8ueyM9i0gP7zNV1BeW+LhA76gknn+ky+mTiJ4/L/BjZ6Uba0Yc0+L0Oa5EeF J9ouUnBt368A0696zfzY0/AhztuoBfHCA9z2rDnOZMDPSNLuwak2EoGcGzyfGCloqAWy /JE0JvkLAYZz65J6kxt0EYHB2ayF+cL25k+NXVIX8ZDlDAemR/68izzLlkKMYncrIWQz dvuSUkDUmpTOPyB9R6cTms6ID0mQd+oV+DE1AEdFHEe6FQoacE/smDP+9X1xH7/qmmEa +7Ixp4UlmgmtPqSR17KNqO3+E3ANF60db2lGal1Uv0n+Ell0k0aIlYtIAkZbb8KPHkPk DsEA== X-Forwarded-Encrypted: i=1; AJvYcCXjvvxH7aiGTfaCeefPYRXbkZMOMInVrF1uYYnykDltpT6/5HIZ6FP9+CH2TVlpFf39TfLxztAJjg==@kvack.org X-Gm-Message-State: AOJu0YycGYFacLazepSbJO/j6QQHJmNOcI9IfTCOLrbIvSp+35EDWp63 zAEt5kZhKF2t7bvQG7qcwYV5EmHTpcrXdS2vVEusLQkpUEGivk3P5KwUdXXKbCdITYn8Y1U9qEc axtkh3iwYfCssQo1QnxWIH5o/GQ== X-Google-Smtp-Source: AGHT+IGurS1FeZxjPsUkkgX6nLcb9gWPD1RB81LRFHPYYM11l8SKTEFiNYBAYv1ewEvWIoDgL2ZOZLZLlNTHAJWkEw== X-Received: from pjbnb6.prod.google.com ([2002:a17:90b:35c6:b0:311:8076:14f1]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2c90:b0:312:ec3b:82c0 with SMTP id 98e67ed59e1d1-318c9314d79mr4816447a91.29.1751036464315; Fri, 27 Jun 2025 08:01:04 -0700 (PDT) Date: Fri, 27 Jun 2025 08:01:02 -0700 In-Reply-To: Mime-Version: 1.0 References: <20250611133330.1514028-1-tabba@google.com> <20250611133330.1514028-11-tabba@google.com> Message-ID: Subject: Re: [PATCH v12 10/18] KVM: x86/mmu: Handle guest page faults for guest_memfd with shared memory From: Ackerley Tng To: Sean Christopherson , Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: jx7sfti1murbztuyaumqk4jz5tmpte7f X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B50AB1C0005 X-Rspam-User: X-HE-Tag: 1751036465-985800 X-HE-Meta: U2FsdGVkX19G/gFhWitv6i6bF9HbHMtpgZOzrQ5e9P/azOBch20fwrsG/lVeCf/+8+zAqDetNVwCz5jHtMiMP0jC24BkWxxBMvG97v6kY6EY4yJjFfqMK5dzC1Zvg+d4eZUlh/yQ3whaCyOPvzkmgsNocZAMmRE4Ko1VfnQLm8xAMo+dcuR2Fe1+ZGKj1vkuMlKfb/ZeDlPP1FoVrveXVwI58vmoQrI5RxQ+xTr/Pct2bOyLRmnNpOgaUHOK/VHyXIdvYqyfw4NlDyahzi1r9tdiBDTh2Go59lyCxMyLhmfrZkvyNYnZXgnAUSO1KkpPw0Vo6pXJKy2CJvl9QcT7rX49gOFGyEdJ72AumkW3+AOg828sspJQUMHb+IGrRLe/8ctAqiHgjXT2CpDKJaSVKfINiERdHoxyZrX8m3uRuK9e+zQhNj5QFc/Xe9N55o04JiJgvr5v8+vZleJb7S0Kr+828XTUJ88LN5YoVS5gEcS8k6PfnScN3UXoOqdJ3MGlKsXNB1StjluVqCmm8qc3ocdMj/Ot3eIKKQJfm2zXKyH69RRAoM4WiJKsx9/j1/4VBnshvE8hL0/CdcHzCi7+9MnifaZD5vx69aT8HR4SwjZCCpjee7ZqUEnT0xdxSHB8FYbM7xjmAxurufmQqXzXnywHCHkE8Gq+i7TGZ4T8kjNCaYRa3LI0Ro2qDjlo8ZrgZmLuHR3E3PEnSqoOaIif9QpBXDT2FlysbxUnPqO8i0hAn7kbz0DQgdGOGc1FFXnFyjrmg1mlG/sJd9PdEQbzUd11nJDCLZILTnEwQipddtet6qfQsbir0QFmG5AdJKxdbAEsvB1zA0EBv3CO84klueekShkAFeup8Vtc0SRV7wI+DSa/IOk7T9qk1ojXrNJbTflGmwqfHgaz7KfSgVMoDhFhiGed8eMqIidhSrgt4T0IEPEHuFCovRrrPwpb1JmwTmwd8W+WxjdShFPGuvI qNsMQLnE DH7HMsC82ZQlftXI0+SnOmWoxAk/rE8sVIqga6D2u0kmdNhp0vWTiZvr5TyisamcQW+FJn7KTSQbS0Mb0Y2yrbAY3dA86EwkDhQTNmy+zJ4CMmfOVBRtW32sBjE53qaOAWN5/r4b2ohGBFVlL/WcNlcMRi5n/oySpjjcJxKNyyBf7eIp+CmaOvCN2Lhk1oBih3Qhmeb48srINUNC7+z3yKiCr7mGu2HH5MwuF+OiOSx5JXsN10N2EG7x8vSiQqdIJDe5lQQIUp+QHuPjqwrIVCmT1xMwPCTfbgqunG1ng9dHmmGu+eIggerSsjKq9FhMvYj1mQoAH5H2RxDd38wjduTAVQ6huIe1xJPLBgNhej1j03BqzegesIYTIxsWdVgLkEAX6BGfxWAsBjrAmr7ZAoYkJojoQJ5JhLx6AqnXCDcAPIvbzCtmQAijCzp7H2a1oqCf7t3Y2XB55XF3nqlhgnZlhKk+w2nsMFl3Y/l2AGv11nW1mTKPUnRBD7sb+Apdf3rPB0VxEd83MGNNxnytKZg/TDceT/0g+f3QRr7NdZT8MRbEYlcelNLl9tWWtru6qhxxZw+aDh7AZmttMA4K95Vxzn2Pott1He1na9zXGabLnriFuBobPPgFYgvhCq/hlw3tdA1CnJvbAtQNIlX2WgjTI0mF4dJsvOpPhtix7WFvKxBP/qzt8D7afJJHVawctzsTlL9IRf5dfNQzOW2WRl2KNnKA4967TaDkI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Ackerley Tng writes: > [...] >>> +/* >>> + * Returns true if the given gfn's private/shared status (in the CoCo sense) is >>> + * private. >>> + * >>> + * A return value of false indicates that the gfn is explicitly or implicitly >>> + * shared (i.e., non-CoCo VMs). >>> + */ >>> static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) >>> { >>> - return IS_ENABLED(CONFIG_KVM_GMEM) && >>> - kvm_get_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE; >>> + struct kvm_memory_slot *slot; >>> + >>> + if (!IS_ENABLED(CONFIG_KVM_GMEM)) >>> + return false; >>> + >>> + slot = gfn_to_memslot(kvm, gfn); >>> + if (kvm_slot_has_gmem(slot) && kvm_gmem_memslot_supports_shared(slot)) { >>> + /* >>> + * Without in-place conversion support, if a guest_memfd memslot >>> + * supports shared memory, then all the slot's memory is >>> + * considered not private, i.e., implicitly shared. >>> + */ >>> + return false; >> >> Why!?!? Just make sure KVM_MEMORY_ATTRIBUTE_PRIVATE is mutually exclusive with >> mappable guest_memfd. You need to do that no matter what. > > Thanks, I agree that setting KVM_MEMORY_ATTRIBUTE_PRIVATE should be > disallowed for gfn ranges whose slot is guest_memfd-only. Missed that > out. Where do people think we should check the mutual exclusivity? > > In kvm_supported_mem_attributes() I'm thiking that we should still allow > the use of KVM_MEMORY_ATTRIBUTE_PRIVATE for other non-guest_memfd-only > gfn ranges. Or do people think we should just disallow > KVM_MEMORY_ATTRIBUTE_PRIVATE for the entire VM as long as one memslot is > a guest_memfd-only memslot? > > If we check mutually exclusivity when handling > kvm_vm_set_memory_attributes(), as long as part of the range where > KVM_MEMORY_ATTRIBUTE_PRIVATE is requested to be set intersects a range > whose slot is guest_memfd-only, the ioctl will return EINVAL. > At yesterday's (2025-06-26) guest_memfd upstream call discussion, * Fuad brought up a possible use case where within the *same* VM, we want to allow both memslots that supports and does not support mmap in guest_memfd. * Shivank suggested a concrete use case for this: the user wants a guest_memfd memslot that supports mmap just so userspace addresses can be used as references for specifying memory policy. * Sean then added on that allowing both types of guest_memfd memslots (support and not supporting mmap) will allow the user to have a second layer of protection and ensure that for some memslots, the user expects never to be able to mmap from the memslot. I agree it will be useful to allow both guest_memfd memslots that support and do not support mmap in a single VM. I think I found an issue with flags, which is that GUEST_MEMFD_FLAG_MMAP should not imply that the guest_memfd will provide memory for all guest faults within the memslot's gfn range (KVM_MEMSLOT_GMEM_ONLY). For the use case Shivank raised, if the user wants a guest_memfd memslot that supports mmap just so userspace addresses can be used as references for specifying memory policy for legacy Coco VMs where shared memory should still come from other sources, GUEST_MEMFD_FLAG_MMAP will be set, but KVM can't fault shared memory from guest_memfd. Hence, GUEST_MEMFD_FLAG_MMAP should not imply KVM_MEMSLOT_GMEM_ONLY. Thinking forward, if we want guest_memfd to provide (no-mmap) protection even for non-CoCo VMs (such that perhaps initial VM image is populated and then VM memory should never be mmap-ed at all), we will want guest_memfd to be the source of memory even if GUEST_MEMFD_FLAG_MMAP is not set. I propose that we should have a single VM-level flag to solve this (in line with Sean's guideline that we should just move towards what we want and not support non-existent use cases): something like KVM_CAP_PREFER_GMEM. If KVM_CAP_PREFER_GMEM_MEMORY is set, * memory for any gfn range in a guest_memfd memslot will be requested from guest_memfd * any privacy status queries will also be directed to guest_memfd * KVM_MEMORY_ATTRIBUTE_PRIVATE will not be a valid attribute KVM_CAP_PREFER_GMEM_MEMORY will be orthogonal with no validation on GUEST_MEMFD_FLAG_MMAP, which should just purely guard mmap support in guest_memfd. Here's a table that I set up [1]. I believe the proposed KVM_CAP_PREFER_GMEM_MEMORY (column 7) lines up with requirements (columns 1 to 4) correctly. [1] https://lpc.events/event/18/contributions/1764/attachments/1409/3710/guest_memfd%20use%20cases%20vs%20guest_memfd%20flags%20and%20privacy%20tracking.pdf > [...]