From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C471C4332F for ; Wed, 1 Nov 2023 14:20:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D0E5F8D0048; Wed, 1 Nov 2023 10:20:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CBF128D0001; Wed, 1 Nov 2023 10:20:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B383A8D0048; Wed, 1 Nov 2023 10:20:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A09B78D0001 for ; Wed, 1 Nov 2023 10:20:07 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 44731120CCD for ; Wed, 1 Nov 2023 14:20:07 +0000 (UTC) X-FDA: 81409594854.22.2E2F286 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by imf30.hostedemail.com (Postfix) with ESMTP id 5692580029 for ; Wed, 1 Nov 2023 14:20:04 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C3QphY2t; spf=pass (imf30.hostedemail.com: domain of tabba@google.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698848404; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Bjt3QR9Oyt0AfbOrG8Ks+Z2RSKpEyu9CpvIK66pVo08=; b=uDCV0bDouzGEuj/AsIUlmpinJDzF6zZa7ypaJ7HKAhYCEUCo7ojVkw+AQQFNVlXrka6sVm JES8JGF7a0oPIG1zDzFVvgAyr682bqEPOmmWDJKdsAliYBkFRLAg1Aa5ep7cdOiIAojpGW CWnldhcXRU3ZKCYlVN7ACRZVyElvW4Y= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C3QphY2t; spf=pass (imf30.hostedemail.com: domain of tabba@google.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698848404; a=rsa-sha256; cv=none; b=pkg9nq1QyU1oRNtGDMwshV+DUlgQ7ThN1bvExpSdeVQBzn5edtRplf+Dr5qOGGz/yHnEAP rspKQE1S7msJCHvsL35MRTx+GaoCaInRLKcPtARfMTjLQ6+jCaFzycDEtV2OJ7cMakSCLL KCmijqZ0320dnzgPZbXJEhaov+yizKs= Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-66fa16092c0so47382396d6.0 for ; Wed, 01 Nov 2023 07:20:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698848403; x=1699453203; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Bjt3QR9Oyt0AfbOrG8Ks+Z2RSKpEyu9CpvIK66pVo08=; b=C3QphY2tLy0m9Sej1V4NiORass2n+JpjmpPLMabJYY8nees0PeHm8A2Ua8Hl/z/z0K ZKguH5+lsCIs8hIBDyrfVhyMZKuhMi7eRAY66ylNol3xQxVcQXwOVWQ+cHfm+qP0K2xz Tf3s+Z9y4f2rxcEx88b5FG/Tf0zgZi4lvkL80ZIAVuaw6Uun6Q05Sf1PJM4HcchiMg2x O6/Yu/FETF5vlMxtKmqdJjSKDAgQLKdK/CVDoEQSFqHXdOOaDXcfg4SXSc1MlCT4VrLO aT0WHM3n37XNjkMBSoD6vLoE/AtBbeRvT020NOvG2GBqBTAPQTwL4kh9wyJ8mtFUKyyv P5lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698848403; x=1699453203; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Bjt3QR9Oyt0AfbOrG8Ks+Z2RSKpEyu9CpvIK66pVo08=; b=iwz2+Dt9cekbRWJwlvKKZuzK1BqFI1G9SDZXuIFZnI26idTdXUz9NObbMBKWtoM9Mg L4zGGJ1e6IZJMs3bfkMh15BIi0sZ2zaevIiepUF/hsm/Nnv23F0MG0hN/SbpJb3hwIRc iQ4BFkt9EW294Cb6uYnye8LakUGuwb0V0Mtj2/Wbq2cSbR2cxijnozW6TCmQ8HlguKn9 01uHme+l6nLdydHi0soN2I7s8Vx3KU+6sDFIwjBs3yr8QAnCyA8EtLQkwCgCqq2MXIny TW7L6D2ZVVnopu3LnD2W0pC2V3t2qUEvQovW79PRyIQHzJysbwLviWKfXLoRK19m2WXW mHHQ== X-Gm-Message-State: AOJu0YzgWVmR0iB2gYPN8yGY/Q8coaF7QvDc/5qY1ILf/ZMWlBIzU6o0 Pt5i+uzUhc1G6ifkAHQwRAR3fI8Zb1WEuT+wvYPPyA== X-Google-Smtp-Source: AGHT+IHK1fieD/BgK48oE6vj2OwKc097KW4M2L6co0EoihJHDvl5lPyutAVieGa2E49DOF2qJpRq12aLhod9YaGzUi0= X-Received: by 2002:ad4:596c:0:b0:65b:220d:72a5 with SMTP id eq12-20020ad4596c000000b0065b220d72a5mr16187853qvb.51.1698848403231; Wed, 01 Nov 2023 07:20:03 -0700 (PDT) MIME-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-9-seanjc@google.com> In-Reply-To: <20231027182217.3615211-9-seanjc@google.com> From: Fuad Tabba Date: Wed, 1 Nov 2023 14:19:26 +0000 Message-ID: Subject: Re: [PATCH v13 08/35] KVM: Introduce KVM_SET_USER_MEMORY_REGION2 To: Sean Christopherson Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 5692580029 X-Rspam-User: X-Stat-Signature: 6jpdjqak1fws79abbjxu1d7e5i9uiax5 X-Rspamd-Server: rspam01 X-HE-Tag: 1698848404-282077 X-HE-Meta: U2FsdGVkX19Ks1fnW3l869kumaBYiNWFJDOc3mzTFkO/e0cl87ams06OFr5RA4RpBItNgrm6MQNPp9GZZUJhZoUkoVVJ+G0DwOLGk31sMPHZnMRLh5DX/Sc6VYbJv9L7CV5oWmeAytL339ZERfW6hNExVVOP9G9Fd5/gRqko1tq7Cp7+elDbO4Gz6yJf5N5YHxy3KEcad41kkPSxZN/bsRikSd520b2jyHoHAkGHSy+aye/ddoA2eLIVrTLLImBqwqTjxIPXIB7/GmFg4nmHU9YEyA94Vy0S9kRTEdngti5bi2XuvVtlroXfHuVTGMWhFeC9fM8buobQgAzRc+Olqxsei638aehpp6c403MqW8YYtyKntQo9TDcnqBi9wACbAhfKYeJTNzZY5Mb1lzpa8e8xiUaQxHrICMiAFgzb2uAdqqP2Wa6tIUUFoOIRil5F02sbUud6xZ+eK0Q73xHV9Chk2fSfHQtai7RZNXmfFP/kO5tY88VuuBpGG6L/wQO9Tq8WvZCU/9PsBRYdKzyCUUto9sr34fjQOLm4BP+coeGB+wLQ8wEz04BAyYn8ePuXXb2PwRtFC2t4FQpZMM7Vu1RTtiu4w763aZXzuCh8L2+oN1ME51eDlU0jCGBmYTi64/2Vx/b+TiYrjAWAw1PUwYtjNw2VmgxWnM5AO+h+6sJj1hJ18MgksOSts8UX3cTUvPj4uK8ZeRJ8CWlCPUc0ihWEhlnG4S/taAo9z3QAvmchTHp5CEzYJtos3bSxCm2hBLu6nSh9FJtb03kJPsDzGeFDD0jJ5TrM9/KlZeSKFAxdKCVrw8KBKjvC7F2koHr4E5BtpraunqZtvLKXbbgoKAoSaWGxtHKZArXYda1X70AUg+tCuZ/1s1pqIJczr/xgyAv7mbGRwx/vrUw8sfd/58S4aqqPXKsupseDXp7KZ12Pyxq/KOGge/u8CKk6EsD9VT3wGMbC644gf218IlA bEqMiZRm INQzqeBzJHoPDXnlPNey4WkylnN7KmXLpmuYTNLWRAJPFy3DQIoZQ/P+jSS9nWLq08pTkUGiEe/nj0rSKVNnYKuRNEo2i2l9cmgkyDqNqsOkFEtivvhJOfPRs2NYaxxMmO9M0D/oPP+S8pjs4cwiIBUQvdpKh9jxVBNlscgVRAa1eq/mlvVugrm+cMD23sNN0AkD9lD/aW4izxtJ5zKMnndghLlt+eF6sBGuT0uFhohQcvkCNsxTM4fJQKzgqNj6MfcSMhyLVvVhDZFQ9oR7Pjn7j7XA8oxGE799244bqNI+wVxi+XkY5BcuUNepj8D8TWWYJ5HHsEL2YBOMNTBG50iF7A3A5ytzwR1q/jOsiCXi//EkGs9w4+y3glJYMqlYtyLUTPJZQ/YbDtEXgqoXCK6AJxpFqBhTw4ZVVmba2Ksb1BjD01k+YVr5mUDuBXFZetTzaiyHirnpO9MqvRrQdwIovo3L+S7PLzCmu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 27, 2023 at 7:22=E2=80=AFPM Sean Christopherson wrote: > > Introduce a "version 2" of KVM_SET_USER_MEMORY_REGION so that additional > information can be supplied without setting userspace up to fail. The > padding in the new kvm_userspace_memory_region2 structure will be used to > pass a file descriptor in addition to the userspace_addr, i.e. allow > userspace to point at a file descriptor and map memory into a guest that > is NOT mapped into host userspace. > > Alternatively, KVM could simply add "struct kvm_userspace_memory_region2" > without a new ioctl(), but as Paolo pointed out, adding a new ioctl() > makes detection of bad flags a bit more robust, e.g. if the new fd field > is guarded only by a flag and not a new ioctl(), then a userspace bug > (setting a "bad" flag) would generate out-of-bounds access instead of an > -EINVAL error. > > Cc: Jarkko Sakkinen > Reviewed-by: Paolo Bonzini > Reviewed-by: Xiaoyao Li > Signed-off-by: Sean Christopherson > --- With the missing pad in api.rst fixed: Reviewed-by: Fuad Tabba Tested-by: Fuad Tabba Cheers, /fuad > Documentation/virt/kvm/api.rst | 21 +++++++++++++++++++ > arch/x86/kvm/x86.c | 2 +- > include/linux/kvm_host.h | 4 ++-- > include/uapi/linux/kvm.h | 13 ++++++++++++ > virt/kvm/kvm_main.c | 38 +++++++++++++++++++++++++++------- > 5 files changed, 67 insertions(+), 11 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.= rst > index 21a7578142a1..ace984acc125 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -6070,6 +6070,27 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers = using the SET_ONE_REG > interface. No error will be returned, but the resulting offset will not = be > applied. > > +4.139 KVM_SET_USER_MEMORY_REGION2 > +--------------------------------- > + > +:Capability: KVM_CAP_USER_MEMORY2 > +:Architectures: all > +:Type: vm ioctl > +:Parameters: struct kvm_userspace_memory_region2 (in) > +:Returns: 0 on success, -1 on error > + > +:: > + > + struct kvm_userspace_memory_region2 { > + __u32 slot; > + __u32 flags; > + __u64 guest_phys_addr; > + __u64 memory_size; /* bytes */ > + __u64 userspace_addr; /* start of the userspace allocated memory = */ > + }; > + > +See KVM_SET_USER_MEMORY_REGION. > + > 5. The kvm_run structure > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 41cce5031126..6409914428ca 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -12455,7 +12455,7 @@ void __user * __x86_set_memory_region(struct kvm = *kvm, int id, gpa_t gpa, > } > > for (i =3D 0; i < KVM_ADDRESS_SPACE_NUM; i++) { > - struct kvm_userspace_memory_region m; > + struct kvm_userspace_memory_region2 m; > > m.slot =3D id | (i << 16); > m.flags =3D 0; > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 5faba69403ac..4e741ff27af3 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1146,9 +1146,9 @@ enum kvm_mr_change { > }; > > int kvm_set_memory_region(struct kvm *kvm, > - const struct kvm_userspace_memory_region *mem); > + const struct kvm_userspace_memory_region2 *mem)= ; > int __kvm_set_memory_region(struct kvm *kvm, > - const struct kvm_userspace_memory_region *mem= ); > + const struct kvm_userspace_memory_region2 *me= m); > void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot= ); > void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen); > int kvm_arch_prepare_memory_region(struct kvm *kvm, > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 13065dd96132..bd1abe067f28 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -95,6 +95,16 @@ struct kvm_userspace_memory_region { > __u64 userspace_addr; /* start of the userspace allocated memory = */ > }; > > +/* for KVM_SET_USER_MEMORY_REGION2 */ > +struct kvm_userspace_memory_region2 { > + __u32 slot; > + __u32 flags; > + __u64 guest_phys_addr; > + __u64 memory_size; > + __u64 userspace_addr; > + __u64 pad[16]; > +}; > + > /* > * The bit 0 ~ bit 15 of kvm_userspace_memory_region::flags are visible = for > * userspace, other bits are reserved for kvm internal use which are def= ined > @@ -1192,6 +1202,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_COUNTER_OFFSET 227 > #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 > #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 > +#define KVM_CAP_USER_MEMORY2 230 > > #ifdef KVM_CAP_IRQ_ROUTING > > @@ -1473,6 +1484,8 @@ struct kvm_vfio_spapr_tce { > struct kvm_userspace_memory_regio= n) > #define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47) > #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64) > +#define KVM_SET_USER_MEMORY_REGION2 _IOW(KVMIO, 0x49, \ > + struct kvm_userspace_memory_regi= on2) > > /* enable ucontrol for s390 */ > struct kvm_s390_ucas_mapping { > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 6e708017064d..3f5b7c2c5327 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -1578,7 +1578,7 @@ static void kvm_replace_memslot(struct kvm *kvm, > } > } > > -static int check_memory_region_flags(const struct kvm_userspace_memory_r= egion *mem) > +static int check_memory_region_flags(const struct kvm_userspace_memory_r= egion2 *mem) > { > u32 valid_flags =3D KVM_MEM_LOG_DIRTY_PAGES; > > @@ -1980,7 +1980,7 @@ static bool kvm_check_memslot_overlap(struct kvm_me= mslots *slots, int id, > * Must be called holding kvm->slots_lock for write. > */ > int __kvm_set_memory_region(struct kvm *kvm, > - const struct kvm_userspace_memory_region *mem= ) > + const struct kvm_userspace_memory_region2 *me= m) > { > struct kvm_memory_slot *old, *new; > struct kvm_memslots *slots; > @@ -2084,7 +2084,7 @@ int __kvm_set_memory_region(struct kvm *kvm, > EXPORT_SYMBOL_GPL(__kvm_set_memory_region); > > int kvm_set_memory_region(struct kvm *kvm, > - const struct kvm_userspace_memory_region *mem) > + const struct kvm_userspace_memory_region2 *mem) > { > int r; > > @@ -2096,7 +2096,7 @@ int kvm_set_memory_region(struct kvm *kvm, > EXPORT_SYMBOL_GPL(kvm_set_memory_region); > > static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm, > - struct kvm_userspace_memory_reg= ion *mem) > + struct kvm_userspace_memory_reg= ion2 *mem) > { > if ((u16)mem->slot >=3D KVM_USER_MEM_SLOTS) > return -EINVAL; > @@ -4566,6 +4566,7 @@ static int kvm_vm_ioctl_check_extension_generic(str= uct kvm *kvm, long arg) > { > switch (arg) { > case KVM_CAP_USER_MEMORY: > + case KVM_CAP_USER_MEMORY2: > case KVM_CAP_DESTROY_MEMORY_REGION_WORKS: > case KVM_CAP_JOIN_MEMORY_REGIONS_WORKS: > case KVM_CAP_INTERNAL_ERROR_DATA: > @@ -4821,6 +4822,14 @@ static int kvm_vm_ioctl_get_stats_fd(struct kvm *k= vm) > return fd; > } > > +#define SANITY_CHECK_MEM_REGION_FIELD(field) = \ > +do { = \ > + BUILD_BUG_ON(offsetof(struct kvm_userspace_memory_region, field) = !=3D \ > + offsetof(struct kvm_userspace_memory_region2, field)= ); \ > + BUILD_BUG_ON(sizeof_field(struct kvm_userspace_memory_region, fie= ld) !=3D \ > + sizeof_field(struct kvm_userspace_memory_region2, fi= eld)); \ > +} while (0) > + > static long kvm_vm_ioctl(struct file *filp, > unsigned int ioctl, unsigned long arg) > { > @@ -4843,15 +4852,28 @@ static long kvm_vm_ioctl(struct file *filp, > r =3D kvm_vm_ioctl_enable_cap_generic(kvm, &cap); > break; > } > + case KVM_SET_USER_MEMORY_REGION2: > case KVM_SET_USER_MEMORY_REGION: { > - struct kvm_userspace_memory_region kvm_userspace_mem; > + struct kvm_userspace_memory_region2 mem; > + unsigned long size; > + > + if (ioctl =3D=3D KVM_SET_USER_MEMORY_REGION) > + size =3D sizeof(struct kvm_userspace_memory_regio= n); > + else > + size =3D sizeof(struct kvm_userspace_memory_regio= n2); > + > + /* Ensure the common parts of the two structs are identic= al. */ > + SANITY_CHECK_MEM_REGION_FIELD(slot); > + SANITY_CHECK_MEM_REGION_FIELD(flags); > + SANITY_CHECK_MEM_REGION_FIELD(guest_phys_addr); > + SANITY_CHECK_MEM_REGION_FIELD(memory_size); > + SANITY_CHECK_MEM_REGION_FIELD(userspace_addr); > > r =3D -EFAULT; > - if (copy_from_user(&kvm_userspace_mem, argp, > - sizeof(kvm_userspace_mem)= )) > + if (copy_from_user(&mem, argp, size)) > goto out; > > - r =3D kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_= mem); > + r =3D kvm_vm_ioctl_set_memory_region(kvm, &mem); > break; > } > case KVM_GET_DIRTY_LOG: { > -- > 2.42.0.820.g83a721a137-goog >