From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27600D2A556 for ; Wed, 16 Oct 2024 19:43:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3FD66B0088; Wed, 16 Oct 2024 15:43:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AC7D96B008C; Wed, 16 Oct 2024 15:43:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 968316B0092; Wed, 16 Oct 2024 15:43:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 72D276B0088 for ; Wed, 16 Oct 2024 15:43:00 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A6CE2140B38 for ; Wed, 16 Oct 2024 19:42:49 +0000 (UTC) X-FDA: 82680488268.10.1C80549 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 3B479A000F for ; Wed, 16 Oct 2024 19:42:52 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linuxfoundation.org header.s=korg header.b=0NMHJxKd; spf=pass (imf25.hostedemail.com: domain of gregkh@linuxfoundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=gregkh@linuxfoundation.org; dmarc=pass (policy=none) header.from=linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729107633; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0Nl84Igaci6mig/KFvuf5Y5n1+Ik/t697FWvBrhhRzw=; b=MW0bUJQe8/TVbu5B6l33FkVa7IsUhv15toNF1Ejf3ZmrwqB2CP2GZhCRwjqsBAclx7ABNh gmtkd/7dUjw3YpArmhn02C1j6dNpu823z80miwRaYRPyQcTJfgCiC1SijscA3QRItRPqYw 7t2rgAl6HSnhgoSNwBZuCNj38SDg5W0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729107633; a=rsa-sha256; cv=none; b=cilnyBqA3/VxA0zVGxdQg42mWL3N65bDwXEB4/eUMtqL5WOEuXslYKa0mmsLQXfBsnGCSJ kZ5/F/By4vBi9WKz7qVYoWUxn+mNNRjLtUNlrKnE2m5a8sdAVw2ZfD9Xf6YaxK+lEOoofg YX7HQ+u8QikT+6L6yzTiXb8hU/43yKg= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=linuxfoundation.org header.s=korg header.b=0NMHJxKd; spf=pass (imf25.hostedemail.com: domain of gregkh@linuxfoundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=gregkh@linuxfoundation.org; dmarc=pass (policy=none) header.from=linuxfoundation.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id E91275C54CC; Wed, 16 Oct 2024 19:42:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E39CC4CECD; Wed, 16 Oct 2024 19:42:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1729107777; bh=LLzFnTpTL1+XFRq4+TKXAgrc///vCyQPUGO2IHwl6Ro=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=0NMHJxKdPXrNVTJJELxvcjP6EXndmZPKY/cVa8r0jrIipsik5JX8iDyHowdWGprVf 9/NrWwBchzPp3veG3nCJTj5bZtgrghrA/UrbB3NhfiaDta1rG+G1i1KcQYazyRbnHr tYnYkPICnlWj/mOfQIIsaTXRNwNGrRy9DzO3sJQI= Date: Wed, 16 Oct 2024 21:42:53 +0200 From: Greg Kroah-Hartman To: Yuanchu Xie Cc: Wei Liu , Rob Bradford , Theodore Ts'o , Pasha Tatashin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux.dev, dev@lists.cloudhypervisor.org Subject: Re: [PATCH v3 1/2] virt: pvmemcontrol: control guest physical memory properties Message-ID: <2024101628-audibly-maverick-e1fe@gregkh> References: <20241016193947.48534-1-yuanchu@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241016193947.48534-1-yuanchu@google.com> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3B479A000F X-Stat-Signature: 5eajhg4rrzwymhdi3notddcuq6rbhxmq X-HE-Tag: 1729107772-472086 X-HE-Meta: U2FsdGVkX18LFlEek8ayu1qclco7ibT4apjdtESK61ouQbe9i1GQeEDtghBtJZ+vO2TgLT7Ahq/gwSZoZfqBpH/jqogx4lzKsO0s/FlJE3zePAUFaRYyZVOarfvaAQI1ScBoPJPWAvzsnp6YSUlSyGEs2OscBye4DIv1A+rs1mxeiOCTE+SDNdyGC9pFSaBUA68cgIBpxPQo38eKLZjQw5TNqQRXw9WakVC9puQazc88KnCvFUyAJL/AL+FlckNCAYqZ+/MypQJJbcV0cOcP66FPJICbBbnSOXS8+/epfjfZZmPHXj12PPCJInh1Jj4A/iqoTBQnfKvN0D9SPdn7haTjJkeOyfw927OB9t8LYJINqUcIiv+nvQfBrC7lAM8OOQ3lKtuWnPQy0SBsOX89XznbW5oykp9h6Xt6kygKjnDu17vidAFKxOlyuPqaW3ND/Hh/7eJrAWlp1SEfH7RiGvRuUGsYGSKF9KflbIGbFmoaIpiR1rFRwWAjb5xv8r+LoGooOVh4Jh87kli+IrCpfNL2sCIZ1uY4oyDd17gw/Btf6wQPHtoufPGyfFEB4/wjrSNUQO0Z31BKb7bn67QOh//UQlmK7ki3enqVYFOOxi9iNGw/X2vqadrQpwqj+gvNmKO9dTpkuZ7oDJ3MEB0tQSv/IqwvTu9dhoXfaUzryApv425DADlJrma20Q/BFmHskn23pw+s3EkxkA/SpNnYzA4SoImWbxz/bsOC4sdJZqMnmsQM7Wf04nLc5RzPbwRNuqXLjop8njYZQGgWXPnQjuZW0E483WOjgbcyB5bzDPgfGcITXiDLFJeypkAa6xKpwoElfQCCsHaFSZgCrRMsBFhupDOUBhF6ovW+nHQG0+YDcRyZH9mT2kk7B5B5Y2AA21uZIfwLnn+tZOh4Jyz3tGWJ/CVL+tnOjvMxCehNDyOL8RugWgYJdcMa0Itk3MGPXEXb6X0eBsnmzrWTrnq /C8uGdo3 CrbcemJGBSTmD4VpV2mulYjKcdYA8TKiXTBqVgBODRC6LELRTQHZTXCiVZ+B9AfecDrS8kACv8K41KqYM2f2+x7xcIEQN1zVBVlN1mw3eoHFEslITQLQevyAuJotGqL2PDZI4LQMpBbyA1JF3AlXKplGFDFve9McopGrvmoZChtiF2a0VqOaUGz8spgOUlI14ncmr8Q5D1/71cJA20UaG/50iCheaCVrg702elw/7bi6u5WPxH3CddW1D1+lVuWRs4LmhTs3pUlETTt5xR6UgWAYN532mja8E+rCnIO1VURRZ/dY81OB5GoHAsCpIN1qNmCLMKx+mAcqbstt+DVbB+m8ysBjMro8FgbFA X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 16, 2024 at 12:39:46PM -0700, Yuanchu Xie wrote: > Pvmemcontrol provides a way for the guest to control its physical memory > properties and enables optimizations and security features. For example, > the guest can provide information to the host where parts of a hugepage > may be unbacked, or sensitive data may not be swapped out, etc. > > Pvmemcontrol allows guests to manipulate its gPTE entries in the SLAT, > and also some other properties of the memory mapping on the host. > This is achieved by using the KVM_CAP_SYNC_MMU capability. When this > capability is available, the changes in the backing of the memory region > on the host are automatically reflected into the guest. For example, an > mmap() or madvise() that affects the region will be made visible > immediately. > > There are two components of the implementation: the guest Linux driver > and Virtual Machine Monitor (VMM) device. A guest-allocated shared > buffer is negotiated per-cpu through a few PCI MMIO registers; the VMM > device assigns a unique command for each per-cpu buffer. The guest > writes its pvmemcontrol request in the per-cpu buffer, then writes the > corresponding command into the command register, calling into the VMM > device to perform the pvmemcontrol request. > > The synchronous per-cpu shared buffer approach avoids the kick and busy > waiting that the guest would have to do with virtio virtqueue transport. > > User API > >From the userland, the pvmemcontrol guest driver is controlled via the > ioctl(2) call. It requires CAP_SYS_ADMIN. > > ioctl(fd, PVMEMCONTROL_IOCTL, struct pvmemcontrol_buf *buf); > > Guest userland applications can tag VMAs and guest hugepages, or advise > the host on how to handle sensitive guest pages. > > Supported function codes and their use cases: > PVMEMCONTROL_FREE/REMOVE/DONTNEED/PAGEOUT. For the guest. One can reduce > the struct page and page table lookup overhead by using hugepages backed > by smaller pages on the host. These pvmemcontrol commands can allow for > partial freeing of private guest hugepages to save memory. They also > allow kernel memory, such as kernel stacks and task_structs to be > paravirtualized if we expose kernel APIs. > > PVMEMCONTROL_MERGEABLE can inform the host KSM to deduplicate VM pages. > > PVMEMCONTROL_UNMERGEABLE is useful for security, when the VM does not > want to share its backing pages. > The same with PVMEMCONTROL_DONTDUMP, so sensitive pages are not included > in a dump. > MLOCK/UNLOCK can advise the host that sensitive information is not > swapped out on the host. > > PVMEMCONTROL_MPROTECT_NONE/R/W/RW. For guest stacks backed by hugepages, > stack guard pages can be handled in the host and memory can be saved in > the hugepage. > > PVMEMCONTROL_SET_VMA_ANON_NAME is useful for observability and debugging > how guest memory is being mapped on the host. > > Sample program making use of PVMEMCONTROL_DONTNEED: > https://github.com/Dummyc0m/pvmemcontrol-user > > The VMM implementation is part of Cloud Hypervisor, the feature > pvmemcontrol can be enabled and the VMM can then provide the device to a > supporting guest. > https://github.com/cloud-hypervisor/cloud-hypervisor > > - > Changelog > PATCH v2 -> v3 > - added PVMEMCONTROL_MERGEABLE for memory dedupe. > - updated link to the upstream Cloud Hypervisor repo, and specify the > feature required to enable the device. > PATCH v1 -> v2 > - fixed byte order sparse warning. ioread/write already does > little-endian. > - add include for linux/percpu.h > RFC v1 -> PATCH v1 > - renamed memctl to pvmemcontrol > - defined device endianness as little endian As per the kernel documentation, this changelog is in the wrong place. Please put it in the correct location. thanks, greg k-h