From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6F56C48BF6 for ; Mon, 26 Feb 2024 09:04:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 38E6C6B0187; Mon, 26 Feb 2024 04:04:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 316E76B0188; Mon, 26 Feb 2024 04:04:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1912E6B0189; Mon, 26 Feb 2024 04:04:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0200F6B0187 for ; Mon, 26 Feb 2024 04:04:21 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D0C46A0A57 for ; Mon, 26 Feb 2024 09:04:20 +0000 (UTC) X-FDA: 81833368680.15.C77D16F Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf16.hostedemail.com (Postfix) with ESMTP id 31E0518000B for ; Mon, 26 Feb 2024 09:04:18 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4ruVeYfE; spf=pass (imf16.hostedemail.com: domain of tabba@google.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708938258; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yEZyECskv2rcwMp+c/P2OzQkmGTKaUReF3Ro5PzNU5M=; b=mKg72wUFXdhuVk8YFoY0YKIt1UdlHhU3SXJND/eK6LyIM8hXdyhhWRZkqbhtIoQ/Y8lfYa 9WQ4+LHRZ1fGArjTDHH7uAZl9ftN2bAwa0zlGOaqUyxFLpIvMjiZZ2JIRZOBKRA7AogIdN n+AzEj6/9Z6Rs0dfXzhVuEooxeZaLxI= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4ruVeYfE; spf=pass (imf16.hostedemail.com: domain of tabba@google.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=tabba@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708938258; a=rsa-sha256; cv=none; b=UuE+HoHa5oAYdLeWh96XuiAXMB2t++InozZJOTpnSjyJrwEl4IwDdC16Ud3fPMG+exHKAu a7Vjd9B2TcWVmozVyLUjsLoWtIeZNy0jU08qMHr/OExSByYXkMZ0c+JdYWfs/zQojLZeW4 EUYbDI5ZgS/pgJteNizjG+bhCuaY4Sw= Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-47048ea1b17so272145137.0 for ; Mon, 26 Feb 2024 01:04:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708938257; x=1709543057; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yEZyECskv2rcwMp+c/P2OzQkmGTKaUReF3Ro5PzNU5M=; b=4ruVeYfEGHSbL9LobflbimlapsEpry3VS3+q1jZDoR6zwlVxnxHJa0w9h6C86QDmvP wn8SyKjgQdwL9S+bCPVgPsXTE+Grht1oBZjJRyGgeOUn6I+AXH/esJ7T/QCt8lZTk6RR hina4Ypnk14ASWivht+brG5c/4zGv+FMSdcPwEjO1Dsqrc4thZEWL8wcoHoV7vYoqCGi ejJZvGyl9CybmZOPPcONOblPF6as8drqgW3XVBQ44uro/EMHj/LeufxeRgNawuLo8DOM 5iQ2T5/Ji1TRdK6aPr0xFUCEWRHIH1NnQYOyiF9z2JWwOT+EzKY5XavGXsQOy8mYfIHo BWDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708938257; x=1709543057; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yEZyECskv2rcwMp+c/P2OzQkmGTKaUReF3Ro5PzNU5M=; b=O7PVUBBiB9dYFArslv3O89MywHK3Tc5hvTQcE6AMsJ185oUwVnFmjSetg1Wa0jqrU5 QA7/hSHGEiCfjRRTun8L9ZhXlo8ZQQVgRU50r98fM6F4Mg2e3rVZycemJ6GpD9e+6U1r AVDvYObiS80i5FH6mNX6CqwIX3T4Eov5+gqq80PhCgTdNAl4yGs2eiPNJoowkjR8pGYB Wtm/By62x6TJQwZIGtgk1xSXZDn5y03xrus/mZXhewSUoSIGO0IhVXZGVLBXjwYikCqY h9XHOTQ+rE0LAInXtdPkMqa/72E2XiwAO3LP9eja8hfoQypLsm7EX7CEVseqmC3NMGcP 5Hqg== X-Forwarded-Encrypted: i=1; AJvYcCUTh9JjuWq4Zl3Co4l3CLEUExlILxpM8h55S/VD5KX05LJ9IympPfdOL1MJvhJll5Cnph9yqA4AbMY6z6GpWiMoDEw= X-Gm-Message-State: AOJu0YzEF4TN7XaYj1ovMNmefj1bUjooM0bNAFENzb6UrYbKf8t1No4g 3L0yRu9IA4MPyOTsAew+i/x2pt9obxnBMZgHd6S4UbOC5eK7snEdRAu5w4hxUQG4yEjHcsXGNS+ VCgPsbN0HGxRJK75s1tSYOmG+qiaCLXNc4x3o X-Google-Smtp-Source: AGHT+IGEsNplZutJPQCE2XWM33H2//sA0tBNvMZ/8xQCoXrQdHcbFYE1YTM2S71oQAfgwcpsit9Bxd7++DH7yVsIWtY= X-Received: by 2002:a67:ad02:0:b0:470:3adf:3ce2 with SMTP id t2-20020a67ad02000000b004703adf3ce2mr4465713vsl.10.1708938257027; Mon, 26 Feb 2024 01:04:17 -0800 (PST) MIME-Version: 1.0 References: <20240222161047.402609-1-tabba@google.com> <20240222141602976-0800.eberman@hu-eberman-lv.qualcomm.com> In-Reply-To: <20240222141602976-0800.eberman@hu-eberman-lv.qualcomm.com> From: Fuad Tabba Date: Mon, 26 Feb 2024 09:03:40 +0000 Message-ID: Subject: Re: [RFC PATCH v1 00/26] KVM: Restricted mapping of guest_memfd at the host and pKVM/arm64 support To: Fuad Tabba , kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 31E0518000B X-Rspam-User: X-Stat-Signature: uy8arc3jkwg5tjhhjz14ugo5uifoiri3 X-Rspamd-Server: rspam01 X-HE-Tag: 1708938258-119036 X-HE-Meta: U2FsdGVkX1/X5qndE1kM8r7RHqG87ErBdFFmbEq5JwPsX2GWHE4EF0NZRG8GO6LUF61Lq7n8sPG3f4Pd+uR9ynVcns6wuNZ148b0p7Q6v4iaIyFUXofJPpgXSon2UtGouiRT1LbHjzK1RQSRsLuHsRTfCStf3NuwB60E+1j+EZZvoCqEJXq5BHxCY/1rRyE9PYxcJLB6oTI6Y0dmDnjOp4R9bTlV5LlSXudsGzg033Zf1WL5cbL8/Hkx5OeOIYQCRgjsYaWSqZFx5nzCTdtgmuzorQqn1AOrGJJ0ty/jzBDfESpyQxoTGPxjomOvwjTbY/KOLbclV8mcgR+HAZkw8sZEw90iLNsyWIEJfHIlyUXKk2g+zgMYkiobhOBXT7oUnPVBbdP5Eo8JujIplYAS+E38mn7uKsR+DOu0bENA+M191w9/4U99JVVmGKrOOHDJaco/92tWOfyKCdDWiPY8k+qZEeLR3NFOpY5g3gSBAtuvnqCspLYvRXfkxpH+z8Za6AOxJ4Yl/4s88LXcAdyn8MLnfV5Wnq7H7TyD4OzodoX6BgjX38xxytySR/4zQTMmEYOi5xi4tutKc//F+TgVOhncsnn0JksM8xJwOnE7W4x08BmqbroC1qNal0VC97iM6hj4gmwHtT2diYRsQpvAH5J9tjkSkngAoK268JhUtphVKBcpxQH3RidJ/BPJv4VzqNgOC+QGjnqQjQg0GbQAT+a+0qrxbwx5qlHGQfg0R7jzDpvs2GQrwNf6DV6+o5PESgtJxFL4IG7f9E8IzV4u7rU2a/wto6haLYz45aDpZ9bmSQB1rCz6bjqK6HXPbs1zDBKbEUZJnU8R32rNVpVX06/3/CSVdyPtko/UTlYUz1ksAh1ZXNUbey8P+vh/vdjUyVRRBNd+ZUgZD58u8SkyAFveU5ViHhIW6k6cfk9rSM2iXyjn6AH/lt2rx/2cb6jLzEgDWscFmKtQLwGb6gz vdCeSf8n L8UF3/77FjYVhpz7oD9HlkgNb/X14IIsNRcz/LdLYy2rTHdPF/+hZw9usX69t7MlqtMwdLThtvJN1bBoK/Z2cRZgf6K43tiu9msFeolu2I6zVcnDiHWp4mWmUcQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000056, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Elliot, On Thu, Feb 22, 2024 at 11:44=E2=80=AFPM Elliot Berman wrote: > > On Thu, Feb 22, 2024 at 04:10:21PM +0000, Fuad Tabba wrote: > > This series adds restricted mmap() support to guest_memfd [1], as > > well as support guest_memfd on pKVM/arm64. > > > > We haven't started using this in Android yet, but we aim to move > > away from anonymous memory to guest_memfd once we have the > > necessary support merged upstream. Others (e.g., Gunyah [8]) are > > also looking into guest_memfd for similar reasons as us. > > I'm especially interested to see if we can factor out much of the common > implementation bits between KVM and Gunyah. In principle, we're doing > same thing: the difference is the exact mechanics to interact with the > hypervisor which (I think) could be easily extracted into an ops > structure. I agree. We should share and reuse as much code as possible. I'll sync with you before the V2 of this series. > [...] > > > In addition to general feedback, we would like feedback on how we > > handle mmap() and faulting-in guest pages at the host (KVM: Add > > restricted support for mapping guest_memfd by the host). > > > > We don't enforce the invariant that only memory shared with the > > host can be mapped by the host userspace in > > file_operations::mmap(), but in vm_operations_struct:fault(). On > > vm_operations_struct::fault(), we check whether the page is > > shared with the host. If not, we deliver a SIGBUS to the current > > task. The reason for enforcing this at fault() is that mmap() > > does not elevate the pagecount(); it's the faulting in of the > > page which does. Even if we were to check at mmap() whether an > > address can be mapped, we would still need to check again on > > fault(), since between mmap() and fault() the status of the page > > can change. > > > > This creates the situation where access to successfully mmap()'d > > memory might SIGBUS at page fault. There is precedence for > > similar behavior in the kernel I believe, with MADV_HWPOISON and > > the hugetlbfs cgroups controller, which could SIGBUS at page > > fault time depending on the accounting limit. > > I added a test: folio_mmapped() [1] which checks if there's a vma > covering the corresponding offset into the guest_memfd. I use this > test before trying to make page private to guest and I've been able to > ensure that userspace can't even mmap() private guest memory. If I try > to make memory private, I can test that it's not mmapped and not allow > memory to become private. In my testing so far, this is enough to > prevent SIGBUS from happening. > > This test probably should be moved outside Gunyah specific code, and was > looking for maintainer to suggest the right home for it :) > > [1]: https://lore.kernel.org/all/20240222-gunyah-v17-20-1e9da6763d38@quic= inc.com/ Let's see what the mm-folks think about this [*]. Thanks! /fuad [*] https://lore.kernel.org/all/ZdfoR3nCEP3HTtm1@casper.infradead.org/ > > > > Another pKVM specific aspect we would like feedback on, is how to > > handle memory mapped by the host being unshared by a guest. The > > approach we've taken is that on an unshare call from the guest, > > the host userspace is notified that the memory has been unshared, > > in order to allow it to unmap it and mark it as PRIVATE as > > acknowledgment. If the host does not unmap the memory, the > > unshare call issued by the guest fails, which the guest is > > informed about on return. > > > > Cheers, > > /fuad > > > > [1] https://lore.kernel.org/all/20231105163040.14904-1-pbonzini@redhat.= com/ > > > > [2] https://android-kvm.googlesource.com/linux/+/refs/heads/for-upstrea= m/pkvm-core > > > > [3] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guest= mem-6.8-rfc-v1 > > > > [4] https://android-kvm.googlesource.com/kvmtool/+/refs/heads/tabba/gue= stmem-6.8 > > > > [5] Protected KVM on arm64 (slides) > > https://static.sched.com/hosted_files/kvmforum2022/88/KVM%20forum%20202= 2%20-%20pKVM%20deep%20dive.pdf > > > > [6] Protected KVM on arm64 (video) > > https://www.youtube.com/watch?v=3D9npebeVFbFw > > > > [7] Supporting guest private memory in Protected KVM on Android (presen= tation) > > https://lpc.events/event/17/contributions/1487/ > > > > [8] Drivers for Gunyah (patch series) > > https://lore.kernel.org/all/20240109-gunyah-v16-0-634904bf4ce9@quicinc.= com/ > > As of 5 min ago when I send this, there's a v17: > https://lore.kernel.org/all/20240222-gunyah-v17-0-1e9da6763d38@quicinc.co= m/ > > Thanks, > Elliot >