From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D2AECD3424 for ; Tue, 19 Sep 2023 00:08:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE9426B047A; Mon, 18 Sep 2023 20:08:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A72C96B047C; Mon, 18 Sep 2023 20:08:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93A346B047D; Mon, 18 Sep 2023 20:08:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8443B6B047A for ; Mon, 18 Sep 2023 20:08:46 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5D8011403F7 for ; Tue, 19 Sep 2023 00:08:46 +0000 (UTC) X-FDA: 81251411052.11.40BBCC4 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf28.hostedemail.com (Postfix) with ESMTP id A8671C000A for ; Tue, 19 Sep 2023 00:08:43 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ThuJH8yw; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3iuYIZQYKCKkbNJWSLPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3iuYIZQYKCKkbNJWSLPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695082123; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dInZugQi4I3jT3AsjQuL7R2MGGG9K+dIBnr86/ph3q8=; b=u/11pOjvhEcCdvuoE3Ev/UvrAzNqXTarOVOTSQRTouNpv8A7ti1ZCtay5cnn5A7qG1fC1+ 1iVij6Mh2rr7/p4JvEMLoogpY7pZEypjX+N8vXTlQU0dbBqnjb6j7djZabHD9U5PAb6PM3 6zgE6jvERswZ76gEHV7n5GvWykD5x/8= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ThuJH8yw; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3iuYIZQYKCKkbNJWSLPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3iuYIZQYKCKkbNJWSLPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695082123; a=rsa-sha256; cv=none; b=ISgsTmrlzRvJgwvwycnOEN2q2pMbEICyYPI16FFZ3b8VLLJT2y7n1cnHL+cN/9h9CoyYev DFEfzc9Mi3JiTmOV5VK58KNaSismH+wBFEL/fRj6R8TRLeKVgMB+CUPXqLJ6bkXD8oNQNu i5WUbFxZyzJUHBYQh29TEGSJb03KclQ= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-d852a6749baso1011517276.0 for ; Mon, 18 Sep 2023 17:08:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695082122; x=1695686922; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dInZugQi4I3jT3AsjQuL7R2MGGG9K+dIBnr86/ph3q8=; b=ThuJH8yw0PhkcqOxOxdlaZR+mfIMPWp3D29hqKgPjHmmEexhMp5J0gJ81M6L9SMsB2 vCJzjT0AiC1pF7EMZojGdLqHW4v/9a2LcMTDWQMnCAy0w98oGEuEmQqEEqIeohBLNGkB KmTFa1K0vYML0U37c4c+7DN3EHWEEsb8vyqdH5+hOrWIhTKdnRkw7kWk65mk9y4vt8Ig ndvlV+AirUPvzhPTkWHMDwYTS4gZS3E1Or2YmbCfCks9ZjAn5qoliaMrq9AHUpcQD1+C vzQl+uUsBPwqY+IczL/2oLLf2lN7HAVaG+oVFGJJQ39rFbJuggpyC/0Dxo+p5XonQHnV s7Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695082122; x=1695686922; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dInZugQi4I3jT3AsjQuL7R2MGGG9K+dIBnr86/ph3q8=; b=sFxXDd661SvfVJGeI8kZxq/cijwZC2jhl0FwIUmJUNQqq/sy4IFWFjteEgA4Fl3k3I BCVEyPF6ksMGMo3xjjfClqqfUj5RgcXIBs43ZC8VKyyw8gUUmhjHgdpNIb9omF0MZDfh u/DLYXb4DZQH8VJ8g5pEa4MUoovuntil8oxourZKshQdv2BX6YUHwzf/qszfr3oWdXGl 03Ynl6d/oOIZ+LIOwSig6iG7ieYPn4wcy4dtakRLEpB3Uv8984fFGB+z3RfDCzRAalj8 +VbzwglpmlgtRMgNuu4JChRyvHpEGVscbW0r9WnpeW0YBiI+b5EnN/fT7bIk+JAfxra7 fhgw== X-Gm-Message-State: AOJu0YweV0xzPUzQY08R1yWi/c0FYhwfThEyMaVibk6vDXD3Zfop94cG y9LS9WQWQ+ITFVZ6Xq61PEdfX9nnz6g= X-Google-Smtp-Source: AGHT+IHHNvt5OesaYTZ4SrOkO6G/E0GpOB67x3jgD/5Z7IEbZFEyHH5t/MUiil/50JGkevI3xvv2hT/ge04= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:11cf:b0:d80:23ff:ae7f with SMTP id n15-20020a05690211cf00b00d8023ffae7fmr237508ybu.4.1695082122578; Mon, 18 Sep 2023 17:08:42 -0700 (PDT) Date: Mon, 18 Sep 2023 17:08:40 -0700 In-Reply-To: <20230918180754.iomoaqnw75j3rrxb@amd.com> Mime-Version: 1.0 References: <20230914015531.1419405-1-seanjc@google.com> <20230914015531.1419405-11-seanjc@google.com> <20230918180754.iomoaqnw75j3rrxb@amd.com> Message-ID: Subject: Re: [RFC PATCH v12 10/33] KVM: Set the stage for handling only shared mappings in mmu_notifier events From: Sean Christopherson To: Michael Roth Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , "Matthew Wilcox (Oracle)" , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , Xu Yilun , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" , Ashish Kalra Content-Type: text/plain; charset="us-ascii" X-Rspam-User: X-Stat-Signature: heeq7y7ykymwmwi9rdmffy1gu5cpk53o X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A8671C000A X-HE-Tag: 1695082123-968349 X-HE-Meta: U2FsdGVkX1+lXM4ZlbYh2XWrCT3YXR5eCJ5zXIfv8JduQaNcOe465/qVhaD8lBq5srl53G5As1pzWkWJ7K3oxavRB4GE9LAf70Lb5xiDNfaoSd8Cna4dsKI5FXNqxi/CJ8PucJwQzN9JfuorjPNzXk99JxWkQhx7AmObCLmc/Fi6hgK9vK00GQuEFiXiWE8RiB02HcwL/ZWPr9af0mX6OMgbZ4O7c2666F19ezt2S1J6Of4x3DAy+oKtQvp84c5GbnKyRGfYQKzmJRSwWZWv6WZx72uIgNVDGM9tLo+jd8g0a8uVcrXCzYeV6EAiCG041/eJqdHK1enhXsjtR+Uph+01TdAtcTyRbPNpRy76Ke9Dnxe3ZSCysgfbdNSGBYw3ggu5L7jWncKIfh8mdQHukc1WUphZ1R46XvTVmwE0NrRia8BsouTA58o/W7AbxxSYPuX33GxIqVEVeQYJ2iA/Ad8l6nsL022DShvcsCmLUP35J+tpKb1qNW8x5ZlG1WIj3aUSE1Sp69mfDGlhIGhu10BqtEydqHC7SiEBAATamllsM1+Aj2eQCBtsoymFJettkovH2ejhniJwt6LFaQH3q7X6xgbQwg78zYgH/d/WeYNPoIdrzuJwEDWbd1Hn09Xj4ZIptMo7lq5lnU9OhMPt3yitE83G7Mm3nR64FU/+mowTMEWlD4paKsijgHsTsk6d600XfbQa8AA7PxNtVMc9D4ljMjRXn+23UurEMybVDzxwomxHeLWcRC7GPgyfAhPC6tG30jDrQkufbGalO9hBLCCDZGFgoh7yaDj44wH6TJ0ab2HzRhEaIcxqPsgHc3mNA/DoJiPnYFFBy1hRp/LKHaiiU7opA1u7xQZ04m6GV21XC5clIh5eT9mUa9L+Kex8vDZrqFlZoaQEzIxvhP0Fm6LudHIrMFdCRx+bTPK9ar89R2gcszcu4pQ0ntLoUjn11n6/nBRHT+xJMNL59S/ EKlljOEZ LRjC6qT0THFdj5xTq9klNgZzKqpB9hzHSSkHh34cQGH4k/DSWk8T+hqna3pvxm8LU4iTc52c3eah4dGwdEokg2ncNijTEGCTerxS/gmD+mqhxZ1ZzYzIVNyH1ygC95xUtFVjMdpTqdsw33p/uy8acaY9pi4o+PtPnF9VSn6Re+UDb4jWV+aITp0wQr2t6Zh/p60UYsijgNz0SzaKliX93qMI/OfzuRkoxY6aJhGtfN3yZuRL3sWS67x8Pl7I5DT32fIkVxCrWK5ybC4O0LELZSLcaBABWDa4trTqcGYtu8n7HYZ83DfaCso8561D5u1AjzGXZpDzvZcTxdQGyhj2e28fSYuZiQ9IoN61wL5T5qauUXhyd8rKqnzx+X8zk/akhSjB5Y73HKO0mJTVeM6f7cLX4nNZvwPuMOeE4OCGzyi9nkm0irMkS6VgCOh6qxM6ybvnfZeE0DfgVSXLL1r/mWCeBQrJWbZ3RY4v3urz1yOMSU4gy5YP7SmmH8oHgHQ6QXpnqpXxunNrMOUEMNIuUZW9lDNIujEHVG3XC+/aW6XCS8zLVdSETs/lWeut1grrVuFO9AWkQd2sAaoSRjLwIEWh64F8LdwIWe4nUGy9kqL61OB5GyF1TP9c6FgcODf5aXrDYQfHQNdgPzlqxvhhkdtNY0tt8nI62wbc0zKJzDUiZH0Xl5EQd0daWOAfffoLp4VInClvab0dZMehWgPWt4eKgwMX4UTRxeolDz+WcmSwkLuRqIMXDfRAgxKN9RozhgqG3pzt+KcgWe/FsG7i98WJos9oNccJ3gbBb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 18, 2023, Michael Roth wrote: > On Wed, Sep 13, 2023 at 06:55:08PM -0700, Sean Christopherson wrote: > > Add flags to "struct kvm_gfn_range" to let notifier events target only > > shared and only private mappings, and write up the existing mmu_notifier > > events to be shared-only (private memory is never associated with a > > userspace virtual address, i.e. can't be reached via mmu_notifiers). > > > > Add two flags so that KVM can handle the three possibilities (shared, > > private, and shared+private) without needing something like a tri-state > > enum. > > > > Link: https://lore.kernel.org/all/ZJX0hk+KpQP0KUyB@google.com > > Signed-off-by: Sean Christopherson > > --- > > include/linux/kvm_host.h | 2 ++ > > virt/kvm/kvm_main.c | 7 +++++++ > > 2 files changed, 9 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index d8c6ce6c8211..b5373cee2b08 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -263,6 +263,8 @@ struct kvm_gfn_range { > > gfn_t start; > > gfn_t end; > > union kvm_mmu_notifier_arg arg; > > + bool only_private; > > + bool only_shared; > > bool may_block; > > }; > > bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 174de2789657..a41f8658dfe0 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -635,6 +635,13 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm, > > * the second or later invocation of the handler). > > */ > > gfn_range.arg = range->arg; > > + > > + /* > > + * HVA-based notifications aren't relevant to private > > + * mappings as they don't have a userspace mapping. > > + */ > > + gfn_range.only_private = false; > > + gfn_range.only_shared = true; > > gfn_range.may_block = range->may_block; > > Who is supposed to read only_private/only_shared? Is it supposed to be > plumbed onto arch code and handled specially there? Yeah, that's the idea. Though I don't know that it's worth using for SNP, the cost of checking the RMP may be higher than just eating the extra faults. > I ask because I see elsewhere you have: > > /* > * If one or more memslots were found and thus zapped, notify arch code > * that guest memory has been reclaimed. This needs to be done *after* > * dropping mmu_lock, as x86's reclaim path is slooooow. > */ > if (__kvm_handle_hva_range(kvm, &hva_range).found_memslot) > kvm_arch_guest_memory_reclaimed(kvm); > > and if there are any MMU notifier events that touch HVAs, then > kvm_arch_guest_memory_reclaimed()->wbinvd_on_all_cpus() will get called, > which causes the performance issues for SEV and SNP that Ashish had brought > up. Technically that would only need to happen if there are GPAs in that > memslot that aren't currently backed by gmem pages (and then gmem could handle > its own wbinvd_on_all_cpus() (or maybe clflush per-page)). > > Actually, even if there are shared pages in the GPA range, the > kvm_arch_guest_memory_reclaimed()->wbinvd_on_all_cpus() can be skipped for > guests that only use gmem pages for private memory. Is that acceptable? Yes, that was my original plan. I may have forgotten that exact plan at one point or another and not communicated it well. But the idea is definitely that if a VM type, a.k.a. SNP guests, is required to use gmem for private memory, then there's no need to blast WBINVD because barring a KVM bug, the mmu_notifier event can't have freed private memory, even if it *did* zap SPTEs. For gmem, if KVM doesn't precisely zap only shared SPTEs for SNP (is that even possible to do race-free?), then KVM needs to blast WBINVD when freeing memory from gmem even if there are no SPTEs. But that seems like a non-issue for a well-behaved setup because the odds of there being *zero* SPTEs should be nil. > Just trying to figure out where this only_private/only_shared handling ties > into that (or if it's a separate thing entirely). It's mostly a TDX thing. I threw it in this series mostly to "formally" document that the mmu_notifier path only affects shared mappings. If the code causes confusion without the TDX context, and won't be used by SNP, we can and should drop it from the initial merge and have it go along with the TDX series.