From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FCC5C4332F for ; Thu, 2 Nov 2023 13:55:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E44A80023; Thu, 2 Nov 2023 09:55:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96D5B8D000F; Thu, 2 Nov 2023 09:55:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E6E480023; Thu, 2 Nov 2023 09:55:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6C6258D000F for ; Thu, 2 Nov 2023 09:55:42 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F2FDE1A0D51 for ; Thu, 2 Nov 2023 13:55:41 +0000 (UTC) X-FDA: 81413162082.10.1439879 Received: from mail-vk1-f182.google.com (mail-vk1-f182.google.com [209.85.221.182]) by imf24.hostedemail.com (Postfix) with ESMTP id 1CFCC18000E for ; Thu, 2 Nov 2023 13:55:39 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tF1gwVFa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of tabba@google.com designates 209.85.221.182 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698933340; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4Yb/48Ns1tpOy5twJOpAhL4i1qPmUF1LLO7pfzUXbus=; b=eDGAGaErVAnCb2+tYubS/TQ1hDGaJo2Ll9KsYfY1+YY+uI8ES45z9gG5+XJ03Bj64uAQxe Ood3wOmNeQlRtn9JFuVvJVq/u3t8E2l17wzO4dpTLWU108PJUfbkkkvGXmAQUeIn2gghpA nvhbwV2fi/ZkurslgPAIBAvPdDUHGaI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tF1gwVFa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of tabba@google.com designates 209.85.221.182 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698933340; a=rsa-sha256; cv=none; b=OC6XFkwXukk+hgV1jymrcVPSGRyl+Aphg16Vz78mNaqI46Tergvab1maN/y+2iyHX3M5OV DIgoyXU27CwN+mazTt9cXlVYWaiCf7iNwiAi8E/YqSnIxqNkwLd/WuL64vwwx+f4K1F2I5 q+mkrSaXCnTiMl8rxHIukDA0zV+Qj0o= Received: by mail-vk1-f182.google.com with SMTP id 71dfb90a1353d-4a9183bf741so411209e0c.1 for ; Thu, 02 Nov 2023 06:55:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698933339; x=1699538139; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4Yb/48Ns1tpOy5twJOpAhL4i1qPmUF1LLO7pfzUXbus=; b=tF1gwVFarnfOxlpBAgA6Zj19vMG/wyOliS2qXf6O5acUlDYBtj6rHS+XuF8QPkGJbq sFp10GMiNsOVS+aoP+BZlbbvFAHXe6c0CE2ZX/AYwOGtLTNYG73Bi7TzdEV/buc/fwWA F7YnhT0pVDB/Fsf8iBxOY6XRrHS2eRp7eoP0+HZhWkaQLI4D6L8gGBrvcjzM0uivoJsb 7cXoT8XOfL7walqf+9D9qFR9d3nx1aLdjlCtwdnvvsWgxjRsCQsLztwbMlFA59QLEmYJ fAmoacLB2BsFZ5nl2ZRMESD4CBdYwxWY7U8QN1ljovcvkfRfQFpWK619P2qvmysnVrUw nTxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698933339; x=1699538139; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4Yb/48Ns1tpOy5twJOpAhL4i1qPmUF1LLO7pfzUXbus=; b=L3RB/TWb5mQgdM0rRgq/4VRsawonqc75dSzT+lPbDorFyv/q/9RpzBi/O9ZZVp9zB+ zl0hcidiePYa9CxmpWGVL6LiiqerW9wQ2kawp7OY1Eu49r+Fd1a28VYI6aLG+3K2WWgr AyiqyiB7UmLa+x4PQwGF2K2KiLK47jUsRKBbAKoYpg5p2Pp01q8spzCiHovZHHglyEeH aP+i5sbEDYFI/Xm3/FjFYsOltf2WiXO2hqlZeknrlpTW0qr7mDkT3jfJnbMn/3rqBdZZ URpkSc7uv/bdMnnKnR5vknCCuB1a5c2kZDnvOFNxsYra0JS83NZKKZiV3riGhNZBMme0 HH4A== X-Gm-Message-State: AOJu0YwkvIutWUzn973DJTknSrXD10jogxaEaawcwVmmtpniCuvkmEe9 +AQPHq9qf7hp9nhBLBtGu2rAkqN9Yszahj8s2tpZfg== X-Google-Smtp-Source: AGHT+IHGemWDnkL7WTN8LWokDsgbdlhg4M3963xhuFMSCwbYM702stLA5BFN7Yk4qTk9o3bsEJ0EqjHFUXiDw4OizcA= X-Received: by 2002:a1f:984f:0:b0:49d:a52a:4421 with SMTP id a76-20020a1f984f000000b0049da52a4421mr16094320vke.4.1698933338782; Thu, 02 Nov 2023 06:55:38 -0700 (PDT) MIME-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-11-seanjc@google.com> In-Reply-To: <20231027182217.3615211-11-seanjc@google.com> From: Fuad Tabba Date: Thu, 2 Nov 2023 13:55:03 +0000 Message-ID: Subject: Re: [PATCH v13 10/35] KVM: Add a dedicated mmu_notifier flag for reclaiming freed memory To: Sean Christopherson Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 1CFCC18000E X-Stat-Signature: btd8zd3h1m95f1r159s4dspt3iobuioy X-HE-Tag: 1698933339-781234 X-HE-Meta: U2FsdGVkX1/0ei1jyGCjXXhUUrzXDzDMduv3Q1vpfa7quu2BpgMFOCB5iGANGoI4asqaLgAXG0NV1hFZ3k7gafPy/kE+aSKiHKwTY2TNgoY0R80lO9yOx5Y+g8w1hyAlQZMrvK1CC2sEvSJC+4fqbZUfjOzrMvDo571RkPqEGZXptYwyEsNEeTyi0BQ5xdjxJ5gt/hH+VLYV81jtnD+asBXpYsYv5DVXA+Xf0UUTMXB8jaEe3nEvtKU/DwPjRemBkzBeXFB7fR1Jh9+ZBaKmqfkUm80F3sW1oDmEPLV5L4PWfTZqyL4tc8gnz6CtgPoCzZ054qUcI8qMFnZLNyjQrFxGreoCzbVXTGMwlHcDMbwCS2ltDHONpP+BbEpt6Iy/c2PMcEhvBNwFk0qfxPmNroBhLm9XTSCIoxCut6iIX7rrdGb4Z0RMl9Z/87gpH16ZsvY5Mb1S3AC32tJXdmZlEzKyvqv7YSgzC99E0dN0KWR+UqDxQ7gXDh2DpUt5D+VyzJp1zKNoamXm3l3l/y+H+SxPHreGm1e0u3CXs+KHBuNI/5Ep7by1s16X9iSCa7MbQ8/6oohFckocOYzhf0fRfIWcgO6q/aDTxuUzRr1ixJkxOcIpvjly9HA8etyK1mSBrGgmpImSG3RjND5XPDWhUhVMQ2IvJ2oV565GAnlT/IPkCUbPAIC7n74hwSd7mSK3GluN658nI6QOFyc+qKNU1Q4ih2+kv92m+/cWl4illBBZ19QsodVGluFIwVxaTehdczIxS2vNwKBsVGfJKIuIOpBbGC/fsB3ylgQUGWhTpkhmDo2yLIsHB31ykqLsNzJ79a37Daa58ht9Jm+NFPAuHmgzZ2Kk/FkkOEuUa0ql5qnGGO3RegOhJMeuoefQoVNhZAQBSLjvQAjpaJb78qMQREuoUDAI0Y+O+PsZzCT+M0UMDQ8fcs9MStSYizJ+cdoU2ak9FQ+Lnnm5BlzRCrS YMdYCMm6 ALFJdSp6zQ1kVXRllBEEIHFuoi2WmVpxc2FVVlPHwlgFe5M6mXED3fokTj4leypU8yjQKDpBz8b7OzZVpghIlP0ETRnQVUGbFGoBpI7G/g5mhbEhWsa5431Mzc5iTXTS4S9jhYb3sfv1OrTV0MsuOfnr53JyWxrVtsu5EvvyXyMMlYfojEESqPn+8sqe8NJIXF+tzfXwlDpuL1DC0cMJMoGePKfjANjIzVhLLd1Vz7N9Hc5QKaYYrQ7FJdpK3JlCFxfEYfcb6lp8rw6SMfxy7JvU7ueQLiWVgI4w7YFczyQoPsKEUdiY3MSFOy1gZU6PNZhyNHlRvBnUZdGlrcPdiEUH9T5b8kpU871ubNJpWgV5mehon5/2H48yvQUPdhSIpfPB1zw7/r5PSgXnlruv2LzNGPg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 27, 2023 at 7:22=E2=80=AFPM Sean Christopherson wrote: > > Handle AMD SEV's kvm_arch_guest_memory_reclaimed() hook by having > __kvm_handle_hva_range() return whether or not an overlapping memslot > was found, i.e. mmu_lock was acquired. Using the .on_unlock() hook > works, but kvm_arch_guest_memory_reclaimed() needs to run after dropping > mmu_lock, which makes .on_lock() and .on_unlock() asymmetrical. > > Use a small struct to return the tuple of the notifier-specific return, > plus whether or not overlap was found. Because the iteration helpers are > __always_inlined, practically speaking, the struct will never actually be > returned from a function call (not to mention the size of the struct will > be two bytes in practice). > > Signed-off-by: Sean Christopherson > --- Reviewed-by: Fuad Tabba Tested-by: Fuad Tabba Cheers, /fuad > virt/kvm/kvm_main.c | 53 +++++++++++++++++++++++++++++++-------------- > 1 file changed, 37 insertions(+), 16 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 3f5b7c2c5327..2bc04c8ae1f4 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -561,6 +561,19 @@ struct kvm_mmu_notifier_range { > bool may_block; > }; > > +/* > + * The inner-most helper returns a tuple containing the return value fro= m the > + * arch- and action-specific handler, plus a flag indicating whether or = not at > + * least one memslot was found, i.e. if the handler found guest memory. > + * > + * Note, most notifiers are averse to booleans, so even though KVM track= s the > + * return from arch code as a bool, outer helpers will cast it to an int= . :-( > + */ > +typedef struct kvm_mmu_notifier_return { > + bool ret; > + bool found_memslot; > +} kvm_mn_ret_t; > + > /* > * Use a dedicated stub instead of NULL to indicate that there is no cal= lback > * function/handler. The compiler technically can't guarantee that a re= al > @@ -582,22 +595,25 @@ static const union kvm_mmu_notifier_arg KVM_MMU_NOT= IFIER_NO_ARG; > node; = \ > node =3D interval_tree_iter_next(node, start, last)) \ > > -static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, > - const struct kvm_mmu_no= tifier_range *range) > +static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *k= vm, > + const struct k= vm_mmu_notifier_range *range) > { > - bool ret =3D false, locked =3D false; > + struct kvm_mmu_notifier_return r =3D { > + .ret =3D false, > + .found_memslot =3D false, > + }; > struct kvm_gfn_range gfn_range; > struct kvm_memory_slot *slot; > struct kvm_memslots *slots; > int i, idx; > > if (WARN_ON_ONCE(range->end <=3D range->start)) > - return 0; > + return r; > > /* A null handler is allowed if and only if on_lock() is provided= . */ > if (WARN_ON_ONCE(IS_KVM_NULL_FN(range->on_lock) && > IS_KVM_NULL_FN(range->handler))) > - return 0; > + return r; > > idx =3D srcu_read_lock(&kvm->srcu); > > @@ -631,8 +647,8 @@ static __always_inline int __kvm_handle_hva_range(str= uct kvm *kvm, > gfn_range.end =3D hva_to_gfn_memslot(hva_end + PA= GE_SIZE - 1, slot); > gfn_range.slot =3D slot; > > - if (!locked) { > - locked =3D true; > + if (!r.found_memslot) { > + r.found_memslot =3D true; > KVM_MMU_LOCK(kvm); > if (!IS_KVM_NULL_FN(range->on_lock)) > range->on_lock(kvm); > @@ -640,14 +656,14 @@ static __always_inline int __kvm_handle_hva_range(s= truct kvm *kvm, > if (IS_KVM_NULL_FN(range->handler)) > break; > } > - ret |=3D range->handler(kvm, &gfn_range); > + r.ret |=3D range->handler(kvm, &gfn_range); > } > } > > - if (range->flush_on_ret && ret) > + if (range->flush_on_ret && r.ret) > kvm_flush_remote_tlbs(kvm); > > - if (locked) { > + if (r.found_memslot) { > KVM_MMU_UNLOCK(kvm); > if (!IS_KVM_NULL_FN(range->on_unlock)) > range->on_unlock(kvm); > @@ -655,8 +671,7 @@ static __always_inline int __kvm_handle_hva_range(str= uct kvm *kvm, > > srcu_read_unlock(&kvm->srcu, idx); > > - /* The notifiers are averse to booleans. :-( */ > - return (int)ret; > + return r; > } > > static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, > @@ -677,7 +692,7 @@ static __always_inline int kvm_handle_hva_range(struc= t mmu_notifier *mn, > .may_block =3D false, > }; > > - return __kvm_handle_hva_range(kvm, &range); > + return __kvm_handle_hva_range(kvm, &range).ret; > } > > static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_noti= fier *mn, > @@ -696,7 +711,7 @@ static __always_inline int kvm_handle_hva_range_no_fl= ush(struct mmu_notifier *mn > .may_block =3D false, > }; > > - return __kvm_handle_hva_range(kvm, &range); > + return __kvm_handle_hva_range(kvm, &range).ret; > } > > static bool kvm_change_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *r= ange) > @@ -798,7 +813,7 @@ static int kvm_mmu_notifier_invalidate_range_start(st= ruct mmu_notifier *mn, > .end =3D range->end, > .handler =3D kvm_mmu_unmap_gfn_range, > .on_lock =3D kvm_mmu_invalidate_begin, > - .on_unlock =3D kvm_arch_guest_memory_reclaimed, > + .on_unlock =3D (void *)kvm_null_fn, > .flush_on_ret =3D true, > .may_block =3D mmu_notifier_range_blockable(range), > }; > @@ -830,7 +845,13 @@ static int kvm_mmu_notifier_invalidate_range_start(s= truct mmu_notifier *mn, > gfn_to_pfn_cache_invalidate_start(kvm, range->start, range->end, > hva_range.may_block); > > - __kvm_handle_hva_range(kvm, &hva_range); > + /* > + * If one or more memslots were found and thus zapped, notify arc= h code > + * that guest memory has been reclaimed. This needs to be done *= after* > + * dropping mmu_lock, as x86's reclaim path is slooooow. > + */ > + if (__kvm_handle_hva_range(kvm, &hva_range).found_memslot) > + kvm_arch_guest_memory_reclaimed(kvm); > > return 0; > } > -- > 2.42.0.820.g83a721a137-goog >