From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D3C4C636D6 for ; Thu, 23 Feb 2023 17:09:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F6B46B0071; Thu, 23 Feb 2023 12:09:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A68A6B0072; Thu, 23 Feb 2023 12:09:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46EB86B0073; Thu, 23 Feb 2023 12:09:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 36D696B0071 for ; Thu, 23 Feb 2023 12:09:42 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 12AFD40442 for ; Thu, 23 Feb 2023 17:09:42 +0000 (UTC) X-FDA: 80499193404.24.0BD60DB Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 21F22C0003 for ; Thu, 23 Feb 2023 17:09:39 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eacT4PM6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 30533YwYKCBQCyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=30533YwYKCBQCyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677172180; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ESEYfucHSnOWaZatj5hNQPiLl3UqdNvq9+VGnD33XD0=; b=Uu4MQ1AuDMZg6psXs/A5VmuWutNuEdhb5fw2xWVR/O2/JBWi77ieJS+8GH9TnBy0f7yy0y xGnIzmkI3S+uQIRSP3ejIrOcSpdfT0AdKSUyx6BJqp0HOAbgAeIr0UM057mjXGTmHTTuA5 dfWJH3NrfKgLk8Bn4TMWm7mSXyAT0Bw= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=eacT4PM6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 30533YwYKCBQCyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=30533YwYKCBQCyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677172180; a=rsa-sha256; cv=none; b=f1uuMNutFP2+Ph0P9nGI1yW/UDkTweZWOyVtM85bGPdpBb5mtD3fvYmX/v8kjC2/KU16TH SwIKK6fCpoYsIt2XNc9C/FJpZFm3JigpJgJq3VvfZJLBozwZVXMc8LISiE6/U/uf8uSyMk 7pKbot6h1pl8XkLRQF/Cev0CwBnUXMY= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-538116920c3so84379027b3.15 for ; Thu, 23 Feb 2023 09:09:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ESEYfucHSnOWaZatj5hNQPiLl3UqdNvq9+VGnD33XD0=; b=eacT4PM6bk+JAGYX8AtEreTJLTfD+mSJsjox9puZbiJhIzz/wIxHhoNqBFIAXYayu7 ix2AdCa9GlyQPz6wKG3LR8o4NT+280wasX8/AykGzPPWlG6F17mnxqyPEg7H8lCLwAiX eXH6zqI0g24xM5ZEDyA5U2YNv7s9H603+yWuiAeHry6HpCAAUQ5/Ypm3z9qWSoASZAJv GIl+kmJbNVHeS7mmPwXAdOExvjZqQ9ItW1uw3DrabrHDmj3KBFLmoGmyQYc+K7/u+5MN XBySfkomcCwXORKGamuW5yTLWSpURcpZFsI0uJbe36YJ/e/6tvZI8z6lN62S38fr6TXv yALQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ESEYfucHSnOWaZatj5hNQPiLl3UqdNvq9+VGnD33XD0=; b=E3wE8gXO/7zIoQKKnQEdS2BUDQHlIqP8koPOob3kw4o9oiFR16lXfsB96MJTwRW1Ex Ehe25mmgrd23hWn+zWWLiy+xr9CHQaQboJ7NK/4abQHO6FNk0WYAypL2R9gNVHhnLDqN KPJnKEN5J/CBj2OOgMUBe52O2V4g1YnGq9UUT8hTH1U6Qzt/aScuM6yIqy1X+u3Z5TYc x8dw/uMIM3bzyJyIiHW0bkA/dXsZGf9+kgSGnhrlLL57CWva+TuRIbjFnepcaqsz5Qfo SnRJCJKExt7BEv3m8iDmaqDfDuOvT8L0fOigqHZOYkqm0K00VFIMaq60Xl5Ke/24b/G/ 3ebA== X-Gm-Message-State: AO0yUKWf680JFfHo9jw4wWBNDJ5//JSkTR8FJQF3GGp86thFMiARodHP MN8Y1S5cGxwa8MS5k6HdBsKaAvrpIzU= X-Google-Smtp-Source: AK7set+o4nZnorUIfwBQUKqlnrrJeoYQJz7sG9NME5jj+i7H5Ea4MTUvMN+YCmQEaMe31ZnG3Fe5siKK8IQ= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:b60b:0:b0:52f:45a:5b00 with SMTP id u11-20020a81b60b000000b0052f045a5b00mr2611344ywh.2.1677172179211; Thu, 23 Feb 2023 09:09:39 -0800 (PST) Date: Thu, 23 Feb 2023 09:09:37 -0800 In-Reply-To: Mime-Version: 1.0 References: <20230217041230.2417228-1-yuzhao@google.com> <20230217041230.2417228-3-yuzhao@google.com> Message-ID: Subject: Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() From: Sean Christopherson To: Yu Zhao Cc: Andrew Morton , Paolo Bonzini , Jonathan Corbet , Michael Larabel , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-mm@google.com Content-Type: text/plain; charset="us-ascii" X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 21F22C0003 X-Stat-Signature: wa4n5omidkctpwci7potb3foxupgkwkc X-HE-Tag: 1677172179-207731 X-HE-Meta: U2FsdGVkX1/COwt/IuCRT0FI47lCD2hb5vyw7U2eRbVHZ0XpYmDs4eJrdQYxDtgoazE0bqu51g96u6A7pU7EWYMwIproV7rFFebTtgwmGzbDgAcfdHFatIVNwvdshEWYR3DAKQ7315NRNi2bE3i12pErKUPZ/aOJ2gdQIm5rCGfIii6LTr0LAoYQSCTkCg/+LhaytIrQVwk5V22AZEGmm9fIV/EwDZDRN3S3kUeRF0aZ0s+zjmHsFuwwCT5kNgfhGAq2EKsawO0Otf6oUvsdcOc6bmteUqi47b+NcHwbReKTajMxAzJlp2mEOTcYMduE86c0IA3K0JlvW98S7OgsoyQhHf+TdM2k+1tLaC93gC7d/GfFGlHItBFEncwZxTlmteMpMOb1FKV+exgXxkxg0lmzzcphbwbzRAOOwtdxI5pQEmT8ScWa/pXbnXI2gH3QxLBtM03bqqqsizU2vL5eif6nBcrt8w7PKU47G4v9G7Z2Dq0qJL19kPg/7dYo148Wlw8ED7gKonkEap6j7dAXXojD9FDajG3HKOE0kSp+G5QWhqYAkR0YF3FGPs8Vxn8YdPFYHs/MZt5qTAqaIK29gqstmnnl0CZpIEPYcImA8mpWCDPBas5Bmqh4+99ajAGRugiQgUjh3ZkmIZCcG5Oevoil0+/ZYXr2f5z+DHiYBqxvgikDy+EjkLKnFDaV/BIUtoieYw9Vzu5KwZPdu53KavBq1ssXbI6tEQTklgBC2vfKu5xw0QKwwfqJmo/5oLO0DWdzjkAmirfSWEatrOx0u5yB/im8Jsvi4FvGn8/2NRVjro9jvCU3SVN9HYrva1Z54SWnXN18v3/iOOvJiqC8c3g9PRpRLnhPMMZjc6Noi07k7jL6F3N44N51kZvJOvVjbueAhSbqVBMtk2AXIXf3wmIxdtMrkYra1mpeXn3Pq45AIVCwWpXX9TffuJ3CoWQZMIeSVukfMObwenbrotV uLiFuN3N vWVLjFuka1HPu1fzR5qpQ0En4GX9T9hr1OlEd7WP+dSEf5/qYGpzfbdb0E2WEylwPTL6eBPk397z6SCDOM3e+SH6jDU8ufEjnrt0cvZ80FjVDBouDyY2519yjqQSOFhkW1om9Da2ss98ioficPxB7A+fw1Ger571ei+T2p0Tf8eugjNzmOV0LsDPQvJRIjbau7AKaYwEPIBKaEn2EVivnExC+bWpzYfp49O16m2wFrRfcrqp0JaV2JAwD6sjYM7c3opKbxeMSgrlXNDJfDAaZ3xT59RvxgywRpcg5SSaAqUrBcb/ykCvXsth/q90200Pf55lhmO3LAXSFQE5XRNEYZRTYc9egiVwctIGFM4SMaFvgoNcjNIGn9kMbQ1m8vHHf7g7YHzNhx/oUJzP2uYk4TIDUap5UQpLBhJnFaC3q5+rs5VX1vTGBGUXfaPCjsdFECKARBMVhNFb6ohu7jgchzVFS3b560xCqAWahMsQVxsYQ6KKvgG+2LC3JT3SdPvD8DlagCZXeMBcHlUrab5U+NKp/ZPwapaJ632wO24I4jh+G8+qE5LkF9nxj/Tl6mBuIkJUvD52cCq4A3DfWig80gHddnnUPYvjZ4vc1ZCTKhpJ7RWzmGc9xNR1+ujZlVANQiiOEpi0JNK51Uk9R77faHzprwKRvOJZZHgQdAn/De6oyP+FqtHWFm4DeLTqLziP5asX4GSLoykqyw1dVvPi8HS8ppG3B8TWHnKV613hU07wkdeY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 22, 2023, Yu Zhao wrote: > On Fri, Feb 17, 2023 at 9:27 AM Sean Christopherson wrote: > > > > On Thu, Feb 16, 2023, Yu Zhao wrote: > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > index 6aaae18f1854..d2995c9e8f07 100644 > > > --- a/arch/x86/include/asm/kvm_host.h > > > +++ b/arch/x86/include/asm/kvm_host.h > > > @@ -1367,6 +1367,12 @@ struct kvm_arch { > > > * the MMU lock in read mode + the tdp_mmu_pages_lock or > > > * the MMU lock in write mode > > > * > > > + * kvm_arch_test_clear_young() is a special case. It relies on two > > > > No, it's not. The TDP MMU already employs on RCU and CMPXCHG. > > It is -- you read it out of context :) Ah, the special case is that it's fully lockless. That's still not all that special, e.g. see kvm_tdp_mmu_walk_lockless_{begin,end}(). > * For reads, this list is protected by: > * the MMU lock in read mode + RCU or > * the MMU lock in write mode > * > * For writes, this list is protected by: > * the MMU lock in read mode + the tdp_mmu_pages_lock or > * the MMU lock in write mode > * > * kvm_arch_test_clear_young() is a special case. > ... > > struct list_head tdp_mmu_roots; > > > Just drop the > > entire comment. > > Let me move it into kvm_arch_test_clear_young(). No, I do not want kvm_arch_test_clear_young(), or any other one-off function, to be "special". I love the idea of a lockless walk, but I want it to be a formal, documented way to walk TDP MMU roots. I.e. add macro to go with for_each_tdp_mmu_root() and the yield-safe variants. /* blah blah blah */ #define for_each_tdp_mmu_root_lockless(_kvm, _root, _as_id) \ list_for_each_entry_rcu(_root, &kvm->arch.tdp_mmu_roots, link) \ if (refcount_read(&root->tdp_mmu_root_count) && \ kvm_mmu_page_as_id(_root) != _as_id) { \ } else > Also I want to be clear: > 1. We can't just focus on here and now; we need to consider the distant future. I 100% agree, but those words need to be backed up by actions. This series is littered with code that is not maintainable long term, e.g. open coding stuff that belongs in helpers and/or for which KVM already provides helpers, copy-pasting __kvm_handle_hva_range() instead of extending it to have a lockless option, poking directly into KVM from mm/ code, etc. I apologize for being so blunt. My intent isn't to be rude/snarky, it's to set very clear expectations for getting any of these changes merges. I asbolutely do want to land improvments to KVM's test+clear young flows, but it needs to be done in a way that is maintainable and doesn't saddle KVM with more tech debt. > 2. From my POV, "see the comments on ..." is like the index of a book. And my _very_ strong preference is to provide the "index" via code, not comments. > > Clearing a single bit doesn't need a CMPXCHG. Please weigh in on a relevant series > > that is modifying the aging flows[*], I want to have exactly one helper for aging > > TDP MMU SPTEs. > > > > [*] https://lore.kernel.org/all/20230211014626.3659152-5-vipinsh@google.com > > I'll take a look at that series. clear_bit() probably won't cause any > practical damage but is technically wrong because, for example, it can > end up clearing the A-bit in a non-leaf PMD. (cmpxchg will just fail > in this case, obviously.) Eh, not really. By that argument, clearing an A-bit in a huge PTE is also technically wrong because the target gfn may or may not have been accessed. The only way for KVM to clear a A-bit in a non-leaf entry is if the entry _was_ a huge PTE, but was replaced between the "is leaf" and the clear_bit().