From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA37FC61DA4 for ; Thu, 23 Feb 2023 18:24:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 547B36B0071; Thu, 23 Feb 2023 13:24:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F7A96B0072; Thu, 23 Feb 2023 13:24:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 371B66B0073; Thu, 23 Feb 2023 13:24:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2572E6B0071 for ; Thu, 23 Feb 2023 13:24:01 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D7710140618 for ; Thu, 23 Feb 2023 18:24:00 +0000 (UTC) X-FDA: 80499380640.18.7DD701C Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf12.hostedemail.com (Postfix) with ESMTP id 0BC9C40014 for ; Thu, 23 Feb 2023 18:23:58 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bYsrefng; spf=pass (imf12.hostedemail.com: domain of 3Pa_3YwYKCKASEANJCGOOGLE.COMLINUX-MMKVACK.ORG@flex--seanjc.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3Pa_3YwYKCKASEANJCGOOGLE.COMLINUX-MMKVACK.ORG@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677176639; a=rsa-sha256; cv=none; b=TX9cPYfYQ5gkMNazkqHOo8Jd9dDyhrIp63Xosi1DSaxjn8giPfJNyiF53Uq9KDjjLRXp7Z SLb4yztX4sWhZ49hNAELLQgjy+5pJr/YvJlELAPZZrwNPKAZFFO+sG92yCkSUO2A091OJa EOugkLQYY1FzX2uQ8WHvFvAZJLtH+eI= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bYsrefng; spf=pass (imf12.hostedemail.com: domain of 3Pa_3YwYKCKASEANJCGOOGLE.COMLINUX-MMKVACK.ORG@flex--seanjc.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3Pa_3YwYKCKASEANJCGOOGLE.COMLINUX-MMKVACK.ORG@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677176639; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cPNG5uIkry6SPqM4BN2SEB/cHBlElEpC9OaRo7SwP1M=; b=Z3xYcBVt1WskkkhwjgkNZDB9x53awaiVKl5J5DdXFeeaEq2KagbBL2byNCm5jkVu5IWNPe +/FFIJaCEL6J2kH2KE4NCmc9m+nyKJDaMuvxxp1gpM6b4iNo48OzThMlo4I5UIrdfsJAQo 1ZYcr18QTqfIsjmBRI/T7EA+QcCU2KA= Received: by mail-pf1-f202.google.com with SMTP id i7-20020a626d07000000b005d29737db06so3580668pfc.15 for ; Thu, 23 Feb 2023 10:23:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=cPNG5uIkry6SPqM4BN2SEB/cHBlElEpC9OaRo7SwP1M=; b=bYsrefngLcy0tfHlns6aMlsdH2oIXTPT7lCylB/IL5DVPx4RFU9eL+p5YVECAPtS2s usebFCow9QCqMP5Aq02575uRXKS0mk//xQRfDBsRpRdC2zmDEyg2ryuBNb/sp+DPWgwF VuLa1AcmuUwAE3iCUPltPHtYeTAZILzvYPi2vnJHhG0/8yhA2WDigL3Zx02Bro1ht/wk s8xh5/L//ScWeIcEScF2iwWK8oEw3PkhR9lUeMjlbwQX3a8MxONyGn4ttyjC8lyqJYZ/ JY6qaBvtT0VySspOSF5NLPqjE3eDxO/SX5MhEIUMI9EksUAeRqergYUfiuOc1l/z/1f1 4c7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=cPNG5uIkry6SPqM4BN2SEB/cHBlElEpC9OaRo7SwP1M=; b=LdbHfMmwgwJh4QxSHFv6AqDH9ijxfy/mWVMp0CPMTbd/JWUfXbNWkU0OKTO7jpH+6K llf4lw4/2KtFvic90VYPWOQwIMaheyjSAtR0Tt3FnlSHvXbN/ZRZKFvMZ15ozEwuRT4E hmyPWMYwLJ5V3rKARczqBinrD3cXmrcZnIh72zjlM4JIcC0N+IX5fMs/v76IzdVxQrQJ hQhEqb/JlIM+pFIXJAL+40nqRVg0T6gKqqjQL8fzlqZL1g3K2a1YcAWpbWBoVVKnCWTR rJW1z0uj9QEHPoZ7++NWrCPQWPxfGZwd80yp4FbHUSc06/R+EWKvlOSrno4COkJJX3lb mCTA== X-Gm-Message-State: AO0yUKWn6uYei317Ws0I2cjNNqJxTGirwirFkX/F9Nz36mhp7mNhwCq/ exY9/NV/Nqr2VVxQqeEtpFWDkdHsSy8= X-Google-Smtp-Source: AK7set/qgsnXlcUCBL6IPFbm1InDpYCFgJKykWmAkZmrH7WbhMO1WH0I80dbu7seLELSE2a0nC1NZ0yIDIg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:aa7:8a53:0:b0:593:dc61:2161 with SMTP id n19-20020aa78a53000000b00593dc612161mr1985673pfa.2.1677176637799; Thu, 23 Feb 2023 10:23:57 -0800 (PST) Date: Thu, 23 Feb 2023 10:23:56 -0800 In-Reply-To: Mime-Version: 1.0 References: <20230217041230.2417228-1-yuzhao@google.com> <20230217041230.2417228-3-yuzhao@google.com> Message-ID: Subject: Re: [PATCH mm-unstable v1 2/5] kvm/x86: add kvm_arch_test_clear_young() From: Sean Christopherson To: Yu Zhao Cc: Andrew Morton , Paolo Bonzini , Jonathan Corbet , Michael Larabel , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-mm@google.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 0BC9C40014 X-Rspamd-Server: rspam01 X-Stat-Signature: 3gen617wpiqax938unxtqhz89hw5dwor X-HE-Tag: 1677176638-789851 X-HE-Meta: U2FsdGVkX1+t2FklkluBLBMb37ijk2bWGcOYq0EIMryEKCeHoomKOZWhTFCkkv8z822Lu/PgLV1Z/d0KAS1dqQP5woW0r0IRCEbtyGPeTBticd9ZZZS7uokku5kHT9P655A/lcxolyuryQzAo1H85TBqxcwnNfPDKv7DcSBpYtlJg5O/xUnSqB7fU+DwddoHF3XLhwCaIzmM+M/G1oPTR9J0Kuyhdel+VZG0uGeKbsVBTjIxMX6gIcXuJ9QWYFzCClVgGAaJwlycAiUuFdlJwIOXCMh9zOBNT5jmTVVXA2EmXH+TOLuijsSSKRbX7y7fCc8sl2RtQsJ+YNiLHLL+v6jo4fFTAV7K5SOelfsRxpNIrdbt1WB/wV0gvhOe0IhO6JnuEsC90DhDk8S98NfawGFc3QZir21zoT79ynNsKy5Rz8+r/r2ejoOI9OCqZMudAE37zQePhvCapm2m/pBcKuFxbSV8vdsFdIyCb/15/I1MmMG9ym4JyPyj//yal4CEWx+wlVBKCcYj7/8RZrNjhE0H/L3ixFJSxep11IP75fD6tsWVW9yDG79xPg6f9HTEgq1KrIcDfVMpb/hYTCsZH0fihaOn9bR0U3pSoEsNnEGP4SMiZtU8g1h0/YbRKsWbuN4iyAC2Aty24nFsye3qYYco9QYoUeuRxyhjkYbGeG4auqfFlbDEPpshI316cdv+Nej/HWYoxa9TmQtHFa8cX66ea8VxEs/JxTPQ9WOHLKC4ji80X6kX7DuSsfAr2wjtfV5DngGO7o3UcV43ptB/0PmezYnFE29ojuOmBoqucZZW62yM73akBVPDg68UoQLxfK68Eq1HCZQ+av5WPBFu8Wx/M6UJmTJ3oYAVyQ5rI5nQHhaYGO1/kaKC35cakYv3Q+EbsmkVmyLNlPb5Y8jak61nV68RBj5G1zTcL5w/dJYqpDgWu29Gk0h3t++amLDCPCllVyR9cG8zO6z4xHv UT9rhyO9 xVPa6UVtOi1E6D10WMhEoJl8+plvf2upu5Woq5E3YsTej3soKUxK9j7eBE7sl+YUdy+XbLLaolAUj0ANNg673Bpij+w/3W1ma0/FcRLjjID54GKJYjTiMp6mgPeeQQcwjIxYz7xG+IwYYl/1B4gm4klSQ6jYOgloAUGmY0l+TfwofUKFjCi0xve80Z+ziSLwzSX/Wv1vP8FiEt4pVweeYRKztiiRjGArVL6o/I3a/tRhb9lfQHxzGMMn+FYn0Bqxl9kIbhBsY+DFDkyrKcM1L+j8zMQmd9tbYjqpceHoCXmH0OZ8uEICP6RVN42p2NZwyKBfEM80wBenoojvbQWnxD8x2lMUn5/U7cjdc1G2Nvv4tQzdJlF+rNnryv80GKbX8WGoZ4zM40W4kIyD0sL6UrNLgN2EPx/HVuRuUBdo1KksieuId1nqdVV5Xb3chagwzRWA9ozotyBTrDtxMY3ilRe32WbMow8eUHLPrfNrPTkW6ifm03nc+U84uINSVrHWU4tk5YPFdd9fWfs0SFZ6/KNX+rbWjyZXRLn6yxn5tpFCXiPRr+SW9UevqjjntTtQlsL/lAhHGg9UaYGbhmd9MZJKsSUAt5D4ThcU9qxo784qJAkQQmoELgQRDU6yXgLM/klWfq3YWsDrc9hbfXkc1Aqv2wwQSEpe//DFC1Kc6OKatD3bg8yYcgaU2K28kHMWYTnTNQmESJ1ZzCpPMjGE/uvNGFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 23, 2023, Yu Zhao wrote: > On Thu, Feb 23, 2023 at 10:09=E2=80=AFAM Sean Christopherson wrote: > > > I'll take a look at that series. clear_bit() probably won't cause any > > > practical damage but is technically wrong because, for example, it ca= n > > > end up clearing the A-bit in a non-leaf PMD. (cmpxchg will just fail > > > in this case, obviously.) > > > > Eh, not really. By that argument, clearing an A-bit in a huge PTE is a= lso technically > > wrong because the target gfn may or may not have been accessed. >=20 > Sorry, I don't understand. You mean clear_bit() on a huge PTE is > technically wrong? Yes, that's what I mean. (cmpxchg() on a huge PTE > is not.) >=20 > > The only way for > > KVM to clear a A-bit in a non-leaf entry is if the entry _was_ a huge P= TE, but was > > replaced between the "is leaf" and the clear_bit(). >=20 > I think there is a misunderstanding here. Let me be more specific: > 1. Clearing the A-bit in a non-leaf entry is technically wrong because > that's not our intention. > 2. When we try to clear_bit() on a leaf PMD, it can at the same time > become a non-leaf PMD, which causes 1) above, and therefore is > technically wrong. > 3. I don't think 2) could do any real harm, so no practically no problem. > 4. cmpxchg() can avoid 2). >=20 > Does this make sense? I understand what you're saying, but clearing an A-bit on a non-leaf PMD th= at _just_ got converted from a leaf PMD is "wrong" if and only if the intented behavior is nonsensical. Without an explicit granluarity from the caller, the intent is to either (a= ) reap A-bit on leaf PTEs, or (b) reap A-bit at the highest possible granularity. = (a) is nonsensical because because it provides zero guarantees to the caller as to= the granularity of the information. Leaf vs. non-leaf matters for the life cyc= le of page tables and guest accesses, e.g. KVM needs to zap _only_ leaf SPTEs whe= n handling an mmu_notifier invalidation, but when it comes to the granularity= of the A-bit, leaf vs. non-leaf has no meaning. On KVM x86, a PMD covers 2MiB of = GPAs regardless of whether it's a leaf or non-leaf PMD. If the intent is (b), then clearing the A-bit on a PMD a few cycles after t= he PMD was converted from leaf to non-leaf is a pointless distinction, because it = yields the same end result as clearing the A-bit just a few cycles earlier, when t= he PMD was a leaf. Actually, if I'm reading patch 5 correctly, this is all much ado about noth= ing, because the MGLRU code only kicks in only for non-huge PTEs, and KVM cannot= have larger mappings than the primary MMU, i.e. should not encounter huge PTEs. On that topic, if the assumption is that the bitmap is used only for non-hu= ge PTEs, then x86's kvm_arch_test_clear_young() needs to be hardened to process only= 4KiB PTEs, and probably to WARN if a huge PTE is encountered. That assumption s= hould also be documented. If that assumption is incorrect, then kvm_arch_test_clear_young() is broken= and/or the expected behavior of the bitmap isn't fully defined.