From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA772C25B75 for ; Mon, 3 Jun 2024 23:17:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36C1C6B008C; Mon, 3 Jun 2024 19:17:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31BDF6B0092; Mon, 3 Jun 2024 19:17:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E4FD6B0098; Mon, 3 Jun 2024 19:17:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F2DF16B008C for ; Mon, 3 Jun 2024 19:17:20 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9F24F140B8B for ; Mon, 3 Jun 2024 23:17:20 +0000 (UTC) X-FDA: 82191140640.29.4C1ED56 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf11.hostedemail.com (Postfix) with ESMTP id CEDD240018 for ; Mon, 3 Jun 2024 23:17:18 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AFLLxG8S; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717456638; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KDykUGLfvL3M1lWs+oKStYngTbEIAr5rRwVrwRKKurY=; b=30n8MLAxfOfvV+1nc5yqcqGa4Nia7S4EixlKEfpvAOQ8qtQ3IrwSN7xQOjwhuvVohmrJyU Hc1tJNZ308OjJtcCemWwXmPmmqa2JpiH3AHRDnj0nm41Sefl5yNVzIS6avO4Hpk5b4FEQb 9oFFTahS+bZXEZSSaUApdbIBvMlsvFU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717456638; a=rsa-sha256; cv=none; b=Gxt2/8ohbxiJxZUutGWR1spSiBP9OgqXO586xkI4qmCV+GfHorZvQn0v6IXKZzGNh2ylSg 5vacji0PwEjWLZXL63E1qdBGFRWEjimeNuvzL43jj/MJ827ycgWQtaEcD359ZIDuJMghI1 ZzofiwJRm0dTwltDHMtGDAM5JZY5vJ4= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AFLLxG8S; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=jthoughton@google.com Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-43dfe020675so127111cf.0 for ; Mon, 03 Jun 2024 16:17:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717456638; x=1718061438; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KDykUGLfvL3M1lWs+oKStYngTbEIAr5rRwVrwRKKurY=; b=AFLLxG8Su8MUFLXtxrbDpYHzuid5eNqqZqN6DnQi4rGtrxc/n5dRQI334Onrv3Z3J5 alGf6BdAP0g5WgTE7iE8/oDuBLkcpEfLelV6hNz6TOxBbn6zOa+7G8kNHaMXnNykkzsC Dow19mqOl3oEJpCbR+6E6UE8wZ4YaCWmmcHGKM3sCZ4+RufoDTn218Gz32dhRg6bE8JJ zyBy51wZ51RPeoCbDT0NnwPysL5Rxyawxt4/+jQW9eg5VNqhhjrWsLTLJqYezRGzswrK Uydy4c5z1f1wlA6FGYHhXUf0qD4mOl98aPgl0dNxEUpF+sovn5q0FPvZncGSAOrcYmYa d3fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717456638; x=1718061438; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KDykUGLfvL3M1lWs+oKStYngTbEIAr5rRwVrwRKKurY=; b=POv2eFz/JJNbp3hjlCMY37gAkjYvQSjRIURfbicaJ1mtxYnz5C+i+k9/M7FHamZEhD 21m55gw1IPRpJBhPgwADr3N9kfhHZbXMXF2SIGdGUwrdaqwISBxVBJNArjgeq72Sxrht /xyXlMyhrWj4bzoMBb5424s6Ceu2ZwZTimIWCmw6cOtDqIfqKbM3I2JwA1/G66Ne/UTl bJF6iq2mbHq3wwrcGSEPYdzl/63ktT8U+T5+bGVpFsaSU4RhzfIcJw4MJLHGsN6k0r8X P8aVuF8e5MvUI2UmEBhUP7MVm8EACnw5SkKHzh9DmChDZbnzjELi8JxpPhEpCDGAcWxA QJEw== X-Forwarded-Encrypted: i=1; AJvYcCUQp276CfMNkpIVz4zeSQJaD/TFVaHMI1Wwzycn8txdoa0P8gzrkK1FF9FB07GiOvKK6HRT0uU2SsWU//zlPeV9o/M= X-Gm-Message-State: AOJu0YxOo9Q9JcrDyFGLkH1+Op1XdwZGcijUoROCxSd2lqneV8f7q8P6 MIjZHkURjA1YuuNF/uCiyB5df7j2eU8LYe8ts69OfQsyaV2SDeJCdXxprIsjlWnRtMV7n/kUSlL IrQ82LzqbcU6Mf4brK5D8xvAo+hIxC7o8ABAq X-Google-Smtp-Source: AGHT+IERlDy+7BDKGdKU12nHNHCd2TlJ06DKBFhjKpTdcIQ5T8pyrt6cuvlCSmsOxFEEGgqtt5PP7LKTO7CDtdLxk+8= X-Received: by 2002:a05:622a:4ccc:b0:43a:aa3f:917a with SMTP id d75a77b69052e-4401e68c145mr1144381cf.27.1717456637692; Mon, 03 Jun 2024 16:17:17 -0700 (PDT) MIME-Version: 1.0 References: <20240529180510.2295118-1-jthoughton@google.com> <20240529180510.2295118-3-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Mon, 3 Jun 2024 16:16:41 -0700 Message-ID: Subject: Re: [PATCH v4 2/7] mm: multi-gen LRU: Have secondary MMUs participate in aging To: Sean Christopherson Cc: Yu Zhao , Andrew Morton , Paolo Bonzini , Albert Ou , Ankit Agrawal , Anup Patel , Atish Patra , Axel Rasmussen , Bibo Mao , Catalin Marinas , David Matlack , David Rientjes , Huacai Chen , James Morse , Jonathan Corbet , Marc Zyngier , Michael Ellerman , Nicholas Piggin , Oliver Upton , Palmer Dabbelt , Paul Walmsley , Raghavendra Rao Ananta , Ryan Roberts , Shaoqin Huang , Shuah Khan , Suzuki K Poulose , Tianrui Zhao , Will Deacon , Zenghui Yu , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: kt4jtyj8zm5fpyp7hdzg7rfehrp74wiy X-Rspamd-Queue-Id: CEDD240018 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1717456638-563069 X-HE-Meta: U2FsdGVkX1+CcpOxeZv0L2O4FRJDgc2tXzYXlXk/hxHHB27aQN3P0CA33wSZ61Vs3iuYbsPXZtHK7uFDs1PItYMgNqjmjiSpXpCOcszHsw3+vdr5f+yVXbP8HgwbkmYlbLyRu9BXsP7i2rBkv9WzRyWMw684XpgY5WC6JbUj3mQYCgHQI3oqZj+Yoqw4NFCxaKceEo/4clU0dLBSJCwietmujBEPAYMrW7A3t75Xes7gtSB9xAhQseCbfPLFRIA3YUPh1gmcOI56Hl5Pd5md8hGjiHGyJ2HmH2WL6hv8j0YPyOtq26IhX078smF/BRAybJ58JOE0AuXBK24c81Dk8S6u9acfSCGjp52LHx2h5o+6vuDj0TUmVYksI3NJbQ4JlgMke8XoC8KeKCROIDzWKrk7fIL5BChRHHYinGGpqQQ09t4YTkDM5oJLpfYTNpMLZBvbsBW7aAkZaT93h+RNEPZur5rkoZ2cnwp7mZFzq63mr60dPXc4gqS50jfVplbMwoZKkRl/64Ov+TgvemqQYuSs2/v/3t9w4WmeuDKCCZm+fYv3gIiyfp7x09buP2+y6Cy2TyrbeXaoCbjO+RACZTAf4A2o3aHNtlp3ZDLZZg+nxvqHT+IbasyBrT/3D7nTraQQooJMQa8A9KJB4ry5sZRXipeW7VgjnT9b0it5kQAOhtY88E+wxJm+FmVlxRdLQEeRolxwkV+er+eZwS4CnqbCZZiTbLIlilXJuLJmhqethca/V5Tw78o4TP2a2WyBJFT1hR0bhYFREdpYhaRm1J8fSyTU1awiWok3BDY84sYETowb2kpxyyL6syxTdLt/lOy1WaDIYk+pPSARp4McUORIMUXk63PPuWvFW9vzyOALzQl9bRaLK8OiesBfHTOHIiTgtclz8lQmeZb2CyMXl8WlUallO/3iKTgavOI2ScVFX4fpyiUV2S4cvD7Cjz0S67HefBSDzyZG4uy6rHl PeIJGnKx kEJUsXuR+4CfI9iPBvVDeZyzba2lX+2YltFffci0kWbAqbZvKllz2j8ICD94VAEGHarBadH3EZb4dA/8SnGBasbR+NtBftgfOAV6EeiIHnR+Bf/Liov6VofKWG07FqjY2wblHbIImStyCiVBOeFSAbYi7fNx3jW0JWmZsfF+3d+i1j8LIeelMRF+2xLUUCabsBsIDK25TusufDD3ToLgg57yha2i4eIUT4xzS1Cm9JKz7K+tNPkut6CTO46LOUKkiiY07 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000344, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 3, 2024 at 4:03=E2=80=AFPM Sean Christopherson wrote: > > On Mon, Jun 03, 2024, James Houghton wrote: > > On Thu, May 30, 2024 at 11:06=E2=80=AFPM Yu Zhao wr= ote: > > > What I don't think is acceptable is simplifying those optimizations > > > out without documenting your justifications (I would even call it a > > > design change, rather than simplification, from v3 to v4). > > > > I'll put back something similar to what you had before (like a > > test_clear_young() with a "fast" parameter instead of "bitmap"). I > > like the idea of having a new mmu notifier, like > > fast_test_clear_young(), while leaving test_young() and clear_young() > > unchanged (where "fast" means "prioritize speed over accuracy"). > > Those two statements are contradicting each other, aren't they? I guess it depends on how you define "similar". :) > Anyways, I vote > for a "fast only" variant, e.g. test_clear_young_fast_only() or so. gup(= ) has > already established that terminology in mm/, so hopefully it would be fam= iliar > to readers. We could pass a param, but then the MGLRU code would likely = end up > doing a bunch of useless indirect calls into secondary MMUs, whereas a de= dicated > hook allows implementations to nullify the pointer if the API isn't suppo= rted > for whatever reason. > > And pulling in Oliver's comments about locking, I think it's important th= at the > mmu_notifier API express it's requirement that the operation be "fast", n= ot that > it be lockless. E.g. if a secondary MMU can guarantee that a lock will b= e > contented only in rare, slow cases, then taking a lock is a-ok. Or a sec= ondary > MMU could do try-lock and bail if the lock is contended. > > That way KVM can honor the intent of the API with an implementation that = works > best for KVM _and_ for MGRLU. I'm sure there will be future adjustments = and fixes, > but that's just more motivation for using something like "fast only" inst= ead of > "lockless". Yes, thanks, this is exactly what I meant. I really should have "only" in the name to signify that it is a requirement that it be fast. Thanks for wording it so clearly. > > > > > I made this logic change as part of removing batching. > > > > > > > > I'd really appreciate guidance on what the correct thing to do is. > > > > > > > > In my mind, what would work great is: by default, do aging exactly > > > > when KVM can do it locklessly, and then have a Kconfig to always ha= ve > > > > MGLRU to do aging with KVM if a user really cares about proactive > > > > reclaim (when the feature bit is set). The selftest can check the > > > > Kconfig + feature bit to know for sure if aging will be done. > > > > > > I still don't see how that Kconfig helps. Or why the new static branc= h > > > isn't enough? > > > > Without a special Kconfig, the feature bit just tells us that aging > > with KVM is possible, not that it will necessarily be done. For the > > self-test, it'd be good to know exactly when aging is being done or > > not, so having a Kconfig like LRU_GEN_ALWAYS_WALK_SECONDARY_MMU would > > help make the self-test set the right expectations for aging. > > > > The Kconfig would also allow a user to know that, no matter what, > > we're going to get correct age data for VMs, even if, say, we're using > > the shadow MMU. > > Heh, unless KVM flushes, you won't get "correct" age data. > > > This is somewhat important for me/Google Cloud. Is that reasonable? May= be > > there's a better solution. > > Hmm, no? There's no reason to use a Kconfig, e.g. if we _really_ want to= prioritize > accuracy over speed, then a KVM (x86?) module param to have KVM walk nest= ed TDP > page tables would give us what we want. > > But before we do that, I think we need to perform due dilegence (or provi= de data) > showing that having KVM take mmu_lock for write in the "fast only" API pr= ovides > better total behavior. I.e. that the additional accuracy is indeed worth= the cost. That sounds good to me. I'll drop the Kconfig. I'm not really sure what to do about the self-test, but that's not really all that important.