From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCAFBC25B78 for ; Mon, 3 Jun 2024 23:03:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E7F66B0083; Mon, 3 Jun 2024 19:03:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 497FD6B0089; Mon, 3 Jun 2024 19:03:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 339166B0088; Mon, 3 Jun 2024 19:03:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 10B706B0099 for ; Mon, 3 Jun 2024 19:03:11 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A6EE0140B8B for ; Mon, 3 Jun 2024 23:03:10 +0000 (UTC) X-FDA: 82191104940.19.5F6F89E Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf28.hostedemail.com (Postfix) with ESMTP id C78C3C001E for ; Mon, 3 Jun 2024 23:03:08 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F+IvGHUy; spf=pass (imf28.hostedemail.com: domain of 3q0teZgYKCPQoaWjfYckkcha.Ykihejqt-iigrWYg.knc@flex--seanjc.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3q0teZgYKCPQoaWjfYckkcha.Ykihejqt-iigrWYg.knc@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717455788; a=rsa-sha256; cv=none; b=QWk/JerHEeynf1Kor6cxTK/6aQiIXcaFReoAvWIli87gXHGKP3VYTjWGnDUU9dbKTIrIWo L6fgUQKj4/Rx3dJg1i5Q6DBG4wXPCjlHEqXZcI0iErYH27Vjb62e0NQm7c7a3l7sUxf7fj hs7hwTjH5rYLdQmtsAd8rMXmlIxSaNY= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F+IvGHUy; spf=pass (imf28.hostedemail.com: domain of 3q0teZgYKCPQoaWjfYckkcha.Ykihejqt-iigrWYg.knc@flex--seanjc.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3q0teZgYKCPQoaWjfYckkcha.Ykihejqt-iigrWYg.knc@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717455788; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w0SJTxFtK/1G56YdHzcPYdHV7zjy1ZVa0L4hU78kSrI=; b=pR2YuKZ9uzyxjxnP5r287JeJi4nUbKbIBFqlUXA/0fWB42JuAQ0vOeRBILI6L/WaslWsVc pB5oSW6hqxuVX7TAnzTV1du7/izsWMBaIABr6q1ALZ+ey26ywmdJGCBUyhP3VHTTd843xA 9B6TEfC6nY7sENsTVYCIrN1trKG78fA= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-1f621072a44so27679415ad.3 for ; Mon, 03 Jun 2024 16:03:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717455787; x=1718060587; darn=kvack.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=w0SJTxFtK/1G56YdHzcPYdHV7zjy1ZVa0L4hU78kSrI=; b=F+IvGHUy5HvI2s3PjUXklhaDCLlDwj/CiogN6z8s/LdNVNdWlkqQ6Vd8fK8B6C9Z+V er8JbNuArA8rR907b3Uf47VZD/ljEc7GovYNxvyaKnjaqniUH7bqEfwoHxH9tmhtkMi0 sHaqi4nR1QtcGbOS82VVyZDA3rIM6O7JA6WKV5R4B22WX99qZenRhV7b4QvXqxmFYYOM vaRb+BMc7rB5WK6XOvyqZlquMc3lCR6loTtIIuMvRDwYOO+vvbn6y7JLS798g7PmeP3A v0bhPNW2GjUG/eptdmbzDUWukvWQCVWr++Q7KQCJ0vCaa6zp/pZ1qqx6woVerEut9AFd 2Mog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717455787; x=1718060587; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=w0SJTxFtK/1G56YdHzcPYdHV7zjy1ZVa0L4hU78kSrI=; b=XSUxpwkPLpGuhLGwd55x1HdjyKSV16Qp62V6fFxXw2EXYWxPZ3Z9bSN65zT1p0MaGA iUBhi+hDj6eeauifNlm0W3g5Jj2d2/mjeJhjCSL6jW2VgMtdIGenSptD6hfmAJ81/qdf Ap2HWCVxaxoRAvug2epjJ8CIgFv2RiFdPbG4wvSXtr15KxlZTjtcWpt9m49l9+5pRanD a1rf2KHlcc8G5vcYhu5nc3AP7zxpruR9wLnWpLhjiIwdyAiiPX8TSqu9640zCoFN8JlQ YIUioHt6RUECH8rTeJcCLlyDl0DtAeXFf71vAjGGa6ZrCAbP0gA3HP05q8mff0dmfkD1 qTrA== X-Forwarded-Encrypted: i=1; AJvYcCVzdlJuUY14iEN58moIVuf8iqucIg2CIb3V1M5+baBw8d8lAh3gndTN33BAz5B5gFm37rz35Q2ZdKoKKqwLIYtALPE= X-Gm-Message-State: AOJu0Yw9J2f4ictjmmwwro0B9JMpBm7W5aHsetTDuBuR4eYRWWfBayMO VtD9dtIhk0l8RcSb/tYqWNZwHnorFz1ROmk60uafaFzmcnLVQNlUMU2JBS1iUOzGJ9OAN2hkmLj zOQ== X-Google-Smtp-Source: AGHT+IH9puzU5V8aMTwgzoabUylaHv5IhJEL6Kgk0jFfHrRaPIZ6XCL4YZWlIMRWpWZxlsRoR1ZSqKiiwm0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:ea05:b0:1f6:3891:794a with SMTP id d9443c01a7336-1f638917b67mr7110545ad.10.1717455787406; Mon, 03 Jun 2024 16:03:07 -0700 (PDT) Date: Mon, 3 Jun 2024 16:03:05 -0700 In-Reply-To: Mime-Version: 1.0 References: <20240529180510.2295118-1-jthoughton@google.com> <20240529180510.2295118-3-jthoughton@google.com> Message-ID: Subject: Re: [PATCH v4 2/7] mm: multi-gen LRU: Have secondary MMUs participate in aging From: Sean Christopherson To: James Houghton Cc: Yu Zhao , Andrew Morton , Paolo Bonzini , Albert Ou , Ankit Agrawal , Anup Patel , Atish Patra , Axel Rasmussen , Bibo Mao , Catalin Marinas , David Matlack , David Rientjes , Huacai Chen , James Morse , Jonathan Corbet , Marc Zyngier , Michael Ellerman , Nicholas Piggin , Oliver Upton , Palmer Dabbelt , Paul Walmsley , Raghavendra Rao Ananta , Ryan Roberts , Shaoqin Huang , Shuah Khan , Suzuki K Poulose , Tianrui Zhao , Will Deacon , Zenghui Yu , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C78C3C001E X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: swahms7axh5yx5hgs86ygdj69totb4dq X-HE-Tag: 1717455788-91750 X-HE-Meta: U2FsdGVkX18FNQqvytkOvx/BFJWHCauz5hnZpzGotWa6ZIhY84laWFU3u3zFCY+9ROCEfH+FOL5Lli0jMo37MGlFSbwT5naKugBXPNTHLeD+RCu77o8P6JrofDRX91Js+Mjo+TakWArBZ/zuqojt8FoVt/RQtBdZMk8byGAHOZY1CUsfyJDhqOe78WQZRvRUvqy4GZMjUgHvbuaOtgwuKbSR6iS0zvLJGiT+Jszx5qXdtjrteM6ZhN1eIhON+EeF5KMmqeH+/p0PSM/PZ5Ec6aU7jsNrE19XFWq6phrsbhTu1nJFd6LFb7zFLya99l/1dL4X1oWG+les9eCQBJb4S16knW+D1ByDOKGY2saKKGRSfs0BIR1uLz2XDYrCwJaCsnVw17kwQif6iNQvCjcPc/+oY+kVI7VFvxdfbWeZjxb77YW0w8SPwd/quHlhj6r7jqHcwO3BBWRWOzNexARt7cv42rd1C4A7Wqyk1daMcmwUcNdzcE6aIVkhtgvpBYy7jhCYScm/UnBXy+W4dJ9SYc0HRya86eoESbrgXLxzQcYRgb6RxF4KOtM2WkRj9SUFxn18q8cGpjfi0o01WUazRUN8NWtKJ5BBhGmIr3F6XvmjiL1+vkCE5m4S1YHvvLccYiEWQJWVlQzkZ2rQISjq/KNb5SbEnBhGH+TohqpSf09AipAEkiDZ9Z50zLOahWCsbtFGWRZ03EIk/Ai+dvstbxceuewRMn+0P+AfXd0LK5K6G71uuPAKKBP4mUKIKqKDy/biMWmaZojpRaw/iPEML31YD7dP5NAPBl6QqsrlqOuWXsR4FO301rI8LADIMLcbYIiRHAH5umEoxKmpq/YU804yQEoLFyV5GsagUeKuxhoshKqxr7XHwXC50C/I49yB72r2KPuN0yKdgrkrgYi6e7pNthae/8/b2d0jEfRVwL66X2hJSJzA7mp9KsaS3yO5aG0BymkhipAXMlzrrY9 L7x0/L6G Q/bnWlJW4vw8FQoI2XRunsjhirNMi32p/tVMqWtTffp5FmMFI2G7nn8bVhZ+k10Ah+lYRw02GocYiNZuOE0ud5GCta0Oqm4XqYzDOm+VS7kFJ+EEBJxKQH+KX4/rLVSXNHew3r1UrzP8O8BJIz/10a67ZpmovAUdu7o5t8aIVKmoG8qWa8u7ZguuFuHDuvTiKNGrfuSzTHTQSmvwC+s6GG5ekoqMQ2uNJ+aKJV4kbDnpxILAYFaanorZpLmzejH5t8pybiTulHFvb5KRcokB6WssOmKkw1SoKsTp2h29cDGBcrm1EKCM544JYWmgy455+c/vITttO9tDRTwz7rgWZbbukCu7XhSexW6upQkgwasF7sWTntiuRnyt5M4D4NF69VVE1je+EuDSrR9bJWfvNuwk05uSvgTc9Cix2ZeCH8Et28jHwA+15SYLMD/KlCzCGaq7ykcUMTrUfSSQ/im2t0tpdrhMZcm1GwBK4bTt712AEQ4wSa78xud++9QVDL2UkVvw3aNGXdcFic10itS4nLm031UrwKPvfagtgDDGqdrMkB22RqDP3YZVKyd1q1RaHp+/GRFaIJ/5UTSzHS8l8l9n1kM4n309Xo0wdI8V89aXEzJE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000239, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 03, 2024, James Houghton wrote: > On Thu, May 30, 2024 at 11:06=E2=80=AFPM Yu Zhao wrot= e: > > What I don't think is acceptable is simplifying those optimizations > > out without documenting your justifications (I would even call it a > > design change, rather than simplification, from v3 to v4). >=20 > I'll put back something similar to what you had before (like a > test_clear_young() with a "fast" parameter instead of "bitmap"). I > like the idea of having a new mmu notifier, like > fast_test_clear_young(), while leaving test_young() and clear_young() > unchanged (where "fast" means "prioritize speed over accuracy"). Those two statements are contradicting each other, aren't they? Anyways, I= vote for a "fast only" variant, e.g. test_clear_young_fast_only() or so. gup() = has already established that terminology in mm/, so hopefully it would be famil= iar to readers. We could pass a param, but then the MGLRU code would likely en= d up doing a bunch of useless indirect calls into secondary MMUs, whereas a dedi= cated hook allows implementations to nullify the pointer if the API isn't support= ed for whatever reason. And pulling in Oliver's comments about locking, I think it's important that= the mmu_notifier API express it's requirement that the operation be "fast", not= that it be lockless. E.g. if a secondary MMU can guarantee that a lock will be contented only in rare, slow cases, then taking a lock is a-ok. Or a secon= dary MMU could do try-lock and bail if the lock is contended. That way KVM can honor the intent of the API with an implementation that wo= rks best for KVM _and_ for MGRLU. I'm sure there will be future adjustments an= d fixes, but that's just more motivation for using something like "fast only" instea= d of "lockless". > > > I made this logic change as part of removing batching. > > > > > > I'd really appreciate guidance on what the correct thing to do is. > > > > > > In my mind, what would work great is: by default, do aging exactly > > > when KVM can do it locklessly, and then have a Kconfig to always have > > > MGLRU to do aging with KVM if a user really cares about proactive > > > reclaim (when the feature bit is set). The selftest can check the > > > Kconfig + feature bit to know for sure if aging will be done. > > > > I still don't see how that Kconfig helps. Or why the new static branch > > isn't enough? >=20 > Without a special Kconfig, the feature bit just tells us that aging > with KVM is possible, not that it will necessarily be done. For the > self-test, it'd be good to know exactly when aging is being done or > not, so having a Kconfig like LRU_GEN_ALWAYS_WALK_SECONDARY_MMU would > help make the self-test set the right expectations for aging. >=20 > The Kconfig would also allow a user to know that, no matter what, > we're going to get correct age data for VMs, even if, say, we're using > the shadow MMU. Heh, unless KVM flushes, you won't get "correct" age data. > This is somewhat important for me/Google Cloud. Is that reasonable? Maybe > there's a better solution. Hmm, no? There's no reason to use a Kconfig, e.g. if we _really_ want to p= rioritize accuracy over speed, then a KVM (x86?) module param to have KVM walk nested= TDP page tables would give us what we want. But before we do that, I think we need to perform due dilegence (or provide= data) showing that having KVM take mmu_lock for write in the "fast only" API prov= ides better total behavior. I.e. that the additional accuracy is indeed worth t= he cost.