From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB7A5D20683 for ; Tue, 15 Oct 2024 22:48:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4096F6B007B; Tue, 15 Oct 2024 18:48:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BA026B0082; Tue, 15 Oct 2024 18:48:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 281146B0083; Tue, 15 Oct 2024 18:48:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 09CA76B007B for ; Tue, 15 Oct 2024 18:48:21 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 7B0061C3477 for ; Tue, 15 Oct 2024 22:48:10 +0000 (UTC) X-FDA: 82677326676.28.90F16F4 Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf22.hostedemail.com (Postfix) with ESMTP id 6720CC0014 for ; Tue, 15 Oct 2024 22:48:09 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1WQEryr7; spf=pass (imf22.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729032355; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=thjVFZ9JHy+XHx/P+OmRUKAu97VTsEznVTTuyaP0ILg=; b=XvnkajagEx07e1Yg45nyCDBxGj5QA7AW2GFLA62y3ImAYp4XVe9S3xyfD4VxSU+71t0D+B /wOFQDW6qmuYhDKNFY0+ZU7y1ESMY0A7wEq+mrZsw7VENpboEOeQulSxvXrfi24Jy27tR2 FsD1034zcDnDw840xeYWIb+AdYGIWnA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729032355; a=rsa-sha256; cv=none; b=lU1c9CIOYNugwvHWIJ/jK+0RdpNof9Yl1aZbI1ckty0r26w51otzmkWOqBs5cv2pBtRqpt T8yF6Gzn4Z/wtb31kGF6zgHu1Zr2ldLGfPawyzMx4sKbP1UR7pMExotXYCOh8KrjEPHOuz K1PI8D2EWMq3AM1Avogr8vUHVid82Ek= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1WQEryr7; spf=pass (imf22.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-4a46d36c044so211848137.1 for ; Tue, 15 Oct 2024 15:48:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729032498; x=1729637298; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=thjVFZ9JHy+XHx/P+OmRUKAu97VTsEznVTTuyaP0ILg=; b=1WQEryr7GPy90OoGuNDG4ADOhRD5kexJ82WAotif2zpNRBFEaC2PkXh7FlGj5W6A7H xpd20HPOhEjZroQoMNZDTmxz1uUyk4sI5NNLFmRunI0p40yakJUrNEYxVHMgWXL18c9L srTqutBlTDp+8kLyM+etEKG07UtECGPWuLeI34j0DHKgBq4YxDxuJkQTdxFioz8RPpWP qc40S73gQ7qbHQDckp2pH+EL1qd9TyIGnxaobGmZY0hwNS82hy3JUudYI7ySDDX3hI26 iSWnZyFoqOwGqSP3KhdAnNLNZVkatkKb/m7iKgzsmlO8ZM3BlndslAV5GJFWthXVyABV 1PzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729032498; x=1729637298; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=thjVFZ9JHy+XHx/P+OmRUKAu97VTsEznVTTuyaP0ILg=; b=LBJrCJWWadVRIQvfmpuQvSn1i2AyInaG9qfBO9RnDt9vmT+rQ2dOW7icW6pyIfP2U/ KfpHG4lVktEhBiFnJbRWDfcU9zl6DfmcN2C9h2As5VFbLxHYxDtdrIVvWyA2hxh2Rsjz WLsoxBvgzCxANzHUExJkhVBUN0gJS7CnIbb2hw78kKV0OX7D7XqOKWTqUCcl6sgszpVn MTJNVuwS41DptfFexZezq85urFRt1y0qJKkvrOD0i2X0RdLLWBTmaXbUDlpL3HX13KNk kvyciQeUU1DGfXBCdjhbrSLetRPv0YvdNm21PwCcOCdg94RF3oh+VVPZfzau09mJXyfx IxYQ== X-Forwarded-Encrypted: i=1; AJvYcCWiDS7w95aOoc32hrWgLoZM4bx/L9sALbBUoX8tDd8lkqV0HFN0M6u4AcR49XJ+nYKyJP2zn1ZSuQ==@kvack.org X-Gm-Message-State: AOJu0YxisDLjE/HZpQmD8Uf87BOe3l/5BIA0e8HQyjmR786YN8xHws9T CWL8XwduUAxcF6GUstYerBMQK0aq9iUEMrNnCpCi6JSEboKiI+ywclOiZLjiyUNsxqGWNJqYSsN rsZA36RDkRhxiTjjX4/U3aeqOW6MJW+tyEcPO X-Google-Smtp-Source: AGHT+IHnaSy0q/kDCz0vAJot6g5xsnJ0ihADeJJee62qYBERkh1eaBoELa9F+6Vz1W9f+tMrfIXWvAwFDhmw3bKOIJ8= X-Received: by 2002:a05:6102:b04:b0:4a4:72f0:7937 with SMTP id ada2fe7eead31-4a5b4cf873bmr2033136137.8.1729032497513; Tue, 15 Oct 2024 15:48:17 -0700 (PDT) MIME-Version: 1.0 References: <20240926013506.860253-1-jthoughton@google.com> In-Reply-To: From: Yu Zhao Date: Tue, 15 Oct 2024 16:47:39 -0600 Message-ID: Subject: Re: [PATCH v7 00/18] mm: multi-gen LRU: Walk secondary MMU page tables while aging To: James Houghton Cc: Sean Christopherson , Paolo Bonzini , Andrew Morton , David Matlack , David Rientjes , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Wei Xu , Axel Rasmussen , kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, David Stevens Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6720CC0014 X-Stat-Signature: 3q8hjd5memsr59w9w6c75ebtrtp866j6 X-HE-Tag: 1729032489-459648 X-HE-Meta: U2FsdGVkX19cvu8rr7iDw+dCy9Hewnv3NsJ3H4UJqpQT7CR82ysFBq+mpZuxNvOpZ8CVb/cH0hpzugyKAQVwWZE5sUGg7/aRphwr59sQSWuQjnn2NHMIrzhLVBAUhAHQTJTJwtknxmlqSkv3XCMuNaAxw36gcpjjv3fKjl3C1T8Tf8OD8ix7WEqGH6TWbZkhZe5O2F1+oYOeChBpnaVAfUyV7io3mlQbGNxRSRz0KhI8ThYJrEbuGJh8CKCpUIXyOmCKYQdwDS0uxB761WO4LvKNEXVW0yKy7ZTyqBJlDOVuXe/irM/qRcuJvVJIvi/SY8TEMXBPn83nlbEqghokOR5SIV18retSbOvAsYQfL3NDrbBf0U1Vk6dCU7JBM6v70h1DduywNxVEyoUBpvhfapnD/e84cn3aX7OppUjKvcOMADWHI527jCY3fzkqfW9OM/TUUlMkOWStfVxBD04Qe8WUa8PmymShMDpjo9xmG4Kg/ZcxEQVqJiPpgZUBEgmdEFGyry6sk79cuCQsFI0SlWosX3eq7CcPC6OLvn8JQtT0IweHX26TOWbxznnHqd9P4tFOcbLx3SI+q7cmIhovjyLrtABeec7fTDXIdjcztqPRt+51cPUQqul4yF6JQbb/fbNCPdDTF2sDHWVrQRI/45Vm+MZVGRvH20S+E2+Xkd3d1nZuSJLjHJ85w5KoveMPdSbdnM3ccWG4bIhkNHdZ2OAdNK1QC398fJBuOGQHeVdG3ZB1n2nF8Kf3qcp/0BQIjKveCGbYHG/fiybNhU+W4q6x0Mma+Pag9vADi1HAvW3d7l3MYh2+JAB0eyY4oD1DTwjxCxYwxcrKKqkIaPob6ykUOfKScXqvJc87hygfFfcO2GHyA3JDQekF5IpalMzzR93L1t60DDjDgq7QlKBI1Sz8pD/udul4iOjJvaL13kx9HZw8QCGEWvGLFkfVCx4ZdIqtTdp4YWD9eLm0Wiy MTqo58rX fwHd7oEgDq1lf11q5oE9lVg07ZMH2/T7OaHrSZuIyak+DeyK+tm4lXL4WoytR93X/kBAiio2eKdDy+Pm7L46uoVHOHdRA8c/UBWB7I39CJ4gQmCD/ObX/dXKz13ntM8I9oBrOPO/m9TyFwyAqfYKe6FgDfMakPU0Ec9eWqxMyB7QAaiyGsOpT53mBPn/ZmtGobOs+9z98DhWxwQ4pRm/KonVfCUDdTp8EHys5PoUWsYTjhOEvKTjlHP1PyMyicA/rpOa2f70F7T5cNuo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 14, 2024 at 6:07=E2=80=AFPM James Houghton wrote: > > On Mon, Oct 14, 2024 at 4:22=E2=80=AFPM Sean Christopherson wrote: > > > > On Thu, Sep 26, 2024, James Houghton wrote: > > > This patchset makes it possible for MGLRU to consult secondary MMUs > > > while doing aging, not just during eviction. This allows for more > > > accurate reclaim decisions, which is especially important for proacti= ve > > > reclaim. > > > > ... > > > > > James Houghton (14): > > > KVM: Remove kvm_handle_hva_range helper functions > > > KVM: Add lockless memslot walk to KVM > > > KVM: x86/mmu: Factor out spte atomic bit clearing routine > > > KVM: x86/mmu: Relax locking for kvm_test_age_gfn and kvm_age_gfn > > > KVM: x86/mmu: Rearrange kvm_{test_,}age_gfn > > > KVM: x86/mmu: Only check gfn age in shadow MMU if > > > indirect_shadow_pages > 0 > > > mm: Add missing mmu_notifier_clear_young for !MMU_NOTIFIER > > > mm: Add has_fast_aging to struct mmu_notifier > > > mm: Add fast_only bool to test_young and clear_young MMU notifiers > > > > Per offline discussions, there's a non-zero chance that fast_only won't= be needed, > > because it may be preferable to incorporate secondary MMUs into MGLRU, = even if > > they don't support "fast" aging. > > > > What's the status on that front? Even if the status is "TBD", it'd be = very helpful > > to let others know, so that they don't spend time reviewing code that m= ight be > > completely thrown away. > > The fast_only MMU notifier changes will probably be removed in v8. > > ChromeOS folks found that the way MGLRU *currently* interacts with KVM > is problematic. That is, today, with the MM_WALK MGLRU capability > enabled, normal PTEs have their Accessed bits cleared via a page table > scan and then during an rmap walk upon attempted eviction, whereas, > KVM SPTEs only have their Accessed bits cleared via the rmap walk at > eviction time. So KVM SPTEs have their Accessed bits cleared less > frequently than normal PTEs, and therefore they appear younger than > they should. > > It turns out that this causes tab open latency regressions on ChromeOS > where a significant amount of memory is being used by a VM. IIUC, the > fix for this is to have MGLRU age SPTEs as often as it ages normal > PTEs; i.e., it should call the correct MMU notifiers each time it > clears A bits on PTEs. The final patch in this series sort of does > this, but instead of calling the new fast_only notifier, we need to > call the normal test/clear_young() notifiers regardless of how fast > they are. > > This also means that the MGLRU changes no longer depend on the KVM > optimizations, as they can motivated independently. > > Yu, have I gotten anything wrong here? Do you have any more details to sh= are? Yes, that's precisely the problem. My original justification [1] for not scanning KVM MMU when lockless is not supported turned out to be harmful to some workloads too. On one hand, scanning KVM MMU when not lockless can cause the KVM MMU lock contention; on the other hand, not scanning KVM MMU can skew anon/file LRU aging and thrash page cache. Given the lock contention is being tackled, the latter seems to be the lesser of two evils. [1] https://lore.kernel.org/linux-mm/CAOUHufYFHKLwt1PWp2uS6g174GZYRZURWJAmd= UWs5eaKmhEeyQ@mail.gmail.com/