From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71EBECD11C2 for ; Wed, 10 Apr 2024 06:17:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC68F6B0083; Wed, 10 Apr 2024 02:17:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A76146B0085; Wed, 10 Apr 2024 02:17:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 916A06B0087; Wed, 10 Apr 2024 02:17:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6FE796B0083 for ; Wed, 10 Apr 2024 02:17:24 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EED3F160698 for ; Wed, 10 Apr 2024 06:17:23 +0000 (UTC) X-FDA: 81992615166.11.B838171 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) by imf29.hostedemail.com (Postfix) with ESMTP id 8FBAA120012 for ; Wed, 10 Apr 2024 06:17:20 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=U6mfRjUk; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf29.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.15 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712729841; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fwQSqxDcMMZ+mZV1Et3fdEgPA3QpbiNuTvfhLfTcAbM=; b=xNQzGTBINgSKRuYtkVBNYOaev7oEoSsyW76pNgmfy85/epaEKtjxNLf/aGVRgw+XyNmHx0 dSLIJGP3ka4DyqZhXdZJMLKPvqVYbabWqc7Wo54VM1zEtG9IuOpVq4vkzU9EuWmGxjmZe7 tVln+smlp+NwNH21KAJ5USL26TLpymQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=U6mfRjUk; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf29.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.15 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712729841; a=rsa-sha256; cv=none; b=eL0s4wwF1P4jymjTeJoddoi1IYzuqCwoWVg4mpsvMxwrrDFC86G/q59dCHiRsKL2eYOEmT jTZDpWDDksGSAEgLvi5XUJYuIr1oBK2y1iy0Ia03WWK+pUM61F7CU+5n6V2HrFz6cueQ3J f61+d0AOwnoMB/FN44LFTSChEHERVA8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712729841; x=1744265841; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=qs6l5jxvkDTBRQoimBG8GeWEgWg+7YUN6vqxESKx63g=; b=U6mfRjUk3sV12Cc0PqKt44hqezEx6I5bAjD3iy2ulBvr7IUgHx/kmfaN WJGj8w/yy6B+RRJCGDubw69jpK6cdUaRDFdd26ePpK9rIo0U5YE7pwNko tq9OYMf1pRi6GTMWVFOan6J2YGxQPTN2lAbCh2AUTr95UgWgCXxAv+/83 E4WW9j/CwFHo72qOdl8JvzDoXEwoDy/dN90CuzzJjYEK5ji42FU+H06v5 Q85fGTLdF7U56O9EP0HhXnOsyU4HHeCFljtUqnwMHAeZqP2AhCDh3STd+ r6FrOCAwdfOp5QYQ0yFikoUiYoxWIigK8DY2CxwYQ7oEnIm9qKckxnEYl A==; X-CSE-ConnectionGUID: jFdOp7VHTqerAQfRncjfYA== X-CSE-MsgGUID: HD8sR2bnR3au4XOgdnVqow== X-IronPort-AV: E=McAfee;i="6600,9927,11039"; a="11863836" X-IronPort-AV: E=Sophos;i="6.07,190,1708416000"; d="scan'208";a="11863836" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Apr 2024 23:17:18 -0700 X-CSE-ConnectionGUID: cp1nF6qKSI6i7+BTGKxGuw== X-CSE-MsgGUID: H3oek+f/TLqHQrJCTDjbhg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,190,1708416000"; d="scan'208";a="24930831" Received: from unknown (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Apr 2024 23:17:12 -0700 From: "Huang, Ying" To: Yuanchu Xie Cc: David Hildenbrand , "Aneesh Kumar K.V" , Khalid Aziz , Henry Huang , Yu Zhao , Dan Williams , Gregory Price , Wei Xu , David Rientjes , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Shuah Khan , Yosry Ahmed , Matthew Wilcox , Sudarshan Rajagopalan , Kairui Song , "Michael S. Tsirkin" , Vasily Averin , Nhat Pham , Miaohe Lin , Qi Zheng , Abel Wu , "Vishal Moola (Oracle)" , Kefeng Wang , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: Re: [RFC PATCH v3 1/8] mm: multi-gen LRU: ignore non-leaf pmd_young for force_scan=true In-Reply-To: (Yuanchu Xie's message of "Tue, 9 Apr 2024 15:36:04 -0700") References: <20240327213108.2384666-1-yuanchu@google.com> <20240327213108.2384666-2-yuanchu@google.com> <875xwr81x9.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Wed, 10 Apr 2024 14:15:18 +0800 Message-ID: <87plux68w9.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 8FBAA120012 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: ttahfgjwyptjmkjwbp5d1w3qu5gnuqtn X-HE-Tag: 1712729840-864781 X-HE-Meta: U2FsdGVkX1/Dy+2hvmVSxIi3VkfPGfE41nQetob2NcomaOYinkTXW/3Zn60osSbU/ZmedMpbDz2DbbvsiamLe/rDz+sU4BeQ4VNP1vC/cHhhkmWfJFjUQ4ZPlWDJv8dBBKvZYvT5neT32PVIoErK8OjXnD9r3mY7o2+sRXmdjPnPIzO8CmKIrcc5pR+GChSbwhraYH0yooLpZmGIn5duzHJHdB14pXIYcoda1ehB3E8x8ii7NL04ehndCbAvefkFaYRZqzwZqCYm996ClXMV5U6NMr/DxhDKAEV7CBDYfStOFcjTlZYqFxWWFinJMqgE7UMn2/4SFUX6gTu8kOUs5jqruvT/PlDATUbkYHS48u/QpUg8uOEhyBC59RVvQ3BfTCQzHqV8yToIABLGMJISpDk7/P3a2xGPpfAk9RnWTMK1cnFgffB9IsfiSE+fAyTJZaHBwsR+S0ZgqSlPeJEelD+2DfUR7MqeX5oRfb1pDJ3YH+erjtaIMqY7raPRVah0vSb4gkzQZW0VrxtmtlSD1pNfPUSR3GKSpq08+IaoI1kRstlR2+g34twQmqn+WyVfoo/rksOd0krMybwVtwxYZV8nTuRHKPZxJ+eAKssme3sOiLYcYhG8FcSVDmozbGTLp7PY0c8taFu429mS0VFFFuprq0Ao5OUmTXYMTWn77EPpewU9oe9UoNeg4FdFbhMANFodUbQAX8l5CKMTkcHU63eALoYKpPDhg1ru1OYZOraq1/S4s0UIad/3poOzRVsgmmu7RY40H2LbusGFWbyMSUbWFrBuWzVvhXGjLW5t42AHKbahl6dbyoE/Mn+SI11i8S0IhXrPrDl06LgzT+96Cak5SOGSJzfNWelAhoIUTSU80aW4dwjGGB3/GPmwOQQtEv7yywZtW3ts2JG8smlaW6GwXMUqV+7KNqbcnn3HeXHArPlMHEMEJ6G1h1PwOf04VpQW70LM64a3DveCaAa vtg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yuanchu Xie writes: > On Mon, Apr 8, 2024 at 11:52=E2=80=AFPM Huang, Ying wrote: >> >> Yuanchu Xie writes: >> >> > When non-leaf pmd accessed bits are available, MGLRU page table walks >> > can clear the accessed bit and promptly ignore the accessed bit on the >> > pte because it's on a different node, so the walk does not update the >> > generation of said page. When the next scan comes around on the right >> > node, the non-leaf pmd accessed bit might remain cleared and the pte >> > accessed bits won't be checked. While this is sufficient for >> > reclaim-driven aging, where the goal is to select a reasonably cold >> > page, the access can be missed when aging proactively for measuring the >> > working set size of a node/memcg. >> > >> > Since force_scan disables various other optimizations, we check >> > force_scan to ignore the non-leaf pmd accessed bit. >> > >> > Signed-off-by: Yuanchu Xie >> > --- >> > mm/vmscan.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/mm/vmscan.c b/mm/vmscan.c >> > index 4f9c854ce6cc..1a7c7d537db6 100644 >> > --- a/mm/vmscan.c >> > +++ b/mm/vmscan.c >> > @@ -3522,7 +3522,7 @@ static void walk_pmd_range(pud_t *pud, unsigned = long start, unsigned long end, >> > >> > walk->mm_stats[MM_NONLEAF_TOTAL]++; >> > >> > - if (should_clear_pmd_young()) { >> > + if (!walk->force_scan && should_clear_pmd_young()) { >> > if (!pmd_young(val)) >> > continue; >> >> Sorry, I don't understand why we need this. If !pmd_young(val), we >> don't need to update the generation. If pmd_young(val), the bloom >> filter will be ignored if force_scan =3D=3D true. Or do I miss somethin= g? > If !pmd_young(val), we still might need to update the generation. > > The get_pfn_folio function returns NULL if the folio's nid !=3D node > under scanning, > so the pte accessed bit does not get cleared and the generation is not up= dated. > Now the pmd_young flag of this pmd is cleared, and if none of the > pte's are accessed > before another round of scanning occurs on the folio's node, the pmd_youn= g check > fails and the pte accessed bit is skipped. > > This is fine for kswapd but can introduce inaccuracies when scanning > proactively for > workingset estimation. Got it! Thanks for detailed explanation. Can you give more details in patch description too? It's unfortunate because PMD young checking helps scanning performance much. It's unnecessary to be done in this patchset, but I hope we can find some way to get it back at some time. -- Best Regards, Huang, Ying