From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D57E9D15D9D for ; Tue, 22 Oct 2024 05:14:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A7086B007B; Tue, 22 Oct 2024 01:14:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 657656B0082; Tue, 22 Oct 2024 01:14:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51EFD6B0083; Tue, 22 Oct 2024 01:14:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3244B6B007B for ; Tue, 22 Oct 2024 01:14:12 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D5272414E0 for ; Tue, 22 Oct 2024 05:14:02 +0000 (UTC) X-FDA: 82700070894.17.FE27991 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf29.hostedemail.com (Postfix) with ESMTP id 1F956120008 for ; Tue, 22 Oct 2024 05:13:48 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gIFVnDpk; spf=pass (imf29.hostedemail.com: domain of mengensun88@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=mengensun88@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729573974; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=at3mNX0tawpFGP5P12WETb9tShYw0qm9sRbMOHP+YTI=; b=MPz1SsekI3IL7SRQWWrz/XhkMlbq06T7co3Zex7xX2fRU6hZn/QOVV1ztLilfzCQhIE7hC f2vlKM2UEh5POxFu9rAVNhCbopX+GGavRf2bK+wmUIZNZvKgSuzqLFZUFWERjXCXLslJoh jbSl5KgsN9gYLag5mBPJs0LNgSX3lBc= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gIFVnDpk; spf=pass (imf29.hostedemail.com: domain of mengensun88@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=mengensun88@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729573974; a=rsa-sha256; cv=none; b=8UEDhbwFzvOr5ZZpatNpDD1EzwVm6zNeNImHi8lkRc1z8Hq0clH4k+CqYz+qWpEM1fPEJt X232PQim5S75e7DEIC/mrGMEk/AYsF00c/dlEiORFvirGpQuVwB2NHM/iXegHwh4maKeyr XcSfcOk8fR30rJijocKUfGZLDpxW/00= Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2e30fb8cb07so3792013a91.3 for ; Mon, 21 Oct 2024 22:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729574049; x=1730178849; darn=kvack.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=at3mNX0tawpFGP5P12WETb9tShYw0qm9sRbMOHP+YTI=; b=gIFVnDpkJXryAB5yXwtwaM0/Cyh7thxZFY2dqfIxB8KEJF0pmzBBJtrHfef6m4sX4q CMwxaIG+9PLU1S6uPa9d0lGzAYLf/+Kh5r1j+/pSeb9ph9rx5a//j/TphhykAce8QOr9 fT1Gk9khvHtdeSufTs7qRxQgD3HW/yUfUP9Ya8SNzxj08gEcAVBZK8vpYdCd3A64/rPw yWXlFzclVG729g8dlI/fNioVFRUSZncFxOwwQ7E6rfiPzPYfPoKOggTHhb7lyRw50pt2 C3q/Kb0uAydtvSp5dEiWjf+SBULXnL3TY8b8paYaWBr67trmKGMerewja6028204swBK aEOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729574049; x=1730178849; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=at3mNX0tawpFGP5P12WETb9tShYw0qm9sRbMOHP+YTI=; b=GkkDcThhAKjxaUiuOg/OgfPMHyq/DQ3MJK407wfzd8OLkjQAV6ozj71NWi/nfrIbj+ 9rabm8EY0JhIS/4GU2MbYnWTvdZy9wWzymUylXKkRBn0eSyuOcxqgpWifWCXGY7oFj3I Prn3ui5+btQOrIQvcI84Az8UbyWc7ecwKybX9rDLQApoK2wW6Pxe+htvPq/OMgZ9Ufd6 nzKGQHTv6cO6ZhbysDkmdSj6FJWjtzicRVYHTA0i1nqB7MVU8mHJSN2t0o0dO/IHXLuv xuNRnlF0jPw0gowpJiQ5/It8boHg9v0IpMkiMMwfVVNmuw8j1VJcqardlaZeVkUsOHa6 bZgA== X-Forwarded-Encrypted: i=1; AJvYcCWA13tqi7z+CfqihARCHJPLRptvUZxSH7jHWdLRrtoi+3ijOPHPvSRpFiIAQrPtRs0NSWu88/UrIw==@kvack.org X-Gm-Message-State: AOJu0YwD0KL6XVwGAbe8BzcO7gmQxQD3KsTKHRvEESAiOQfg7tH6NilI em7NbXRLyPOq3yF9i890tU2UAF5uWMs8Q81OxVuEIl+u9ZS3U2Zw X-Google-Smtp-Source: AGHT+IFheUPN8TAJ6sSMv8qbClgGA9hLNbm4NdGj+Q2op7OD6KynBw2ik4mcP300XeC70jTavoRX2A== X-Received: by 2002:a17:90a:8990:b0:2e2:bbcd:6cbb with SMTP id 98e67ed59e1d1-2e5da552565mr2318855a91.6.1729574048496; Mon, 21 Oct 2024 22:14:08 -0700 (PDT) Received: from localhost.localdomain ([43.153.70.29]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e5ad2616b9sm5032149a91.3.2024.10.21.22.14.07 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Oct 2024 22:14:08 -0700 (PDT) From: MengEn Sun X-Google-Original-From: MengEn Sun To: ying.huang@intel.com Cc: akpm@linux-foundation.org, alexjlzheng@tencent.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mengensun88@gmail.com, mengensun@tencent.com Subject: Re: [PATCH linux-mm v2] mm: make pcp_decay_high working better with NOHZ full Date: Tue, 22 Oct 2024 13:14:06 +0800 Message-Id: <1729574046-3392-1-git-send-email-mengensun@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <87msix6e8c.fsf@yhuang6-desk2.ccr.corp.intel.com> References: <87msix6e8c.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 1F956120008 X-Stat-Signature: eik5edgedbded4yy7sbkpbfgm7pmsu54 X-HE-Tag: 1729574028-840229 X-HE-Meta: U2FsdGVkX1+Ct3kShAE85Tv37A+ReixEolc+yMLqUQF0gbkcAwUYs0ATgbcBSvtVWB5w6Rv4r2fKJHUaVO9/KSWFocKgIsc9ZcgmO5LhqyB93FbCno3zey5d54qOQ75B7a8Et63K/Ios0uAHzlZgMj5pKIQG57WsWYDnw+xzDMWYbt8Sowi2ba3eQaENcFZisYcTqJLabLEQyIxYy3UI6r5n55RE1HrqzOIO3Jd17Mu9FTkdH8OGPxvILxLCXjqQsE5vzEH5OFuHk8VMYfRqbi7D3qDn9WoFhHPOOmSHE/k0F4iUZtG3utz7ZNj+PMaYyydAcrSW+N4cOUQpuunMVDVIAQIzDYvs0BWsqmL57BXP7MzO2LQfiqxe9dT+R/JfWtKIRqLJNVWKdUQpqjxTX0WiBJojBWsI6R1WQ3hVRs/GBStK7UFkAXt4LCgBhbFnBTbSakPY5YZGf6N47QKaWdxH2GaBnougnTD/TLMBeJwMb63z8H8NMAXQjTn0zPWIvhC67+SUzHgkcJm3pmCknFgk9zJQSUDlXoHfT0WlSnlBFWOrZ9fyXupjCfpcgaih6FS87KBR4TFOfRrfdurxPtbDTskcE4+I9ITLOFfXgZ3KJakDV4WZxyR2rE6PnDc7IO4BF0kc5cNmmrzJL58JKfKI/cSHrztoB1PueWO6I+s3Q1lqsGYiJCty4U8WvuK2Aw+H4gBA8kqW+L8MCR7EApyUhI2MypenBIcKA+oxW1SIgQiEQgO3WgQYLX6sbxXcYcrU1NOMr8/HmOviOtCwx9uipoSrCKeXgkAqiJC80PLjUz7eqI3qaMZCGd10KoDic5D93RXzP3qXPIrUe9IywcoYyjkVr055CpSTwgyNKZxzHigkIqONHgpnf/bKBmn8tW+AI7uRv7rkOuAvGc8FJrK7VZQmfMMoOkBtNcZ+3yzNt2VkDTywhxIZ4Mmir91MtVRWhZXJhEelDcGPmci 8b1Mkuig 14oZScZTKp9y+FAXaWXduWEr6gujb7ozp8bmOjIaJFaN9L8l9NjsZ1SJkPg7Q50crE6M9uo0bCqoAl/9JCN87iF8QVDh9aPxXNLdGjl2FWND8uv+0EJvDB9Sy5+HVXM1Z2ZY3seKlADirWnock5jxeAfrlUku51eZhxmbIQ/ws5Ep+OwB9IUs9fty5WTUN23wQZVYO1OOjJi3J0J8kD7cRG8PwQ4fQ0yyoNpB8dbaGnSJav2IvKKCJr1nQ1MhvFbmKeHZgrcT+f8Ii0T5rRuMj9IdIv7qdhzjKmskazzrQepgECz8rdUKLFARJhzgEjA95C1nj9nbdUh8I+FUNv0LBXn9XPNKnq1B2o93VPbRVE4MnTnILIoYmb6ZSbTO9ArSDUHShLyiQEDDTkL0kTv8oFzP6QUB8JRp9fUkKexzaIzk538= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000453, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Thank you for your suggestion. I understand and am ready to make some changes > > Have verified the issue with some test? If not, I suggest you to do > that. > I have conducted tests: Applying this patch or not does not have a significant impact on the test results. perhaps my testing was not thorough enough. #^_^ But, the logic of the code is like following: CPU0 CPUx ---- ----- T0: vmstat_work is pending T1: vmstat_shepherd check vmstat_work and do nothing T2: vmstat_work is in unpending state. T3: alloc many pages T4: free all the pages allocated at T3 T5: entry NOHZ, flushing all zonestats and nodestats T6: next vmstat_shepherd fired In my opinion, there are indeed some issues. I'm not sure if there's something I haven't understood? By the way, There are two other questions for me: Q1: Vmstat_work is a **deferreable work** So, It may be delayed for a long time by NOHZ. As a result, "vmstat_update() may not be executed once every second in the above scenario. Therefore, I'm not sure if using a deferrable work to reduce pcp->high is appropriate. In my tests, if I don't use deferrable work, it takes about a minute to reduce high to high_min, but using deferrable work may take several minutes to reduce high to high_min. Q2: On a big machine, for example, with 1TB of memory, the default maximum memory on PCP can be 1TB * 0.125. This portion of memory is not accounted for in MemFree in /proc/meminfo. Users can see this portion of memory from /proc/zoneinfo, but the memory reported by the `free` command is reduced. can we include the PCP memory in the MemFree statistic in /proc/meminfo? > > While, This seems to be fine: > > - if freeing and allocating memory occur later, it may the > > high_max may be adjust automatically > > - If memory is tight, the memory reclamation process will > > release the pcp > > This could be a real issue for me. Thanks, I will test more carefully for those issue > > > Whatever, we make vmstat_shepherd to checking whether we need > > decay pcp high_max, and fire pcp_decay_high early if we need. > > > > Fixes: 51a755c56dc0 ("mm: tune PCP high automatically") > > Reviewed-by: Jinliang Zheng > > Signed-off-by: MengEn Sun > > --- > > changelog: > > v1: https://lore.kernel.org/lkml/20241012154328.015f57635566485ad60712f3@linux-foundation.org/T/#t > > v2: Make the commit message clearer by adding some comments. > > --- > > mm/vmstat.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/mm/vmstat.c b/mm/vmstat.c > > index 1917c034c045..07b494b06872 100644 > > --- a/mm/vmstat.c > > +++ b/mm/vmstat.c > > @@ -2024,8 +2024,17 @@ static bool need_update(int cpu) > > > > for_each_populated_zone(zone) { > > struct per_cpu_zonestat *pzstats = per_cpu_ptr(zone->per_cpu_zonestats, cpu); > > + struct per_cpu_pages *pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); > > struct per_cpu_nodestat *n; > > > > + /* per_cpu_nodestats and per_cpu_zonestats maybe flush when cpu > > + * entering NOHZ full, see quiet_vmstat. so, we check pcp > > + * high_{min,max} to determine whether it is necessary to run > > + * decay_pcp_high on the corresponding CPU > > + */ > > Please follow the comments coding style. > > /* > * comments line 1 > * comments line 2 > */ > Thank you for your suggestion. I understand and am ready to make some changes > > + if (pcp->high_max > pcp->high_min) > > + return true; > > + > > We don't tune pcp->high_max/min in fact. Instead, we tune pcp->high. > Your code may make need_update() return true in most cases. You are right, using high_max is incorrect. May i use pcp->high > pcp->high_min? > > > /* > > * The fast way of checking if there are any vmstat diffs. > > */ > > -- > Best Regards, > Huang, Ying Best Regards, MengEn, Sun