From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 409DEC5AE59 for ; Tue, 3 Jun 2025 10:28:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 963E56B0403; Tue, 3 Jun 2025 06:28:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 914676B0404; Tue, 3 Jun 2025 06:28:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 829F36B0405; Tue, 3 Jun 2025 06:28:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6582D6B0403 for ; Tue, 3 Jun 2025 06:28:09 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E77041403C4 for ; Tue, 3 Jun 2025 10:28:08 +0000 (UTC) X-FDA: 83513714256.24.999537A Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) by imf17.hostedemail.com (Postfix) with ESMTP id A34414000A for ; Tue, 3 Jun 2025 10:28:06 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=ePUzTEVQ; spf=pass (imf17.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748946487; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RkpDFfZ5i7HOcQGzQzekKRtKGEewYp9impWw2CyWJbU=; b=sHUeKPH2pYyiVPbb1DLu9wodxhrSVbzYf0618D+U4sQe4tpYiTCd9ND3m2hrhdSDnPqayx eVgc/PY6Se6SDqpoZwbNsSYXoXAHCJbqG4UHbd9UQXpqJUI0MN5V3aP0euLnLs/cBzZ81T pMX+TqJWnIwQAJu6Mi9dFKIDc9FHrvg= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=ePUzTEVQ; spf=pass (imf17.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.54 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748946487; a=rsa-sha256; cv=none; b=fVTX0cTxLF1LyNyB/+5gUZCaMUAdmaCNELldlrU0IlNJFCrD/3QZQ4jFnwaPC3ni5G4Mgp WCyWRYtpawQBbebrNbMAvAb2YwzOTkvskWKyLDZpGpe7J7In/9d+7nnBkIsWv0bZsY/Cd1 edZ6q8rfT8r2mChXmprnUAgFPMbyvoY= Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-adb5fd85996so418823166b.2 for ; Tue, 03 Jun 2025 03:28:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1748946485; x=1749551285; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RkpDFfZ5i7HOcQGzQzekKRtKGEewYp9impWw2CyWJbU=; b=ePUzTEVQ8SyA38sX+6fncYaTpyBrEJjJkp7HpxFXfH13xehS/HGzzXa9gtrfQM9+OS WL0tdg73XQLtt7+EFokLoPBWY3T0lk+M8VTLb9L4HyDGyx30YWpi/EFBYjV8Ky97e3Uw 9sa4Dz9tjMeDQxqfPF1ecHXZycMPADkqawHdAvaNqnpFksbLtT+Iml7XdBTDM6bgSUJF NEjRtMv8+cxvDuzFl3iEFB3M4F1yf9n6JSK70yuTWpa8kFQkAjC1wE5lD4d5DIToHncO EJN37pVgjftRO4pcPvARWfY9/B+EjrFSn5gLd0nvtqknHFJdJLAOIDPL2Y1aaldXzRng 6fjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748946485; x=1749551285; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RkpDFfZ5i7HOcQGzQzekKRtKGEewYp9impWw2CyWJbU=; b=payG3quo/xhFtPvjDd6UAZZs2G6uOyi/HoT766OWZ2Amcl1NB23fCW/zOLP/sV7xss THuBrTuBKrnJgq7pEBhb6fiLzkUiiLQ1nTECd1LWTMcsxar/RLrVt+SdnTohBp/GThhc bhHVuveecknPig+yunDZ5ckuapXAvRh/myjP1kenWHkPfwHG1A1fA+XAqZE4kmYzqT0/ HaJUsnE4i8K3Ll4DYKvNFS6RsylZnX1k1pmIcQvGRfScehrYw07b5taB+MlxoJsMS5k+ iQpHkM/T0xHA+Z4scD2V06vk5dIfuJpe+NwhYsynBn26HbHR8shRzoAdF5CIKK+Bxu0U tdLA== X-Forwarded-Encrypted: i=1; AJvYcCUWrbgGzuP8THR6ILJMciSUwG08kRrVMHpmKTAZF6RUkc1ewMm2SiRT4T4hPNSTF33K3pXPqUUV7w==@kvack.org X-Gm-Message-State: AOJu0YxQbUfMD1D+wHNa1YvrmYMlUo97k9pMLHXNOqpwWOXuh1uOrl0i rLjEDBDIbCs2IWJa0mwFdWZe7jRch+v/+VaKk9axVSoZ85ZnPpm2Il7pLjCbguvBQJ4= X-Gm-Gg: ASbGncvJNF7A8Ow0NbywS/dbkivXoD9WAV+wc21w4lh3K9Ge5nY5tEyR8AHO+HAQ3G7 GqYkakdriz7INSyaqejk8wj+cVQXJ3awa5yb5WPM7rRBIzzKPTWQAQYwf4NDZFWd97IQXOSDpX1 ojrBewJLLh7rsy1XPeFgFsXr0Ub5FfomiAsT/I3KE8L1JfaihuwwEMOifnbPDBCiw078y3Wncrq TGWbBISyL3OpONuLw9ewwJXRE+lP0gzw7u7bfcZOC/m+Ul6863VNzA87lG05gUue0r8eVNAGxhY EcbyMYbq90GAg16OIC3fjpLsttaHRPYSM1qSlPs7PfUXMo/fF1lchN0nQVwyCx34 X-Google-Smtp-Source: AGHT+IG04Ee/QV805aMBgvuJh4sGVhSL2Bzx/FAjKhiAOV0XyyL256xfcGp+5HFVq2BopoUl+KthNg== X-Received: by 2002:a17:907:96a2:b0:adb:335b:decb with SMTP id a640c23a62f3a-adb493e14d1mr1116718066b.24.1748946484834; Tue, 03 Jun 2025 03:28:04 -0700 (PDT) Received: from localhost (109-81-89-112.rct.o2.cz. [109.81.89.112]) by smtp.gmail.com with UTF8SMTPSA id a640c23a62f3a-ada6ad6ac28sm924861566b.164.2025.06.03.03.28.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 03:28:04 -0700 (PDT) Date: Tue, 3 Jun 2025 12:28:03 +0200 From: Michal Hocko To: Baolin Wang Cc: Andrew Morton , david@redhat.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, donettom@linux.ibm.com, aboorvad@linux.ibm.com, sj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: fix the inaccurate memory statistics issue for users Message-ID: References: <4f0fd51eb4f48c1a34226456b7a8b4ebff11bf72.1748051851.git.baolin.wang@linux.alibaba.com> <20250529205313.a1285b431bbec2c54d80266d@linux-foundation.org> <72f0dc8c-def3-447c-b54e-c390705f8c26@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: bot3ab3wkj7zktiogu6up95iwk85gsiz X-Rspamd-Queue-Id: A34414000A X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1748946486-604431 X-HE-Meta: U2FsdGVkX1/yDjkqBCCiARoYnr/djLjZr4fB4QpOEbJoJVYQGuZK/uAq36+Q435DQIcgJdXfluy/ns/69a8PJUb04UmbmKdx7uFqrQ1TX/ShXUroCO3czc8jz+djLehXiAmNR2nQW1aFR3qH/Zpf59V20AO/O8AYuh38SYWOmzb0RcbSxNYr2+FjsPRICe3BRDifUwi3vO44IWliL7z22XwupHZqG1tuJVfN+94mwNR/Ym3TLCAXpMcxUl8PoXV6EkDMcNWehKZc9yyg1JoCp253pnPrmSy2W2OjBWhrCYLYXO9StumCpkUHKNgoMZIn4nNqZFDeOK6n5oubp7hEqHiJTHenthXpg06bzamW3nrwVTdCKpFReRpXqXhx8IKY2zmKB9POB87rYun0BqAgM+A30ZqLiI1UQKg+DrEAc0dPw8fpGmntfNppndGuT0Kq4oeshDCCQPy1le8h2rW5NS1z4UjMUcjAnATmvRHOfKTQnNrC7d65AcsBkvwHzSzeM3n4uVdIs0zJxCM++5ZOmycsHQ4aYvl+L/hv3V+Vr/FCukSFAaCVt6NVALutjtx7mtIO2kRyzPMXXbaBcl94dGOCWcxqAJQ460OvrebTP1hz/jIiq+6MOoZ4FM1E8+l61QxC7emteTUuUuXLq5KYSNiKypUOAAnudyyr0TElkaHBkFTjbs01IJxXYjPrATEUMFacV/Nwxwf9oY3KrroGZzMCz2nBeigqYWIRoqrwRdDWLY85oY/Fa+XyrpC+mCLhJFxAUIO0a5rKkd66GK0qo0lf53XnuuAOoaX4FU2xbcuIukESpb64WT+JWUvKhAjTy6q205YCrp8HU5lW3gk3/adp1pFsLCOl9aZyGRV86kvB4NgvqmgzIqKXGbPb356ciGNZ1ll9Q2PUIszwnsPbf4uXd4iJZgTlrAzsU9EdEBYBeFDRvv1ddjzpv7+N8YXuOlQQyxSL4LHTL4qaue1 1Z8SDq3S zyVXAzklq8RL4rj1qhe+7e/fcCJJBQBjY4KHKEmBPryXh9c/t6LStB6TWzkslvO3gT6FeF1DjQM0YpvH3T1LlJu6j5oeaRRssZpUpq/CEIBvuFT/ja9ybUP3ybcWe9NVSYIEkbJFOdVViV7Na+8VydL12MuCZuFCd3C9XoZeKFrOGdD79s81WgB0WCep57dL6sv5k+5DNklNySk6JnG4Z2/Dcag7xILMe+eXeiFeOfNPUEULTvcWbPmv4zwh0F3xdp5Thi5+h19qm6Uuq32G9wcWv6dL6nLuyME3/PDvmNGiRdXSsd2zm8X9QfdVO3iJmcFOe5NpZJFDWZ3fHXKW2VQoP5CbtIB7Oqch6vN21AmpSnFAcZcmilYqISFIYMU7TVCrzgehVA/7LXnI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 03-06-25 16:32:35, Baolin Wang wrote: > > > On 2025/6/3 16:15, Michal Hocko wrote: > > On Tue 03-06-25 16:08:21, Baolin Wang wrote: > > > > > > > > > On 2025/5/30 21:39, Michal Hocko wrote: > > > > On Thu 29-05-25 20:53:13, Andrew Morton wrote: > > > > > On Sat, 24 May 2025 09:59:53 +0800 Baolin Wang wrote: > > > > > > > > > > > On some large machines with a high number of CPUs running a 64K pagesize > > > > > > kernel, we found that the 'RES' field is always 0 displayed by the top > > > > > > command for some processes, which will cause a lot of confusion for users. > > > > > > > > > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > > > > > 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top > > > > > > 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd > > > > > > > > > > > > The main reason is that the batch size of the percpu counter is quite large > > > > > > on these machines, caching a significant percpu value, since converting mm's > > > > > > rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss > > > > > > stats into percpu_counter"). Intuitively, the batch number should be optimized, > > > > > > but on some paths, performance may take precedence over statistical accuracy. > > > > > > Therefore, introducing a new interface to add the percpu statistical count > > > > > > and display it to users, which can remove the confusion. In addition, this > > > > > > change is not expected to be on a performance-critical path, so the modification > > > > > > should be acceptable. > > > > > > > > > > > > Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") > > > > > > > > > > Three years ago. > > > > > > > > > > > Tested-by Donet Tom > > > > > > Reviewed-by: Aboorva Devarajan > > > > > > Tested-by: Aboorva Devarajan > > > > > > Acked-by: Shakeel Butt > > > > > > Acked-by: SeongJae Park > > > > > > Signed-off-by: Baolin Wang > > > > > > > > > > Thanks, I added cc:stable to this. > > > > > > > > I have only noticed this new posting now. I do not think this is a > > > > stable material. I am also not convinced that the impact of the pcp lock > > > > exposure to the userspace has been properly analyzed and documented in > > > > the changelog. I am not nacking the patch (yet) but I would like to see > > > > a serious analyses that this has been properly thought through. > > > > > > Good point. I did a quick measurement on my 32 cores Arm machine. I ran two > > > workloads, one is the 'top' command: top -d 1 (updating every second). > > > Another workload is kernel building (time make -j32). > > > > > > From the following data, I did not see any significant impact of the patch > > > changes on the execution of the kernel building workload. > > > > I do not think this is really representative of an adverse workload. I > > believe you need to have a look which potentially sensitive kernel code > > paths run with the lock held how would a busy loop over affected proc > > files influence those in the worst case. Maybe there are none of such > > kernel code paths to really worry about. This should be a part of the > > changelog though. > > IMO, kernel code paths usually have batch caching to avoid lock contention, > so I think the impact on kernel code paths is not that obvious. This is a very generic statement. Does this refer to the existing pcp locking usage in the kernel? Have you evaluated existing users? > Therefore, I > also think it's hard to find an adverse workload. > > How about adding the following comments in the commit log? > " > I did a quick measurement on my 32 cores Arm machine. I ran two workloads, > one is the 'top' command: top -d 1 (updating every second). Another workload > is kernel building (time make -j32). This test doesn't really do much to trigger an actual lock contention as already mentioned. -- Michal Hocko SUSE Labs