From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0419C5AE59 for ; Tue, 3 Jun 2025 08:15:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EDA76B03D6; Tue, 3 Jun 2025 04:15:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 035C26B0095; Tue, 3 Jun 2025 04:15:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E18886B0089; Tue, 3 Jun 2025 04:15:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C348D6B0085 for ; Tue, 3 Jun 2025 04:15:37 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 70FF65C36A for ; Tue, 3 Jun 2025 08:15:37 +0000 (UTC) X-FDA: 83513380314.05.D033987 Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by imf01.hostedemail.com (Postfix) with ESMTP id 44AD740003 for ; Tue, 3 Jun 2025 08:15:35 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RzStzevj; spf=pass (imf01.hostedemail.com: domain of mhocko@suse.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748938535; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7JtWNL9wmv8FUs8JMPyk0qjLcGIi53aG/txZlPwoIlQ=; b=qKACv0qIlJfb0XYz1EWm2jF1aSiGKRRS5Wq+/xSe9+QH+kvolKUEfZyzIs84MYRhSiTnP9 FASpwHK8sVm8UTUrJN776RGDCWoDjjSpbMnj+8FnJky1HdSFE+An5kHHEH/0B97OlYAEAq UI17Yc92+KyjwmWfBUZVmxaBCaVLmew= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RzStzevj; spf=pass (imf01.hostedemail.com: domain of mhocko@suse.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748938535; a=rsa-sha256; cv=none; b=waKs4fGenykUIsy5CacHs9BkCKQ43K4lGIHWkXYMgfGIFXy+5WqqPvq5zemT6RN6J9D0+q lEBNm3x01Hn+lSHSZCDA/PbPTAMIGdGl7yq96nFzFkhylfHQiO/yDU+XU3UzKXQUGYe1qx sJfTRZSARgEzq1arYlja55kHHUo6+2I= Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5fff52493e0so7472724a12.3 for ; Tue, 03 Jun 2025 01:15:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1748938533; x=1749543333; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=7JtWNL9wmv8FUs8JMPyk0qjLcGIi53aG/txZlPwoIlQ=; b=RzStzevjYcf3LiDW0gIHdP+Ax2EnE78nQxKE/lTUYMlGOfw1mzbapT4DgXvOZf8U91 SAISj5B224tmeMwOgFi95sSnob/vaHyYpEd4vTZOKUNILb3cSEmVc+Ax26YT0Y0EoV3Q 0liyuV86b2W0z8lg846vvrGvmD4lzKVlVO1lIfFq95N131BwwMcwStsmTnGu9ZVl/fQN OCA4OEYhRyna4UReAPA6kQszi9WCRLLjp2r30MPbqtU5Mp2azkjSSpq7tpuqgxC3ou/U CKJ0702RfHWXEw61ddVd3WHtEjEqOmBLyhehPs7Ais7EnX1GVZOEugXqYywJ/7+hoJlS Cn5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748938533; x=1749543333; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=7JtWNL9wmv8FUs8JMPyk0qjLcGIi53aG/txZlPwoIlQ=; b=jsrJj9tq3l39Z8qhV7Up29OLhHGpokBJNGcA6eqXzO8xK13TOQdYrmvnOxeS1TP5ya ieLXtxmmK2xaktKW0XoDhXAvg5YmvmDqptDXvX923H8v//oFSJrmWzbgm1gUXh74/DS/ PGoJJiVWwDxahzsutIbAnyNcjzVnYHS8FEbRNREvZr6s3ARnfTcCUpUXlBcViPN7xD1g PATEA8gZ7u7UxGXlhYQMmV91olIG1wK88UEMbRO30tvQw/LYfX25aSsPAUefwddSZ1i9 y51u+bgYrGRo6RWgMcfd8JgwxKv/+8THghvLFioGEAy3VmcvNi7cSQYqju/cJgTpoxD0 MOVw== X-Forwarded-Encrypted: i=1; AJvYcCXg+JjLKZiAQY6gVvACnclVoNQrMD8l8XN95lWvjjhC45SzQOBjvV1LyqWkZzEmUYeKcCfjMNbJSA==@kvack.org X-Gm-Message-State: AOJu0Yymj7O6n0MvzcbzFYhvYLcIKZcbJhu03sctY0PlZblgQ9AdZlc8 OUv4MD2YXEpe7zObufqxLiL4PWaUtqA8lYfDFl4EdAJSQ97SDbkhD8Q8NmGAMHX8jSo= X-Gm-Gg: ASbGncs8F+qPAtzwk6ru99dKcJz0Mud5VTU/9HDd0kBtPTnc1WYa7qHgcaLk3ryf5/C i1uYOWr5kj0MEuEI2pyfgl3WxfVA3Zz+mj2Puehe6qE09HL09X7ITy5SPfb58g3Jk1nlZFkeZbq gFNPz2TphsmC04VAbGY92cBy8O5LZ80bQmPcdxJXBVUd4MTWx5J9dopFKRz3oG3i0wOzZpV3L4g oMs87d9Oi4jY0AsY87cJEo1HCFQ+eHmHDNE0GXTvMUw7by1pHHd5Oot7Y65JfYVCKhhWs6SP4Mi AEozmMoZcrmrKUgqAnrRvmy3nVuf04TKjt+QsylkAwoK9xkt8uywUAdKrv3znIbM X-Google-Smtp-Source: AGHT+IHgWu4IdI3gzh2/uscswRF8XgnFv7trP/TcuTZpl+j7x5yq2ZjYIyS4GD1EWC37yt76MfBT0w== X-Received: by 2002:a17:907:3f99:b0:ad8:8efe:3201 with SMTP id a640c23a62f3a-adb325838ffmr1676986366b.43.1748938533445; Tue, 03 Jun 2025 01:15:33 -0700 (PDT) Received: from localhost (109-81-89-112.rct.o2.cz. [109.81.89.112]) by smtp.gmail.com with UTF8SMTPSA id a640c23a62f3a-ada5e2bf0b3sm914953866b.112.2025.06.03.01.15.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 01:15:32 -0700 (PDT) Date: Tue, 3 Jun 2025 10:15:27 +0200 From: Michal Hocko To: Baolin Wang Cc: Andrew Morton , david@redhat.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, donettom@linux.ibm.com, aboorvad@linux.ibm.com, sj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: fix the inaccurate memory statistics issue for users Message-ID: References: <4f0fd51eb4f48c1a34226456b7a8b4ebff11bf72.1748051851.git.baolin.wang@linux.alibaba.com> <20250529205313.a1285b431bbec2c54d80266d@linux-foundation.org> <72f0dc8c-def3-447c-b54e-c390705f8c26@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <72f0dc8c-def3-447c-b54e-c390705f8c26@linux.alibaba.com> X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 44AD740003 X-Stat-Signature: k1e5u3ssdkk7k6ekenpuud7xw6x1j7x7 X-Rspam-User: X-HE-Tag: 1748938535-707343 X-HE-Meta: U2FsdGVkX19QuEIG8YrbTKkWwXicnvEYzGzDtx3qJbLe0F2tqnhNXVf1SzGQt38hYSByzZSmPKmAANqYaxHkiOAGhsFCVfyyBtLhHYxtF+K99V+X/q9TlAyOETyDEscnB+rPJ9r3SbBnalmLIT+pRrnr7GWdVt8rjWYt26lNjHXChDA5olNqO2Xfg4baVKIIOkHJz/egSlLM4V47Bf6QbskrIuiS7mmQ4jorts+9+7WKnm5WoC2TquQ8oyKsL29TMCYSawLLnnOjvLLT/qK0VlILxjA76HOvytlk6TRNxUx4iNbVsD/gZd9+e64eEWqEtobtaA0z9/E13TeMM97EMZjLZd4BCF6eBilIn6nMhQ48UGBlYAuombFM741ojVD+imylPXGnVY06T0+T/EqEX3WeIJKgCvg7ZB3AaEhupS5kKLPJxk9LUgiRZVxv/duBcfjFupW29GLEACdfi7Zv6NbYtQKlffoo94gmp/lwo44dHfVFOdODXFxVq8w68hEqnGFeZpzhfVHo+N5/nU4Es12VEqJCVbcvr4GzuxlxDxTnJSx2kead1vUUJeKFl/xSbhoNdn6IwIMd9tyg+tAlW4A2ksRIWYQbCXOE/s3vIfZHwz7oES8ls8v3lRGUmic6uibMBr1qtAtLLrUpHifJxpNqZGe0MQCCy3/8YfjB1aYhdvEGIeLQE/0tQlwGO3wXH2U2oZFFaB1LfZODWCRoEgL5WIRA8WjaQWa32V49Q/kRW9FbBTVYQNyrRYuaE7LuPfGF064eSbcUlCa15Z4GGMaVOOjlh3gyyzxhRRrvO9Tzer4GA3UsldhqCPgP9G2DIO8ZXy/Z6uTofWFDozrP8HUNcTOOa3BvRaXKNCuNv9DvM0dHIKX7pV0IE3MatEyQX3etFR9poK+oGkaTRjCAgUt+pofCRKU4GiNximKqM5Lc5gsNCJwzTKZxoGppymIL/FE/JqToG9+c028YADY mcT/6MM8 xSZ8LcoJVN4C7b5vKRYLXdt29TRpv97Hw/EY1wgW9T5izWMpsp33/MIsiwjXnvvJOa37W6RzXpJnZVElOXal22LVThqRSQoBfThnsqPxVs0Q8UBPFwmsEhQasu7j5XB0EYSXUvzdL97EuR4Q8Wusj3b0TorgaXhtEE4D6H+8nAL4p+tn8/vxGxmvcIYrmGsANAWHa68OhJhz0G20UxH5KN3w4NUt+VphrW+Od4q9RFXkMTidFrl8UlPSSuK424ZHdHWiXKbNC+gdBCRFfAQExhmDPeGVgsYkqdcqDetQ55uPZGQePyXU8AU9AIYXVoVO7SF4+5JtZ1F8DFyeFY3ra7cgWBj0RnG3CPFhPZUHuiYrjMh7AotDDYAMmH2ES8/DWqoJLPva1YYD2l/4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 03-06-25 16:08:21, Baolin Wang wrote: > > > On 2025/5/30 21:39, Michal Hocko wrote: > > On Thu 29-05-25 20:53:13, Andrew Morton wrote: > > > On Sat, 24 May 2025 09:59:53 +0800 Baolin Wang wrote: > > > > > > > On some large machines with a high number of CPUs running a 64K pagesize > > > > kernel, we found that the 'RES' field is always 0 displayed by the top > > > > command for some processes, which will cause a lot of confusion for users. > > > > > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > > > 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top > > > > 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd > > > > > > > > The main reason is that the batch size of the percpu counter is quite large > > > > on these machines, caching a significant percpu value, since converting mm's > > > > rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss > > > > stats into percpu_counter"). Intuitively, the batch number should be optimized, > > > > but on some paths, performance may take precedence over statistical accuracy. > > > > Therefore, introducing a new interface to add the percpu statistical count > > > > and display it to users, which can remove the confusion. In addition, this > > > > change is not expected to be on a performance-critical path, so the modification > > > > should be acceptable. > > > > > > > > Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") > > > > > > Three years ago. > > > > > > > Tested-by Donet Tom > > > > Reviewed-by: Aboorva Devarajan > > > > Tested-by: Aboorva Devarajan > > > > Acked-by: Shakeel Butt > > > > Acked-by: SeongJae Park > > > > Signed-off-by: Baolin Wang > > > > > > Thanks, I added cc:stable to this. > > > > I have only noticed this new posting now. I do not think this is a > > stable material. I am also not convinced that the impact of the pcp lock > > exposure to the userspace has been properly analyzed and documented in > > the changelog. I am not nacking the patch (yet) but I would like to see > > a serious analyses that this has been properly thought through. > > Good point. I did a quick measurement on my 32 cores Arm machine. I ran two > workloads, one is the 'top' command: top -d 1 (updating every second). > Another workload is kernel building (time make -j32). > > From the following data, I did not see any significant impact of the patch > changes on the execution of the kernel building workload. I do not think this is really representative of an adverse workload. I believe you need to have a look which potentially sensitive kernel code paths run with the lock held how would a busy loop over affected proc files influence those in the worst case. Maybe there are none of such kernel code paths to really worry about. This should be a part of the changelog though. Thanks! -- Michal Hocko SUSE Labs