From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F99FC5AE59 for ; Tue, 3 Jun 2025 08:08:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1144A6B03D3; Tue, 3 Jun 2025 04:08:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 09E6C6B03D4; Tue, 3 Jun 2025 04:08:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED00C6B03D5; Tue, 3 Jun 2025 04:08:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CBA486B03D3 for ; Tue, 3 Jun 2025 04:08:30 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 267D2BC1BD for ; Tue, 3 Jun 2025 08:08:30 +0000 (UTC) X-FDA: 83513362380.03.0A1853E Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) by imf12.hostedemail.com (Postfix) with ESMTP id 2BE9140009 for ; Tue, 3 Jun 2025 08:08:25 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=DY0SvXZW; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748938107; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=94IxEkB+vGRX1OwSpnG6F+yDF6/i4bDG7xzgyZQ+YKk=; b=ZjTN9fa1XUeGfzkkW2oZuBA7Ho5k31PVlBNoBj38ggtDsTjz9Qu5HBrljejfVmvBdpMTyA Rffv+zjdSnUCEmoNUQ95Bn9kjaRdXaD0Xc929l9keU0qsndbzFwBc3qJ5if1itdS9bjajU Hyovu2CK9Lnsux5y6CmZaL/8RSGu4YY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=DY0SvXZW; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.100 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748938107; a=rsa-sha256; cv=none; b=gWimwdsg5Bw2eRA2s7Mf7zkhs7k/WY/E9/CVxCbBJm3f/vqnneKzWx5DbKWNQNsXKYLLw6 gDkmizZNYdB9e3ML21fEutURXoF2ERXRI4htlB5ixjoxsvgwrh793w1ZiVLIdfZrzkM4FU F9jZp0uugyCE3JlQLcHuoG3R2NPyd80= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1748938103; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=94IxEkB+vGRX1OwSpnG6F+yDF6/i4bDG7xzgyZQ+YKk=; b=DY0SvXZWNA/RpcI5IsCnc+HzGRFAqehL0jXhJ24J1x0SqSow5WkcJagz2iOzQtmJTh4pCdt+j+/85m8//2icRo4t2HGMg3LLucs3mwETohl2GGZ0t0lZnD7bBUypPNCwBrzE+6rML6OCL5qBPsXvvK5Yj4gReNAytaOOClU7D8M= Received: from 30.74.144.120(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WcnHjtm_1748938101 cluster:ay36) by smtp.aliyun-inc.com; Tue, 03 Jun 2025 16:08:21 +0800 Message-ID: <72f0dc8c-def3-447c-b54e-c390705f8c26@linux.alibaba.com> Date: Tue, 3 Jun 2025 16:08:21 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm: fix the inaccurate memory statistics issue for users To: Michal Hocko , Andrew Morton Cc: david@redhat.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, donettom@linux.ibm.com, aboorvad@linux.ibm.com, sj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org References: <4f0fd51eb4f48c1a34226456b7a8b4ebff11bf72.1748051851.git.baolin.wang@linux.alibaba.com> <20250529205313.a1285b431bbec2c54d80266d@linux-foundation.org> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 2BE9140009 X-Stat-Signature: hauri1mzzqpa4hpmqefomiuda773eo1m X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1748938105-88663 X-HE-Meta: U2FsdGVkX197+i5/M90tHFvBi5ioDDJp777mabmXtAFUkMvryIuC8xnmeYmyFxOc0PL8BqLNetFu7FHOCt0N5UrbuwhgM6F/WbILg9vV5Ehf64e4qLNiqzdDB/b39UhCDSxrH/x1RaBaYNlLj1lbic53s1HPQE9UIZE2sbpnIBtSNOgIbzg238/7jECwIbP+gnMDTeSc4tDvcw0526AicsSF/dYXXs+GVMJHF+tH8a53/JamPdYkuM6ktE04SrI3m7tZpOMY4yd5HBid7ooyxkNHoj6szMoeehZ9p0/I6oZVjvOR7Fe9xGD4LWbGvCp+Uw25SRAReMd9T/nOyTMYSVgJh+Ugdad0qaK9/lktDD0LloOYyI3i7vBXpWCgKMYQPJm0k0EKJVsoafm4hAk4bawS54UCYOQeRo8/o8ZAd/mRs57Srp21Nr6s0BdhtPEI9Gcr+AHRkI8ZeXXZqyMFed8Z8Imx267wQRF4mjJQmwK8l6q/OTRzVHUwxsKLZy7TaLbP8+fg/xCuenApypAZS+RDG8yNthlpgOPPvqMLU65Z1eMkOTEic2tuuQxGORhIVM6M56Culhfa8K8vWt3U42hIouHmDnt6yhi+GHUbXx/4UyXBf2VNkhSRv3MOTh+Z4wm7Rw0cW/VsDkFtqRbFDN9VbciFn/61yOOQrfwjhqNjGVHJvnT5ZMS0sFkNJAPYrIo1lsRXLsBWcgGG6ObqEn1wxcJBBmrzP3G7Qrsanu0rjgolMwzQj9KDsTskiP/AhLQkH7QCmhnChnqVCHFkiDNVYhtGSsViwMbKiVIEvn030V8aANzUJK9pVylrrVdlU3XCC2bLyNgVtP8QjVzCe/DB/M3OReXy/fLZdyHIxDg7wuLuV2+IGdsnDi6/ALmHuvwIU2t+eLoq3BUx+WinoeC7d3/oHDEiGW95n6mYENshEt+cHZtXVUtakKpbyZPRWsJzlXlPRsxZ49KXkB9 VgK+yFVr M/LQNcLWxMatMA+hf+W6xo6J33v7ERy/PbwLrRHKNI6O8p+z24Atbd+44v3LPwlJY8T2dKXTObImLQx04iSRmiQTiP2mbmAcfu/hoqt4sC+CqVmVPY/oihG9qx5th6k3Q+oWrPnRsJve40pnUuVTfgPqhtvlcJ7jPv9fLBPjupR+M9AkU7gRdLhdWBLIBrByurKths8lcF6VaLcUuheUSD/xqGL3adOEdVazIYhw8DnrDdLErg4T6S1B5AnWr0R3DL81pFy2HuumbK42RjINqBHCdoQNyx6yEYNDH7iB3c9dpNBIJVYDccSmFJmBzdbnIsQIaqBJcQK/utSwrknKRce3u+drBF3xbb3tnLcw5qvOvuUA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/5/30 21:39, Michal Hocko wrote: > On Thu 29-05-25 20:53:13, Andrew Morton wrote: >> On Sat, 24 May 2025 09:59:53 +0800 Baolin Wang wrote: >> >>> On some large machines with a high number of CPUs running a 64K pagesize >>> kernel, we found that the 'RES' field is always 0 displayed by the top >>> command for some processes, which will cause a lot of confusion for users. >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top >>> 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd >>> >>> The main reason is that the batch size of the percpu counter is quite large >>> on these machines, caching a significant percpu value, since converting mm's >>> rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss >>> stats into percpu_counter"). Intuitively, the batch number should be optimized, >>> but on some paths, performance may take precedence over statistical accuracy. >>> Therefore, introducing a new interface to add the percpu statistical count >>> and display it to users, which can remove the confusion. In addition, this >>> change is not expected to be on a performance-critical path, so the modification >>> should be acceptable. >>> >>> Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") >> >> Three years ago. >> >>> Tested-by Donet Tom >>> Reviewed-by: Aboorva Devarajan >>> Tested-by: Aboorva Devarajan >>> Acked-by: Shakeel Butt >>> Acked-by: SeongJae Park >>> Signed-off-by: Baolin Wang >> >> Thanks, I added cc:stable to this. > > I have only noticed this new posting now. I do not think this is a > stable material. I am also not convinced that the impact of the pcp lock > exposure to the userspace has been properly analyzed and documented in > the changelog. I am not nacking the patch (yet) but I would like to see > a serious analyses that this has been properly thought through. Good point. I did a quick measurement on my 32 cores Arm machine. I ran two workloads, one is the 'top' command: top -d 1 (updating every second). Another workload is kernel building (time make -j32). From the following data, I did not see any significant impact of the patch changes on the execution of the kernel building workload. w/o patch: real 4m33.887s user 118m24.153s sys 9m51.402s w/ patch: real 4m34.495s user 118m21.739s sys 9m39.232s