From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D24EC61DB2 for ; Mon, 9 Jun 2025 07:35:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C12406B007B; Mon, 9 Jun 2025 03:35:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9B586B0088; Mon, 9 Jun 2025 03:35:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A65806B0089; Mon, 9 Jun 2025 03:35:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 86FA66B007B for ; Mon, 9 Jun 2025 03:35:48 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7A09E8130F for ; Mon, 9 Jun 2025 07:35:47 +0000 (UTC) X-FDA: 83535052734.21.7042D2C Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by imf07.hostedemail.com (Postfix) with ESMTP id 8991E40002 for ; Mon, 9 Jun 2025 07:35:45 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=KuvXEVa0; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf07.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.44 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749454545; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8RuWRqP0d99Xv1WKhTslXLfzi2R4jCLF5zn4rM3x2gk=; b=f9m7O8YNoJhVX+KTp6o/X02xU38JWETXO/MiSKOPwjz8anR+EBT3HnArj6HaZ+eVhlFURm bqfXXP0BXijvP7FQPAjwE+t7Sb1HAXZSskhbjIZ575v09JSYsKCR+LG5P2IIRkCf1iyv/p pAv1ds7wkzgVh8oVfxXr0INxx/arPYI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=KuvXEVa0; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf07.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.44 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749454545; a=rsa-sha256; cv=none; b=GtEz2yGSpd2xFXML9xYxMQ++5x6WnxXnbDiT4I66wfqGrCM6LWaiLamxP9+9VbUErh2h/N EPAFShgQw0XdLhO4VVqWyXTmll7MGpAqd2UDQgAvf0BA9npwHxWPC6i8BdrefFEnqZZgk8 wHqnbn5vLG5PsGwJiMPeT/v3RsSnnjA= Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-451d54214adso33155255e9.3 for ; Mon, 09 Jun 2025 00:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749454544; x=1750059344; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=8RuWRqP0d99Xv1WKhTslXLfzi2R4jCLF5zn4rM3x2gk=; b=KuvXEVa0WMhNIfAk8rCni09NgScqBxAr85JUsGHtTNAb9wN9ZdOBRmoqVDngHF4Dbz nSB0XqXR/C9ICCBxsCcVwz+Z0mLTGBPivx0JyLY8uceNROWDl6fle8tVTvMF61ubICnR ZQjiVi4SY3mUkZu/s2UpbJOgoTtAWhl+KZ79ycUBSbYp0/ETSp5nKCR/SEXjwNwtfHuO e7pvwYl4OSrwuf+dUMbfenfT37ZxxGu4tpvQNlnCkbQrYS/bYMKPHz4/tvc9+glnrCvJ UMI/wBtEMsUuNJYiIfSZp08y2VdbgyRYR9XgirFNJ26vuT+IO+DvIL7ta8g3gIgBavQF Qrig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749454544; x=1750059344; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8RuWRqP0d99Xv1WKhTslXLfzi2R4jCLF5zn4rM3x2gk=; b=RDvcZ+ght92UxhdlHZnwXtxbaGWl2z2mxpFHlGWRF8GXYixcJ7PRDLbMepRbhii+hb HW4RYRiO77IQtrOOPl5shDCPrzJm2JFc8Xyr+lsPEB6SAY40G/IkgG73mAuooPnP+Vti t+53eR0oYMXeOEanK/Us31qS+BWUEGzlb9Oh+qSexnd02fOxti6oYrMK07OHDq8BnXXV 5Blfl9LvABrrKOxY2oz8bbmED63cagKSvZAu+DTr5F3jiLLJMO9t0OSTlhkOYnl//ZQ5 MsWZ62mm7HO/jDnC3HQixgfYSTiKbmLqDmms3crCvxtpKrfHjOdFU7/Uvh6ZWfzv78RD o0yQ== X-Forwarded-Encrypted: i=1; AJvYcCUSbxwjRlFJ7/tgbwLtXURohmM3b2lV/8dDnBMR2n7J2hjO4wZoNzW6JO3g4QRR6axXRn9iQhK3pw==@kvack.org X-Gm-Message-State: AOJu0YwG2GyYpX4qDLO89m/u7x5bvXcUiJaOtlT4/s47Cm3jqYYQAjpX KOqz0hNsUp00usNCxOAJDgV7MkxB/OydESMNmvnluUb1TXVcXjobNVxawOfYrzefy2M= X-Gm-Gg: ASbGnctLtKaNX8nzqomwxPnNQ1ySdJBodmnaJvr9L2ksm52uJ9VA7QiVUIsD59OlJt1 iRaKmecIFsD3AnMOA40IowsCvAaSl9RH5wD7qmuB63mwmkXEvJoNwGbVnhKf9VnJ0D/km9ncOND f8bBCNwIq3l3OvUt5YZN7XsOQB3j3bDxtnx83A6hKV3jNMuBkoNxTXo3uHMEoDyk2S/vFybzWzf 4BbXxhol0WQomyV4GpRe49zxhSG1xFqdPXwO4ktpAJaOIwYE50ew4EuvBbtqdgyiJ7nMZVG+Wvy LmAswrfI9+huY2dPfCxbRSnWdSIVJG515nD3fQxI0g9FeoWsYYSAQxuoYgBATVV93cUSYczyJVQ = X-Google-Smtp-Source: AGHT+IF6DM1q4q8CPakAyqZNCbPDwJpamBu3whbTRqn03tXemtCu5+RQvYP709FhwnPjbnmjHYqPmA== X-Received: by 2002:a05:600c:37cd:b0:442:d9f2:ded8 with SMTP id 5b1f17b1804b1-45201368cfcmr113234485e9.15.1749454543908; Mon, 09 Jun 2025 00:35:43 -0700 (PDT) Received: from localhost (109-81-91-146.rct.o2.cz. [109.81.91.146]) by smtp.gmail.com with UTF8SMTPSA id 5b1f17b1804b1-452f8f011c8sm97794095e9.3.2025.06.09.00.35.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Jun 2025 00:35:43 -0700 (PDT) Date: Mon, 9 Jun 2025 09:35:42 +0200 From: Michal Hocko To: Ritesh Harjani Cc: Baolin Wang , akpm@linux-foundation.org, david@redhat.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, donettom@linux.ibm.com, aboorvad@linux.ibm.com, sj@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users Message-ID: References: <87bjqx4h82.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87bjqx4h82.fsf@gmail.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 8991E40002 X-Stat-Signature: g3k8jjrbtmi5b77dquqsr916nuokrtjs X-Rspam-User: X-HE-Tag: 1749454545-247027 X-HE-Meta: U2FsdGVkX19Sfgabahuqi6j1XAWwIBbAjZe7ak8dlqRB/QQe7GChpOYoSRynLjoUfDudJy6KKsgoXXeustDRQZpROUJrHo+ZUc9Ak+biTlQiS8JL0B4HGOfYwLybLDFTocAkvt+/6qic3GGbAJ1EE0bwAb69MeWAzukm1c/W04mvtcXehZiH4RvuOKCMJz4kxsjEQQaPZ0nrSS4DasetuH2e6XsifCcS1In5z0QWEq9nBsniscsyRcFiddkIl0qsFB8BLBEvfvVTZ3zDd3ALI4hQClDUlPw2ZvyGrQMNZQt7mn62YyoZ1fw+Z78UJ+oiTffinv+bBIEVRe0dfvsOLEpnzqZUvYeMR3StrUDLms6Tm1ZZ7Mda5egS4l6083ZXXPBuH5W6nR4RrYBBQMIosh8rRm2LhSZxaINYoOEKJquITLMTr1cN2pjzDVWYGhPrX/stwsbbKnv5uKqlJqXm724kXaF18z3UHjZQhNARkHp3UdhRDtkc8F50oD3lMyeLFrbiRtE/rXCGqDEzy49SVHM2D6ZIHoNZ4xtUU9YvqrFhV5WatXVwZSaVMN5Q7Q571VLCpnfqgulMqwJB80f4jF+AngX4kEAA35tnSIIWRwSOzQDtRXjJasn0Jx7aTE4B4OSPqz1WZzvWdIkGB+xiqOk9jyu0c0uLB7WMV9TPzBaf0D7iriFMVRQFpPRFnqGrjVMDWSwFg1B+x+PbSbgP+to16Y7VBT2JCBwRu/KwPySundvjXqHwZ/OPxy6fZRu0DvtVa0SAtk0xU+tKJxGgf55V3DZiQpDntsOpOdjUhZr5/ejnQUwky/4epaMe/arIxgrhZ1x4JGZrypy/qoqikSlBNvKRePa0Cqs35/YCnfFIWAv0i2DtvJRixjDm879ePa114pO7FNvtguvjG9kC6mMJNufUoXCj6297GaVyie/jwwEEo98wNaTHSbQAr/YFTQBHRlO276lIWL7hLyu Vz09HAL9 9JlvXeQR7vWkOLVqqKGq7umkhJvK6sXDCrxsTj8am4QJCAlZBSAFry4yeCh0JOmUscssS/fw/kkUuPqEEbqlukxQhx8U2fuVuH++Ow1nRs1BhA2mD7TE9Cudn4yhLToBt/aBODnA5uK3jvgJ6KPi3uiqK3vibn6WTHjbKMiAeNgP//78iv9w91fUUdjWcHaATuJTRNQUAtNZMgsXnfRh6cds6gAQMMQuNi71gHDbJC/fxquZgK+TlHWC1Zv4dBNJAFZmYcScDnPI4AhjrJDVl50bNj3AXDKHjnc0W1hJKs36a3rwPLvcwn7m05CUjpItTW6eeKT1SS39ISld0ZF84fsEHX5bGTCPRi858eStEJzWH0tTuQ57d0B+y66RT1UBKU7H5gHLzjK8ffymhYdtjsfK16YIQrCXjmfKN9vJxk0WkCUQc7W4JjEXIFZ21Xhs6Lfuwujgu7+xLJ2FDFZoOV0e4WloankYg76zr4eiJzDautpys64YwJy+POevge5isUy4WUdXZdtmC488= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 09-06-25 10:57:41, Ritesh Harjani wrote: > Baolin Wang writes: > > > On some large machines with a high number of CPUs running a 64K pagesize > > kernel, we found that the 'RES' field is always 0 displayed by the top > > command for some processes, which will cause a lot of confusion for users. > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top > > 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd > > > > The main reason is that the batch size of the percpu counter is quite large > > on these machines, caching a significant percpu value, since converting mm's > > rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss > > stats into percpu_counter"). Intuitively, the batch number should be optimized, > > but on some paths, performance may take precedence over statistical accuracy. > > Therefore, introducing a new interface to add the percpu statistical count > > and display it to users, which can remove the confusion. In addition, this > > change is not expected to be on a performance-critical path, so the modification > > should be acceptable. > > > > In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and > > dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch(). > > In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock' > > contention. This patch changes task_mem() and task_statm() to get the accurate > > mm counters under the 'fbc->lock', but this should not exacerbate kernel > > 'mm->rss_stat' lock contention due to the percpu batch caching of the mm > > counters. The following test also confirm the theoretical analysis. > > > > I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores > > machine, while simultaneously running a script that starts 32 threads to > > busy-loop pread each stress-ng thread's /proc/pid/status interface. From the > > following data, I did not observe any obvious impact of this patch on the > > stress-ng tests. > > > > w/o patch: > > stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec > > stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle) > > stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec > > stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec > > > > w/patch: > > stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec > > stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle) > > stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec > > stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec > > > > Tested-by Donet Tom > > Reviewed-by: Aboorva Devarajan > > Tested-by: Aboorva Devarajan > > Acked-by: Shakeel Butt > > Acked-by: SeongJae Park > > Acked-by: Michal Hocko > > Signed-off-by: Baolin Wang > > --- > > Changes from v1: > > - Update the commit message to add some measurements. > > - Add acked tag from Michal. Thanks. > > - Drop the Fixes tag. > > Any reason why we dropped the Fixes tag? I see there were a series of > discussion on v1 and it got concluded that the fix was correct, then why > drop the fixes tag? This seems more like an improvement than a bug fix. -- Michal Hocko SUSE Labs