From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5719DC35FF1 for ; Wed, 19 Mar 2025 06:20:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 77621280003; Wed, 19 Mar 2025 02:19:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 72239280001; Wed, 19 Mar 2025 02:19:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61222280003; Wed, 19 Mar 2025 02:19:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3B16D280001 for ; Wed, 19 Mar 2025 02:19:58 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 80CFE1CCC0D for ; Wed, 19 Mar 2025 06:19:58 +0000 (UTC) X-FDA: 83237300076.28.96858D9 Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) by imf08.hostedemail.com (Postfix) with ESMTP id D6859160006 for ; Wed, 19 Mar 2025 06:19:56 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=K3jrrep+; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf08.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.181 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742365197; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=WWASpG3to/0F3mlwjDcmepAAIz3iQz3ZX6GBE9y9UpE=; b=Xg5Z1HZKrgMO16YAa5N6DycPx2dz0lj/DOsJGgk89GCYexDK44l9DYz1K+tD0D7jdJpoa0 hyHS1sEI0Md8k6N5jmlA4gUnh6qOozIOnsuQZyfpq1/PUARvMQhzDak6PI2Yvs3rpi+rG2 1j2gzrWbjp3Z/qk0PJSuGD2/V9qM0ac= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742365197; a=rsa-sha256; cv=none; b=ZGkseBIdwdQ4y1DWPJcFAwWEiwniSzdVPmlPAivXtMehKe/7dThkbDV0rodfadMmbe6nX1 CRUqZurgPOT3th6bhnpipWrNa1nhGxfRlzS6TZexEoT4SgFGyPLAWkWBcLH+OV44UOdmaP fEZK53+wmVkRzpnAffGPCKlI/Kj2MW4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=K3jrrep+; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf08.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.181 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev Date: Tue, 18 Mar 2025 23:19:42 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742365195; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=WWASpG3to/0F3mlwjDcmepAAIz3iQz3ZX6GBE9y9UpE=; b=K3jrrep+v1QTm6KpsKThxzVbgxgfteSjJ8kzyJHLjlZ8yM3ktu+XeeQVxipCxYOAHswqnf z6KhaJSZEGMEvOx/yaXNvCgP36rzFQF8buFWAs0zWjv3Ixip9BCv7PgB3+TU2Ow6xKqREE uretfNPckb5nunBwBT+HkAA9Vt50tFQ= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Yosry Ahmed , Meta kernel team Subject: [LSF/MM/BPF Topic] Performance improvement for Memory Cgroups Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D6859160006 X-Stat-Signature: 4rgodb6wpz81anscdkhxxpmogsx383jg X-HE-Tag: 1742365196-435944 X-HE-Meta: U2FsdGVkX18Qfh8aozBBQaBZXbmYwSoFMYaNZLYbOxDNAWwwh+4W01+XeksPCiOkMt5WgpUwllnK3c05EOWIYuvS0bMfeE8UA1QWbIq+s/XXHwkunD6vUPJ5xNmcYE+CA3K3btWNUdRJ49M68iT/W0ko0Zraj+Zr0jzIA+vMq5+a6IRHCRvLJ9kVRa1HKrX8S1bQKBpj5vGGYJVFLWBW+UtzDuAGjUGZ+eIUcs8SSqOU/ovGW2P/sJQ81fuQHpP9cXnFagsOkDDiYHLaaidY6Pg4f5enaIqK/FTqr9+TJGX6gtvvMuDet27z7q3hAGRRdbQPBAggcCYKUMjlfBuG3cbw2rrS4v7sifI0Dqj6QWwgpyA2ThARThbwYDMdk5f2TL+dDDcJb9sOXSl39GL1rhhttxiqlnLw1FwsmZXJ3zSVe4o8jRBXlLYayrXFrYL2KMQivqYhR2PYo26tA6k6+pZ79f5j5tYxFLZ8UsT20Yi7v9Ip2DIv2pw/HSeHO5GyxEwJMYTET8fmMxcxpPdYC+QrxrGmPVY8abQpzZB6+4zTE0R3UQEKnU8ROjrASiTseBf1ksAR1h8mo2oaJ41fTenQtRF3D2NTjMV1zqiAaeEmUpmiAjMHerm/YMJuTwuD7X/CWPoCMFHqh/dxhWnTfiTYr1GKvla2U9MM7LITa0NjJA+uawBOTlvx1QCKb1tqOjFQWAXG9wX6ysG6mBIvRDKZ458kaugsQvh09lnwxTjP1da7BOYX7WNNEM8/E6Xm52d2XeXvjxP0dRT1gBpfgKA8ywPIrSgMib+froO9oCCcn4mOMZ0endOCb89o27nVM8CB5RBoUXhSi1puANyDERhdVs1LPYCnlikkvya62lc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A bit late but let me still propose a session on topics related to memory cgroups. Last year at LSFMM 2024, we discussed [1] about the potential deprecation of memcg v1. Since then we have made very good progress in that regard. We have moved the v1-only code in a separate file and make it not compile by default, have added warnings in many v1-only interfaces and have removed a lot of v1-only code. This year, I want to focus on performance of memory cgroup, particularly improving cost of charging and stats. At the high level we can partition the memory charging in three cases. First is the user memory (anon & file), second if kernel memory (slub mostly) and third is network memory. For network memory, [1] has described some of the challenges. Similarly for kernel memory, we had to revert patches where memcg charging was too expensive [3,4]. I want to discuss and brainstorm different ways to further optimize the memcg charging for all these types of memory. I am at the moment prototying multi-memcg support for per-cpu memcg stocks and would like to see what else we can do. One additional interesting observation from our fleet is that the cost of memory charging increases for the users of memory.low and memory.min. Basically propagate_protected_usage() becomes very prominently visible in the perf traces. Other than charging, the memcg stats infra also is very expensive and a lot of CPUs in our fleet are spent on maintaining these stats. Memcg stats use rstat infrastructure which is designed for fast updates and slow readers. The updaters put the cgroup in a per-cpu update tree while the stats readers flushes update trees of all the cpus. For memcg, the flushes has become very expensive and over the years we have added ratelimiting to limit the cost. I want to discuss what else we can do to further improve the memcg stats. Other than the performance of charging and memcg stats, time permitting, we can discuss other memcg topics like new features or something still lacking. [1] https://lwn.net/Articles/974575/ [2] https://lore.kernel.org/all/20250307055936.3988572-1-shakeel.butt@linux.dev/ [3] 3754707bcc3e ("Revert "memcg: enable accounting for file lock caches"") [4] 0bcfe68b8767 ("Revert "memcg: enable accounting for pollfd and select bits arrays"")