From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 045F7C3DA49 for ; Fri, 19 Jul 2024 01:34:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D240D6B0082; Thu, 18 Jul 2024 21:34:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD3C26B0083; Thu, 18 Jul 2024 21:34:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9BFE6B0088; Thu, 18 Jul 2024 21:34:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9C6316B0082 for ; Thu, 18 Jul 2024 21:34:51 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 0A8AE1A0311 for ; Fri, 19 Jul 2024 01:34:51 +0000 (UTC) X-FDA: 82354783182.21.BACCAD5 Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) by imf06.hostedemail.com (Postfix) with ESMTP id 6236E180010 for ; Fri, 19 Jul 2024 01:34:48 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=clwfAMgx; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721352854; a=rsa-sha256; cv=none; b=aZeo0+Q4I7UNEAb1kFnjn7/akDq+4ukG8YzH0aIlTFVvbuiswRMA8Y5kCsuf1ZDQmrxWvq 2bH7gh4OOvDhlsmOcI88bkV1ideiJOlreVVnT5YfLJF/JdTx7IdQ1x7S6LD90OnaguW1c2 SD3mHXAHLUaSGsgNpU03NsxnWNJgGUY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=clwfAMgx; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721352854; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OHdkclOjNcbZfs7KM0HcDLDNcC3yk1tELm43DaUnfTI=; b=y86klPkyvSue9UeUr4pWOCcGwVVRL+J3JF0XWhUUvgv77cCmgqYMy1O9iXgfFgU8Qk2PEk d6pgf0KJhhZDXt2ulx2kD/vmynFZL1T4Nit/X8hqWTXn7bJrdaOFOQ42G6JdQALRJDRoBt +dZjWZkvvi2KJmg7d5NaNlOgo6avl0c= X-Envelope-To: kasong@tencent.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1721352886; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OHdkclOjNcbZfs7KM0HcDLDNcC3yk1tELm43DaUnfTI=; b=clwfAMgxG/N2JRPmK+f9UjCdNgkAix9soHu9JjYCjO6nJIwqBr4cM8IV2ua9iItyhc3rvH rhCADgBBO1GAy0vLV3dwRoXXa+Or76QP39Gp2thgvvWkiS41KjIM9DBBsgkfLQ+3scxlIW +3MdGKWwkx8w3Zmojc+INrQaPB8W3rE= X-Envelope-To: linux-mm@kvack.org X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: willy@infradead.org X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: roman.gushchin@linux.dev X-Envelope-To: longman@redhat.com X-Envelope-To: shakeelb@google.com X-Envelope-To: nphamcs@gmail.com X-Envelope-To: mhocko@suse.com X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: zhengqi.arch@bytedance.com X-Envelope-To: muchun.song@linux.dev X-Envelope-To: chrisl@kernel.org X-Envelope-To: yosryahmed@google.com X-Envelope-To: ying.huang@intel.com Date: Thu, 18 Jul 2024 18:34:37 -0700 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Kairui Song Cc: linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Waiman Long , Shakeel Butt , Nhat Pham , Michal Hocko , Chengming Zhou , Qi Zheng , Muchun Song , Chris Li , Yosry Ahmed , "Huang, Ying" Subject: Re: [PATCH 1/7] mm/swap, workingset: make anon workingset nodes memcg aware Message-ID: <7gzevefivueqtebzvikzbucnrnpurmh3scmfuiuo2tnrs37xso@haj7gzepjur2> References: <20240624175313.47329-1-ryncsn@gmail.com> <20240624175313.47329-2-ryncsn@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240624175313.47329-2-ryncsn@gmail.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6236E180010 X-Stat-Signature: 3israd7bnnnrdn1iaidyq7b8m11szrnt X-Rspam-User: X-HE-Tag: 1721352888-453728 X-HE-Meta: U2FsdGVkX1+FCv6ExKTMYFgdGY4fUJ1upWaB1EczywmW/WpQcu1qnW2tq3hrZzdgBpVLplvJdpFJJT1PSfays5a7JFMkJ/wA5tyajGQKrAOgFpYV3SM5BeQQc0QoY8Ustd1nfV7Kd3CPGwvanldH1yJ+i8rMTwru1JzkWZfOY/HcwWaXDHV6Dd8Pqklk96znp2f6vWuwscdImAFP2GOlWzY0mLbwNRiyXX1JAfka8cfJzbQghJZPfc6mc+MJh1Dd5M7Nf9GkGaz6d4BEA2GqXpeeykq4UFYgfaS9WJPdIWNVsdn3rnD6wtoS5buLCl31ajHDljX0r5U+GaiRKFLESQ+5AKoVaDoqCGgrkou6dyTayT0MAA86KCdrbsig7AwUu1E1a3UyN2kgV9Ch61ioLkVBEpc0ACUf/p5RsRs63c4LLKiMyGWw728pOKWE/cFcDzu/JhMK1ODc5XtBRkqzn7NLwOx29qsaZJ+DQly6NqdBgQdxFRTHYYYJYFLM7zM9yGcvTC+IIwsDjPf/e596F4AD10Ns/Lr7KugJlr+Esk3EWqR3q124EnyTeK1l5E8A0kQqSGO1xw45iEzGOVFdJWTiRfFz4tsBBRPdMMQF9nM1E0IgdzKHDsJrQ8rfHezg5Ma9XGd2vbU0v6iT6iz3r9pTIdQLDK08n4PPi2459ze9CEZ9iOqZNkJkjJAMv5PeCZSnh0MElhLWsdGiXtq4Qy47vhL715FPVXEzB/1Aq+ppZoucci6/au1UvUOSG6JhEuKVbzAtpij+4WPLeFpq7BZ2SmAmYNasIkb8mjJu6YdX6Q/m9x46d9M+qim1iCox1PlpNT7/axkWj0h6RasdaNRQk3tsIC+dS277u5tq/NitzLLlAsfPql/3qm7F7IeLsPReB7OEQUjGAXledP4AQ0y/RjWZbGctzJ2wxVlRXm/KfR6Na842UZFsaMOa/AUJLIVA613xoE4djdzvde1 u72il+Dr DiLdocXHM/YERL/4T3iNUawioJZisNYXpTdZr9wqcbpIdyXn7U7LeAtnI4pJTHlcBnNCarpTXKftKJb5s7JnQ6/1NCJlEyAY4ZOy1J0vJV15UYmiOhiYZHSuD+hNWCO0bQE/StXaQODYQoeyGlajPokFVTcMM3h8ZE3E8mfmy4aO+m/0GpxP5ybK6HQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 25, 2024 at 01:53:07AM GMT, Kairui Song wrote: > From: Kairui Song > > Currently, the (shadow) nodes of the swap cache are not accounted to > their corresponding memory cgroup, instead, they are all accounted > to the root cgroup. This leads to inaccurate accounting and > ineffective reclaiming. > > This issue is similar to commit 7b785645e8f1 ("mm: fix page cache > convergence regression"), where page cache shadow nodes were incorrectly > accounted. That was due to the accidental dropping of the accounting > flag during the XArray conversion in commit a28334862993 > ("page cache: Finish XArray conversion"). > > However, this fix has a different cause. Swap cache shadow nodes were > never accounted even before the XArray conversion, since they did not > exist until commit 3852f6768ede ("mm/swapcache: support to handle the > shadow entries"), which was years after the XArray conversion. Without > shadow nodes, swap cache nodes can only use a very small amount of memory > and so reclaiming is not very important. > > But now with shadow nodes, if a cgroup swaps out a large amount of > memory, it could take up a lot of memory. > > This can be easily fixed by adding proper flags and LRU setters. > > Signed-off-by: Kairui Song As Muchun said, please send this patch separately. However as I am thinking more about this patch, I think it is incomplete and the full solution may be much more involved and might not be worth it. One of the differences between file page cache and swap page cache is the context in which the underlying xarray node can be allocated. For file pages, such allocations happen in the context of the process/cgroup owning the file pages and thus the memcg of the current is used for charging. However xarray node allocations happen when a page is added to swap cache which often happen in reclaim and reclaim can happen in any context i.e. kernel thread, unrelated process/cgroups e.t.c. So, we may charge unrelated memcg for these nodes. Now you may argue that we can use the memcg of the page which is being swapped out but then you may have an xarray node containing pointers to pages (or shadows) of different memcgs. Who should be charged? BTW filesystem shared between multiple cgroups can face this issue as well but users have more control on shared filesystem as compare to shared swap address space. Shakeel