From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84E4BC369D3 for ; Mon, 21 Apr 2025 09:14:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 134066B000C; Mon, 21 Apr 2025 05:14:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BD486B000D; Mon, 21 Apr 2025 05:14:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E78BD6B000E; Mon, 21 Apr 2025 05:14:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C60996B000C for ; Mon, 21 Apr 2025 05:14:04 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 841E1C1BF5 for ; Mon, 21 Apr 2025 09:14:05 +0000 (UTC) X-FDA: 83357489250.20.06DA44E Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf02.hostedemail.com (Postfix) with ESMTP id 9CCC980004 for ; Mon, 21 Apr 2025 09:14:03 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=kSARcHxt; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf02.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745226843; a=rsa-sha256; cv=none; b=3P7+gUHwElavhH26exIEyPtmjzTVBqziXg40jTHOSixgaBby+VQf2iAM7iCARFtDm+e/vE pg1NYx+riqhx1bWq/JkITM+xCafykaKuR1Q6hbaenfCTIqOBLXMuXbO29njTH598SHcd5O j4vh7rClSmg7RbopRxgrK6rUZl9sb48= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=kSARcHxt; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf02.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745226843; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rwNGrkHqtII9hHebmCv8hT2EO8L3wdoKwS9DkX4S/JM=; b=0XOGAcAOTnkDuzwGTgZ/ocdS9c3JX+NFvGMoV9xYsfvP8AURu+ndCqvle1EYz4DrTr+01H leMor9G62P+jJtQrYFlCXzbCIfSldCfCFFLwu0WEC8QJTvllwakrAfowzHBjVy4bxJccs9 6+9kfhrd8cxHHzym7Gv9o1qF8L1ZDfs= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-22423adf751so38072335ad.2 for ; Mon, 21 Apr 2025 02:14:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1745226842; x=1745831642; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rwNGrkHqtII9hHebmCv8hT2EO8L3wdoKwS9DkX4S/JM=; b=kSARcHxtoFsYhT2bQ4eA5acLbW/HGD5dZgJh0sk34a+w6poTeXSdsSpacwbcVFV+rC AKfJcmJRK9j5sNFB4B8aAY+SPm4Hvcg6T2xGaNtiXc0ZtUnD78MFwbLTbZxPrIKX5Qii YQY3PZxXgc2PlB7fj4DVDeq/I0D/dfbbjZTrw5e84eCGQ9A2IJnyCpEzjm5gnLoXbfUi boCLyjeoNGF4jMOyo5q/n9wZRK3JxJdMrCyjRmTpuAAJFSqY7WZxIWwYxM2VaadHZ0CC KNBxT7GFHKZDYU5opXq3yU9KTRU5p1cbzZp91kyon3bTILUqGHKfG7VWSwGbdLUHKjpy lMXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745226842; x=1745831642; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rwNGrkHqtII9hHebmCv8hT2EO8L3wdoKwS9DkX4S/JM=; b=uKiZrPoAf1k2kO98maT6+F0UGpc6VMwH2Cc9DA96VyovZhd3zdiOWUNkF4oDBwWYv4 Z9jtAXq2vnyHIQIC55/hBKUFQO+fb3hDQrovMcPr25CMtRxgt5g1G+fV6s4DVegz3e/G nHVs9hPMZKT+sw7+G83wR85qqVLKyWhSU4/rWKyBxSlloD4esG0nUGnuAiTlMZAwUr0o QooTfYsekSZ4/8nJKycNE6rnSkAIxpUUdav7OgK66KKrlsUokXSHEtTabZ2SmV84yjGo aeNBAmzWcNSOilyro1FM1NquoiUftT6oCaer80JpwWN+0Nd3VKEn7Ova02Kv/uGQYJGS fe9w== X-Forwarded-Encrypted: i=1; AJvYcCXwKp00C2WA2/neZZWGieMFvXcJs/TBUo5AyRZ60AkFBKzfDCyjSxR/FS99Rzu2rnE9LMcAxC+aNA==@kvack.org X-Gm-Message-State: AOJu0YzJJpyPd9jAt8m+p++yWRebWtcn2mfGrhGwIpqRVi/TBviw3Bxg 7t6luQzednGH/430tdRqWjIDak1+skfHDOlHHLNBxYmhkPABN8EQmVuuXYPwTyk= X-Gm-Gg: ASbGncvxxyx0J7SAavXJZMQpiPxqWLdEnpWKhSawWCJl4l0GhQz+/kTXh/8+UaSMjbM JI+B3OcNZJ9/35k1SfNNIgXvym9MNtbLsgO9wBMTlv3pKMgX1+taIamKovha4LE7u8zHlIUFhk+ hLVvjLeIIZ6ifb+dOdt9Y9UN3yAvfE/xi7Y0PnoG+fHi7o5iCPG5j3yvIaJQ6mIZQEb1YrTsYIX IIcr/hFJTgL2RWchfvU2teU7YqB1Fy+l49/KSoVKcEY0q1tjcGO23gS2GuItbUXAILyIX3xjYrn LvcqjXW3kvXNBhOH68iVOxr2vMTQJBkLjqeY2UiOlfDQ61lnaHH6gmxfxoy6F1sThw== X-Google-Smtp-Source: AGHT+IG8Xt4Ez8eHR0AdyQar+DTN2NHlhc9GFJ7M3y+sg+/0Ss8t7pUCmNYFFQfl4rEm0AP3+cR/yw== X-Received: by 2002:a17:902:e54e:b0:22c:35c5:e30d with SMTP id d9443c01a7336-22c53583e3dmr138421385ad.13.1745226842358; Mon, 21 Apr 2025 02:14:02 -0700 (PDT) Received: from n37-069-081.byted.org ([115.190.40.15]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b0db157be12sm5246285a12.64.2025.04.21.02.13.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Apr 2025 02:14:01 -0700 (PDT) From: Zhongkun He To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, mhocko@suse.com, yosry.ahmed@linux.dev, muchun.song@linux.dev, yuzhao@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zhongkun He Subject: [PATCH V4 1/4] mm: add swappiness=max arg to memory.reclaim for only anon reclaim Date: Mon, 21 Apr 2025 17:13:28 +0800 Message-Id: <519e12b9b1f8c31a01e228c8b4b91a2419684f77.1745225696.git.hezhongkun.hzk@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 9CCC980004 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: io59okeqg4t9rmmid1psu8axeb39daad X-HE-Tag: 1745226843-436263 X-HE-Meta: U2FsdGVkX1+JsH2SKrya62utqgsq8cYQpcZH0/sLOVYQZl7iJLsRk1gOtVBREjspjTYz3AUhBEI9CQ6+ofjpIGDIJlfv40J+iP0GdRGfYrtMA4fMIC+mq7M6vajtxRyYLEHfU5y2X5kQ4Jfqof6xzHt+TpIQRKZ/q1zn4y4SQY/HdenbQV60OMJmprH7emuY1Qu3FC6H/orOA/SlRNO6g9ln0IdiPrh0v4LZ6ubaEpRjYvS+MWSfcUMCkOD50Ug1lWfyKsQnucbDMbd5c+zWfQDCfLKXdTSRuqBa6W7PNYX6ahqidkj/0glSOis13lmEnXUxofFQUmLhRgqnrapD2FeD9AmSJAhBDZ3EWh70xcXJ1oGoUh94iuEcjBYeFVL5NHSSM8EuVYxF64MiW/CA98sMCIl43ZlIyVpTDfEWcPf8unZ4HtY6IENr3M3lWjJbij3+sPCcrGZVjOUg1LRjqOoO7xDC5F3gGfnfmDA/HNY0C2sBVHp5hxwAQ/mUAFMMhoLX9rKpmOcq4tMZXcamLK0wK8ODjIUskvkBdBPepIS7cKxZOOnW93AQisrYJfGVaACzonDGpmhwxnqpUtWTIANmB0AGOJZ/M7FUq3IZRiZH79hxWHIw7aEi8XVJWkrLWa1lnKXKtG+NaxpxzRUUvd6tih7vT8MKRfjqNnrzGRHObIGbVsLzHkAUeNiO9I+wqinnXCF047aGM44L0wvsLSzFcxmL8MLgNvp+qwTVlE/3E6NWNM/eVoCAlEOFW/VtZHf+p66tf1edf9ahqD7FtPWZN1H48fDsPhwc3Su6B+Tw5ZRptSx4B0VzvviUR3bPA6lQwtcgJGYpbDAhgjUi1add9Cx5BFt7DZpnLvlZLKr5yVG1bD9VPWTucFH/5qG3J9uttlEavBI1kvCmwKvbO2agPkRKOtMISugm64C0mb2cBkei72A4ElgIWrmid6H7+yzYz0MOmUdmRn++Zhm haFQqb6r xRqfQsT4iwDJeLiUBFpjtNOHfMyId6AFcuNpNJM3SYmWXTZ/8sbn+q89ayNxrCItkTdg+YMGyS3jGljmmMk/klt0nxffd2cepdEHbk07Intx6+Mlr1diaHPev4fOeFLN3Tsa8Yupz1s3o1UD6l9Kpehy1WFVMnkboEeQjiW2YkRvayYe4AMtWJYIn59QhHU/joq6MiknJ6hnBpsnW1XttxqAqrvPTkWB0gK+VjXT2GHZonx5ZCEttXRTMVLo3zmrkJNsSYPfZKF2njOqCs8xS2o1ORaehxmQ/r2JVMImf0BTaYlbwWu479+9g73LVBBnZn9LnMPdCJxHhypo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to memory.reclaim")', we can submit an additional swappiness= argument to memory.reclaim. It is very useful because we can dynamically adjust the reclamation ratio based on the anonymous folios and file folios of each cgroup. For example,when swappiness is set to 0, we only reclaim from file folios. However,we have also encountered a new issue: when swappiness is set to the MAX_SWAPPINESS, it may still only reclaim file folios. So, we hope to add a new arg 'swappiness=max' in memory.reclaim where proactive memory reclaim only reclaims from anonymous folios when swappiness is set to max. The swappiness semantics from a user perspective remain unchanged. For example, something like this: echo "2M swappiness=max" > /sys/fs/cgroup/memory.reclaim will perform reclaim on the rootcg with a swappiness setting of 'max' (a new mode) regardless of the file folios. Users have a more comprehensive view of the application's memory distribution because there are many metrics available. For example, if we find that a certain cgroup has a large number of inactive anon folios, we can reclaim only those and skip file folios, because with the zram/zswap, the IO tradeoff that cache_trim_mode or other file first logic is making doesn't hold - file refaults will cause IO, whereas anon decompression will not. With this patch, the swappiness argument of memory.reclaim has a new mode 'max', means reclaiming just from anonymous folios both in traditional LRU and MGLRU. Here is the previous discussion: https://lore.kernel.org/all/20250314033350.1156370-1-hezhongkun.hzk@bytedance.com/ https://lore.kernel.org/all/20250312094337.2296278-1-hezhongkun.hzk@bytedance.com/ https://lore.kernel.org/all/20250318135330.3358345-1-hezhongkun.hzk@bytedance.com/ Suggested-by: Yosry Ahmed Signed-off-by: Zhongkun He --- Documentation/admin-guide/cgroup-v2.rst | 3 +++ include/linux/swap.h | 4 ++++ mm/memcontrol.c | 5 +++++ mm/vmscan.c | 7 +++++++ 4 files changed, 19 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 1a16ce68a4d7..472c01e0eb2c 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1348,6 +1348,9 @@ The following nested keys are defined. same semantics as vm.swappiness applied to memcg reclaim with all the existing limitations and potential future extensions. + The valid range for swappiness is [0-200, max], setting + swappiness=max exclusively reclaims anonymous memory. + memory.peak A read-write single value file which exists on non-root cgroups. diff --git a/include/linux/swap.h b/include/linux/swap.h index db46b25a65ae..f57c7e0012ba 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -414,6 +414,10 @@ extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order, #define MEMCG_RECLAIM_PROACTIVE (1 << 2) #define MIN_SWAPPINESS 0 #define MAX_SWAPPINESS 200 + +/* Just recliam from anon folios in proactive memory reclaim */ +#define SWAPPINESS_ANON_ONLY (MAX_SWAPPINESS + 1) + extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, unsigned long nr_pages, gfp_t gfp_mask, diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c96c1f2b9cf5..796c78b26e43 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4395,11 +4395,13 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of, enum { MEMORY_RECLAIM_SWAPPINESS = 0, + MEMORY_RECLAIM_SWAPPINESS_MAX, MEMORY_RECLAIM_NULL, }; static const match_table_t tokens = { { MEMORY_RECLAIM_SWAPPINESS, "swappiness=%d"}, + { MEMORY_RECLAIM_SWAPPINESS_MAX, "swappiness=max"}, { MEMORY_RECLAIM_NULL, NULL }, }; @@ -4433,6 +4435,9 @@ static ssize_t memory_reclaim(struct kernfs_open_file *of, char *buf, if (swappiness < MIN_SWAPPINESS || swappiness > MAX_SWAPPINESS) return -EINVAL; break; + case MEMORY_RECLAIM_SWAPPINESS_MAX: + swappiness = SWAPPINESS_ANON_ONLY; + break; default: return -EINVAL; } diff --git a/mm/vmscan.c b/mm/vmscan.c index 3783e45bfc92..ebe1407f6741 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2503,6 +2503,13 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, goto out; } + /* Proactive reclaim initiated by userspace for anonymous memory only */ + if (swappiness == SWAPPINESS_ANON_ONLY) { + WARN_ON_ONCE(!sc->proactive); + scan_balance = SCAN_ANON; + goto out; + } + /* * Do not apply any pressure balancing cleverness when the * system is close to OOM, scan both anon and file equally -- 2.39.5