From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3275FC282EC for ; Wed, 19 Mar 2025 02:35:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 69BAA280002; Tue, 18 Mar 2025 22:35:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 64A9F280001; Tue, 18 Mar 2025 22:35:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51421280002; Tue, 18 Mar 2025 22:35:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 35179280001 for ; Tue, 18 Mar 2025 22:35:35 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6FF4E58F0F for ; Wed, 19 Mar 2025 02:35:36 +0000 (UTC) X-FDA: 83236734672.28.54AE7E6 Received: from mail-lf1-f52.google.com (mail-lf1-f52.google.com [209.85.167.52]) by imf27.hostedemail.com (Postfix) with ESMTP id B387D40009 for ; Wed, 19 Mar 2025 02:35:33 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ajpavDAy; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf27.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.52 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742351734; a=rsa-sha256; cv=none; b=xPJcYiRq5b3TPxY8de18R976N2sCfa+gPxPiZJVRozVSQHgZ6YaG9GPQ6xTGbQtmPezq69 cmLQPatDH/k3/BF15j+yI9Ost1f2x1zoD+tyVd0Assdt4vyuA1Wo0g291ytp0MoZPPYVYz cfecFb0qHV6/59Bc9Hl1JTWZdE8lpEg= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ajpavDAy; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf27.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.52 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742351734; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KNDDNF8aCnqSxf5FZFdewOlEPZTLJmOK+unVNUewuCQ=; b=KM7ArKduW7LiRjytll/Bbwdy5U/+uTi9LdMfR9tVddKxQ7QVcZrG0hrhs57lY7/jHyJkRN 65Unq34DaRgcWNNkezNj++0lG0lnZLa/3ZyjyLVNdCnw7ZaHLH/3etEC/5EMyXtlh25xnb OmUc59G66Nt+jK8BJZEpoPHJqS2UxpQ= Received: by mail-lf1-f52.google.com with SMTP id 2adb3069b0e04-5495c1e1b63so7560641e87.0 for ; Tue, 18 Mar 2025 19:35:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1742351731; x=1742956531; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KNDDNF8aCnqSxf5FZFdewOlEPZTLJmOK+unVNUewuCQ=; b=ajpavDAyv8gTNDwNSPPLuSRBM194xexUnWGUuEsFLYB4NfYXjG+EIUluKoR1ltKUqd avoqlVgQdz8eYnlx5OPhqWpaHTlODZjtEKAkeL5Q1ajoYe5ojH3E6oKzehJdiUu7h1on NsEUtJvu7uHj/x6kUiyMRDoiCbxWpX3UbxJZIZPmMxEmcS8O7H8VnZ+6Ha6yQzzEOiE9 7lgogAB1/CPp/s/3g/1P4oPw897K28YBuUbeaDrT0iKXzR8gnflJXUCyWeqhgUKDEmUz fcHT9UVIp9gTUKA5XcgFrbD7bcqdIPkN0fZDdS49d5UGXx9MXrIpyZtXLdazvBdlVYcr bkkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742351731; x=1742956531; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KNDDNF8aCnqSxf5FZFdewOlEPZTLJmOK+unVNUewuCQ=; b=j8BzaCjUrEMW3tqJ7oA7BtY+gEYvitjIFL+n45NDN/KVSMxGamGpUBjUNxpCKvl4si 4XHYWlsNQJn/5+UgQk03aA1oGo6Wy7Kntnb3WZ/RfWPp1gSIqXLfapTFTSyRLQxYoHCs bSNstDJbGeY699tmMWGUszXlh7Bj6/K9N4vcf7zhVNF7sclMtBB4IzG1eBRtM9tcBNRS NUOtL5I+jrnHg4ThHAKb76V467CpMHNWo2sl7CbW9mP/yo6qTT6pZNOa9Mp5zkMCdpw4 8BL0XR/EkCtgYV9MORLpWiMVe3ADzKMvv0uIEDrXMq0ldgZmPHBlfFR2n2yTiAgKhSUk fGcg== X-Forwarded-Encrypted: i=1; AJvYcCVn8G6Gw18w6Bt2uTY3a5ev+pyPZHQNbMhMU5cI0d2MAFkqTakVSdR9A2rl4Ee6OJi+4yfR4K/KMQ==@kvack.org X-Gm-Message-State: AOJu0YzWCzliGNg2C6chvyGvgxwxob8QWsDvbw+EkGD7WKQmbCfPk+3M T2hPMB7uBL1Splu0A2J2I8xz39pIuBrzdieDSKBQqwKCppS+r+Adas+b10DFOOIRDE5M/CVBVxU S2O2ijStAsVnKMD87WtSYQaamY2heYN6z1UP4vw== X-Gm-Gg: ASbGncvf2Nf0qbnReteCi3qnf2hkTskUIc4A2dtAYxnWDqGK1T3ExBghrWKYhJ4h/ZR QLUmcWjBrxfZa2PRMzixtf8+DK1DGd9hJ3ugG62U1iQAS3MhjWxeOLE9BemvK5JEqHyVWvuSUOg gQFCpSHypWAWJZ3Zs3rIGDLZ5R0Pm8rkVCJERQ3Iel2pHTVGPEwFk= X-Google-Smtp-Source: AGHT+IGdtVm6Bqx7M3YzZ1f0mDz/H5tgrBcE5uuf325Gvnvr62PB2XzSF/KXWtyeOFveND5pUCsOFXbNJUQFLtEC1fg= X-Received: by 2002:a05:6512:b1f:b0:549:7394:2ce5 with SMTP id 2adb3069b0e04-54acb205b22mr299695e87.41.1742351731370; Tue, 18 Mar 2025 19:35:31 -0700 (PDT) MIME-Version: 1.0 References: <20250318135330.3358345-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Zhongkun He Date: Wed, 19 Mar 2025 10:34:54 +0800 X-Gm-Features: AQ5f1Jomcet91R-UxIGnSDO0si2b5Ui89tzxcKFLDXZ8iJaVmGNN16MHY4Fw0cQ Message-ID: Subject: Re: [External] Re: [PATCH] mm: add swappiness=max arg to memory.reclaim for only anon reclaim To: Yosry Ahmed Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, yuzhao@google.com, mhocko@suse.com, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: B387D40009 X-Stat-Signature: 76b3oxhxdeyipmh3d64o3npbcnbh6ft5 X-Rspamd-Server: rspam06 X-HE-Tag: 1742351733-528502 X-HE-Meta: U2FsdGVkX18UiKrldVPWEeRUUQWXLBPJRnvTtRJWWuqgV2xmJ5lR4M/cPH1N+yh7YQj6DFuNx7f4fKbsm0kbIH6vTAJsfbYS/w82qh2PFCipiCNLE7xNtj5CN3TqzqQP5v9cMwh+AGde0W1vbnfRxTE4YI6zMxiiwCPOoj2oAmCIS8oL6mRMoUJTaGY+wngFu4HeRmHtitWuipJXBEzgVtnn4SoumokFtDtc6Rx1RxRoqMAHp4+kuQG59JBfPmyDXCS1q7iAgZXJiIP0WWRtn1Gi8qULnFYhvDFc7mQ0J7hqhdh+JRyHs9Ntl/Sgqb7HkoOOEnBb2acAcBaejjoW75mKpS6Y/iE1kf8QD54hRbbG2dfPXFLSoBPWI0WEYreAr5x8zcpVX7q3ROe8wj+eX+6mvYXysBfTGjlSLHiDSimCy/Id6HwCzjbEIZEHAvAO5Eszaxs5ZQitlJEeubFLY6Mt6Jnb6LaPEdzBtPcqwqVtDUigklVQVN/33LOkUy1vVMAbOd+3u52qwh71lMW0ypcgQ31IWpJzpP1ixfWay6LImjAyl3dcBiFrmR2NUT6yIjuKfxurmSKsmQgdNZIU42oT1BNVmiRNoDW8YSiAswqS75UM5tQQ/dgJyG8QPrqXzvJoVuWVnAZ9UE36R27rZlHIh7iCYUbMbfBat8IWgCTnOCEaoscr3cnxXEaaDlJdhfTtLtXAVTu0Vzp6oP8M08uHDW1PX7C5hQBa3tbdn4HcUHyokiVHoDWiONEP2qVu+q7DvtJW4k62afiMgnGoKfmFE/H2YHlhJZ+D1wOmYX22QN0LgNKmK0Vhk4yefiFWXGmksEy6nJaYdiDyKFlZqrpsuuX/Mcg5TY/Naslw5gtyMtwe5vqe6boJ5sdBaTA7Iq5nX1fYXHvH/3anEMhpy2htlt5I9nTQtGPNj3g5jES2mdHxYFGbHS1070jNl9PotG7EvwEVIdGy+ghcT0V dqxw4eaE L/yvh3UBKkt85LtxhkrAyqceO7MWd2I1XfD1CZT7xAtAqAv1uJbzpa7ZNPW5HtMJS62eBgXeR4U4M2IyAxB7lNoZFvqjL/Llpx8+MijqdfvOWTKPg3yycGulf5kr8j396R1TDrsYDonlLuRx4zmYJIJWTgh0BL6c4VL0FE3DjIyPm+/lDUO7VEySdE9Djpg3smlJlHuun7wv0+bnFSXzgmmwB2bW7ETvMksKV/ieeVmpJdAFOaG+ysNdwM/cnKTPkcfNf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 18, 2025 at 10:10=E2=80=AFPM Yosry Ahmed wrote: > > On Tue, Mar 18, 2025 at 09:53:30PM +0800, Zhongkun He wrote: > > With this patch 'commit <68cd9050d871> ("mm: add swappiness=3D arg to > > memory.reclaim")', we can submit an additional swappiness=3D argum= ent > > to memory.reclaim. It is very useful because we can dynamically adjust > > the reclamation ratio based on the anonymous folios and file folios of > > each cgroup. For example,when swappiness is set to 0, we only reclaim > > from file folios. > > > > However,we have also encountered a new issue: when swappiness is set to > > the MAX_SWAPPINESS, it may still only reclaim file folios. > > > > So, we hope to add a new arg 'swappiness=3Dmax' in memory.reclaim where > > proactive memory reclaim only reclaims from anonymous folios when > > swappiness is set to max. The swappiness semantics from a user > > perspective remain unchanged. > > > > For example, something like this: > > > > echo "2M swappiness=3Dmax" > /sys/fs/cgroup/memory.reclaim > > > > will perform reclaim on the rootcg with a swappiness setting of 'max' (= a > > new mode) regardless of the file folios. Users have a more comprehensiv= e > > view of the application's memory distribution because there are many > > metrics available. For example, if we find that a certain cgroup has a > > large number of inactive anon folios, we can reclaim only those and ski= p > > file folios, because with the zram/zswap, the IO tradeoff that > > cache_trim_mode or other file first logic is making doesn't hold - > > file refaults will cause IO, whereas anon decompression will not. > > > > With this patch, the swappiness argument of memory.reclaim has a new > > mode 'max', means reclaiming just from anonymous folios both in traditi= onal > > LRU and MGLRU. > > Is MGLRU handled in this patch? Yes, The value of ONLY_ANON_RECLAIM_MODE is 201, and the MGLRU select the evictable type like this: #define evictable_min_seq(min_seq, swappiness) \ min((min_seq)[!(swappiness)], (min_seq)[(swappiness) <=3D MAX_SWAPPINES= S]) #define for_each_evictable_type(type, swappiness) \ for ((type) =3D !(swappiness); (type) <=3D ((swappiness) <=3D MAX_SWAPPINESS); (type)++) if the swappiness=3D0, the type is LRU_GEN_FILE(1); if the swappiness=3D201 (>MAX_SWAPPINESS), for ((type) =3D 0; (type) <=3D 0); (type)++) The type is always LRU_GEN_ANON(0). > > > > > Here is the previous discussion: > > https://lore.kernel.org/all/20250314033350.1156370-1-hezhongkun.hzk@byt= edance.com/ > > https://lore.kernel.org/all/20250312094337.2296278-1-hezhongkun.hzk@byt= edance.com/ > > > > Suggested-by: Yosry Ahmed > > Signed-off-by: Zhongkun He > > --- > > Documentation/admin-guide/cgroup-v2.rst | 4 ++++ > > include/linux/swap.h | 4 ++++ > > mm/memcontrol.c | 5 +++++ > > mm/vmscan.c | 10 ++++++++++ > > 4 files changed, 23 insertions(+) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/ad= min-guide/cgroup-v2.rst > > index cb1b4e759b7e..c39ef4314499 100644 > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1343,6 +1343,10 @@ The following nested keys are defined. > > same semantics as vm.swappiness applied to memcg reclaim with > > all the existing limitations and potential future extensions. > > > > + If set swappiness=3Dmax, memory reclamation will exclusively > > + target the anonymous folio list for both traditional LRU and > > + MGLRU reclamation algorithms. > > + > > I don't think we need to specify LRU and MGLRU here. What about: > > Setting swappiness=3Dmax exclusively reclaims anonymous memory. > Agree, thanks. > > memory.peak > > A read-write single value file which exists on non-root cgroups. > > > > diff --git a/include/linux/swap.h b/include/linux/swap.h > > index b13b72645db3..a94efac10fe5 100644 > > --- a/include/linux/swap.h > > +++ b/include/linux/swap.h > > @@ -419,6 +419,10 @@ extern unsigned long try_to_free_pages(struct zone= list *zonelist, int order, > > #define MEMCG_RECLAIM_PROACTIVE (1 << 2) > > #define MIN_SWAPPINESS 0 > > #define MAX_SWAPPINESS 200 > > + > > +/* Just recliam from anon folios in proactive memory reclaim */ > > +#define ONLY_ANON_RECLAIM_MODE (MAX_SWAPPINESS + 1) > > + > > This is a swappiness value so let's keep that clear, e.g. > SWAPPINESS_ANON_ONLY or similar. > OK. > > extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *m= emcg, > > unsigned long nr_pages, > > gfp_t gfp_mask, > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 4de6acb9b8ec..0d0400f141d1 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -4291,11 +4291,13 @@ static ssize_t memory_oom_group_write(struct ke= rnfs_open_file *of, > > > > enum { > > MEMORY_RECLAIM_SWAPPINESS =3D 0, > > + MEMORY_RECLAIM_ONLY_ANON_MODE, > > MEMORY_RECLAIM_NULL, > > }; > > > > static const match_table_t tokens =3D { > > { MEMORY_RECLAIM_SWAPPINESS, "swappiness=3D%d"}, > > + { MEMORY_RECLAIM_ONLY_ANON_MODE, "swappiness=3Dmax"}, > > MEMORY_RECLAIM_SWAPPINESS_MAX? > OK. > > { MEMORY_RECLAIM_NULL, NULL }, > > }; > > > > @@ -4329,6 +4331,9 @@ static ssize_t memory_reclaim(struct kernfs_open_= file *of, char *buf, > > if (swappiness < MIN_SWAPPINESS || swappiness > M= AX_SWAPPINESS) > > return -EINVAL; > > break; > > + case MEMORY_RECLAIM_ONLY_ANON_MODE: > > + swappiness =3D ONLY_ANON_RECLAIM_MODE; > > + break; > > default: > > return -EINVAL; > > } > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index c767d71c43d7..779a9a3cf715 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2438,6 +2438,16 @@ static void get_scan_count(struct lruvec *lruvec= , struct scan_control *sc, > > goto out; > > } > > > > + /* > > + * Do not bother scanning file folios if the memory reclaim > > + * invoked by userspace through memory.reclaim and set > > + * 'swappiness=3Dmax'. > > + */ > > /* Proactive reclaim initiated by userspace for anonymous memory only */ > Looks clearer. > > + if (sc->proactive && (swappiness =3D=3D ONLY_ANON_RECLAIM_MODE)) = { > > Do we need to check sc->proactive here? Supposedly this swappiness value > can only be passed in from proactive reclaim. Instead of silently > ignoring the value from other paths, I wonder if we should WARN on > !sc->proactive instead. > I'm also hesitating on how to handle this judgment. WARN looks good. > > + scan_balance =3D SCAN_ANON; > > + goto out; > > + } > > + > > /* > > * Do not apply any pressure balancing cleverness when the > > * system is close to OOM, scan both anon and file equally > > -- > > 2.39.5 > >