From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3685AC2BA1B for ; Thu, 13 Mar 2025 11:56:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4892E280003; Thu, 13 Mar 2025 07:56:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 438F9280001; Thu, 13 Mar 2025 07:56:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B3C0280003; Thu, 13 Mar 2025 07:56:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0B3EA280001 for ; Thu, 13 Mar 2025 07:56:21 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E16EE560AE for ; Thu, 13 Mar 2025 11:56:21 +0000 (UTC) X-FDA: 83216374962.20.649AFB6 Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) by imf20.hostedemail.com (Postfix) with ESMTP id 198C41C0002 for ; Thu, 13 Mar 2025 11:56:18 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=C7AAwR+Z; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf20.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.49 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741866979; a=rsa-sha256; cv=none; b=EyJpU8VqdmEhvzDgwLJYOqJQUyDZCLdZOsQhEr3+z3T2iKvSMZfayAtlPUhsHbrvO1Z81t G8j9OXrXdde3lLEQBhTchGlLr0I6GyJYoR0TmeJDiXHC3bORmJlqAd8s5E0+ctsDHxn08Q mdPDbxFwgSL3FTCjB0jKBBsZe3kiyHA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=C7AAwR+Z; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf20.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.167.49 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741866979; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GBeZSLN5RdC5jHLNiuFJa6YOsMbFeh/krmmMuSB+kdI=; b=L1DRta5NVC1GlEFaSRdtEfHHuD6w1w2QE4FceI6ngfo+mjyuN0i1psUSudgBGJfICgD9Y/ 4RVPu7lS7HjvpFlP7TnJSqic09pB2koXFxDMtuks0fJXCK5Ps8H4nWIeE5ZEH+xayZ8QCD nUyJH16eW5rANDO28iqJm88CH6a7+Pc= Received: by mail-lf1-f49.google.com with SMTP id 2adb3069b0e04-549b12ad16eso933979e87.0 for ; Thu, 13 Mar 2025 04:56:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1741866977; x=1742471777; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GBeZSLN5RdC5jHLNiuFJa6YOsMbFeh/krmmMuSB+kdI=; b=C7AAwR+Z1QxQWBWL6XJJoaMfnm1nh3F9/ef+QjGfUjIty4+cWWiDsCvc2QO2cydJQz 0iz65kyh2Sn6Aub3iJIMDUAdf6oZfIt5FOpx3peoh48xPZ8Z+me9nmuhRQRCLcnw9vNz CYx7qRVuwv3VedIQCyd8HMHGfEUujcxuJ2tNy19L+6oJzFGjHYcuujx2KnC+kGCe1vhK YFtF5Qs0v7NHz16K7Meye9+4neWR5aAIwUmp4Ahkpeuiohy/pYjCGud402s0fjUj/PKd SpjbJLitdpnsydIct85qPdS9DpzdOrKgFW9wdnq75KRuVzUUYeQkLQfteFXYRl9OaP6L ZD8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741866977; x=1742471777; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GBeZSLN5RdC5jHLNiuFJa6YOsMbFeh/krmmMuSB+kdI=; b=oVmbp9nsAfefASNwkldsJZEoiFG1YUY8Y7YvBfkrpU0WTL3sCSIgvTZf7sMgug5acN t6H2BfMisC3lET8/pWiUiRb4A6d6+IsuhtYAam8fnvEkUfprskT9wlUD8bp2NSg97qPX KTbKzIcOXj2XjZz7h/f0ajG7DuTFrPYbGEEv8nCMKu2nqDq8aVaabFIcoB0eyoWQ988O VqvlRdo6c9H2faKBO1QJlm1ePj+BJU7L4SAaHeoQH3B3py5o8UADcqkwZRNEnALPFhqe P9jHD4Ws2qpJCPD1xQ8iTj8ygnWsTcuNuPbPmm5np+9vClwb89nDR6AddQRbrxsLBa76 4Osw== X-Forwarded-Encrypted: i=1; AJvYcCVtWDw3QVG5bMVOMj03tClAESVe1xWv/k+PSFTHl2JGDPoWD96ceSvtHN36wcucoF9Mt9PkhtOZyQ==@kvack.org X-Gm-Message-State: AOJu0YxxKlEavw1+CvgDSyY8XgXndPICJTDBHEfouH6dtYGFmHuVao/3 lQu+e9dgrVEovtZjax/P1wv2/8Fn9qv9crA5+wDfAjO+UysxEvTNOKqA19H1NSBelEnN0BypfEr PXEdwnbAh8Ud4doDMIGBg0EcG2sgLS1DDATYuQQ== X-Gm-Gg: ASbGnctE75SqzwBIYd5TjJHo6CVXOMOPGi7ICku+tqQ862LlIespyOc0R0poLcSyHNx HslNLFM0X/NRRR/Ci4hg+XwnHWQ1DkR3hbWBBevhDChzxEKtAJd0YigLX0zh+X/mmggQpZCNU3V R89+iaOHbm7dtNwze5TVAhHrZ2o6iFoBjN9az5LzM= X-Google-Smtp-Source: AGHT+IGEyp8QlFUO5iBEQ3+sjvXdhFu8C1RRBX4cpUJB6fBBJ7ZTqWe0E9PLhwuwpOMizy4BKuMJYxz3SiV73Nqw/98= X-Received: by 2002:a05:6512:2313:b0:549:4416:df02 with SMTP id 2adb3069b0e04-549910b6194mr7925159e87.41.1741866976644; Thu, 13 Mar 2025 04:56:16 -0700 (PDT) MIME-Version: 1.0 References: <20250313034812.3910627-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Zhongkun He Date: Thu, 13 Mar 2025 19:55:39 +0800 X-Gm-Features: AQ5f1Jpo3Xic03LAlbOgUO1vxBJrzOBtSKzzs_8j6CeGNehBeppmpW2K0Q-LMv8 Message-ID: Subject: Re: [External] Re: Re: [PATCH V1] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , muchun.song@linux.dev, linux-mm , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 198C41C0002 X-Rspamd-Server: rspam05 X-Stat-Signature: 555ng8n7ioqpnjmiisatkqfiy5hmnsmn X-HE-Tag: 1741866978-766370 X-HE-Meta: U2FsdGVkX1+Y1i89oi6BHNfmpaMTQsBs9qvDa3j5Q2KGx849wJY+YMqLTY+MGBK3tzWNBNZ3u0YNHNu08zw6QhC0cfC5OESR8waJZMk5XvF/puJQO+32jwg3R4MyXOvhT3q4I23uJtNFkUGfRakF81b3OLbhkP5kB5wmwFcvOBfAvKITnaF3h6diHyk6QB3+JDdG3VONyFYNkf+FS1XhJKLZfbLG3jS00FvcRZAv29AIwJ9t/1pjPDTk5ewXg3UFlPMyzHdriKhmvA5XY2hj7SjJh+BLX734WZYfskyilf11URYoH92mNX1Z9jTSZDwpLngVnePQ6Lm9mPST0A+2Sk/Qs5AYE6ej0RXy1RKvUCvM7R0+4Rk+PiuPu6IfPmjHg3OB8W2wrVb72+5RJ037MieL0QyWR5OLpg5gH0ZCA2Iv7/ZWFdFyg346PQqcZpXkWd4A8VAwVzCn7tNO2lnKdKsstIDQFBsWwwLkJEGHWmHP8Vn/yg7yW6wJerncKrmGBAHz1Rn/iuUQaEG5FZ0FvfsDrfhjOoYS1LkQ/wYkW6j/fIfWz1GB+rK4k5Q3FfuiIWYJbvVx9GqKqS/bEO2cD0d8XY/LFCYu0RcxmGPlWa48YQ1ZodDh9eFoE864kE5WFytBbjTzbfKUqH04LB01keIy42CyvgDbRIG/tQwHVV+IVzszUmVC6RBb672XRo6NJfZPDvYl6735HpAllxlFiCxzeSlzxXa480Mr/1RUMPslWNYxIxM29zs9x64xGsodYuP3OUxdGWLYTUl6yZCz10LYJjgzdOhTRem1i774nvwMxoVdBp+UQhTAjFc0HX43lIDbdLHmR4CWBHL7V1rqLhljTk//YN7W/rIxBA6ZYfyoWaxTOmdn+9+/8rpvapbU5qRZsbusKnHtokTETl7q5a0pbU8Q3AT5X4grKETHO2HJi6Gyd8MA7DEOgM1mXXdqzqIPUBqGWBC1nXKHBqV ZREkWVht EH2ir7ykqKaoJE4HHYXG3CMc1sfCfgoopyumLLarbboVHe3COcnPHdVzNJitK20BIxXTDOaIpJMC9uS+JFxmUwNgWCAA0KgvXDg8psz/6hhOi0qxEGkcWTZDuyHrxfImlwkuD4Ic9mRsZuLF4BbrqOu1HwvGrxZKz9f6oaUyhbhgky8+Nv6GbEK71STR4yDhUrMi1qfRQAzrGIKYoHG7o9LS5ibkRVmTx5SwVa/6wQUNPE8M4mgjKWX+z1KN7WEJql2eUca0Q5cPQ+72Rr3NzonIjiULmqWZ+F3KvoL1OLb0OIC0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 13, 2025 at 5:43=E2=80=AFPM Michal Hocko wrot= e: > > On Thu 13-03-25 16:57:34, Zhongkun He wrote: > > On Thu, Mar 13, 2025 at 3:57=E2=80=AFPM Michal Hocko = wrote: > > > > > > On Thu 13-03-25 11:48:12, Zhongkun He wrote: > > > > With this patch 'commit <68cd9050d871> ("mm: add swappiness=3D arg = to > > > > memory.reclaim")', we can submit an additional swappiness=3D a= rgument > > > > to memory.reclaim. It is very useful because we can dynamically adj= ust > > > > the reclamation ratio based on the anonymous folios and file folios= of > > > > each cgroup. For example,when swappiness is set to 0, we only recla= im > > > > from file folios. > > > > > > > > However,we have also encountered a new issue: when swappiness is se= t to > > > > the MAX_SWAPPINESS, it may still only reclaim file folios. This is = due > > > > to the knob of cache_trim_mode, which depends solely on the ratio o= f > > > > inactive folios, regardless of whether there are a large number of = cold > > > > folios in anonymous folio list. > > > > > > > > So, we hope to add a new control logic where proactive memory recla= im only > > > > reclaims from anonymous folios when swappiness is set to MAX_SWAPPI= NESS. > > > > For example, something like this: > > > > > > > > echo "2M swappiness=3D200" > /sys/fs/cgroup/memory.reclaim > > > > > > > > will perform reclaim on the rootcg with a swappiness setting of 200= (max > > > > swappiness) regardless of the file folios. Users have a more compre= hensive > > > > view of the application's memory distribution because there are man= y > > > > metrics available. For example, if we find that a certain cgroup ha= s a > > > > large number of inactive anon folios, we can reclaim only those and= skip > > > > file folios, because with the zram/zswap, the IO tradeoff that > > > > cache_trim_mode is making doesn't hold - file refaults will cause I= O, > > > > whereas anon decompression will not. > > > > > > > > With this patch, the swappiness argument of memory.reclaim has a mo= re > > > > precise semantics: 0 means reclaiming only from file pages, while 2= 00 > > > > means reclaiming just from anonymous pages. > > > > > > Well, with this patch we have 0 - always swap, 200 - never swap and > > > anything inbetween behaves more or less arbitrary, right? Not a new > > > problem with swappiness but would it make more sense to drop all the > > > heuristics for scanning LRUs and simply use the given swappiness when > > > doing the pro active reclaim? > > > > Thanks for your suggestion! I totally agree with you. I'm preparing to = send > > another patch to do this and a new thread to discuss, because I think t= he > > implementation doesn't conflict with this one. Do you think so ? > > If the change will enforce SCAN_FRACT for proactive reclaim with > swappiness given then it will make the balancing much smoother but I do > not think the behavior at both ends of the scale would imply only single > LRU scanning mode. Hi Michal, I'am confused about the description that 'I do not think the beh= avior at both ends of the scale would imply only single LRU scanning mode.=E2=80= =99 and what we should do at the max value of swappiness. Besides that, I have discovered a new issue. If we drop all the heuristics = for scanning LRUs, the swappiness value each time will accurately represent the ratio of memory to be reclaimed. This means that before each pro reclamatio= n operation, we would need to have a relatively clear understanding of the cu= rrent memory ratio and dynamically changing the swappiness more often because wit= h the pro memory reclaim the ratio is alway changing . As a result, the flexibility would be reduced. However, at both ends of the scale, we would have a clearer intention to re= claim from a single list. For example, in a cgroup, if we have 10G of anon pages = and 3G of file pages, I would prefer to set swappiness=3D200 to reclaim anon pages only. Once the amount of file and anon pages becomes roughly equal, we can set swappiness=3D100 and rely on the system's original heuristics to determine = the appropriate amount to reclaim. On the other hand, if we have 1g anon, and 1= 0G page caches, we would like to set swappiness=3D0 to reclaim only from file = pages even with cache_trim_mode. At least from the semantic perspective, it is clear, and users don=E2=80=99t need to worry about the threshold of cache_trim_mode or= even don't know the existence of cache_trim_mode . Overall, setting swappiness=3D0 and swappiness=3D200 to reclaim from a sing= le LRU list is intended to address the extreme cases we have actually encounte= red. As Johannes mentioned above, with the zram/zswap, the IO tradeoff that cache_trim_mode is making doesn't hold - file refaults will cause IO, whereas anon decompression will not. we would like to set swappiness=3D200 to reclaim only from anon list which really makes sense to us. Thanks. > -- > Michal Hocko > SUSE Labs