Re: [PATCH v1] mm/vmscan: Add retry logic for cgroups with memory.low in kswapd

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	David Hildenbrand <david@redhat.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] mm/vmscan: Add retry logic for cgroups with memory.low in kswapd
Date: Thu, 16 Oct 2025 16:49:51 +0200	[thread overview]
Message-ID: <aPEGDwiA_LhuLZmX@tiehlicka> (raw)
In-Reply-To: <a6cd4eb712f3b9f8898e9a2e511b397e8dc397fc@linux.dev>

On Tue 14-10-25 12:56:06, Jiayuan Chen wrote:
> October 14, 2025 at 17:33, "Michal Hocko" <mhocko@suse.com mailto:mhocko@suse.com?to=%22Michal%20Hocko%22%20%3Cmhocko%40suse.com%3E > wrote:
> 
> 
> > 
> > On Tue 14-10-25 16:18:49, Jiayuan Chen wrote:
> > 
> > > 
> > > We can set memory.low for cgroups as a soft protection limit. When the
> > >  kernel cannot reclaim any pages from other cgroups, it retries reclaim
> > >  while ignoring the memory.low protection of the skipped cgroups.
> > >  
> > >  Currently, this retry logic only works in direct reclaim path, but is
> > >  missing in the kswapd asynchronous reclaim. Typically, a cgroup may
> > >  contain some cold pages that could be reclaimed even when memory.low is
> > >  set.
> > >  
> > >  This change adds retry logic to kswapd: if the first reclaim attempt fails
> > >  to reclaim any pages and some cgroups were skipped due to memory.low
> > >  protection, kswapd will perform a second reclaim pass ignoring memory.low
> > >  restrictions.
> > >  
> > >  This ensures more consistent reclaim behavior between direct reclaim and
> > >  kswapd. By allowing kswapd to reclaim more proactively from protected
> > >  cgroups under global memory pressure, this optimization can help reduce
> > >  the occurrence of direct reclaim, which is more disruptive to application
> > >  performance.
> > > 
> > Could you describe the problem you are trying to address in more details
> > please? Because your patch is significantly changing the behavior of the
> > low limit. I would even go as far as say it breaks its expecations
> > because low limit should provide a certain level of protection and
> > your patch would allow kswapd to reclaim from those cgroups much sooner
> > now. If this is really needed then we need much more detailed
> > justification and also evaluation how that influences existing users.
> > 
> 
> 
> Thanks Michal, let me explain the issue I encountered:
> 
> 1. When kswapd is triggered and there's no reclaimable memory (sc.nr_reclaimed == 0),
> this causes kswapd_failures counter to continuously accumulate until it reaches
> MAX_RECLAIM_RETRIES. This makes the kswapd thread stop running until a direct memory
> reclaim is triggered.

While the definition of low limit is rather vague:
        Best-effort memory protection.  If the memory usage of a
        cgroup is within its effective low boundary, the cgroup's
        memory won't be reclaimed unless there is no reclaimable
        memory available in unprotected cgroups.
        Above the effective low boundary (or
        effective min boundary if it is higher), pages are reclaimed
        proportionally to the overage, reducing reclaim pressure for
        smaller overages.
which doesn't explicitly rule out reclaim from the kswapd context but
historically we relied on the direct reclaim to detect the "no
reclaimable memory" situation as it is much easier to achieve in that
context. Also you do not really explain why backing off kswapd when all
the reclaimable memory is low limit protected is bad.

> 2. We observed a phenomenon where kswapd is triggered by watermark_boost rather
> than by actual memory watermarks being insufficient. For boost-triggered
> reclamation, the maximum priority can only be DEF_PRIORITY - 2, making memory
> reclamation more difficult compared to when priority is 1.

Do I get it right that you would like to break low limits on
watermark_boost reclaim? I am not sure I follow your priority argument.

-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2025-10-16 14:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-14  8:18 Jiayuan Chen
2025-10-14  9:33 ` Michal Hocko
2025-10-14 12:56   ` Jiayuan Chen
2025-10-16 14:49     ` Michal Hocko [this message]
2025-10-16 15:10       ` Jiayuan Chen
2025-10-16 18:43         ` Michal Hocko
2025-10-20 10:11           ` Jiayuan Chen
2025-11-07 13:22             ` Michal Hocko
2025-11-08  0:09               ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPEGDwiA_LhuLZmX@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=shakeel.butt@linux.dev \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox