linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: Matt Fleming <matt@readmodwrite.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>,
	 Sergey Senozhatsky <senozhatsky@chromium.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	 Minchan Kim <minchan@kernel.org>,
	kernel-team@cloudflare.com,
	 Matt Fleming <mfleming@cloudflare.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
	 Barry Song <baohua@kernel.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	 Brendan Jackman <jackmanb@google.com>, Zi Yan <ziy@nvidia.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	 David Hildenbrand <david@kernel.org>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	 Lorenzo Stoakes <ljs@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: Require LRU reclaim progress before retrying direct reclaim
Date: Wed, 15 Apr 2026 18:01:54 -0700	[thread overview]
Message-ID: <aeAtOUIhFv5hXOyb@linux.dev> (raw)
In-Reply-To: <20260410101550.2930139-1-matt@readmodwrite.com>

On Fri, Apr 10, 2026 at 11:15:49AM +0100, Matt Fleming wrote:
> From: Matt Fleming <mfleming@cloudflare.com>
> 
> should_reclaim_retry() uses zone_reclaimable_pages() to estimate whether
> retrying reclaim could eventually satisfy an allocation. It's possible
> for reclaim to make minimal or no progress on an LRU type despite having
> ample reclaimable pages, e.g. anonymous pages when the only swap is
> RAM-backed (zram). 

Or incompressible memory on zswap with writeback disabled or overcommitted
memory.min.

> This can cause the reclaim path to loop indefinitely.
> 
> Track LRU reclaim progress (anon vs file) through a new struct
> reclaim_progress passed out of try_to_free_pages(), and only count a
> type's reclaimable pages if at least reclaim_progress_pct% was actually
> reclaimed in the last cycle.
> 
> The threshold is exposed as /proc/sys/vm/reclaim_progress_pct (default
> 1, range 0-100). 

Let's not expose any sysctl or user visible API for this heuristic. It will
evolve and then this interface would be awkward and hard to remove.

> Setting 0 disables the gate and restores the previous
> behaviour. Environments with only RAM-backed swap (zram) and small
> memory may need a higher value to prevent futile anon LRU churn from
> keeping the allocator spinning.
> 
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
> ---

[...]

>  
> @@ -4637,7 +4672,24 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
>  			!__cpuset_zone_allowed(zone, gfp_mask))
>  				continue;
>  
> -		available = reclaimable = zone_reclaimable_pages(zone);
> +		/*
> +		 * Only count reclaimable pages from an LRU type if reclaim
> +		 * actually made headway on that type in the last cycle.
> +		 * This prevents the allocator from looping endlessly on
> +		 * account of a large pool of pages that reclaim cannot make
> +		 * progress on, e.g. anonymous pages when the only swap is
> +		 * RAM-backed (zram).
> +		 */
> +		reclaimable = 0;
> +		reclaimable_file = zone_reclaimable_file_pages(zone);
> +		reclaimable_anon = zone_reclaimable_anon_pages(zone);

Here we are getting the current reclaimable pages.

> +
> +		if (reclaim_progress_sufficient(progress->nr_file, reclaimable_file))
> +			reclaimable += reclaimable_file;
> +		if (reclaim_progress_sufficient(progress->nr_anon, reclaimable_anon))
> +			reclaimable += reclaimable_anon;

And here we are comparing the current reclaimable pages with last iteration. Is
this intentional to keep things simple?

> +
> +		available = reclaimable;
>  		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
>  

Another heuristic we can play with is to also pass through the vmscan scan
count. If for couple of consecutive iterations, we continue to see low reclaim
efficiency, go for OOM. Also maybe compare the scan count with the watermark as
I expect we don't see much difference scan count for consecutive reclaim
iteration, so, it is a good representative of reclaimable memory.

The reclaim efficiency heuristic should handle the swap-on-zram or
incomp-zswap-with-no-writeback. Treating scan count as proxy for reclaimable
memory should handle the overcommitted memory.min case.



  parent reply	other threads:[~2026-04-16  1:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10 10:15 Matt Fleming
2026-04-13 15:38 ` Vlastimil Babka (SUSE)
2026-04-15  9:11   ` Matt Fleming
2026-04-15 14:57 ` Pedro Falcato
2026-04-16  1:01 ` Shakeel Butt [this message]
2026-04-16  1:44 ` Barry Song
  -- strict thread matches above, loose matches on Subject: below --
2026-03-03 11:53 [RFC PATCH 0/1] mm: Reduce direct reclaim stalls with RAM-backed swap Matt Fleming
2026-04-10  9:41 ` [PATCH] mm: Require LRU reclaim progress before retrying direct reclaim Matt Fleming
2026-04-10 10:13   ` Matt Fleming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeAtOUIhFv5hXOyb@linux.dev \
    --to=shakeel.butt@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jackmanb@google.com \
    --cc=kasong@tencent.com \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=matt@readmodwrite.com \
    --cc=mfleming@cloudflare.com \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=senozhatsky@chromium.org \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox