Re: [PATCH v9] mm/page_alloc: boost watermarks on atomic allocation failure

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: SeongJae Park <sj@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: SeongJae Park <sj@kernel.org>,
	Qiliang Yuan <realwujing@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
	Lance Yang <lance.yang@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9] mm/page_alloc: boost watermarks on atomic allocation failure
Date: Fri, 13 Feb 2026 07:07:20 -0800	[thread overview]
Message-ID: <20260213150721.72997-1-sj@kernel.org> (raw)
In-Reply-To: <c2388a14-9ef1-419b-b86a-56629708be15@suse.cz>

On Fri, 13 Feb 2026 09:46:14 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 2/13/26 04:17, Qiliang Yuan wrote:
> > Atomic allocations (GFP_ATOMIC) are prone to failure under heavy memory
> > pressure as they cannot enter direct reclaim. This patch introduces a
> > watermark boost mechanism to mitigate this issue.
> > 
> > When a GFP_ATOMIC request enters the slowpath, the preferred zone's
> > watermark_boost is increased under zone->lock protection. This triggers
> > kswapd to proactively reclaim memory, creating a safety buffer for
> > future atomic allocations. A 1-second debounce timer prevents excessive
> > boosts during traffic bursts.
> > 
> > This approach reuses existing watermark_boost infrastructure with
> > minimal overhead and proper locking to ensure thread safety.
[...]
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index c380f063e8b7..8af88584a8bd 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -218,6 +218,13 @@ unsigned int pageblock_order __read_mostly;
> >  static void __free_pages_ok(struct page *page, unsigned int order,
> >  			    fpi_t fpi_flags);
> >  
> > +/*
> > + * Boost watermarks by ~0.1% of zone size on atomic allocation pressure.
> > + * This provides zone-proportional safety buffers: ~1MB per 1GB of zone size.
> > + * Larger zones under GFP_ATOMIC pressure need proportionally larger reserves.
> > + */
> > +#define ATOMIC_BOOST_FACTOR 1
> 
> ... so now we #define 1 but it makes little sense without that hardcoded
> 1000 below.

I agree.  I think it could be easier to understand if we use 10000 as the
denominator, consistent to other similar ones, like watermark_scale_factor.
Or, defining as a constant local variable or hard-coded value before its real
single use case might be easier to read, for below-mentioned reason.

> 
> > +
> >  /*
> >   * results with 256, 32 in the lowmem_reserve sysctl:
> >   *	1G machine -> (16M dma, 800M-16M normal, 1G-800M high)
> > @@ -2161,6 +2168,9 @@ bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *pag
> >  static inline bool boost_watermark(struct zone *zone)
> >  {
> >  	unsigned long max_boost;
> > +	unsigned long boost_amount;
> > +
> > +	lockdep_assert_held(&zone->lock);
> >  
> >  	if (!watermark_boost_factor)
> >  		return false;
> > @@ -2189,12 +2199,43 @@ static inline bool boost_watermark(struct zone *zone)
> >  
> >  	max_boost = max(pageblock_nr_pages, max_boost);
> >  
> > -	zone->watermark_boost = min(zone->watermark_boost + pageblock_nr_pages,
> > -		max_boost);
> > +	boost_amount = max(pageblock_nr_pages,
> > +			   mult_frac(zone_managed_pages(zone), ATOMIC_BOOST_FACTOR, 1000));
> 
> I don't think mult_frac() was a great suggestion. We're talking about right
> shifting by a constant 10. In the other cases of mult_frac() we use dynamic
> values for x and n so it's justified. But this IMHO is unnecessary complication.

This file uses multi_frac() in two places with hard-coded denominator 10000.
Hence I feel it is more consistent to use mutl_frac() with the same denominator
(10000) and consistent naming.  In terms of overhead, I think the added
overhead is negligible, since this is called only once per second.

No strong opinion but just a trivial and personal taste, though.  Right
shifting should also be good to me. :)

And now I find I was thinking the ATOMIC_BOOST_SHIFT coulb be better to be
consistent with other similar code, because it is defined as a macro.  That is,
I was assuming it would be used in multiple places and therefore better to be
easily understood by readers.  Now I find it is actually being used only here.
What about defining it as a constant local variable here, or just hard-coding?


Thanks,
SJ

[...]

next prev parent reply	other threads:[~2026-02-13 15:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-13  3:17 Qiliang Yuan
2026-02-13  8:46 ` Vlastimil Babka
2026-02-13 15:07   ` SeongJae Park [this message]
2026-02-13 19:36 ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260213150721.72997-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=realwujing@gmail.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox