linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Arjan Van De Ven <arjan@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	David Hildenbrand <david@redhat.com>,
	Johannes Weiner <jweiner@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark
Date: Mon, 23 Oct 2023 10:26:47 +0100	[thread overview]
Message-ID: <20231023092647.dxu45uq4lncw5asy@techsingularity.net> (raw)
In-Reply-To: <87o7guezrm.fsf@yhuang6-desk2.ccr.corp.intel.com>

On Fri, Oct 20, 2023 at 11:30:53AM +0800, Huang, Ying wrote:
> Mel Gorman <mgorman@techsingularity.net> writes:
> 
> > On Mon, Oct 16, 2023 at 01:30:01PM +0800, Huang Ying wrote:
> >> One target of PCP is to minimize pages in PCP if the system free pages
> >> is too few.  To reach that target, when page reclaiming is active for
> >> the zone (ZONE_RECLAIM_ACTIVE), we will stop increasing PCP high in
> >> allocating path, decrease PCP high and free some pages in freeing
> >> path.  But this may be too late because the background page reclaiming
> >> may introduce latency for some workloads.  So, in this patch, during
> >> page allocation we will detect whether the number of free pages of the
> >> zone is below high watermark.  If so, we will stop increasing PCP high
> >> in allocating path, decrease PCP high and free some pages in freeing
> >> path.  With this, we can reduce the possibility of the premature
> >> background page reclaiming caused by too large PCP.
> >> 
> >> The high watermark checking is done in allocating path to reduce the
> >> overhead in hotter freeing path.
> >> 
> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Mel Gorman <mgorman@techsingularity.net>
> >> Cc: Vlastimil Babka <vbabka@suse.cz>
> >> Cc: David Hildenbrand <david@redhat.com>
> >> Cc: Johannes Weiner <jweiner@redhat.com>
> >> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> >> Cc: Matthew Wilcox <willy@infradead.org>
> >> Cc: Christoph Lameter <cl@linux.com>
> >> ---
> >>  include/linux/mmzone.h |  1 +
> >>  mm/page_alloc.c        | 33 +++++++++++++++++++++++++++++++--
> >>  2 files changed, 32 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index ec3f7daedcc7..c88770381aaf 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -1018,6 +1018,7 @@ enum zone_flags {
> >>  					 * Cleared when kswapd is woken.
> >>  					 */
> >>  	ZONE_RECLAIM_ACTIVE,		/* kswapd may be scanning the zone. */
> >> +	ZONE_BELOW_HIGH,		/* zone is below high watermark. */
> >>  };
> >>  
> >>  static inline unsigned long zone_managed_pages(struct zone *zone)
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 8382ad2cdfd4..253fc7d0498e 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -2407,7 +2407,13 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone,
> >>  		return min(batch << 2, pcp->high);
> >>  	}
> >>  
> >> -	if (pcp->count >= high && high_min != high_max) {
> >> +	if (high_min == high_max)
> >> +		return high;
> >> +
> >> +	if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) {
> >> +		pcp->high = max(high - (batch << pcp->free_factor), high_min);
> >> +		high = max(pcp->count, high_min);
> >> +	} else if (pcp->count >= high) {
> >>  		int need_high = (batch << pcp->free_factor) + batch;
> >>  
> >>  		/* pcp->high should be large enough to hold batch freed pages */
> >> @@ -2457,6 +2463,10 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
> >>  	if (pcp->count >= high) {
> >>  		free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high),
> >>  				   pcp, pindex);
> >> +		if (test_bit(ZONE_BELOW_HIGH, &zone->flags) &&
> >> +		    zone_watermark_ok(zone, 0, high_wmark_pages(zone),
> >> +				      ZONE_MOVABLE, 0))
> >> +			clear_bit(ZONE_BELOW_HIGH, &zone->flags);
> >>  	}
> >>  }
> >>  
> >
> > This is a relatively fast path and freeing pages should not need to check
> > watermarks.
> 
> Another stuff that mitigate the overhead is that the watermarks checking
> only occurs when we free pages from PCP to buddy.  That is, in most
> cases, every 63 page freeing.
> 

True

> > While the overhead is mitigated because it applies only when
> > the watermark is below high, that is also potentially an unbounded condition
> > if a workload is sized precisely enough. Why not clear this bit when kswapd
> > is going to sleep after reclaiming enough pages in a zone?
> 
> IIUC, if the number of free pages is kept larger than the low watermark,
> then kswapd will have no opportunity to be waken up even if the number
> of free pages was ever smaller than the high watermark.
> 

Also true and I did think of that later. I guess it's ok, the chances
are that the series overall offsets any micro-costs like this so I'm
happy. If, for some reason, this overhead is noticable (doubtful), then
it can be revisted.

Thanks.

-- 
Mel Gorman
SUSE Labs


  reply	other threads:[~2023-10-23  9:26 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16  5:29 [PATCH -V3 0/9] mm: PCP high auto-tuning Huang Ying
2023-10-16  5:29 ` [PATCH -V3 1/9] mm, pcp: avoid to drain PCP when process exit Huang Ying
2023-10-16  5:29 ` [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice Huang Ying
2023-10-19 12:11   ` Mel Gorman
2023-10-16  5:29 ` [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages Huang Ying
2023-10-27  6:23   ` kernel test robot
2023-11-06  6:22   ` kernel test robot
2023-11-06  6:38     ` Huang, Ying
2023-10-16  5:29 ` [PATCH -V3 4/9] mm: restrict the pcp batch scale factor to avoid too long latency Huang Ying
2023-10-19 12:12   ` Mel Gorman
2023-10-16  5:29 ` [PATCH -V3 5/9] mm, page_alloc: scale the number of pages that are batch allocated Huang Ying
2023-10-16  5:29 ` [PATCH -V3 6/9] mm: add framework for PCP high auto-tuning Huang Ying
2023-10-19 12:16   ` Mel Gorman
2023-10-16  5:30 ` [PATCH -V3 7/9] mm: tune PCP high automatically Huang Ying
2023-10-31  2:50   ` kernel test robot
2023-10-16  5:30 ` [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark Huang Ying
2023-10-19 12:33   ` Mel Gorman
2023-10-20  3:30     ` Huang, Ying
2023-10-23  9:26       ` Mel Gorman [this message]
2023-10-16  5:30 ` [PATCH -V3 9/9] mm, pcp: reduce detecting time of consecutive high order page freeing Huang Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231023092647.dxu45uq4lncw5asy@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=cl@linux.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox