From: Mel Gorman <mgorman@techsingularity.net>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Arjan Van De Ven <arjan@linux.intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
David Hildenbrand <david@redhat.com>,
Johannes Weiner <jweiner@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Michal Hocko <mhocko@suse.com>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Matthew Wilcox <willy@infradead.org>,
Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark
Date: Mon, 23 Oct 2023 10:26:47 +0100 [thread overview]
Message-ID: <20231023092647.dxu45uq4lncw5asy@techsingularity.net> (raw)
In-Reply-To: <87o7guezrm.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Fri, Oct 20, 2023 at 11:30:53AM +0800, Huang, Ying wrote:
> Mel Gorman <mgorman@techsingularity.net> writes:
>
> > On Mon, Oct 16, 2023 at 01:30:01PM +0800, Huang Ying wrote:
> >> One target of PCP is to minimize pages in PCP if the system free pages
> >> is too few. To reach that target, when page reclaiming is active for
> >> the zone (ZONE_RECLAIM_ACTIVE), we will stop increasing PCP high in
> >> allocating path, decrease PCP high and free some pages in freeing
> >> path. But this may be too late because the background page reclaiming
> >> may introduce latency for some workloads. So, in this patch, during
> >> page allocation we will detect whether the number of free pages of the
> >> zone is below high watermark. If so, we will stop increasing PCP high
> >> in allocating path, decrease PCP high and free some pages in freeing
> >> path. With this, we can reduce the possibility of the premature
> >> background page reclaiming caused by too large PCP.
> >>
> >> The high watermark checking is done in allocating path to reduce the
> >> overhead in hotter freeing path.
> >>
> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Mel Gorman <mgorman@techsingularity.net>
> >> Cc: Vlastimil Babka <vbabka@suse.cz>
> >> Cc: David Hildenbrand <david@redhat.com>
> >> Cc: Johannes Weiner <jweiner@redhat.com>
> >> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> >> Cc: Matthew Wilcox <willy@infradead.org>
> >> Cc: Christoph Lameter <cl@linux.com>
> >> ---
> >> include/linux/mmzone.h | 1 +
> >> mm/page_alloc.c | 33 +++++++++++++++++++++++++++++++--
> >> 2 files changed, 32 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index ec3f7daedcc7..c88770381aaf 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -1018,6 +1018,7 @@ enum zone_flags {
> >> * Cleared when kswapd is woken.
> >> */
> >> ZONE_RECLAIM_ACTIVE, /* kswapd may be scanning the zone. */
> >> + ZONE_BELOW_HIGH, /* zone is below high watermark. */
> >> };
> >>
> >> static inline unsigned long zone_managed_pages(struct zone *zone)
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 8382ad2cdfd4..253fc7d0498e 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -2407,7 +2407,13 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone,
> >> return min(batch << 2, pcp->high);
> >> }
> >>
> >> - if (pcp->count >= high && high_min != high_max) {
> >> + if (high_min == high_max)
> >> + return high;
> >> +
> >> + if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) {
> >> + pcp->high = max(high - (batch << pcp->free_factor), high_min);
> >> + high = max(pcp->count, high_min);
> >> + } else if (pcp->count >= high) {
> >> int need_high = (batch << pcp->free_factor) + batch;
> >>
> >> /* pcp->high should be large enough to hold batch freed pages */
> >> @@ -2457,6 +2463,10 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
> >> if (pcp->count >= high) {
> >> free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high),
> >> pcp, pindex);
> >> + if (test_bit(ZONE_BELOW_HIGH, &zone->flags) &&
> >> + zone_watermark_ok(zone, 0, high_wmark_pages(zone),
> >> + ZONE_MOVABLE, 0))
> >> + clear_bit(ZONE_BELOW_HIGH, &zone->flags);
> >> }
> >> }
> >>
> >
> > This is a relatively fast path and freeing pages should not need to check
> > watermarks.
>
> Another stuff that mitigate the overhead is that the watermarks checking
> only occurs when we free pages from PCP to buddy. That is, in most
> cases, every 63 page freeing.
>
True
> > While the overhead is mitigated because it applies only when
> > the watermark is below high, that is also potentially an unbounded condition
> > if a workload is sized precisely enough. Why not clear this bit when kswapd
> > is going to sleep after reclaiming enough pages in a zone?
>
> IIUC, if the number of free pages is kept larger than the low watermark,
> then kswapd will have no opportunity to be waken up even if the number
> of free pages was ever smaller than the high watermark.
>
Also true and I did think of that later. I guess it's ok, the chances
are that the series overall offsets any micro-costs like this so I'm
happy. If, for some reason, this overhead is noticable (doubtful), then
it can be revisted.
Thanks.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2023-10-23 9:26 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-16 5:29 [PATCH -V3 0/9] mm: PCP high auto-tuning Huang Ying
2023-10-16 5:29 ` [PATCH -V3 1/9] mm, pcp: avoid to drain PCP when process exit Huang Ying
2023-10-16 5:29 ` [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice Huang Ying
2023-10-19 12:11 ` Mel Gorman
2023-10-16 5:29 ` [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages Huang Ying
2023-10-27 6:23 ` kernel test robot
2023-11-06 6:22 ` kernel test robot
2023-11-06 6:38 ` Huang, Ying
2023-10-16 5:29 ` [PATCH -V3 4/9] mm: restrict the pcp batch scale factor to avoid too long latency Huang Ying
2023-10-19 12:12 ` Mel Gorman
2023-10-16 5:29 ` [PATCH -V3 5/9] mm, page_alloc: scale the number of pages that are batch allocated Huang Ying
2023-10-16 5:29 ` [PATCH -V3 6/9] mm: add framework for PCP high auto-tuning Huang Ying
2023-10-19 12:16 ` Mel Gorman
2023-10-16 5:30 ` [PATCH -V3 7/9] mm: tune PCP high automatically Huang Ying
2023-10-31 2:50 ` kernel test robot
2023-10-16 5:30 ` [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark Huang Ying
2023-10-19 12:33 ` Mel Gorman
2023-10-20 3:30 ` Huang, Ying
2023-10-23 9:26 ` Mel Gorman [this message]
2023-10-16 5:30 ` [PATCH -V3 9/9] mm, pcp: reduce detecting time of consecutive high order page freeing Huang Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231023092647.dxu45uq4lncw5asy@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=cl@linux.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=jweiner@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox