Re: [PATCH -V2] mm: fix draining PCP of remote zone

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@linux-foundation.org>
To: Huang Ying <ying.huang@intel.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Christoph Lameter <cl@linux.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH -V2] mm: fix draining PCP of remote zone
Date: Mon, 9 Oct 2023 17:41:35 -0700	[thread overview]
Message-ID: <20231009174135.2357dcfcdc691a6ef61dbd9a@linux-foundation.org> (raw)
In-Reply-To: <20231007062356.187621-1-ying.huang@intel.com>

On Sat,  7 Oct 2023 14:23:56 +0800 Huang Ying <ying.huang@intel.com> wrote:

> If there is no memory allocation/freeing in the PCP (Per-CPU Pageset)
> of a remote zone (zone in remote NUMA node) after some time (3 seconds
> for now), the pages of the PCP of the remote zone will be drained to
> avoid memory wastage.
> 
> This behavior was introduced in the commit 4ae7c03943fc ("[PATCH]
> Periodically drain non local pagesets") and the commit
> 4037d452202e ("Move remote node draining out of slab allocators")
> 
> But, after the commit 7cc36bbddde5 ("vmstat: on-demand vmstat workers
> V8"), the vmstat updater worker which is used to drain the PCP of
> remote zones may not be re-queued when we are waiting for the
> timeout (pcp->expire != 0) if there are no vmstat changes on this CPU,
> for example, when the CPU goes idle or runs user space only workloads.
> This may cause the pages of a remote zone be kept in PCP of this CPU
> for long time.  So that, the page reclaiming of the remote zone may be
> triggered prematurely.  This isn't a severe problem in practice,
> because the PCP of the remote zone will be drained if some memory are
> allocated/freed again on this CPU.  And, the PCP will eventually be
> drained during the direct reclaiming if necessary.
> 
> Anyway, the problem still deserves a fix via guaranteeing that the
> vmstat updater worker will always be re-queued when we are waiting for
> the timeout.  In effect, this restores the original behavior before
> the commit 7cc36bbddde5.
> 
> We can reproduce the bug via allocating/freeing pages from a remote
> zone then go idle as follows.  And the patch can fix it.
> 
> - Run some workloads, use `numactl` to bind CPU to node 0 and memory to
>   node 1.  So the PCP of the CPU on node 0 for zone on node 1 will be
>   filled.
> 
> - After workloads finish, idle for 60s
> 
> - Check /proc/zoneinfo
> 
> With the original kernel, the number of pages in the PCP of the CPU on
> node 0 for zone on node 1 is non-zero after idle.  With the patched
> kernel, it becomes 0 after idle.  That is, we avoid to keep pages in
> the remote PCP during idle.
> 

Thanks, I updated the changelog in place and queued this for mm-stable.

     prev parent reply	other threads:[~2023-10-10  0:42 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-07  6:23 Huang Ying
2023-10-10  0:41 ` Andrew Morton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231009174135.2357dcfcdc691a6ef61dbd9a@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox