Re: [PATCH] mm/vmscan: check references from all memcgs for swapbacked memory

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeelb@google.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Greg Thelen <gthelen@google.com>,
	David Rientjes <rientjes@google.com>,
	cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/vmscan: check references from all memcgs for swapbacked memory
Date: Wed, 5 Oct 2022 10:04:05 -0400	[thread overview]
Message-ID: <Yz2O1dGeBGBTh6SM@cmpxchg.org> (raw)
In-Reply-To: <20221004233446.787056-1-yosryahmed@google.com>

Hi Yosry,

On Tue, Oct 04, 2022 at 11:34:46PM +0000, Yosry Ahmed wrote:
> During page/folio reclaim, we check folio is referenced using
> folio_referenced() to avoid reclaiming folios that have been recently
> accessed (hot memory). The ratinale is that this memory is likely to be
> accessed soon, and hence reclaiming it will cause a refault.
> 
> For memcg reclaim, we pass in sc->target_mem_cgroup to
> folio_referenced(), which means we only check accesses to the folio
> from processes in the subtree of the target memcg. This behavior was
> originally introduced by commit bed7161a519a ("Memory controller: make
> page_referenced() cgroup aware") a long time ago. Back then, refaulted
> pages would get charged to the memcg of the process that was faulting them
> in. It made sense to only consider accesses coming from processes in the
> subtree of target_mem_cgroup. If a page was charged to memcg A but only
> being accessed by a sibling memcg B, we would reclaim it if memcg A is
> under pressure. memcg B can then fault it back in and get charged for it
> appropriately.
> 
> Today, this behavior still makes sense for file pages. However, unlike
> file pages, when swapbacked pages are refaulted they are charged to the
> memcg that was originally charged for them during swapout. Which
> means that if a swapbacked page is charged to memcg A but only used by
> memcg B, and we reclaim it when memcg A is under pressure, it would
> simply be faulted back in and charged again to memcg A once memcg B
> accesses it. In that sense, accesses from all memcgs matter equally when
> considering if a swapbacked page/folio is a viable reclaim target.
> 
> Add folio_referenced_memcg() which decides what memcg we should pass to
> folio_referenced() based on the folio type, and includes an elaborate
> comment about why we should do so. This should help reclaim make better
> decision and reduce refaults when reclaiming swapbacked memory that is
> used by multiple memcgs.

Great observation, and I agree with this change.

Just one nitpick:

> Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> ---
>  mm/vmscan.c | 38 ++++++++++++++++++++++++++++++++++----
>  1 file changed, 34 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c5a4bff11da6..f9fa0f9287e5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1443,14 +1443,43 @@ enum folio_references {
>  	FOLIOREF_ACTIVATE,
>  };
>  
> +/* What memcg should we pass to folio_referenced()? */
> +static struct mem_cgroup *folio_referenced_memcg(struct folio *folio,
> +						 struct mem_cgroup *target_memcg)
> +{
> +	/*
> +	 * We check references to folios to make sure we don't reclaim hot
> +	 * folios that are likely to be refaulted soon. We pass a memcg to
> +	 * folio_referenced() to only check references coming from processes in
> +	 * that memcg's subtree.
> +	 *
> +	 * For file folios, we only consider references from processes in the
> +	 * subtree of the target memcg. If a folio is charged to
> +	 * memcg A but is only referenced by processes in memcg B, we reclaim it
> +	 * if memcg A is under pressure. If it is later accessed by memcg B it
> +	 * will be faulted back in and charged to memcg B. For memcg A, this is
> +	 * called memory that should be reclaimed.
> +	 *
> +	 * On the other hand, when swapbacked folios are faulted in, they get
> +	 * charged to the memcg that was originally charged for them at the time
> +	 * of swapping out. This means that if a folio that is charged to
> +	 * memcg A gets swapped out, it will get charged back to A when *any*
> +	 * memcg accesses it. In that sense, we need to consider references from
> +	 * *all* processes when considering whether to reclaim a swapbacked
> +	 * folio.
> +	 */
> +	return folio_test_swapbacked(folio) ? NULL : target_memcg;
> +}
> +
>  static enum folio_references folio_check_references(struct folio *folio,
>  						  struct scan_control *sc)
>  {
>  	int referenced_ptes, referenced_folio;
>  	unsigned long vm_flags;
> +	struct mem_cgroup *memcg = folio_referenced_memcg(folio,
> +						sc->target_mem_cgroup);
>  
> -	referenced_ptes = folio_referenced(folio, 1, sc->target_mem_cgroup,
> -					   &vm_flags);
> +	referenced_ptes = folio_referenced(folio, 1, memcg, &vm_flags);
>  	referenced_folio = folio_test_clear_referenced(folio);
>  
>  	/*
> @@ -2581,6 +2610,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  
>  	while (!list_empty(&l_hold)) {
>  		struct folio *folio;
> +		struct mem_cgroup *memcg;
>  
>  		cond_resched();
>  		folio = lru_to_folio(&l_hold);
> @@ -2600,8 +2630,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  		}
>  
>  		/* Referenced or rmap lock contention: rotate */
> -		if (folio_referenced(folio, 0, sc->target_mem_cgroup,
> -				     &vm_flags) != 0) {
> +		memcg = folio_referenced_memcg(folio, sc->target_mem_cgroup);
> +		if (folio_referenced(folio, 0, memcg, &vm_flags) != 0) {

Would you mind moving this to folio_referenced() directly? There is
already a comment and branch in there that IMO would extend quite
naturally to cover the new exception:

	/*
	 * If we are reclaiming on behalf of a cgroup, skip
	 * counting on behalf of references from different
	 * cgroups
	 */
	if (memcg) {
		rwc.invalid_vma = invalid_folio_referenced_vma;
	}

That would keep the decision-making and doc in one place.

Thanks!
Johannnes

next prev parent reply	other threads:[~2022-10-05 14:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-04 23:34 Yosry Ahmed
2022-10-05  2:11 ` Yosry Ahmed
2022-10-05 14:04 ` Johannes Weiner [this message]
2022-10-05 14:54   ` Yosry Ahmed
2022-10-05 15:51     ` Johannes Weiner
2022-10-05 15:55       ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yz2O1dGeBGBTh6SM@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox