From: Johannes Weiner <hannes@cmpxchg.org>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeelb@google.com>,
Muchun Song <songmuchun@bytedance.com>,
Greg Thelen <gthelen@google.com>,
David Rientjes <rientjes@google.com>,
cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/vmscan: check references from all memcgs for swapbacked memory
Date: Wed, 5 Oct 2022 10:04:05 -0400 [thread overview]
Message-ID: <Yz2O1dGeBGBTh6SM@cmpxchg.org> (raw)
In-Reply-To: <20221004233446.787056-1-yosryahmed@google.com>
Hi Yosry,
On Tue, Oct 04, 2022 at 11:34:46PM +0000, Yosry Ahmed wrote:
> During page/folio reclaim, we check folio is referenced using
> folio_referenced() to avoid reclaiming folios that have been recently
> accessed (hot memory). The ratinale is that this memory is likely to be
> accessed soon, and hence reclaiming it will cause a refault.
>
> For memcg reclaim, we pass in sc->target_mem_cgroup to
> folio_referenced(), which means we only check accesses to the folio
> from processes in the subtree of the target memcg. This behavior was
> originally introduced by commit bed7161a519a ("Memory controller: make
> page_referenced() cgroup aware") a long time ago. Back then, refaulted
> pages would get charged to the memcg of the process that was faulting them
> in. It made sense to only consider accesses coming from processes in the
> subtree of target_mem_cgroup. If a page was charged to memcg A but only
> being accessed by a sibling memcg B, we would reclaim it if memcg A is
> under pressure. memcg B can then fault it back in and get charged for it
> appropriately.
>
> Today, this behavior still makes sense for file pages. However, unlike
> file pages, when swapbacked pages are refaulted they are charged to the
> memcg that was originally charged for them during swapout. Which
> means that if a swapbacked page is charged to memcg A but only used by
> memcg B, and we reclaim it when memcg A is under pressure, it would
> simply be faulted back in and charged again to memcg A once memcg B
> accesses it. In that sense, accesses from all memcgs matter equally when
> considering if a swapbacked page/folio is a viable reclaim target.
>
> Add folio_referenced_memcg() which decides what memcg we should pass to
> folio_referenced() based on the folio type, and includes an elaborate
> comment about why we should do so. This should help reclaim make better
> decision and reduce refaults when reclaiming swapbacked memory that is
> used by multiple memcgs.
Great observation, and I agree with this change.
Just one nitpick:
> Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> ---
> mm/vmscan.c | 38 ++++++++++++++++++++++++++++++++++----
> 1 file changed, 34 insertions(+), 4 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c5a4bff11da6..f9fa0f9287e5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1443,14 +1443,43 @@ enum folio_references {
> FOLIOREF_ACTIVATE,
> };
>
> +/* What memcg should we pass to folio_referenced()? */
> +static struct mem_cgroup *folio_referenced_memcg(struct folio *folio,
> + struct mem_cgroup *target_memcg)
> +{
> + /*
> + * We check references to folios to make sure we don't reclaim hot
> + * folios that are likely to be refaulted soon. We pass a memcg to
> + * folio_referenced() to only check references coming from processes in
> + * that memcg's subtree.
> + *
> + * For file folios, we only consider references from processes in the
> + * subtree of the target memcg. If a folio is charged to
> + * memcg A but is only referenced by processes in memcg B, we reclaim it
> + * if memcg A is under pressure. If it is later accessed by memcg B it
> + * will be faulted back in and charged to memcg B. For memcg A, this is
> + * called memory that should be reclaimed.
> + *
> + * On the other hand, when swapbacked folios are faulted in, they get
> + * charged to the memcg that was originally charged for them at the time
> + * of swapping out. This means that if a folio that is charged to
> + * memcg A gets swapped out, it will get charged back to A when *any*
> + * memcg accesses it. In that sense, we need to consider references from
> + * *all* processes when considering whether to reclaim a swapbacked
> + * folio.
> + */
> + return folio_test_swapbacked(folio) ? NULL : target_memcg;
> +}
> +
> static enum folio_references folio_check_references(struct folio *folio,
> struct scan_control *sc)
> {
> int referenced_ptes, referenced_folio;
> unsigned long vm_flags;
> + struct mem_cgroup *memcg = folio_referenced_memcg(folio,
> + sc->target_mem_cgroup);
>
> - referenced_ptes = folio_referenced(folio, 1, sc->target_mem_cgroup,
> - &vm_flags);
> + referenced_ptes = folio_referenced(folio, 1, memcg, &vm_flags);
> referenced_folio = folio_test_clear_referenced(folio);
>
> /*
> @@ -2581,6 +2610,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
>
> while (!list_empty(&l_hold)) {
> struct folio *folio;
> + struct mem_cgroup *memcg;
>
> cond_resched();
> folio = lru_to_folio(&l_hold);
> @@ -2600,8 +2630,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
> }
>
> /* Referenced or rmap lock contention: rotate */
> - if (folio_referenced(folio, 0, sc->target_mem_cgroup,
> - &vm_flags) != 0) {
> + memcg = folio_referenced_memcg(folio, sc->target_mem_cgroup);
> + if (folio_referenced(folio, 0, memcg, &vm_flags) != 0) {
Would you mind moving this to folio_referenced() directly? There is
already a comment and branch in there that IMO would extend quite
naturally to cover the new exception:
/*
* If we are reclaiming on behalf of a cgroup, skip
* counting on behalf of references from different
* cgroups
*/
if (memcg) {
rwc.invalid_vma = invalid_folio_referenced_vma;
}
That would keep the decision-making and doc in one place.
Thanks!
Johannnes
next prev parent reply other threads:[~2022-10-05 14:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-04 23:34 Yosry Ahmed
2022-10-05 2:11 ` Yosry Ahmed
2022-10-05 14:04 ` Johannes Weiner [this message]
2022-10-05 14:54 ` Yosry Ahmed
2022-10-05 15:51 ` Johannes Weiner
2022-10-05 15:55 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yz2O1dGeBGBTh6SM@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=gthelen@google.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=songmuchun@bytedance.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox