From: Nhat Pham <nphamcs@gmail.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
akpm@linux-foundation.org, riel@surriel.com, mhocko@kernel.org,
roman.gushchin@linux.dev, shakeelb@google.com,
muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com,
shuah@kernel.org, yosryahmed@google.com, fvdl@google.com,
linux-mm@kvack.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller
Date: Tue, 3 Oct 2023 16:26:10 -0700 [thread overview]
Message-ID: <CAKEwX=MqV5CThRxTXs3DKqGNw04w2j=4hmE+Wi7x4Gu_ykATmw@mail.gmail.com> (raw)
In-Reply-To: <20231003224214.GE314430@monkey>
On Tue, Oct 3, 2023 at 3:42 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 10/03/23 15:09, Nhat Pham wrote:
> > On Tue, Oct 3, 2023 at 11:39 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > On Tue, Oct 03, 2023 at 11:01:24AM -0700, Nhat Pham wrote:
> > > > On Tue, Oct 3, 2023 at 10:13 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > > > > On 10/02/23 17:18, Nhat Pham wrote:
> > > > >
> > > > > IIUC, huge page usage is charged in alloc_hugetlb_folio and uncharged in
> > > > > free_huge_folio. During migration, huge pages are allocated via
> > > > > alloc_migrate_hugetlb_folio, not alloc_hugetlb_folio. So, there is no
> > > > > charging for the migration target page and we uncharge the source page.
> > > > > It looks like there will be no charge for the huge page after migration?
> > > > >
> > > >
> > > > Ah I see! This is a bit subtle indeed.
> > > >
> > > > For the hugetlb controller, it looks like they update the cgroup info
> > > > inside move_hugetlb_state(), which calls hugetlb_cgroup_migrate()
> > > > to transfer the hugetlb cgroup info to the destination folio.
> > > >
> > > > Perhaps we can do something analogous here.
> > > >
> > > > > If my analysis above is correct, then we may need to be careful about
> > > > > this accounting. We may not want both source and target pages to be
> > > > > charged at the same time.
> > > >
> > > > We can create a variant of mem_cgroup_migrate that does not double
> > > > charge, but instead just copy the mem_cgroup information to the new
> > > > folio, and then clear that info from the old folio. That way the memory
> > > > usage counters are untouched.
> > > >
> > > > Somebody with more expertise on migration should fact check me
> > > > of course :)
> > >
> > > The only reason mem_cgroup_migrate() double charges right now is
> > > because it's used by replace_page_cache_folio(). In that context, the
> > > isolation of the old page isn't quite as thorough as with migration,
> > > so it cannot transfer and uncharge directly. This goes back a long
> > > time: 0a31bc97c80c3fa87b32c091d9a930ac19cd0c40
> > >
> > > If you rename the current implementation to mem_cgroup_replace_page()
> > > for that one caller, you can add a mem_cgroup_migrate() variant which
> > > is charge neutral and clears old->memcg_data. This can be used for
> > > regular and hugetlb page migration. Something like this (totally
> > > untested):
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index a4d3282493b6..17ec45bf3653 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -7226,29 +7226,14 @@ void mem_cgroup_migrate(struct folio *old, struct folio *new)
> > > if (mem_cgroup_disabled())
> > > return;
> > >
> > > - /* Page cache replacement: new folio already charged? */
> > > - if (folio_memcg(new))
> > > - return;
> > > -
> > > memcg = folio_memcg(old);
> > > VM_WARN_ON_ONCE_FOLIO(!memcg, old);
> > > if (!memcg)
> > > return;
> > >
> > > - /* Force-charge the new page. The old one will be freed soon */
> > > - if (!mem_cgroup_is_root(memcg)) {
> > > - page_counter_charge(&memcg->memory, nr_pages);
> > > - if (do_memsw_account())
> > > - page_counter_charge(&memcg->memsw, nr_pages);
> > > - }
> > > -
> > > - css_get(&memcg->css);
> > > + /* Transfer the charge and the css ref */
> > > commit_charge(new, memcg);
> > > -
> > > - local_irq_save(flags);
> > > - mem_cgroup_charge_statistics(memcg, nr_pages);
> > > - memcg_check_events(memcg, folio_nid(new));
> > > - local_irq_restore(flags);
> > > + old->memcg_data = 0;
> > > }
> > >
> > > DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key);
> > >
> >
> > Ah, I like this. Will send a fixlet based on this :)
> > I was scratching my head trying to figure out why we were
> > doing the double charging in the first place. Thanks for the context,
> > Johannes!
>
> Be sure to check for code similar to this in folio_migrate_flags:
>
> void folio_migrate_flags(struct folio *newfolio, struct folio *folio)
> {
> ...
> if (!folio_test_hugetlb(folio))
> mem_cgroup_migrate(folio, newfolio);
> }
>
> There are many places where hugetlb is special cased.
Yeah makes sense. I'm actually gonna take advantage of this,
and remove the test hugetlb check here, so that it will also
migrate the memcg metadata in this case too. See the new patch
I just sent out.
> --
> Mike Kravetz
next prev parent reply other threads:[~2023-10-03 23:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-03 0:18 [PATCH v3 0/3] hugetlb memcg accounting Nhat Pham
2023-10-03 0:18 ` [PATCH v3 1/3] memcontrol: add helpers for " Nhat Pham
2023-10-03 11:50 ` Michal Hocko
2023-10-03 12:47 ` Johannes Weiner
2023-10-03 0:18 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller Nhat Pham
2023-10-03 0:26 ` Nhat Pham
2023-10-03 12:54 ` Johannes Weiner
2023-10-03 12:58 ` Michal Hocko
2023-10-03 15:59 ` Johannes Weiner
2023-10-03 17:13 ` Mike Kravetz
2023-10-03 18:01 ` Nhat Pham
2023-10-03 18:39 ` Johannes Weiner
2023-10-03 22:09 ` Nhat Pham
2023-10-03 22:42 ` Mike Kravetz
2023-10-03 23:26 ` Nhat Pham [this message]
2023-10-03 23:14 ` [PATCH] memcontrol: only transfer the memcg data for migration Nhat Pham
2023-10-03 23:22 ` Yosry Ahmed
2023-10-03 23:31 ` Nhat Pham
2023-10-03 23:54 ` Yosry Ahmed
2023-10-04 0:02 ` Nhat Pham
2023-10-04 0:02 ` Nhat Pham
2023-10-04 14:17 ` Johannes Weiner
2023-10-04 19:45 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller (fix) Nhat Pham
2023-10-06 17:25 ` Andrew Morton
2023-10-06 18:23 ` Nhat Pham
2023-10-03 0:18 ` [PATCH v3 3/3] selftests: add a selftest to verify hugetlb usage in memcg Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=MqV5CThRxTXs3DKqGNw04w2j=4hmE+Wi7x4Gu_ykATmw@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=muchun.song@linux.dev \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox