From: Mike Kravetz <mike.kravetz@oracle.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
akpm@linux-foundation.org, riel@surriel.com, mhocko@kernel.org,
roman.gushchin@linux.dev, shakeelb@google.com,
muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com,
shuah@kernel.org, yosryahmed@google.com, fvdl@google.com,
linux-mm@kvack.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller
Date: Tue, 3 Oct 2023 15:42:14 -0700 [thread overview]
Message-ID: <20231003224214.GE314430@monkey> (raw)
In-Reply-To: <CAKEwX=O8X+ZRNEzzs+NYBAh9QJLObxKPOpmFo-ci4Aw=MR=0+Q@mail.gmail.com>
On 10/03/23 15:09, Nhat Pham wrote:
> On Tue, Oct 3, 2023 at 11:39 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Tue, Oct 03, 2023 at 11:01:24AM -0700, Nhat Pham wrote:
> > > On Tue, Oct 3, 2023 at 10:13 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > > > On 10/02/23 17:18, Nhat Pham wrote:
> > > >
> > > > IIUC, huge page usage is charged in alloc_hugetlb_folio and uncharged in
> > > > free_huge_folio. During migration, huge pages are allocated via
> > > > alloc_migrate_hugetlb_folio, not alloc_hugetlb_folio. So, there is no
> > > > charging for the migration target page and we uncharge the source page.
> > > > It looks like there will be no charge for the huge page after migration?
> > > >
> > >
> > > Ah I see! This is a bit subtle indeed.
> > >
> > > For the hugetlb controller, it looks like they update the cgroup info
> > > inside move_hugetlb_state(), which calls hugetlb_cgroup_migrate()
> > > to transfer the hugetlb cgroup info to the destination folio.
> > >
> > > Perhaps we can do something analogous here.
> > >
> > > > If my analysis above is correct, then we may need to be careful about
> > > > this accounting. We may not want both source and target pages to be
> > > > charged at the same time.
> > >
> > > We can create a variant of mem_cgroup_migrate that does not double
> > > charge, but instead just copy the mem_cgroup information to the new
> > > folio, and then clear that info from the old folio. That way the memory
> > > usage counters are untouched.
> > >
> > > Somebody with more expertise on migration should fact check me
> > > of course :)
> >
> > The only reason mem_cgroup_migrate() double charges right now is
> > because it's used by replace_page_cache_folio(). In that context, the
> > isolation of the old page isn't quite as thorough as with migration,
> > so it cannot transfer and uncharge directly. This goes back a long
> > time: 0a31bc97c80c3fa87b32c091d9a930ac19cd0c40
> >
> > If you rename the current implementation to mem_cgroup_replace_page()
> > for that one caller, you can add a mem_cgroup_migrate() variant which
> > is charge neutral and clears old->memcg_data. This can be used for
> > regular and hugetlb page migration. Something like this (totally
> > untested):
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index a4d3282493b6..17ec45bf3653 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -7226,29 +7226,14 @@ void mem_cgroup_migrate(struct folio *old, struct folio *new)
> > if (mem_cgroup_disabled())
> > return;
> >
> > - /* Page cache replacement: new folio already charged? */
> > - if (folio_memcg(new))
> > - return;
> > -
> > memcg = folio_memcg(old);
> > VM_WARN_ON_ONCE_FOLIO(!memcg, old);
> > if (!memcg)
> > return;
> >
> > - /* Force-charge the new page. The old one will be freed soon */
> > - if (!mem_cgroup_is_root(memcg)) {
> > - page_counter_charge(&memcg->memory, nr_pages);
> > - if (do_memsw_account())
> > - page_counter_charge(&memcg->memsw, nr_pages);
> > - }
> > -
> > - css_get(&memcg->css);
> > + /* Transfer the charge and the css ref */
> > commit_charge(new, memcg);
> > -
> > - local_irq_save(flags);
> > - mem_cgroup_charge_statistics(memcg, nr_pages);
> > - memcg_check_events(memcg, folio_nid(new));
> > - local_irq_restore(flags);
> > + old->memcg_data = 0;
> > }
> >
> > DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key);
> >
>
> Ah, I like this. Will send a fixlet based on this :)
> I was scratching my head trying to figure out why we were
> doing the double charging in the first place. Thanks for the context,
> Johannes!
Be sure to check for code similar to this in folio_migrate_flags:
void folio_migrate_flags(struct folio *newfolio, struct folio *folio)
{
...
if (!folio_test_hugetlb(folio))
mem_cgroup_migrate(folio, newfolio);
}
There are many places where hugetlb is special cased.
--
Mike Kravetz
next prev parent reply other threads:[~2023-10-03 22:42 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-03 0:18 [PATCH v3 0/3] hugetlb memcg accounting Nhat Pham
2023-10-03 0:18 ` [PATCH v3 1/3] memcontrol: add helpers for " Nhat Pham
2023-10-03 11:50 ` Michal Hocko
2023-10-03 12:47 ` Johannes Weiner
2023-10-03 0:18 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller Nhat Pham
2023-10-03 0:26 ` Nhat Pham
2023-10-03 12:54 ` Johannes Weiner
2023-10-03 12:58 ` Michal Hocko
2023-10-03 15:59 ` Johannes Weiner
2023-10-03 17:13 ` Mike Kravetz
2023-10-03 18:01 ` Nhat Pham
2023-10-03 18:39 ` Johannes Weiner
2023-10-03 22:09 ` Nhat Pham
2023-10-03 22:42 ` Mike Kravetz [this message]
2023-10-03 23:26 ` Nhat Pham
2023-10-03 23:14 ` [PATCH] memcontrol: only transfer the memcg data for migration Nhat Pham
2023-10-03 23:22 ` Yosry Ahmed
2023-10-03 23:31 ` Nhat Pham
2023-10-03 23:54 ` Yosry Ahmed
2023-10-04 0:02 ` Nhat Pham
2023-10-04 0:02 ` Nhat Pham
2023-10-04 14:17 ` Johannes Weiner
2023-10-04 19:45 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller (fix) Nhat Pham
2023-10-06 17:25 ` Andrew Morton
2023-10-06 18:23 ` Nhat Pham
2023-10-03 0:18 ` [PATCH v3 3/3] selftests: add a selftest to verify hugetlb usage in memcg Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231003224214.GE314430@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox