From: Nhat Pham <nphamcs@gmail.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: akpm@linux-foundation.org, riel@surriel.com, hannes@cmpxchg.org,
mhocko@kernel.org, roman.gushchin@linux.dev,
shakeelb@google.com, muchun.song@linux.dev, tj@kernel.org,
lizefan.x@bytedance.com, shuah@kernel.org,
yosryahmed@google.com, fvdl@google.com, linux-mm@kvack.org,
kernel-team@meta.com, linux-kernel@vger.kernel.org,
cgroups@vger.kernel.org
Subject: Re: [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller
Date: Tue, 3 Oct 2023 11:01:24 -0700 [thread overview]
Message-ID: <CAKEwX=POd1DZc2K5ym14R2DpU74DqV30_A6QGfsCAaOTMK2WJA@mail.gmail.com> (raw)
In-Reply-To: <20231003171329.GB314430@monkey>
On Tue, Oct 3, 2023 at 10:13 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 10/02/23 17:18, Nhat Pham wrote:
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index de220e3ff8be..74472e911b0a 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1902,6 +1902,7 @@ void free_huge_folio(struct folio *folio)
> > pages_per_huge_page(h), folio);
> > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> > pages_per_huge_page(h), folio);
> > + mem_cgroup_uncharge(folio);
> > if (restore_reserve)
> > h->resv_huge_pages++;
> >
> > @@ -3009,11 +3010,20 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> > struct hugepage_subpool *spool = subpool_vma(vma);
> > struct hstate *h = hstate_vma(vma);
> > struct folio *folio;
> > - long map_chg, map_commit;
> > + long map_chg, map_commit, nr_pages = pages_per_huge_page(h);
> > long gbl_chg;
> > - int ret, idx;
> > + int memcg_charge_ret, ret, idx;
> > struct hugetlb_cgroup *h_cg = NULL;
> > + struct mem_cgroup *memcg;
> > bool deferred_reserve;
> > + gfp_t gfp = htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL;
> > +
> > + memcg = get_mem_cgroup_from_current();
> > + memcg_charge_ret = mem_cgroup_hugetlb_try_charge(memcg, gfp, nr_pages);
> > + if (memcg_charge_ret == -ENOMEM) {
> > + mem_cgroup_put(memcg);
> > + return ERR_PTR(-ENOMEM);
> > + }
> >
> > idx = hstate_index(h);
> > /*
> > @@ -3022,8 +3032,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> > * code of zero indicates a reservation exists (no change).
> > */
> > map_chg = gbl_chg = vma_needs_reservation(h, vma, addr);
> > - if (map_chg < 0)
> > + if (map_chg < 0) {
> > + if (!memcg_charge_ret)
> > + mem_cgroup_cancel_charge(memcg, nr_pages);
> > + mem_cgroup_put(memcg);
> > return ERR_PTR(-ENOMEM);
> > + }
> >
> > /*
> > * Processes that did not create the mapping will have no
> > @@ -3034,10 +3048,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> > */
> > if (map_chg || avoid_reserve) {
> > gbl_chg = hugepage_subpool_get_pages(spool, 1);
> > - if (gbl_chg < 0) {
> > - vma_end_reservation(h, vma, addr);
> > - return ERR_PTR(-ENOSPC);
> > - }
> > + if (gbl_chg < 0)
> > + goto out_end_reservation;
> >
> > /*
> > * Even though there was no reservation in the region/reserve
> > @@ -3119,6 +3131,11 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> > hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
> > pages_per_huge_page(h), folio);
> > }
> > +
> > + if (!memcg_charge_ret)
> > + mem_cgroup_commit_charge(folio, memcg);
> > + mem_cgroup_put(memcg);
> > +
> > return folio;
> >
> > out_uncharge_cgroup:
> > @@ -3130,7 +3147,11 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
> > out_subpool_put:
> > if (map_chg || avoid_reserve)
> > hugepage_subpool_put_pages(spool, 1);
> > +out_end_reservation:
> > vma_end_reservation(h, vma, addr);
> > + if (!memcg_charge_ret)
> > + mem_cgroup_cancel_charge(memcg, nr_pages);
> > + mem_cgroup_put(memcg);
> > return ERR_PTR(-ENOSPC);
> > }
> >
>
> IIUC, huge page usage is charged in alloc_hugetlb_folio and uncharged in
> free_huge_folio. During migration, huge pages are allocated via
> alloc_migrate_hugetlb_folio, not alloc_hugetlb_folio. So, there is no
> charging for the migration target page and we uncharge the source page.
> It looks like there will be no charge for the huge page after migration?
>
Ah I see! This is a bit subtle indeed.
For the hugetlb controller, it looks like they update the cgroup info
inside move_hugetlb_state(), which calls hugetlb_cgroup_migrate()
to transfer the hugetlb cgroup info to the destination folio.
Perhaps we can do something analogous here.
> If my analysis above is correct, then we may need to be careful about
> this accounting. We may not want both source and target pages to be
> charged at the same time.
We can create a variant of mem_cgroup_migrate that does not double
charge, but instead just copy the mem_cgroup information to the new
folio, and then clear that info from the old folio. That way the memory
usage counters are untouched.
Somebody with more expertise on migration should fact check me
of course :)
> --
> Mike Kravetz
next prev parent reply other threads:[~2023-10-03 18:01 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-03 0:18 [PATCH v3 0/3] hugetlb memcg accounting Nhat Pham
2023-10-03 0:18 ` [PATCH v3 1/3] memcontrol: add helpers for " Nhat Pham
2023-10-03 11:50 ` Michal Hocko
2023-10-03 12:47 ` Johannes Weiner
2023-10-03 0:18 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller Nhat Pham
2023-10-03 0:26 ` Nhat Pham
2023-10-03 12:54 ` Johannes Weiner
2023-10-03 12:58 ` Michal Hocko
2023-10-03 15:59 ` Johannes Weiner
2023-10-03 17:13 ` Mike Kravetz
2023-10-03 18:01 ` Nhat Pham [this message]
2023-10-03 18:39 ` Johannes Weiner
2023-10-03 22:09 ` Nhat Pham
2023-10-03 22:42 ` Mike Kravetz
2023-10-03 23:26 ` Nhat Pham
2023-10-03 23:14 ` [PATCH] memcontrol: only transfer the memcg data for migration Nhat Pham
2023-10-03 23:22 ` Yosry Ahmed
2023-10-03 23:31 ` Nhat Pham
2023-10-03 23:54 ` Yosry Ahmed
2023-10-04 0:02 ` Nhat Pham
2023-10-04 0:02 ` Nhat Pham
2023-10-04 14:17 ` Johannes Weiner
2023-10-04 19:45 ` [PATCH v3 2/3] hugetlb: memcg: account hugetlb-backed memory in memory controller (fix) Nhat Pham
2023-10-06 17:25 ` Andrew Morton
2023-10-06 18:23 ` Nhat Pham
2023-10-03 0:18 ` [PATCH v3 3/3] selftests: add a selftest to verify hugetlb usage in memcg Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=POd1DZc2K5ym14R2DpU74DqV30_A6QGfsCAaOTMK2WJA@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=muchun.song@linux.dev \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox