linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nhat Pham <nphamcs@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: akpm@linux-foundation.org, riel@surriel.com, hannes@cmpxchg.org,
	 roman.gushchin@linux.dev, shakeelb@google.com,
	muchun.song@linux.dev,  tj@kernel.org, lizefan.x@bytedance.com,
	shuah@kernel.org,  mike.kravetz@oracle.com,
	yosryahmed@google.com, linux-mm@kvack.org,  kernel-team@meta.com,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH 0/2] hugetlb memcg accounting
Date: Wed, 27 Sep 2023 16:33:59 -0700	[thread overview]
Message-ID: <CAKEwX=MVcHSHRTzHMo7W6PCiWkTNdR8zp0c4UxR4xKDZCVbPQQ@mail.gmail.com> (raw)
In-Reply-To: <ZRQQMABiVIcXXcrg@dhcp22.suse.cz>

On Wed, Sep 27, 2023 at 4:21 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Tue 26-09-23 12:49:47, Nhat Pham wrote:
> > Currently, hugetlb memory usage is not acounted for in the memory
> > controller, which could lead to memory overprotection for cgroups with
> > hugetlb-backed memory. This has been observed in our production system.
> >
> > This patch series rectifies this issue by charging the memcg when the
> > hugetlb folio is allocated, and uncharging when the folio is freed. In
> > addition, a new selftest is added to demonstrate and verify this new
> > behavior.
>
> The primary reason why hugetlb is living outside of memcg (and the core
> MM as well) is that it doesn't really fit the whole scheme. In several
> aspects. First and the foremost it is an independently managed resource
> with its own pool management, use and lifetime.
>
> There is no notion of memory reclaim and this makes a huge difference
> for the pool that might consume considerable amount of memory. While
> this is the case for many kernel allocations as well they usually do not
> consume considerable portions of the accounted memory. This makes it
> really tricky to handle limit enforcement gracefully.
>
> Another important aspect comes from the lifetime semantics when a proper
> reservations accounting and managing needs to handle mmap time rather
> than than usual allocation path. While pages are allocated they do not
> belong to anybody and only later at the #PF time (or read for the fs
> backed mapping) the ownership is established. That makes it really hard
> to manage memory as whole under the memcg anyway as a large part of
> that pool sits without an ownership yet it cannot be used for any other
> purpose.
>
> These and more reasons where behind the earlier decision o have a
> dedicated hugetlb controller.

While I believe all of these are true, I think they are not reasons not to
have memcg accounting. As everyone has pointed out, memcg
accounting by itself cannot handle all situations - it is not a fix-all.
Other mechanisms, such as the HugeTLB controller, could be the better
solution in these cases, and hugetlb memcg accounting is definitely not
an attempt to infringe upon these control domains.

However, memcg accounting is still necessary for certain memory limits
enforcement to work cleanly and properly - such as the use cases we have
(as Johannes has beautifully described). It will certainly help
administrators simplify their control workflow a lot (assuming we do not
surprise them with this change - a new mount option to opt-in should
help with the transition).

>
> Also I will also Nack involving hugetlb pages being accounted by
> default. This would break any setups which mix normal and hugetlb memory
> with memcg limits applied.

Got it! I'll introduce some opt-in mechanisms in the next version. This is
my oversight.


> --
> Michal Hocko
> SUSE Labs


  parent reply	other threads:[~2023-09-27 23:34 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-26 19:49 Nhat Pham
2023-09-26 19:49 ` [PATCH 1/2] hugetlb: memcg: account hugetlb-backed memory in memory controller Nhat Pham
2023-09-26 20:50 ` [PATCH 0/2] hugetlb memcg accounting Frank van der Linden
2023-09-26 22:14   ` Johannes Weiner
2023-09-27 12:50     ` Michal Hocko
2023-09-27 16:44       ` Johannes Weiner
2023-09-27 17:22         ` Nhat Pham
2023-09-26 23:31   ` Nhat Pham
2023-09-27 11:21 ` Michal Hocko
2023-09-27 18:47   ` Johannes Weiner
2023-09-27 21:37     ` Roman Gushchin
2023-09-28 12:52       ` Johannes Weiner
2023-10-01 23:27     ` Mike Kravetz
2023-10-02 14:42       ` Johannes Weiner
2023-10-02 14:58         ` Michal Hocko
2023-10-02 15:36           ` Johannes Weiner
2023-09-27 23:33   ` Nhat Pham [this message]
2023-09-28  1:00 ` Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKEwX=MVcHSHRTzHMo7W6PCiWkTNdR8zp0c4UxR4xKDZCVbPQQ@mail.gmail.com' \
    --to=nphamcs@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=tj@kernel.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox