From: Mina Almasry <almasrymina@google.com>
To: "Michal Koutný" <mkoutny@suse.com>
Cc: mike.kravetz@oracle.com, shuah <shuah@kernel.org>,
David Rientjes <rientjes@google.com>,
Shakeel Butt <shakeelb@google.com>,
Greg Thelen <gthelen@google.com>,
akpm@linux-foundation.org, khalid.aziz@oracle.com,
open list <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
cgroups@vger.kernel.org
Subject: Re: [RFC PATCH] hugetlbfs: Add hugetlb_cgroup reservation limits
Date: Fri, 9 Aug 2019 11:05:43 -0700 [thread overview]
Message-ID: <CAHS8izNM3jYFWHY5UJ7cmJ402f-RKXzQ=JFHpD7EkvpAdC2_SA@mail.gmail.com> (raw)
In-Reply-To: <20190809112738.GB13061@blackbody.suse.cz>
On Fri, Aug 9, 2019 at 4:27 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> (+CC cgroups@vger.kernel.org)
>
> On Thu, Aug 08, 2019 at 12:40:02PM -0700, Mina Almasry <almasrymina@google.com> wrote:
> > We have developers interested in using hugetlb_cgroups, and they have expressed
> > dissatisfaction regarding this behavior.
> I assume you still want to enforce a limit on a particular group and the
> application must be able to handle resource scarcity (but better
> notified than SIGBUS).
>
> > Alternatives considered:
> > [...]
> (I did not try that but) have you considered:
> 3) MAP_POPULATE while you're making the reservation,
I have tried this, and the behaviour is not great. Basically if
userspace mmaps more memory than its cgroup limit allows with
MAP_POPULATE, the kernel will reserve the total amount requested by
the userspace, it will fault in up to the cgroup limit, and then it
will SIGBUS the task when it tries to access the rest of its
'reserved' memory.
So for example:
- if /proc/sys/vm/nr_hugepages == 10, and
- your cgroup limit is 5 pages, and
- you mmap(MAP_POPULATE) 7 pages.
Then the kernel will reserve 7 pages, and will fault in 5 of those 7
pages, and will SIGBUS you when you try to access the remaining 2
pages. So the problem persists. Folks would still like to know they
are crossing the limits on mmap time.
> 4) Using multple hugetlbfs mounts with respective limits.
>
I assume you mean the size=<value> option on the hugetlbfs mount. This
would only limit hugetlb memory usage via the hugetlbfs mount. Tasks
can still allocate hugetlb memory without any mount via
mmap(MAP_HUGETLB) and shmget/shmat APIs, and all these calls will
deplete the global, shared hugetlb memory pool.
> > Caveats:
> > 1. This support is implemented for cgroups-v1. I have not tried
> > hugetlb_cgroups with cgroups v2, and AFAICT it's not supported yet.
> > This is largely because we use cgroups-v1 for now.
> Adding something new into v1 without v2 counterpart, is making migration
> harder, that's one of the reasons why v1 API is rather frozen now. (I'm
> not sure whether current hugetlb controller fits into v2 at all though.)
>
In my estimation it's maybe fine to make this change in v1 because, as
far as I understand, hugetlb_cgroups are a little used feature of the
kernel (although we see it getting requested) and hugetlb_cgroups
aren't supported in v2 yet, and I don't *think* this change makes it
any harder to port hugetlb_cgroups to v2.
But, like I said if there is consensus this must not be checked in
without hugetlb_cgroups v2 supported is added alongside, I can take a
look at that.
> Michal
next prev parent reply other threads:[~2019-08-09 18:05 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-08 19:40 Mina Almasry
2019-08-08 20:23 ` shuah
2019-08-08 21:28 ` Mina Almasry
2019-08-09 11:27 ` Michal Koutný
2019-08-09 18:05 ` Mina Almasry [this message]
2019-08-09 20:38 ` Mike Kravetz
2019-08-09 20:57 ` Mina Almasry
2019-08-09 21:00 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHS8izNM3jYFWHY5UJ7cmJ402f-RKXzQ=JFHpD7EkvpAdC2_SA@mail.gmail.com' \
--to=almasrymina@google.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=gthelen@google.com \
--cc=khalid.aziz@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=mkoutny@suse.com \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox