From: Michal Hocko <mhocko@suse.cz>
To: Marian Marinov <mm@yuhu.biz>
Cc: Richard Davies <richard@arachsys.com>,
Dwight Engen <dwight.engen@oracle.com>,
Tim Hockin <thockin@google.com>,
Vladimir Davydov <vdavydov@parallels.com>,
David Rientjes <rientjes@google.com>,
Max Kellermann <mk@cm4all.com>, Tim Hockin <thockin@hockin.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
containers@lists.linux-foundation.org,
Serge Hallyn <serge.hallyn@ubuntu.com>,
Glauber Costa <glommer@parallels.com>,
linux-mm@kvack.org, William Dauchy <wdauchy@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
cgroups@vger.kernel.org, Daniel Walsh <dwalsh@redhat.com>
Subject: Re: Protection against container fork bombs [WAS: Re: memcg with kmem limit doesn't recover after disk i/o causes limit to be hit]
Date: Wed, 30 Apr 2014 15:31:22 +0200 [thread overview]
Message-ID: <20140430133122.GG4357@dhcp22.suse.cz> (raw)
In-Reply-To: <53601B68.60906@yuhu.biz>
On Wed 30-04-14 00:36:40, Marian Marinov wrote:
> On 04/29/2014 09:27 PM, Michal Hocko wrote:
> >On Tue 29-04-14 19:09:27, Richard Davies wrote:
> >>Dwight Engen wrote:
> >>>Michal Hocko wrote:
> >>>>Tim Hockin wrote:
> >>>>>Here's the reason it doesn't work for us: It doesn't work.
> >>>>
> >>>>There is a "simple" solution for that. Help us to fix it.
> >>>>
> >>>>>It was something like 2 YEARS since we first wanted this, and it
> >>>>>STILL does not work.
> >>>>
> >>>>My recollection is that it was primarily Parallels and Google asking
> >>>>for the kmem accounting. The reason why I didn't fight against
> >>>>inclusion although the implementation at the time didn't have a
> >>>>proper slab shrinking implemented was that that would happen later.
> >>>>Well, that later hasn't happened yet and we are slowly getting there.
> >>>>
> >>>>>You're postponing a pretty simple request indefinitely in
> >>>>>favor of a much more complex feature, which still doesn't really
> >>>>>give me what I want.
> >>>>
> >>>>But we cannot simply add a new interface that will have to be
> >>>>maintained for ever just because something else that is supposed to
> >>>>workaround bugs.
> >>>>
> >>>>>What I want is an API that works like rlimit but per-cgroup, rather
> >>>>>than per-UID.
> >>>>
> >>>>You can use an out-of-tree patchset for the time being or help to get
> >>>>kmem into shape. If there are principal reasons why kmem cannot be
> >>>>used then you better articulate them.
> >>>
> >>>Is there a plan to separately account/limit stack pages vs kmem in
> >>>general? Richard would have to verify, but I suspect kmem is not currently
> >>>viable as a process limiter for him because icache/dcache/stack is all
> >>>accounted together.
> >>
> >>Certainly I would like to be able to limit container fork-bombs without
> >>limiting the amount of disk IO caching for processes in those containers.
> >>
> >>In my testing with of kmem limits, I needed a limit of 256MB or lower to
> >>catch fork bombs early enough. I would definitely like more than 256MB of
> >>disk caching.
> >>
> >>So if we go the "working kmem" route, I would like to be able to specify a
> >>limit excluding disk cache.
> >
> >Page cache (which is what you mean by disk cache probably) is a
> >userspace accounted memory with the memory cgroup controller. And you
> >do not have to limit that one. Kmem accounting refers to kernel internal
> >allocations - slab memory and per process kernel stack. You can see how
> >much memory is allocated per container by memory.kmem.usage_in_bytes or
> >have a look at /proc/slabinfo to see what kind of memory kernel
> >allocates globally and might be accounted for a container as well.
> >
> >The primary problem with the kmem accounting right now is that such a
> >memory is not "reclaimed" and so if the kmem limit is reached all the
> >further kmem allocations fail. The biggest user of the kmem allocations
> >on many systems is dentry and inode chache which is reclaimable easily.
> >When this is implemented the kmem limit will be usable to both prevent
> >forkbombs but also other DOS scenarios when the kernel is pushed to
> >allocate a huge amount of memory.
>
> I would have to disagree here.
> If a container starts to create many processes it will use kmem, however my use cases, the memory is not the problem.
> The simple scheduling of so many processes generates have load on the machine.
> Even if I have the memory to handle this... the problem becomes the scheduling of all of these processes.
What prevents you from setting the kmem limit to NR_PROC * 8K + slab_pillow?
> Typical rsync of 2-3TB of small files(1-100k) will generate heavy pressure
> on the kmem, but will would not produce many processes.
Once we have a proper slab reclaim implementation this shouldn't be a
problem.
> On the other hand, forking thousands of processes with low memory footprint
> will hit the scheduler a lot faster then hitting the kmem limit.
>
> Kmem limit is something that we need! But firmly believe that we need
> a simple NPROC limit for cgroups.
Once again. If you feel that your usecase is not covered by the kmem
limit follow up on the original email thread I have referenced earlier
in the thread. Splitting up the discussion doesn't help at all.
> -hackman
>
> >
> >HTH
> >
> >>I am also somewhat worried that normal software use could legitimately go
> >>above 256MB of kmem (even excluding disk cache) - I got to 50MB in testing
> >>just by booting a distro with a few daemons in a container.
> >>
> >>Richard.
> >
>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-04-30 13:31 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-16 15:46 memcg with kmem limit doesn't recover after disk i/o causes limit to be hit Richard Davies
2014-04-18 15:59 ` Michal Hocko
2014-04-18 17:57 ` Vladimir Davydov
2014-04-18 18:20 ` Michal Hocko
2014-04-18 18:37 ` Vladimir Davydov
2014-04-20 14:28 ` Protection against container fork bombs [WAS: Re: memcg with kmem limit doesn't recover after disk i/o causes limit to be hit] Richard Davies
2014-04-20 18:35 ` Tim Hockin
2014-04-22 18:39 ` Dwight Engen
2014-04-22 20:05 ` Richard Davies
2014-04-22 20:13 ` Tim Hockin
2014-04-23 6:07 ` Marian Marinov
2014-04-23 12:49 ` Dwight Engen
2014-04-28 18:00 ` Serge Hallyn
2014-04-29 7:25 ` Michal Hocko
2014-04-29 13:03 ` Serge Hallyn
2014-04-29 13:57 ` Marian Marinov
2014-04-29 14:04 ` Tim Hockin
2014-04-29 15:43 ` Michal Hocko
2014-04-29 16:06 ` Tim Hockin
2014-04-29 16:51 ` Frederic Weisbecker
2014-04-29 16:59 ` Tim Hockin
2014-04-29 17:06 ` Michal Hocko
2014-04-29 17:30 ` Dwight Engen
2014-04-29 18:09 ` Richard Davies
2014-04-29 18:27 ` Michal Hocko
2014-04-29 18:39 ` Richard Davies
2014-04-29 19:03 ` Michal Hocko
2014-04-29 21:36 ` Marian Marinov
2014-04-30 13:31 ` Michal Hocko [this message]
2014-04-29 21:44 ` Frederic Weisbecker
2014-04-30 13:12 ` Daniel J Walsh
2014-04-30 13:28 ` Frederic Weisbecker
2014-05-06 11:40 ` Marian Marinov
2014-05-07 17:15 ` Dwight Engen
2014-05-07 22:39 ` Marian Marinov
2014-05-08 15:25 ` Richard Davies
2014-06-10 14:50 ` Marian Marinov
2014-06-10 12:18 ` Alin Dobre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140430133122.GG4357@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=cgroups@vger.kernel.org \
--cc=containers@lists.linux-foundation.org \
--cc=dwalsh@redhat.com \
--cc=dwight.engen@oracle.com \
--cc=fweisbec@gmail.com \
--cc=glommer@parallels.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mk@cm4all.com \
--cc=mm@yuhu.biz \
--cc=richard@arachsys.com \
--cc=rientjes@google.com \
--cc=serge.hallyn@ubuntu.com \
--cc=thockin@google.com \
--cc=thockin@hockin.org \
--cc=tj@kernel.org \
--cc=vdavydov@parallels.com \
--cc=wdauchy@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox