linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Marian Marinov <mm@yuhu.biz>
Cc: Richard Davies <richard@arachsys.com>,
	Dwight Engen <dwight.engen@oracle.com>,
	Tim Hockin <thockin@google.com>,
	Vladimir Davydov <vdavydov@parallels.com>,
	David Rientjes <rientjes@google.com>,
	Max Kellermann <mk@cm4all.com>, Tim Hockin <thockin@hockin.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	containers@lists.linux-foundation.org,
	Serge Hallyn <serge.hallyn@ubuntu.com>,
	Glauber Costa <glommer@parallels.com>,
	linux-mm@kvack.org, William Dauchy <wdauchy@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	cgroups@vger.kernel.org, Daniel Walsh <dwalsh@redhat.com>
Subject: Re: Protection against container fork bombs [WAS: Re: memcg with kmem limit doesn't recover after disk i/o causes limit to be hit]
Date: Wed, 30 Apr 2014 15:31:22 +0200	[thread overview]
Message-ID: <20140430133122.GG4357@dhcp22.suse.cz> (raw)
In-Reply-To: <53601B68.60906@yuhu.biz>

On Wed 30-04-14 00:36:40, Marian Marinov wrote:
> On 04/29/2014 09:27 PM, Michal Hocko wrote:
> >On Tue 29-04-14 19:09:27, Richard Davies wrote:
> >>Dwight Engen wrote:
> >>>Michal Hocko wrote:
> >>>>Tim Hockin wrote:
> >>>>>Here's the reason it doesn't work for us: It doesn't work.
> >>>>
> >>>>There is a "simple" solution for that. Help us to fix it.
> >>>>
> >>>>>It was something like 2 YEARS since we first wanted this, and it
> >>>>>STILL does not work.
> >>>>
> >>>>My recollection is that it was primarily Parallels and Google asking
> >>>>for the kmem accounting. The reason why I didn't fight against
> >>>>inclusion although the implementation at the time didn't have a
> >>>>proper slab shrinking implemented was that that would happen later.
> >>>>Well, that later hasn't happened yet and we are slowly getting there.
> >>>>
> >>>>>You're postponing a pretty simple request indefinitely in
> >>>>>favor of a much more complex feature, which still doesn't really
> >>>>>give me what I want.
> >>>>
> >>>>But we cannot simply add a new interface that will have to be
> >>>>maintained for ever just because something else that is supposed to
> >>>>workaround bugs.
> >>>>
> >>>>>What I want is an API that works like rlimit but per-cgroup, rather
> >>>>>than per-UID.
> >>>>
> >>>>You can use an out-of-tree patchset for the time being or help to get
> >>>>kmem into shape. If there are principal reasons why kmem cannot be
> >>>>used then you better articulate them.
> >>>
> >>>Is there a plan to separately account/limit stack pages vs kmem in
> >>>general? Richard would have to verify, but I suspect kmem is not currently
> >>>viable as a process limiter for him because icache/dcache/stack is all
> >>>accounted together.
> >>
> >>Certainly I would like to be able to limit container fork-bombs without
> >>limiting the amount of disk IO caching for processes in those containers.
> >>
> >>In my testing with of kmem limits, I needed a limit of 256MB or lower to
> >>catch fork bombs early enough. I would definitely like more than 256MB of
> >>disk caching.
> >>
> >>So if we go the "working kmem" route, I would like to be able to specify a
> >>limit excluding disk cache.
> >
> >Page cache (which is what you mean by disk cache probably) is a
> >userspace accounted memory with the memory cgroup controller. And you
> >do not have to limit that one. Kmem accounting refers to kernel internal
> >allocations - slab memory and per process kernel stack. You can see how
> >much memory is allocated per container by memory.kmem.usage_in_bytes or
> >have a look at /proc/slabinfo to see what kind of memory kernel
> >allocates globally and might be accounted for a container as well.
> >
> >The primary problem with the kmem accounting right now is that such a
> >memory is not "reclaimed" and so if the kmem limit is reached all the
> >further kmem allocations fail. The biggest user of the kmem allocations
> >on many systems is dentry and inode chache which is reclaimable easily.
> >When this is implemented the kmem limit will be usable to both prevent
> >forkbombs but also other DOS scenarios when the kernel is pushed to
> >allocate a huge amount of memory.
> 
> I would have to disagree here.
> If a container starts to create many processes it will use kmem, however my use cases, the memory is not the problem.
> The simple scheduling of so many processes generates have load on the machine.
> Even if I have the memory to handle this... the problem becomes the scheduling of all of these processes.

What prevents you from setting the kmem limit to NR_PROC * 8K + slab_pillow?

> Typical rsync of 2-3TB of small files(1-100k) will generate heavy pressure
> on the kmem, but will would not produce many processes.

Once we have a proper slab reclaim implementation this shouldn't be a
problem.

> On the other hand, forking thousands of processes with low memory footprint
> will hit the scheduler a lot faster then hitting the kmem limit.
>
> Kmem limit is something that we need! But firmly believe that we need
> a simple NPROC limit for cgroups.

Once again. If you feel that your usecase is not covered by the kmem
limit follow up on the original email thread I have referenced earlier
in the thread. Splitting up the discussion doesn't help at all.

> -hackman
> 
> >
> >HTH
> >
> >>I am also somewhat worried that normal software use could legitimately go
> >>above 256MB of kmem (even excluding disk cache) - I got to 50MB in testing
> >>just by booting a distro with a few daemons in a container.
> >>
> >>Richard.
> >
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-04-30 13:31 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-16 15:46 memcg with kmem limit doesn't recover after disk i/o causes limit to be hit Richard Davies
2014-04-18 15:59 ` Michal Hocko
2014-04-18 17:57   ` Vladimir Davydov
2014-04-18 18:20     ` Michal Hocko
2014-04-18 18:37       ` Vladimir Davydov
2014-04-20 14:28     ` Protection against container fork bombs [WAS: Re: memcg with kmem limit doesn't recover after disk i/o causes limit to be hit] Richard Davies
2014-04-20 18:35       ` Tim Hockin
2014-04-22 18:39       ` Dwight Engen
2014-04-22 20:05         ` Richard Davies
2014-04-22 20:13           ` Tim Hockin
2014-04-23  6:07           ` Marian Marinov
2014-04-23 12:49             ` Dwight Engen
2014-04-28 18:00               ` Serge Hallyn
2014-04-29  7:25                 ` Michal Hocko
2014-04-29 13:03                   ` Serge Hallyn
2014-04-29 13:57                     ` Marian Marinov
2014-04-29 14:04                     ` Tim Hockin
2014-04-29 15:43                     ` Michal Hocko
2014-04-29 16:06                       ` Tim Hockin
2014-04-29 16:51                         ` Frederic Weisbecker
2014-04-29 16:59                           ` Tim Hockin
2014-04-29 17:06                             ` Michal Hocko
2014-04-29 17:30                               ` Dwight Engen
2014-04-29 18:09                                 ` Richard Davies
2014-04-29 18:27                                   ` Michal Hocko
2014-04-29 18:39                                     ` Richard Davies
2014-04-29 19:03                                       ` Michal Hocko
2014-04-29 21:36                                     ` Marian Marinov
2014-04-30 13:31                                       ` Michal Hocko [this message]
2014-04-29 21:44                             ` Frederic Weisbecker
2014-04-30 13:12                               ` Daniel J Walsh
2014-04-30 13:28                                 ` Frederic Weisbecker
2014-05-06 11:40               ` Marian Marinov
2014-05-07 17:15                 ` Dwight Engen
2014-05-07 22:39                   ` Marian Marinov
2014-05-08 15:25                     ` Richard Davies
2014-06-10 14:50               ` Marian Marinov
2014-06-10 12:18           ` Alin Dobre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140430133122.GG4357@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=cgroups@vger.kernel.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=dwalsh@redhat.com \
    --cc=dwight.engen@oracle.com \
    --cc=fweisbec@gmail.com \
    --cc=glommer@parallels.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mk@cm4all.com \
    --cc=mm@yuhu.biz \
    --cc=richard@arachsys.com \
    --cc=rientjes@google.com \
    --cc=serge.hallyn@ubuntu.com \
    --cc=thockin@google.com \
    --cc=thockin@hockin.org \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    --cc=wdauchy@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox