linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>, Hugh Dickins <hughd@google.com>,
	Greg Thelen <gthelen@google.com>,
	Glauber Costa <glommer@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [RFC PATCH] memcg: export knobs for the defaul cgroup hierarchy
Date: Mon, 21 Jul 2014 14:09:38 +0200	[thread overview]
Message-ID: <20140721120938.GC8393@dhcp22.suse.cz> (raw)
In-Reply-To: <20140721114839.GA11848@esperanza>

On Mon 21-07-14 15:48:39, Vladimir Davydov wrote:
> On Mon, Jul 21, 2014 at 11:07:24AM +0200, Michal Hocko wrote:
> > On Fri 18-07-14 19:44:43, Vladimir Davydov wrote:
> > > On Wed, Jul 16, 2014 at 11:58:14AM -0400, Johannes Weiner wrote:
> > > > On Wed, Jul 16, 2014 at 04:39:38PM +0200, Michal Hocko wrote:
> > > > > +#ifdef CONFIG_MEMCG_KMEM
> > > > > +	{
> > > > > +		.name = "kmem.limit_in_bytes",
> > > > > +		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> > > > > +		.write = mem_cgroup_write,
> > > > > +		.read_u64 = mem_cgroup_read_u64,
> > > > > +	},
> > > > 
> > > > Does it really make sense to have a separate limit for kmem only?
> > > > IIRC, the reason we introduced this was that this memory is not
> > > > reclaimable and so we need to limit it.
> > > > 
> > > > But the opposite effect happened: because it's not reclaimable, the
> > > > separate kmem limit is actually unusable for any values smaller than
> > > > the overall memory limit: because there is no reclaim mechanism for
> > > > that limit, once you hit it, it's over, there is nothing you can do
> > > > anymore.  The problem isn't so much unreclaimable memory, the problem
> > > > is unreclaimable limits.
> > > > 
> > > > If the global case produces memory pressure through kernel memory
> > > > allocations, we reclaim page cache, anonymous pages, inodes, dentries
> > > > etc.  I think the same should happen for kmem: kmem should just be
> > > > accounted and limited in the overall memory limit of a group, and when
> > > > pressure arises, we go after anything that's reclaimable.
> > > 
> > > Personally, I don't think there's much sense in having a separate knob
> > > for kmem limit either. Until we have a user with a sane use case for it,
> > > let's not propagate it to the new interface.
> > 
> > What about fork-bomb forks protection? I thought that was the primary usecase
> > for K < U? Or how can we handle that use case with a single limit? A
> > special gfp flag to not trigger OOM path when called from some kmem
> > charge paths?
> 
> Hmm, for a moment I thought that putting a fork-bomb inside a memory
> cgroup with kmem accounting enabled and K=U will isolate it from the
> rest of the system and therefore there's no need in K<U, but now I
> realize it's not quite right.
> 
> In contrast to user memory, thread stack allocations have costly order,
> they cannot be swapped out, and on 32-bit systems they will consume a
> limited resource of low mem. Although the latter two doesn't look like
> being of much concern, costly order of stack pages certainly does I
> think.
> 
> Is this what you mean by saying we have to disable OOM from some kmem
> charge paths? To prevent OOM on the global level that might trigger due
> to lack of high order pages for task stack?

No, I meant it for a different reason. If you simply cause OOM from e.g.
stack charge then you simply DoS your cgroup before you start
effectively stopping fork-bomb because the fork-bomb will usually have
much smaller RSS than anything else in the group. So this is a case
where you really want to fail the allocation.

Maybe I just didn't understand what a single-limit proposal meant...

> > What about task_count or what was the name of the controller which was
> > dropped and suggested to be replaced by kmem accounting? I can imagine
> > that to be implemented by a separate K limit which would be roughtly
> > stack_size * task_count + pillow for slab.
> 
> I wonder how big this pillow for slab should be...

Well, it obviously depends on the load running in the group. It depends
on the amount of unreclaimable slab + reclaimable_and_still_not_trashing
amount of slab. So the pillow should be quite large but that shouldn't
be a big deal as the kernel allocations usually are a small part of the
U.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-07-21 12:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-16 14:39 Michal Hocko
2014-07-16 15:58 ` Johannes Weiner
2014-07-17 13:45   ` Michal Hocko
2014-07-18 15:44   ` Vladimir Davydov
2014-07-18 16:13     ` Johannes Weiner
2014-07-21  9:07     ` Michal Hocko
2014-07-21 11:46       ` Michal Hocko
2014-07-21 12:02         ` Tejun Heo
2014-07-21 12:03         ` Vladimir Davydov
2014-07-21 12:49           ` Tejun Heo
2014-07-21 11:48       ` Vladimir Davydov
2014-07-21 12:09         ` Michal Hocko [this message]
2014-07-21 13:22   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140721120938.GC8393@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=glommer@gmail.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox