linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: linux-kernel@vger.kernel.org, Michal Hocko <mhocko@suse.cz>,
	Greg Thelen <gthelen@google.com>, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH 2/2] memcg: add memory and swap knobs to the default cgroup hierarchy
Date: Sun, 28 Dec 2014 15:30:23 -0500	[thread overview]
Message-ID: <20141228203023.GB9385@phnom.home.cmpxchg.org> (raw)
In-Reply-To: <9aeed65ee700e81abde90c20570415a40acb36e2.1419782051.git.vdavydov@parallels.com>

On Sun, Dec 28, 2014 at 07:19:13PM +0300, Vladimir Davydov wrote:
> This patch adds the following files to the default cgroup hierarchy:
> 
>   memory.usage:         read memory usage
>   memory.limit:         read/set memory limit

These names are one hell of a lot better than what we currently have,
but I'm not happy with "usage" and "limit" as the basic memcg knobs.

Statically limiting groups as a means of partitioning a system doesn't
reflect the reality that a) memory consumption is elastic, b) varies
over the course of a workload, and c) working set estimation is
incredibly hard - and inaccurate.  We need gradual degredation on the
configuration, not OOM kills, to allow the admin to make it tight,
monitor groups and system, and intervene when performance degrades.

That's why in v2 the user should instead be able to configure the
groups' ranges of memory consumption, and then leave it to global
reclaim and memcg reclaim to balance memory pressure accordingly.
Groups that are below their normal range will be spared by global
pressure, as long as there are other groups available for reclaim.
The admin can monitor global overcommit by looking at allocation
latencies and how often groups get pushed below their comfort zone.
On the other hand, groups that exceed their normal range will be
throttled in direct reclaim.  The admin can monitor group overcommit
by looking at the charge latency.  A hard upper limit will still be
available, but only for emergency containment of buggy or malicious
workloads, where the admin/job scheduler is not considered fast enough
to protect the system from harm.  This allows packing groups very
tightly with monitorable gradual degredation, and at the same time
turns the OOM killer back into the last-resort measure it should be.

We could add those low and high boundary knobs to the usage and limit
knobs, but I really don't want the flawed assumptions of the old model
to be reflected in the new interface.  As such, my proposals would be:

  memory.low:        the expected lower end of the workload size
  memory.high:       the expected upper end
  memory.max:        the absolute OOM-enforced maximum size
  memory.current:    the current size

And then, in the same vein:

  swap.max
  swap.current

These names are short, but they should be unambiguous and descriptive
in their context, and users will have to consult the documentation on
how to configure this stuff anyway.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-12-28 20:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-28 16:19 [RFC PATCH 1/2] memcg: account swap instead of memory+swap Vladimir Davydov
2014-12-28 16:19 ` [RFC PATCH 2/2] memcg: add memory and swap knobs to the default cgroup hierarchy Vladimir Davydov
2014-12-28 20:30   ` Johannes Weiner [this message]
2014-12-29  8:54     ` Vladimir Davydov
2014-12-28 19:00 ` [RFC PATCH 1/2] memcg: account swap instead of memory+swap Johannes Weiner
2014-12-29  8:47   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141228203023.GB9385@phnom.home.cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox