From: Michal Hocko <mhocko@kernel.org>
To: akpm@linux-foundation.org
Cc: mm-commits@vger.kernel.org, tj@kernel.org, hannes@cmpxchg.org,
guro@fb.com, dennis@kernel.org, chris@chrisdown.name,
cgroups mailinglist <cgroups@vger.kernel.org>,
linux-mm@kvack.org
Subject: Re: + mm-consider-subtrees-in-memoryevents.patch added to -mm tree
Date: Wed, 13 Feb 2019 13:47:29 +0100 [thread overview]
Message-ID: <20190213124729.GI4525@dhcp22.suse.cz> (raw)
In-Reply-To: <20190212224542.ZW63a%akpm@linux-foundation.org>
On Tue 12-02-19 14:45:42, Andrew Morton wrote:
[...]
> From: Chris Down <chris@chrisdown.name>
> Subject: mm, memcg: consider subtrees in memory.events
>
> memory.stat and other files already consider subtrees in their output, and
> we should too in order to not present an inconsistent interface.
>
> The current situation is fairly confusing, because people interacting with
> cgroups expect hierarchical behaviour in the vein of memory.stat,
> cgroup.events, and other files. For example, this causes confusion when
> debugging reclaim events under low, as currently these always read "0" at
> non-leaf memcg nodes, which frequently causes people to misdiagnose breach
> behaviour. The same confusion applies to other counters in this file when
> debugging issues.
>
> Aggregation is done at write time instead of at read-time since these
> counters aren't hot (unlike memory.stat which is per-page, so it does it
> at read time), and it makes sense to bundle this with the file
> notifications.
>
> After this patch, events are propagated up the hierarchy:
>
> [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
> low 0
> high 0
> max 0
> oom 0
> oom_kill 0
> [root@ktst ~]# systemd-run -p MemoryMax=1 true
> Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service
> [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
> low 0
> high 0
> max 7
> oom 1
> oom_kill 1
>
> As this is a change in behaviour, this can be reverted to the old
> behaviour by mounting with the `memory_localevents' flag set. However, we
> use the new behaviour by default as there's a lack of evidence that there
> are any current users of memory.events that would find this change
> undesirable.
>
> Link: http://lkml.kernel.org/r/20190208224419.GA24772@chrisdown.name
> Signed-off-by: Chris Down <chris@chrisdown.name>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Dennis Zhou <dennis@kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
FTR: As I've already said here [1] I can live with this change as long
as there is a larger consensus among cgroup v2 users. So let's give this
some more time before merging to see whether there is such a consensus.
[1] http://lkml.kernel.org/r/20190201102515.GK11599@dhcp22.suse.cz
> ---
>
> Documentation/admin-guide/cgroup-v2.rst | 9 +++++++++
> include/linux/cgroup-defs.h | 5 +++++
> include/linux/memcontrol.h | 10 ++++++++--
> kernel/cgroup/cgroup.c | 16 ++++++++++++++--
> 4 files changed, 36 insertions(+), 4 deletions(-)
>
> --- a/Documentation/admin-guide/cgroup-v2.rst~mm-consider-subtrees-in-memoryevents
> +++ a/Documentation/admin-guide/cgroup-v2.rst
> @@ -177,6 +177,15 @@ cgroup v2 currently supports the followi
> ignored on non-init namespace mounts. Please refer to the
> Delegation section for details.
>
> + memory_localevents
> +
> + Only populate memory.events with data for the current cgroup,
> + and not any subtrees. This is legacy behaviour, the default
> + behaviour without this option is to include subtree counts.
> + This option is system wide and can only be set on mount or
> + modified through remount from the init namespace. The mount
> + option is ignored on non-init namespace mounts.
> +
>
> Organizing Processes and Threads
> --------------------------------
> --- a/include/linux/cgroup-defs.h~mm-consider-subtrees-in-memoryevents
> +++ a/include/linux/cgroup-defs.h
> @@ -83,6 +83,11 @@ enum {
> * Enable cpuset controller in v1 cgroup to use v2 behavior.
> */
> CGRP_ROOT_CPUSET_V2_MODE = (1 << 4),
> +
> + /*
> + * Enable legacy local memory.events.
> + */
> + CGRP_ROOT_MEMORY_LOCAL_EVENTS = (1 << 5),
> };
>
> /* cftype->flags */
> --- a/include/linux/memcontrol.h~mm-consider-subtrees-in-memoryevents
> +++ a/include/linux/memcontrol.h
> @@ -789,8 +789,14 @@ static inline void count_memcg_event_mm(
> static inline void memcg_memory_event(struct mem_cgroup *memcg,
> enum memcg_memory_event event)
> {
> - atomic_long_inc(&memcg->memory_events[event]);
> - cgroup_file_notify(&memcg->events_file);
> + do {
> + atomic_long_inc(&memcg->memory_events[event]);
> + cgroup_file_notify(&memcg->events_file);
> +
> + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> + break;
> + } while ((memcg = parent_mem_cgroup(memcg)) &&
> + !mem_cgroup_is_root(memcg));
> }
>
> static inline void memcg_memory_event_mm(struct mm_struct *mm,
> --- a/kernel/cgroup/cgroup.c~mm-consider-subtrees-in-memoryevents
> +++ a/kernel/cgroup/cgroup.c
> @@ -1775,11 +1775,13 @@ int cgroup_show_path(struct seq_file *sf
>
> enum cgroup2_param {
> Opt_nsdelegate,
> + Opt_memory_localevents,
> nr__cgroup2_params
> };
>
> static const struct fs_parameter_spec cgroup2_param_specs[] = {
> - fsparam_flag ("nsdelegate", Opt_nsdelegate),
> + fsparam_flag("nsdelegate", Opt_nsdelegate),
> + fsparam_flag("memory_localevents", Opt_memory_localevents),
> {}
> };
>
> @@ -1802,6 +1804,9 @@ static int cgroup2_parse_param(struct fs
> case Opt_nsdelegate:
> ctx->flags |= CGRP_ROOT_NS_DELEGATE;
> return 0;
> + case Opt_memory_localevents:
> + ctx->flags |= CGRP_ROOT_MEMORY_LOCAL_EVENTS;
> + return 0;
> }
> return -EINVAL;
> }
> @@ -1813,6 +1818,11 @@ static void apply_cgroup_root_flags(unsi
> cgrp_dfl_root.flags |= CGRP_ROOT_NS_DELEGATE;
> else
> cgrp_dfl_root.flags &= ~CGRP_ROOT_NS_DELEGATE;
> +
> + if (root_flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> + cgrp_dfl_root.flags |= CGRP_ROOT_MEMORY_LOCAL_EVENTS;
> + else
> + cgrp_dfl_root.flags &= ~CGRP_ROOT_MEMORY_LOCAL_EVENTS;
> }
> }
>
> @@ -1820,6 +1830,8 @@ static int cgroup_show_options(struct se
> {
> if (cgrp_dfl_root.flags & CGRP_ROOT_NS_DELEGATE)
> seq_puts(seq, ",nsdelegate");
> + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> + seq_puts(seq, ",memory_localevents");
> return 0;
> }
>
> @@ -6207,7 +6219,7 @@ static struct kobj_attribute cgroup_dele
> static ssize_t features_show(struct kobject *kobj, struct kobj_attribute *attr,
> char *buf)
> {
> - return snprintf(buf, PAGE_SIZE, "nsdelegate\n");
> + return snprintf(buf, PAGE_SIZE, "nsdelegate\nmemory_localevents\n");
> }
> static struct kobj_attribute cgroup_features_attr = __ATTR_RO(features);
>
> _
>
> Patches currently in -mm which might be from chris@chrisdown.name are
>
> mm-create-mem_cgroup_from_seq.patch
> mm-extract-memcg-maxable-seq_file-logic-to-seq_show_memcg_tunable.patch
> mm-proportional-memorylowmin-reclaim.patch
> mm-proportional-memorylowmin-reclaim-fix.patch
> mm-memcontrol-expose-thp-events-on-a-per-memcg-basis.patch
> mm-memcontrol-expose-thp-events-on-a-per-memcg-basis-fix-2.patch
> mm-make-memoryemin-the-baseline-for-utilisation-determination.patch
> mm-rename-ambiguously-named-memorystat-counters-and-functions.patch
> mm-consider-subtrees-in-memoryevents.patch
--
Michal Hocko
SUSE Labs
next parent reply other threads:[~2019-02-13 12:47 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190212224542.ZW63a%akpm@linux-foundation.org>
2019-02-13 12:47 ` Michal Hocko [this message]
2019-05-16 17:56 ` Johannes Weiner
2019-05-16 18:10 ` Michal Hocko
2019-05-16 19:39 ` Johannes Weiner
2019-05-17 12:33 ` Michal Hocko
2019-05-17 13:00 ` Shakeel Butt
2019-05-22 5:30 ` Suren Baghdasaryan
2019-05-18 1:33 ` Johannes Weiner
2019-05-22 2:23 ` Andrew Morton
2019-05-22 15:44 ` Johannes Weiner
2019-05-17 13:00 ` Shakeel Butt
2019-05-17 19:04 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190213124729.GI4525@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=chris@chrisdown.name \
--cc=dennis@kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mm-commits@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox