From: Michal Hocko <mhocko@kernel.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH] mm, memcg: show memcg min setting in oom messages
Date: Wed, 20 Nov 2019 12:40:43 +0100 [thread overview]
Message-ID: <20191120114043.GH23213@dhcp22.suse.cz> (raw)
In-Reply-To: <CALOAHbA4VwSg3m0AUeV3kfNr7wiibdPsxR9_6AtjQLrvfjxi4g@mail.gmail.com>
On Wed 20-11-19 18:53:44, Yafang Shao wrote:
> On Wed, Nov 20, 2019 at 6:22 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Wed 20-11-19 03:53:05, Yafang Shao wrote:
> > > A task running in a memcg may OOM because of the memory.min settings of his
> > > slibing and parent. If this happens, the current oom messages can't show
> > > why file page cache can't be reclaimed.
> >
> > min limit is not the only way to protect memory from being reclaim. The
> > memory might be pinned or unreclaimable for other reasons (e.g. swap
> > quota exceeded for memcg).
>
> Both swap or unreclaimabed (unevicteable) is printed in OOM messages.
Not really. Consider a memcg which has reached it's swap limit. The
anonymous memory is not really reclaimable even when there is a lot of
swap space available.
> If something else can prevent the file cache being reclaimed, we'd
> better show them as well.
How are you going to do that? How do you track pins on pages?
> > Besides that, there is the very same problem
> > with the global OOM killer, right? And I do not expect we want to print
> > all memcgs in the system (this might be hundreds).
> >
>
> I forgot the global oom...
>
> Why not just print the memcgs which are under memory.min protection or
> something like a total number of min protected memory ?
Yes, this would likely help. But the main question really reamains, is
this really worth it?
> > > So it is better to show the memcg
> > > min settings.
> > > Let's take an example.
> > > bar bar/memory.max = 1200M memory.min=800M
> > > / \
> > > barA barB barA/memory.min = 800M memory.current=1G (file page cache)
> > > barB/memory.min = 0 (process in this memcg is allocating page)
> > >
> > > The process will do memcg reclaim if the bar/memory.max is reached. Once
> > > the barA/memory.min is reached it will stop reclaiming file page caches in
> > > barA, and if there is no reclaimable pages in bar and bar/barB it will
> > > enter memcg OOM then.
> > > After this pacch, bellow messages will be show then (only includeing the
> > > relevant messages here). The lines begin with '#' are newly added info (the
> > > '#' symbol is not in the original messages).
> > > memory: usage 1228800kB, limit 1228800kB, failcnt 18337
> > > ...
> > > # Memory cgroup min setting:
> > > # /bar: min 819200KB emin 0KB
> > > # /bar/barA: min 819200KB emin 819200KB
> > > # /bar/barB: min 0KB emin 0KB
> > > ...
> > > Memory cgroup stats for /bar:
> > > anon 418328576
> > > file 835756032
> > > ...
> > > unevictable 0
> > > ...
> > > oom-kill:constraint=CONSTRAINT_MEMCG..oom_memcg=/bar,task_memcg=/bar/barB
> > >
> > > With the new added information, we can find the memory.min in bar/barA is
> > > reached and the processes in bar/barB can't reclaim file page cache from
> > > bar/barA any more. While without this new added information we don't know
> > > why the file page cache in bar can't be reclaimed.
> >
> > Well, I am not sure this is really usefull enough TBH. It doesn't give
> > you the whole picture and it potentially generates a lot of output in
> > the oom report. FYI we used to have a more precise break down of
> > counters in memcg hierarchy, see 58cf188ed649 ("memcg, oom: provide more
> > precise dump info while memcg oom happening") which later got rewritten
> > by c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup OOM")
> >
>
> At least we'd better print a total protected memory in the oom messages.
>
> > Could you be more specific why do you really need this piece of
> > information?
>
> I have said in the commit log, that we don't know why the file cache
> can't be reclaimed (when evictable is 0 and dirty is 0 as well.)
And the counter argument is that this will not help you there much in
many large and much more common cases.
I argue, and I might be wrong here so feel free to correct me, that the
reclaim protection guarantee (min) is something to be under admins
control. It shouldn't really happen nilly-willy because it has really
large consequences, the OOM including. So if there is a suspicious
amount of memory that could be reclaimed normally then the reclaim
protection is really the first suspect to go after.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2019-11-20 11:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-20 8:53 Yafang Shao
2019-11-20 10:21 ` Michal Hocko
2019-11-20 10:53 ` Yafang Shao
2019-11-20 11:40 ` Michal Hocko [this message]
2019-11-20 12:23 ` Yafang Shao
2019-11-22 10:28 ` Michal Hocko
2019-11-23 5:52 ` Yafang Shao
2019-11-25 8:20 ` Michal Hocko
2019-11-25 9:12 ` Yafang Shao
2019-11-25 9:27 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191120114043.GH23213@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox