linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Greg Thelen <gthelen@google.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -mm 2/2] memcg: fix css reference leak and endless loop in mem_cgroup_iter
Date: Tue, 21 Jan 2014 13:18:42 -0800 (PST)	[thread overview]
Message-ID: <alpine.LSU.2.11.1401211218010.5688@eggly.anvils> (raw)
In-Reply-To: <20140121114219.8c34256dfbe7c2470b36ced8@linux-foundation.org>

On Tue, 21 Jan 2014, Andrew Morton wrote:
> On Tue, 21 Jan 2014 11:45:43 +0100 Michal Hocko <mhocko@suse.cz> wrote:
> 
> > 19f39402864e (memcg: simplify mem_cgroup_iter) has reorganized
> > mem_cgroup_iter code in order to simplify it. A part of that change was
> > dropping an optimization which didn't call css_tryget on the root of
> > the walked tree. The patch however didn't change the css_put part in
> > mem_cgroup_iter which excludes root.
> > This wasn't an issue at the time because __mem_cgroup_iter_next bailed
> > out for root early without taking a reference as cgroup iterators
> > (css_next_descendant_pre) didn't visit root themselves.
> > 
> > Nevertheless cgroup iterators have been reworked to visit root by
> > bd8815a6d802 (cgroup: make css_for_each_descendant() and friends include
> > the origin css in the iteration) when the root bypass have been dropped
> > in __mem_cgroup_iter_next. This means that css_put is not called for
> > root and so css along with mem_cgroup and other cgroup internal object
> > tied by css lifetime are never freed.
> > 
> > Fix the issue by reintroducing root check in __mem_cgroup_iter_next
> > and do not take css reference for it.
> > 
> > This reference counting magic protects us also from another issue, an
> > endless loop reported by Hugh Dickins when reclaim races with root
> > removal and css_tryget called by iterator internally would fail. There
> > would be no other nodes to visit so __mem_cgroup_iter_next would return
> > NULL and mem_cgroup_iter would interpret it as "start looping from root
> > again" and so mem_cgroup_iter would loop forever internally.
> 
> I grabbed these two patches but I will sit on them for a week or so,
> pending review-n-test.

Thank you, yes, I'm about to give them more testing.

> 
> > Cc: stable@vger.kernel.org # mem_leak part 3.12+
> 
> What does this mean?

It's certainly a confusing comment.

I suggest just deleting the "mem_leak part ": Michal isn't referring to
any two parts of the patch itself, but to parts of his commit comment;
but it's still unclear what he's claiming.

We do have a confusing situation.  The hang goes back to 3.10 but takes
two different forms, because of intervening changes: in 3.10 and 3.11
mem_cgroup_iter repeatedly returns root memcg to its caller, in 3.12 and
3.13 mem_cgroup_iter repeatedly gets NULL memcg from mem_cgroup_iter_next
and cannot return to its caller.

Patch 1/2 is what's needed to fix 3.10 and 3.11 (and applies correctly
to 3.11, but will have to be rediffed for 3.10 because of rearrangement
in between).  Patch 2/2 is what's needed to fix 3.12 and 3.13 (but applies
correctly to neither of them because it's diffed on top of my CSS_ONLINE
fix).  Patch 1/2 is correct but unnecessary in 3.12 and 3.13: I'm unclear
whether Michal is claiming that it would also fix the hang in 3.12 and
3.13 if we didn't have 2/2: I doubt that, and haven't tested that.

Given how Michal has diffed this patch on top of my CSS_ONLINE one
(mm-memcg-iteration-skip-memcgs-not-yet-fully-initialized.patch),
it would be helpful if you could mark that one also for stable 3.12+,
to save us from having to rediff this one for stable.  We don't have
a concrete example of a problem it solves in the vanilla kernel, but
it makes more sense to include it than to exclude it.

(You would be right to point out that the CSS_ONLINE one fixes
something that goes back to 3.10: I'm saying 3.12+ because I'm not
motivated to rediff it for 3.10 and 3.11 when there's nothing to
go on top; but that's not a very good reason to lie - overrule me.)

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-01-21 21:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-14  1:50 [PATCH 1/3] mm/memcg: fix last_dead_count memory wastage Hugh Dickins
2014-01-14  1:52 ` [PATCH 2/3] mm/memcg: fix endless iteration in reclaim Hugh Dickins
2014-01-14 13:27   ` Michal Hocko
2014-01-14 13:34     ` Michal Hocko
2014-01-14 14:26     ` Michal Hocko
2014-01-14 20:42       ` Hugh Dickins
2014-01-15  9:58         ` Michal Hocko
2014-01-15 12:17           ` Michal Hocko
2014-01-15 21:24             ` Hugh Dickins
2014-01-16  8:17               ` Michal Hocko
2014-01-16 15:22                 ` Michal Hocko
2014-01-16 19:15                   ` Hugh Dickins
2014-01-17 15:41                     ` Michal Hocko
2014-01-21  5:16                       ` Hugh Dickins
2014-01-21  8:34                         ` Michal Hocko
2014-01-21 10:45                           ` [PATCH -mm 1/2] memcg: fix endless loop caused by mem_cgroup_iter Michal Hocko
2014-01-21 10:45                             ` [PATCH -mm 2/2] memcg: fix css reference leak and endless loop in mem_cgroup_iter Michal Hocko
2014-01-21 19:42                               ` Andrew Morton
2014-01-21 21:18                                 ` Hugh Dickins [this message]
2014-01-22  8:27                                   ` Michal Hocko
2014-01-23 10:42                                     ` Hugh Dickins
2014-01-23 11:09                                       ` Michal Hocko
2014-01-23 12:53                                         ` Hugh Dickins
2014-01-22  8:12                                 ` Michal Hocko
2014-01-14  1:54 ` [PATCH 3/3] mm/memcg: iteration skip memcgs not yet fully initialized Hugh Dickins
2014-01-14 13:30   ` Michal Hocko
2014-01-14 14:29     ` Tejun Heo
2014-01-15  8:20       ` Michal Hocko
2014-01-15  8:21   ` Michal Hocko
2014-01-14 13:03 ` [PATCH 1/3] mm/memcg: fix last_dead_count memory wastage Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1401211218010.5688@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox