Re: [PATCH v2] mm: memcontrol: fix possible memcg leak due to interrupted reclaim

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	stable@vger.kernel.org, Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: memcontrol: fix possible memcg leak due to interrupted reclaim
Date: Fri, 18 Dec 2015 11:00:41 -0500	[thread overview]
Message-ID: <20151218160041.GA4201@cmpxchg.org> (raw)
In-Reply-To: <20151218153202.GS28521@esperanza>

On Fri, Dec 18, 2015 at 06:32:02PM +0300, Vladimir Davydov wrote:
> On Thu, Dec 17, 2015 at 03:02:17PM -0800, Andrew Morton wrote:
> > On Tue, 15 Dec 2015 15:31:37 +0300 Vladimir Davydov <vdavydov@virtuozzo.com> wrote:
> > > @@ -859,14 +859,20 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
> > >  		if (prev && reclaim->generation != iter->generation)
> > >  			goto out_unlock;
> > >  
> > > -		do {
> > > +		while (1) {
> > >  			pos = READ_ONCE(iter->position);
> > > +			if (!pos || css_tryget(&pos->css))
> > > +				break;
> > >  			/*
> > > -			 * A racing update may change the position and
> > > -			 * put the last reference, hence css_tryget(),
> > > -			 * or retry to see the updated position.
> > > +			 * css reference reached zero, so iter->position will
> > > +			 * be cleared by ->css_released. However, we should not
> > > +			 * rely on this happening soon, because ->css_released
> > > +			 * is called from a work queue, and by busy-waiting we
> > > +			 * might block it. So we clear iter->position right
> > > +			 * away.
> > >  			 */
> > > -		} while (pos && !css_tryget(&pos->css));
> > > +			cmpxchg(&iter->position, pos, NULL);
> > > +		}
> > 
> > It's peculiar to use cmpxchg() without actually checking that it did
> > anything.  Should we use xchg() here?  And why aren't we using plain
> > old "=", come to that?
> 
> Well, it's obvious why we need the 'compare' part - the iter could have
> been already advanced by a competing process, in which case we shouldn't
> touch it, otherwise we would reclaim some cgroup twice during the same
> reclaim generation. However, it's not that clear why it must be atomic.
> Before this patch, atomicity was necessary to guarantee that we adjust
> the reference counters correctly, but now we don't do it anymore. If a
> competing process happens to update iter->position between the compare
> and set steps, we might reclaim from the same cgroup twice at worst, and
> this extremely unlikely to happen.
> 
> So I think we can replace the atomic operation with a non-atomic one,
> like the patch below does. Any objections?

I don't think the race window is actually that small and reclaiming a
group twice could cause sporadic latency issues in the victim group.
Think about the group not just trimming caches but already swapping.

The cmpxchg()s without checking the return values look odd without a
comment, but that doesn't mean that they're wrong in this situation:
advance the iterator from what we think is the current position, and
don't if somebody beat us to that. That's what cmpxchg() does. So I'd
rather we kept them here.

> @@ -902,7 +903,15 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
>  	}
>  
>  	if (reclaim) {
> -		cmpxchg(&iter->position, pos, memcg);
> +		/*
> +		 * The position could have already been updated by a competing
> +		 * thread, so check that the value hasn't changed since we read
> +		 * it. This operation doesn't need to be atomic, because a race
> +		 * is extremely unlikely and in the worst case can only result
> +		 * in the same cgroup reclaimed twice.

But it would be good to add the first half of that comment to the
cmpxchg to explain why we don't have to check the return value.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2015-12-18 16:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-15 12:31 Vladimir Davydov
2015-12-15 14:53 ` Johannes Weiner
2015-12-17 23:02 ` Andrew Morton
2015-12-18 15:32   ` Vladimir Davydov
2015-12-18 16:00     ` Johannes Weiner [this message]
2015-12-18 16:24       ` Vladimir Davydov
2015-12-18 22:40         ` Andrew Morton
2015-12-19  8:51           ` Vladimir Davydov
2015-12-23 22:19 ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151218160041.GA4201@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox