Re: [PATCH 1/3] mm: migrate: do not touch page->mem_cgroup of live pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Mateusz Guzik <mguzik@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com,
	Greg Thelen <gthelen@google.com>, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 1/3] mm: migrate: do not touch page->mem_cgroup of live pages
Date: Wed, 3 Feb 2016 13:35:47 -0500	[thread overview]
Message-ID: <20160203183547.GA4007@cmpxchg.org> (raw)
In-Reply-To: <20160203140824.GJ21016@esperanza>

CCing Hugh and Greg, they have worked on the memcg migration code most
recently. AFAIK the only reason newpage->mem_cgroup had to be set up
that early in migration was because of the way dirty accounting used
to work. But Hugh took memcg out of the equation there, so moving
mem_cgroup_migrate() to the end should be safe, as long as the pages
are still locked and off the LRU.

Full quote:

On Wed, Feb 03, 2016 at 05:08:24PM +0300, Vladimir Davydov wrote:
> On Wed, Feb 03, 2016 at 02:17:49PM +0100, Mateusz Guzik wrote:
> > On Fri, Jan 29, 2016 at 06:19:31PM -0500, Johannes Weiner wrote:
> > > Changing a page's memcg association complicates dealing with the page,
> > > so we want to limit this as much as possible. Page migration e.g. does
> > > not have to do that. Just like page cache replacement, it can forcibly
> > > charge a replacement page, and then uncharge the old page when it gets
> > > freed. Temporarily overcharging the cgroup by a single page is not an
> > > issue in practice, and charging is so cheap nowadays that this is much
> > > preferrable to the headache of messing with live pages.
> > > 
> > > The only place that still changes the page->mem_cgroup binding of live
> > > pages is when pages move along with a task to another cgroup. But that
> > > path isolates the page from the LRU, takes the page lock, and the move
> > > lock (lock_page_memcg()). That means page->mem_cgroup is always stable
> > > in callers that have the page isolated from the LRU or locked. Lighter
> > > unlocked paths, like writeback accounting, can use lock_page_memcg().
> > > 
> > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > [..]
> > > @@ -372,12 +373,13 @@ int migrate_page_move_mapping(struct address_space *mapping,
> > >  	 * Now we know that no one else is looking at the page:
> > >  	 * no turning back from here.
> > >  	 */
> > > -	set_page_memcg(newpage, page_memcg(page));
> > >  	newpage->index = page->index;
> > >  	newpage->mapping = page->mapping;
> > >  	if (PageSwapBacked(page))
> > >  		SetPageSwapBacked(newpage);
> > >  
> > > +	mem_cgroup_migrate(page, newpage);
> > > +
> > >  	get_page(newpage);	/* add cache reference */
> > >  	if (PageSwapCache(page)) {
> > >  		SetPageSwapCache(newpage);
> > > @@ -457,9 +459,11 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
> > >  		return -EAGAIN;
> > >  	}
> > >  
> > > -	set_page_memcg(newpage, page_memcg(page));
> > >  	newpage->index = page->index;
> > >  	newpage->mapping = page->mapping;
> > > +
> > > +	mem_cgroup_migrate(page, newpage);
> > > +
> > >  	get_page(newpage);
> > >  
> > >  	radix_tree_replace_slot(pslot, newpage);
> > 
> > I ran trinity on recent linux-next and got the lockdep splat below and if I
> > read it right, this is the culprit.  In particular, mem_cgroup_migrate was put
> > in an area covered by spin_lock_irq(&mapping->tree_lock), but stuff it calls
> > enables and disables interrupts on its own.
> 
> It must be safe to move these calls outside tree_lock:
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 307e95ece622..17db63b2dd36 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -379,8 +379,6 @@ int migrate_page_move_mapping(struct address_space *mapping,
>  	if (PageSwapBacked(page))
>  		SetPageSwapBacked(newpage);
>  
> -	mem_cgroup_migrate(page, newpage);
> -
>  	get_page(newpage);	/* add cache reference */
>  	if (PageSwapCache(page)) {
>  		SetPageSwapCache(newpage);
> @@ -430,6 +428,8 @@ int migrate_page_move_mapping(struct address_space *mapping,
>  	}
>  	local_irq_enable();
>  
> +	mem_cgroup_migrate(page, newpage);
> +
>  	return MIGRATEPAGE_SUCCESS;
>  }
>  
> @@ -463,8 +463,6 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
>  	newpage->index = page->index;
>  	newpage->mapping = page->mapping;
>  
> -	mem_cgroup_migrate(page, newpage);
> -
>  	get_page(newpage);
>  
>  	radix_tree_replace_slot(pslot, newpage);
> @@ -472,6 +470,9 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
>  	page_unfreeze_refs(page, expected_count - 1);
>  
>  	spin_unlock_irq(&mapping->tree_lock);
> +
> +	mem_cgroup_migrate(page, newpage);
> +
>  	return MIGRATEPAGE_SUCCESS;
>  }
>  
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2016-02-03 18:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-29 23:19 [PATCH 0/3] mm: memcontrol: simplify page->mem_cgroup pinning Johannes Weiner
2016-01-29 23:19 ` [PATCH 1/3] mm: migrate: do not touch page->mem_cgroup of live pages Johannes Weiner
2016-02-03  9:20   ` Vladimir Davydov
2016-02-03 13:17   ` Mateusz Guzik
2016-02-03 14:08     ` Vladimir Davydov
2016-02-03 18:35       ` Johannes Weiner [this message]
2016-02-04  1:39         ` Hugh Dickins
2016-02-04 19:53           ` Johannes Weiner
2016-02-28 23:57             ` Hugh Dickins
2016-01-29 23:19 ` [PATCH 2/3] mm: simplify lock_page_memcg() Johannes Weiner
2016-02-03  9:25   ` Vladimir Davydov
2016-01-29 23:19 ` [PATCH 3/3] mm: remove unnecessary uses of lock_page_memcg() Johannes Weiner
2016-02-03  9:29   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160203183547.GA4007@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mguzik@redhat.com \
    --cc=mhocko@suse.cz \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox