linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Bharata B Rao <bharata@linux.ibm.com>
Cc: <linux-kernel@vger.kernel.org>, <cl@linux.com>,
	<rientjes@google.com>, <iamjoonsoo.kim@lge.com>,
	<akpm@linux-foundation.org>, <linux-mm@kvack.org>
Subject: Re: [PATCH FIX v0] mm: memcg/slab: Uncharge during kmem_cache_free_bulk()
Date: Mon, 12 Oct 2020 17:40:07 -0700	[thread overview]
Message-ID: <20201013004007.GB3687509@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20201012033326.GJ185637@in.ibm.com>

On Mon, Oct 12, 2020 at 09:03:26AM +0530, Bharata B Rao wrote:
> On Fri, Oct 09, 2020 at 11:45:51AM -0700, Roman Gushchin wrote:
> > On Fri, Oct 09, 2020 at 11:34:23AM +0530, Bharata B Rao wrote:
> > 
> > Hi Bharata,
> > 
> > > Object cgroup charging is done for all the objects during
> > > allocation, but during freeing, uncharging ends up happening
> > > for only one object in the case of bulk allocation/freeing.
> > 
> > Yes, it's definitely a problem. Thank you for catching it!
> > 
> > I'm curious, did you find it in the wild or by looking into the code?
> 
> Found by looking at the code.
> 
> > 
> > > 
> > > Fix this by having a separate call to uncharge all the
> > > objects from kmem_cache_free_bulk() and by modifying
> > > memcg_slab_free_hook() to take care of bulk uncharging.
> > >
> > > Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> > 
> > Please, add:
> > 
> > Fixes: 964d4bd370d5 ("mm: memcg/slab: save obj_cgroup for non-root slab objects")
> > Cc: stable@vger.kernel.org
> > 
> > > ---
> > >  mm/slab.c |  2 +-
> > >  mm/slab.h | 42 +++++++++++++++++++++++++++---------------
> > >  mm/slub.c |  3 ++-
> > >  3 files changed, 30 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/mm/slab.c b/mm/slab.c
> > > index f658e86ec8cee..5c70600d8b1cc 100644
> > > --- a/mm/slab.c
> > > +++ b/mm/slab.c
> > > @@ -3440,7 +3440,7 @@ void ___cache_free(struct kmem_cache *cachep, void *objp,
> > >  		memset(objp, 0, cachep->object_size);
> > >  	kmemleak_free_recursive(objp, cachep->flags);
> > >  	objp = cache_free_debugcheck(cachep, objp, caller);
> > > -	memcg_slab_free_hook(cachep, virt_to_head_page(objp), objp);
> > > +	memcg_slab_free_hook(cachep, &objp, 1);
> > >  
> > >  	/*
> > >  	 * Skip calling cache_free_alien() when the platform is not numa.
> > > diff --git a/mm/slab.h b/mm/slab.h
> > > index 6cc323f1313af..6dd4b702888a7 100644
> > > --- a/mm/slab.h
> > > +++ b/mm/slab.h
> > > @@ -345,30 +345,42 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s,
> > >  	obj_cgroup_put(objcg);
> > >  }
> > >  
> > > -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page,
> > > -					void *p)
> > > +static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
> > > +					void **p, int objects)
> > >  {
> > > +	struct kmem_cache *s;
> > >  	struct obj_cgroup *objcg;
> > > +	struct page *page;
> > >  	unsigned int off;
> > > +	int i;
> > >  
> > >  	if (!memcg_kmem_enabled())
> > >  		return;
> > >  
> > > -	if (!page_has_obj_cgroups(page))
> > > -		return;
> > > +	for (i = 0; i < objects; i++) {
> > > +		if (unlikely(!p[i]))
> > > +			continue;
> > >  
> > > -	off = obj_to_index(s, page, p);
> > > -	objcg = page_obj_cgroups(page)[off];
> > > -	page_obj_cgroups(page)[off] = NULL;
> > > +		page = virt_to_head_page(p[i]);
> > > +		if (!page_has_obj_cgroups(page))
> > > +			continue;
> > >  
> > > -	if (!objcg)
> > > -		return;
> > > +		if (!s_orig)
> > > +			s = page->slab_cache;
> > > +		else
> > > +			s = s_orig;
> > >  
> > > -	obj_cgroup_uncharge(objcg, obj_full_size(s));
> > > -	mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s),
> > > -			-obj_full_size(s));
> > > +		off = obj_to_index(s, page, p[i]);
> > > +		objcg = page_obj_cgroups(page)[off];
> > > +		if (!objcg)
> > > +			continue;
> > >  
> > > -	obj_cgroup_put(objcg);
> > > +		page_obj_cgroups(page)[off] = NULL;
> > > +		obj_cgroup_uncharge(objcg, obj_full_size(s));
> > > +		mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s),
> > > +				-obj_full_size(s));
> > > +		obj_cgroup_put(objcg);
> > > +	}
> > >  }
> > >  
> > >  #else /* CONFIG_MEMCG_KMEM */
> > > @@ -406,8 +418,8 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s,
> > >  {
> > >  }
> > >  
> > > -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page,
> > > -					void *p)
> > > +static inline void memcg_slab_free_hook(struct kmem_cache *s,
> > > +					void **p, int objects)
> > >  {
> > >  }
> > >  #endif /* CONFIG_MEMCG_KMEM */
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 6d3574013b2f8..0cbe67f13946e 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -3091,7 +3091,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
> > >  	struct kmem_cache_cpu *c;
> > >  	unsigned long tid;
> > >  
> > > -	memcg_slab_free_hook(s, page, head);
> > > +	memcg_slab_free_hook(s, &head, 1);
> > 
> > Hm, I wonder if it's better to teach do_slab_free() to handle the (cnt > 1) case correctly?
> 
> Possible, but we will have to walk the detached freelist there while
> here it is much easier to just walk the array of objects?
> 
> > 
> > >  redo:
> > >  	/*
> > >  	 * Determine the currently cpus per cpu slab.
> > > @@ -3253,6 +3253,7 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> > >  	if (WARN_ON(!size))
> > >  		return;
> > >  
> > > +	memcg_slab_free_hook(s, p, size);
> > 
> > Then you don't need this change.
> > 
> > Otherwise memcg_slab_free_hook() can be called twice for the same object. It's ok from
> > accounting correctness perspective, because the first call will zero out the objcg pointer,
> > but still much better to be avoided.
> 
> Yes, for that one object there will be one additional uncharge call,
> but as you note it is handled by the !objcg check.

Got it. Thanks for the explanation!

Acked-by: Roman Gushchin <guro@fb.com>

Thanks!


      reply	other threads:[~2020-10-13  0:40 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-09  6:04 Bharata B Rao
2020-10-09 18:45 ` Roman Gushchin
2020-10-12  3:33   ` Bharata B Rao
2020-10-13  0:40     ` Roman Gushchin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201013004007.GB3687509@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@linux.ibm.com \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox