Re: [PATCH v6 07/10] mm: synchronize access to kmem_cache dying flag using a spinlock

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Roman Gushchin <guro@fb.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	"Shakeel Butt" <shakeelb@google.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Waiman Long <longman@redhat.com>
Subject: Re: [PATCH v6 07/10] mm: synchronize access to kmem_cache dying flag using a spinlock
Date: Wed, 5 Jun 2019 22:02:06 +0000	[thread overview]
Message-ID: <20190605220201.GA16188@tower.DHCP.thefacebook.com> (raw)
In-Reply-To: <20190605165615.GC12453@cmpxchg.org>

On Wed, Jun 05, 2019 at 12:56:16PM -0400, Johannes Weiner wrote:
> On Tue, Jun 04, 2019 at 07:44:51PM -0700, Roman Gushchin wrote:
> > Currently the memcg_params.dying flag and the corresponding
> > workqueue used for the asynchronous deactivation of kmem_caches
> > is synchronized using the slab_mutex.
> > 
> > It makes impossible to check this flag from the irq context,
> > which will be required in order to implement asynchronous release
> > of kmem_caches.
> > 
> > So let's switch over to the irq-save flavor of the spinlock-based
> > synchronization.
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > ---
> >  mm/slab_common.c | 19 +++++++++++++++----
> >  1 file changed, 15 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 09b26673b63f..2914a8f0aa85 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -130,6 +130,7 @@ int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
> >  #ifdef CONFIG_MEMCG_KMEM
> >  
> >  LIST_HEAD(slab_root_caches);
> > +static DEFINE_SPINLOCK(memcg_kmem_wq_lock);
> >  
> >  void slab_init_memcg_params(struct kmem_cache *s)
> >  {
> > @@ -629,6 +630,7 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  	struct memcg_cache_array *arr;
> >  	struct kmem_cache *s = NULL;
> >  	char *cache_name;
> > +	bool dying;
> >  	int idx;
> >  
> >  	get_online_cpus();
> > @@ -640,7 +642,13 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
> >  	 * The memory cgroup could have been offlined while the cache
> >  	 * creation work was pending.
> >  	 */
> > -	if (memcg->kmem_state != KMEM_ONLINE || root_cache->memcg_params.dying)
> > +	if (memcg->kmem_state != KMEM_ONLINE)
> > +		goto out_unlock;
> > +
> > +	spin_lock_irq(&memcg_kmem_wq_lock);
> > +	dying = root_cache->memcg_params.dying;
> > +	spin_unlock_irq(&memcg_kmem_wq_lock);
> > +	if (dying)
> >  		goto out_unlock;
> 
> What does this lock protect? The dying flag could get set right after
> the unlock.
>

Hi Johannes!

Here is my logic:

1) flush_memcg_workqueue() must guarantee that no new memcg kmem_caches
will be created, and there are no works queued, which will touch
the root kmem_cache, so it can be released
2) so it sets the dying flag, waits for an rcu grace period and flushes
the workqueue (that means for all in-flight works)
3) dying flag in checked in kmemcg_cache_shutdown() and
kmemcg_cache_deactivate(), so that if it set, no new works/rcu tasks
will be queued. corresponding queue_work()/call_rcu() are all under
memcg_kmem_wq_lock lock.
4) memcg_schedule_kmem_cache_create() doesn't check the dying flag
(probably to avoid taking locks on a hot path), but it does
memcg_create_kmem_cache(), which is part of the scheduled work.
And it does it at the very beginning, so even if new kmem_caches
are scheduled to be created, the root kmem_cache won't be touched.

Previously the flag was checked under slab_mutex, but now we set it
under memcg_kmem_wq_lock lock. So I'm not sure we can read it without
taking this lock.

If the flag will be set after unlock, it's fine. It means that the
work has already been scheduled, and flush_workqueue() in
flush_memcg_workqueue() will wait for it. The only problem is if we
don't see the flag after flush_workqueue() is called, but I don't
see how it's possible.

Does it makes sense? I'm sure there are ways to make it more obvious.
Please, let me know if you've any ideas.

Thank you!

next prev parent reply	other threads:[~2019-06-05 22:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-05  2:44 [PATCH v6 00/10] mm: reparent slab memory on cgroup removal Roman Gushchin
2019-06-05  2:44 ` [PATCH v6 01/10] mm: add missing smp read barrier on getting memcg kmem_cache pointer Roman Gushchin
2019-06-05  4:35   ` Shakeel Butt
2019-06-05 17:14     ` Roman Gushchin
2019-06-05 19:51       ` Shakeel Butt
2019-06-05 16:42   ` Johannes Weiner
2019-06-09 12:10   ` Vladimir Davydov
2019-06-10 20:33     ` Johannes Weiner
2019-06-10 20:38       ` Roman Gushchin
2019-06-05  2:44 ` [PATCH v6 02/10] mm: postpone kmem_cache memcg pointer initialization to memcg_link_cache() Roman Gushchin
2019-06-05  2:44 ` [PATCH v6 03/10] mm: rename slab delayed deactivation functions and fields Roman Gushchin
2019-06-09 12:13   ` Vladimir Davydov
2019-06-05  2:44 ` [PATCH v6 04/10] mm: generalize postponed non-root kmem_cache deactivation Roman Gushchin
2019-06-09 12:23   ` Vladimir Davydov
2019-06-05  2:44 ` [PATCH v6 05/10] mm: introduce __memcg_kmem_uncharge_memcg() Roman Gushchin
2019-06-09 12:29   ` Vladimir Davydov
2019-06-05  2:44 ` [PATCH v6 06/10] mm: unify SLAB and SLUB page accounting Roman Gushchin
2019-06-05  2:44 ` [PATCH v6 07/10] mm: synchronize access to kmem_cache dying flag using a spinlock Roman Gushchin
2019-06-05 16:56   ` Johannes Weiner
2019-06-05 22:02     ` Roman Gushchin [this message]
2019-06-06  0:48       ` Roman Gushchin
2019-06-09 14:31   ` Vladimir Davydov
2019-06-10 20:46     ` Roman Gushchin
2019-06-05  2:44 ` [PATCH v6 08/10] mm: rework non-root kmem_cache lifecycle management Roman Gushchin
2019-06-09 17:09   ` Vladimir Davydov
2019-06-05  2:44 ` [PATCH v6 09/10] mm: stop setting page->mem_cgroup pointer for slab pages Roman Gushchin
2019-06-09 17:09   ` Vladimir Davydov
2019-06-05  2:44 ` [PATCH v6 10/10] mm: reparent slab memory on cgroup removal Roman Gushchin
2019-06-09 17:18   ` Vladimir Davydov
2019-06-05  4:14 ` [PATCH v6 00/10] " Andrew Morton
2019-06-05 20:45   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190605220201.GA16188@tower.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox