From: David Rientjes <rientjes@google.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>, Tejun Heo <tj@kernel.org>,
Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [BUG] lockdep splat with kernfs lockdep annotations and slab mutex from drm patch??
Date: Thu, 11 Jul 2019 14:22:17 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.21.1907111419120.157247@chino.kir.corp.google.com> (raw)
In-Reply-To: <156282587317.12280.11217721447100506162@skylake-alporthouse-com>
On Thu, 11 Jul 2019, Chris Wilson wrote:
> Quoting Steven Rostedt (2019-07-11 03:57:20)
> > On Fri, 14 Jun 2019 08:38:37 -0700
> > Tejun Heo <tj@kernel.org> wrote:
> >
> > > Hello,
> > >
> > > On Fri, Jun 14, 2019 at 04:08:33PM +0100, Chris Wilson wrote:
> > > > #ifdef CONFIG_MEMCG
> > > > if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
> > > > struct kmem_cache *c;
> > > >
> > > > mutex_lock(&slab_mutex);
> > > >
> > > > so it happens to hit the error + FULL case with the additional slabcaches?
> > > >
> > > > Anyway, according to lockdep, it is dangerous to use the slab_mutex inside
> > > > slab_attr_store().
> > >
> > > Didn't really look into the code but it looks like slab_mutex is held
> > > while trying to remove sysfs files. sysfs file removal flushes
> > > on-going accesses, so if a file operation then tries to grab a mutex
> > > which is held during removal, it leads to a deadlock.
> > >
> >
> > Looks like this never got fixed and now this bug is in 5.2.
>
> git blame gives
>
> commit 107dab5c92d5f9c3afe962036e47c207363255c7
> Author: Glauber Costa <glommer@parallels.com>
> Date: Tue Dec 18 14:23:05 2012 -0800
>
> slub: slub-specific propagation changes
>
> for adding the mutex underneath sysfs read, and I think
>
> commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0
> Author: Mikulas Patocka <mpatocka@redhat.com>
> Date: Wed Jun 27 23:26:09 2018 -0700
>
> slub: fix failure when we delete and create a slab cache
>
> added the sysfs removal underneath the slab_mutex.
>
> > Just got this:
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.2.0-test #15 Not tainted
> > ------------------------------------------------------
> > slub_cpu_partia/899 is trying to acquire lock:
> > 000000000f6f2dd7 (slab_mutex){+.+.}, at: slab_attr_store+0x6d/0xe0
> >
> > but task is already holding lock:
> > 00000000b23ffe3d (kn->count#160){++++}, at: kernfs_fop_write+0x125/0x230
> >
> > which lock already depends on the new lock.
> >
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #1 (kn->count#160){++++}:
> > __kernfs_remove+0x413/0x4a0
> > kernfs_remove_by_name_ns+0x40/0x80
> > sysfs_slab_add+0x1b5/0x2f0
> > __kmem_cache_create+0x511/0x560
> > create_cache+0xcd/0x1f0
> > kmem_cache_create_usercopy+0x18a/0x240
> > kmem_cache_create+0x12/0x20
> > is_active_nid+0xdb/0x230 [snd_hda_codec_generic]
> > snd_hda_get_path_idx+0x55/0x80 [snd_hda_codec_generic]
> > get_nid_path+0xc/0x170 [snd_hda_codec_generic]
> > do_one_initcall+0xa2/0x394
> > do_init_module+0xfd/0x370
> > load_module+0x38c6/0x3bd0
> > __do_sys_finit_module+0x11a/0x1b0
> > do_syscall_64+0x68/0x250
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
Which slab cache is getting created here? I assume that sysfs_slab_add()
is only trying to do kernfs_remove_by_name_ns() becasue its unmergeable
with other slab caches.
> > -> #0 (slab_mutex){+.+.}:
> > lock_acquire+0xbd/0x1d0
> > __mutex_lock+0xfc/0xb70
> > slab_attr_store+0x6d/0xe0
> > kernfs_fop_write+0x170/0x230
> > vfs_write+0xe1/0x240
> > ksys_write+0xba/0x150
> > do_syscall_64+0x68/0x250
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > other info that might help us debug this:
> >
> > Possible unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(kn->count#160);
> > lock(slab_mutex);
> > lock(kn->count#160);
> > lock(slab_mutex);
> >
> > *** DEADLOCK ***
> >
prev parent reply other threads:[~2019-07-11 21:22 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190614093914.58f41d8f@gandalf.local.home>
[not found] ` <156052491337.7796.17642747687124632554@skylake-alporthouse-com>
[not found] ` <20190614153837.GE538958@devbig004.ftw2.facebook.com>
[not found] ` <20190710225720.58246f8e@oasis.local.home>
2019-07-11 6:17 ` Chris Wilson
2019-07-11 21:22 ` David Rientjes [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.21.1907111419120.157247@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=chris@chris-wilson.co.uk \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox