From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, borntraeger@de.ibm.com, guro@fb.com,
hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com,
mm-commits@vger.kernel.org, shakeelb@google.com,
stable@vger.kernel.org, torvalds@linux-foundation.org
Subject: [patch 02/86] mm: memcg/slab: wait for !root kmem_cache refcnt killing on root kmem_cache destruction
Date: Wed, 04 Dec 2019 16:49:46 -0800 [thread overview]
Message-ID: <20191205004946.YQxscs-3o%akpm@linux-foundation.org> (raw)
In-Reply-To: <20191204164858.fe4ed8886e34ad9f3b34ea00@linux-foundation.org>
From: Roman Gushchin <guro@fb.com>
Subject: mm: memcg/slab: wait for !root kmem_cache refcnt killing on root kmem_cache destruction
Christian reported a warning like the following obtained during running
some KVM-related tests on s390:
WARNING: CPU: 8 PID: 208 at lib/percpu-refcount.c:108 percpu_ref_exit+0x50/0x58
Modules linked in: kvm(-) xt_CHECKSUM xt_MASQUERADE bonding xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_na>
CPU: 8 PID: 208 Comm: kworker/8:1 Not tainted 5.2.0+ #66
Hardware name: IBM 2964 NC9 712 (LPAR)
Workqueue: events sysfs_slab_remove_workfn
Krnl PSW : 0704e00180000000 0000001529746850 (percpu_ref_exit+0x50/0x58)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 00000000ffff8808 0000001529746740 000003f4e30e8e18 0036008100000000
0000001f00000000 0035008100000000 0000001fb3573ab8 0000000000000000
0000001fbdb6de00 0000000000000000 0000001529f01328 0000001fb3573b00
0000001fbb27e000 0000001fbdb69300 000003e009263d00 000003e009263cd0
Krnl Code: 0000001529746842: f0a0000407fe srp 4(11,%r0),2046,0
0000001529746848: 47000700 bc 0,1792
#000000152974684c: a7f40001 brc 15,152974684e
>0000001529746850: a7f4fff2 brc 15,1529746834
0000001529746854: 0707 bcr 0,%r7
0000001529746856: 0707 bcr 0,%r7
0000001529746858: eb8ff0580024 stmg %r8,%r15,88(%r15)
000000152974685e: a738ffff lhi %r3,-1
Call Trace:
([<000003e009263d00>] 0x3e009263d00)
[<00000015293252ea>] slab_kmem_cache_release+0x3a/0x70
[<0000001529b04882>] kobject_put+0xaa/0xe8
[<000000152918cf28>] process_one_work+0x1e8/0x428
[<000000152918d1b0>] worker_thread+0x48/0x460
[<00000015291942c6>] kthread+0x126/0x160
[<0000001529b22344>] ret_from_fork+0x28/0x30
[<0000001529b2234c>] kernel_thread_starter+0x0/0x10
Last Breaking-Event-Address:
[<000000152974684c>] percpu_ref_exit+0x4c/0x58
---[ end trace b035e7da5788eb09 ]---
The problem occurs because kmem_cache_destroy() is called immediately
after deleting of a memcg, so it races with the memcg kmem_cache
deactivation.
flush_memcg_workqueue() at the beginning of kmem_cache_destroy() is
supposed to guarantee that all deactivation processes are finished, but
failed to do so. It waits for an rcu grace period, after which all
children kmem_caches should be deactivated. During the deactivation
percpu_ref_kill() is called for non root kmem_cache refcounters, but it
requires yet another rcu grace period to finish the transition to the
atomic (dead) state.
So in a rare case when not all children kmem_caches are destroyed at the
moment when the root kmem_cache is about to be gone, we need to wait
another rcu grace period before destroying the root kmem_cache.
This issue can be triggered only with dynamically created kmem_caches
which are used with memcg accounting. In this case per-memcg child
kmem_caches are created. They are deactivated from the cgroup removing
path. If the destruction of the root kmem_cache is racing with the
removal of the cgroup (both are quite complicated multi-stage processes),
the described issue can occur. The only known way to trigger it in the
real life, is to unload some kernel module which creates a dedicated
kmem_cache, used from different memory cgroups with GFP_ACCOUNT flag. If
the unloading happens immediately after calling rmdir on the corresponding
cgroup, there is some chance to trigger the issue.
Link: http://lkml.kernel.org/r/20191129025011.3076017-1-guro@fb.com
Fixes: f0a3a24b532d ("mm: memcg/slab: rework non-root kmem_cache lifecycle management")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/slab_common.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
--- a/mm/slab_common.c~mm-memcg-slab-wait-for-root-kmem_cache-refcnt-killing-on-root-kmem_cache-destruction
+++ a/mm/slab_common.c
@@ -904,6 +904,18 @@ static void flush_memcg_workqueue(struct
* previous workitems on workqueue are processed.
*/
flush_workqueue(memcg_kmem_cache_wq);
+
+ /*
+ * If we're racing with children kmem_cache deactivation, it might
+ * take another rcu grace period to complete their destruction.
+ * At this moment the corresponding percpu_ref_kill() call should be
+ * done, but it might take another rcu grace period to complete
+ * switching to the atomic mode.
+ * Please, note that we check without grabbing the slab_mutex. It's safe
+ * because at this moment the children list can't grow.
+ */
+ if (!list_empty(&s->memcg_params.children))
+ rcu_barrier();
}
#else
static inline int shutdown_memcg_caches(struct kmem_cache *s)
_
next prev parent reply other threads:[~2019-12-05 0:49 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-05 0:48 incoming Andrew Morton
2019-12-05 0:49 ` [patch 01/86] mm/kasan/common.c: fix compile error Andrew Morton
2019-12-05 0:49 ` Andrew Morton [this message]
2019-12-05 0:49 ` [patch 03/86] mm/vmstat: add helpers to get vmstat item names for each enum type Andrew Morton
2019-12-05 0:49 ` [patch 04/86] mm/memcontrol: use vmstat names for printing statistics Andrew Morton
2019-12-05 0:49 ` [patch 05/86] mm/memory.c: replace is_zero_pfn with is_huge_zero_pmd for thp Andrew Morton
2019-12-05 0:49 ` [patch 06/86] proc: change ->nlink under proc_subdir_lock Andrew Morton
2019-12-05 0:50 ` [patch 07/86] fs/proc/generic.c: delete useless "len" variable Andrew Morton
2019-12-05 0:50 ` [patch 08/86] fs/proc/internal.h: shuffle "struct pde_opener" Andrew Morton
2019-12-05 0:50 ` [patch 09/86] include/linux/proc_fs.h: fix confusing macro arg name Andrew Morton
2019-12-05 0:50 ` [patch 10/86] fs/proc/Kconfig: fix indentation Andrew Morton
2019-12-05 0:50 ` [patch 11/86] include/linux/sysctl.h: inline braces for ctl_table and ctl_table_header Andrew Morton
2019-12-05 0:50 ` [patch 12/86] .gitattributes: use 'dts' diff driver for dts files Andrew Morton
2019-12-05 1:00 ` Frank Rowand
2019-12-05 0:50 ` [patch 13/86] linux/build_bug.h: change type to int Andrew Morton
2019-12-05 0:50 ` [patch 14/86] linux/scc.h: make uapi linux/scc.h self-contained Andrew Morton
2019-12-05 0:50 ` [patch 15/86] arch/Kconfig: fix indentation Andrew Morton
2019-12-05 0:50 ` [patch 16/86] scripts/get_maintainer.pl: add signatures from Fixes: <badcommit> lines in commit message Andrew Morton
2019-12-05 0:50 ` [patch 17/86] kernel.h: update comment about simple_strto<foo>() functions Andrew Morton
2019-12-05 0:50 ` [patch 18/86] auxdisplay: charlcd: deduplicate simple_strtoul() Andrew Morton
2019-12-05 0:50 ` [patch 19/86] kernel/notifier.c: intercept duplicate registrations to avoid infinite loops Andrew Morton
2019-12-05 0:50 ` [patch 20/86] kernel/notifier.c: remove notifier_chain_cond_register() Andrew Morton
2019-12-05 0:50 ` [patch 21/86] kernel/notifier.c: remove blocking_notifier_chain_cond_register() Andrew Morton
2019-12-05 0:50 ` [patch 22/86] kernel/profile.c: use cpumask_available to check for NULL cpumask Andrew Morton
2019-12-05 0:50 ` [patch 23/86] kernel/sys.c: avoid copying possible padding bytes in copy_to_user Andrew Morton
2019-12-05 0:50 ` [patch 24/86] bitops: introduce the for_each_set_clump8 macro Andrew Morton
2019-12-05 0:51 ` [patch 25/86] lib/test_bitmap.c: add for_each_set_clump8 test cases Andrew Morton
2019-12-05 0:51 ` [patch 26/86] gpio: 104-dio-48e: utilize for_each_set_clump8 macro Andrew Morton
2019-12-05 0:51 ` [patch 27/86] gpio: 104-idi-48: " Andrew Morton
2019-12-05 0:51 ` [patch 28/86] gpio: gpio-mm: " Andrew Morton
2019-12-05 0:51 ` [patch 29/86] gpio: ws16c48: " Andrew Morton
2019-12-05 0:51 ` [patch 30/86] gpio: pci-idio-16: " Andrew Morton
2019-12-05 0:51 ` [patch 31/86] gpio: pcie-idio-24: " Andrew Morton
2019-12-05 0:51 ` [patch 32/86] gpio: uniphier: " Andrew Morton
2019-12-05 0:51 ` [patch 33/86] gpio: 74x164: utilize the " Andrew Morton
2019-12-05 0:51 ` [patch 34/86] thermal: intel: intel_soc_dts_iosf: Utilize " Andrew Morton
2019-12-05 0:51 ` [patch 35/86] gpio: pisosr: utilize the " Andrew Morton
2019-12-05 0:51 ` [patch 36/86] gpio: max3191x: " Andrew Morton
2019-12-05 0:51 ` [patch 37/86] gpio: pca953x: " Andrew Morton
2019-12-05 0:51 ` [patch 38/86] lib/rbtree: set successor's parent unconditionally Andrew Morton
2019-12-05 0:51 ` [patch 39/86] lib/rbtree: get successor's color directly Andrew Morton
2019-12-05 0:51 ` [patch 40/86] lib/test_meminit.c: add bulk alloc/free tests Andrew Morton
2019-12-05 0:51 ` [patch 41/86] lib/math/rational.c: fix possible incorrect result from rational fractions helper Andrew Morton
2019-12-05 0:52 ` [patch 42/86] lib/genalloc.c: export symbol addr_in_gen_pool Andrew Morton
2019-12-05 0:52 ` [patch 43/86] lib/genalloc.c: rename addr_in_gen_pool to gen_pool_has_addr Andrew Morton
2019-12-05 0:52 ` [patch 44/86] checkpatch: improve ignoring CamelCase SI style variants like mA Andrew Morton
2019-12-05 0:52 ` [patch 45/86] checkpatch: reduce is_maintained_obsolete lookup runtime Andrew Morton
2019-12-05 0:52 ` [patch 46/86] epoll: simplify ep_poll_safewake() for CONFIG_DEBUG_LOCK_ALLOC Andrew Morton
2019-12-05 0:52 ` [patch 47/86] fs/epoll: remove unnecessary wakeups of nested epoll Andrew Morton
2019-12-05 0:52 ` [patch 48/86] selftests: add epoll selftests Andrew Morton
2019-12-05 0:52 ` [patch 49/86] fs/binfmt_elf.c: delete unused "interp_map_addr" argument Andrew Morton
2019-12-05 0:52 ` [patch 50/86] fs/binfmt_elf.c: extract elf_read() function Andrew Morton
2019-12-05 0:52 ` [patch 51/86] init/Kconfig: fix indentation Andrew Morton
2019-12-05 0:52 ` [patch 52/86] drivers/rapidio/rio-driver.c: fix missing include of <linux/rio_drv.h> Andrew Morton
2019-12-05 0:52 ` [patch 53/86] drivers/rapidio/rio-access.c: " Andrew Morton
2019-12-05 0:52 ` [patch 54/86] drm: limit to INT_MAX in create_blob ioctl Andrew Morton
2019-12-05 0:52 ` [patch 55/86] uaccess: disallow > INT_MAX copy sizes Andrew Morton
2019-12-05 0:52 ` [patch 56/86] kcov: remote coverage support Andrew Morton
2019-12-05 0:52 ` [patch 57/86] usb, kcov: collect coverage from hub_event Andrew Morton
2019-12-05 0:52 ` [patch 58/86] vhost, kcov: collect coverage from vhost_worker Andrew Morton
2019-12-05 0:52 ` [patch 59/86] lib/ubsan: don't serialize UBSAN report Andrew Morton
2019-12-05 0:52 ` [patch 60/86] arch: ipcbuf.h: make uapi asm/ipcbuf.h self-contained Andrew Morton
2019-12-05 0:53 ` [patch 61/86] arch: msgbuf.h: make uapi asm/msgbuf.h self-contained Andrew Morton
2019-12-05 0:53 ` [patch 62/86] arch: sembuf.h: make uapi asm/sembuf.h self-contained Andrew Morton
2019-12-05 0:53 ` [patch 63/86] lib/test_bitmap: force argument of bitmap_parselist_user() to proper address space Andrew Morton
2019-12-05 0:53 ` [patch 64/86] lib/test_bitmap: undefine macros after use Andrew Morton
2019-12-05 0:53 ` [patch 65/86] lib/test_bitmap: name EXP_BYTES properly Andrew Morton
2019-12-05 0:53 ` [patch 66/86] lib/test_bitmap: rename exp to exp1 to avoid ambiguous name Andrew Morton
2019-12-05 0:53 ` [patch 67/86] lib/test_bitmap: move exp1 and exp2 upper for others to use Andrew Morton
2019-12-05 0:53 ` [patch 68/86] lib/test_bitmap: fix comment about this file Andrew Morton
2019-12-05 0:53 ` [patch 69/86] lib/bitmap: introduce bitmap_replace() helper Andrew Morton
2019-12-05 0:53 ` [patch 70/86] gpio: pca953x: remove redundant variable and check in IRQ handler Andrew Morton
2019-12-05 0:53 ` [patch 71/86] gpio: pca953x: use input from regs structure in pca953x_irq_pending() Andrew Morton
2019-12-05 0:53 ` [patch 72/86] gpio: pca953x: convert to use bitmap API Andrew Morton
2019-12-05 0:53 ` [patch 73/86] gpio: pca953x: tighten up indentation Andrew Morton
2019-12-05 0:53 ` [patch 74/86] alpha: use pgtable-nopud instead of 4level-fixup Andrew Morton
2019-12-05 0:53 ` [patch 75/86] arm: nommu: " Andrew Morton
2019-12-05 0:53 ` [patch 76/86] c6x: " Andrew Morton
2019-12-05 0:53 ` [patch 77/86] m68k: nommu: " Andrew Morton
2019-12-05 0:53 ` [patch 78/86] m68k: mm: use pgtable-nopXd " Andrew Morton
2019-12-05 0:54 ` [patch 79/86] microblaze: use pgtable-nopmd " Andrew Morton
2019-12-05 0:54 ` [patch 80/86] nds32: " Andrew Morton
2019-12-05 0:54 ` [patch 81/86] parisc: use pgtable-nopXd " Andrew Morton
2019-12-05 0:54 ` [patch 82/86] parisc/hugetlb: " Andrew Morton
2019-12-05 0:54 ` [patch 83/86] sparc32: use pgtable-nopud " Andrew Morton
2019-12-05 0:54 ` [patch 84/86] um: remove unused pxx_offset_proc() and addr_pte() functions Andrew Morton
2019-12-05 0:54 ` [patch 85/86] um: add support for folded p4d page tables Andrew Morton
2019-12-05 0:54 ` [patch 86/86] mm: remove __ARCH_HAS_4LEVEL_HACK and include/asm-generic/4level-fixup.h Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191205004946.YQxscs-3o%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=borntraeger@de.ibm.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mm-commits@vger.kernel.org \
--cc=shakeelb@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox