From: Roman Gushchin <guro@fb.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Kernel Team <Kernel-team@fb.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
Waiman Long <longman@redhat.com>,
"Christoph Lameter" <cl@linux.com>,
Michal Hocko <mhocko@suse.com>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Pekka Enberg <penberg@kernel.org>
Subject: Re: [PATCH] mm: memcg/slab: properly handle kmem_caches reparented to root_mem_cgroup
Date: Thu, 20 Jun 2019 21:26:37 +0000 [thread overview]
Message-ID: <20190620212624.GA3494@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <CALvZod6MzPvX67AxrGddNWhr99oVY7_v6tXh_7yXdf-g24b6nQ@mail.gmail.com>
On Thu, Jun 20, 2019 at 08:48:00AM -0700, Shakeel Butt wrote:
> On Wed, Jun 19, 2019 at 6:57 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > As a result of reparenting a kmem_cache might belong to the root
> > memory cgroup. It happens when a top-level memory cgroup is removed,
> > and all associated kmem_caches are reparented to the root memory
> > cgroup.
> >
> > The root memory cgroup is special, and requires a special handling.
> > Let's make sure that we don't try to charge or uncharge it,
> > and we handle system-wide vmstats exactly as for root kmem_caches.
> >
> > Note, that we still need to alter the kmem_cache reference counter,
> > so that the kmem_cache can be released properly.
> >
> > The issue was discovered by running CRIU tests; the following warning
> > did appear:
> >
> > [ 381.345960] WARNING: CPU: 0 PID: 11655 at mm/page_counter.c:62
> > page_counter_cancel+0x26/0x30
> > [ 381.345992] Modules linked in:
> > [ 381.345998] CPU: 0 PID: 11655 Comm: kworker/0:8 Not tainted
> > 5.2.0-rc5-next-20190618+ #1
> > [ 381.346001] Hardware name: Google Google Compute Engine/Google
> > Compute Engine, BIOS Google 01/01/2011
> > [ 381.346010] Workqueue: memcg_kmem_cache kmemcg_workfn
> > [ 381.346013] RIP: 0010:page_counter_cancel+0x26/0x30
> > [ 381.346017] Code: 1f 44 00 00 0f 1f 44 00 00 48 89 f0 53 48 f7 d8
> > f0 48 0f c1 07 48 29 f0 48 89 c3 48 89 c6 e8 61 ff ff ff 48 85 db 78
> > 02 5b c3 <0f> 0b 5b c3 66 0f 1f 44 00 00 0f 1f 44 00 00 48 85 ff 74 41
> > 41 55
> > [ 381.346019] RSP: 0018:ffffb3b34319f990 EFLAGS: 00010086
> > [ 381.346022] RAX: fffffffffffffffc RBX: fffffffffffffffc RCX: 0000000000000004
> > [ 381.346024] RDX: 0000000000000000 RSI: fffffffffffffffc RDI: ffff9c2cd7165270
> > [ 381.346026] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000001
> > [ 381.346028] R10: 00000000000000c8 R11: ffff9c2cd684e660 R12: 00000000fffffffc
> > [ 381.346030] R13: 0000000000000002 R14: 0000000000000006 R15: ffff9c2c8ce1f200
> > [ 381.346033] FS: 0000000000000000(0000) GS:ffff9c2cd8200000(0000)
> > knlGS:0000000000000000
> > [ 381.346039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 381.346041] CR2: 00000000007be000 CR3: 00000001cdbfc005 CR4: 00000000001606f0
> > [ 381.346043] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 381.346045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 381.346047] Call Trace:
> > [ 381.346054] page_counter_uncharge+0x1d/0x30
> > [ 381.346065] __memcg_kmem_uncharge_memcg+0x39/0x60
> > [ 381.346071] __free_slab+0x34c/0x460
> > [ 381.346079] deactivate_slab.isra.80+0x57d/0x6d0
> > [ 381.346088] ? add_lock_to_list.isra.36+0x9c/0xf0
> > [ 381.346095] ? __lock_acquire+0x252/0x1410
> > [ 381.346106] ? cpumask_next_and+0x19/0x20
> > [ 381.346110] ? slub_cpu_dead+0xd0/0xd0
> > [ 381.346113] flush_cpu_slab+0x36/0x50
> > [ 381.346117] ? slub_cpu_dead+0xd0/0xd0
> > [ 381.346125] on_each_cpu_mask+0x51/0x70
> > [ 381.346131] ? ksm_migrate_page+0x60/0x60
> > [ 381.346134] on_each_cpu_cond_mask+0xab/0x100
> > [ 381.346143] __kmem_cache_shrink+0x56/0x320
> > [ 381.346150] ? ret_from_fork+0x3a/0x50
> > [ 381.346157] ? unwind_next_frame+0x73/0x480
> > [ 381.346176] ? __lock_acquire+0x252/0x1410
> > [ 381.346188] ? kmemcg_workfn+0x21/0x50
> > [ 381.346196] ? __mutex_lock+0x99/0x920
> > [ 381.346199] ? kmemcg_workfn+0x21/0x50
> > [ 381.346205] ? kmemcg_workfn+0x21/0x50
> > [ 381.346216] __kmemcg_cache_deactivate_after_rcu+0xe/0x40
> > [ 381.346220] kmemcg_cache_deactivate_after_rcu+0xe/0x20
> > [ 381.346223] kmemcg_workfn+0x31/0x50
> > [ 381.346230] process_one_work+0x23c/0x5e0
> > [ 381.346241] worker_thread+0x3c/0x390
> > [ 381.346248] ? process_one_work+0x5e0/0x5e0
> > [ 381.346252] kthread+0x11d/0x140
> > [ 381.346255] ? kthread_create_on_node+0x60/0x60
> > [ 381.346261] ret_from_fork+0x3a/0x50
> > [ 381.346275] irq event stamp: 10302
> > [ 381.346278] hardirqs last enabled at (10301): [<ffffffffb2c1a0b9>]
> > _raw_spin_unlock_irq+0x29/0x40
> > [ 381.346282] hardirqs last disabled at (10302): [<ffffffffb2182289>]
> > on_each_cpu_mask+0x49/0x70
> > [ 381.346287] softirqs last enabled at (10262): [<ffffffffb2191f4a>]
> > cgroup_idr_replace+0x3a/0x50
> > [ 381.346290] softirqs last disabled at (10260): [<ffffffffb2191f2d>]
> > cgroup_idr_replace+0x1d/0x50
> > [ 381.346293] ---[ end trace b324ba73eb3659f0 ]---
> >
> > Reported-by: Andrei Vagin <avagin@gmail.com>
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Shakeel Butt <shakeelb@google.com>
> > Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > Cc: Pekka Enberg <penberg@kernel.org>
> > ---
> > mm/slab.h | 17 +++++++++++++----
> > 1 file changed, 13 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/slab.h b/mm/slab.h
> > index a4c9b9d042de..c02e7f44268b 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -294,8 +294,12 @@ static __always_inline int memcg_charge_slab(struct page *page,
> > memcg = parent_mem_cgroup(memcg);
> > rcu_read_unlock();
> >
> > - if (unlikely(!memcg))
> > + if (unlikely(!memcg || mem_cgroup_is_root(memcg))) {
> > + mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
> > + (1 << order));
> > + percpu_ref_get_many(&s->memcg_params.refcnt, 1 << order);
> > return true;
>
> Should the above be "return 0;" instead of true?
Yeah... Good catch!
Somehow I missed this previously...
I'll send v2 with the fix.
Thank you!
prev parent reply other threads:[~2019-06-20 21:26 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-20 1:55 Roman Gushchin
2019-06-20 15:48 ` Shakeel Butt
2019-06-20 21:26 ` Roman Gushchin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190620212624.GA3494@castle.DHCP.thefacebook.com \
--to=guro@fb.com \
--cc=Kernel-team@fb.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mhocko@suse.com \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox