Re: [PATCHSET v2] slab: make memcg slab destruction scalable

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Tejun Heo <tj@kernel.org>
Cc: vdavydov.dev@gmail.com, cl@linux.com, penberg@kernel.org,
	rientjes@google.com, akpm@linux-foundation.org, jsvana@fb.com,
	hannes@cmpxchg.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, cgroups@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCHSET v2] slab: make memcg slab destruction scalable
Date: Tue, 17 Jan 2017 09:12:57 +0900	[thread overview]
Message-ID: <20170117001256.GB25218@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <20170114184834.8658-1-tj@kernel.org>

On Sat, Jan 14, 2017 at 01:48:26PM -0500, Tejun Heo wrote:
> This is v2.  Changes from the last version[L] are
> 
> * 0002-slab-remove-synchronous-rcu_barrier-call-in-memcg-ca.patch was
>   incorrect and dropped.
> 
> * 0006-slab-don-t-put-memcg-caches-on-slab_caches-list.patch
>   incorrectly converted places which needed to walk all caches.
>   Replaced with 0005-slab-implement-slab_root_caches-list.patch which
>   adds root-only list instead of converting slab_caches list to list
>   only root caches.
> 
> * Misc fixes.
> 
> With kmem cgroup support enabled, kmem_caches can be created and
> destroyed frequently and a great number of near empty kmem_caches can
> accumulate if there are a lot of transient cgroups and the system is
> not under memory pressure.  When memory reclaim starts under such
> conditions, it can lead to consecutive deactivation and destruction of
> many kmem_caches, easily hundreds of thousands on moderately large
> systems, exposing scalability issues in the current slab management
> code.
> 
> I've seen machines which end up with hundred thousands of caches and
> many millions of kernfs_nodes.  The current code is O(N^2) on the
> total number of caches and has synchronous rcu_barrier() and
> synchronize_sched() in cgroup offline / release path which is executed
> while holding cgroup_mutex.  Combined, this leads to very expensive
> and slow cache destruction operations which can easily keep running
> for half a day.
> 
> This also messes up /proc/slabinfo along with other cache iterating
> operations.  seq_file operates on 4k chunks and on each 4k boundary
> tries to seek to the last position in the list.  With a huge number of
> caches on the list, this becomes very slow and very prone to the list
> content changing underneath it leading to a lot of missing and/or
> duplicate entries.
> 
> This patchset addresses the scalability problem.
> 
> * Add root and per-memcg lists.  Update each user to use the
>   appropriate list.
> 
> * Replace rcu_barrier() and synchronize_rcu() with call_rcu() and
>   call_rcu_sched().
> 
> * For dying empty slub caches, remove the sysfs files after
>   deactivation so that we don't end up with millions of sysfs files
>   without any useful information on them.

Could you confirm that your series solves the problem that is reported
by Doug? It would be great if the result is mentioned to the patch
description.

https://bugzilla.kernel.org/show_bug.cgi?id=172991

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-01-17  0:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-14 18:48 Tejun Heo
2017-01-14 18:48 ` [PATCH 1/8] Revert "slub: move synchronize_sched out of slab_mutex on shrink" Tejun Heo
2017-01-14 18:48 ` [PATCH 2/8] slab: remove synchronous rcu_barrier() call in memcg cache release path Tejun Heo
2017-01-14 18:48 ` [PATCH 3/8] slab: reorganize memcg_cache_params Tejun Heo
2017-01-14 18:48 ` [PATCH 4/8] slab: link memcg kmem_caches on their associated memory cgroup Tejun Heo
2017-01-14 18:48 ` [PATCH 5/8] slab: implement slab_root_caches list Tejun Heo
2017-01-14 18:48 ` [PATCH 6/8] slab: introduce __kmemcg_cache_deactivate() Tejun Heo
2017-01-14 18:48 ` [PATCH 7/8] slab: remove synchronous synchronize_sched() from memcg cache deactivation path Tejun Heo
2017-01-17  0:26   ` Joonsoo Kim
2017-01-17 16:42     ` Tejun Heo
2017-01-14 18:48 ` [PATCH 8/8] slab: remove slub sysfs interface files early for empty memcg caches Tejun Heo
2017-01-17  0:12 ` Joonsoo Kim [this message]
2017-01-17 16:49   ` [PATCHSET v2] slab: make memcg slab destruction scalable Tejun Heo
2017-01-18  7:54     ` Joonsoo Kim
2017-01-18 21:01       ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170117001256.GB25218@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=jsvana@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox