Re: [PATCH v2] mm, memcg: Add a memcg_slabinfo debugfs file

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Shakeel Butt <shakeelb@google.com>
To: Waiman Long <longman@redhat.com>
Cc: Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	 David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>,
	 LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@kernel.org>,  Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Vladimir Davydov <vdavydov.dev@gmail.com>
Subject: Re: [PATCH v2] mm, memcg: Add a memcg_slabinfo debugfs file
Date: Wed, 19 Jun 2019 16:48:09 -0700	[thread overview]
Message-ID: <CALvZod7pdOx0a1v4oX5-7ZfCykM8iwRwPkW-+gbO1B4+j1SXqw@mail.gmail.com> (raw)
In-Reply-To: <20190619171621.26209-1-longman@redhat.com>

Hi Waiman,

On Wed, Jun 19, 2019 at 10:16 AM Waiman Long <longman@redhat.com> wrote:
>
> There are concerns about memory leaks from extensive use of memory
> cgroups as each memory cgroup creates its own set of kmem caches. There
> is a possiblity that the memcg kmem caches may remain even after the
> memory cgroups have been offlined. Therefore, it will be useful to show
> the status of each of memcg kmem caches.
>
> This patch introduces a new <debugfs>/memcg_slabinfo file which is
> somewhat similar to /proc/slabinfo in format, but lists only information
> about kmem caches that have child memcg kmem caches. Information
> available in /proc/slabinfo are not repeated in memcg_slabinfo.
>
> A portion of a sample output of the file was:
>
>   # <name> <css_id[:dead]> <active_objs> <num_objs> <active_slabs> <num_slabs>
>   rpc_inode_cache   root          13     51      1      1
>   rpc_inode_cache     48           0      0      0      0
>   fat_inode_cache   root           1     45      1      1
>   fat_inode_cache     41           2     45      1      1
>   xfs_inode         root         770    816     24     24
>   xfs_inode           92          22     34      1      1
>   xfs_inode           88:dead      1     34      1      1
>   xfs_inode           89:dead     23     34      1      1
>   xfs_inode           85           4     34      1      1
>   xfs_inode           84           9     34      1      1
>
> The css id of the memcg is also listed. If a memcg is not online,
> the tag ":dead" will be attached as shown above.
>
> Suggested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  mm/slab_common.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 57 insertions(+)
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 58251ba63e4a..2bca1558a722 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -17,6 +17,7 @@
>  #include <linux/uaccess.h>
>  #include <linux/seq_file.h>
>  #include <linux/proc_fs.h>
> +#include <linux/debugfs.h>
>  #include <asm/cacheflush.h>
>  #include <asm/tlbflush.h>
>  #include <asm/page.h>
> @@ -1498,6 +1499,62 @@ static int __init slab_proc_init(void)
>         return 0;
>  }
>  module_init(slab_proc_init);
> +
> +#if defined(CONFIG_DEBUG_FS) && defined(CONFIG_MEMCG_KMEM)
> +/*
> + * Display information about kmem caches that have child memcg caches.
> + */
> +static int memcg_slabinfo_show(struct seq_file *m, void *unused)
> +{
> +       struct kmem_cache *s, *c;
> +       struct slabinfo sinfo;
> +
> +       mutex_lock(&slab_mutex);

On large machines there can be thousands of memcgs and potentially
each memcg can have hundreds of kmem caches. So, the slab_mutex can be
held for a very long time.

Our internal implementation traverses the memcg tree and then
traverses 'memcg->kmem_caches' within the slab_mutex (and
cond_resched() after unlock).

> +       seq_puts(m, "# <name> <css_id[:dead]> <active_objs> <num_objs>");
> +       seq_puts(m, " <active_slabs> <num_slabs>\n");
> +       list_for_each_entry(s, &slab_root_caches, root_caches_node) {
> +               /*
> +                * Skip kmem caches that don't have any memcg children.
> +                */
> +               if (list_empty(&s->memcg_params.children))
> +                       continue;
> +
> +               memset(&sinfo, 0, sizeof(sinfo));
> +               get_slabinfo(s, &sinfo);
> +               seq_printf(m, "%-17s root      %6lu %6lu %6lu %6lu\n",
> +                          cache_name(s), sinfo.active_objs, sinfo.num_objs,
> +                          sinfo.active_slabs, sinfo.num_slabs);
> +
> +               for_each_memcg_cache(c, s) {
> +                       struct cgroup_subsys_state *css;
> +                       char *dead = "";
> +
> +                       css = &c->memcg_params.memcg->css;
> +                       if (!(css->flags & CSS_ONLINE))
> +                               dead = ":dead";

Please note that Roman's kmem cache reparenting patch series have made
kmem caches of zombie memcgs a bit tricky. On memcg offlining the
memcg kmem caches are reparented and the css->id can get recycled. So,
we want to know that the a kmem cache is reparented and which memcg it
belonged to initially. Determining if a kmem cache is reparented, we
can store a flag on the kmem cache and for the previous memcg we can
use fhandle. However to not make this more complicated, for now, we
can just have the info that the kmem cache was reparented i.e. belongs
to an offlined memcg.

> +
> +                       memset(&sinfo, 0, sizeof(sinfo));
> +                       get_slabinfo(c, &sinfo);
> +                       seq_printf(m, "%-17s %4d%5s %6lu %6lu %6lu %6lu\n",
> +                                  cache_name(c), css->id, dead,
> +                                  sinfo.active_objs, sinfo.num_objs,
> +                                  sinfo.active_slabs, sinfo.num_slabs);
> +               }
> +       }
> +       mutex_unlock(&slab_mutex);
> +       return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(memcg_slabinfo);
> +
> +static int __init memcg_slabinfo_init(void)
> +{
> +       debugfs_create_file("memcg_slabinfo", S_IFREG | S_IRUGO,
> +                           NULL, NULL, &memcg_slabinfo_fops);
> +       return 0;
> +}
> +
> +late_initcall(memcg_slabinfo_init);
> +#endif /* CONFIG_DEBUG_FS && CONFIG_MEMCG_KMEM */
>  #endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */
>
>  static __always_inline void *__do_krealloc(const void *p, size_t new_size,
> --
> 2.18.1
>

next prev parent reply	other threads:[~2019-06-19 23:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-19 17:16 Waiman Long
2019-06-19 23:48 ` Shakeel Butt [this message]
2019-06-20 14:23   ` Waiman Long
2019-06-20 14:39     ` Shakeel Butt
2019-06-20 14:48       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALvZod7pdOx0a1v4oX5-7ZfCykM8iwRwPkW-+gbO1B4+j1SXqw@mail.gmail.com \
    --to=shakeelb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox