linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] slab: Error out on duplicate cache names when DEBUG_VM=y
@ 2024-08-04 21:28 Pedro Falcato
  2024-08-05 10:24 ` Pedro Falcato
  0 siblings, 1 reply; 4+ messages in thread
From: Pedro Falcato @ 2024-08-04 21:28 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Vlastimil Babka
  Cc: Roman Gushchin, Hyeonggon Yoo, linux-mm, linux-kernel, Pedro Falcato

Duplicate slab cache names can create havoc for userspace tooling that
expects slab cache names to be unique. This is a reasonable expectation.

Link: https://lore.kernel.org/linux-fsdevel/2d1d053da1cafb3e7940c4f25952da4f0af34e38.1722293276.git.osandov@fb.com/
Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
---
 mm/slab_common.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 40b582a014b..c4f31c887f9 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -88,6 +88,19 @@ unsigned int kmem_cache_size(struct kmem_cache *s)
 EXPORT_SYMBOL(kmem_cache_size);
 
 #ifdef CONFIG_DEBUG_VM
+
+static bool kmem_cache_is_duplicate_name(const char *name)
+{
+	struct kmem_cache *s;
+
+	list_for_each_entry(s, &slab_caches, list) {
+		if (!strcmp(s->name, name))
+			return true;
+	}
+
+	return false;
+}
+
 static int kmem_cache_sanity_check(const char *name, unsigned int size)
 {
 	if (!name || in_interrupt() || size > KMALLOC_MAX_SIZE) {
@@ -95,6 +108,12 @@ static int kmem_cache_sanity_check(const char *name, unsigned int size)
 		return -EINVAL;
 	}
 
+	if (kmem_cache_is_duplicate_name(name)) {
+		/* Duplicate names will confuse slabtop, et al */
+		pr_err("%s: name %s already exists as a cache\n", __func__, name);
+		return -EINVAL;
+	}
+
 	WARN_ON(strchr(name, ' '));	/* It confuses parsers */
 	return 0;
 }
-- 
2.46.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] slab: Error out on duplicate cache names when DEBUG_VM=y
  2024-08-04 21:28 [PATCH] slab: Error out on duplicate cache names when DEBUG_VM=y Pedro Falcato
@ 2024-08-05 10:24 ` Pedro Falcato
  2024-08-05 10:38   ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Pedro Falcato @ 2024-08-05 10:24 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Vlastimil Babka
  Cc: Roman Gushchin, Hyeonggon Yoo, linux-mm, linux-kernel

On Sun, Aug 4, 2024 at 10:28 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> Duplicate slab cache names can create havoc for userspace tooling that
> expects slab cache names to be unique. This is a reasonable expectation.

For completeness, I just had a look at duplicate cache names around
the kernel using
git grep -Eoh "kmem_cache_create.*\"," | grep -Eo \".*\" | uniq -d
(which seems to be correct)

which results in the following patch (on top of torvalds/linux.git
master, so file_lock_cache hasn't been fixed yet)

This patch being so small is what leads me to believe that erroring
out here is safe. Of course, no one knows what the out-of-tree modules
do.

diff --git a/drivers/scsi/snic/snic_main.c b/drivers/scsi/snic/snic_main.c
index cc824dcfe7d..abc78320c66 100644
--- a/drivers/scsi/snic/snic_main.c
+++ b/drivers/scsi/snic/snic_main.c
@@ -873,7 +873,7 @@ snic_global_data_init(void)
       snic_glob->req_cache[SNIC_REQ_CACHE_MAX_SGL] = cachep;

       len = sizeof(struct snic_host_req);
-       cachep = kmem_cache_create("snic_req_maxsgl", len, SNIC_SG_DESC_ALIGN,
+       cachep = kmem_cache_create("snic_req_tm", len, SNIC_SG_DESC_ALIGN,
                                  SLAB_HWCACHE_ALIGN, NULL);
       if (!cachep) {
               SNIC_ERR("Failed to create snic tm req slab\n");
diff --git a/fs/locks.c b/fs/locks.c
index 9afb16e0683..e45cad40f8b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2984,7 +2984,7 @@ static int __init filelock_init(void)
       filelock_cache = kmem_cache_create("file_lock_cache",
                       sizeof(struct file_lock), 0, SLAB_PANIC, NULL);

-       filelease_cache = kmem_cache_create("file_lock_cache",
+       filelease_cache = kmem_cache_create("file_lease_cache",
                       sizeof(struct file_lease), 0, SLAB_PANIC, NULL);

       for_each_possible_cpu(i) {


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] slab: Error out on duplicate cache names when DEBUG_VM=y
  2024-08-05 10:24 ` Pedro Falcato
@ 2024-08-05 10:38   ` Vlastimil Babka
  2024-08-05 11:24     ` Mateusz Guzik
  0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2024-08-05 10:38 UTC (permalink / raw)
  To: Pedro Falcato, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton
  Cc: Roman Gushchin, Hyeonggon Yoo, linux-mm, linux-kernel

On 8/5/24 12:24, Pedro Falcato wrote:
> On Sun, Aug 4, 2024 at 10:28 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>>
>> Duplicate slab cache names can create havoc for userspace tooling that
>> expects slab cache names to be unique. This is a reasonable expectation.
> 
> For completeness, I just had a look at duplicate cache names around
> the kernel using
> git grep -Eoh "kmem_cache_create.*\"," | grep -Eo \".*\" | uniq -d
> (which seems to be correct)
> 
> which results in the following patch (on top of torvalds/linux.git
> master, so file_lock_cache hasn't been fixed yet)
> 
> This patch being so small is what leads me to believe that erroring
> out here is safe. Of course, no one knows what the out-of-tree modules
> do.

What about module unload/reload with a SLAB_TYPESAFE_BY_RCU cache that will
delay its freeing. Soon also if there are kfree_rcu()'s in flight. And the
zombie cache can stay also permamently around if it fails to be destroy
because some objects were not freed.

In all these cases the cache's refcount should be 0 at that point, so
minimally the check should ignore those. But I would also rather make it be
a warning anyway, at least for a few releases.

> diff --git a/drivers/scsi/snic/snic_main.c b/drivers/scsi/snic/snic_main.c
> index cc824dcfe7d..abc78320c66 100644
> --- a/drivers/scsi/snic/snic_main.c
> +++ b/drivers/scsi/snic/snic_main.c
> @@ -873,7 +873,7 @@ snic_global_data_init(void)
>        snic_glob->req_cache[SNIC_REQ_CACHE_MAX_SGL] = cachep;
> 
>        len = sizeof(struct snic_host_req);
> -       cachep = kmem_cache_create("snic_req_maxsgl", len, SNIC_SG_DESC_ALIGN,
> +       cachep = kmem_cache_create("snic_req_tm", len, SNIC_SG_DESC_ALIGN,
>                                   SLAB_HWCACHE_ALIGN, NULL);
>        if (!cachep) {
>                SNIC_ERR("Failed to create snic tm req slab\n");
> diff --git a/fs/locks.c b/fs/locks.c
> index 9afb16e0683..e45cad40f8b 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2984,7 +2984,7 @@ static int __init filelock_init(void)
>        filelock_cache = kmem_cache_create("file_lock_cache",
>                        sizeof(struct file_lock), 0, SLAB_PANIC, NULL);
> 
> -       filelease_cache = kmem_cache_create("file_lock_cache",
> +       filelease_cache = kmem_cache_create("file_lease_cache",
>                        sizeof(struct file_lease), 0, SLAB_PANIC, NULL);
> 
>        for_each_possible_cpu(i) {



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] slab: Error out on duplicate cache names when DEBUG_VM=y
  2024-08-05 10:38   ` Vlastimil Babka
@ 2024-08-05 11:24     ` Mateusz Guzik
  0 siblings, 0 replies; 4+ messages in thread
From: Mateusz Guzik @ 2024-08-05 11:24 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Pedro Falcato, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	linux-mm, linux-kernel

On Mon, Aug 05, 2024 at 12:38:29PM +0200, Vlastimil Babka wrote:
> What about module unload/reload with a SLAB_TYPESAFE_BY_RCU cache that will
> delay its freeing. Soon also if there are kfree_rcu()'s in flight. And the
> zombie cache can stay also permamently around if it fails to be destroy
> because some objects were not freed.
> 

It should be an invariant that the cache is fully whacked by the time
kmem_cache_destroy returns, at worst with the exception of when leaked
items are encountered (but even then it should be renamed to something
indicating it is defunct).

Suppose a cache could not have been destroyed and was left as is, then
the offending module was loaded again -- now you got two with the same
name, which is not that great either.

I find myself quite surprised that kmem_cache_destroy can return even if
cache destruction is still pending.

This was added in 657dc2f97220 ("slab: remove synchronous rcu_barrier()
call in memcg cache release path"), citing batching benefits for
frequent kmem cache creation/destruction.

I believe the very notion of doing that *frequently* is b0rked and any
code doing it should be patched to stop regardless.

Even so, if there are any benefits to the committed patch, it perhaps
can be augmented so that the kmem_cache_destroy caller can wait for the
entire thing to finish (e.g., with a completion).


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-08-05 11:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-04 21:28 [PATCH] slab: Error out on duplicate cache names when DEBUG_VM=y Pedro Falcato
2024-08-05 10:24 ` Pedro Falcato
2024-08-05 10:38   ` Vlastimil Babka
2024-08-05 11:24     ` Mateusz Guzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox