From: Vlastimil Babka <vbabka@suse.cz>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Andrew Morton <akpm@linux-foundation.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, Jinjie Ruan <ruanjinjie@huawei.com>,
Liu Shixin <liushixin2@huawei.com>
Subject: Re: [PATCH v1] mm/slab: Allow cache creation to proceed even if sysfs registration fails
Date: Sat, 2 Nov 2024 09:53:11 +0100 [thread overview]
Message-ID: <195242b4-5471-419a-a350-08fd246973f0@suse.cz> (raw)
In-Reply-To: <CAB=+i9QE7JAwKzjh0w9nyj2w71uOJNE2vBsFgA=Gd3LuybFndA@mail.gmail.com>
On 11/2/24 8:18 AM, Hyeonggon Yoo wrote:
>>
>> Also here, and simplified to "if (sysfs_slab_add(s)) ... " to avoid dealing
>> with err.
>
> Oh no. err is initialized to -EINVAL, so that will not work as intended.
> It is causing the following list corruption.
Ooops, right, thanks a lot. Wrongly assumed that a test boot in
virtme-ng would catch silly mistakes like that. Looks like all caches
were created with SLAB_STATE < FULL.
Fixed by setting err = 0 before trying sysfs add.
> [ 0.607833] __kmem_cache_create_args(fscrypt_inode_info) failed
> with error -22
> [ 0.608518] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc2+ #63
> [ 0.609181] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.16.2-1.fc38 04/01/2014
> [ 0.610000] Call Trace:
> [ 0.610233] <TASK>
> [ 0.610433] dump_stack_lvl+0x64/0x80
> [ 0.610806] __kmem_cache_create_args+0x1eb/0x280
> [ 0.611253] ? __pfx_fscrypt_init+0x10/0x10
> [ 0.611647] fscrypt_init+0x88/0xf0
> [ 0.611980] ? __pfx_fscrypt_init+0x10/0x10
> [ 0.612373] do_one_initcall+0x5b/0x320
> [ 0.612736] kernel_init_freeable+0x351/0x510
> [ 0.613150] ? __pfx_kernel_init+0x10/0x10
> [ 0.613536] kernel_init+0x1a/0x1d0
> [ 0.613865] ret_from_fork+0x34/0x50
> [ 0.614207] ? __pfx_kernel_init+0x10/0x10
> [ 0.614591] ret_from_fork_asm+0x1a/0x30
> [ 0.614968] </TASK>
> [ 0.615203] list_add corruption. prev->next should be next
> (ffff986bc2bb9aa0), but was ffff986bc2b6.
> [ 0.616308] ------------[ cut here ]------------
> [ 0.616746] kernel BUG at lib/list_debug.c:32!
> [ 0.617173] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 0.617709] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-rc2+ #63
> [ 0.618372] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.16.2-1.fc38 04/01/2014
> [ 0.619192] RIP: 0010:__list_add_valid_or_report+0x78/0xa0
> [ 0.619714] Code: 8b ff 0f 0b 48 89 c1 48 c7 c7 50 ca c0 9a e8 2f
> f4 8b ff 0f 0b 48 89 d1 48 89 c6 b
> [ 0.621473] RSP: 0018:ffffa47380013c98 EFLAGS: 00010246
> [ 0.621969] RAX: 0000000000000075 RBX: ffff986bc2b63238 RCX: ffffffff9b5646a8
> [ 0.622638] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
> [ 0.623313] RBP: ffff986bc2bb9ab8 R08: 0000000000000000 R09: 205d333032353136
> [ 0.623996] R10: 74707572726f6320 R11: 6464615f7473696c R12: ffff986bc2bb9aa0
> [ 0.624673] R13: ffff986bc2b63240 R14: ffff986bc2b63240 R15: ffff986bc2b631c0
> [ 0.625355] FS: 0000000000000000(0000) GS:ffff986bdf400000(0000)
> knlGS:0000000000000000
> [ 0.626111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.626653] CR2: ffff986bd3801000 CR3: 000000001242a000 CR4: 00000000000006f0
> [ 0.627325] Call Trace:
> [ 0.627559] <TASK>
> [ 0.627760] ? die+0x36/0x90
> [ 0.628039] ? do_trap+0xdd/0x100
> [ 0.628354] ? __list_add_valid_or_report+0x78/0xa0
> [ 0.628814] ? do_error_trap+0x6a/0x90
> [ 0.629172] ? __list_add_valid_or_report+0x78/0xa0
> [ 0.629633] ? exc_invalid_op+0x50/0x70
> [ 0.630001] ? __list_add_valid_or_report+0x78/0xa0
> [ 0.630463] ? asm_exc_invalid_op+0x1a/0x20
> [ 0.630861] ? __list_add_valid_or_report+0x78/0xa0
> [ 0.631324] ? __list_add_valid_or_report+0x78/0xa0
> [ 0.631784] kobject_add_internal+0x78/0x2a0
> [ 0.632192] kobject_init_and_add+0x8c/0xd0
> [ 0.632589] ? kernfs_find_ns+0x35/0xc0
> [ 0.632957] sysfs_slab_add+0x193/0x1e0
> [ 0.633318] do_kmem_cache_create+0x455/0x630
> [ 0.633727] __kmem_cache_create_args+0x157/0x280
> [ 0.634176] ? __pfx_fsverity_init+0x10/0x10
> [ 0.634586] fsverity_init_info_cache+0x66/0x90
> [ 0.635022] fsverity_init+0x13/0x40
> [ 0.635365] do_one_initcall+0x5b/0x320
> [ 0.635734] kernel_init_freeable+0x351/0x510
> [ 0.636154] ? __pfx_kernel_init+0x10/0x10
> [ 0.636547] kernel_init+0x1a/0x1d0
> [ 0.636881] ret_from_fork+0x34/0x50
> [ 0.637227] ? __pfx_kernel_init+0x10/0x10
> [ 0.637619] ret_from_fork_asm+0x1a/0x30
> [ 0.637998] </TASK>
> [ 0.638211] Modules linked in:
> [ 0.638512] ---[ end trace 0000000000000000 ]---
> [ 0.638962] RIP: 0010:__list_add_valid_or_report+0x78/0xa0
> [ 0.639483] Code: 8b ff 0f 0b 48 89 c1 48 c7 c7 50 ca c0 9a e8 2f
> f4 8b ff 0f 0b 48 89 d1 48 89 c6 b
> [ 0.641253] RSP: 0018:ffffa47380013c98 EFLAGS: 00010246
> [ 0.641753] RAX: 0000000000000075 RBX: ffff986bc2b63238 RCX: ffffffff9b5646a8
> [ 0.642431] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
> [ 0.643118] RBP: ffff986bc2bb9ab8 R08: 0000000000000000 R09: 205d333032353136
> [ 0.643795] R10: 74707572726f6320 R11: 6464615f7473696c R12: ffff986bc2bb9aa0
> [ 0.644471] R13: ffff986bc2b63240 R14: ffff986bc2b63240 R15: ffff986bc2b631c0
> [ 0.645152] FS: 0000000000000000(0000) GS:ffff986bdf400000(0000)
> knlGS:0000000000000000
> [ 0.645923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.646470] CR2: ffff986bd3801000 CR3: 000000001242a000 CR4: 00000000000006f0
> [ 0.647156] note: swapper/0[1] exited with preempt_count 1
> [ 0.647686] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x0000000b
> [ 0.648446] Kernel Offset: 0x18000000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000)
> [ 0.649466] ---[ end Kernel panic - not syncing: Attempted to kill
> init! exitcode=0x0000000b ]---
>
>
>>>> + }
>>>>
>>>> if (s->flags & SLAB_STORE_USER)
>>>> debugfs_slab_add(s);
>>>> @@ -7276,7 +7283,8 @@ static int sysfs_slab_add(struct kmem_cache *s)
>>>>
>>>> void sysfs_slab_unlink(struct kmem_cache *s)
>>>> {
>>>> - kobject_del(&s->kobj);
>>>> + if (s->kobj.state_in_sysfs)
>>>> + kobject_del(&s->kobj);
>>>> }
>>>>
>>>> void sysfs_slab_release(struct kmem_cache *s)
>>>> @@ -7305,6 +7313,11 @@ static int sysfs_slab_alias(struct kmem_cache *s, const char *name)
>>>> * If we have a leftover link then remove it.
>>>> */
>>>> sysfs_remove_link(&slab_kset->kobj, name);
>>>> + /*
>>>> + * The original cache may have failed to generate sysfs file.
>>>> + * In that case, sysfs_create_link() returns -ENOENT and
>>>> + * symbolic link creation is skipped.
>>>> + */
>>>> return sysfs_create_link(&slab_kset->kobj, &s->kobj, name);
>>>> }
>>>>
>>>> --
>>>> 2.45.0
>>>>
>>
next prev parent reply other threads:[~2024-11-02 8:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-01 13:08 Hyeonggon Yoo
2024-11-01 13:15 ` Hyeonggon Yoo
2024-11-01 13:58 ` Vlastimil Babka
2024-11-02 7:18 ` Hyeonggon Yoo
2024-11-02 8:53 ` Vlastimil Babka [this message]
2024-11-02 10:07 ` Hyeonggon Yoo
2024-11-02 10:16 ` Hyeonggon Yoo
2024-11-02 14:51 ` Hyeonggon Yoo
2024-11-04 9:28 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=195242b4-5471-419a-a350-08fd246973f0@suse.cz \
--to=vbabka@suse.cz \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-mm@kvack.org \
--cc=liushixin2@huawei.com \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=ruanjinjie@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox