Re: [PATCH] mm/slub: fix panic in slab_alloc_node()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Laurent Dufour <ldufour@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	nathanl@linux.ibm.com, cheloha@linux.ibm.com,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	stable@vger.kernel.org, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] mm/slub: fix panic in slab_alloc_node()
Date: Tue, 27 Oct 2020 15:39:46 +0100	[thread overview]
Message-ID: <11bdd295-3ef8-fbeb-2c76-2a109fa26f19@linux.ibm.com> (raw)
In-Reply-To: <20201027142421.GW20500@dhcp22.suse.cz>

Le 27/10/2020 à 15:24, Michal Hocko a écrit :
> [Cc Vlastimil]
> 
> On Tue 27-10-20 15:09:26, Laurent Dufour wrote:
>> While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
>> with 11TB of ram, I hit the following panic:
>>
>> BUG: Kernel NULL pointer dereference on read at 0x00000007
>> Faulting instruction address: 0xc000000000456048
>> Oops: Kernel access of bad area, sig: 11 [#2]
>> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> Modules linked in: rpadlpar_io rpaphp
>> CPU: 160 PID: 1 Comm: systemd Tainted: G      D           5.9.0 #1
>> NIP:  c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
>> REGS: c00006028d1b77a0 TRAP: 0300   Tainted: G      D            (5.9.0)
>> MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24004228  XER: 00000000
>> CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
>> GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
>> GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
>> GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
>> GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
>> GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
>> GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
>> GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
>> GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
>> NIP [c000000000456048] __kmalloc_node+0x108/0x790
>> LR [c000000000455fd4] __kmalloc_node+0x94/0x790
>> Call Trace:
>> [c00006028d1b7a30] [c00007c51af92000] 0xc00007c51af92000 (unreliable)
>> [c00006028d1b7aa0] [c0000000003c0ff8] kvmalloc_node+0x58/0x110
>> [c00006028d1b7ae0] [c00000000047b45c] mem_cgroup_css_online+0x10c/0x270
>> [c00006028d1b7b30] [c000000000241fd8] online_css+0x48/0xd0
>> [c00006028d1b7b60] [c00000000024af14] cgroup_apply_control_enable+0x2c4/0x470
>> [c00006028d1b7c40] [c00000000024e838] cgroup_mkdir+0x408/0x5f0
>> [c00006028d1b7cb0] [c0000000005a4ef0] kernfs_iop_mkdir+0x90/0x100
>> [c00006028d1b7cf0] [c0000000004b8168] vfs_mkdir+0x138/0x250
>> [c00006028d1b7d40] [c0000000004baf04] do_mkdirat+0x154/0x1c0
>> [c00006028d1b7dc0] [c000000000032b38] system_call_exception+0xf8/0x200
>> [c00006028d1b7e20] [c00000000000c740] system_call_common+0xf0/0x27c
>> Instruction dump:
>> e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
>> 2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
>>
>> This pointing to the following code:
>>
>> mm/slub.c:2851
>>          if (unlikely(!object || !node_match(page, node))) {
>> c000000000456038:       00 00 bc 2f     cmpdi   cr7,r28,0
>> c00000000045603c:       18 00 9e 41     beq     cr7,c000000000456054 <__kmalloc_node+0x114>
>> node_match():
>> mm/slub.c:2491
>>          if (node != NUMA_NO_NODE && page_to_nid(page) != node)
>> c000000000456040:       30 02 92 41     beq     cr4,c000000000456270 <__kmalloc_node+0x330>
>> page_to_nid():
>> include/linux/mm.h:1294
>> c000000000456044:       10 00 27 e9     ld      r9,16(r7)
>> c000000000456048:       07 00 29 89     lbz     r9,7(r9)	<<<< r9 = NULL
>> node_match():
>> mm/slub.c:2491
>> c00000000045604c:       00 48 99 7f     cmpw    cr7,r25,r9
>> c000000000456050:       20 02 9e 41     beq     cr7,c000000000456270 <__kmalloc_node+0x330>
>>
>> The panic occurred in slab_alloc_node() when checking for the page's node:
>> 	object = c->freelist;
>> 	page = c->page;
>> 	if (unlikely(!object || !node_match(page, node))) {
>> 		object = __slab_alloc(s, gfpflags, node, addr, c);
>> 		stat(s, ALLOC_SLOWPATH);
>>
>> The issue is that object is not NULL while page is NULL which is odd but
>> may happen if the cache flush happened after loading object but before
>> loading page. Thus checking for the page pointer is required too.
> 
> Could you be more specific? I am especially confused how the memory
> hotplug is involved here. What kind of flush are we talking about?

This happens when flush_cpu_slab() is called when a memory block is about to be 
offlined, see slab_mem_going_offline_callback() called by the 
MEM_GOING_OFFLINE's callback triggered by offline_pages().

> 
>> In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
>> node_match()") check on the page pointer has been removed assuming that
>> page is always valid when it is called. It happens that this is not true in
>> that particular case, so check for page before calling node_match() here.
>>
>> Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Pekka Enberg <penberg@kernel.org>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: stable@vger.kernel.org
>> ---
>>   mm/slub.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 8f66de8a5ab3..7dc5c6aaf4b7 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2852,7 +2852,7 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
>>   
>>   	object = c->freelist;
>>   	page = c->page;
>> -	if (unlikely(!object || !node_match(page, node))) {
>> +	if (unlikely(!object || !page || !node_match(page, node))) {
>>   		object = __slab_alloc(s, gfpflags, node, addr, c);
>>   	} else {
>>   		void *next_object = get_freepointer_safe(s, object);
>> -- 
>> 2.29.1
>>
>

next prev parent reply	other threads:[~2020-10-27 14:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27 14:09 Laurent Dufour
2020-10-27 14:24 ` Michal Hocko
2020-10-27 14:39   ` Laurent Dufour [this message]
2020-10-27 15:03     ` Michal Hocko
2020-10-27 15:12       ` Laurent Dufour
2020-10-27 16:31         ` Vlastimil Babka
2020-10-27 19:04           ` [PATCH v2] " Laurent Dufour
2020-10-28 11:11             ` Christopher Lameter
2020-10-28 13:49               ` Laurent Dufour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11bdd295-3ef8-fbeb-2c76-2a109fa26f19@linux.ibm.com \
    --to=ldufour@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cheloha@linux.ibm.com \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=nathanl@linux.ibm.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox