linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug
@ 2007-09-18 12:33 Yasunori Goto
  2007-09-18 19:05 ` Christoph Lameter
  0 siblings, 1 reply; 5+ messages in thread
From: Yasunori Goto @ 2007-09-18 12:33 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm

Hi Cristoph-san.

I found panic occuring after memory hot-add on 2.6.23-rc6-mm1 yet.

Its cause was null pointer access to kmem_cache_node of SLUB at
discard_slab().
In my understanding, it should be created for all slubs after
memory-less-node(or new node) gets new memory. But, current -mm doen't it.
This patch fix for it.

In this patch, it is created after that new_slab is allocated from
new onlined memory.
If kmem_cache_node is created at online_pages() of memory hot-add,
it should be done before build_zonelist to avoid race condition.
But, it means kmem_cache_node must be allocated on other old nodes
due not to complete initialization.
I think this "delay creation" fix is better way than it.

I know that failure case of kmem_cache_alloc_node() must be written
and the prototype of init_kmem_cache_node() here is not good.
Just I would like to confirm that I don't overlook something about SLUB.

Bye.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>

---
 mm/slub.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Index: current/mm/slub.c
===================================================================
--- current.orig/mm/slub.c	2007-09-18 19:46:33.000000000 +0900
+++ current/mm/slub.c	2007-09-18 19:46:59.000000000 +0900
@@ -1081,6 +1081,7 @@ static void setup_object(struct kmem_cac
 		s->ctor(s, object);
 }
 
+static void init_kmem_cache_node(struct kmem_cache_node *n);
 static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
 	struct page *page;
@@ -1089,6 +1090,7 @@ static struct page *new_slab(struct kmem
 	void *end;
 	void *last;
 	void *p;
+	int page_nid;
 
 	BUG_ON(flags & GFP_SLAB_BUG_MASK);
 
@@ -1097,9 +1099,20 @@ static struct page *new_slab(struct kmem
 	if (!page)
 		goto out;
 
-	n = get_node(s, page_to_nid(page));
+	page_nid = page_to_nid(page);
+	n = get_node(s, page_nid);
 	if (n)
 		atomic_long_inc(&n->nr_slabs);
+	else if (node_state(page_nid, N_HIGH_MEMORY) && s != kmalloc_caches) {
+		/*
+		 * If new memory is onlined on new(or memory less) node,
+		 * this will happen. (Second comparison is to avoid eternal
+		 * recursion.)
+		 */
+		n = kmem_cache_alloc_node(kmalloc_caches, GFP_KERNEL, page_nid);
+		init_kmem_cache_node(n);
+		s->node[page_nid] = n;
+	}
 	page->slab = s;
 	page->flags |= 1 << PG_slab;
 	if (s->flags & (SLAB_DEBUG_FREE | SLAB_RED_ZONE | SLAB_POISON |

-- 
Yasunori Goto 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug
  2007-09-18 12:33 [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug Yasunori Goto
@ 2007-09-18 19:05 ` Christoph Lameter
  2007-09-19  2:12   ` Yasunori Goto
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Lameter @ 2007-09-18 19:05 UTC (permalink / raw)
  To: Yasunori Goto; +Cc: linux-mm

On Tue, 18 Sep 2007, Yasunori Goto wrote:

> Its cause was null pointer access to kmem_cache_node of SLUB at
> discard_slab().
> In my understanding, it should be created for all slubs after
> memory-less-node(or new node) gets new memory. But, current -mm doen't it.
> This patch fix for it.

Right. Isnt there a notifier chain that can be used to create the missing 
node structure?

> If kmem_cache_node is created at online_pages() of memory hot-add,
> it should be done before build_zonelist to avoid race condition.
> But, it means kmem_cache_node must be allocated on other old nodes
> due not to complete initialization.

Why before build_zonelist? The regular slab bootstrap occurs after
zonelist creation.

> I think this "delay creation" fix is better way than it.

Looks like this is a way to on demand node structure creation?

> I know that failure case of kmem_cache_alloc_node() must be written
> and the prototype of init_kmem_cache_node() here is not good.
> Just I would like to confirm that I don't overlook something about SLUB.

Could be okay. I would feel better if we always had a per node structure 
for each available node on the node that it covers.

> +	else if (node_state(page_nid, N_HIGH_MEMORY) && s != kmalloc_caches) {
> +		/*
> +		 * If new memory is onlined on new(or memory less) node,
> +		 * this will happen. (Second comparison is to avoid eternal
> +		 * recursion.)
> +		 */

For memoryless nodes this function will return NULL which will cause 
fallback. It looks like we are not going into this branch because in that 
case N_HIGH_MEMORY will not be set for the node.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug
  2007-09-18 19:05 ` Christoph Lameter
@ 2007-09-19  2:12   ` Yasunori Goto
  2007-09-19 17:23     ` Christoph Lameter
  0 siblings, 1 reply; 5+ messages in thread
From: Yasunori Goto @ 2007-09-19  2:12 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm

> On Tue, 18 Sep 2007, Yasunori Goto wrote:
> 
> > Its cause was null pointer access to kmem_cache_node of SLUB at
> > discard_slab().
> > In my understanding, it should be created for all slubs after
> > memory-less-node(or new node) gets new memory. But, current -mm doen't it.
> > This patch fix for it.
> 
> Right. Isnt there a notifier chain that can be used to create the missing 
> node structure?

Yes, there is. Though nothing uses it so far....


> > If kmem_cache_node is created at online_pages() of memory hot-add,
> > it should be done before build_zonelist to avoid race condition.
> > But, it means kmem_cache_node must be allocated on other old nodes
> > due not to complete initialization.
> 
> Why before build_zonelist? The regular slab bootstrap occurs after
> zonelist creation.

build_zonelist() is called very early stage of bootstrap, But it is
called final stage of hot-add.
When build_zonelist() is called at hot-add, all kernel module can
use new memory of the node. So, I'm afraid like following worst case.

   build_zonelist()              
        :                     new_nodes_page = new_slab();
        :                         :
        :                         :
        :                     discard_slab(new_nodes_page)
        :                         (access kmem_cache_node)
        :
   kmem_cache_node setting,


> > I think this "delay creation" fix is better way than it.
> 
> Looks like this is a way to on demand node structure creation?

Yes.

> > I know that failure case of kmem_cache_alloc_node() must be written
> > and the prototype of init_kmem_cache_node() here is not good.
> > Just I would like to confirm that I don't overlook something about SLUB.
> 
> Could be okay. I would feel better if we always had a per node structure 
> for each available node on the node that it covers.
> 
> > +	else if (node_state(page_nid, N_HIGH_MEMORY) && s != kmalloc_caches) {
> > +		/*
> > +		 * If new memory is onlined on new(or memory less) node,
> > +		 * this will happen. (Second comparison is to avoid eternal
> > +		 * recursion.)
> > +		 */
> 
> For memoryless nodes this function will return NULL which will cause 
> fallback. It looks like we are not going into this branch because in that 
> case N_HIGH_MEMORY will not be set for the node.

Probably, the comment was wrong. 
When a memory less node gets new memory by hot-add, 
N_HIGH_MEMORY is set at online_pages(). (It is included in
2.6.23-rc6-mm1). The first comparison is to find it.



Thanks.

-- 
Yasunori Goto 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug
  2007-09-19  2:12   ` Yasunori Goto
@ 2007-09-19 17:23     ` Christoph Lameter
  2007-09-20  2:06       ` Yasunori Goto
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Lameter @ 2007-09-19 17:23 UTC (permalink / raw)
  To: Yasunori Goto; +Cc: linux-mm

On Wed, 19 Sep 2007, Yasunori Goto wrote:

> build_zonelist() is called very early stage of bootstrap, But it is
> called final stage of hot-add.
> When build_zonelist() is called at hot-add, all kernel module can
> use new memory of the node. So, I'm afraid like following worst case.
> 
>    build_zonelist()              
>         :                     new_nodes_page = new_slab();
>         :                         :
>         :                         :
>         :                     discard_slab(new_nodes_page)
>         :                         (access kmem_cache_node)
>         :
>    kmem_cache_node setting,

So we cannot do this without holding off other kernel accesses since it is 
not serialized like bootstrap. Sigh.
 
> > > I think this "delay creation" fix is better way than it.
> > 
> > Looks like this is a way to on demand node structure creation?
> 
> Yes.

Could be useful in general if you can make that work reliably. We can just 
start out with a single per node structure for the boot node and then add 
others on demand?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug
  2007-09-19 17:23     ` Christoph Lameter
@ 2007-09-20  2:06       ` Yasunori Goto
  0 siblings, 0 replies; 5+ messages in thread
From: Yasunori Goto @ 2007-09-20  2:06 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm

> On Wed, 19 Sep 2007, Yasunori Goto wrote:
> 
> > build_zonelist() is called very early stage of bootstrap, But it is
> > called final stage of hot-add.
> > When build_zonelist() is called at hot-add, all kernel module can
> > use new memory of the node. So, I'm afraid like following worst case.
> > 
> >    build_zonelist()              
> >         :                     new_nodes_page = new_slab();
> >         :                         :
> >         :                         :
> >         :                     discard_slab(new_nodes_page)
> >         :                         (access kmem_cache_node)
> >         :
> >    kmem_cache_node setting,
> 
> So we cannot do this without holding off other kernel accesses since it is 
> not serialized like bootstrap. Sigh.
>
> > > > I think this "delay creation" fix is better way than it.
> > > 
> > > Looks like this is a way to on demand node structure creation?
> > 
> > Yes.
> 
> Could be useful in general if you can make that work reliably. We can just 
> start out with a single per node structure for the boot node and then add 
> others on demand?

Hmmmmm. I don't think demand node creation can be generic.
Just I would like to fix the panic.
Ok, I'll make a patch which sets kmem_cache_node before
build_zonelist() to fix panic for the present.
And I reconsider about allocation place issue later.

Thanks for your comment.

-- 
Yasunori Goto 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-09-20  2:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-18 12:33 [RFC/Patch](memory hotplug) fix null pointer access of kmem_cache_node after memory hotplug Yasunori Goto
2007-09-18 19:05 ` Christoph Lameter
2007-09-19  2:12   ` Yasunori Goto
2007-09-19 17:23     ` Christoph Lameter
2007-09-20  2:06       ` Yasunori Goto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox