linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Allocate memory cgroup structures in local nodes
@ 2011-05-04 18:17 Andi Kleen
  2011-05-04 19:17 ` David Rientjes
  2011-05-04 19:36 ` Balbir Singh
  0 siblings, 2 replies; 6+ messages in thread
From: Andi Kleen @ 2011-05-04 18:17 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, Andi Kleen, Michal Hocko, Dave Hansen,
	Balbir Singh, Johannes Weiner

From: Andi Kleen <ak@linux.intel.com>

[Andrew: since this is a regression and a very simple fix
could you still consider it for .39? Thanks]

dde79e005a769 added a regression that the memory cgroup data structures
all end up in node 0 because the first attempt at allocating them
would not pass in a node hint. Since the initialization runs on CPU #0
it would all end up node 0. This is a problem on large memory systems,
where node 0 would lose a lot of memory.

Change the alloc_pages_exact to alloc_pages_exact_node. This will
still fall back to other nodes if not enough memory is available.

[RED-PEN: right now it would fall back first before trying
vmalloc_node. Probably not the best strategy ... But I left it like
that for now.]

Reported-by: Doug Nelson
CC: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 mm/page_cgroup.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 9905501..1f4e20f 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -134,7 +134,7 @@ static void *__init_refok alloc_page_cgroup(size_t size, int nid)
 {
 	void *addr = NULL;
 
-	addr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
+	addr = alloc_pages_exact_node(nid, size, GFP_KERNEL | __GFP_NOWARN);
 	if (addr)
 		return addr;
 
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allocate memory cgroup structures in local nodes
  2011-05-04 18:17 [PATCH] Allocate memory cgroup structures in local nodes Andi Kleen
@ 2011-05-04 19:17 ` David Rientjes
  2011-05-04 20:04   ` Andi Kleen
  2011-05-04 19:36 ` Balbir Singh
  1 sibling, 1 reply; 6+ messages in thread
From: David Rientjes @ 2011-05-04 19:17 UTC (permalink / raw)
  To: Andi Kleen
  Cc: akpm, linux-kernel, linux-mm, Andi Kleen, Michal Hocko,
	Dave Hansen, Balbir Singh, Johannes Weiner

On Wed, 4 May 2011, Andi Kleen wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> [Andrew: since this is a regression and a very simple fix
> could you still consider it for .39? Thanks]
> 

Before that's considered, the order of the arguments to 
alloc_pages_exact_node() needs to be fixed.

> dde79e005a769 added a regression that the memory cgroup data structures
> all end up in node 0 because the first attempt at allocating them
> would not pass in a node hint. Since the initialization runs on CPU #0
> it would all end up node 0. This is a problem on large memory systems,
> where node 0 would lose a lot of memory.
> 
> Change the alloc_pages_exact to alloc_pages_exact_node. This will
> still fall back to other nodes if not enough memory is available.
> 

The vmalloc_node() calls ensure that the nid is actually set in 
N_HIGH_MEMORY and fails otherwise (we don't fallback to using vmalloc()), 
so it looks like the failures for alloc_pages_exact_node() and 
vmalloc_node() would be different?  Why do we want to fallback for one and 
not the other?

> [RED-PEN: right now it would fall back first before trying
> vmalloc_node. Probably not the best strategy ... But I left it like
> that for now.]
> 
> Reported-by: Doug Nelson
> CC: Michal Hocko <mhocko@suse.cz>
> Cc: Dave Hansen <dave@linux.vnet.ibm.com>
> Cc: Balbir Singh <balbir@in.ibm.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  mm/page_cgroup.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 9905501..1f4e20f 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -134,7 +134,7 @@ static void *__init_refok alloc_page_cgroup(size_t size, int nid)
>  {
>  	void *addr = NULL;
>  
> -	addr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
> +	addr = alloc_pages_exact_node(nid, size, GFP_KERNEL | __GFP_NOWARN);
>  	if (addr)
>  		return addr;
>  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allocate memory cgroup structures in local nodes
  2011-05-04 18:17 [PATCH] Allocate memory cgroup structures in local nodes Andi Kleen
  2011-05-04 19:17 ` David Rientjes
@ 2011-05-04 19:36 ` Balbir Singh
  1 sibling, 0 replies; 6+ messages in thread
From: Balbir Singh @ 2011-05-04 19:36 UTC (permalink / raw)
  To: Andi Kleen
  Cc: akpm, linux-kernel, linux-mm, Andi Kleen, Michal Hocko,
	Dave Hansen, Johannes Weiner

* Andi Kleen <andi@firstfloor.org> [2011-05-04 11:17:38]:

> From: Andi Kleen <ak@linux.intel.com>
> 
> [Andrew: since this is a regression and a very simple fix
> could you still consider it for .39? Thanks]
> 
> dde79e005a769 added a regression that the memory cgroup data structures
> all end up in node 0 because the first attempt at allocating them
> would not pass in a node hint. Since the initialization runs on CPU #0
> it would all end up node 0. This is a problem on large memory systems,
> where node 0 would lose a lot of memory.
> 
> Change the alloc_pages_exact to alloc_pages_exact_node. This will
> still fall back to other nodes if not enough memory is available.
> 
> [RED-PEN: right now it would fall back first before trying
> vmalloc_node. Probably not the best strategy ... But I left it like
> that for now.]
> 
> Reported-by: Doug Nelson
> CC: Michal Hocko <mhocko@suse.cz>
> Cc: Dave Hansen <dave@linux.vnet.ibm.com>
> Cc: Balbir Singh <balbir@in.ibm.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  mm/page_cgroup.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 9905501..1f4e20f 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -134,7 +134,7 @@ static void *__init_refok alloc_page_cgroup(size_t size, int nid)
>  {
>  	void *addr = NULL;
> 
> -	addr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
> +	addr = alloc_pages_exact_node(nid, size, GFP_KERNEL | __GFP_NOWARN);

Excellent catch! My eyes might be cheating me, I see
alloc_pages_exact_node doing what you expect it to do, I think the
size is interpreted as order.

>  	if (addr)
>  		return addr;
> 
> -- 
> 1.7.4.4
> 

-- 
	Three Cheers,
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allocate memory cgroup structures in local nodes
  2011-05-04 19:17 ` David Rientjes
@ 2011-05-04 20:04   ` Andi Kleen
  2011-05-04 20:10     ` David Rientjes
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2011-05-04 20:04 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andi Kleen, akpm, linux-kernel, linux-mm, Michal Hocko,
	Dave Hansen, Balbir Singh, Johannes Weiner


> Before that's considered, the order of the arguments to
> alloc_pages_exact_node() needs to be fixed.

Good point. I'll send another one.

This is really misleading BTW. Grumble.  Maybe it would be actually 
better to
change the prototype too.


>  The vmalloc_node() calls ensure that the nid is actually set in
>N_HIGH_MEMORY and fails otherwise (we don't fallback to using vmalloc()),
>so it looks like the failures for alloc_pages_exact_node() and
>vmalloc_node() would be different?  Why do we want to fallback for one and
>not the other?

The right order would be to try everything (alloc_pages + vmalloc)
to get it node local, before trying everything else. Right now that's
not how it's done.

-Andi



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allocate memory cgroup structures in local nodes
  2011-05-04 20:04   ` Andi Kleen
@ 2011-05-04 20:10     ` David Rientjes
  2011-05-04 20:18       ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: David Rientjes @ 2011-05-04 20:10 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andi Kleen, akpm, linux-kernel, linux-mm, Michal Hocko,
	Dave Hansen, Balbir Singh, Johannes Weiner

On Wed, 4 May 2011, Andi Kleen wrote:

> >  The vmalloc_node() calls ensure that the nid is actually set in
> > N_HIGH_MEMORY and fails otherwise (we don't fallback to using vmalloc()),
> > so it looks like the failures for alloc_pages_exact_node() and
> > vmalloc_node() would be different?  Why do we want to fallback for one and
> > not the other?
> 
> The right order would be to try everything (alloc_pages + vmalloc)
> to get it node local, before trying everything else. Right now that's
> not how it's done.
> 

Completely agreed, I think that's how it should be patched instead of only 
touching the alloc_pages() allocation; we care much more about local node 
than whether we're using vmalloc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Allocate memory cgroup structures in local nodes
  2011-05-04 20:10     ` David Rientjes
@ 2011-05-04 20:18       ` Andi Kleen
  0 siblings, 0 replies; 6+ messages in thread
From: Andi Kleen @ 2011-05-04 20:18 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andi Kleen, akpm, linux-kernel, linux-mm, Michal Hocko,
	Dave Hansen, Balbir Singh, Johannes Weiner


> Completely agreed, I think that's how it should be patched instead of only
> touching the alloc_pages() allocation; we care much more about local node
> than whether we're using vmalloc.

Right now the problem is you end up in node 0 always and then run out of 
memory
later on it on a large system. That's the problem I'm trying to solve ASAP

The rest is much less important.


-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-05-05  5:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-04 18:17 [PATCH] Allocate memory cgroup structures in local nodes Andi Kleen
2011-05-04 19:17 ` David Rientjes
2011-05-04 20:04   ` Andi Kleen
2011-05-04 20:10     ` David Rientjes
2011-05-04 20:18       ` Andi Kleen
2011-05-04 19:36 ` Balbir Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox