[PATCH] add __GFP_ZERP to GFP_LEVEL

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
@ 2007-07-23 10:03 Peter Zijlstra
  2007-07-23 11:21 ` Mel Gorman
  2007-07-23 18:37 ` [PATCH] add __GFP_ZERP " Andrew Morton
  0 siblings, 2 replies; 28+ messages in thread
From: Peter Zijlstra @ 2007-07-23 10:03 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, linux-kernel
  Cc: Christoph Lameter, Daniel Phillips, linux-mm

Daniel recently spotted that __GFP_ZERO is not (and has never been)
part of GFP_LEVEL_MASK. I could not find a reason for this in the
original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
tree.

This of course is in stark contradiction with the comment accompanying
GFP_LEVEL_MASK.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/gfp.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-2/include/linux/gfp.h
===================================================================
--- linux-2.6-2.orig/include/linux/gfp.h
+++ linux-2.6-2/include/linux/gfp.h
@@ -56,7 +56,7 @@ struct vm_area_struct;
 /* if you forget to add the bitmask here kernel will crash, period */
 #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
 			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
-			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
+			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO \
 			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
 			__GFP_MOVABLE)
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 10:03 [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK Peter Zijlstra
@ 2007-07-23 11:21 ` Mel Gorman
  2007-07-23 11:38   ` [PATCH] add __GFP_ZERO " Peter Zijlstra
  2007-07-23 18:37 ` [PATCH] add __GFP_ZERP " Andrew Morton
  1 sibling, 1 reply; 28+ messages in thread
From: Mel Gorman @ 2007-07-23 11:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, Christoph Lameter,
	Daniel Phillips, linux-mm

On (23/07/07 12:03), Peter Zijlstra didst pronounce:
> 
> Daniel recently spotted that __GFP_ZERO is not (and has never been)
> part of GFP_LEVEL_MASK. I could not find a reason for this in the
> original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
> tree.
> 
> This of course is in stark contradiction with the comment accompanying
> GFP_LEVEL_MASK.

This probably never showed up as a problem because slab is not using
__GFP_ZERO. If a flag is used with slab that is not in the level mask,
it goes *bang* as described in the comment.

Does this patch compile though?

> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

> ---
>  include/linux/gfp.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6-2/include/linux/gfp.h
> ===================================================================
> --- linux-2.6-2.orig/include/linux/gfp.h
> +++ linux-2.6-2/include/linux/gfp.h
> @@ -56,7 +56,7 @@ struct vm_area_struct;
>  /* if you forget to add the bitmask here kernel will crash, period */
>  #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
>  			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
> -			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
> +			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO \

Should there be a | at the end there?

>  			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
>  			__GFP_MOVABLE)
>  
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-23 11:21 ` Mel Gorman
@ 2007-07-23 11:38   ` Peter Zijlstra
  2007-07-23 12:30     ` Mel Gorman
  2007-07-23 23:17     ` Christoph Lameter
  0 siblings, 2 replies; 28+ messages in thread
From: Peter Zijlstra @ 2007-07-23 11:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, Christoph Lameter,
	Daniel Phillips, linux-mm

On Mon, 2007-07-23 at 12:21 +0100, Mel Gorman wrote:

> Does this patch compile though?

Ugh, the fix landed in another patch :-(

updated patch below.

---
Daniel recently spotted that __GFP_ZERO is not (and has never been)
part of GFP_LEVEL_MASK. I could not find a reason for this in the
original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
tree.

This of course is in stark contradiction with the comment accompanying
GFP_LEVEL_MASK.
---
 include/linux/gfp.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-2/include/linux/gfp.h
===================================================================
--- linux-2.6-2.orig/include/linux/gfp.h
+++ linux-2.6-2/include/linux/gfp.h
@@ -56,7 +56,7 @@ struct vm_area_struct;
 /* if you forget to add the bitmask here kernel will crash, period */
 #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
 			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
-			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
+			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO| \
 			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
 			__GFP_MOVABLE)
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-23 11:38   ` [PATCH] add __GFP_ZERO " Peter Zijlstra
@ 2007-07-23 12:30     ` Mel Gorman
  2007-07-23 23:17     ` Christoph Lameter
  1 sibling, 0 replies; 28+ messages in thread
From: Mel Gorman @ 2007-07-23 12:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, Christoph Lameter,
	Daniel Phillips, linux-mm

On (23/07/07 13:38), Peter Zijlstra didst pronounce:
> On Mon, 2007-07-23 at 12:21 +0100, Mel Gorman wrote:
> 
> > Does this patch compile though?
> 
> Ugh, the fix landed in another patch :-(
> 
> updated patch below.
> 
> ---
> Daniel recently spotted that __GFP_ZERO is not (and has never been)
> part of GFP_LEVEL_MASK. I could not find a reason for this in the
> original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
> tree.
> 

Missing sign-off but anyway

Acked-by: Mel Gorman <mel@csn.ul.ie>

> This of course is in stark contradiction with the comment accompanying
> GFP_LEVEL_MASK.
> ---
>  include/linux/gfp.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6-2/include/linux/gfp.h
> ===================================================================
> --- linux-2.6-2.orig/include/linux/gfp.h
> +++ linux-2.6-2/include/linux/gfp.h
> @@ -56,7 +56,7 @@ struct vm_area_struct;
>  /* if you forget to add the bitmask here kernel will crash, period */
>  #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
>  			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
> -			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
> +			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO| \
>  			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
>  			__GFP_MOVABLE)
>  
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 10:03 [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK Peter Zijlstra
  2007-07-23 11:21 ` Mel Gorman
@ 2007-07-23 18:37 ` Andrew Morton
  2007-07-23 18:40   ` Peter Zijlstra
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2007-07-23 18:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, linux-kernel, Christoph Lameter, Daniel Phillips,
	linux-mm

On Mon, 23 Jul 2007 12:03:40 +0200 Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> Daniel recently spotted that __GFP_ZERO is not (and has never been)
> part of GFP_LEVEL_MASK. I could not find a reason for this in the
> original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
> tree.

It doesn't make a lot of sense to be passing __GFP_ZERO into slab
allocation functions.  It's not really for the caller to be telling slab
how it should arrange for its new memory to get zeroed.

And the caller of slab functions will need to zero the memory anyway,
because you don't know whether your new object came direct from the page
allocator or if it is recycled memory from a partial slab.

I have a feeling that we did support passing __GFP_ZERO into the slab
allocation functions for a while, but took it out.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 18:37 ` [PATCH] add __GFP_ZERP " Andrew Morton
@ 2007-07-23 18:40   ` Peter Zijlstra
  2007-07-23 21:43     ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2007-07-23 18:40 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, linux-kernel, Christoph Lameter, Daniel Phillips,
	linux-mm

On Mon, 2007-07-23 at 11:37 -0700, Andrew Morton wrote:
> On Mon, 23 Jul 2007 12:03:40 +0200 Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> 
> > Daniel recently spotted that __GFP_ZERO is not (and has never been)
> > part of GFP_LEVEL_MASK. I could not find a reason for this in the
> > original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
> > tree.
> 
> It doesn't make a lot of sense to be passing __GFP_ZERO into slab
> allocation functions.  It's not really for the caller to be telling slab
> how it should arrange for its new memory to get zeroed.
> 
> And the caller of slab functions will need to zero the memory anyway,
> because you don't know whether your new object came direct from the page
> allocator or if it is recycled memory from a partial slab.
> 
> I have a feeling that we did support passing __GFP_ZERO into the slab
> allocation functions for a while, but took it out.

Didn't we just reinstate doing that?

/me goes look at .23-rc1

# grep __GFP_ZERO mm/sl[uoa]b.c
mm/slab.c:      BUG_ON(flags & ~(GFP_DMA | __GFP_ZERO | GFP_LEVEL_MASK));
mm/slab.c:      if (unlikely((flags & __GFP_ZERO) && ptr))
mm/slab.c:      if (unlikely((flags & __GFP_ZERO) && objp))
mm/slob.c:      if (unlikely((gfp & __GFP_ZERO) && b))
mm/slub.c:      BUG_ON(flags & ~(GFP_DMA | __GFP_ZERO | GFP_LEVEL_MASK));
mm/slub.c:      if (unlikely((gfpflags & __GFP_ZERO) && object))


seems to suggest we do.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 18:40   ` Peter Zijlstra
@ 2007-07-23 21:43     ` Christoph Lameter
  2007-07-23 22:13       ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-23 21:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Linus Torvalds, linux-kernel, Daniel Phillips, linux-mm

__GFP_ZERO is implemented by the slab allocators (the page allocator
has no knowledge about the length of the object to be zeroed). The slab
allocators do not pass __GFP_ZERO to the page allocator.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 21:43     ` Christoph Lameter
@ 2007-07-23 22:13       ` Andrew Morton
  2007-07-23 22:41         ` Linus Torvalds
  2007-07-23 22:50         ` Christoph Lameter
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2007-07-23 22:13 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Linus Torvalds, linux-kernel, Daniel Phillips, linux-mm

On Mon, 23 Jul 2007 14:43:23 -0700
Christoph Lameter <clameter@sgi.com> wrote:

> __GFP_ZERO is implemented by the slab allocators (the page allocator
> has no knowledge about the length of the object to be zeroed). The slab
> allocators do not pass __GFP_ZERO to the page allocator.

OK, well that was weird.  So

	kmalloc(42, GFP_KERNEL|__GFP_ZERO);

duplicates

	kzalloc(42, GFP_KERNEL);


Why do it both ways?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 22:13       ` Andrew Morton
@ 2007-07-23 22:41         ` Linus Torvalds
  2007-07-23 22:56           ` Andrew Morton
  2007-07-23 22:50         ` Christoph Lameter
  1 sibling, 1 reply; 28+ messages in thread
From: Linus Torvalds @ 2007-07-23 22:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Peter Zijlstra, linux-kernel, Daniel Phillips,
	linux-mm


On Mon, 23 Jul 2007, Andrew Morton wrote:
> 
> OK, well that was weird.  So
> 
> 	kmalloc(42, GFP_KERNEL|__GFP_ZERO);
> 
> duplicates
> 
> 	kzalloc(42, GFP_KERNEL);
> 
> Why do it both ways?

Both ways? The latter *is* the former. That's how kzalloc() is implemented 
these days.

Andrew - all these patches came through you. You didn't realize?

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 22:13       ` Andrew Morton
  2007-07-23 22:41         ` Linus Torvalds
@ 2007-07-23 22:50         ` Christoph Lameter
  1 sibling, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2007-07-23 22:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Zijlstra, Linus Torvalds, linux-kernel, Daniel Phillips, linux-mm

On Mon, 23 Jul 2007, Andrew Morton wrote:

> On Mon, 23 Jul 2007 14:43:23 -0700
> Christoph Lameter <clameter@sgi.com> wrote:
> 
> > __GFP_ZERO is implemented by the slab allocators (the page allocator
> > has no knowledge about the length of the object to be zeroed). The slab
> > allocators do not pass __GFP_ZERO to the page allocator.
> 
> OK, well that was weird.  So
> 
> 	kmalloc(42, GFP_KERNEL|__GFP_ZERO);
> 
> duplicates
> 
> 	kzalloc(42, GFP_KERNEL);

Correct.

> Why do it both ways?

There are now ~2.5k uses of kzalloc in the kernel.

kzalloc and friends implies that we need to duplicate all allocator 
functions. That was never done for NUMA allocators there was never a 
kzalloc_node f.e..

kzalloc also cannot be used in a parameterized way in a derived allocator 
where you would simply pass on the gfp flags.

__GFP_ZERO support allows all allocators to work in a consistent way. 
There is no need to special case slab zeroing allocation.

kzalloc could be removed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 22:41         ` Linus Torvalds
@ 2007-07-23 22:56           ` Andrew Morton
  2007-07-23 23:00             ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2007-07-23 22:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Peter Zijlstra, linux-kernel, Daniel Phillips,
	linux-mm

On Mon, 23 Jul 2007 15:41:36 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Mon, 23 Jul 2007, Andrew Morton wrote:
> > 
> > OK, well that was weird.  So
> > 
> > 	kmalloc(42, GFP_KERNEL|__GFP_ZERO);
> > 
> > duplicates
> > 
> > 	kzalloc(42, GFP_KERNEL);
> > 
> > Why do it both ways?
> 
> Both ways? The latter *is* the former. That's how kzalloc() is implemented 
> these days.

<looks>

So this:

	/*
	 * Be lazy and only check for valid flags here,  keeping it out of the
	 * critical path in kmem_cache_alloc().
	 */
	BUG_ON(flags & ~(GFP_DMA | __GFP_ZERO | GFP_LEVEL_MASK));

would no longer need the __GFP_ZERO.  Ditto in slob's new_slab().


> Andrew - all these patches came through you. You didn't realize?

Well.  I didn't memorise the past few months' 250-odd slab/slob/slub
patches..

My point is, I don't think we want some code doing
kmalloc(42, GFP_KERNEL|__GFP_ZERO) and some other code doing
kzalloc(42, GFP_KERNEL).  But this patch does nothing to increase the
chances of that happening, so I'm happy.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK
  2007-07-23 22:56           ` Andrew Morton
@ 2007-07-23 23:00             ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2007-07-23 23:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Peter Zijlstra, linux-kernel, Daniel Phillips, linux-mm

On Mon, 23 Jul 2007, Andrew Morton wrote:

> So this:
> 
> 	/*
> 	 * Be lazy and only check for valid flags here,  keeping it out of the
> 	 * critical path in kmem_cache_alloc().
> 	 */
> 	BUG_ON(flags & ~(GFP_DMA | __GFP_ZERO | GFP_LEVEL_MASK));
> 
> would no longer need the __GFP_ZERO.  Ditto in slob's new_slab().

That __GFP_ZERO is needed to avoid triggering the BUG_ON. The next line

	local_flags = (flags & GFP_LEVEL_MASK);

filters out the __GFP_ZERO before calling the page allocator.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-23 11:38   ` [PATCH] add __GFP_ZERO " Peter Zijlstra
  2007-07-23 12:30     ` Mel Gorman
@ 2007-07-23 23:17     ` Christoph Lameter
  2007-07-24  6:01       ` Peter Zijlstra
  1 sibling, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-23 23:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Mon, 23 Jul 2007, Peter Zijlstra wrote:

> ---
> Daniel recently spotted that __GFP_ZERO is not (and has never been)
> part of GFP_LEVEL_MASK. I could not find a reason for this in the
> original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
> tree.
> 
> This of course is in stark contradiction with the comment accompanying
> GFP_LEVEL_MASK.

NACK.

The effect that this patch will have is that __GFP_ZERO is passed through 
to the page allocator which will needlessly zero pages. GFP_LEVEL_MASK is 
used to filter out the flags that are to be passed to the page allocator. 
__GFP_ZERO is not passed on but handled by the slab allocators.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-23 23:17     ` Christoph Lameter
@ 2007-07-24  6:01       ` Peter Zijlstra
  2007-07-24  6:48         ` Peter Zijlstra
  2007-07-24  7:09         ` Christoph Lameter
  0 siblings, 2 replies; 28+ messages in thread
From: Peter Zijlstra @ 2007-07-24  6:01 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Mon, 2007-07-23 at 16:17 -0700, Christoph Lameter wrote:
> On Mon, 23 Jul 2007, Peter Zijlstra wrote:
> 
> > ---
> > Daniel recently spotted that __GFP_ZERO is not (and has never been)
> > part of GFP_LEVEL_MASK. I could not find a reason for this in the
> > original patch: 3977971c7f09ce08ed1b8d7a67b2098eb732e4cd in the -bk
> > tree.
> > 
> > This of course is in stark contradiction with the comment accompanying
> > GFP_LEVEL_MASK.
> 
> NACK.
> 
> The effect that this patch will have is that __GFP_ZERO is passed through 
> to the page allocator which will needlessly zero pages. GFP_LEVEL_MASK is 
> used to filter out the flags that are to be passed to the page allocator. 
> __GFP_ZERO is not passed on but handled by the slab allocators.

Then we can either fixup the slab allocators to mask out __GFP_ZERO, or
do something like the below.

Personally I like the consistency of adding __GFP_ZERO here (removes
this odd exception) and just masking it in the sl[aou]b thingies.

Anybody else got a preference?

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/gfp.h |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6-2/include/linux/gfp.h
===================================================================
--- linux-2.6-2.orig/include/linux/gfp.h
+++ linux-2.6-2/include/linux/gfp.h
@@ -53,7 +53,14 @@ struct vm_area_struct;
 #define __GFP_BITS_SHIFT 20	/* Room for 20 __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
-/* if you forget to add the bitmask here kernel will crash, period */
+/*
+ * If you forget to add the bitmask here kernel will crash, period!
+ *
+ * GFP_LEVEL_MASK is used to filter out the flags that are to be passed to the
+ * page allocator.
+ *
+ * __GFP_ZERO is not passed on but handled by the slab allocators.
+ */
 #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
 			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
 			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24  6:01       ` Peter Zijlstra
@ 2007-07-24  6:48         ` Peter Zijlstra
  2007-07-24  7:14           ` Christoph Lameter
  2007-07-24  7:09         ` Christoph Lameter
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2007-07-24  6:48 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 2007-07-24 at 08:01 +0200, Peter Zijlstra wrote:

> Then we can either fixup the slab allocators to mask out __GFP_ZERO, or
> do something like the below.
> 
> Personally I like the consistency of adding __GFP_ZERO here (removes
> this odd exception) and just masking it in the sl[aou]b thingies.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 include/linux/gfp.h |    2 +-
 mm/slab.c           |    4 +++-
 mm/slob.c           |    2 ++
 mm/slub.c           |    4 +++-
 4 files changed, 9 insertions(+), 3 deletions(-)

Index: linux-2.6-2/include/linux/gfp.h
===================================================================
--- linux-2.6-2.orig/include/linux/gfp.h
+++ linux-2.6-2/include/linux/gfp.h
@@ -56,7 +56,7 @@ struct vm_area_struct;
 /* if you forget to add the bitmask here kernel will crash, period */
 #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
 			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
-			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
+			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO| \
 			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
 			__GFP_MOVABLE)
 
Index: linux-2.6-2/mm/slab.c
===================================================================
--- linux-2.6-2.orig/mm/slab.c
+++ linux-2.6-2/mm/slab.c
@@ -2739,11 +2739,13 @@ static int cache_grow(struct kmem_cache 
 	gfp_t local_flags;
 	struct kmem_list3 *l3;
 
+	flags &= ~__GFP_ZERO; /* slab has its own object zeroing */
+
 	/*
 	 * Be lazy and only check for valid flags here,  keeping it out of the
 	 * critical path in kmem_cache_alloc().
 	 */
-	BUG_ON(flags & ~(GFP_DMA | __GFP_ZERO | GFP_LEVEL_MASK));
+	BUG_ON(flags & ~(GFP_DMA | GFP_LEVEL_MASK));
 
 	local_flags = (flags & GFP_LEVEL_MASK);
 	/* Take the l3 list lock to change the colour_next on this node */
Index: linux-2.6-2/mm/slob.c
===================================================================
--- linux-2.6-2.orig/mm/slob.c
+++ linux-2.6-2/mm/slob.c
@@ -223,6 +223,8 @@ static void *slob_new_page(gfp_t gfp, in
 {
 	void *page;
 
+	gfp &= ~__GFP_ZERO; /* slob has its own object zeroing */
+
 #ifdef CONFIG_NUMA
 	if (node != -1)
 		page = alloc_pages_node(node, gfp, order);
Index: linux-2.6-2/mm/slub.c
===================================================================
--- linux-2.6-2.orig/mm/slub.c
+++ linux-2.6-2/mm/slub.c
@@ -1078,7 +1078,9 @@ static struct page *new_slab(struct kmem
 	void *last;
 	void *p;
 
-	BUG_ON(flags & ~(GFP_DMA | __GFP_ZERO | GFP_LEVEL_MASK));
+	flags &= ~__GFP_ZERO; /* slab has its own object zeroing */
+
+	BUG_ON(flags & ~(GFP_DMA | GFP_LEVEL_MASK));
 
 	if (flags & __GFP_WAIT)
 		local_irq_enable();



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24  6:01       ` Peter Zijlstra
  2007-07-24  6:48         ` Peter Zijlstra
@ 2007-07-24  7:09         ` Christoph Lameter
  2007-07-24  7:24           ` Peter Zijlstra
  1 sibling, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24  7:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007, Peter Zijlstra wrote:

> Then we can either fixup the slab allocators to mask out __GFP_ZERO, or
> do something like the below.
> 
> Personally I like the consistency of adding __GFP_ZERO here (removes
> this odd exception) and just masking it in the sl[aou]b thingies.

There is another exception for __GFP_DMA.

> Anybody else got a preference?

>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>  
> -/* if you forget to add the bitmask here kernel will crash, period */
> +/*
> + * If you forget to add the bitmask here kernel will crash, period!
> + *
> + * GFP_LEVEL_MASK is used to filter out the flags that are to be passed to the
> + * page allocator.
> + *

GFP_LEVEL_MASK is also used in mm/vmalloc.c. We need a definition that 
goes beyond slab allocators.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24  6:48         ` Peter Zijlstra
@ 2007-07-24  7:14           ` Christoph Lameter
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24  7:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007, Peter Zijlstra wrote:

> > Personally I like the consistency of adding __GFP_ZERO here (removes
> > this odd exception) and just masking it in the sl[aou]b thingies.

Adds more code. GFP_LEVEL_MASK are the flags passed through to the 
page allocator. Neither __GFP_ZERO nor __GFP_DMA are passed through and 
therefore they are not part of the GFP_LEVEL_MASK.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24  7:09         ` Christoph Lameter
@ 2007-07-24  7:24           ` Peter Zijlstra
  2007-07-24  7:35             ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2007-07-24  7:24 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 2007-07-24 at 00:09 -0700, Christoph Lameter wrote:
> On Tue, 24 Jul 2007, Peter Zijlstra wrote:
> 
> > Then we can either fixup the slab allocators to mask out __GFP_ZERO, or
> > do something like the below.
> > 
> > Personally I like the consistency of adding __GFP_ZERO here (removes
> > this odd exception) and just masking it in the sl[aou]b thingies.
> 
> There is another exception for __GFP_DMA.

non of the zone specifiers are

> > Anybody else got a preference?
> 
> >  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> >  
> > -/* if you forget to add the bitmask here kernel will crash, period */
> > +/*
> > + * If you forget to add the bitmask here kernel will crash, period!
> > + *
> > + * GFP_LEVEL_MASK is used to filter out the flags that are to be passed to the
> > + * page allocator.
> > + *
> 
> GFP_LEVEL_MASK is also used in mm/vmalloc.c. We need a definition that 
> goes beyond slab allocators.

Right, bugger.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24  7:24           ` Peter Zijlstra
@ 2007-07-24  7:35             ` Christoph Lameter
  2007-07-24 19:07               ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24  7:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mel Gorman, Andrew Morton, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007, Peter Zijlstra wrote:

> > There is another exception for __GFP_DMA.
> 
> non of the zone specifiers are

__GFP_DMA is handled in a similar way to __GFP_ZERO though. Its explicitly 
listed in BUG_ON() because it can be specified in the gfpflags to kmalloc 
but also set by having created a slab with SLAB_DMA. It is also cleared 
by the & GFP_LEVEL_MASK.

> > > Anybody else got a preference?
> > 
> > >  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> > >  
> > > -/* if you forget to add the bitmask here kernel will crash, period */
> > > +/*
> > > + * If you forget to add the bitmask here kernel will crash, period!
> > > + *
> > > + * GFP_LEVEL_MASK is used to filter out the flags that are to be passed to the
> > > + * page allocator.
> > > + *
> > 
> > GFP_LEVEL_MASK is also used in mm/vmalloc.c. We need a definition that 
> > goes beyond slab allocators.
> 
> Right, bugger.

Lets get rid of the cryptic sentence there and explain it in a better way. 
GFP_LEVEL_MASK contains the flags that are passed to the page allocator
by derived allocators (such as slab allocators and vmalloc, maybe the 
uncached allocator may use it in the future?).

__get_vm_area_node also relies on GFP_LEVEL_MASK to clear the __GFP_ZERO 
flag. Otherwise the kmalloc_node there would needlessly return zeroed 
memory (or have failed in the past).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24  7:35             ` Christoph Lameter
@ 2007-07-24 19:07               ` Christoph Lameter
  2007-07-24 19:25                 ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24 19:07 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, Andrew Morton, Linus Torvalds,
	linux-kernel, Daniel Phillips, linux-mm

GFP_LEVEL_MASK is used to allow the pass through of page allocator
flags. Currently these are

#define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
                        __GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
                        __GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
                        __GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|
			__GFP_MOVABLE)

Some of these flags control page allocator reclaim and fallback
behavior. If they are specified for a slab alloc operation then they
are effective if a new slab has to be allocated. These are

1. Reclaim control

__GFP_WAIT
__GFP_IO
__GFP_FS
__GFP_NOWARN
__GFP_REPEAT
__GFP_NOFAIL
__GFP_NORETRY

2. Reserve control

__GFP_HIGH
__GFP_NOMEMALLOC

2. Fallback control

__GFP_HARDWALL	(cpuset contraints)
__GFP_THISNODE (handled by SLAB on its own, SLUB/SLOB pass through)

AFAIK these make sense.

Then there are some other flags. I am wondering why they are in
GFP_LEVEL_MASK?

__GFP_COLD	Does not make sense for slab allocators since we have
		to touch the page immediately.

__GFP_COMP	No effect. Added by the page allocator on their own
		if a higher order allocs are used for a slab.

__GFP_MOVABLE	The movability of a slab is determined by the
		options specified at kmem_cache_create time. If this is
		specified at kmalloc time then we will have some random
		slabs movable and others not. 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 19:07               ` Christoph Lameter
@ 2007-07-24 19:25                 ` Andrew Morton
  2007-07-24 19:36                   ` Christoph Lameter
  2007-07-25 13:06                   ` Mel Gorman
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2007-07-24 19:25 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007 12:07:51 -0700
Christoph Lameter <clameter@sgi.com> wrote:

> Then there are some other flags. I am wondering why they are in
> GFP_LEVEL_MASK?
> 
> __GFP_COLD	Does not make sense for slab allocators since we have
> 		to touch the page immediately.
> 
> __GFP_COMP	No effect. Added by the page allocator on their own
> 		if a higher order allocs are used for a slab.
> 
> __GFP_MOVABLE	The movability of a slab is determined by the
> 		options specified at kmem_cache_create time. If this is
> 		specified at kmalloc time then we will have some random
> 		slabs movable and others not. 

Yes, they seem inappropriate.  Especially the first two.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 19:25                 ` Andrew Morton
@ 2007-07-24 19:36                   ` Christoph Lameter
  2007-07-24 22:10                     ` Andrew Morton
  2007-07-25 13:06                   ` Mel Gorman
  1 sibling, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24 19:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007, Andrew Morton wrote:

> > __GFP_MOVABLE	The movability of a slab is determined by the
> > 		options specified at kmem_cache_create time. If this is
> > 		specified at kmalloc time then we will have some random
> > 		slabs movable and others not. 
> 
> Yes, they seem inappropriate.  Especially the first two.

The third one would randomize __GFP_MOVABLE allocs from the page allocator 
since one __GFP_MOVABLE alloc may allocate a slab that is then used for 
!__GFP_MOVABLE allocs.

Maybe something like this? Note that we may get into some churn here 
since slab allocations that any of these flags will BUG.



GFP_LEVEL_MASK: Remove __GFP_COLD, __GFP_COMP and __GFPMOVABLE

Add an explanation for the GFP_LEVEL_MASK and remove the flags
that should not be passed through derived allocators.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6/include/linux/gfp.h
===================================================================
--- linux-2.6.orig/include/linux/gfp.h	2007-07-24 12:31:04.000000000 -0700
+++ linux-2.6/include/linux/gfp.h	2007-07-24 12:32:50.000000000 -0700
@@ -53,12 +53,15 @@ struct vm_area_struct;
 #define __GFP_BITS_SHIFT 20	/* Room for 20 __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
-/* if you forget to add the bitmask here kernel will crash, period */
+/*
+ * GFP_LEVEL_MASK is used to filter out flags to be passed on to the
+ * page allocator in derived allocators such as slab allocators and
+ * vmalloc. It should not contain flags that are to be handled by the
+ * derived allocators themselves.
+ */
 #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
-			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
-			__GFP_NOFAIL|__GFP_NORETRY|__GFP_COMP| \
-			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
-			__GFP_MOVABLE)
+		__GFP_NOWARN|__GFP_REPEAT|__GFP_NOFAIL|__GFP_NORETRY| \
+		__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| __GFP_MOVABLE)
 
 /* This equals 0, but use constants in case they ever change */
 #define GFP_NOWAIT	(GFP_ATOMIC & ~__GFP_HIGH)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 19:36                   ` Christoph Lameter
@ 2007-07-24 22:10                     ` Andrew Morton
  2007-07-24 23:00                       ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2007-07-24 22:10 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007 12:36:59 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:

> On Tue, 24 Jul 2007, Andrew Morton wrote:
> 
> > > __GFP_MOVABLE	The movability of a slab is determined by the
> > > 		options specified at kmem_cache_create time. If this is
> > > 		specified at kmalloc time then we will have some random
> > > 		slabs movable and others not. 
> > 
> > Yes, they seem inappropriate.  Especially the first two.
> 
> The third one would randomize __GFP_MOVABLE allocs from the page allocator 
> since one __GFP_MOVABLE alloc may allocate a slab that is then used for 
> !__GFP_MOVABLE allocs.
> 
> Maybe something like this? Note that we may get into some churn here 
> since slab allocations that any of these flags will BUG.
> 
> 
> 
> GFP_LEVEL_MASK: Remove __GFP_COLD, __GFP_COMP and __GFPMOVABLE
> 
> Add an explanation for the GFP_LEVEL_MASK and remove the flags
> that should not be passed through derived allocators.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>

I think I'll duck this for now.  Otherwise I have a suspicion that I'll
be the first person to run it and I'm too old for such excitement.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 22:10                     ` Andrew Morton
@ 2007-07-24 23:00                       ` Christoph Lameter
  2007-07-24 23:12                         ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24 23:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007, Andrew Morton wrote:

> I think I'll duck this for now.  Otherwise I have a suspicion that I'll
> be the first person to run it and I'm too old for such excitement.

I always had the suspicion that you have some magical script 
which will immediately tell you that a patch is not working ;-)

Works fine on x86_64 (on top of the ctor cleanup patchset) and passes the 
kernel build test but then there may be creatively designed drivers and 
such that pass these flags to the slab allocators which will now BUG.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 23:00                       ` Christoph Lameter
@ 2007-07-24 23:12                         ` Andrew Morton
  2007-07-24 23:58                           ` Christoph Lameter
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2007-07-24 23:12 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007 16:00:32 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote:

> On Tue, 24 Jul 2007, Andrew Morton wrote:
> 
> > I think I'll duck this for now.  Otherwise I have a suspicion that I'll
> > be the first person to run it and I'm too old for such excitement.
> 
> I always had the suspicion that you have some magical script 
> which will immediately tell you that a patch is not working ;-)

sort of a defensive crouch.

> Works fine on x86_64 (on top of the ctor cleanup patchset) and passes the 
> kernel build test but then there may be creatively designed drivers and 
> such that pass these flags to the slab allocators which will now BUG.

__GFP_COLD looks OK.

__GFP_COMP I'm not so sure about. 
drivers/char/drm/drm_pci.c:drm_pci_alloc() (and other places like infiniband)
pass it into dma_alloc_coherent() which some architectures implement via slab.  umm,
arch/arm/mm/consistent.c is one such.

__GFP_MOVABLE looks OK.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 23:12                         ` Andrew Morton
@ 2007-07-24 23:58                           ` Christoph Lameter
  2007-07-25  0:06                             ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Lameter @ 2007-07-24 23:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007, Andrew Morton wrote:

> __GFP_COMP I'm not so sure about. 
> drivers/char/drm/drm_pci.c:drm_pci_alloc() (and other places like infiniband)
> pass it into dma_alloc_coherent() which some architectures implement via slab.  umm,
> arch/arm/mm/consistent.c is one such.

Should  drm_pci_alloc really aright in setting __GFP_COMP? 
dma_alloc_coherent does not set __GFP_COMP for other higher order allocs 
and expects to be able to operate on the page structs indepedently. That 
is not the case for a compound page.

Creates a really interesting case for SLAB. Slab did not use __GFP_COMP in 
order to be able to allow the use page->private (No longer an issue since 
the 2.6.22 cleanups and avoiding the use of page->private for the compound 
head).

Now the __GFP_COMP flag is passed through for any higher order page alloc 
(such as a kmalloc allocation > PAGE_SIZE). Then we may have allocated one 
slab that is a compound page amoung others higher order pages allocated 
without __GFP_COMP. May have caused rare and strange failures in 2.6.21 
and earlier because of the concurrent page->private use in compound head 
pages and arch pages.

SLUB will always use __GFP_COMP so the pages are consistent regardless if 
__GFP_COMP is passed in or not.

The strange scenarios come about by expecting a page allocation when 
sometimes we just substitute a slab alloc.

We could filter __GFP_COMP out to avoid the BUG()? Or deal with it on a 
case by case basis?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 23:58                           ` Christoph Lameter
@ 2007-07-25  0:06                             ` Andrew Morton
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Morton @ 2007-07-25  0:06 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Peter Zijlstra, Mel Gorman, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On Tue, 24 Jul 2007 16:58:51 -0700 (PDT) Christoph Lameter <clameter@sgi.com> wrote:

> On Tue, 24 Jul 2007, Andrew Morton wrote:
> 
> > __GFP_COMP I'm not so sure about. 
> > drivers/char/drm/drm_pci.c:drm_pci_alloc() (and other places like infiniband)
> > pass it into dma_alloc_coherent() which some architectures implement via slab.  umm,
> > arch/arm/mm/consistent.c is one such.
> 
> Should  drm_pci_alloc really aright in setting __GFP_COMP? 

I don't see what's special about that dma_alloc_coherent() call.

> dma_alloc_coherent does not set __GFP_COMP for other higher order allocs 
> and expects to be able to operate on the page structs indepedently. That 
> is not the case for a compound page.
> 
> Creates a really interesting case for SLAB. Slab did not use __GFP_COMP in 
> order to be able to allow the use page->private (No longer an issue since 
> the 2.6.22 cleanups and avoiding the use of page->private for the compound 
> head).
> 
> Now the __GFP_COMP flag is passed through for any higher order page alloc 
> (such as a kmalloc allocation > PAGE_SIZE). Then we may have allocated one 
> slab that is a compound page amoung others higher order pages allocated 
> without __GFP_COMP. May have caused rare and strange failures in 2.6.21 
> and earlier because of the concurrent page->private use in compound head 
> pages and arch pages.
> 
> SLUB will always use __GFP_COMP so the pages are consistent regardless if 
> __GFP_COMP is passed in or not.
> 
> The strange scenarios come about by expecting a page allocation when 
> sometimes we just substitute a slab alloc.
> 
> We could filter __GFP_COMP out to avoid the BUG()? Or deal with it on a 
> case by case basis?

Fix callers, I'd suggest.  There are a number of fishy-looking open-coded
usages of __GFP_COMP around the place.

It's a bit sad that some architectures are using slab for dma_alloc_coherent()
while others go to alloc_pages().

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] add __GFP_ZERO to GFP_LEVEL_MASK
  2007-07-24 19:25                 ` Andrew Morton
  2007-07-24 19:36                   ` Christoph Lameter
@ 2007-07-25 13:06                   ` Mel Gorman
  1 sibling, 0 replies; 28+ messages in thread
From: Mel Gorman @ 2007-07-25 13:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Peter Zijlstra, Linus Torvalds, linux-kernel,
	Daniel Phillips, linux-mm

On (24/07/07 12:25), Andrew Morton didst pronounce:
> On Tue, 24 Jul 2007 12:07:51 -0700
> Christoph Lameter <clameter@sgi.com> wrote:
> 
> > Then there are some other flags. I am wondering why they are in
> > GFP_LEVEL_MASK?
> > 
> > __GFP_COLD	Does not make sense for slab allocators since we have
> > 		to touch the page immediately.
> > 
> > __GFP_COMP	No effect. Added by the page allocator on their own
> > 		if a higher order allocs are used for a slab.
> > 
> > __GFP_MOVABLE	The movability of a slab is determined by the
> > 		options specified at kmem_cache_create time. If this is
> > 		specified at kmalloc time then we will have some random
> > 		slabs movable and others not. 
> 
> Yes, they seem inappropriate.  Especially the first two.

And the third one is also inappropriate by the definition of GFP_LEVEL_MASK
Christoph is using. If GFP_LEVEL_MASK is to be used to filter out flags
that are unsuitable for higher allocators such as slab and vmalloc, then
they shouldn't be using __GFP_MOVABLE because they are unlikely to do the
correct thing with the pages.

When the flags were added, I was treating GFP_LEVEL_MASK as a set of
allowed flags to the allocator.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2007-07-25 13:06 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-23 10:03 [PATCH] add __GFP_ZERP to GFP_LEVEL_MASK Peter Zijlstra
2007-07-23 11:21 ` Mel Gorman
2007-07-23 11:38   ` [PATCH] add __GFP_ZERO " Peter Zijlstra
2007-07-23 12:30     ` Mel Gorman
2007-07-23 23:17     ` Christoph Lameter
2007-07-24  6:01       ` Peter Zijlstra
2007-07-24  6:48         ` Peter Zijlstra
2007-07-24  7:14           ` Christoph Lameter
2007-07-24  7:09         ` Christoph Lameter
2007-07-24  7:24           ` Peter Zijlstra
2007-07-24  7:35             ` Christoph Lameter
2007-07-24 19:07               ` Christoph Lameter
2007-07-24 19:25                 ` Andrew Morton
2007-07-24 19:36                   ` Christoph Lameter
2007-07-24 22:10                     ` Andrew Morton
2007-07-24 23:00                       ` Christoph Lameter
2007-07-24 23:12                         ` Andrew Morton
2007-07-24 23:58                           ` Christoph Lameter
2007-07-25  0:06                             ` Andrew Morton
2007-07-25 13:06                   ` Mel Gorman
2007-07-23 18:37 ` [PATCH] add __GFP_ZERP " Andrew Morton
2007-07-23 18:40   ` Peter Zijlstra
2007-07-23 21:43     ` Christoph Lameter
2007-07-23 22:13       ` Andrew Morton
2007-07-23 22:41         ` Linus Torvalds
2007-07-23 22:56           ` Andrew Morton
2007-07-23 23:00             ` Christoph Lameter
2007-07-23 22:50         ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox