2.4: why is NR_GFPINDEX so large?

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* 2.4: why is NR_GFPINDEX so large?
@ 2000-06-21 19:48 Timur Tabi
  2000-06-21 19:56 ` Kanoj Sarcar
  0 siblings, 1 reply; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 19:48 UTC (permalink / raw)
  To: Linux MM mailing list

In mmzone.h, NR_GFPINDEX is set to 0x100.  This means that the node_zonelists
array in pg_data_t has 256 elements.  However, then we have code like this in
mm.h:

static inline struct page * alloc_pages(int gfp_mask, unsigned long order)
{
[snip]
	return __alloc_pages(contig_page_data.node_zonelists+(gfp_mask), order);
}

gfp_mask is any combination of any of these flags (from mm.h):

#define __GFP_WAIT	0x01
#define __GFP_HIGH	0x02
#define __GFP_IO	0x04
#define __GFP_DMA	0x08
#define __GFP_HIGHMEM	0x10

Which means theorectically, the largest value is 0x1F, or 31.  This means that
elements 32-255 of array node_zonelists are never accessed.  Can someone explain
this to me?



--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 19:48 2.4: why is NR_GFPINDEX so large? Timur Tabi
@ 2000-06-21 19:56 ` Kanoj Sarcar
  2000-06-21 19:57   ` Timur Tabi
  0 siblings, 1 reply; 24+ messages in thread
From: Kanoj Sarcar @ 2000-06-21 19:56 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

> 
> In mmzone.h, NR_GFPINDEX is set to 0x100.  This means that the node_zonelists
> array in pg_data_t has 256 elements.  However, then we have code like this in
> mm.h:
> 
> static inline struct page * alloc_pages(int gfp_mask, unsigned long order)
> {
> [snip]
> 	return __alloc_pages(contig_page_data.node_zonelists+(gfp_mask), order);
> }
> 
> gfp_mask is any combination of any of these flags (from mm.h):
> 
> #define __GFP_WAIT	0x01
> #define __GFP_HIGH	0x02
> #define __GFP_IO	0x04
> #define __GFP_DMA	0x08
> #define __GFP_HIGHMEM	0x10
> 
> Which means theorectically, the largest value is 0x1F, or 31.  This means that
> elements 32-255 of array node_zonelists are never accessed.  Can someone explain
> this to me?
> 

This is a left over from the days when we had a few more __GFP_ flags,
but that has been cleaned up now, so NR_GFPINDEX can go down. Be aware 
of any cache footprint issues though.

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 19:56 ` Kanoj Sarcar
@ 2000-06-21 19:57   ` Timur Tabi
  2000-06-21 20:23     ` Puppetmaster
  2000-06-21 20:37     ` Kanoj Sarcar
  0 siblings, 2 replies; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 19:57 UTC (permalink / raw)
  To: Linux MM mailing list

** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
Jun 2000 12:56:12 -0700 (PDT)


> This is a left over from the days when we had a few more __GFP_ flags,
> but that has been cleaned up now, so NR_GFPINDEX can go down. 

Cool.  I'm glad to see that my questions wasn't stupid :-)

>Be aware 
> of any cache footprint issues though.

Ok, you just lost me.  What's a "cache footprint"?




--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 19:57   ` Timur Tabi
@ 2000-06-21 20:23     ` Puppetmaster
  2000-06-21 20:37       ` Timur Tabi
  2000-06-21 20:37     ` Kanoj Sarcar
  1 sibling, 1 reply; 24+ messages in thread
From: Puppetmaster @ 2000-06-21 20:23 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

On Wed, 21 Jun 2000, Timur Tabi wrote:

> ** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
> Jun 2000 12:56:12 -0700 (PDT)
> 
> 
> > This is a left over from the days when we had a few more __GFP_ flags,
> > but that has been cleaned up now, so NR_GFPINDEX can go down. 
> 
> Cool.  I'm glad to see that my questions wasn't stupid :-)
> 
> >Be aware 
> > of any cache footprint issues though.
> 
> Ok, you just lost me.  What's a "cache footprint"?
Cache footprint refers to the amount of space code/data take up in the
cache. This is important for code that is frequently executed, as it is
very good performance-wise to have the nescessary data and code entirely
in the L1 cache.
 > 
> 
> 
> 
> --
> Timur Tabi - ttabi@interactivesi.com
> Interactive Silicon - http://www.interactivesi.com
> 
> When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/
> 

-- 
Master! Master! Where's the dreams that I've been after?
           Master! Master! Promised only lies!         
  Laughter! Laughter! All I hear and see is laughter,
       Laughter! Laughter! Laughing at my cries!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:23     ` Puppetmaster
@ 2000-06-21 20:37       ` Timur Tabi
  0 siblings, 0 replies; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 20:37 UTC (permalink / raw)
  To: Linux MM mailing list

** Reply to message from Puppetmaster <akhripin@mbhs.edu> on Wed, 21 Jun 2000
16:23:30 -0400 (EDT)


> > >Be aware 
> > > of any cache footprint issues though.
> > 
> > Ok, you just lost me.  What's a "cache footprint"?
> Cache footprint refers to the amount of space code/data take up in the
> cache. This is important for code that is frequently executed, as it is
> very good performance-wise to have the nescessary data and code entirely
> in the L1 cache.

So what does that have to do with NR_GFPINDEX?



--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 19:57   ` Timur Tabi
  2000-06-21 20:23     ` Puppetmaster
@ 2000-06-21 20:37     ` Kanoj Sarcar
  2000-06-21 20:41       ` Timur Tabi
  1 sibling, 1 reply; 24+ messages in thread
From: Kanoj Sarcar @ 2000-06-21 20:37 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

> 
> ** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
> Jun 2000 12:56:12 -0700 (PDT)
> 
> 
> > This is a left over from the days when we had a few more __GFP_ flags,
> > but that has been cleaned up now, so NR_GFPINDEX can go down. 
> 
> Cool.  I'm glad to see that my questions wasn't stupid :-)

Best way to verify this is change NR_GFPINDEX to whatever you think
is right, then see whether the resulting kernel comes up fine in
multiuser mode with networking and X.

> 
> >Be aware 
> > of any cache footprint issues though.
> 
> Ok, you just lost me.  What's a "cache footprint"?
>

Even though there is unused space, that might be padding out certain
data structures to cache line aligned sizes, causing lesser cache
line eviction etc, at the cost of few more bytes of unused space. On
certain applications, this can cause a noticeable improvement.

Kanoj 
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:37     ` Kanoj Sarcar
@ 2000-06-21 20:41       ` Timur Tabi
  2000-06-21 20:49         ` Kanoj Sarcar
  0 siblings, 1 reply; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 20:41 UTC (permalink / raw)
  To: Linux MM mailing list

** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
Jun 2000 13:37:14 -0700 (PDT)

> Even though there is unused space, that might be padding out certain
> data structures to cache line aligned sizes, causing lesser cache
> line eviction etc, at the cost of few more bytes of unused space. On
> certain applications, this can cause a noticeable improvement.

Oh, that has to do with this comment in mmzone.h:

 * Right now a zonelist takes up less than a cacheline. We never
 * modify it apart from boot-up, and only a few indices are used,
 * so despite the zonelist table being relatively big, the cache
 * footprint of this construct is very small.

But isn't that talking about the individual zonelist_t structures, not the
entire node_zonelists array?  I mean, we're talking about 224 UNUSED array
elements, which is much bigger than any cache line.  And since the stuff is
never used, it's never cached either.

--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:41       ` Timur Tabi
@ 2000-06-21 20:49         ` Kanoj Sarcar
  2000-06-21 20:59           ` Timur Tabi
  0 siblings, 1 reply; 24+ messages in thread
From: Kanoj Sarcar @ 2000-06-21 20:49 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

> 
>  * Right now a zonelist takes up less than a cacheline. We never
>  * modify it apart from boot-up, and only a few indices are used,
>  * so despite the zonelist table being relatively big, the cache
>  * footprint of this construct is very small.
> 
> But isn't that talking about the individual zonelist_t structures, not the
> entire node_zonelists array?  I mean, we're talking about 224 UNUSED array
> elements, which is much bigger than any cache line.  And since the stuff is
> never used, it's never cached either.
>

Yes, this is saying that although we waste physical memory (which few
people care about any more), some of the unused space is never cached,
since it is not accessed (although hardware processor prefetches might
change this assumption a little bit). So, valuable cache space is not 
wasted that can be used to hold data/code that is actually used.

What I was warning you about is that if you shrink the array to the
exact size, there might be other data that comes on the same cacheline,
which might cause all kinds of interesting behavior (I think they call
this false cache sharing or some such thing).

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:49         ` Kanoj Sarcar
@ 2000-06-21 20:59           ` Timur Tabi
  2000-06-21 21:10             ` Kanoj Sarcar
                               ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 20:59 UTC (permalink / raw)
  To: Linux MM mailing list

** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
Jun 2000 13:49:56 -0700 (PDT)


> Yes, this is saying that although we waste physical memory (which few
> people care about any more), some of the unused space is never cached,
> since it is not accessed (although hardware processor prefetches might
> change this assumption a little bit). So, valuable cache space is not 
> wasted that can be used to hold data/code that is actually used.
> 
> What I was warning you about is that if you shrink the array to the
> exact size, there might be other data that comes on the same cacheline,
> which might cause all kinds of interesting behavior (I think they call
> this false cache sharing or some such thing).

Ok, I understand your explanation, but I have a hard time seeing how false
cache sharing can be a bad thing.

If the cache sucks up a bunch of zeros that are never used, that's definitely
wasted cache space.  How can that be any better than sucking up some real data
that can be used?



--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:59           ` Timur Tabi
@ 2000-06-21 21:10             ` Kanoj Sarcar
  2000-06-21 21:28               ` Timur Tabi
  2000-06-21 21:22             ` James Manning
  2000-06-21 21:24             ` Juan J. Quintela
  2 siblings, 1 reply; 24+ messages in thread
From: Kanoj Sarcar @ 2000-06-21 21:10 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

> 
> > Yes, this is saying that although we waste physical memory (which few
> > people care about any more), some of the unused space is never cached,
> > since it is not accessed (although hardware processor prefetches might
> > change this assumption a little bit). So, valuable cache space is not 
> > wasted that can be used to hold data/code that is actually used.
> > 
> > What I was warning you about is that if you shrink the array to the
> > exact size, there might be other data that comes on the same cacheline,
> > which might cause all kinds of interesting behavior (I think they call
> > this false cache sharing or some such thing).
> 
> Ok, I understand your explanation, but I have a hard time seeing how false
> cache sharing can be a bad thing.
> 
> If the cache sucks up a bunch of zeros that are never used, that's definitely
> wasted cache space.  How can that be any better than sucking up some real data
> that can be used?
>

Okay, I will shut up since I will have to pull out old notes and books
to convince you, but basically, here's a simple example. Say a L2 cache 
line is 128 bytes, and each array element is 16 bytes, giving 8 array 
elements per cache line. Say you decide to eliminate the last element,
maybe because it is not used. So, in that space, two global integers/
spinlocks etc are packed in after the deletion. Further assume these
two integers are frequently updated. Looking at an SMP system that uses
the exlusive write cache update protocol, the cache line will probably
bounce between the different L2 caches, which is quite bad, assuming 
that the original 8 element array was readonly, and was probably 
coresident in all the caches.

Till now, this has been a completely academic decision. I suggest if
you are serious, go ahead and make the change, then try running few
simple benchmarks (kernel compiles possibly), and if you see no 
performance regression, post the patch, and send it to Alan and Linus.

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 21:10             ` Kanoj Sarcar
@ 2000-06-21 21:28               ` Timur Tabi
  2000-06-21 21:41                 ` Kanoj Sarcar
  2000-06-22 19:26                 ` Andrea Arcangeli
  0 siblings, 2 replies; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 21:28 UTC (permalink / raw)
  To: Linux MM mailing list

** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
Jun 2000 14:10:17 -0700 (PDT)


> Okay, I will shut up since I will have to pull out old notes and books
> to convince you, but basically, here's a simple example. Say a L2 cache 
> line is 128 bytes, and each array element is 16 bytes, giving 8 array 
> elements per cache line. Say you decide to eliminate the last element,
> maybe because it is not used. So, in that space, two global integers/
> spinlocks etc are packed in after the deletion. Further assume these
> two integers are frequently updated. Looking at an SMP system that uses
> the exlusive write cache update protocol, the cache line will probably
> bounce between the different L2 caches, which is quite bad, assuming 
> that the original 8 element array was readonly, and was probably 
> coresident in all the caches.

Fascinating.  I really appreciate your taking the time to explain this to me.  

So I suppose the best way to optimize this is to make sure that "NR_GFPINDEX *
sizeof(zonelist_t)" is a multiple of the cache line size?



--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 21:28               ` Timur Tabi
@ 2000-06-21 21:41                 ` Kanoj Sarcar
  2000-06-21 21:43                   ` Timur Tabi
  2000-06-22 19:26                 ` Andrea Arcangeli
  1 sibling, 1 reply; 24+ messages in thread
From: Kanoj Sarcar @ 2000-06-21 21:41 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

> 
> ** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
> Jun 2000 14:10:17 -0700 (PDT)
> 
> 
> > Okay, I will shut up since I will have to pull out old notes and books
> > to convince you, but basically, here's a simple example. Say a L2 cache 
> > line is 128 bytes, and each array element is 16 bytes, giving 8 array 
> > elements per cache line. Say you decide to eliminate the last element,
> > maybe because it is not used. So, in that space, two global integers/
> > spinlocks etc are packed in after the deletion. Further assume these
> > two integers are frequently updated. Looking at an SMP system that uses
> > the exlusive write cache update protocol, the cache line will probably
> > bounce between the different L2 caches, which is quite bad, assuming 
> > that the original 8 element array was readonly, and was probably 
> > coresident in all the caches.
> 
> Fascinating.  I really appreciate your taking the time to explain this to me.  
> 
> So I suppose the best way to optimize this is to make sure that "NR_GFPINDEX *
> sizeof(zonelist_t)" is a multiple of the cache line size?
>

Which is hard to do with all the various architectures with varying
cache line sizes out there. The asm header files can conveniently use
__attribute__((aligned(128))) etc, but I think the generic header files
use something like __attribute__((__aligned__(SMP_CACHE_BYTES))).
Note that SMP_CACHE_BYTES is equated to the >> L1 << cache size for
most architectures, which probably has a different effect than 
aligning on L2 cache lines.

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 21:41                 ` Kanoj Sarcar
@ 2000-06-21 21:43                   ` Timur Tabi
  0 siblings, 0 replies; 24+ messages in thread
From: Timur Tabi @ 2000-06-21 21:43 UTC (permalink / raw)
  To: Linux MM mailing list

** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
Jun 2000 14:41:16 -0700 (PDT)


> Which is hard to do with all the various architectures with varying
> cache line sizes out there. The asm header files can conveniently use
> __attribute__((aligned(128))) etc, but I think the generic header files
> use something like __attribute__((__aligned__(SMP_CACHE_BYTES))).
> Note that SMP_CACHE_BYTES is equated to the >> L1 << cache size for
> most architectures, which probably has a different effect than 
> aligning on L2 cache lines.

Is the majority of the kernel cache-line aligned like this, or is this an area
where the kernel needs a lot of work?  




--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 21:28               ` Timur Tabi
  2000-06-21 21:41                 ` Kanoj Sarcar
@ 2000-06-22 19:26                 ` Andrea Arcangeli
  2000-06-22 19:51                   ` Jamie Lokier
  2000-06-22 20:22                   ` Kanoj Sarcar
  1 sibling, 2 replies; 24+ messages in thread
From: Andrea Arcangeli @ 2000-06-22 19:26 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

On Wed, 21 Jun 2000, Timur Tabi wrote:

>So I suppose the best way to optimize this is to make sure that
>"NR_GFPINDEX * sizeof(zonelist_t)" is a multiple of the cache line size?

Yes but only in SMP. On an UP compile you can save space. For this purpose
in ac22-class there's a ____cacheline_aligned_in_smp macro that you can
use for things like that (it relies on the compiler enterely).

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-22 19:26                 ` Andrea Arcangeli
@ 2000-06-22 19:51                   ` Jamie Lokier
  2000-06-23 17:41                     ` Andrea Arcangeli
  2000-06-22 20:22                   ` Kanoj Sarcar
  1 sibling, 1 reply; 24+ messages in thread
From: Jamie Lokier @ 2000-06-22 19:51 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Timur Tabi, Linux MM mailing list

Andrea Arcangeli wrote:
> >So I suppose the best way to optimize this is to make sure that
> >"NR_GFPINDEX * sizeof(zonelist_t)" is a multiple of the cache line size?
> 
> Yes but only in SMP. On an UP compile you can save space. For this purpose
> in ac22-class there's a ____cacheline_aligned_in_smp macro that you can
> use for things like that (it relies on the compiler enterely).

Does ____cacheline_aligned_in_smp guarantee the _size_ of the object is
aligned, or merely its address?

You can always make an array of one element containing an aligned object
I suppose.

Longer term some variation of the per-CPU data area patch should be used.
If only it can be made nice :-)

-- jamie
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-22 19:51                   ` Jamie Lokier
@ 2000-06-23 17:41                     ` Andrea Arcangeli
  2000-06-23 17:52                       ` Jamie Lokier
  0 siblings, 1 reply; 24+ messages in thread
From: Andrea Arcangeli @ 2000-06-23 17:41 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Timur Tabi, Linux MM mailing list

On Thu, 22 Jun 2000, Jamie Lokier wrote:

>Does ____cacheline_aligned_in_smp guarantee the _size_ of the object is
>aligned, or merely its address?

Only its address. It uses the attribute aligned of gcc.

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-23 17:41                     ` Andrea Arcangeli
@ 2000-06-23 17:52                       ` Jamie Lokier
  2000-06-23 18:02                         ` Andrea Arcangeli
  0 siblings, 1 reply; 24+ messages in thread
From: Jamie Lokier @ 2000-06-23 17:52 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Timur Tabi, Linux MM mailing list

Andrea Arcangeli wrote:
> On Thu, 22 Jun 2000, Jamie Lokier wrote:
> 
> >Does ____cacheline_aligned_in_smp guarantee the _size_ of the object is
> >aligned, or merely its address?
> 
> Only its address. It uses the attribute aligned of gcc.

Quite.  So __cacheline_aligned_in_smp is not sufficient to ensure the
array doesn't share cache lines with another variable.

-- Jamie
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-23 17:52                       ` Jamie Lokier
@ 2000-06-23 18:02                         ` Andrea Arcangeli
  2000-06-23 18:03                           ` Andrea Arcangeli
  0 siblings, 1 reply; 24+ messages in thread
From: Andrea Arcangeli @ 2000-06-23 18:02 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Timur Tabi, Linux MM mailing list

On Fri, 23 Jun 2000, Jamie Lokier wrote:

>Quite.  So __cacheline_aligned_in_smp is not sufficient to ensure the
>array doesn't share cache lines with another variable.

Of course, it does only half of the work, you need it here:

	gfpmask_zone_t node_gfpmask_zone[NR_GFPINDEX] ____cacheline_aligned_in_smp;

Andrea


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-23 18:02                         ` Andrea Arcangeli
@ 2000-06-23 18:03                           ` Andrea Arcangeli
  0 siblings, 0 replies; 24+ messages in thread
From: Andrea Arcangeli @ 2000-06-23 18:03 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Timur Tabi, Linux MM mailing list

On Fri, 23 Jun 2000, Andrea Arcangeli wrote:

>On Fri, 23 Jun 2000, Jamie Lokier wrote:
>
>>Quite.  So __cacheline_aligned_in_smp is not sufficient to ensure the
>>array doesn't share cache lines with another variable.
>
>Of course, it does only half of the work, you need it here:
>
>	gfpmask_zone_t node_gfpmask_zone[NR_GFPINDEX] ____cacheline_aligned_in_smp;

btw, I'm not suggesting to add the above (I only wanted to show its
usage).

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-22 19:26                 ` Andrea Arcangeli
  2000-06-22 19:51                   ` Jamie Lokier
@ 2000-06-22 20:22                   ` Kanoj Sarcar
  2000-06-23 18:11                     ` Andrea Arcangeli
  1 sibling, 1 reply; 24+ messages in thread
From: Kanoj Sarcar @ 2000-06-22 20:22 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Timur Tabi, Linux MM mailing list

> 
> On Wed, 21 Jun 2000, Timur Tabi wrote:
> 
> >So I suppose the best way to optimize this is to make sure that
> >"NR_GFPINDEX * sizeof(zonelist_t)" is a multiple of the cache line size?
> 
> Yes but only in SMP. On an UP compile you can save space. For this purpose
> in ac22-class there's a ____cacheline_aligned_in_smp macro that you can
> use for things like that (it relies on the compiler enterely).
> 
> Andrea

Umm, careful. If you happen to share a cacheline between a readonly array 
and a frequently updated variable, it might be better not to delete
unused elements from an array - that way, you might be able to bunch up
all the frequently updated variables into their own cacheline, and save
the memory write back of an extra cacheline.

BTW, this is all of course nitpicking.

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-22 20:22                   ` Kanoj Sarcar
@ 2000-06-23 18:11                     ` Andrea Arcangeli
  0 siblings, 0 replies; 24+ messages in thread
From: Andrea Arcangeli @ 2000-06-23 18:11 UTC (permalink / raw)
  To: Kanoj Sarcar; +Cc: Timur Tabi, Linux MM mailing list

On Thu, 22 Jun 2000, Kanoj Sarcar wrote:

>Umm, careful. If you happen to share a cacheline between a readonly array 
>and a frequently updated variable, it might be better not to delete
>unused elements from an array - that way, you might be able to bunch up
>all the frequently updated variables into their own cacheline, and save
>the memory write back of an extra cacheline.

I think on UP we shouldn't protect any read-only memory against somebody
that isn't optimized. I think if there are a set of frequently updated
variables, _they_ should care to live in the same cacheline (and they
could also include in the same cacheline other stuff of course). It
shouldn't be the gfpmask_zone array (that is read only) that cares to not
include other stuff because there could be something not well optimized
for cacheline flushes.

>BTW, this is all of course nitpicking.

Oh indeed but it's fun ;-)

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:59           ` Timur Tabi
  2000-06-21 21:10             ` Kanoj Sarcar
@ 2000-06-21 21:22             ` James Manning
  2000-06-21 21:24             ` Juan J. Quintela
  2 siblings, 0 replies; 24+ messages in thread
From: James Manning @ 2000-06-21 21:22 UTC (permalink / raw)
  To: Linux MM mailing list

[Timur Tabi]
> ** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
> Jun 2000 13:49:56 -0700 (PDT)
> > What I was warning you about is that if you shrink the array to the
> > exact size, there might be other data that comes on the same cacheline,
> > which might cause all kinds of interesting behavior (I think they call
> > this false cache sharing or some such thing).
> 
> Ok, I understand your explanation, but I have a hard time seeing how false
> cache sharing can be a bad thing.
> 
> If the cache sucks up a bunch of zeros that are never used, that's definitely
> wasted cache space.  How can that be any better than sucking up some real data
> that can be used?

The (possible) problem is that by decreasing the size of the array,
you're shifting data structures in memory and therefore shifting
their placement in caches.  Since caches exist as sets of cache lines
(an N-way associative cache having N members of each of these sets),
we may have shifted some high-traffic cachelines into the same set as
this structure.  We also may have made the situation better, but it's
hard to tell without real data on cache behavior (something I'm working
on now, but it's going slowly).

Of course, since gcc is the blessed compiler we can specify alignments
of structures to try and help the situation, and page coloring may
help the situation later down the road as well.

James
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
  2000-06-21 20:59           ` Timur Tabi
  2000-06-21 21:10             ` Kanoj Sarcar
  2000-06-21 21:22             ` James Manning
@ 2000-06-21 21:24             ` Juan J. Quintela
  2 siblings, 0 replies; 24+ messages in thread
From: Juan J. Quintela @ 2000-06-21 21:24 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

>>>>> "timur" == Timur Tabi <ttabi@interactivesi.com> writes:

timur> ** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed, 21
timur> Jun 2000 13:49:56 -0700 (PDT)


>> Yes, this is saying that although we waste physical memory (which few
>> people care about any more), some of the unused space is never cached,
>> since it is not accessed (although hardware processor prefetches might
>> change this assumption a little bit). So, valuable cache space is not 
>> wasted that can be used to hold data/code that is actually used.
>> 
>> What I was warning you about is that if you shrink the array to the
>> exact size, there might be other data that comes on the same cacheline,
>> which might cause all kinds of interesting behavior (I think they call
>> this false cache sharing or some such thing).

timur> Ok, I understand your explanation, but I have a hard time seeing how false
timur> cache sharing can be a bad thing.

timur> If the cache sucks up a bunch of zeros that are never used, that's definitely
timur> wasted cache space.  How can that be any better than sucking up some real data
timur> that can be used?



You put there a variable that is written a lot of times, then the
cache line with that array will be doing ping pong from one CPU to the
other.  Now, like it is a read only data, it can be in both caches at
the same time.  If you have a lot of CPUs problem become worst.

Later, Juan.

-- 
In theory, practice and theory are the same, but in practice they 
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 2.4: why is NR_GFPINDEX so large?
@ 2000-06-21 21:15 frankeh
  0 siblings, 0 replies; 24+ messages in thread
From: frankeh @ 2000-06-21 21:15 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Linux MM mailing list

Timur...

If [A] is located on the same cacheline as frequently accessed readonly
data, and [A] is written frequently on other processors, e.g a frequently
used lock, then in order to write to the cacheline, write access must be
obtained, which will lead to a global cache line invalidate. If [A] is now
accessed again, then the read permissions have to be obtained and the
cacheline has to be transferred back from another processor. These
interprocessor  cacheline transfers can be expensive operations. When you
go up to NUMA machines this will be even more.

Hope this helps, otherwise keep pounding..

-- Hubertus

Timur Tabi <ttabi@interactivesi.com>@kvack.org on 06/21/2000 04:59:51 PM

Sent by:  owner-linux-mm@kvack.org

To:   Linux MM mailing list <linux-mm@kvack.org>
cc:
Subject:  Re: 2.4: why is NR_GFPINDEX so large?

** Reply to message from Kanoj Sarcar <kanoj@google.engr.sgi.com> on Wed,
21
Jun 2000 13:49:56 -0700 (PDT)

> Yes, this is saying that although we waste physical memory (which few
> people care about any more), some of the unused space is never cached,
> since it is not accessed (although hardware processor prefetches might
> change this assumption a little bit). So, valuable cache space is not
> wasted that can be used to hold data/code that is actually used.
>
> What I was warning you about is that if you shrink the array to the
> exact size, there might be other data that comes on the same cacheline,
> which might cause all kinds of interesting behavior (I think they call
> this false cache sharing or some such thing).

Ok, I understand your explanation, but I have a hard time seeing how false
cache sharing can be a bad thing.

If the cache sucks up a bunch of zeros that are never used, that's
definitely
wasted cache space.  How can that be any better than sucking up some real
data
that can be used?

--
Timur Tabi - ttabi@interactivesi.com
Interactive Silicon - http://www.interactivesi.com

When replying to a mailing-list message, please don't cc: me, because then
I'll just get two copies of the same message.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2000-06-23 18:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-21 19:48 2.4: why is NR_GFPINDEX so large? Timur Tabi
2000-06-21 19:56 ` Kanoj Sarcar
2000-06-21 19:57   ` Timur Tabi
2000-06-21 20:23     ` Puppetmaster
2000-06-21 20:37       ` Timur Tabi
2000-06-21 20:37     ` Kanoj Sarcar
2000-06-21 20:41       ` Timur Tabi
2000-06-21 20:49         ` Kanoj Sarcar
2000-06-21 20:59           ` Timur Tabi
2000-06-21 21:10             ` Kanoj Sarcar
2000-06-21 21:28               ` Timur Tabi
2000-06-21 21:41                 ` Kanoj Sarcar
2000-06-21 21:43                   ` Timur Tabi
2000-06-22 19:26                 ` Andrea Arcangeli
2000-06-22 19:51                   ` Jamie Lokier
2000-06-23 17:41                     ` Andrea Arcangeli
2000-06-23 17:52                       ` Jamie Lokier
2000-06-23 18:02                         ` Andrea Arcangeli
2000-06-23 18:03                           ` Andrea Arcangeli
2000-06-22 20:22                   ` Kanoj Sarcar
2000-06-23 18:11                     ` Andrea Arcangeli
2000-06-21 21:22             ` James Manning
2000-06-21 21:24             ` Juan J. Quintela
2000-06-21 21:15 frankeh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox