* Question: new bind_zonelist uses only one zone type
@ 2006-01-17 8:46 KAMEZAWA Hiroyuki
2006-01-17 14:29 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-01-17 8:46 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
in -mm4 (in linus.patch)
==
static struct zonelist *bind_zonelist(nodemask_t *nodes)
{
struct zonelist *zl;
int num, max, nd;
max = 1 + MAX_NR_ZONES * nodes_weight(*nodes);
zl = kmalloc(sizeof(void *) * max, GFP_KERNEL);
if (!zl)
return NULL;
num = 0;
for_each_node_mask(nd, *nodes)
zl->zones[num++] = &NODE_DATA(nd)->node_zones[policy_zone];
zl->zones[num] = NULL;
return zl;
}
==
policy_zone is ZONE_DMA, ZONE_NORMAL, ZONE_HIGHMEM, depends on system.
If policy_zone is ZONE_NORMAL, returned zonelist will be
{Node(0)'s NORMAL, Node(1)'s NORMAL, Node(2)'s Normal.....}
If node0 has only DMA/DMA32 and Node1-NodeX has Normal, node0 will be ignored
and zonelist will include not-populated zone.
Is this intended ?
-- Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question: new bind_zonelist uses only one zone type
2006-01-17 8:46 Question: new bind_zonelist uses only one zone type KAMEZAWA Hiroyuki
@ 2006-01-17 14:29 ` Andi Kleen
2006-01-17 23:47 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-01-17 14:29 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Christoph Lameter, linux-mm
On Tuesday 17 January 2006 09:46, KAMEZAWA Hiroyuki wrote:
> policy_zone is ZONE_DMA, ZONE_NORMAL, ZONE_HIGHMEM, depends on system.
>
> If policy_zone is ZONE_NORMAL, returned zonelist will be
> {Node(0)'s NORMAL, Node(1)'s NORMAL, Node(2)'s Normal.....}
>
> If node0 has only DMA/DMA32 and Node1-NodeX has Normal, node0 will be ignored
> and zonelist will include not-populated zone.
>
> Is this intended ?
I was wondering when someone else would notice. Congratulations, you
are the first ;-)
It was originally intended - back then either IA64 NUMA systems didn't
have a ZONE_DMA and on x86-64 it was only 16MB and for i386 NUMA
it was considered acceptable - and it made the code simpler and policies
use less memory. But is now considered a bug because of the introduction
of ZONE_DMA32 on x86-64 and I gather from your report your platform
has NUMA and a 4GB ZONE_DMA too?
It is on my todo list to fix, but I haven't gotten around to it yet.
Fixing it will unfortunately increase the footprint of the policy structures,
so likely it would only increase to two. If someone beats me to a patch
that would be ok too.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question: new bind_zonelist uses only one zone type
2006-01-17 14:29 ` Andi Kleen
@ 2006-01-17 23:47 ` KAMEZAWA Hiroyuki
2006-01-18 3:00 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-01-17 23:47 UTC (permalink / raw)
To: Andi Kleen; +Cc: Christoph Lameter, linux-mm
Andi Kleen wrote:
> It was originally intended - back then either IA64 NUMA systems didn't
> have a ZONE_DMA and on x86-64 it was only 16MB and for i386 NUMA
> it was considered acceptable - and it made the code simpler and policies
> use less memory. But is now considered a bug because of the introduction
> of ZONE_DMA32 on x86-64 and I gather from your report your platform
> has NUMA and a 4GB ZONE_DMA too?
>
on ia64, 0-4G area is ZONE_DMA.
> It is on my todo list to fix, but I haven't gotten around to it yet.
>
> Fixing it will unfortunately increase the footprint of the policy structures,
> so likely it would only increase to two. If someone beats me to a patch
> that would be ok too.
>
I don't have real problem now. It just looks curious.
And, anyway, mbind's list doesn't guarantee fair allocation among nodes.
Thanks,
-- Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question: new bind_zonelist uses only one zone type
2006-01-17 23:47 ` KAMEZAWA Hiroyuki
@ 2006-01-18 3:00 ` Andi Kleen
2006-01-18 3:26 ` KAMEZAWA Hiroyuki
2006-01-18 3:40 ` Jack Steiner
0 siblings, 2 replies; 7+ messages in thread
From: Andi Kleen @ 2006-01-18 3:00 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Christoph Lameter, linux-mm
On Wednesday 18 January 2006 00:47, KAMEZAWA Hiroyuki wrote:
> Andi Kleen wrote:
> > It was originally intended - back then either IA64 NUMA systems didn't
> > have a ZONE_DMA and on x86-64 it was only 16MB and for i386 NUMA
> > it was considered acceptable - and it made the code simpler and policies
> > use less memory. But is now considered a bug because of the introduction
> > of ZONE_DMA32 on x86-64 and I gather from your report your platform
> > has NUMA and a 4GB ZONE_DMA too?
>
> on ia64, 0-4G area is ZONE_DMA.
On IA64/SN2 ZONE_DMA is empty and at least in the part SGI was the only
IA64 vendor actively interested in NUMA policy.
I assume you have a NUMA platform too. Do your machines have a contiguous
memory map where there could be one or more nodes which only
have ZONE_DMA?
> > It is on my todo list to fix, but I haven't gotten around to it yet.
> >
> > Fixing it will unfortunately increase the footprint of the policy
> > structures, so likely it would only increase to two. If someone beats me
> > to a patch that would be ok too.
>
> I don't have real problem now. It just looks curious.
Well it's a bit nasty to not be able to policy 4GB of your memory. Maybe
if you have a few TB of it you won't care, but on smaller machines
it likely will make a difference.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question: new bind_zonelist uses only one zone type
2006-01-18 3:00 ` Andi Kleen
@ 2006-01-18 3:26 ` KAMEZAWA Hiroyuki
2006-01-18 3:40 ` Jack Steiner
1 sibling, 0 replies; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-01-18 3:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: Christoph Lameter, linux-mm
Andi Kleen wrote:
> On IA64/SN2 ZONE_DMA is empty and at least in the part SGI was the only
> IA64 vendor actively interested in NUMA policy.
>
> I assume you have a NUMA platform too. Do your machines have a contiguous
> memory map where there could be one or more nodes which only
> have ZONE_DMA?
>
Fujitsu's PrimeQuest is NUMA and has memory in 0-4G areas in node 0.
It depends on installed memory whether node 0 contains only ZONE_DMA or not.
When using SPARSEMEM, it uses NUMA config. This means one-node-NUMA.
So, if I use SPARSEMEM on ia64 SMP machine with 5 Gbytes mem,
I can allocate just 1G bytes on mbind area.
(*)Maybe using mempolicy on one-node-NUMA make no sense.
>>> It is on my todo list to fix, but I haven't gotten around to it yet.
>>>
>>> Fixing it will unfortunately increase the footprint of the policy
>>> structures, so likely it would only increase to two. If someone beats me
>>> to a patch that would be ok too.
>> I don't have real problem now. It just looks curious.
>
> Well it's a bit nasty to not be able to policy 4GB of your memory. Maybe
> if you have a few TB of it you won't care, but on smaller machines
> it likely will make a difference.
>
It makes difference on my *numa emulation* environment, now ;)
But it's just emulation.
-- Kame.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question: new bind_zonelist uses only one zone type
2006-01-18 3:00 ` Andi Kleen
2006-01-18 3:26 ` KAMEZAWA Hiroyuki
@ 2006-01-18 3:40 ` Jack Steiner
2006-01-18 3:49 ` Andi Kleen
1 sibling, 1 reply; 7+ messages in thread
From: Jack Steiner @ 2006-01-18 3:40 UTC (permalink / raw)
To: Andi Kleen; +Cc: KAMEZAWA Hiroyuki, Christoph Lameter, linux-mm
On Wed, Jan 18, 2006 at 04:00:41AM +0100, Andi Kleen wrote:
> On Wednesday 18 January 2006 00:47, KAMEZAWA Hiroyuki wrote:
> > Andi Kleen wrote:
> > > It was originally intended - back then either IA64 NUMA systems didn't
> > > have a ZONE_DMA and on x86-64 it was only 16MB and for i386 NUMA
> > > it was considered acceptable - and it made the code simpler and policies
> > > use less memory. But is now considered a bug because of the introduction
> > > of ZONE_DMA32 on x86-64 and I gather from your report your platform
> > > has NUMA and a 4GB ZONE_DMA too?
> >
> > on ia64, 0-4G area is ZONE_DMA.
>
> On IA64/SN2 ZONE_DMA is empty and at least in the part SGI was the only
> IA64 vendor actively interested in NUMA policy.
>
On the SN systems, ALL memory is in the DMA zone. The other zones are empty.
I think this is SN-specific - other IA64 platforms may be different & have memory
in multiple zones.
--
Jack
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Question: new bind_zonelist uses only one zone type
2006-01-18 3:40 ` Jack Steiner
@ 2006-01-18 3:49 ` Andi Kleen
0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2006-01-18 3:49 UTC (permalink / raw)
To: Jack Steiner; +Cc: KAMEZAWA Hiroyuki, Christoph Lameter, linux-mm
On Wednesday 18 January 2006 04:40, Jack Steiner wrote:
> On Wed, Jan 18, 2006 at 04:00:41AM +0100, Andi Kleen wrote:
> > On Wednesday 18 January 2006 00:47, KAMEZAWA Hiroyuki wrote:
> > > Andi Kleen wrote:
> > > > It was originally intended - back then either IA64 NUMA systems didn't
> > > > have a ZONE_DMA and on x86-64 it was only 16MB and for i386 NUMA
> > > > it was considered acceptable - and it made the code simpler and policies
> > > > use less memory. But is now considered a bug because of the introduction
> > > > of ZONE_DMA32 on x86-64 and I gather from your report your platform
> > > > has NUMA and a 4GB ZONE_DMA too?
> > >
> > > on ia64, 0-4G area is ZONE_DMA.
> >
> > On IA64/SN2 ZONE_DMA is empty and at least in the part SGI was the only
> > IA64 vendor actively interested in NUMA policy.
> >
>
> On the SN systems, ALL memory is in the DMA zone. The other zones are empty.
Ah sorry Jack got that wrong. Thanks for the correction. For the
NUMA policy makes no difference though because only the highest zone
with pages populated counts.
> I think this is SN-specific - other IA64 platforms may be different & have memory
> in multiple zones.
Yes, it is.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-01-18 3:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-17 8:46 Question: new bind_zonelist uses only one zone type KAMEZAWA Hiroyuki
2006-01-17 14:29 ` Andi Kleen
2006-01-17 23:47 ` KAMEZAWA Hiroyuki
2006-01-18 3:00 ` Andi Kleen
2006-01-18 3:26 ` KAMEZAWA Hiroyuki
2006-01-18 3:40 ` Jack Steiner
2006-01-18 3:49 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox