* [PATCH 2/3] add dev_to_node()
@ 2006-10-30 14:15 Christoph Hellwig
2006-10-30 22:33 ` David Miller, Christoph Hellwig
0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2006-10-30 14:15 UTC (permalink / raw)
To: linux-kernel, netdev, linux-mm
Davem suggested to get the node-affinity information directly from
struct device instead of having the caller extreact it from the
pci_dev. This patch adds dev_to_node() to the topology API for that.
The implementation is rather ugly as we need to compare the bus
operations which we can't do inline in a header without pulling all
kinds of mess in.
Thus provide an out of line dev_to_node for ppc and let everyone else
use the dummy variant in asm-generic.h for now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: linux-2.6/include/asm-generic/topology.h
===================================================================
--- linux-2.6.orig/include/asm-generic/topology.h 2006-10-10 14:53:52.000000000 +0200
+++ linux-2.6/include/asm-generic/topology.h 2006-10-30 13:42:22.000000000 +0100
@@ -45,11 +45,14 @@
#define pcibus_to_node(node) (-1)
#endif
+#ifndef dev_to_node
+#define dev_to_node(dev) (-1)
+#endif
+
#ifndef pcibus_to_cpumask
#define pcibus_to_cpumask(bus) (pcibus_to_node(bus) == -1 ? \
CPU_MASK_ALL : \
node_to_cpumask(pcibus_to_node(bus)) \
)
#endif
-
#endif /* _ASM_GENERIC_TOPOLOGY_H */
Index: linux-2.6/include/asm-powerpc/topology.h
===================================================================
--- linux-2.6.orig/include/asm-powerpc/topology.h 2006-10-10 14:53:52.000000000 +0200
+++ linux-2.6/include/asm-powerpc/topology.h 2006-10-30 14:03:44.000000000 +0100
@@ -5,6 +5,7 @@
struct sys_device;
struct device_node;
+struct device;
#ifdef CONFIG_NUMA
@@ -33,6 +34,7 @@
struct pci_bus;
extern int pcibus_to_node(struct pci_bus *bus);
+int dev_to_node(struct device *dev);
#define pcibus_to_cpumask(bus) (pcibus_to_node(bus) == -1 ? \
CPU_MASK_ALL : \
Index: linux-2.6/arch/powerpc/kernel/pci_64.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/pci_64.c 2006-10-23 17:21:43.000000000 +0200
+++ linux-2.6/arch/powerpc/kernel/pci_64.c 2006-10-30 14:02:40.000000000 +0100
@@ -1424,4 +1424,12 @@
return phb->node;
}
EXPORT_SYMBOL(pcibus_to_node);
+
+int dev_to_node(struct device *dev)
+{
+ if (dev->bus == &pci_bus_type)
+ return pcibus_to_node(to_pci_dev(dev)->bus);
+ return -1;
+}
+EXPORT_SYMBOL(dev_to_node);
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-10-30 14:15 [PATCH 2/3] add dev_to_node() Christoph Hellwig
@ 2006-10-30 22:33 ` David Miller, Christoph Hellwig
2006-11-01 0:10 ` Christoph Lameter
2006-11-04 22:56 ` Christoph Hellwig
0 siblings, 2 replies; 17+ messages in thread
From: David Miller, Christoph Hellwig @ 2006-10-30 22:33 UTC (permalink / raw)
To: hch; +Cc: linux-kernel, netdev, linux-mm
> Davem suggested to get the node-affinity information directly from
> struct device instead of having the caller extreact it from the
> pci_dev. This patch adds dev_to_node() to the topology API for that.
> The implementation is rather ugly as we need to compare the bus
> operations which we can't do inline in a header without pulling all
> kinds of mess in.
>
> Thus provide an out of line dev_to_node for ppc and let everyone else
> use the dummy variant in asm-generic.h for now.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
It may be a bit much to be calling all the way through up to the PCI
layer just to pluck out a simple integer, don't you think? The PCI
bus pointer comparison is just a symptom of how silly this is.
Especially since this will be used for every packet allocation a
device makes.
So, please add some sanity to this situation and just put the node
into the generic struct device. :-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-10-30 22:33 ` David Miller, Christoph Hellwig
@ 2006-11-01 0:10 ` Christoph Lameter
2006-11-01 0:53 ` David Miller, Christoph Lameter
2006-11-04 22:56 ` Christoph Hellwig
1 sibling, 1 reply; 17+ messages in thread
From: Christoph Lameter @ 2006-11-01 0:10 UTC (permalink / raw)
To: David Miller, Christoph Hellwig; +Cc: linux-kernel, netdev, linux-mm
On Mon, 30 Oct 2006, David Miller wrote:
> So, please add some sanity to this situation and just put the node
> into the generic struct device. :-)
Good. Then we can remove the node from the pci structure and get rid of
pcibus_to_node?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-01 0:10 ` Christoph Lameter
@ 2006-11-01 0:53 ` David Miller, Christoph Lameter
2006-11-01 1:58 ` Christoph Lameter
0 siblings, 1 reply; 17+ messages in thread
From: David Miller, Christoph Lameter @ 2006-11-01 0:53 UTC (permalink / raw)
To: clameter; +Cc: hch, linux-kernel, netdev, linux-mm
> On Mon, 30 Oct 2006, David Miller wrote:
>
> > So, please add some sanity to this situation and just put the node
> > into the generic struct device. :-)
>
> Good. Then we can remove the node from the pci structure and get rid of
> pcibus_to_node?
Yes, that's possible, because the idea is that the arch specific
bus layer code would initialize the node value. Therefore, there
would be no need for things like pcibus_to_node() any longer.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-01 0:53 ` David Miller, Christoph Lameter
@ 2006-11-01 1:58 ` Christoph Lameter
0 siblings, 0 replies; 17+ messages in thread
From: Christoph Lameter @ 2006-11-01 1:58 UTC (permalink / raw)
To: David Miller; +Cc: hch, linux-kernel, netdev, linux-mm
On Tue, 31 Oct 2006, David Miller wrote:
> Yes, that's possible, because the idea is that the arch specific
> bus layer code would initialize the node value. Therefore, there
> would be no need for things like pcibus_to_node() any longer.
Then lets rename pcibus_to_node to dev_to_node() throughout the kernel.
Provide a -1 default. Then other device layers that are not based on pci
will also be able to exploit NUMA locality.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-10-30 22:33 ` David Miller, Christoph Hellwig
2006-11-01 0:10 ` Christoph Lameter
@ 2006-11-04 22:56 ` Christoph Hellwig
2006-11-04 23:06 ` Dave Jones
2006-11-08 2:40 ` KAMEZAWA Hiroyuki
1 sibling, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2006-11-04 22:56 UTC (permalink / raw)
To: David Miller; +Cc: hch, linux-kernel, netdev, linux-mm
On Mon, Oct 30, 2006 at 02:33:57PM -0800, David Miller wrote:
> It may be a bit much to be calling all the way through up to the PCI
> layer just to pluck out a simple integer, don't you think? The PCI
> bus pointer comparison is just a symptom of how silly this is.
>
> Especially since this will be used for every packet allocation a
> device makes.
>
> So, please add some sanity to this situation and just put the node
> into the generic struct device. :-)
I was concerned about growing struct device, on smaller system it
already eats up a lot of memory. But we can make the node member
conditional on CONFIG_NUMA, as I did in the patch below.
This directly replaces PATCH 2/2 (the one we're replying to), all
others remain unmodified.
Index: linux-2.6/include/linux/device.h
===================================================================
--- linux-2.6.orig/include/linux/device.h 2006-10-29 16:02:38.000000000 +0100
+++ linux-2.6/include/linux/device.h 2006-11-02 12:47:17.000000000 +0100
@@ -347,6 +347,9 @@
BIOS data),reserved for device core*/
struct dev_pm_info power;
+#ifdef CONFIG_NUMA
+ int numa_node; /* NUMA node this device is close to */
+#endif
u64 *dma_mask; /* dma mask (if dma'able device) */
u64 coherent_dma_mask;/* Like dma_mask, but for
alloc_coherent mappings as
@@ -368,6 +371,12 @@
void (*release)(struct device * dev);
};
+#ifdef CONFIG_NUMA
+#define dev_to_node(dev) ((dev)->numa_node)
+#else
+#define dev_to_node(dev) (-1)
+#endif
+
static inline void *
dev_get_drvdata (struct device *dev)
{
Index: linux-2.6/drivers/base/core.c
===================================================================
--- linux-2.6.orig/drivers/base/core.c 2006-10-23 17:21:44.000000000 +0200
+++ linux-2.6/drivers/base/core.c 2006-11-02 12:48:12.000000000 +0100
@@ -381,6 +381,7 @@
INIT_LIST_HEAD(&dev->node);
init_MUTEX(&dev->sem);
device_init_wakeup(dev, 0);
+ dev->numa_node = -1;
}
/**
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c 2006-10-23 17:21:46.000000000 +0200
+++ linux-2.6/drivers/pci/probe.c 2006-11-02 12:47:35.000000000 +0100
@@ -846,6 +846,7 @@
dev->dev.release = pci_release_dev;
pci_dev_get(dev);
+ dev->dev.numa_node = pcibus_to_node(bus);
dev->dev.dma_mask = &dev->dma_mask;
dev->dev.coherent_dma_mask = 0xffffffffull;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-04 22:56 ` Christoph Hellwig
@ 2006-11-04 23:06 ` Dave Jones
2006-11-04 23:09 ` Christoph Hellwig
2006-11-04 23:53 ` Christoph Hellwig
2006-11-08 2:40 ` KAMEZAWA Hiroyuki
1 sibling, 2 replies; 17+ messages in thread
From: Dave Jones @ 2006-11-04 23:06 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David Miller, linux-kernel, netdev, linux-mm
On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote:
This will break the compile for !NUMA if someone ends up doing a bisect
and lands here as a bisect point.
You introduce this nice wrapper..
> +#ifdef CONFIG_NUMA
> +#define dev_to_node(dev) ((dev)->numa_node)
> +#else
> +#define dev_to_node(dev) (-1)
> +#endif
> +
> static inline void *
> dev_get_drvdata (struct device *dev)
> {
And then don't use it here..
> Index: linux-2.6/drivers/base/core.c
> ===================================================================
> --- linux-2.6.orig/drivers/base/core.c 2006-10-23 17:21:44.000000000 +0200
> +++ linux-2.6/drivers/base/core.c 2006-11-02 12:48:12.000000000 +0100
> @@ -381,6 +381,7 @@
> INIT_LIST_HEAD(&dev->node);
> init_MUTEX(&dev->sem);
> device_init_wakeup(dev, 0);
> + dev->numa_node = -1;
> }
>
> /**
and here.
> Index: linux-2.6/drivers/pci/probe.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/probe.c 2006-10-23 17:21:46.000000000 +0200
> +++ linux-2.6/drivers/pci/probe.c 2006-11-02 12:47:35.000000000 +0100
> @@ -846,6 +846,7 @@
> dev->dev.release = pci_release_dev;
> pci_dev_get(dev);
>
> + dev->dev.numa_node = pcibus_to_node(bus);
> dev->dev.dma_mask = &dev->dma_mask;
> dev->dev.coherent_dma_mask = 0xffffffffull;
Dave
--
http://www.codemonkey.org.uk
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-04 23:06 ` Dave Jones
@ 2006-11-04 23:09 ` Christoph Hellwig
2006-11-04 23:53 ` Christoph Hellwig
1 sibling, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2006-11-04 23:09 UTC (permalink / raw)
To: Dave Jones, Christoph Hellwig, David Miller, linux-kernel,
netdev, linux-mm
On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote:
> On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote:
>
> This will break the compile for !NUMA if someone ends up doing a bisect
> and lands here as a bisect point.
>
> You introduce this nice wrapper..
Yes, I'm stupid :) Updated version will follow ASAP.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-04 23:06 ` Dave Jones
2006-11-04 23:09 ` Christoph Hellwig
@ 2006-11-04 23:53 ` Christoph Hellwig
2006-11-05 8:22 ` David Miller, Christoph Hellwig
2006-11-07 6:25 ` Ravikiran G Thirumalai
1 sibling, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2006-11-04 23:53 UTC (permalink / raw)
To: Dave Jones, Christoph Hellwig, David Miller, linux-kernel,
netdev, linux-mm
On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote:
> On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote:
>
> This will break the compile for !NUMA if someone ends up doing a bisect
> and lands here as a bisect point.
>
> You introduce this nice wrapper..
The dev_to_node wrapper is not enough as we can't assign to (-1) for
the non-NUMA case. So I added a second macro, set_dev_node for that.
The patch below compiles and works on numa and non-NUMA platforms.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: linux-2.6/include/linux/device.h
===================================================================
--- linux-2.6.orig/include/linux/device.h 2006-11-05 00:16:09.000000000 +0100
+++ linux-2.6/include/linux/device.h 2006-11-05 00:39:22.000000000 +0100
@@ -347,6 +347,9 @@
BIOS data),reserved for device core*/
struct dev_pm_info power;
+#ifdef CONFIG_NUMA
+ int numa_node; /* NUMA node this device is close to */
+#endif
u64 *dma_mask; /* dma mask (if dma'able device) */
u64 coherent_dma_mask;/* Like dma_mask, but for
alloc_coherent mappings as
@@ -368,6 +371,14 @@
void (*release)(struct device * dev);
};
+#ifdef CONFIG_NUMA
+#define dev_to_node(dev) ((dev)->numa_node)
+#define set_dev_node(dev, node) ((dev)->numa_node = node)
+#else
+#define dev_to_node(dev) (-1)
+#define set_dev_node(dev, node) do { } while (0)
+#endif
+
static inline void *
dev_get_drvdata (struct device *dev)
{
Index: linux-2.6/drivers/base/core.c
===================================================================
--- linux-2.6.orig/drivers/base/core.c 2006-11-05 00:16:09.000000000 +0100
+++ linux-2.6/drivers/base/core.c 2006-11-05 00:40:01.000000000 +0100
@@ -381,6 +381,7 @@
INIT_LIST_HEAD(&dev->node);
init_MUTEX(&dev->sem);
device_init_wakeup(dev, 0);
+ set_dev_node(dev, -1);
}
/**
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c 2006-11-05 00:16:09.000000000 +0100
+++ linux-2.6/drivers/pci/probe.c 2006-11-05 00:39:55.000000000 +0100
@@ -846,6 +846,7 @@
dev->dev.release = pci_release_dev;
pci_dev_get(dev);
+ set_dev_node(&dev->dev, pcibus_to_node(bus));
dev->dev.dma_mask = &dev->dma_mask;
dev->dev.coherent_dma_mask = 0xffffffffull;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-04 23:53 ` Christoph Hellwig
@ 2006-11-05 8:22 ` David Miller, Christoph Hellwig
2006-11-06 23:39 ` Christoph Hellwig
2006-11-07 6:25 ` Ravikiran G Thirumalai
1 sibling, 1 reply; 17+ messages in thread
From: David Miller, Christoph Hellwig @ 2006-11-05 8:22 UTC (permalink / raw)
To: hch; +Cc: davej, linux-kernel, netdev, linux-mm
> On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote:
> > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote:
> >
> > This will break the compile for !NUMA if someone ends up doing a bisect
> > and lands here as a bisect point.
> >
> > You introduce this nice wrapper..
>
> The dev_to_node wrapper is not enough as we can't assign to (-1) for
> the non-NUMA case. So I added a second macro, set_dev_node for that.
>
> The patch below compiles and works on numa and non-NUMA platforms.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good to me.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-05 8:22 ` David Miller, Christoph Hellwig
@ 2006-11-06 23:39 ` Christoph Hellwig
0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2006-11-06 23:39 UTC (permalink / raw)
To: David Miller; +Cc: linux-kernel, netdev, linux-mm
On Sun, Nov 05, 2006 at 12:22:37AM -0800, David Miller wrote:
> Looks good to me.
So what's the right path to get this in? There's one patch touching
MM code, one adding something to the driver core and then finally a
networking patch depending on the previous two. Do you want to take
them all and send them in through the networking tree? Or should
we put the burden on Andrew?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-04 23:53 ` Christoph Hellwig
2006-11-05 8:22 ` David Miller, Christoph Hellwig
@ 2006-11-07 6:25 ` Ravikiran G Thirumalai
2006-11-07 10:15 ` Christoph Hellwig
1 sibling, 1 reply; 17+ messages in thread
From: Ravikiran G Thirumalai @ 2006-11-07 6:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Dave Jones, David Miller, linux-kernel, netdev, linux-mm,
Benzi Galili (Benzi@ScaleMP.com),
Shai Fultheim (Shai@scalex86.org)
On Sun, Nov 05, 2006 at 12:53:23AM +0100, Christoph Hellwig wrote:
> On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote:
> > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote:
> >
> > This will break the compile for !NUMA if someone ends up doing a bisect
> > and lands here as a bisect point.
> >
> > You introduce this nice wrapper..
>
> The dev_to_node wrapper is not enough as we can't assign to (-1) for
> the non-NUMA case. So I added a second macro, set_dev_node for that.
>
> The patch below compiles and works on numa and non-NUMA platforms.
>
>
Hi Christoph,
dev_to_node does not work as expected on x86_64 (and i386). This is because
node value returned by pcibus_to_node is initialized after a struct device
is created with current x86_64 code.
We need the node value initialized before the call to pci_scan_bus_parented,
as the generic devices are allocated and initialized
off pci_scan_child_bus, which gets called from pci_scan_bus_parented
The following patch does that using "pci_sysdata" introduced by the PCI
domain patches in -mm.
Signed-off-by: Alok N Kataria <alok.kataria@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Index: linux-2.6.19-rc4mm2/arch/i386/pci/acpi.c
===================================================================
--- linux-2.6.19-rc4mm2.orig/arch/i386/pci/acpi.c 2006-11-06 11:03:50.000000000 -0800
+++ linux-2.6.19-rc4mm2/arch/i386/pci/acpi.c 2006-11-06 22:04:14.000000000 -0800
@@ -9,6 +9,7 @@ struct pci_bus * __devinit pci_acpi_scan
{
struct pci_bus *bus;
struct pci_sysdata *sd;
+ int pxm;
/* Allocate per-root-bus (not per bus) arch-specific data.
* TODO: leak; this memory is never freed.
@@ -30,15 +31,21 @@ struct pci_bus * __devinit pci_acpi_scan
}
#endif /* CONFIG_PCI_DOMAINS */
+ sd->node = -1;
+
+ pxm = acpi_get_pxm(device->handle);
+#ifdef CONFIG_ACPI_NUMA
+ if (pxm >= 0)
+ sd->node = pxm_to_node(pxm);
+#endif
+
bus = pci_scan_bus_parented(NULL, busnum, &pci_root_ops, sd);
if (!bus)
kfree(sd);
#ifdef CONFIG_ACPI_NUMA
if (bus != NULL) {
- int pxm = acpi_get_pxm(device->handle);
if (pxm >= 0) {
- sd->node = pxm_to_node(pxm);
printk("bus %d -> pxm %d -> node %d\n",
busnum, pxm, sd->node);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-07 6:25 ` Ravikiran G Thirumalai
@ 2006-11-07 10:15 ` Christoph Hellwig
0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2006-11-07 10:15 UTC (permalink / raw)
To: Ravikiran G Thirumalai
Cc: Christoph Hellwig, Dave Jones, David Miller, linux-kernel,
netdev, linux-mm, Benzi Galili (Benzi@ScaleMP.com),
Shai Fultheim (Shai@scalex86.org)
On Mon, Nov 06, 2006 at 10:25:36PM -0800, Ravikiran G Thirumalai wrote:
> On Sun, Nov 05, 2006 at 12:53:23AM +0100, Christoph Hellwig wrote:
> > On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote:
> > > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote:
> > >
> > > This will break the compile for !NUMA if someone ends up doing a bisect
> > > and lands here as a bisect point.
> > >
> > > You introduce this nice wrapper..
> >
> > The dev_to_node wrapper is not enough as we can't assign to (-1) for
> > the non-NUMA case. So I added a second macro, set_dev_node for that.
> >
> > The patch below compiles and works on numa and non-NUMA platforms.
> >
> >
>
> Hi Christoph,
> dev_to_node does not work as expected on x86_64 (and i386). This is because
> node value returned by pcibus_to_node is initialized after a struct device
> is created with current x86_64 code.
>
> We need the node value initialized before the call to pci_scan_bus_parented,
> as the generic devices are allocated and initialized
> off pci_scan_child_bus, which gets called from pci_scan_bus_parented
> The following patch does that using "pci_sysdata" introduced by the PCI
> domain patches in -mm.
A nice, that some non-cell folks actually care for this patch. As far
as my x86_64 pci code knowledge is concerned that patch look fine to me.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-04 22:56 ` Christoph Hellwig
2006-11-04 23:06 ` Dave Jones
@ 2006-11-08 2:40 ` KAMEZAWA Hiroyuki
2006-11-10 18:16 ` Christoph Lameter
1 sibling, 1 reply; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-11-08 2:40 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: davem, linux-kernel, netdev, linux-mm
Hi, I have a question.
On Sat, 4 Nov 2006 23:56:29 +0100
Christoph Hellwig <hch@lst.de> wrote:
> Index: linux-2.6/include/linux/device.h
> ===================================================================
> --- linux-2.6.orig/include/linux/device.h 2006-10-29 16:02:38.000000000 +0100
> +++ linux-2.6/include/linux/device.h 2006-11-02 12:47:17.000000000 +0100
> @@ -347,6 +347,9 @@
> BIOS data),reserved for device core*/
> struct dev_pm_info power;
>
> +#ifdef CONFIG_NUMA
> + int numa_node; /* NUMA node this device is close to */
> +#endif
> + dev->dev.numa_node = pcibus_to_node(bus);
Does this "node" is guaranteed to be online ?
if node is not online, NODE_DATA(node) is NULL or not initialized.
Then, alloc_pages_node() at el. will panic.
I wonder there are no code for creating NODE_DATA() for device-only-node.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-08 2:40 ` KAMEZAWA Hiroyuki
@ 2006-11-10 18:16 ` Christoph Lameter
2006-11-10 18:28 ` Lee Schermerhorn
0 siblings, 1 reply; 17+ messages in thread
From: Christoph Lameter @ 2006-11-10 18:16 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Christoph Hellwig, davem, linux-kernel, netdev, linux-mm
On Wed, 8 Nov 2006, KAMEZAWA Hiroyuki wrote:
> I wonder there are no code for creating NODE_DATA() for device-only-node.
On IA64 we remap nodes with no memory / cpus to the nearest node with
memory. I think that is sufficient.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-10 18:16 ` Christoph Lameter
@ 2006-11-10 18:28 ` Lee Schermerhorn
2006-11-11 0:08 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 17+ messages in thread
From: Lee Schermerhorn @ 2006-11-10 18:28 UTC (permalink / raw)
To: Christoph Lameter
Cc: KAMEZAWA Hiroyuki, Christoph Hellwig, davem, linux-kernel,
netdev, linux-mm
On Fri, 2006-11-10 at 10:16 -0800, Christoph Lameter wrote:
> On Wed, 8 Nov 2006, KAMEZAWA Hiroyuki wrote:
>
> > I wonder there are no code for creating NODE_DATA() for device-only-node.
>
> On IA64 we remap nodes with no memory / cpus to the nearest node with
> memory. I think that is sufficient.
I don't think this happens anymore. Back in the ~2.6.5 days, when we
would configure our numa platforms with 100% of memory interleaved [in
hardware at cache line granularity], the cpus would move to the
interleaved "pseudo-node" and the memoryless nodes would be removed.
numactl --hardware would show something like this:
# uname -r
2.6.5-7.244-default
# numactl --hardware
available: 1 nodes (0-0)
node 0 size: 65443 MB
node 0 free: 64506 MB
I started seeing different behavior about the time SPARSEMEM went in.
Now, with a 2.6.16 base kernel [same platform, hardware interleaved
memory], I see:
# uname -r# numactl --hardware
available: 5 nodes (0-4)
node 0 size: 0 MB
node 0 free: 0 MB
node 1 size: 0 MB
node 1 free: 0 MB
node 2 size: 0 MB
node 2 free: 0 MB
node 3 size: 0 MB
node 3 free: 0 MB
node 4 size: 65439 MB
node 4 free: 64492 MB
node distances:
node 0 1 2 3 4
0: 10 17 17 17 14
1: 17 10 17 17 14
2: 17 17 10 17 14
3: 17 17 17 10 14
4: 14 14 14 14 10
2.6.16.21-0.8-default
[Aside: The firmware/SLIT says that the interleaved memory is closer to
all nodes that other nodes' memory. This has interesting implications
for the "overflow" zone lists...]
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node()
2006-11-10 18:28 ` Lee Schermerhorn
@ 2006-11-11 0:08 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 17+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-11-11 0:08 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: clameter, hch, davem, linux-kernel, netdev, linux-mm
On Fri, 10 Nov 2006 13:28:25 -0500
Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> On Fri, 2006-11-10 at 10:16 -0800, Christoph Lameter wrote:
> > On Wed, 8 Nov 2006, KAMEZAWA Hiroyuki wrote:
> >
> > > I wonder there are no code for creating NODE_DATA() for device-only-node.
> >
> > On IA64 we remap nodes with no memory / cpus to the nearest node with
> > memory. I think that is sufficient.
>
> I don't think this happens anymore.
In my understanding , from drivers/acpi/numa.c,
a node is created by a pxm found in SRAT table at boot time.
the node-number for the pxm which was not found in SRAT at boot time is "-1".
please check how acpi_map_pxm_to_node() is used.
If pci's node-id is based on pxm, checking return vaule of pxm_to_node()
will be good.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-11-11 0:08 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-30 14:15 [PATCH 2/3] add dev_to_node() Christoph Hellwig
2006-10-30 22:33 ` David Miller, Christoph Hellwig
2006-11-01 0:10 ` Christoph Lameter
2006-11-01 0:53 ` David Miller, Christoph Lameter
2006-11-01 1:58 ` Christoph Lameter
2006-11-04 22:56 ` Christoph Hellwig
2006-11-04 23:06 ` Dave Jones
2006-11-04 23:09 ` Christoph Hellwig
2006-11-04 23:53 ` Christoph Hellwig
2006-11-05 8:22 ` David Miller, Christoph Hellwig
2006-11-06 23:39 ` Christoph Hellwig
2006-11-07 6:25 ` Ravikiran G Thirumalai
2006-11-07 10:15 ` Christoph Hellwig
2006-11-08 2:40 ` KAMEZAWA Hiroyuki
2006-11-10 18:16 ` Christoph Lameter
2006-11-10 18:28 ` Lee Schermerhorn
2006-11-11 0:08 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox