* [PATCH 2/3] add dev_to_node()
@ 2006-10-30 14:15 Christoph Hellwig
2006-10-30 22:33 ` David Miller, Christoph Hellwig
0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2006-10-30 14:15 UTC (permalink / raw)
To: linux-kernel, netdev, linux-mm
Davem suggested to get the node-affinity information directly from
struct device instead of having the caller extreact it from the
pci_dev. This patch adds dev_to_node() to the topology API for that.
The implementation is rather ugly as we need to compare the bus
operations which we can't do inline in a header without pulling all
kinds of mess in.
Thus provide an out of line dev_to_node for ppc and let everyone else
use the dummy variant in asm-generic.h for now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: linux-2.6/include/asm-generic/topology.h
===================================================================
--- linux-2.6.orig/include/asm-generic/topology.h 2006-10-10 14:53:52.000000000 +0200
+++ linux-2.6/include/asm-generic/topology.h 2006-10-30 13:42:22.000000000 +0100
@@ -45,11 +45,14 @@
#define pcibus_to_node(node) (-1)
#endif
+#ifndef dev_to_node
+#define dev_to_node(dev) (-1)
+#endif
+
#ifndef pcibus_to_cpumask
#define pcibus_to_cpumask(bus) (pcibus_to_node(bus) == -1 ? \
CPU_MASK_ALL : \
node_to_cpumask(pcibus_to_node(bus)) \
)
#endif
-
#endif /* _ASM_GENERIC_TOPOLOGY_H */
Index: linux-2.6/include/asm-powerpc/topology.h
===================================================================
--- linux-2.6.orig/include/asm-powerpc/topology.h 2006-10-10 14:53:52.000000000 +0200
+++ linux-2.6/include/asm-powerpc/topology.h 2006-10-30 14:03:44.000000000 +0100
@@ -5,6 +5,7 @@
struct sys_device;
struct device_node;
+struct device;
#ifdef CONFIG_NUMA
@@ -33,6 +34,7 @@
struct pci_bus;
extern int pcibus_to_node(struct pci_bus *bus);
+int dev_to_node(struct device *dev);
#define pcibus_to_cpumask(bus) (pcibus_to_node(bus) == -1 ? \
CPU_MASK_ALL : \
Index: linux-2.6/arch/powerpc/kernel/pci_64.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/pci_64.c 2006-10-23 17:21:43.000000000 +0200
+++ linux-2.6/arch/powerpc/kernel/pci_64.c 2006-10-30 14:02:40.000000000 +0100
@@ -1424,4 +1424,12 @@
return phb->node;
}
EXPORT_SYMBOL(pcibus_to_node);
+
+int dev_to_node(struct device *dev)
+{
+ if (dev->bus == &pci_bus_type)
+ return pcibus_to_node(to_pci_dev(dev)->bus);
+ return -1;
+}
+EXPORT_SYMBOL(dev_to_node);
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH 2/3] add dev_to_node() 2006-10-30 14:15 [PATCH 2/3] add dev_to_node() Christoph Hellwig @ 2006-10-30 22:33 ` David Miller, Christoph Hellwig 2006-11-01 0:10 ` Christoph Lameter 2006-11-04 22:56 ` Christoph Hellwig 0 siblings, 2 replies; 17+ messages in thread From: David Miller, Christoph Hellwig @ 2006-10-30 22:33 UTC (permalink / raw) To: hch; +Cc: linux-kernel, netdev, linux-mm > Davem suggested to get the node-affinity information directly from > struct device instead of having the caller extreact it from the > pci_dev. This patch adds dev_to_node() to the topology API for that. > The implementation is rather ugly as we need to compare the bus > operations which we can't do inline in a header without pulling all > kinds of mess in. > > Thus provide an out of line dev_to_node for ppc and let everyone else > use the dummy variant in asm-generic.h for now. > > Signed-off-by: Christoph Hellwig <hch@lst.de> It may be a bit much to be calling all the way through up to the PCI layer just to pluck out a simple integer, don't you think? The PCI bus pointer comparison is just a symptom of how silly this is. Especially since this will be used for every packet allocation a device makes. So, please add some sanity to this situation and just put the node into the generic struct device. :-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-10-30 22:33 ` David Miller, Christoph Hellwig @ 2006-11-01 0:10 ` Christoph Lameter 2006-11-01 0:53 ` David Miller, Christoph Lameter 2006-11-04 22:56 ` Christoph Hellwig 1 sibling, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2006-11-01 0:10 UTC (permalink / raw) To: David Miller, Christoph Hellwig; +Cc: linux-kernel, netdev, linux-mm On Mon, 30 Oct 2006, David Miller wrote: > So, please add some sanity to this situation and just put the node > into the generic struct device. :-) Good. Then we can remove the node from the pci structure and get rid of pcibus_to_node? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-01 0:10 ` Christoph Lameter @ 2006-11-01 0:53 ` David Miller, Christoph Lameter 2006-11-01 1:58 ` Christoph Lameter 0 siblings, 1 reply; 17+ messages in thread From: David Miller, Christoph Lameter @ 2006-11-01 0:53 UTC (permalink / raw) To: clameter; +Cc: hch, linux-kernel, netdev, linux-mm > On Mon, 30 Oct 2006, David Miller wrote: > > > So, please add some sanity to this situation and just put the node > > into the generic struct device. :-) > > Good. Then we can remove the node from the pci structure and get rid of > pcibus_to_node? Yes, that's possible, because the idea is that the arch specific bus layer code would initialize the node value. Therefore, there would be no need for things like pcibus_to_node() any longer. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-01 0:53 ` David Miller, Christoph Lameter @ 2006-11-01 1:58 ` Christoph Lameter 0 siblings, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2006-11-01 1:58 UTC (permalink / raw) To: David Miller; +Cc: hch, linux-kernel, netdev, linux-mm On Tue, 31 Oct 2006, David Miller wrote: > Yes, that's possible, because the idea is that the arch specific > bus layer code would initialize the node value. Therefore, there > would be no need for things like pcibus_to_node() any longer. Then lets rename pcibus_to_node to dev_to_node() throughout the kernel. Provide a -1 default. Then other device layers that are not based on pci will also be able to exploit NUMA locality. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-10-30 22:33 ` David Miller, Christoph Hellwig 2006-11-01 0:10 ` Christoph Lameter @ 2006-11-04 22:56 ` Christoph Hellwig 2006-11-04 23:06 ` Dave Jones 2006-11-08 2:40 ` KAMEZAWA Hiroyuki 1 sibling, 2 replies; 17+ messages in thread From: Christoph Hellwig @ 2006-11-04 22:56 UTC (permalink / raw) To: David Miller; +Cc: hch, linux-kernel, netdev, linux-mm On Mon, Oct 30, 2006 at 02:33:57PM -0800, David Miller wrote: > It may be a bit much to be calling all the way through up to the PCI > layer just to pluck out a simple integer, don't you think? The PCI > bus pointer comparison is just a symptom of how silly this is. > > Especially since this will be used for every packet allocation a > device makes. > > So, please add some sanity to this situation and just put the node > into the generic struct device. :-) I was concerned about growing struct device, on smaller system it already eats up a lot of memory. But we can make the node member conditional on CONFIG_NUMA, as I did in the patch below. This directly replaces PATCH 2/2 (the one we're replying to), all others remain unmodified. Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h 2006-10-29 16:02:38.000000000 +0100 +++ linux-2.6/include/linux/device.h 2006-11-02 12:47:17.000000000 +0100 @@ -347,6 +347,9 @@ BIOS data),reserved for device core*/ struct dev_pm_info power; +#ifdef CONFIG_NUMA + int numa_node; /* NUMA node this device is close to */ +#endif u64 *dma_mask; /* dma mask (if dma'able device) */ u64 coherent_dma_mask;/* Like dma_mask, but for alloc_coherent mappings as @@ -368,6 +371,12 @@ void (*release)(struct device * dev); }; +#ifdef CONFIG_NUMA +#define dev_to_node(dev) ((dev)->numa_node) +#else +#define dev_to_node(dev) (-1) +#endif + static inline void * dev_get_drvdata (struct device *dev) { Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c 2006-10-23 17:21:44.000000000 +0200 +++ linux-2.6/drivers/base/core.c 2006-11-02 12:48:12.000000000 +0100 @@ -381,6 +381,7 @@ INIT_LIST_HEAD(&dev->node); init_MUTEX(&dev->sem); device_init_wakeup(dev, 0); + dev->numa_node = -1; } /** Index: linux-2.6/drivers/pci/probe.c =================================================================== --- linux-2.6.orig/drivers/pci/probe.c 2006-10-23 17:21:46.000000000 +0200 +++ linux-2.6/drivers/pci/probe.c 2006-11-02 12:47:35.000000000 +0100 @@ -846,6 +846,7 @@ dev->dev.release = pci_release_dev; pci_dev_get(dev); + dev->dev.numa_node = pcibus_to_node(bus); dev->dev.dma_mask = &dev->dma_mask; dev->dev.coherent_dma_mask = 0xffffffffull; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-04 22:56 ` Christoph Hellwig @ 2006-11-04 23:06 ` Dave Jones 2006-11-04 23:09 ` Christoph Hellwig 2006-11-04 23:53 ` Christoph Hellwig 2006-11-08 2:40 ` KAMEZAWA Hiroyuki 1 sibling, 2 replies; 17+ messages in thread From: Dave Jones @ 2006-11-04 23:06 UTC (permalink / raw) To: Christoph Hellwig; +Cc: David Miller, linux-kernel, netdev, linux-mm On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: This will break the compile for !NUMA if someone ends up doing a bisect and lands here as a bisect point. You introduce this nice wrapper.. > +#ifdef CONFIG_NUMA > +#define dev_to_node(dev) ((dev)->numa_node) > +#else > +#define dev_to_node(dev) (-1) > +#endif > + > static inline void * > dev_get_drvdata (struct device *dev) > { And then don't use it here.. > Index: linux-2.6/drivers/base/core.c > =================================================================== > --- linux-2.6.orig/drivers/base/core.c 2006-10-23 17:21:44.000000000 +0200 > +++ linux-2.6/drivers/base/core.c 2006-11-02 12:48:12.000000000 +0100 > @@ -381,6 +381,7 @@ > INIT_LIST_HEAD(&dev->node); > init_MUTEX(&dev->sem); > device_init_wakeup(dev, 0); > + dev->numa_node = -1; > } > > /** and here. > Index: linux-2.6/drivers/pci/probe.c > =================================================================== > --- linux-2.6.orig/drivers/pci/probe.c 2006-10-23 17:21:46.000000000 +0200 > +++ linux-2.6/drivers/pci/probe.c 2006-11-02 12:47:35.000000000 +0100 > @@ -846,6 +846,7 @@ > dev->dev.release = pci_release_dev; > pci_dev_get(dev); > > + dev->dev.numa_node = pcibus_to_node(bus); > dev->dev.dma_mask = &dev->dma_mask; > dev->dev.coherent_dma_mask = 0xffffffffull; Dave -- http://www.codemonkey.org.uk -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-04 23:06 ` Dave Jones @ 2006-11-04 23:09 ` Christoph Hellwig 2006-11-04 23:53 ` Christoph Hellwig 1 sibling, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2006-11-04 23:09 UTC (permalink / raw) To: Dave Jones, Christoph Hellwig, David Miller, linux-kernel, netdev, linux-mm On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote: > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: > > This will break the compile for !NUMA if someone ends up doing a bisect > and lands here as a bisect point. > > You introduce this nice wrapper.. Yes, I'm stupid :) Updated version will follow ASAP. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-04 23:06 ` Dave Jones 2006-11-04 23:09 ` Christoph Hellwig @ 2006-11-04 23:53 ` Christoph Hellwig 2006-11-05 8:22 ` David Miller, Christoph Hellwig 2006-11-07 6:25 ` Ravikiran G Thirumalai 1 sibling, 2 replies; 17+ messages in thread From: Christoph Hellwig @ 2006-11-04 23:53 UTC (permalink / raw) To: Dave Jones, Christoph Hellwig, David Miller, linux-kernel, netdev, linux-mm On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote: > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: > > This will break the compile for !NUMA if someone ends up doing a bisect > and lands here as a bisect point. > > You introduce this nice wrapper.. The dev_to_node wrapper is not enough as we can't assign to (-1) for the non-NUMA case. So I added a second macro, set_dev_node for that. The patch below compiles and works on numa and non-NUMA platforms. Signed-off-by: Christoph Hellwig <hch@lst.de> Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h 2006-11-05 00:16:09.000000000 +0100 +++ linux-2.6/include/linux/device.h 2006-11-05 00:39:22.000000000 +0100 @@ -347,6 +347,9 @@ BIOS data),reserved for device core*/ struct dev_pm_info power; +#ifdef CONFIG_NUMA + int numa_node; /* NUMA node this device is close to */ +#endif u64 *dma_mask; /* dma mask (if dma'able device) */ u64 coherent_dma_mask;/* Like dma_mask, but for alloc_coherent mappings as @@ -368,6 +371,14 @@ void (*release)(struct device * dev); }; +#ifdef CONFIG_NUMA +#define dev_to_node(dev) ((dev)->numa_node) +#define set_dev_node(dev, node) ((dev)->numa_node = node) +#else +#define dev_to_node(dev) (-1) +#define set_dev_node(dev, node) do { } while (0) +#endif + static inline void * dev_get_drvdata (struct device *dev) { Index: linux-2.6/drivers/base/core.c =================================================================== --- linux-2.6.orig/drivers/base/core.c 2006-11-05 00:16:09.000000000 +0100 +++ linux-2.6/drivers/base/core.c 2006-11-05 00:40:01.000000000 +0100 @@ -381,6 +381,7 @@ INIT_LIST_HEAD(&dev->node); init_MUTEX(&dev->sem); device_init_wakeup(dev, 0); + set_dev_node(dev, -1); } /** Index: linux-2.6/drivers/pci/probe.c =================================================================== --- linux-2.6.orig/drivers/pci/probe.c 2006-11-05 00:16:09.000000000 +0100 +++ linux-2.6/drivers/pci/probe.c 2006-11-05 00:39:55.000000000 +0100 @@ -846,6 +846,7 @@ dev->dev.release = pci_release_dev; pci_dev_get(dev); + set_dev_node(&dev->dev, pcibus_to_node(bus)); dev->dev.dma_mask = &dev->dma_mask; dev->dev.coherent_dma_mask = 0xffffffffull; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-04 23:53 ` Christoph Hellwig @ 2006-11-05 8:22 ` David Miller, Christoph Hellwig 2006-11-06 23:39 ` Christoph Hellwig 2006-11-07 6:25 ` Ravikiran G Thirumalai 1 sibling, 1 reply; 17+ messages in thread From: David Miller, Christoph Hellwig @ 2006-11-05 8:22 UTC (permalink / raw) To: hch; +Cc: davej, linux-kernel, netdev, linux-mm > On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote: > > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: > > > > This will break the compile for !NUMA if someone ends up doing a bisect > > and lands here as a bisect point. > > > > You introduce this nice wrapper.. > > The dev_to_node wrapper is not enough as we can't assign to (-1) for > the non-NUMA case. So I added a second macro, set_dev_node for that. > > The patch below compiles and works on numa and non-NUMA platforms. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Looks good to me. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-05 8:22 ` David Miller, Christoph Hellwig @ 2006-11-06 23:39 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2006-11-06 23:39 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, netdev, linux-mm On Sun, Nov 05, 2006 at 12:22:37AM -0800, David Miller wrote: > Looks good to me. So what's the right path to get this in? There's one patch touching MM code, one adding something to the driver core and then finally a networking patch depending on the previous two. Do you want to take them all and send them in through the networking tree? Or should we put the burden on Andrew? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-04 23:53 ` Christoph Hellwig 2006-11-05 8:22 ` David Miller, Christoph Hellwig @ 2006-11-07 6:25 ` Ravikiran G Thirumalai 2006-11-07 10:15 ` Christoph Hellwig 1 sibling, 1 reply; 17+ messages in thread From: Ravikiran G Thirumalai @ 2006-11-07 6:25 UTC (permalink / raw) To: Christoph Hellwig Cc: Dave Jones, David Miller, linux-kernel, netdev, linux-mm, Benzi Galili (Benzi@ScaleMP.com), Shai Fultheim (Shai@scalex86.org) On Sun, Nov 05, 2006 at 12:53:23AM +0100, Christoph Hellwig wrote: > On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote: > > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: > > > > This will break the compile for !NUMA if someone ends up doing a bisect > > and lands here as a bisect point. > > > > You introduce this nice wrapper.. > > The dev_to_node wrapper is not enough as we can't assign to (-1) for > the non-NUMA case. So I added a second macro, set_dev_node for that. > > The patch below compiles and works on numa and non-NUMA platforms. > > Hi Christoph, dev_to_node does not work as expected on x86_64 (and i386). This is because node value returned by pcibus_to_node is initialized after a struct device is created with current x86_64 code. We need the node value initialized before the call to pci_scan_bus_parented, as the generic devices are allocated and initialized off pci_scan_child_bus, which gets called from pci_scan_bus_parented The following patch does that using "pci_sysdata" introduced by the PCI domain patches in -mm. Signed-off-by: Alok N Kataria <alok.kataria@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Index: linux-2.6.19-rc4mm2/arch/i386/pci/acpi.c =================================================================== --- linux-2.6.19-rc4mm2.orig/arch/i386/pci/acpi.c 2006-11-06 11:03:50.000000000 -0800 +++ linux-2.6.19-rc4mm2/arch/i386/pci/acpi.c 2006-11-06 22:04:14.000000000 -0800 @@ -9,6 +9,7 @@ struct pci_bus * __devinit pci_acpi_scan { struct pci_bus *bus; struct pci_sysdata *sd; + int pxm; /* Allocate per-root-bus (not per bus) arch-specific data. * TODO: leak; this memory is never freed. @@ -30,15 +31,21 @@ struct pci_bus * __devinit pci_acpi_scan } #endif /* CONFIG_PCI_DOMAINS */ + sd->node = -1; + + pxm = acpi_get_pxm(device->handle); +#ifdef CONFIG_ACPI_NUMA + if (pxm >= 0) + sd->node = pxm_to_node(pxm); +#endif + bus = pci_scan_bus_parented(NULL, busnum, &pci_root_ops, sd); if (!bus) kfree(sd); #ifdef CONFIG_ACPI_NUMA if (bus != NULL) { - int pxm = acpi_get_pxm(device->handle); if (pxm >= 0) { - sd->node = pxm_to_node(pxm); printk("bus %d -> pxm %d -> node %d\n", busnum, pxm, sd->node); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-07 6:25 ` Ravikiran G Thirumalai @ 2006-11-07 10:15 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2006-11-07 10:15 UTC (permalink / raw) To: Ravikiran G Thirumalai Cc: Christoph Hellwig, Dave Jones, David Miller, linux-kernel, netdev, linux-mm, Benzi Galili (Benzi@ScaleMP.com), Shai Fultheim (Shai@scalex86.org) On Mon, Nov 06, 2006 at 10:25:36PM -0800, Ravikiran G Thirumalai wrote: > On Sun, Nov 05, 2006 at 12:53:23AM +0100, Christoph Hellwig wrote: > > On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote: > > > On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: > > > > > > This will break the compile for !NUMA if someone ends up doing a bisect > > > and lands here as a bisect point. > > > > > > You introduce this nice wrapper.. > > > > The dev_to_node wrapper is not enough as we can't assign to (-1) for > > the non-NUMA case. So I added a second macro, set_dev_node for that. > > > > The patch below compiles and works on numa and non-NUMA platforms. > > > > > > Hi Christoph, > dev_to_node does not work as expected on x86_64 (and i386). This is because > node value returned by pcibus_to_node is initialized after a struct device > is created with current x86_64 code. > > We need the node value initialized before the call to pci_scan_bus_parented, > as the generic devices are allocated and initialized > off pci_scan_child_bus, which gets called from pci_scan_bus_parented > The following patch does that using "pci_sysdata" introduced by the PCI > domain patches in -mm. A nice, that some non-cell folks actually care for this patch. As far as my x86_64 pci code knowledge is concerned that patch look fine to me. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-04 22:56 ` Christoph Hellwig 2006-11-04 23:06 ` Dave Jones @ 2006-11-08 2:40 ` KAMEZAWA Hiroyuki 2006-11-10 18:16 ` Christoph Lameter 1 sibling, 1 reply; 17+ messages in thread From: KAMEZAWA Hiroyuki @ 2006-11-08 2:40 UTC (permalink / raw) To: Christoph Hellwig; +Cc: davem, linux-kernel, netdev, linux-mm Hi, I have a question. On Sat, 4 Nov 2006 23:56:29 +0100 Christoph Hellwig <hch@lst.de> wrote: > Index: linux-2.6/include/linux/device.h > =================================================================== > --- linux-2.6.orig/include/linux/device.h 2006-10-29 16:02:38.000000000 +0100 > +++ linux-2.6/include/linux/device.h 2006-11-02 12:47:17.000000000 +0100 > @@ -347,6 +347,9 @@ > BIOS data),reserved for device core*/ > struct dev_pm_info power; > > +#ifdef CONFIG_NUMA > + int numa_node; /* NUMA node this device is close to */ > +#endif > + dev->dev.numa_node = pcibus_to_node(bus); Does this "node" is guaranteed to be online ? if node is not online, NODE_DATA(node) is NULL or not initialized. Then, alloc_pages_node() at el. will panic. I wonder there are no code for creating NODE_DATA() for device-only-node. -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-08 2:40 ` KAMEZAWA Hiroyuki @ 2006-11-10 18:16 ` Christoph Lameter 2006-11-10 18:28 ` Lee Schermerhorn 0 siblings, 1 reply; 17+ messages in thread From: Christoph Lameter @ 2006-11-10 18:16 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Christoph Hellwig, davem, linux-kernel, netdev, linux-mm On Wed, 8 Nov 2006, KAMEZAWA Hiroyuki wrote: > I wonder there are no code for creating NODE_DATA() for device-only-node. On IA64 we remap nodes with no memory / cpus to the nearest node with memory. I think that is sufficient. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-10 18:16 ` Christoph Lameter @ 2006-11-10 18:28 ` Lee Schermerhorn 2006-11-11 0:08 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 17+ messages in thread From: Lee Schermerhorn @ 2006-11-10 18:28 UTC (permalink / raw) To: Christoph Lameter Cc: KAMEZAWA Hiroyuki, Christoph Hellwig, davem, linux-kernel, netdev, linux-mm On Fri, 2006-11-10 at 10:16 -0800, Christoph Lameter wrote: > On Wed, 8 Nov 2006, KAMEZAWA Hiroyuki wrote: > > > I wonder there are no code for creating NODE_DATA() for device-only-node. > > On IA64 we remap nodes with no memory / cpus to the nearest node with > memory. I think that is sufficient. I don't think this happens anymore. Back in the ~2.6.5 days, when we would configure our numa platforms with 100% of memory interleaved [in hardware at cache line granularity], the cpus would move to the interleaved "pseudo-node" and the memoryless nodes would be removed. numactl --hardware would show something like this: # uname -r 2.6.5-7.244-default # numactl --hardware available: 1 nodes (0-0) node 0 size: 65443 MB node 0 free: 64506 MB I started seeing different behavior about the time SPARSEMEM went in. Now, with a 2.6.16 base kernel [same platform, hardware interleaved memory], I see: # uname -r# numactl --hardware available: 5 nodes (0-4) node 0 size: 0 MB node 0 free: 0 MB node 1 size: 0 MB node 1 free: 0 MB node 2 size: 0 MB node 2 free: 0 MB node 3 size: 0 MB node 3 free: 0 MB node 4 size: 65439 MB node 4 free: 64492 MB node distances: node 0 1 2 3 4 0: 10 17 17 17 14 1: 17 10 17 17 14 2: 17 17 10 17 14 3: 17 17 17 10 14 4: 14 14 14 14 10 2.6.16.21-0.8-default [Aside: The firmware/SLIT says that the interleaved memory is closer to all nodes that other nodes' memory. This has interesting implications for the "overflow" zone lists...] Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/3] add dev_to_node() 2006-11-10 18:28 ` Lee Schermerhorn @ 2006-11-11 0:08 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 17+ messages in thread From: KAMEZAWA Hiroyuki @ 2006-11-11 0:08 UTC (permalink / raw) To: Lee Schermerhorn; +Cc: clameter, hch, davem, linux-kernel, netdev, linux-mm On Fri, 10 Nov 2006 13:28:25 -0500 Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > On Fri, 2006-11-10 at 10:16 -0800, Christoph Lameter wrote: > > On Wed, 8 Nov 2006, KAMEZAWA Hiroyuki wrote: > > > > > I wonder there are no code for creating NODE_DATA() for device-only-node. > > > > On IA64 we remap nodes with no memory / cpus to the nearest node with > > memory. I think that is sufficient. > > I don't think this happens anymore. In my understanding , from drivers/acpi/numa.c, a node is created by a pxm found in SRAT table at boot time. the node-number for the pxm which was not found in SRAT at boot time is "-1". please check how acpi_map_pxm_to_node() is used. If pci's node-id is based on pxm, checking return vaule of pxm_to_node() will be good. -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-11-11 0:08 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-10-30 14:15 [PATCH 2/3] add dev_to_node() Christoph Hellwig 2006-10-30 22:33 ` David Miller, Christoph Hellwig 2006-11-01 0:10 ` Christoph Lameter 2006-11-01 0:53 ` David Miller, Christoph Lameter 2006-11-01 1:58 ` Christoph Lameter 2006-11-04 22:56 ` Christoph Hellwig 2006-11-04 23:06 ` Dave Jones 2006-11-04 23:09 ` Christoph Hellwig 2006-11-04 23:53 ` Christoph Hellwig 2006-11-05 8:22 ` David Miller, Christoph Hellwig 2006-11-06 23:39 ` Christoph Hellwig 2006-11-07 6:25 ` Ravikiran G Thirumalai 2006-11-07 10:15 ` Christoph Hellwig 2006-11-08 2:40 ` KAMEZAWA Hiroyuki 2006-11-10 18:16 ` Christoph Lameter 2006-11-10 18:28 ` Lee Schermerhorn 2006-11-11 0:08 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox