* [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute
@ 2015-03-02 4:04 Sheng Yong
2015-03-02 4:05 ` [RFC PATCH 2/2] mem-hotplug: add description of " Sheng Yong
2015-03-02 9:17 ` [RFC PATCH 1/2] mem-hotplug: introduce " Naoya Horiguchi
0 siblings, 2 replies; 5+ messages in thread
From: Sheng Yong @ 2015-03-02 4:04 UTC (permalink / raw)
To: akpm, gregkh, nfont; +Cc: linux-mm, zhenzhang.zhang
There may be memory holes in a memory section, and because of that we can
not know the real size of the section. In order to know the physical memory
area used int one memory section, we walks through iomem resources and
report the memory range in /sys/devices/system/memory/memoryX/range, like,
root@ivybridge:~# cat /sys/devices/system/memory/memory0/range
00001000-0008efff
00090000-0009ffff
00100000-07ffffff
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
---
drivers/base/memory.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 85be040..e72e5e4 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -21,6 +21,7 @@
#include <linux/mutex.h>
#include <linux/stat.h>
#include <linux/slab.h>
+#include <linux/ioport.h>
#include <linux/atomic.h>
#include <asm/uaccess.h>
@@ -373,6 +374,69 @@ static ssize_t show_phys_device(struct device *dev,
return sprintf(buf, "%d\n", mem->phys_device);
}
+static int get_range(u64 start, u64 end, void *arg)
+{
+ struct resource **head, *p, *tmp;
+
+ head = (struct resource **) arg;
+
+ if (!(*head)) {
+ *head = kmalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!(*head))
+ return -ENOMEM;
+ (*head)->start = start;
+ (*head)->end = end;
+ (*head)->sibling = NULL;
+ } else {
+ p = *head;
+ while (p->sibling != NULL)
+ p = p->sibling;
+ if (p->end == start - 1) {
+ p->end = end;
+ return 0;
+ }
+ tmp = kmalloc(sizeof(struct resource), GFP_KERNEL);
+ if (!tmp)
+ return -ENOMEM;
+ tmp->start = start;
+ tmp->end = end;
+ tmp->sibling = NULL;
+ p->sibling = tmp;
+ }
+
+ return 0;
+}
+
+static ssize_t show_mem_range(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct memory_block *mem = to_memory_block(dev);
+ unsigned long start_pfn, end_pfn, nr_pages;
+ struct resource *ranges = NULL, *p;
+ u64 start, end;
+ int cnt, err;
+
+ nr_pages = PAGES_PER_SECTION * sections_per_block;
+ start_pfn = section_nr_to_pfn(mem->start_section_nr);
+ end_pfn = start_pfn + nr_pages;
+
+ start = (u64) start_pfn << PAGE_SHIFT;
+ end = ((u64) end_pfn << PAGE_SHIFT) - 1;
+ err = walk_system_ram_res(start, end, &ranges, get_range);
+
+ cnt = 0;
+ while (ranges != NULL) {
+ p = ranges;
+ if (err == 0)
+ cnt += sprintf(buf, "%s%08llx-%08llx\n", buf,
+ ranges->start, ranges->end);
+ ranges = ranges->sibling;
+ kfree(p);
+ }
+
+ return cnt;
+}
+
#ifdef CONFIG_MEMORY_HOTREMOVE
static ssize_t show_valid_zones(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -416,6 +480,7 @@ static DEVICE_ATTR(phys_index, 0444, show_mem_start_phys_index, NULL);
static DEVICE_ATTR(state, 0644, show_mem_state, store_mem_state);
static DEVICE_ATTR(phys_device, 0444, show_phys_device, NULL);
static DEVICE_ATTR(removable, 0444, show_mem_removable, NULL);
+static DEVICE_ATTR(range, 0444, show_mem_range, NULL);
/*
* Block size attribute stuff
@@ -565,6 +630,7 @@ static struct attribute *memory_memblk_attrs[] = {
#ifdef CONFIG_MEMORY_HOTREMOVE
&dev_attr_valid_zones.attr,
#endif
+ &dev_attr_range.attr,
NULL
};
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC PATCH 2/2] mem-hotplug: add description of sysfs `range' attribute
2015-03-02 4:04 [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute Sheng Yong
@ 2015-03-02 4:05 ` Sheng Yong
2015-03-02 9:17 ` [RFC PATCH 1/2] mem-hotplug: introduce " Naoya Horiguchi
1 sibling, 0 replies; 5+ messages in thread
From: Sheng Yong @ 2015-03-02 4:05 UTC (permalink / raw)
To: akpm, gregkh, nfont; +Cc: linux-mm, zhenzhang.zhang
Add description of sysfs `range' attribute, which is designed to show the
memory holes in a memory section.
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
---
Documentation/ABI/testing/sysfs-devices-memory | 8 ++++++++
Documentation/memory-hotplug.txt | 12 ++++++++----
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
index deef3b5..15629f5 100644
--- a/Documentation/ABI/testing/sysfs-devices-memory
+++ b/Documentation/ABI/testing/sysfs-devices-memory
@@ -69,6 +69,14 @@ Description:
read-only and is designed to show which zone this memory
block can be onlined to.
+What: /sys/devices/system/memory/memoryX/range
+Date: Feb 2015
+Contact: Sheng Yong <shengyong1@huawei.com>
+Description:
+ The file /sys/devices/system/memory/memoryX/range is
+ read-only and is designed to show memory holes in one
+ memory section.
+
What: /sys/devices/system/memoryX/nodeY
Date: October 2009
Contact: Linux Memory Management list <linux-mm@kvack.org>
diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index ea03abf..d59724b 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -140,22 +140,22 @@ is described under /sys/devices/system/memory as
For the memory block covered by the sysfs directory. It is expected that all
memory sections in this range are present and no memory holes exist in the
-range. Currently there is no way to determine if there is a memory hole, but
-the existence of one should not affect the hotplug capabilities of the memory
-block.
+range. However, if there is a memory hole, the existence of one should not
+affect the hotplug capabilities of the memory block.
For example, assume 1GiB memory block size. A device for a memory starting at
0x100000000 is /sys/device/system/memory/memory4
(0x100000000 / 1Gib = 4)
This device covers address range [0x100000000 ... 0x140000000)
-Under each memory block, you can see 4 files:
+Under each memory block, you can see 6 files:
/sys/devices/system/memory/memoryXXX/phys_index
/sys/devices/system/memory/memoryXXX/phys_device
/sys/devices/system/memory/memoryXXX/state
/sys/devices/system/memory/memoryXXX/removable
/sys/devices/system/memory/memoryXXX/valid_zones
+/sys/devices/system/memory/memoryXXX/range
'phys_index' : read-only and contains memory block id, same as XXX.
'state' : read-write
@@ -180,6 +180,10 @@ Under each memory block, you can see 4 files:
"memory7/valid_zones: Movable Normal" shows this memoryblock
can be onlined to ZONE_MOVABLE by default and to ZONE_NORMAL
by online_kernel.
+'range' : read-only: designed to show memory holes in a memory
+ section.
+ Each line shows the start and end physical address of a
+ memory area.
NOTE:
These directories/files appear after physical memory hotplug phase.
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute
2015-03-02 4:04 [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute Sheng Yong
2015-03-02 4:05 ` [RFC PATCH 2/2] mem-hotplug: add description of " Sheng Yong
@ 2015-03-02 9:17 ` Naoya Horiguchi
2015-03-02 12:29 ` shengyong
1 sibling, 1 reply; 5+ messages in thread
From: Naoya Horiguchi @ 2015-03-02 9:17 UTC (permalink / raw)
To: Sheng Yong
Cc: akpm, gregkh, nfont, linux-mm, zhenzhang.zhang, Dave Hansen,
David Rientjes
# Cced some people maybe interested in this topic.
On Mon, Mar 02, 2015 at 04:04:59AM +0000, Sheng Yong wrote:
> There may be memory holes in a memory section, and because of that we can
> not know the real size of the section. In order to know the physical memory
> area used int one memory section, we walks through iomem resources and
> report the memory range in /sys/devices/system/memory/memoryX/range, like,
>
> root@ivybridge:~# cat /sys/devices/system/memory/memory0/range
> 00001000-0008efff
> 00090000-0009ffff
> 00100000-07ffffff
>
> Signed-off-by: Sheng Yong <shengyong1@huawei.com>
About a year ago, there was a similar request/suggestion from a library
developer about exporting valid physical address range
(http://thread.gmane.org/gmane.linux.kernel.mm/115600).
Then, we tried some but didn't make it.
So if you try to solve this, please consider some points from that discussion:
- interface name: just 'range' might not be friendly, if the interface returns
physicall address range, something like 'phys_addr_range' looks better.
- prefix '0x': if you display the value range in hex, prefixing '0x' might
be better to avoid letting every parser to add it in itself.
- supporting node range: your patch is now just for memory block interface, but
someone (like me) are interested in exporting easy "phys_addr <=> node number"
mapping, so if your approach is easily extensible to node interface, it would
be very nice to include node interface support too.
Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute
2015-03-02 9:17 ` [RFC PATCH 1/2] mem-hotplug: introduce " Naoya Horiguchi
@ 2015-03-02 12:29 ` shengyong
2015-03-07 4:26 ` shengyong
0 siblings, 1 reply; 5+ messages in thread
From: shengyong @ 2015-03-02 12:29 UTC (permalink / raw)
To: Naoya Horiguchi
Cc: akpm, gregkh, nfont, linux-mm, zhenzhang.zhang, Dave Hansen,
David Rientjes
在 2015/3/2 17:17, Naoya Horiguchi 写道:
> # Cced some people maybe interested in this topic.
>
> On Mon, Mar 02, 2015 at 04:04:59AM +0000, Sheng Yong wrote:
>> There may be memory holes in a memory section, and because of that we can
>> not know the real size of the section. In order to know the physical memory
>> area used int one memory section, we walks through iomem resources and
>> report the memory range in /sys/devices/system/memory/memoryX/range, like,
>>
>> root@ivybridge:~# cat /sys/devices/system/memory/memory0/range
>> 00001000-0008efff
>> 00090000-0009ffff
>> 00100000-07ffffff
>>
>> Signed-off-by: Sheng Yong <shengyong1@huawei.com>
>
> About a year ago, there was a similar request/suggestion from a library
> developer about exporting valid physical address range
> (http://thread.gmane.org/gmane.linux.kernel.mm/115600).
> Then, we tried some but didn't make it.
Thanks for your information.
>
> So if you try to solve this, please consider some points from that discussion:
> - interface name: just 'range' might not be friendly, if the interface returns
> physicall address range, something like 'phys_addr_range' looks better.
> - prefix '0x': if you display the value range in hex, prefixing '0x' might
> be better to avoid letting every parser to add it in itself.
I agree on these 2 suggestion.
> - supporting node range: your patch is now just for memory block interface, but
> someone (like me) are interested in exporting easy "phys_addr <=> node number"
> mapping, so if your approach is easily extensible to node interface, it would
> be very nice to include node interface support too.
After reading the previous discussion, I think the content in the interface should
look like "<node id> <start-end>" to avoid overlay of memory node. Am I right? Then
we could use `memory_add_physaddr_to_nid(u64 start)' to translate physical address
to node id when the address is recorded to the ranges list in get_range().
The problem is that `struct resource' does not have an appropriate member to save
the node id value, which is saved in resource->flags temporarily for testing.
thanks,
Sheng
>
> Thanks,
> Naoya Horiguchi
> .
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute
2015-03-02 12:29 ` shengyong
@ 2015-03-07 4:26 ` shengyong
0 siblings, 0 replies; 5+ messages in thread
From: shengyong @ 2015-03-07 4:26 UTC (permalink / raw)
To: Naoya Horiguchi
Cc: akpm, gregkh, nfont, linux-mm, zhenzhang.zhang, Dave Hansen,
David Rientjes
Ping.
The original thoughts of this interface is to get the real size of the section.
Then I thought it maybe more useful if it gives the address range of the section,
so that we can know where the hole is. As Naoya said, I didn't consider NUMA
situation. So if the interface helps, I could try to cover NUMA stuff in it.
thanks,
Sheng
在 2015/3/2 20:29, shengyong 写道:
>
>
> 在 2015/3/2 17:17, Naoya Horiguchi 写道:
>> # Cced some people maybe interested in this topic.
>>
>> On Mon, Mar 02, 2015 at 04:04:59AM +0000, Sheng Yong wrote:
>>> There may be memory holes in a memory section, and because of that we can
>>> not know the real size of the section. In order to know the physical memory
>>> area used int one memory section, we walks through iomem resources and
>>> report the memory range in /sys/devices/system/memory/memoryX/range, like,
>>>
>>> root@ivybridge:~# cat /sys/devices/system/memory/memory0/range
>>> 00001000-0008efff
>>> 00090000-0009ffff
>>> 00100000-07ffffff
>>>
>>> Signed-off-by: Sheng Yong <shengyong1@huawei.com>
>>
>> About a year ago, there was a similar request/suggestion from a library
>> developer about exporting valid physical address range
>> (http://thread.gmane.org/gmane.linux.kernel.mm/115600).
>> Then, we tried some but didn't make it.
> Thanks for your information.
>>
>> So if you try to solve this, please consider some points from that discussion:
>> - interface name: just 'range' might not be friendly, if the interface returns
>> physicall address range, something like 'phys_addr_range' looks better.
>> - prefix '0x': if you display the value range in hex, prefixing '0x' might
>> be better to avoid letting every parser to add it in itself.
> I agree on these 2 suggestion.
>> - supporting node range: your patch is now just for memory block interface, but
>> someone (like me) are interested in exporting easy "phys_addr <=> node number"
>> mapping, so if your approach is easily extensible to node interface, it would
>> be very nice to include node interface support too.
> After reading the previous discussion, I think the content in the interface should
> look like "<node id> <start-end>" to avoid overlay of memory node. Am I right? Then
> we could use `memory_add_physaddr_to_nid(u64 start)' to translate physical address
> to node id when the address is recorded to the ranges list in get_range().
> The problem is that `struct resource' does not have an appropriate member to save
> the node id value, which is saved in resource->flags temporarily for testing.
>
> thanks,
> Sheng
>>
>> Thanks,
>> Naoya Horiguchi
>> .
>>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-03-07 4:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-02 4:04 [RFC PATCH 1/2] mem-hotplug: introduce sysfs `range' attribute Sheng Yong
2015-03-02 4:05 ` [RFC PATCH 2/2] mem-hotplug: add description of " Sheng Yong
2015-03-02 9:17 ` [RFC PATCH 1/2] mem-hotplug: introduce " Naoya Horiguchi
2015-03-02 12:29 ` shengyong
2015-03-07 4:26 ` shengyong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox