* [PATCH 0/2] Report the pagesize backing VMAs in /proc
@ 2008-09-22 1:38 Mel Gorman
2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
2008-09-22 1:38 ` [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps Mel Gorman
0 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-22 1:38 UTC (permalink / raw)
To: LKML; +Cc: Linux-MM, Mel Gorman
The following two patches add support for printing the size used for
hugepage-backed regions. This can be used by a user to verify that a
hugepage-aware application is using the expected page sizes.
The first patch should not be considered too contensious as it is highly
unlikely to break any parsers. There is a possibility that the second patch
will break parsers that arguably are already broken. More details are in
the patches themselves.
fs/proc/task_mmu.c | 29 +++++++++++++++++++++--------
include/linux/hugetlb.h | 13 +++++++++++++
2 files changed, 34 insertions(+), 8 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 1:38 [PATCH 0/2] Report the pagesize backing VMAs in /proc Mel Gorman
@ 2008-09-22 1:38 ` Mel Gorman
2008-09-22 8:30 ` Andrew Morton
2008-09-22 15:55 ` Dave Hansen
2008-09-22 1:38 ` [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps Mel Gorman
1 sibling, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-22 1:38 UTC (permalink / raw)
To: LKML; +Cc: Linux-MM, Mel Gorman
It is useful to verify that a hugepage-aware application is using the expected
pagesizes in each of its memory regions. This patch reports the pagesize
backing the VMA in /proc/pid/smaps. This should not break any sensible
parser as the file format is multi-line and it should skip information it
does not recognise.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
fs/proc/task_mmu.c | 6 ++++--
include/linux/hugetlb.h | 13 +++++++++++++
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 73d1891..81a3f91 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -394,7 +394,8 @@ static int show_smap(struct seq_file *m, void *v)
"Private_Clean: %8lu kB\n"
"Private_Dirty: %8lu kB\n"
"Referenced: %8lu kB\n"
- "Swap: %8lu kB\n",
+ "Swap: %8lu kB\n"
+ "PageSize: %8lu kB\n",
(vma->vm_end - vma->vm_start) >> 10,
mss.resident >> 10,
(unsigned long)(mss.pss >> (10 + PSS_SHIFT)),
@@ -403,7 +404,8 @@ static int show_smap(struct seq_file *m, void *v)
mss.private_clean >> 10,
mss.private_dirty >> 10,
mss.referenced >> 10,
- mss.swap >> 10);
+ mss.swap >> 10,
+ vma_page_size(vma) >> 10);
return ret;
}
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 32e0ef0..0c83445 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -231,6 +231,19 @@ static inline unsigned long huge_page_size(struct hstate *h)
return (unsigned long)PAGE_SIZE << h->order;
}
+static inline unsigned long vma_page_size(struct vm_area_struct *vma)
+{
+ struct hstate *hstate;
+
+ if (!is_vm_hugetlb_page(vma))
+ return PAGE_SIZE;
+
+ hstate = hstate_vma(vma);
+ VM_BUG_ON(!hstate);
+
+ return 1UL << (hstate->order + PAGE_SHIFT);
+}
+
static inline unsigned long huge_page_mask(struct hstate *h)
{
return h->mask;
--
1.5.6.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps
2008-09-22 1:38 [PATCH 0/2] Report the pagesize backing VMAs in /proc Mel Gorman
2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
@ 2008-09-22 1:38 ` Mel Gorman
1 sibling, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-22 1:38 UTC (permalink / raw)
To: LKML; +Cc: Linux-MM, Mel Gorman
This patch adds a new field for hugepage-backed memory regions to show the
pagesize in /proc/pid/maps. While the information is available in smaps,
maps is more human-readable and does not incur the significant cost of
calculating Pss. An example of a /proc/self/maps output for an application
using hugepages with this patch applied is;
08048000-0804c000 r-xp 00000000 03:01 49135 /bin/cat
0804c000-0804d000 rw-p 00003000 03:01 49135 /bin/cat
08400000-08800000 rw-p 00000000 00:10 4055 /mnt/libhugetlbfs.tmp.QzPPTJ (deleted) (hpagesize=4096kB)
b7daa000-b7dab000 rw-p b7daa000 00:00 0
b7dab000-b7ed2000 r-xp 00000000 03:01 116846 /lib/tls/i686/cmov/libc-2.3.6.so
b7ed2000-b7ed7000 r--p 00127000 03:01 116846 /lib/tls/i686/cmov/libc-2.3.6.so
b7ed7000-b7ed9000 rw-p 0012c000 03:01 116846 /lib/tls/i686/cmov/libc-2.3.6.so
b7ed9000-b7edd000 rw-p b7ed9000 00:00 0
b7ee1000-b7ee8000 r-xp 00000000 03:01 49262 /root/libhugetlbfs-git/obj32/libhugetlbfs.so
b7ee8000-b7ee9000 rw-p 00006000 03:01 49262 /root/libhugetlbfs-git/obj32/libhugetlbfs.so
b7ee9000-b7eed000 rw-p b7ee9000 00:00 0
b7eed000-b7f02000 r-xp 00000000 03:01 119345 /lib/ld-2.3.6.so
b7f02000-b7f04000 rw-p 00014000 03:01 119345 /lib/ld-2.3.6.so
bf8ef000-bf903000 rwxp bffeb000 00:00 0 [stack]
bf903000-bf904000 rw-p bffff000 00:00 0
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
To be predictable for parsers, the patch adds the notion of reporting
VMA attributes by adding fields that look like "(attribute[=value])". This
already happens when a file is deleted and the user sees (deleted) after the
filename. The expectation is that existing parsers will not break as those
that read the filename should be reading forward after the inode number
and stopping when it sees something that is not part of the filename.
Parsers that assume everything after / is a filename will get confused by
(hpagesize=XkB) but are already broken due to (deleted).
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
fs/proc/task_mmu.c | 23 +++++++++++++++++------
1 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 81a3f91..80233e6 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -198,7 +198,7 @@ static int do_maps_open(struct inode *inode, struct file *file,
return ret;
}
-static int show_map(struct seq_file *m, void *v)
+static int __show_map(struct seq_file *m, void *v, int showattributes)
{
struct proc_maps_private *priv = m->private;
struct task_struct *task = priv->task;
@@ -233,8 +233,8 @@ static int show_map(struct seq_file *m, void *v)
* Print the dentry name for named mappings, and a
* special [heap] marker for the heap:
*/
+ pad_len_spaces(m, len);
if (file) {
- pad_len_spaces(m, len);
seq_path(m, &file->f_path, "\n");
} else {
const char *name = arch_vma_name(vma);
@@ -251,11 +251,17 @@ static int show_map(struct seq_file *m, void *v)
name = "[vdso]";
}
}
- if (name) {
- pad_len_spaces(m, len);
+ if (name)
seq_puts(m, name);
- }
}
+
+ /*
+ * Print additional attributes of the VMA of interest
+ * - hugepage size if hugepage-backed
+ */
+ if (showattributes && vma->vm_flags & VM_HUGETLB)
+ seq_printf(m, " (hpagesize=%lukB)", vma_page_size(vma) >> 10);
+
seq_putc(m, '\n');
if (m->count < m->size) /* vma is copied successfully */
@@ -263,6 +269,11 @@ static int show_map(struct seq_file *m, void *v)
return 0;
}
+static int show_map(struct seq_file *m, void *v)
+{
+ return __show_map(m, v, 1);
+}
+
static const struct seq_operations proc_pid_maps_op = {
.start = m_start,
.next = m_next,
@@ -381,7 +392,7 @@ static int show_smap(struct seq_file *m, void *v)
if (vma->vm_mm && !is_vm_hugetlb_page(vma))
walk_page_range(vma->vm_start, vma->vm_end, &smaps_walk);
- ret = show_map(m, v);
+ ret = __show_map(m, v, 0);
if (ret)
return ret;
--
1.5.6.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
@ 2008-09-22 8:30 ` Andrew Morton
2008-09-22 16:17 ` Mel Gorman
2008-09-22 15:55 ` Dave Hansen
1 sibling, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2008-09-22 8:30 UTC (permalink / raw)
To: Mel Gorman; +Cc: LKML, Linux-MM
On Mon, 22 Sep 2008 02:38:11 +0100 Mel Gorman <mel@csn.ul.ie> wrote:
> + vma_page_size(vma) >> 10);
>
> return ret;
> }
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 32e0ef0..0c83445 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -231,6 +231,19 @@ static inline unsigned long huge_page_size(struct hstate *h)
> return (unsigned long)PAGE_SIZE << h->order;
> }
>
> +static inline unsigned long vma_page_size(struct vm_area_struct *vma)
> +{
> + struct hstate *hstate;
> +
> + if (!is_vm_hugetlb_page(vma))
> + return PAGE_SIZE;
> +
> + hstate = hstate_vma(vma);
> + VM_BUG_ON(!hstate);
> +
> + return 1UL << (hstate->order + PAGE_SHIFT);
> +}
> +
CONFIG_HUGETLB_PAGE=n?
What did you hope to gain by inlining this?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
2008-09-22 8:30 ` Andrew Morton
@ 2008-09-22 15:55 ` Dave Hansen
2008-09-22 16:21 ` Mel Gorman
1 sibling, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2008-09-22 15:55 UTC (permalink / raw)
To: Mel Gorman; +Cc: LKML, Linux-MM
On Mon, 2008-09-22 at 02:38 +0100, Mel Gorman wrote:
> It is useful to verify that a hugepage-aware application is using the expected
> pagesizes in each of its memory regions. This patch reports the pagesize
> backing the VMA in /proc/pid/smaps. This should not break any sensible
> parser as the file format is multi-line and it should skip information it
> does not recognise.
Time to play devil's advocate. :)
To be fair, this doesn't return the MMU pagesize backing the VMA. It
returns pagesize that hugetlb reports *or* the kernel's base PAGE_SIZE.
The ppc64 case where we have a 64k PAGE_SIZE, but no hardware 64k
support means that we'll have a 4k MMU pagesize that we're pretending is
a 64k MMU page. That might confuse someone seeing 16x the number of TLB
misses they expect.
This also doesn't work if, in the future, we get multiple page sizes
mapped under one VMA. But, I guess that all only matters if you worry
about how the kernel is treating the pages vs. the MMU hardware.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 8:30 ` Andrew Morton
@ 2008-09-22 16:17 ` Mel Gorman
0 siblings, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-22 16:17 UTC (permalink / raw)
To: Andrew Morton; +Cc: LKML, Linux-MM
On (22/09/08 01:30), Andrew Morton didst pronounce:
> On Mon, 22 Sep 2008 02:38:11 +0100 Mel Gorman <mel@csn.ul.ie> wrote:
>
> > + vma_page_size(vma) >> 10);
> >
> > return ret;
> > }
> > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> > index 32e0ef0..0c83445 100644
> > --- a/include/linux/hugetlb.h
> > +++ b/include/linux/hugetlb.h
> > @@ -231,6 +231,19 @@ static inline unsigned long huge_page_size(struct hstate *h)
> > return (unsigned long)PAGE_SIZE << h->order;
> > }
> >
> > +static inline unsigned long vma_page_size(struct vm_area_struct *vma)
> > +{
> > + struct hstate *hstate;
> > +
> > + if (!is_vm_hugetlb_page(vma))
> > + return PAGE_SIZE;
> > +
> > + hstate = hstate_vma(vma);
> > + VM_BUG_ON(!hstate);
> > +
> > + return 1UL << (hstate->order + PAGE_SHIFT);
> > +}
> > +
>
> CONFIG_HUGETLB_PAGE=n?
>
Fails miserably.
> What did you hope to gain by inlining this?
>
Inclusion with similar helper functions in the header but it's the wrong thing
to do in this case, obvious when pointed out. It's too large and called from
multiple places. I'll revise the patch
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 15:55 ` Dave Hansen
@ 2008-09-22 16:21 ` Mel Gorman
2008-09-22 16:48 ` Dave Hansen
0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2008-09-22 16:21 UTC (permalink / raw)
To: Dave Hansen; +Cc: LKML, Linux-MM
On (22/09/08 08:55), Dave Hansen didst pronounce:
> On Mon, 2008-09-22 at 02:38 +0100, Mel Gorman wrote:
> > It is useful to verify that a hugepage-aware application is using the expected
> > pagesizes in each of its memory regions. This patch reports the pagesize
> > backing the VMA in /proc/pid/smaps. This should not break any sensible
> > parser as the file format is multi-line and it should skip information it
> > does not recognise.
>
> Time to play devil's advocate. :)
>
> To be fair, this doesn't return the MMU pagesize backing the VMA. It
> returns pagesize that hugetlb reports *or* the kernel's base PAGE_SIZE.
>
True. In the vast majority of cases, this is the MMU size with ppc64 on
pro
> The ppc64 case where we have a 64k PAGE_SIZE, but no hardware 64k
> support means that we'll have a 4k MMU pagesize that we're pretending is
> a 64k MMU page. That might confuse someone seeing 16x the number of TLB
> misses they expect.
The corollary is that someone running with a 64K base page kernel may be
surprised that the pagesize is always 4K. However I'll check if there is
a simple way of checking out if the MMU size differs from PAGE_SIZE.
> This also doesn't work if, in the future, we get multiple page sizes
> mapped under one VMA. But, I guess that all only matters if you worry
> about how the kernel is treating the pages vs. the MMU hardware.
>
Will deal with that problem if and when we encounter it. It may be a
case that VMAs split or that we could report how many pages of each MMU
size are in that VMA.
Thanks
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 16:21 ` Mel Gorman
@ 2008-09-22 16:48 ` Dave Hansen
2008-09-23 12:15 ` KOSAKI Motohiro
0 siblings, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2008-09-22 16:48 UTC (permalink / raw)
To: Mel Gorman; +Cc: LKML, Linux-MM
On Mon, 2008-09-22 at 17:21 +0100, Mel Gorman wrote:
> The corollary is that someone running with a 64K base page kernel may be
> surprised that the pagesize is always 4K. However I'll check if there is
> a simple way of checking out if the MMU size differs from PAGE_SIZE.
Sure. If it isn't easy, the best thing to do is probably just to
document the "interesting" behavior.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-22 16:48 ` Dave Hansen
@ 2008-09-23 12:15 ` KOSAKI Motohiro
2008-09-23 19:46 ` Mel Gorman
0 siblings, 1 reply; 24+ messages in thread
From: KOSAKI Motohiro @ 2008-09-23 12:15 UTC (permalink / raw)
To: Dave Hansen; +Cc: kosaki.motohiro, Mel Gorman, LKML, Linux-MM
> > The corollary is that someone running with a 64K base page kernel may be
> > surprised that the pagesize is always 4K. However I'll check if there is
> > a simple way of checking out if the MMU size differs from PAGE_SIZE.
>
> Sure. If it isn't easy, the best thing to do is probably just to
> document the "interesting" behavior.
Dave, please let me know getpagesize() function return to 4k or 64k on ppc64.
I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched.
otherwise, enduser may be confused.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-23 12:15 ` KOSAKI Motohiro
@ 2008-09-23 19:46 ` Mel Gorman
2008-09-24 12:32 ` KOSAKI Motohiro
0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2008-09-23 19:46 UTC (permalink / raw)
To: KOSAKI Motohiro; +Cc: Dave Hansen, LKML, Linux-MM
On (23/09/08 21:15), KOSAKI Motohiro didst pronounce:
> > > The corollary is that someone running with a 64K base page kernel may be
> > > surprised that the pagesize is always 4K. However I'll check if there is
> > > a simple way of checking out if the MMU size differs from PAGE_SIZE.
> >
> > Sure. If it isn't easy, the best thing to do is probably just to
> > document the "interesting" behavior.
>
> Dave, please let me know getpagesize() function return to 4k or 64k on ppc64.
> I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched.
>
> otherwise, enduser may be confused.
>
To distinguish between the two, I now report the kernel pagesize and the
mmu pagesize like so
KernelPageSize: 64 kB
MMUPageSize: 4 kB
This is running a kernel with a 64K base pagesize on a PPC970MP which
does not support 64K hardware pagesizes.
Does this make sense?
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-23 19:46 ` Mel Gorman
@ 2008-09-24 12:32 ` KOSAKI Motohiro
2008-09-24 15:41 ` Mel Gorman
0 siblings, 1 reply; 24+ messages in thread
From: KOSAKI Motohiro @ 2008-09-24 12:32 UTC (permalink / raw)
To: Mel Gorman; +Cc: kosaki.motohiro, Dave Hansen, LKML, Linux-MM
> > Dave, please let me know getpagesize() function return to 4k or 64k on ppc64.
> > I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched.
> >
> > otherwise, enduser may be confused.
> >
>
> To distinguish between the two, I now report the kernel pagesize and the
> mmu pagesize like so
>
> KernelPageSize: 64 kB
> MMUPageSize: 4 kB
>
> This is running a kernel with a 64K base pagesize on a PPC970MP which
> does not support 64K hardware pagesizes.
>
> Does this make sense?
Hmmm, Who want to this infomation?
I agreed with
- An administrator want to know these page are normal or huge.
- An administrator want to know hugepage size.
(e.g. x86_64 has two hugepage size (2M and 1G))
but above ppc64 case seems deeply implementation depended infomation and
nobody want to know it.
it seems a bottleneck of future enhancement.
then I disagreed with
- show both KernelPageSize and MMUPageSize in normal page.
I like following two choice
1) in normal page, show PAZE_SIZE
because, any userland application woks as pagesize==PAZE_SIZE
on current powerpc architecture.
because
fs/binfmt_elf.c
------------------------------
static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
unsigned long load_addr, unsigned long interp_load_addr)
{
(snip)
NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); /* pass ELF_EXEC_PAGESIZE to libc */
include/asm-powerpc/elf.h
-----------------------------
#define ELF_EXEC_PAGESIZE PAGE_SIZE
2) in normal page, no display any page size.
only hugepage case, display page size.
because, An administrator want to hugepage size only. (AFAICS)
Thought?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 12:32 ` KOSAKI Motohiro
@ 2008-09-24 15:41 ` Mel Gorman
2008-09-24 16:06 ` Dave Hansen
2008-09-25 12:23 ` KOSAKI Motohiro
0 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-24 15:41 UTC (permalink / raw)
To: KOSAKI Motohiro; +Cc: Dave Hansen, LKML, Linux-MM
On (24/09/08 21:32), KOSAKI Motohiro didst pronounce:
> > > Dave, please let me know getpagesize() function return to 4k or 64k on ppc64.
> > > I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched.
> > >
> > > otherwise, enduser may be confused.
> > >
> >
> > To distinguish between the two, I now report the kernel pagesize and the
> > mmu pagesize like so
> >
> > KernelPageSize: 64 kB
> > MMUPageSize: 4 kB
> >
> > This is running a kernel with a 64K base pagesize on a PPC970MP which
> > does not support 64K hardware pagesizes.
> >
> > Does this make sense?
>
> Hmmm, Who want to this infomation?
>
Someone doing performance analysis on POWER may want it. If they switched to
a large base page size without using hugetlbfs at all and saw the same number
of TLB misses, it could be explained by the lower MMU pagesize. Admittedly,
they should have known the hardware didn't support that pagesize.
> I agreed with
> - An administrator want to know these page are normal or huge.
> - An administrator want to know hugepage size.
> (e.g. x86_64 has two hugepage size (2M and 1G))
>
> but above ppc64 case seems deeply implementation depended infomation and
> nobody want to know it.
>
I admit it's ppc64-specific. In the latest patch series, I made this a
separate patch so that it could be readily dropped again for this reason.
Maybe an alternative would be to display MMUPageSize *only* where it differs
from KernelPageSize. Would that be better or similarly confusing?
> it seems a bottleneck of future enhancement.
>
I'm not sure what you mean by it being a bottleneck
> then I disagreed with
> - show both KernelPageSize and MMUPageSize in normal page.
>
>
> I like following two choice
>
>
> 1) in normal page, show PAZE_SIZE
>
> because, any userland application woks as pagesize==PAZE_SIZE
> on current powerpc architecture.
>
> because
>
> fs/binfmt_elf.c
> ------------------------------
> static int
> create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
> unsigned long load_addr, unsigned long interp_load_addr)
> {
> (snip)
> NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
> NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); /* pass ELF_EXEC_PAGESIZE to libc */
>
> include/asm-powerpc/elf.h
> -----------------------------
> #define ELF_EXEC_PAGESIZE PAGE_SIZE
>
I'm ok with this option and dropping the MMUPageSize patch as the user
should already be able to identify that the hardware does not support 64K
base pagesizes. I will leave the name as KernelPageSize so that it is still
difficult to confuse it with MMU page size.
>
> 2) in normal page, no display any page size.
> only hugepage case, display page size.
>
> because, An administrator want to hugepage size only. (AFAICS)
>
I prefer option 1 as it's easier to parse the presense of information
than infer from the absense of it.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 15:41 ` Mel Gorman
@ 2008-09-24 16:06 ` Dave Hansen
2008-09-24 17:10 ` Mel Gorman
2008-09-25 12:23 ` KOSAKI Motohiro
1 sibling, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2008-09-24 16:06 UTC (permalink / raw)
To: Mel Gorman; +Cc: KOSAKI Motohiro, LKML, Linux-MM
On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote:
> I admit it's ppc64-specific. In the latest patch series, I made this a
> separate patch so that it could be readily dropped again for this reason.
> Maybe an alternative would be to display MMUPageSize *only* where it differs
> from KernelPageSize. Would that be better or similarly confusing?
I would also think that any arch implementing fallback from large to
small pages in a hugetlbfs area (Adam needs to post his patches :) would
also use this.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 16:06 ` Dave Hansen
@ 2008-09-24 17:10 ` Mel Gorman
2008-09-24 18:59 ` Dave Hansen
0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2008-09-24 17:10 UTC (permalink / raw)
To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM
On (24/09/08 09:06), Dave Hansen didst pronounce:
> On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote:
> > I admit it's ppc64-specific. In the latest patch series, I made this a
> > separate patch so that it could be readily dropped again for this reason.
> > Maybe an alternative would be to display MMUPageSize *only* where it differs
> > from KernelPageSize. Would that be better or similarly confusing?
>
> I would also think that any arch implementing fallback from large to
> small pages in a hugetlbfs area (Adam needs to post his patches :) would
> also use this.
>
Fair point. Maybe the thing to do is backburner this patch for the moment and
reintroduce it when/if an architecture supports demotion? The KernelPageSize
reporting in smaps and what the hpagesize in maps is still useful though
I believe. Any comment?
(future stuff from here on)
In the future if demotion does happen then the MMUPageSize information may
be genuinely useful instead of just a curious oddity on ppc64. As you point
out, Adam (added to cc) has worked on this area (starting with x86 demotion)
in the past but it's a while before it'll be considered for merging I believe.
That aside, more would need to be done with the page size reporting then
anyway. For example, it maybe indicate how much of each pagesize is in a VMA
or indicate that KernelPageSize is what is being requested but in reality
it is mixed like;
KernelPageSize: 2048 kB (mixed)
or
KernelPageSize: 2048 kB * 5, 4096 kB * 20
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 17:10 ` Mel Gorman
@ 2008-09-24 18:59 ` Dave Hansen
2008-09-24 19:11 ` Mel Gorman
0 siblings, 1 reply; 24+ messages in thread
From: Dave Hansen @ 2008-09-24 18:59 UTC (permalink / raw)
To: Mel Gorman; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM
On Wed, 2008-09-24 at 18:10 +0100, Mel Gorman wrote:
> On (24/09/08 09:06), Dave Hansen didst pronounce:
> > On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote:
> > > I admit it's ppc64-specific. In the latest patch series, I made this a
> > > separate patch so that it could be readily dropped again for this reason.
> > > Maybe an alternative would be to display MMUPageSize *only* where it differs
> > > from KernelPageSize. Would that be better or similarly confusing?
> >
> > I would also think that any arch implementing fallback from large to
> > small pages in a hugetlbfs area (Adam needs to post his patches :) would
> > also use this.
> >
>
> Fair point. Maybe the thing to do is backburner this patch for the moment and
> reintroduce it when/if an architecture supports demotion? The KernelPageSize
> reporting in smaps and what the hpagesize in maps is still useful though
> I believe. Any comment?
I'd kinda prefer to see it normalized into a single place rather than
sprinkle it in each smaps file. We should be able to figure out which
mount the file is from and, from there, maybe we need some per-mount
information exported.
> (future stuff from here on)
>
> In the future if demotion does happen then the MMUPageSize information may
> be genuinely useful instead of just a curious oddity on ppc64. As you point
> out, Adam (added to cc) has worked on this area (starting with x86 demotion)
> in the past but it's a while before it'll be considered for merging I believe.
>
> That aside, more would need to be done with the page size reporting then
> anyway. For example, it maybe indicate how much of each pagesize is in a VMA
> or indicate that KernelPageSize is what is being requested but in reality
> it is mixed like;
>
> KernelPageSize: 2048 kB (mixed)
>
> or
>
> KernelPageSize: 2048 kB * 5, 4096 kB * 20
Looks a bit verbose, but I agree with the sentiment.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 18:59 ` Dave Hansen
@ 2008-09-24 19:11 ` Mel Gorman
2008-09-24 19:23 ` Dave Hansen
0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2008-09-24 19:11 UTC (permalink / raw)
To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM
On (24/09/08 11:59), Dave Hansen didst pronounce:
> On Wed, 2008-09-24 at 18:10 +0100, Mel Gorman wrote:
> > On (24/09/08 09:06), Dave Hansen didst pronounce:
> > > On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote:
> > > > I admit it's ppc64-specific. In the latest patch series, I made this a
> > > > separate patch so that it could be readily dropped again for this reason.
> > > > Maybe an alternative would be to display MMUPageSize *only* where it differs
> > > > from KernelPageSize. Would that be better or similarly confusing?
> > >
> > > I would also think that any arch implementing fallback from large to
> > > small pages in a hugetlbfs area (Adam needs to post his patches :) would
> > > also use this.
> > >
> >
> > Fair point. Maybe the thing to do is backburner this patch for the moment and
> > reintroduce it when/if an architecture supports demotion? The KernelPageSize
> > reporting in smaps and what the hpagesize in maps is still useful though
> > I believe. Any comment?
>
> I'd kinda prefer to see it normalized into a single place rather than
> sprinkle it in each smaps file.
I don't get what you mean by it being sprinkled in each smaps file. How
would you present the data?
> We should be able to figure out which
> mount the file is from and, from there, maybe we need some per-mount
> information exported.
>
Per-mount information is already exported and you can infer the data about
huge pagesizes. For example, if you know the default huge pagesize (from
/proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts)
and there is no pagesize= mount option (mounts again), you could guess what the
hugepage that is backing a VMA is. Shared memory segments are a little harder
but again, you can infer the information if you look around for long enough.
However, this is awkward and not very user-friendly. With the patches (minus
MMUPageSize as I think we've agreed to postpone that), it's easy to see what
pagesize is being used at a glance. Without it, you need to know a fair bit
about hugepages are implemented in Linux to infer the information correctly.
> > (future stuff from here on)
> >
> > In the future if demotion does happen then the MMUPageSize information may
> > be genuinely useful instead of just a curious oddity on ppc64. As you point
> > out, Adam (added to cc) has worked on this area (starting with x86 demotion)
> > in the past but it's a while before it'll be considered for merging I believe.
> >
> > That aside, more would need to be done with the page size reporting then
> > anyway. For example, it maybe indicate how much of each pagesize is in a VMA
> > or indicate that KernelPageSize is what is being requested but in reality
> > it is mixed like;
> >
> > KernelPageSize: 2048 kB (mixed)
> >
> > or
> >
> > KernelPageSize: 2048 kB * 5, 4096 kB * 20
>
> Looks a bit verbose, but I agree with the sentiment.
>
Grand, I'll keep note of this to revisit it in the future when/if
pagesizes get mixed in a VMA. Thanks
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 19:11 ` Mel Gorman
@ 2008-09-24 19:23 ` Dave Hansen
2008-09-24 23:39 ` Mel Gorman
2008-09-24 23:42 ` Mel Gorman
0 siblings, 2 replies; 24+ messages in thread
From: Dave Hansen @ 2008-09-24 19:23 UTC (permalink / raw)
To: Mel Gorman; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM
On Wed, 2008-09-24 at 20:11 +0100, Mel Gorman wrote:
> I don't get what you mean by it being sprinkled in each smaps file. How
> would you present the data?
1. figure out what the file path is from smaps
2. look up the mount
3. look up the page sizes from the mount's information
> > We should be able to figure out which
> > mount the file is from and, from there, maybe we need some per-mount
> > information exported.
>
> Per-mount information is already exported and you can infer the data about
> huge pagesizes. For example, if you know the default huge pagesize (from
> /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts)
> and there is no pagesize= mount option (mounts again), you could guess what the
> hugepage that is backing a VMA is. Shared memory segments are a little harder
> but again, you can infer the information if you look around for long enough.
>
> However, this is awkward and not very user-friendly. With the patches (minus
> MMUPageSize as I think we've agreed to postpone that), it's easy to see what
> pagesize is being used at a glance. Without it, you need to know a fair bit
> about hugepages are implemented in Linux to infer the information correctly.
I agree completely. But, if we consider this a user ABI thing, then
we're stuck with it for a long time, and we better make it flexible
enough to at least contain the gunk we're planning on adding in a small
number of years, like the fallback. We don't want to be adding this
stuff if it isn't going to be stable.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 19:23 ` Dave Hansen
@ 2008-09-24 23:39 ` Mel Gorman
2008-09-24 23:42 ` Mel Gorman
1 sibling, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-24 23:39 UTC (permalink / raw)
To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM
On (24/09/08 12:23), Dave Hansen didst pronounce:
> On Wed, 2008-09-24 at 20:11 +0100, Mel Gorman wrote:
> > I don't get what you mean by it being sprinkled in each smaps file. How
> > would you present the data?
>
> 1. figure out what the file path is from smaps
> 2. look up the mount
> 3. look up the page sizes from the mount's information
>
You should be able to do that today but it's not a particularly friendly
task. I expect without decent knowledge of how hugepages work that you'll get
it wrong. A userspace tool could do this of course and likely would use stat
on the file to get teh blocksize if it was hugetlbfs instead of consulting
mounts. It's just not as user-friendly. Consider "cat smaps" as opposed to
download this tool, run it and it'll give you an smaps-like output.
> > > We should be able to figure out which
> > > mount the file is from and, from there, maybe we need some per-mount
> > > information exported.
> >
> > Per-mount information is already exported and you can infer the data about
> > huge pagesizes. For example, if you know the default huge pagesize (from
> > /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts)
> > and there is no pagesize= mount option (mounts again), you could guess what the
> > hugepage that is backing a VMA is. Shared memory segments are a little harder
> > but again, you can infer the information if you look around for long enough.
> >
> > However, this is awkward and not very user-friendly. With the patches (minus
> > MMUPageSize as I think we've agreed to postpone that), it's easy to see what
> > pagesize is being used at a glance. Without it, you need to know a fair bit
> > about hugepages are implemented in Linux to infer the information correctly.
>
> I agree completely. But, if we consider this a user ABI thing, then
> we're stuck with it for a long time, and we better make it flexible
> enough to at least contain the gunk we're planning on adding in a small
> number of years, like the fallback. We don't want to be adding this
> stuff if it isn't going to be stable.
>
What's wrong with
KernelPageSize: X kB
now which a parser can easily handle and later
KernelPageSize: X kb * nX Y kB * nY
where X is a pagesize, nX is the number of pages of that size in a VMA
later? The second format should not break a naive parser.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 19:23 ` Dave Hansen
2008-09-24 23:39 ` Mel Gorman
@ 2008-09-24 23:42 ` Mel Gorman
1 sibling, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2008-09-24 23:42 UTC (permalink / raw)
To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM
On (24/09/08 12:23), Dave Hansen didst pronounce:
> On Wed, 2008-09-24 at 20:11 +0100, Mel Gorman wrote:
> > I don't get what you mean by it being sprinkled in each smaps file. How
> > would you present the data?
>
> 1. figure out what the file path is from smaps
> 2. look up the mount
> 3. look up the page sizes from the mount's information
>
> > > We should be able to figure out which
> > > mount the file is from and, from there, maybe we need some per-mount
> > > information exported.
> >
> > Per-mount information is already exported and you can infer the data about
> > huge pagesizes. For example, if you know the default huge pagesize (from
> > /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts)
> > and there is no pagesize= mount option (mounts again), you could guess what the
> > hugepage that is backing a VMA is. Shared memory segments are a little harder
> > but again, you can infer the information if you look around for long enough.
> >
> > However, this is awkward and not very user-friendly. With the patches (minus
> > MMUPageSize as I think we've agreed to postpone that), it's easy to see what
> > pagesize is being used at a glance. Without it, you need to know a fair bit
> > about hugepages are implemented in Linux to infer the information correctly.
>
> I agree completely. But, if we consider this a user ABI thing, then
> we're stuck with it for a long time, and we better make it flexible
> enough to at least contain the gunk we're planning on adding in a small
> number of years, like the fallback. We don't want to be adding this
> stuff if it isn't going to be stable.
>
This could also be done as
KernelPageSize == Kernel page size that is ideally used in this VMA
and later
MixedPageSize == Breakdown of the pagesizes that are used in the VMA
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-09-24 15:41 ` Mel Gorman
2008-09-24 16:06 ` Dave Hansen
@ 2008-09-25 12:23 ` KOSAKI Motohiro
1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2008-09-25 12:23 UTC (permalink / raw)
To: Mel Gorman; +Cc: kosaki.motohiro, Dave Hansen, LKML, Linux-MM
Hi!
> > 1) in normal page, show PAZE_SIZE
> >
> > because, any userland application woks as pagesize==PAZE_SIZE
> > on current powerpc architecture.
> >
> > because
> >
> > fs/binfmt_elf.c
> > ------------------------------
> > static int
> > create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
> > unsigned long load_addr, unsigned long interp_load_addr)
> > {
> > (snip)
> > NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
> > NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); /* pass ELF_EXEC_PAGESIZE to libc */
> >
> > include/asm-powerpc/elf.h
> > -----------------------------
> > #define ELF_EXEC_PAGESIZE PAGE_SIZE
> >
>
> I'm ok with this option and dropping the MMUPageSize patch as the user
> should already be able to identify that the hardware does not support 64K
> base pagesizes. I will leave the name as KernelPageSize so that it is still
> difficult to confuse it with MMU page size.
>
> >
> > 2) in normal page, no display any page size.
> > only hugepage case, display page size.
> >
> > because, An administrator want to hugepage size only. (AFAICS)
> >
>
> I prefer option 1 as it's easier to parse the presense of information
> than infer from the absense of it.
OK.
I'll review and test your latest patch without MMUPageSize part.
(maybe today's midnight or tommorow)
Thanks!
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-10-08 21:38 ` Alexey Dobriyan
2008-10-09 2:16 ` KOSAKI Motohiro
@ 2008-10-09 10:24 ` Mel Gorman
1 sibling, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2008-10-09 10:24 UTC (permalink / raw)
To: Alexey Dobriyan; +Cc: akpm, kosaki.motohiro, dave, linux-mm, linux-kernel
On (09/10/08 01:38), Alexey Dobriyan didst pronounce:
> On Fri, Oct 03, 2008 at 05:46:54PM +0100, Mel Gorman wrote:
> > It is useful to verify a hugepage-aware application is using the expected
> > pagesizes for its memory regions. This patch creates an entry called
> > KernelPageSize in /proc/pid/smaps that is the size of page used by the
> > kernel to back a VMA. The entry is not called PageSize as it is possible
> > the MMU uses a different size. This extension should not break any sensible
> > parser that skips lines containing unrecognised information.
>
> > + "KernelPageSize: %8lu kB\n",
>
> > +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> > +{
> > + struct hstate *hstate;
> > +
> > + if (!is_vm_hugetlb_page(vma))
> > + return PAGE_SIZE;
> > +
> > + hstate = hstate_vma(vma);
> > + VM_BUG_ON(!hstate);
> > +
> > + return 1UL << (hstate->order + PAGE_SHIFT);
> ^^^^
> VM_BUG_ON is unneeded because kernel will oops here if hstate is NULL.
>
Ok, will drop it. I used the VM_BUG_ON so if the situation was triggered,
it would come with line numbers but it'll be an obvious oops so I guess it
is redundant.
> Also, in /proc/*/maps it's printed only for hugetlb vmas and called
> hpagesize,
Well yes... because it's a huge pagesize for that VMA. The name reflects
what is being described there.
> in smaps it's printed for every vma and called
> KernelPageSize. All of this is inconsistent.
>
In smaps, we are printing for every VMA because it's easier for parsers to
deal with the presense of information than its absense. The name KernelPageSize
there is an accurate description.
I don't feel it is inconsistent.
> And app will verify once that hugepages are of right size, so Pss cost
> argument for changing /proc/*/maps seems weak to me.
>
Lets say someone wanted to monitor an application to see what its use of
hugepages were over time, they would have to constantly incur the PSS
cost to do that which seems a bit unfair.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-10-08 21:38 ` Alexey Dobriyan
@ 2008-10-09 2:16 ` KOSAKI Motohiro
2008-10-09 10:24 ` Mel Gorman
1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2008-10-09 2:16 UTC (permalink / raw)
To: Alexey Dobriyan
Cc: kosaki.motohiro, Mel Gorman, akpm, dave, linux-mm, linux-kernel
Hi
> > It is useful to verify a hugepage-aware application is using the expected
> > pagesizes for its memory regions. This patch creates an entry called
> > KernelPageSize in /proc/pid/smaps that is the size of page used by the
> > kernel to back a VMA. The entry is not called PageSize as it is possible
> > the MMU uses a different size. This extension should not break any sensible
> > parser that skips lines containing unrecognised information.
>
> > + "KernelPageSize: %8lu kB\n",
>
> > +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> > +{
> > + struct hstate *hstate;
> > +
> > + if (!is_vm_hugetlb_page(vma))
> > + return PAGE_SIZE;
> > +
> > + hstate = hstate_vma(vma);
> > + VM_BUG_ON(!hstate);
> > +
> > + return 1UL << (hstate->order + PAGE_SHIFT);
> ^^^^
> VM_BUG_ON is unneeded because kernel will oops here if hstate is NULL.
yup.
> Also, in /proc/*/maps it's printed only for hugetlb vmas and called
> hpagesize, in smaps it's printed for every vma and called
> KernelPageSize. All of this is inconsistent.
Is this a problem?
/proc/*/maps and /proc/*/smaps are different purpose file.
/proc/*/maps: summary & suppressed information & easy readable
/proc/*/smaps: verbose output
Already some information output only smaps.
> And app will verify once that hugepages are of right size, so Pss cost
> argument for changing /proc/*/maps seems weak to me.
sorry, I don't understand yet.
Why pss cost changed?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-10-03 16:46 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
@ 2008-10-08 21:38 ` Alexey Dobriyan
2008-10-09 2:16 ` KOSAKI Motohiro
2008-10-09 10:24 ` Mel Gorman
0 siblings, 2 replies; 24+ messages in thread
From: Alexey Dobriyan @ 2008-10-08 21:38 UTC (permalink / raw)
To: Mel Gorman; +Cc: akpm, kosaki.motohiro, dave, linux-mm, linux-kernel
On Fri, Oct 03, 2008 at 05:46:54PM +0100, Mel Gorman wrote:
> It is useful to verify a hugepage-aware application is using the expected
> pagesizes for its memory regions. This patch creates an entry called
> KernelPageSize in /proc/pid/smaps that is the size of page used by the
> kernel to back a VMA. The entry is not called PageSize as it is possible
> the MMU uses a different size. This extension should not break any sensible
> parser that skips lines containing unrecognised information.
> + "KernelPageSize: %8lu kB\n",
> +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> +{
> + struct hstate *hstate;
> +
> + if (!is_vm_hugetlb_page(vma))
> + return PAGE_SIZE;
> +
> + hstate = hstate_vma(vma);
> + VM_BUG_ON(!hstate);
> +
> + return 1UL << (hstate->order + PAGE_SHIFT);
^^^^
VM_BUG_ON is unneeded because kernel will oops here if hstate is NULL.
Also, in /proc/*/maps it's printed only for hugetlb vmas and called
hpagesize, in smaps it's printed for every vma and called
KernelPageSize. All of this is inconsistent.
And app will verify once that hugepages are of right size, so Pss cost
argument for changing /proc/*/maps seems weak to me.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps
2008-10-03 16:46 [PATCH 0/2] Report the size of pages backing VMAs in /proc V3 Mel Gorman
@ 2008-10-03 16:46 ` Mel Gorman
2008-10-08 21:38 ` Alexey Dobriyan
0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2008-10-03 16:46 UTC (permalink / raw)
To: akpm; +Cc: Mel Gorman, kosaki.motohiro, dave, linux-mm, linux-kernel
It is useful to verify a hugepage-aware application is using the expected
pagesizes for its memory regions. This patch creates an entry called
KernelPageSize in /proc/pid/smaps that is the size of page used by the
kernel to back a VMA. The entry is not called PageSize as it is possible
the MMU uses a different size. This extension should not break any sensible
parser that skips lines containing unrecognised information.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
fs/proc/task_mmu.c | 6 ++++--
include/linux/hugetlb.h | 3 +++
mm/hugetlb.c | 17 +++++++++++++++++
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index f6add87..beb884d 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -402,7 +402,8 @@ static int show_smap(struct seq_file *m, void *v)
"Private_Clean: %8lu kB\n"
"Private_Dirty: %8lu kB\n"
"Referenced: %8lu kB\n"
- "Swap: %8lu kB\n",
+ "Swap: %8lu kB\n"
+ "KernelPageSize: %8lu kB\n",
(vma->vm_end - vma->vm_start) >> 10,
mss.resident >> 10,
(unsigned long)(mss.pss >> (10 + PSS_SHIFT)),
@@ -411,7 +412,8 @@ static int show_smap(struct seq_file *m, void *v)
mss.private_clean >> 10,
mss.private_dirty >> 10,
mss.referenced >> 10,
- mss.swap >> 10);
+ mss.swap >> 10,
+ vma_kernel_pagesize(vma) >> 10);
if (m->count < m->size) /* vma is copied successfully */
m->version = (vma != get_gate_vma(task)) ? vma->vm_start : 0;
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 32e0ef0..ace04a7 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -231,6 +231,8 @@ static inline unsigned long huge_page_size(struct hstate *h)
return (unsigned long)PAGE_SIZE << h->order;
}
+extern unsigned long vma_kernel_pagesize(struct vm_area_struct *vma);
+
static inline unsigned long huge_page_mask(struct hstate *h)
{
return h->mask;
@@ -271,6 +273,7 @@ struct hstate {};
#define hstate_inode(i) NULL
#define huge_page_size(h) PAGE_SIZE
#define huge_page_mask(h) PAGE_MASK
+#define vma_kernel_pagesize(v) PAGE_SIZE
#define huge_page_order(h) 0
#define huge_page_shift(h) PAGE_SHIFT
static inline unsigned int pages_per_huge_page(struct hstate *h)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index adf3568..856949c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -219,6 +219,23 @@ static pgoff_t vma_hugecache_offset(struct hstate *h,
}
/*
+ * Return the size of the pages allocated when backing a VMA. In the majority
+ * cases this will be same size as used by the page table entries.
+ */
+unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
+{
+ struct hstate *hstate;
+
+ if (!is_vm_hugetlb_page(vma))
+ return PAGE_SIZE;
+
+ hstate = hstate_vma(vma);
+ VM_BUG_ON(!hstate);
+
+ return 1UL << (hstate->order + PAGE_SHIFT);
+}
+
+/*
* Flags for MAP_PRIVATE reservations. These are stored in the bottom
* bits of the reservation map pointer, which are always clear due to
* alignment.
--
1.5.6.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2008-10-09 10:24 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-22 1:38 [PATCH 0/2] Report the pagesize backing VMAs in /proc Mel Gorman
2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
2008-09-22 8:30 ` Andrew Morton
2008-09-22 16:17 ` Mel Gorman
2008-09-22 15:55 ` Dave Hansen
2008-09-22 16:21 ` Mel Gorman
2008-09-22 16:48 ` Dave Hansen
2008-09-23 12:15 ` KOSAKI Motohiro
2008-09-23 19:46 ` Mel Gorman
2008-09-24 12:32 ` KOSAKI Motohiro
2008-09-24 15:41 ` Mel Gorman
2008-09-24 16:06 ` Dave Hansen
2008-09-24 17:10 ` Mel Gorman
2008-09-24 18:59 ` Dave Hansen
2008-09-24 19:11 ` Mel Gorman
2008-09-24 19:23 ` Dave Hansen
2008-09-24 23:39 ` Mel Gorman
2008-09-24 23:42 ` Mel Gorman
2008-09-25 12:23 ` KOSAKI Motohiro
2008-09-22 1:38 ` [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps Mel Gorman
2008-10-03 16:46 [PATCH 0/2] Report the size of pages backing VMAs in /proc V3 Mel Gorman
2008-10-03 16:46 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman
2008-10-08 21:38 ` Alexey Dobriyan
2008-10-09 2:16 ` KOSAKI Motohiro
2008-10-09 10:24 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox