* [PATCH 0/2] Report the pagesize backing VMAs in /proc @ 2008-09-22 1:38 Mel Gorman 2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman 2008-09-22 1:38 ` [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps Mel Gorman 0 siblings, 2 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-22 1:38 UTC (permalink / raw) To: LKML; +Cc: Linux-MM, Mel Gorman The following two patches add support for printing the size used for hugepage-backed regions. This can be used by a user to verify that a hugepage-aware application is using the expected page sizes. The first patch should not be considered too contensious as it is highly unlikely to break any parsers. There is a possibility that the second patch will break parsers that arguably are already broken. More details are in the patches themselves. fs/proc/task_mmu.c | 29 +++++++++++++++++++++-------- include/linux/hugetlb.h | 13 +++++++++++++ 2 files changed, 34 insertions(+), 8 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 1:38 [PATCH 0/2] Report the pagesize backing VMAs in /proc Mel Gorman @ 2008-09-22 1:38 ` Mel Gorman 2008-09-22 8:30 ` Andrew Morton 2008-09-22 15:55 ` Dave Hansen 2008-09-22 1:38 ` [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps Mel Gorman 1 sibling, 2 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-22 1:38 UTC (permalink / raw) To: LKML; +Cc: Linux-MM, Mel Gorman It is useful to verify that a hugepage-aware application is using the expected pagesizes in each of its memory regions. This patch reports the pagesize backing the VMA in /proc/pid/smaps. This should not break any sensible parser as the file format is multi-line and it should skip information it does not recognise. Signed-off-by: Mel Gorman <mel@csn.ul.ie> --- fs/proc/task_mmu.c | 6 ++++-- include/linux/hugetlb.h | 13 +++++++++++++ 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 73d1891..81a3f91 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -394,7 +394,8 @@ static int show_smap(struct seq_file *m, void *v) "Private_Clean: %8lu kB\n" "Private_Dirty: %8lu kB\n" "Referenced: %8lu kB\n" - "Swap: %8lu kB\n", + "Swap: %8lu kB\n" + "PageSize: %8lu kB\n", (vma->vm_end - vma->vm_start) >> 10, mss.resident >> 10, (unsigned long)(mss.pss >> (10 + PSS_SHIFT)), @@ -403,7 +404,8 @@ static int show_smap(struct seq_file *m, void *v) mss.private_clean >> 10, mss.private_dirty >> 10, mss.referenced >> 10, - mss.swap >> 10); + mss.swap >> 10, + vma_page_size(vma) >> 10); return ret; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 32e0ef0..0c83445 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -231,6 +231,19 @@ static inline unsigned long huge_page_size(struct hstate *h) return (unsigned long)PAGE_SIZE << h->order; } +static inline unsigned long vma_page_size(struct vm_area_struct *vma) +{ + struct hstate *hstate; + + if (!is_vm_hugetlb_page(vma)) + return PAGE_SIZE; + + hstate = hstate_vma(vma); + VM_BUG_ON(!hstate); + + return 1UL << (hstate->order + PAGE_SHIFT); +} + static inline unsigned long huge_page_mask(struct hstate *h) { return h->mask; -- 1.5.6.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman @ 2008-09-22 8:30 ` Andrew Morton 2008-09-22 16:17 ` Mel Gorman 2008-09-22 15:55 ` Dave Hansen 1 sibling, 1 reply; 24+ messages in thread From: Andrew Morton @ 2008-09-22 8:30 UTC (permalink / raw) To: Mel Gorman; +Cc: LKML, Linux-MM On Mon, 22 Sep 2008 02:38:11 +0100 Mel Gorman <mel@csn.ul.ie> wrote: > + vma_page_size(vma) >> 10); > > return ret; > } > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 32e0ef0..0c83445 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -231,6 +231,19 @@ static inline unsigned long huge_page_size(struct hstate *h) > return (unsigned long)PAGE_SIZE << h->order; > } > > +static inline unsigned long vma_page_size(struct vm_area_struct *vma) > +{ > + struct hstate *hstate; > + > + if (!is_vm_hugetlb_page(vma)) > + return PAGE_SIZE; > + > + hstate = hstate_vma(vma); > + VM_BUG_ON(!hstate); > + > + return 1UL << (hstate->order + PAGE_SHIFT); > +} > + CONFIG_HUGETLB_PAGE=n? What did you hope to gain by inlining this? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 8:30 ` Andrew Morton @ 2008-09-22 16:17 ` Mel Gorman 0 siblings, 0 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-22 16:17 UTC (permalink / raw) To: Andrew Morton; +Cc: LKML, Linux-MM On (22/09/08 01:30), Andrew Morton didst pronounce: > On Mon, 22 Sep 2008 02:38:11 +0100 Mel Gorman <mel@csn.ul.ie> wrote: > > > + vma_page_size(vma) >> 10); > > > > return ret; > > } > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > > index 32e0ef0..0c83445 100644 > > --- a/include/linux/hugetlb.h > > +++ b/include/linux/hugetlb.h > > @@ -231,6 +231,19 @@ static inline unsigned long huge_page_size(struct hstate *h) > > return (unsigned long)PAGE_SIZE << h->order; > > } > > > > +static inline unsigned long vma_page_size(struct vm_area_struct *vma) > > +{ > > + struct hstate *hstate; > > + > > + if (!is_vm_hugetlb_page(vma)) > > + return PAGE_SIZE; > > + > > + hstate = hstate_vma(vma); > > + VM_BUG_ON(!hstate); > > + > > + return 1UL << (hstate->order + PAGE_SHIFT); > > +} > > + > > CONFIG_HUGETLB_PAGE=n? > Fails miserably. > What did you hope to gain by inlining this? > Inclusion with similar helper functions in the header but it's the wrong thing to do in this case, obvious when pointed out. It's too large and called from multiple places. I'll revise the patch -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman 2008-09-22 8:30 ` Andrew Morton @ 2008-09-22 15:55 ` Dave Hansen 2008-09-22 16:21 ` Mel Gorman 1 sibling, 1 reply; 24+ messages in thread From: Dave Hansen @ 2008-09-22 15:55 UTC (permalink / raw) To: Mel Gorman; +Cc: LKML, Linux-MM On Mon, 2008-09-22 at 02:38 +0100, Mel Gorman wrote: > It is useful to verify that a hugepage-aware application is using the expected > pagesizes in each of its memory regions. This patch reports the pagesize > backing the VMA in /proc/pid/smaps. This should not break any sensible > parser as the file format is multi-line and it should skip information it > does not recognise. Time to play devil's advocate. :) To be fair, this doesn't return the MMU pagesize backing the VMA. It returns pagesize that hugetlb reports *or* the kernel's base PAGE_SIZE. The ppc64 case where we have a 64k PAGE_SIZE, but no hardware 64k support means that we'll have a 4k MMU pagesize that we're pretending is a 64k MMU page. That might confuse someone seeing 16x the number of TLB misses they expect. This also doesn't work if, in the future, we get multiple page sizes mapped under one VMA. But, I guess that all only matters if you worry about how the kernel is treating the pages vs. the MMU hardware. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 15:55 ` Dave Hansen @ 2008-09-22 16:21 ` Mel Gorman 2008-09-22 16:48 ` Dave Hansen 0 siblings, 1 reply; 24+ messages in thread From: Mel Gorman @ 2008-09-22 16:21 UTC (permalink / raw) To: Dave Hansen; +Cc: LKML, Linux-MM On (22/09/08 08:55), Dave Hansen didst pronounce: > On Mon, 2008-09-22 at 02:38 +0100, Mel Gorman wrote: > > It is useful to verify that a hugepage-aware application is using the expected > > pagesizes in each of its memory regions. This patch reports the pagesize > > backing the VMA in /proc/pid/smaps. This should not break any sensible > > parser as the file format is multi-line and it should skip information it > > does not recognise. > > Time to play devil's advocate. :) > > To be fair, this doesn't return the MMU pagesize backing the VMA. It > returns pagesize that hugetlb reports *or* the kernel's base PAGE_SIZE. > True. In the vast majority of cases, this is the MMU size with ppc64 on pro > The ppc64 case where we have a 64k PAGE_SIZE, but no hardware 64k > support means that we'll have a 4k MMU pagesize that we're pretending is > a 64k MMU page. That might confuse someone seeing 16x the number of TLB > misses they expect. The corollary is that someone running with a 64K base page kernel may be surprised that the pagesize is always 4K. However I'll check if there is a simple way of checking out if the MMU size differs from PAGE_SIZE. > This also doesn't work if, in the future, we get multiple page sizes > mapped under one VMA. But, I guess that all only matters if you worry > about how the kernel is treating the pages vs. the MMU hardware. > Will deal with that problem if and when we encounter it. It may be a case that VMAs split or that we could report how many pages of each MMU size are in that VMA. Thanks -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 16:21 ` Mel Gorman @ 2008-09-22 16:48 ` Dave Hansen 2008-09-23 12:15 ` KOSAKI Motohiro 0 siblings, 1 reply; 24+ messages in thread From: Dave Hansen @ 2008-09-22 16:48 UTC (permalink / raw) To: Mel Gorman; +Cc: LKML, Linux-MM On Mon, 2008-09-22 at 17:21 +0100, Mel Gorman wrote: > The corollary is that someone running with a 64K base page kernel may be > surprised that the pagesize is always 4K. However I'll check if there is > a simple way of checking out if the MMU size differs from PAGE_SIZE. Sure. If it isn't easy, the best thing to do is probably just to document the "interesting" behavior. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-22 16:48 ` Dave Hansen @ 2008-09-23 12:15 ` KOSAKI Motohiro 2008-09-23 19:46 ` Mel Gorman 0 siblings, 1 reply; 24+ messages in thread From: KOSAKI Motohiro @ 2008-09-23 12:15 UTC (permalink / raw) To: Dave Hansen; +Cc: kosaki.motohiro, Mel Gorman, LKML, Linux-MM > > The corollary is that someone running with a 64K base page kernel may be > > surprised that the pagesize is always 4K. However I'll check if there is > > a simple way of checking out if the MMU size differs from PAGE_SIZE. > > Sure. If it isn't easy, the best thing to do is probably just to > document the "interesting" behavior. Dave, please let me know getpagesize() function return to 4k or 64k on ppc64. I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched. otherwise, enduser may be confused. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-23 12:15 ` KOSAKI Motohiro @ 2008-09-23 19:46 ` Mel Gorman 2008-09-24 12:32 ` KOSAKI Motohiro 0 siblings, 1 reply; 24+ messages in thread From: Mel Gorman @ 2008-09-23 19:46 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: Dave Hansen, LKML, Linux-MM On (23/09/08 21:15), KOSAKI Motohiro didst pronounce: > > > The corollary is that someone running with a 64K base page kernel may be > > > surprised that the pagesize is always 4K. However I'll check if there is > > > a simple way of checking out if the MMU size differs from PAGE_SIZE. > > > > Sure. If it isn't easy, the best thing to do is probably just to > > document the "interesting" behavior. > > Dave, please let me know getpagesize() function return to 4k or 64k on ppc64. > I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched. > > otherwise, enduser may be confused. > To distinguish between the two, I now report the kernel pagesize and the mmu pagesize like so KernelPageSize: 64 kB MMUPageSize: 4 kB This is running a kernel with a 64K base pagesize on a PPC970MP which does not support 64K hardware pagesizes. Does this make sense? -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-23 19:46 ` Mel Gorman @ 2008-09-24 12:32 ` KOSAKI Motohiro 2008-09-24 15:41 ` Mel Gorman 0 siblings, 1 reply; 24+ messages in thread From: KOSAKI Motohiro @ 2008-09-24 12:32 UTC (permalink / raw) To: Mel Gorman; +Cc: kosaki.motohiro, Dave Hansen, LKML, Linux-MM > > Dave, please let me know getpagesize() function return to 4k or 64k on ppc64. > > I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched. > > > > otherwise, enduser may be confused. > > > > To distinguish between the two, I now report the kernel pagesize and the > mmu pagesize like so > > KernelPageSize: 64 kB > MMUPageSize: 4 kB > > This is running a kernel with a 64K base pagesize on a PPC970MP which > does not support 64K hardware pagesizes. > > Does this make sense? Hmmm, Who want to this infomation? I agreed with - An administrator want to know these page are normal or huge. - An administrator want to know hugepage size. (e.g. x86_64 has two hugepage size (2M and 1G)) but above ppc64 case seems deeply implementation depended infomation and nobody want to know it. it seems a bottleneck of future enhancement. then I disagreed with - show both KernelPageSize and MMUPageSize in normal page. I like following two choice 1) in normal page, show PAZE_SIZE because, any userland application woks as pagesize==PAZE_SIZE on current powerpc architecture. because fs/binfmt_elf.c ------------------------------ static int create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec, unsigned long load_addr, unsigned long interp_load_addr) { (snip) NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP); NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); /* pass ELF_EXEC_PAGESIZE to libc */ include/asm-powerpc/elf.h ----------------------------- #define ELF_EXEC_PAGESIZE PAGE_SIZE 2) in normal page, no display any page size. only hugepage case, display page size. because, An administrator want to hugepage size only. (AFAICS) Thought? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 12:32 ` KOSAKI Motohiro @ 2008-09-24 15:41 ` Mel Gorman 2008-09-24 16:06 ` Dave Hansen 2008-09-25 12:23 ` KOSAKI Motohiro 0 siblings, 2 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-24 15:41 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: Dave Hansen, LKML, Linux-MM On (24/09/08 21:32), KOSAKI Motohiro didst pronounce: > > > Dave, please let me know getpagesize() function return to 4k or 64k on ppc64. > > > I think the PageSize line of the /proc/pid/smap and getpagesize() result should be matched. > > > > > > otherwise, enduser may be confused. > > > > > > > To distinguish between the two, I now report the kernel pagesize and the > > mmu pagesize like so > > > > KernelPageSize: 64 kB > > MMUPageSize: 4 kB > > > > This is running a kernel with a 64K base pagesize on a PPC970MP which > > does not support 64K hardware pagesizes. > > > > Does this make sense? > > Hmmm, Who want to this infomation? > Someone doing performance analysis on POWER may want it. If they switched to a large base page size without using hugetlbfs at all and saw the same number of TLB misses, it could be explained by the lower MMU pagesize. Admittedly, they should have known the hardware didn't support that pagesize. > I agreed with > - An administrator want to know these page are normal or huge. > - An administrator want to know hugepage size. > (e.g. x86_64 has two hugepage size (2M and 1G)) > > but above ppc64 case seems deeply implementation depended infomation and > nobody want to know it. > I admit it's ppc64-specific. In the latest patch series, I made this a separate patch so that it could be readily dropped again for this reason. Maybe an alternative would be to display MMUPageSize *only* where it differs from KernelPageSize. Would that be better or similarly confusing? > it seems a bottleneck of future enhancement. > I'm not sure what you mean by it being a bottleneck > then I disagreed with > - show both KernelPageSize and MMUPageSize in normal page. > > > I like following two choice > > > 1) in normal page, show PAZE_SIZE > > because, any userland application woks as pagesize==PAZE_SIZE > on current powerpc architecture. > > because > > fs/binfmt_elf.c > ------------------------------ > static int > create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec, > unsigned long load_addr, unsigned long interp_load_addr) > { > (snip) > NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP); > NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); /* pass ELF_EXEC_PAGESIZE to libc */ > > include/asm-powerpc/elf.h > ----------------------------- > #define ELF_EXEC_PAGESIZE PAGE_SIZE > I'm ok with this option and dropping the MMUPageSize patch as the user should already be able to identify that the hardware does not support 64K base pagesizes. I will leave the name as KernelPageSize so that it is still difficult to confuse it with MMU page size. > > 2) in normal page, no display any page size. > only hugepage case, display page size. > > because, An administrator want to hugepage size only. (AFAICS) > I prefer option 1 as it's easier to parse the presense of information than infer from the absense of it. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 15:41 ` Mel Gorman @ 2008-09-24 16:06 ` Dave Hansen 2008-09-24 17:10 ` Mel Gorman 2008-09-25 12:23 ` KOSAKI Motohiro 1 sibling, 1 reply; 24+ messages in thread From: Dave Hansen @ 2008-09-24 16:06 UTC (permalink / raw) To: Mel Gorman; +Cc: KOSAKI Motohiro, LKML, Linux-MM On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote: > I admit it's ppc64-specific. In the latest patch series, I made this a > separate patch so that it could be readily dropped again for this reason. > Maybe an alternative would be to display MMUPageSize *only* where it differs > from KernelPageSize. Would that be better or similarly confusing? I would also think that any arch implementing fallback from large to small pages in a hugetlbfs area (Adam needs to post his patches :) would also use this. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 16:06 ` Dave Hansen @ 2008-09-24 17:10 ` Mel Gorman 2008-09-24 18:59 ` Dave Hansen 0 siblings, 1 reply; 24+ messages in thread From: Mel Gorman @ 2008-09-24 17:10 UTC (permalink / raw) To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM On (24/09/08 09:06), Dave Hansen didst pronounce: > On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote: > > I admit it's ppc64-specific. In the latest patch series, I made this a > > separate patch so that it could be readily dropped again for this reason. > > Maybe an alternative would be to display MMUPageSize *only* where it differs > > from KernelPageSize. Would that be better or similarly confusing? > > I would also think that any arch implementing fallback from large to > small pages in a hugetlbfs area (Adam needs to post his patches :) would > also use this. > Fair point. Maybe the thing to do is backburner this patch for the moment and reintroduce it when/if an architecture supports demotion? The KernelPageSize reporting in smaps and what the hpagesize in maps is still useful though I believe. Any comment? (future stuff from here on) In the future if demotion does happen then the MMUPageSize information may be genuinely useful instead of just a curious oddity on ppc64. As you point out, Adam (added to cc) has worked on this area (starting with x86 demotion) in the past but it's a while before it'll be considered for merging I believe. That aside, more would need to be done with the page size reporting then anyway. For example, it maybe indicate how much of each pagesize is in a VMA or indicate that KernelPageSize is what is being requested but in reality it is mixed like; KernelPageSize: 2048 kB (mixed) or KernelPageSize: 2048 kB * 5, 4096 kB * 20 -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 17:10 ` Mel Gorman @ 2008-09-24 18:59 ` Dave Hansen 2008-09-24 19:11 ` Mel Gorman 0 siblings, 1 reply; 24+ messages in thread From: Dave Hansen @ 2008-09-24 18:59 UTC (permalink / raw) To: Mel Gorman; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM On Wed, 2008-09-24 at 18:10 +0100, Mel Gorman wrote: > On (24/09/08 09:06), Dave Hansen didst pronounce: > > On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote: > > > I admit it's ppc64-specific. In the latest patch series, I made this a > > > separate patch so that it could be readily dropped again for this reason. > > > Maybe an alternative would be to display MMUPageSize *only* where it differs > > > from KernelPageSize. Would that be better or similarly confusing? > > > > I would also think that any arch implementing fallback from large to > > small pages in a hugetlbfs area (Adam needs to post his patches :) would > > also use this. > > > > Fair point. Maybe the thing to do is backburner this patch for the moment and > reintroduce it when/if an architecture supports demotion? The KernelPageSize > reporting in smaps and what the hpagesize in maps is still useful though > I believe. Any comment? I'd kinda prefer to see it normalized into a single place rather than sprinkle it in each smaps file. We should be able to figure out which mount the file is from and, from there, maybe we need some per-mount information exported. > (future stuff from here on) > > In the future if demotion does happen then the MMUPageSize information may > be genuinely useful instead of just a curious oddity on ppc64. As you point > out, Adam (added to cc) has worked on this area (starting with x86 demotion) > in the past but it's a while before it'll be considered for merging I believe. > > That aside, more would need to be done with the page size reporting then > anyway. For example, it maybe indicate how much of each pagesize is in a VMA > or indicate that KernelPageSize is what is being requested but in reality > it is mixed like; > > KernelPageSize: 2048 kB (mixed) > > or > > KernelPageSize: 2048 kB * 5, 4096 kB * 20 Looks a bit verbose, but I agree with the sentiment. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 18:59 ` Dave Hansen @ 2008-09-24 19:11 ` Mel Gorman 2008-09-24 19:23 ` Dave Hansen 0 siblings, 1 reply; 24+ messages in thread From: Mel Gorman @ 2008-09-24 19:11 UTC (permalink / raw) To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM On (24/09/08 11:59), Dave Hansen didst pronounce: > On Wed, 2008-09-24 at 18:10 +0100, Mel Gorman wrote: > > On (24/09/08 09:06), Dave Hansen didst pronounce: > > > On Wed, 2008-09-24 at 16:41 +0100, Mel Gorman wrote: > > > > I admit it's ppc64-specific. In the latest patch series, I made this a > > > > separate patch so that it could be readily dropped again for this reason. > > > > Maybe an alternative would be to display MMUPageSize *only* where it differs > > > > from KernelPageSize. Would that be better or similarly confusing? > > > > > > I would also think that any arch implementing fallback from large to > > > small pages in a hugetlbfs area (Adam needs to post his patches :) would > > > also use this. > > > > > > > Fair point. Maybe the thing to do is backburner this patch for the moment and > > reintroduce it when/if an architecture supports demotion? The KernelPageSize > > reporting in smaps and what the hpagesize in maps is still useful though > > I believe. Any comment? > > I'd kinda prefer to see it normalized into a single place rather than > sprinkle it in each smaps file. I don't get what you mean by it being sprinkled in each smaps file. How would you present the data? > We should be able to figure out which > mount the file is from and, from there, maybe we need some per-mount > information exported. > Per-mount information is already exported and you can infer the data about huge pagesizes. For example, if you know the default huge pagesize (from /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts) and there is no pagesize= mount option (mounts again), you could guess what the hugepage that is backing a VMA is. Shared memory segments are a little harder but again, you can infer the information if you look around for long enough. However, this is awkward and not very user-friendly. With the patches (minus MMUPageSize as I think we've agreed to postpone that), it's easy to see what pagesize is being used at a glance. Without it, you need to know a fair bit about hugepages are implemented in Linux to infer the information correctly. > > (future stuff from here on) > > > > In the future if demotion does happen then the MMUPageSize information may > > be genuinely useful instead of just a curious oddity on ppc64. As you point > > out, Adam (added to cc) has worked on this area (starting with x86 demotion) > > in the past but it's a while before it'll be considered for merging I believe. > > > > That aside, more would need to be done with the page size reporting then > > anyway. For example, it maybe indicate how much of each pagesize is in a VMA > > or indicate that KernelPageSize is what is being requested but in reality > > it is mixed like; > > > > KernelPageSize: 2048 kB (mixed) > > > > or > > > > KernelPageSize: 2048 kB * 5, 4096 kB * 20 > > Looks a bit verbose, but I agree with the sentiment. > Grand, I'll keep note of this to revisit it in the future when/if pagesizes get mixed in a VMA. Thanks -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 19:11 ` Mel Gorman @ 2008-09-24 19:23 ` Dave Hansen 2008-09-24 23:39 ` Mel Gorman 2008-09-24 23:42 ` Mel Gorman 0 siblings, 2 replies; 24+ messages in thread From: Dave Hansen @ 2008-09-24 19:23 UTC (permalink / raw) To: Mel Gorman; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM On Wed, 2008-09-24 at 20:11 +0100, Mel Gorman wrote: > I don't get what you mean by it being sprinkled in each smaps file. How > would you present the data? 1. figure out what the file path is from smaps 2. look up the mount 3. look up the page sizes from the mount's information > > We should be able to figure out which > > mount the file is from and, from there, maybe we need some per-mount > > information exported. > > Per-mount information is already exported and you can infer the data about > huge pagesizes. For example, if you know the default huge pagesize (from > /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts) > and there is no pagesize= mount option (mounts again), you could guess what the > hugepage that is backing a VMA is. Shared memory segments are a little harder > but again, you can infer the information if you look around for long enough. > > However, this is awkward and not very user-friendly. With the patches (minus > MMUPageSize as I think we've agreed to postpone that), it's easy to see what > pagesize is being used at a glance. Without it, you need to know a fair bit > about hugepages are implemented in Linux to infer the information correctly. I agree completely. But, if we consider this a user ABI thing, then we're stuck with it for a long time, and we better make it flexible enough to at least contain the gunk we're planning on adding in a small number of years, like the fallback. We don't want to be adding this stuff if it isn't going to be stable. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 19:23 ` Dave Hansen @ 2008-09-24 23:39 ` Mel Gorman 2008-09-24 23:42 ` Mel Gorman 1 sibling, 0 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-24 23:39 UTC (permalink / raw) To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM On (24/09/08 12:23), Dave Hansen didst pronounce: > On Wed, 2008-09-24 at 20:11 +0100, Mel Gorman wrote: > > I don't get what you mean by it being sprinkled in each smaps file. How > > would you present the data? > > 1. figure out what the file path is from smaps > 2. look up the mount > 3. look up the page sizes from the mount's information > You should be able to do that today but it's not a particularly friendly task. I expect without decent knowledge of how hugepages work that you'll get it wrong. A userspace tool could do this of course and likely would use stat on the file to get teh blocksize if it was hugetlbfs instead of consulting mounts. It's just not as user-friendly. Consider "cat smaps" as opposed to download this tool, run it and it'll give you an smaps-like output. > > > We should be able to figure out which > > > mount the file is from and, from there, maybe we need some per-mount > > > information exported. > > > > Per-mount information is already exported and you can infer the data about > > huge pagesizes. For example, if you know the default huge pagesize (from > > /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts) > > and there is no pagesize= mount option (mounts again), you could guess what the > > hugepage that is backing a VMA is. Shared memory segments are a little harder > > but again, you can infer the information if you look around for long enough. > > > > However, this is awkward and not very user-friendly. With the patches (minus > > MMUPageSize as I think we've agreed to postpone that), it's easy to see what > > pagesize is being used at a glance. Without it, you need to know a fair bit > > about hugepages are implemented in Linux to infer the information correctly. > > I agree completely. But, if we consider this a user ABI thing, then > we're stuck with it for a long time, and we better make it flexible > enough to at least contain the gunk we're planning on adding in a small > number of years, like the fallback. We don't want to be adding this > stuff if it isn't going to be stable. > What's wrong with KernelPageSize: X kB now which a parser can easily handle and later KernelPageSize: X kb * nX Y kB * nY where X is a pagesize, nX is the number of pages of that size in a VMA later? The second format should not break a naive parser. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 19:23 ` Dave Hansen 2008-09-24 23:39 ` Mel Gorman @ 2008-09-24 23:42 ` Mel Gorman 1 sibling, 0 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-24 23:42 UTC (permalink / raw) To: Dave Hansen; +Cc: KOSAKI Motohiro, agl, LKML, Linux-MM On (24/09/08 12:23), Dave Hansen didst pronounce: > On Wed, 2008-09-24 at 20:11 +0100, Mel Gorman wrote: > > I don't get what you mean by it being sprinkled in each smaps file. How > > would you present the data? > > 1. figure out what the file path is from smaps > 2. look up the mount > 3. look up the page sizes from the mount's information > > > > We should be able to figure out which > > > mount the file is from and, from there, maybe we need some per-mount > > > information exported. > > > > Per-mount information is already exported and you can infer the data about > > huge pagesizes. For example, if you know the default huge pagesize (from > > /proc/meminfo), and the file is on hugetlbfs (read maps, then /proc/mounts) > > and there is no pagesize= mount option (mounts again), you could guess what the > > hugepage that is backing a VMA is. Shared memory segments are a little harder > > but again, you can infer the information if you look around for long enough. > > > > However, this is awkward and not very user-friendly. With the patches (minus > > MMUPageSize as I think we've agreed to postpone that), it's easy to see what > > pagesize is being used at a glance. Without it, you need to know a fair bit > > about hugepages are implemented in Linux to infer the information correctly. > > I agree completely. But, if we consider this a user ABI thing, then > we're stuck with it for a long time, and we better make it flexible > enough to at least contain the gunk we're planning on adding in a small > number of years, like the fallback. We don't want to be adding this > stuff if it isn't going to be stable. > This could also be done as KernelPageSize == Kernel page size that is ideally used in this VMA and later MixedPageSize == Breakdown of the pagesizes that are used in the VMA -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-09-24 15:41 ` Mel Gorman 2008-09-24 16:06 ` Dave Hansen @ 2008-09-25 12:23 ` KOSAKI Motohiro 1 sibling, 0 replies; 24+ messages in thread From: KOSAKI Motohiro @ 2008-09-25 12:23 UTC (permalink / raw) To: Mel Gorman; +Cc: kosaki.motohiro, Dave Hansen, LKML, Linux-MM Hi! > > 1) in normal page, show PAZE_SIZE > > > > because, any userland application woks as pagesize==PAZE_SIZE > > on current powerpc architecture. > > > > because > > > > fs/binfmt_elf.c > > ------------------------------ > > static int > > create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec, > > unsigned long load_addr, unsigned long interp_load_addr) > > { > > (snip) > > NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP); > > NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE); /* pass ELF_EXEC_PAGESIZE to libc */ > > > > include/asm-powerpc/elf.h > > ----------------------------- > > #define ELF_EXEC_PAGESIZE PAGE_SIZE > > > > I'm ok with this option and dropping the MMUPageSize patch as the user > should already be able to identify that the hardware does not support 64K > base pagesizes. I will leave the name as KernelPageSize so that it is still > difficult to confuse it with MMU page size. > > > > > 2) in normal page, no display any page size. > > only hugepage case, display page size. > > > > because, An administrator want to hugepage size only. (AFAICS) > > > > I prefer option 1 as it's easier to parse the presense of information > than infer from the absense of it. OK. I'll review and test your latest patch without MMUPageSize part. (maybe today's midnight or tommorow) Thanks! ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps 2008-09-22 1:38 [PATCH 0/2] Report the pagesize backing VMAs in /proc Mel Gorman 2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman @ 2008-09-22 1:38 ` Mel Gorman 1 sibling, 0 replies; 24+ messages in thread From: Mel Gorman @ 2008-09-22 1:38 UTC (permalink / raw) To: LKML; +Cc: Linux-MM, Mel Gorman This patch adds a new field for hugepage-backed memory regions to show the pagesize in /proc/pid/maps. While the information is available in smaps, maps is more human-readable and does not incur the significant cost of calculating Pss. An example of a /proc/self/maps output for an application using hugepages with this patch applied is; 08048000-0804c000 r-xp 00000000 03:01 49135 /bin/cat 0804c000-0804d000 rw-p 00003000 03:01 49135 /bin/cat 08400000-08800000 rw-p 00000000 00:10 4055 /mnt/libhugetlbfs.tmp.QzPPTJ (deleted) (hpagesize=4096kB) b7daa000-b7dab000 rw-p b7daa000 00:00 0 b7dab000-b7ed2000 r-xp 00000000 03:01 116846 /lib/tls/i686/cmov/libc-2.3.6.so b7ed2000-b7ed7000 r--p 00127000 03:01 116846 /lib/tls/i686/cmov/libc-2.3.6.so b7ed7000-b7ed9000 rw-p 0012c000 03:01 116846 /lib/tls/i686/cmov/libc-2.3.6.so b7ed9000-b7edd000 rw-p b7ed9000 00:00 0 b7ee1000-b7ee8000 r-xp 00000000 03:01 49262 /root/libhugetlbfs-git/obj32/libhugetlbfs.so b7ee8000-b7ee9000 rw-p 00006000 03:01 49262 /root/libhugetlbfs-git/obj32/libhugetlbfs.so b7ee9000-b7eed000 rw-p b7ee9000 00:00 0 b7eed000-b7f02000 r-xp 00000000 03:01 119345 /lib/ld-2.3.6.so b7f02000-b7f04000 rw-p 00014000 03:01 119345 /lib/ld-2.3.6.so bf8ef000-bf903000 rwxp bffeb000 00:00 0 [stack] bf903000-bf904000 rw-p bffff000 00:00 0 ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso] To be predictable for parsers, the patch adds the notion of reporting VMA attributes by adding fields that look like "(attribute[=value])". This already happens when a file is deleted and the user sees (deleted) after the filename. The expectation is that existing parsers will not break as those that read the filename should be reading forward after the inode number and stopping when it sees something that is not part of the filename. Parsers that assume everything after / is a filename will get confused by (hpagesize=XkB) but are already broken due to (deleted). Signed-off-by: Mel Gorman <mel@csn.ul.ie> --- fs/proc/task_mmu.c | 23 +++++++++++++++++------ 1 files changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 81a3f91..80233e6 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -198,7 +198,7 @@ static int do_maps_open(struct inode *inode, struct file *file, return ret; } -static int show_map(struct seq_file *m, void *v) +static int __show_map(struct seq_file *m, void *v, int showattributes) { struct proc_maps_private *priv = m->private; struct task_struct *task = priv->task; @@ -233,8 +233,8 @@ static int show_map(struct seq_file *m, void *v) * Print the dentry name for named mappings, and a * special [heap] marker for the heap: */ + pad_len_spaces(m, len); if (file) { - pad_len_spaces(m, len); seq_path(m, &file->f_path, "\n"); } else { const char *name = arch_vma_name(vma); @@ -251,11 +251,17 @@ static int show_map(struct seq_file *m, void *v) name = "[vdso]"; } } - if (name) { - pad_len_spaces(m, len); + if (name) seq_puts(m, name); - } } + + /* + * Print additional attributes of the VMA of interest + * - hugepage size if hugepage-backed + */ + if (showattributes && vma->vm_flags & VM_HUGETLB) + seq_printf(m, " (hpagesize=%lukB)", vma_page_size(vma) >> 10); + seq_putc(m, '\n'); if (m->count < m->size) /* vma is copied successfully */ @@ -263,6 +269,11 @@ static int show_map(struct seq_file *m, void *v) return 0; } +static int show_map(struct seq_file *m, void *v) +{ + return __show_map(m, v, 1); +} + static const struct seq_operations proc_pid_maps_op = { .start = m_start, .next = m_next, @@ -381,7 +392,7 @@ static int show_smap(struct seq_file *m, void *v) if (vma->vm_mm && !is_vm_hugetlb_page(vma)) walk_page_range(vma->vm_start, vma->vm_end, &smaps_walk); - ret = show_map(m, v); + ret = __show_map(m, v, 0); if (ret) return ret; -- 1.5.6.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 0/2] Report the size of pages backing VMAs in /proc V3 @ 2008-10-03 16:46 Mel Gorman 2008-10-03 16:46 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman 0 siblings, 1 reply; 24+ messages in thread From: Mel Gorman @ 2008-10-03 16:46 UTC (permalink / raw) To: akpm; +Cc: Mel Gorman, kosaki.motohiro, dave, linux-mm, linux-kernel The following two patches add support for printing the size of pages used by the kernel to back VMAs in maps and smaps. This can be used by a user to verify that a hugepage-aware application is using the expected page sizes. In one case the pagesize used by the MMU differs from the size used by the kernel. This is on PPC64 using 64K as a base page size running on a processor that does not support 64K in the MMU. In this case, the kernel uses 64K pages but the MMU is still using 4K. The first patch prints the size of page used by the kernel when allocating pages for a VMA in /proc/pid/smaps and should not be considered too contentious as it is highly unlikely to break any parsers. The second patch reports the size of page used by hugetlbfs regions in /proc/pid/maps. There is a possibility that the final patch will break parsers but they are arguably already broken. More details are in the patches themselves. Thanks to KOSAKI Motohiro for rebasing the patches onto mmotm, reviewing and testing. Changelog since V2 o Drop printing of MMUPageSize (mel) o Rebase onto mmotm (KOSAKI Motohiro) Changelog since V1 o Fix build failure on !CONFIG_HUGETLB_PAGE o Uninline helper functions o Distinguish between base pagesize and MMU pagesize fs/proc/task_mmu.c | 27 ++++++++++++++++++--------- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 17 +++++++++++++++++ 3 files changed, 38 insertions(+), 9 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-10-03 16:46 [PATCH 0/2] Report the size of pages backing VMAs in /proc V3 Mel Gorman @ 2008-10-03 16:46 ` Mel Gorman 2008-10-08 21:38 ` Alexey Dobriyan 0 siblings, 1 reply; 24+ messages in thread From: Mel Gorman @ 2008-10-03 16:46 UTC (permalink / raw) To: akpm; +Cc: Mel Gorman, kosaki.motohiro, dave, linux-mm, linux-kernel It is useful to verify a hugepage-aware application is using the expected pagesizes for its memory regions. This patch creates an entry called KernelPageSize in /proc/pid/smaps that is the size of page used by the kernel to back a VMA. The entry is not called PageSize as it is possible the MMU uses a different size. This extension should not break any sensible parser that skips lines containing unrecognised information. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> --- fs/proc/task_mmu.c | 6 ++++-- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 17 +++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f6add87..beb884d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -402,7 +402,8 @@ static int show_smap(struct seq_file *m, void *v) "Private_Clean: %8lu kB\n" "Private_Dirty: %8lu kB\n" "Referenced: %8lu kB\n" - "Swap: %8lu kB\n", + "Swap: %8lu kB\n" + "KernelPageSize: %8lu kB\n", (vma->vm_end - vma->vm_start) >> 10, mss.resident >> 10, (unsigned long)(mss.pss >> (10 + PSS_SHIFT)), @@ -411,7 +412,8 @@ static int show_smap(struct seq_file *m, void *v) mss.private_clean >> 10, mss.private_dirty >> 10, mss.referenced >> 10, - mss.swap >> 10); + mss.swap >> 10, + vma_kernel_pagesize(vma) >> 10); if (m->count < m->size) /* vma is copied successfully */ m->version = (vma != get_gate_vma(task)) ? vma->vm_start : 0; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 32e0ef0..ace04a7 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -231,6 +231,8 @@ static inline unsigned long huge_page_size(struct hstate *h) return (unsigned long)PAGE_SIZE << h->order; } +extern unsigned long vma_kernel_pagesize(struct vm_area_struct *vma); + static inline unsigned long huge_page_mask(struct hstate *h) { return h->mask; @@ -271,6 +273,7 @@ struct hstate {}; #define hstate_inode(i) NULL #define huge_page_size(h) PAGE_SIZE #define huge_page_mask(h) PAGE_MASK +#define vma_kernel_pagesize(v) PAGE_SIZE #define huge_page_order(h) 0 #define huge_page_shift(h) PAGE_SHIFT static inline unsigned int pages_per_huge_page(struct hstate *h) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index adf3568..856949c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -219,6 +219,23 @@ static pgoff_t vma_hugecache_offset(struct hstate *h, } /* + * Return the size of the pages allocated when backing a VMA. In the majority + * cases this will be same size as used by the page table entries. + */ +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma) +{ + struct hstate *hstate; + + if (!is_vm_hugetlb_page(vma)) + return PAGE_SIZE; + + hstate = hstate_vma(vma); + VM_BUG_ON(!hstate); + + return 1UL << (hstate->order + PAGE_SHIFT); +} + +/* * Flags for MAP_PRIVATE reservations. These are stored in the bottom * bits of the reservation map pointer, which are always clear due to * alignment. -- 1.5.6.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-10-03 16:46 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman @ 2008-10-08 21:38 ` Alexey Dobriyan 2008-10-09 2:16 ` KOSAKI Motohiro 2008-10-09 10:24 ` Mel Gorman 0 siblings, 2 replies; 24+ messages in thread From: Alexey Dobriyan @ 2008-10-08 21:38 UTC (permalink / raw) To: Mel Gorman; +Cc: akpm, kosaki.motohiro, dave, linux-mm, linux-kernel On Fri, Oct 03, 2008 at 05:46:54PM +0100, Mel Gorman wrote: > It is useful to verify a hugepage-aware application is using the expected > pagesizes for its memory regions. This patch creates an entry called > KernelPageSize in /proc/pid/smaps that is the size of page used by the > kernel to back a VMA. The entry is not called PageSize as it is possible > the MMU uses a different size. This extension should not break any sensible > parser that skips lines containing unrecognised information. > + "KernelPageSize: %8lu kB\n", > +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma) > +{ > + struct hstate *hstate; > + > + if (!is_vm_hugetlb_page(vma)) > + return PAGE_SIZE; > + > + hstate = hstate_vma(vma); > + VM_BUG_ON(!hstate); > + > + return 1UL << (hstate->order + PAGE_SHIFT); ^^^^ VM_BUG_ON is unneeded because kernel will oops here if hstate is NULL. Also, in /proc/*/maps it's printed only for hugetlb vmas and called hpagesize, in smaps it's printed for every vma and called KernelPageSize. All of this is inconsistent. And app will verify once that hugepages are of right size, so Pss cost argument for changing /proc/*/maps seems weak to me. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-10-08 21:38 ` Alexey Dobriyan @ 2008-10-09 2:16 ` KOSAKI Motohiro 2008-10-09 10:24 ` Mel Gorman 1 sibling, 0 replies; 24+ messages in thread From: KOSAKI Motohiro @ 2008-10-09 2:16 UTC (permalink / raw) To: Alexey Dobriyan Cc: kosaki.motohiro, Mel Gorman, akpm, dave, linux-mm, linux-kernel Hi > > It is useful to verify a hugepage-aware application is using the expected > > pagesizes for its memory regions. This patch creates an entry called > > KernelPageSize in /proc/pid/smaps that is the size of page used by the > > kernel to back a VMA. The entry is not called PageSize as it is possible > > the MMU uses a different size. This extension should not break any sensible > > parser that skips lines containing unrecognised information. > > > + "KernelPageSize: %8lu kB\n", > > > +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma) > > +{ > > + struct hstate *hstate; > > + > > + if (!is_vm_hugetlb_page(vma)) > > + return PAGE_SIZE; > > + > > + hstate = hstate_vma(vma); > > + VM_BUG_ON(!hstate); > > + > > + return 1UL << (hstate->order + PAGE_SHIFT); > ^^^^ > VM_BUG_ON is unneeded because kernel will oops here if hstate is NULL. yup. > Also, in /proc/*/maps it's printed only for hugetlb vmas and called > hpagesize, in smaps it's printed for every vma and called > KernelPageSize. All of this is inconsistent. Is this a problem? /proc/*/maps and /proc/*/smaps are different purpose file. /proc/*/maps: summary & suppressed information & easy readable /proc/*/smaps: verbose output Already some information output only smaps. > And app will verify once that hugepages are of right size, so Pss cost > argument for changing /proc/*/maps seems weak to me. sorry, I don't understand yet. Why pss cost changed? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps 2008-10-08 21:38 ` Alexey Dobriyan 2008-10-09 2:16 ` KOSAKI Motohiro @ 2008-10-09 10:24 ` Mel Gorman 1 sibling, 0 replies; 24+ messages in thread From: Mel Gorman @ 2008-10-09 10:24 UTC (permalink / raw) To: Alexey Dobriyan; +Cc: akpm, kosaki.motohiro, dave, linux-mm, linux-kernel On (09/10/08 01:38), Alexey Dobriyan didst pronounce: > On Fri, Oct 03, 2008 at 05:46:54PM +0100, Mel Gorman wrote: > > It is useful to verify a hugepage-aware application is using the expected > > pagesizes for its memory regions. This patch creates an entry called > > KernelPageSize in /proc/pid/smaps that is the size of page used by the > > kernel to back a VMA. The entry is not called PageSize as it is possible > > the MMU uses a different size. This extension should not break any sensible > > parser that skips lines containing unrecognised information. > > > + "KernelPageSize: %8lu kB\n", > > > +unsigned long vma_kernel_pagesize(struct vm_area_struct *vma) > > +{ > > + struct hstate *hstate; > > + > > + if (!is_vm_hugetlb_page(vma)) > > + return PAGE_SIZE; > > + > > + hstate = hstate_vma(vma); > > + VM_BUG_ON(!hstate); > > + > > + return 1UL << (hstate->order + PAGE_SHIFT); > ^^^^ > VM_BUG_ON is unneeded because kernel will oops here if hstate is NULL. > Ok, will drop it. I used the VM_BUG_ON so if the situation was triggered, it would come with line numbers but it'll be an obvious oops so I guess it is redundant. > Also, in /proc/*/maps it's printed only for hugetlb vmas and called > hpagesize, Well yes... because it's a huge pagesize for that VMA. The name reflects what is being described there. > in smaps it's printed for every vma and called > KernelPageSize. All of this is inconsistent. > In smaps, we are printing for every VMA because it's easier for parsers to deal with the presense of information than its absense. The name KernelPageSize there is an accurate description. I don't feel it is inconsistent. > And app will verify once that hugepages are of right size, so Pss cost > argument for changing /proc/*/maps seems weak to me. > Lets say someone wanted to monitor an application to see what its use of hugepages were over time, they would have to constantly incur the PSS cost to do that which seems a bit unfair. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2008-10-09 10:24 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-09-22 1:38 [PATCH 0/2] Report the pagesize backing VMAs in /proc Mel Gorman 2008-09-22 1:38 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman 2008-09-22 8:30 ` Andrew Morton 2008-09-22 16:17 ` Mel Gorman 2008-09-22 15:55 ` Dave Hansen 2008-09-22 16:21 ` Mel Gorman 2008-09-22 16:48 ` Dave Hansen 2008-09-23 12:15 ` KOSAKI Motohiro 2008-09-23 19:46 ` Mel Gorman 2008-09-24 12:32 ` KOSAKI Motohiro 2008-09-24 15:41 ` Mel Gorman 2008-09-24 16:06 ` Dave Hansen 2008-09-24 17:10 ` Mel Gorman 2008-09-24 18:59 ` Dave Hansen 2008-09-24 19:11 ` Mel Gorman 2008-09-24 19:23 ` Dave Hansen 2008-09-24 23:39 ` Mel Gorman 2008-09-24 23:42 ` Mel Gorman 2008-09-25 12:23 ` KOSAKI Motohiro 2008-09-22 1:38 ` [PATCH 2/2] Report the pagesize backing a VMA in /proc/pid/maps Mel Gorman 2008-10-03 16:46 [PATCH 0/2] Report the size of pages backing VMAs in /proc V3 Mel Gorman 2008-10-03 16:46 ` [PATCH 1/2] Report the pagesize backing a VMA in /proc/pid/smaps Mel Gorman 2008-10-08 21:38 ` Alexey Dobriyan 2008-10-09 2:16 ` KOSAKI Motohiro 2008-10-09 10:24 ` Mel Gorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox