* Re: [PATCH]: Adding a counter in vma to indicate the number of physical_pages_backing it
@ 2006-06-13 5:53 Albert Cahalan
2006-06-13 5:56 ` Andi Kleen
0 siblings, 1 reply; 23+ messages in thread
From: Albert Cahalan @ 2006-06-13 5:53 UTC (permalink / raw)
To: linux-kernel, ak, rohitseth, akpm, Linux-mm, arjan, jengelh
Quoting two different people:
> BTW, what is smaps used for (who uses it), anyway?
...
> smaps is only a debugging kludge anyways and it's
> not a good idea to we bloat core data structures for it.
I'd be using it in procps for the pmap command if it
were not so horribly nasty. I may eventually get around
to using it, but maybe it's just too gross to tolerate.
That mess should never have slipped into the kernel.
Just take a look at /proc/self/smaps some time. Wow.
A month or two ago I supplied a patch to replace smaps
with something sanely parsable. I was essentially told
that we already have this lovely smaps dungheap that I
should just use, but a couple people were eager to see
the patch go in.
Anyway, I need smaps stuff plus info about locked memory
and page sizes. Solaris provides this. People seem to
like it. I guess it's for performance tuning of app code or
maybe for scalibility predictions.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical_pages_backing it
2006-06-13 5:53 [PATCH]: Adding a counter in vma to indicate the number of physical_pages_backing it Albert Cahalan
@ 2006-06-13 5:56 ` Andi Kleen
2006-06-13 17:10 ` Rohit Seth
0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2006-06-13 5:56 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel, rohitseth, akpm, Linux-mm, arjan, jengelh
On Tuesday 13 June 2006 07:53, Albert Cahalan wrote:
> Quoting two different people:
>
> > BTW, what is smaps used for (who uses it), anyway?
> ...
> > smaps is only a debugging kludge anyways and it's
> > not a good idea to we bloat core data structures for it.
>
> I'd be using it in procps for the pmap command if it
> were not so horribly nasty. I may eventually get around
> to using it, but maybe it's just too gross to tolerate.
I agree it's pretty ugly.
But pmap I would consider a debugging kludge too - it should
work when someone needs it, but it doesn't need to be particularly
fast.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical_pages_backing it
2006-06-13 5:56 ` Andi Kleen
@ 2006-06-13 17:10 ` Rohit Seth
2006-06-13 17:18 ` Andi Kleen
0 siblings, 1 reply; 23+ messages in thread
From: Rohit Seth @ 2006-06-13 17:10 UTC (permalink / raw)
To: Andi Kleen; +Cc: Albert Cahalan, linux-kernel, akpm, Linux-mm, arjan, jengelh
On Tue, 2006-06-13 at 07:56 +0200, Andi Kleen wrote:
> On Tuesday 13 June 2006 07:53, Albert Cahalan wrote:
> > Quoting two different people:
> >
> > > BTW, what is smaps used for (who uses it), anyway?
> > ...
> > > smaps is only a debugging kludge anyways and it's
> > > not a good idea to we bloat core data structures for it.
> >
> > I'd be using it in procps for the pmap command if it
> > were not so horribly nasty. I may eventually get around
> > to using it, but maybe it's just too gross to tolerate.
>
> I agree it's pretty ugly.
>
> But pmap I would consider a debugging kludge too - it should
> work when someone needs it, but it doesn't need to be particularly
> fast.
>
Providing useful information about memory consumption is hardly
debugging kludge. Quite unfortunately the rss part of the col is filled
with dashes at present. This vma based counter will allow Albert to
print useful information w/o having to traverse the whole set of page
tables.
-rohit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical_pages_backing it
2006-06-13 17:10 ` Rohit Seth
@ 2006-06-13 17:18 ` Andi Kleen
0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2006-06-13 17:18 UTC (permalink / raw)
To: rohitseth; +Cc: Albert Cahalan, linux-kernel, akpm, Linux-mm, arjan, jengelh
> Providing useful information about memory consumption is hardly
> debugging kludge.
I strongly believe anything that shows virtual addresses is for debugging
only. If your monitoring systems needs to look at VMAs it is doing
something very wrong or trying to do something that shouldn't be
in user space.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
@ 2006-06-10 1:33 Rohit Seth
2006-06-10 2:42 ` Andrew Morton
` (3 more replies)
0 siblings, 4 replies; 23+ messages in thread
From: Rohit Seth @ 2006-06-10 1:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux-mm, Linux-kernel
Below is a patch that adds number of physical pages that each vma is
using in a process. Exporting this information to user space
using /proc/<pid>/maps interface.
There is currently /proc/<pid>/smaps that prints the detailed
information about the usage of physical pages but that is a very
expensive operation as it traverses all the PTs (for some one who is
just interested in getting that data for each vma).
Signed-off-by: Rohit Seth <rohitseth@google.com>
fs/exec.c | 1 +
fs/proc/task_mmu.c | 4 ++--
include/linux/mm.h | 1 +
mm/fremap.c | 2 ++
mm/hugetlb.c | 3 +++
mm/memory.c | 5 +++++
mm/migrate.c | 1 +
mm/rmap.c | 2 ++
mm/swapfile.c | 1 +
9 files changed, 18 insertions(+), 2 deletions(-)
--- linux-2.6.17-rc5-mm3.org/fs/exec.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/fs/exec.c 2006-06-05 15:56:42.000000000 -0700
@@ -326,6 +326,7 @@
set_pte_at(mm, address, pte, pte_mkdirty(pte_mkwrite(mk_pte(
page, vma->vm_page_prot))));
page_add_new_anon_rmap(page, vma, address);
+ vma->nphys++;
pte_unmap_unlock(pte, ptl);
/* no need for flush_tlb */
--- linux-2.6.17-rc5-mm3.org/fs/proc/task_mmu.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/fs/proc/task_mmu.c 2006-06-06 14:23:48.000000000 -0700
@@ -145,14 +145,14 @@
ino = inode->i_ino;
}
- seq_printf(m, "%08lx-%08lx %c%c%c%c %08lx %02x:%02x %lu %n",
+ seq_printf(m, "%08lx-%08lx %c%c%c%c %08lx %08lu %02x:%02x %lu %n",
vma->vm_start,
vma->vm_end,
flags & VM_READ ? 'r' : '-',
flags & VM_WRITE ? 'w' : '-',
flags & VM_EXEC ? 'x' : '-',
flags & VM_MAYSHARE ? 's' : 'p',
- vma->vm_pgoff << PAGE_SHIFT,
+ vma->vm_pgoff << PAGE_SHIFT, vma->nphys,
MAJOR(dev), MINOR(dev), ino, &len);
/*
--- linux-2.6.17-rc5-mm3.org/include/linux/mm.h 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/include/linux/mm.h 2006-06-05 16:27:05.000000000 -0700
@@ -111,6 +111,7 @@
#ifdef CONFIG_NUMA
struct mempolicy *vm_policy; /* NUMA policy for the VMA */
#endif
+ unsigned long nphys; /* Num phys pages backing this vma */
};
/*
--- linux-2.6.17-rc5-mm3.org/mm/migrate.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/mm/migrate.c 2006-06-09 17:17:31.000000000 -0700
@@ -181,6 +181,7 @@
/* No need to invalidate - it was non-present before */
update_mmu_cache(vma, addr, pte);
lazy_mmu_prot_update(pte);
+ vma->nphys++;
out:
pte_unmap_unlock(ptep, ptl);
--- linux-2.6.17-rc5-mm3.org/mm/swapfile.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/mm/swapfile.c 2006-06-09 17:24:24.000000000 -0700
@@ -500,6 +500,7 @@
* immediately swapped out again after swapon.
*/
activate_page(page);
+ vma->nphys++;
}
static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
--- linux-2.6.17-rc5-mm3.org/mm/rmap.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/mm/rmap.c 2006-06-09 17:22:59.000000000 -0700
@@ -620,6 +620,7 @@
dec_mm_counter(mm, file_rss);
+ vma->nphys--;
page_remove_rmap(page);
page_cache_release(page);
@@ -710,6 +711,7 @@
if (pte_dirty(pteval))
set_page_dirty(page);
+ vma->nphys--;
page_remove_rmap(page);
page_cache_release(page);
dec_mm_counter(mm, file_rss);
--- linux-2.6.17-rc5-mm3.org/mm/memory.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/mm/memory.c 2006-06-05 15:57:16.000000000 -0700
@@ -677,6 +677,7 @@
mark_page_accessed(page);
file_rss--;
}
+ vma->nphys--;
page_remove_rmap(page);
tlb_remove_page(tlb, page);
continue;
@@ -2001,6 +2002,7 @@
/* No need to invalidate - it was non-present before */
update_mmu_cache(vma, address, pte);
lazy_mmu_prot_update(pte);
+ vma->nphys++;
unlock:
pte_unmap_unlock(page_table, ptl);
out:
@@ -2063,6 +2065,7 @@
/* No need to invalidate - it was non-present before */
update_mmu_cache(vma, address, entry);
lazy_mmu_prot_update(entry);
+ vma->nphys++;
unlock:
pte_unmap_unlock(page_table, ptl);
return VM_FAULT_MINOR;
@@ -2201,6 +2204,7 @@
/* no need to invalidate: a not-present page shouldn't be cached */
update_mmu_cache(vma, address, entry);
lazy_mmu_prot_update(entry);
+ vma->nphys++;
unlock:
pte_unmap_unlock(page_table, ptl);
return ret;
@@ -2480,6 +2484,7 @@
gate_vma.vm_end = FIXADDR_USER_END;
gate_vma.vm_page_prot = PAGE_READONLY;
gate_vma.vm_flags = 0;
+ gate_vma.nphys = 1;
return 0;
}
__initcall(gate_vma_init);
--- linux-2.6.17-rc5-mm3.org/mm/fremap.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/mm/fremap.c 2006-06-08 15:00:11.000000000 -0700
@@ -35,6 +35,7 @@
set_page_dirty(page);
page_remove_rmap(page);
page_cache_release(page);
+ vma->nphys--;
}
} else {
if (!pte_file(pte))
@@ -84,6 +85,7 @@
pte_val = *pte;
update_mmu_cache(vma, addr, pte_val);
lazy_mmu_prot_update(pte_val);
+ vma->nphys++;
err = 0;
unlock:
pte_unmap_unlock(pte, ptl);
--- linux-2.6.17-rc5-mm3.org/mm/hugetlb.c 2006-06-05 11:08:40.000000000 -0700
+++ linux-2.6.17-rc5-mm3/mm/hugetlb.c 2006-06-09 18:23:56.000000000 -0700
@@ -346,6 +346,7 @@
get_page(ptepage);
add_mm_counter(dst, file_rss, HPAGE_SIZE / PAGE_SIZE);
set_huge_pte_at(dst, addr, dst_pte, entry);
+ vma->nphys += (HPAGE_SIZE / PAGE_SIZE);
}
spin_unlock(&src->page_table_lock);
spin_unlock(&dst->page_table_lock);
@@ -386,6 +387,7 @@
page = pte_page(pte);
put_page(page);
add_mm_counter(mm, file_rss, (int) -(HPAGE_SIZE / PAGE_SIZE));
+ vma->nphys -= (HPAGE_SIZE / PAGE_SIZE);
}
spin_unlock(&mm->page_table_lock);
@@ -493,6 +495,7 @@
&& (vma->vm_flags & VM_SHARED)));
set_huge_pte_at(mm, address, ptep, new_pte);
+ vma->nphys += (HPAGE_SIZE / PAGE_SIZE);
if (write_access && !(vma->vm_flags & VM_SHARED)) {
/* Optimization, do the COW without a second fault */
ret = hugetlb_cow(mm, vma, address, ptep, new_pte);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 1:33 [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it Rohit Seth
@ 2006-06-10 2:42 ` Andrew Morton
2006-06-12 17:49 ` Rohit Seth
2006-06-10 7:35 ` Nick Piggin
` (2 subsequent siblings)
3 siblings, 1 reply; 23+ messages in thread
From: Andrew Morton @ 2006-06-10 2:42 UTC (permalink / raw)
To: rohitseth; +Cc: Linux-mm, Linux-kernel
On Fri, 09 Jun 2006 18:33:55 -0700
Rohit Seth <rohitseth@google.com> wrote:
> Below is a patch that adds number of physical pages that each vma is
> using in a process. Exporting this information to user space
> using /proc/<pid>/maps interface.
Ouch, that's an awful lot of open-coded incs and decs. Isn't there some
more centralised place we can do this?
What locking protects vma.nphys (can we call this nr_present or something?)
Will this patch do the right thing with weird vmas such as the gate vma and
mmaps of device memory, etc?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 2:42 ` Andrew Morton
@ 2006-06-12 17:49 ` Rohit Seth
0 siblings, 0 replies; 23+ messages in thread
From: Rohit Seth @ 2006-06-12 17:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux-mm, Linux-kernel
On Fri, 2006-06-09 at 19:42 -0700, Andrew Morton wrote:
> On Fri, 09 Jun 2006 18:33:55 -0700
> Rohit Seth <rohitseth@google.com> wrote:
>
> > Below is a patch that adds number of physical pages that each vma is
> > using in a process. Exporting this information to user space
> > using /proc/<pid>/maps interface.
>
> Ouch, that's an awful lot of open-coded incs and decs. Isn't there some
> more centralised place we can do this?
>
I'll look into this. Possibly combining it with mm counters.
> What locking protects vma.nphys (can we call this nr_present or something?)
>
I'll need to use the same atomic counters as mm. And Yes nr_present is
a better name.
> Will this patch do the right thing with weird vmas such as the gate vma and
> mmaps of device memory, etc?
>
I think so. (though strictly speaking those special vmas are less
interesting). But final solution (if we do decide to implement this
counter) will address that.
-rohit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 1:33 [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it Rohit Seth
2006-06-10 2:42 ` Andrew Morton
@ 2006-06-10 7:35 ` Nick Piggin
2006-06-11 10:15 ` Jan Engelhardt
2006-06-12 17:36 ` Rohit Seth
2006-06-11 16:09 ` Arjan van de Ven
2006-06-12 16:43 ` Christoph Lameter
3 siblings, 2 replies; 23+ messages in thread
From: Nick Piggin @ 2006-06-10 7:35 UTC (permalink / raw)
To: rohitseth; +Cc: Andrew Morton, Linux-mm, Linux-kernel
Rohit Seth wrote:
> Below is a patch that adds number of physical pages that each vma is
> using in a process. Exporting this information to user space
> using /proc/<pid>/maps interface.
>
> There is currently /proc/<pid>/smaps that prints the detailed
> information about the usage of physical pages but that is a very
> expensive operation as it traverses all the PTs (for some one who is
> just interested in getting that data for each vma).
Yet more cacheline footprint in the page fault and unmap paths...
What is this used for and why do we want it? Could you do some
smaps-like interface that can work on ranges of memory, and
continue to walk pagetables instead?
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 7:35 ` Nick Piggin
@ 2006-06-11 10:15 ` Jan Engelhardt
2006-06-12 17:36 ` Rohit Seth
1 sibling, 0 replies; 23+ messages in thread
From: Jan Engelhardt @ 2006-06-11 10:15 UTC (permalink / raw)
To: Nick Piggin; +Cc: rohitseth, Andrew Morton, Linux-mm, Linux-kernel
>> There is currently /proc/<pid>/smaps that prints the detailed
>> information about the usage of physical pages but that is a very
>> expensive operation as it traverses all the PTs (for some one who is
>> just interested in getting that data for each vma).
>
> Yet more cacheline footprint in the page fault and unmap paths...
>
> What is this used for and why do we want it? Could you do some
> smaps-like interface that can work on ranges of memory, and
> continue to walk pagetables instead?
>
BTW, what is smaps used for (who uses it), anyway?
Jan Engelhardt
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 7:35 ` Nick Piggin
2006-06-11 10:15 ` Jan Engelhardt
@ 2006-06-12 17:36 ` Rohit Seth
2006-06-12 17:58 ` Andi Kleen
1 sibling, 1 reply; 23+ messages in thread
From: Rohit Seth @ 2006-06-12 17:36 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, Linux-mm, Linux-kernel
On Sat, 2006-06-10 at 17:35 +1000, Nick Piggin wrote:
> Rohit Seth wrote:
> > Below is a patch that adds number of physical pages that each vma is
> > using in a process. Exporting this information to user space
> > using /proc/<pid>/maps interface.
> >
> > There is currently /proc/<pid>/smaps that prints the detailed
> > information about the usage of physical pages but that is a very
> > expensive operation as it traverses all the PTs (for some one who is
> > just interested in getting that data for each vma).
>
> Yet more cacheline footprint in the page fault and unmap paths...
>
Not necessarily. If I'm doing calculation right then vm_struct is
currently 176 bytes (without the addition of nphys) on my x86_64 box. So
in this case addition would not result in bigger cache foot print of
page fulats. Also currently two adjacent vmas share a cache line. So
there is already that much of cache line ping pong going on.
Though I agree that we should try to not extend this size beyond
absolutely necessary.
> What is this used for and why do we want it? Could you do some
> smaps-like interface that can work on ranges of memory, and
> continue to walk pagetables instead?
>
It is just the price of those walks that makes smaps not an attractive
solution for monitoring purposes.
I'm thinking if it is possible to extend current interfaces (possibly
having a new system call) in such a way that a user land process can
give some hints/preferences to kernel in terms of <pid, virtual_range>
to remove/inactivate. This can help in keeping the current kernel
behavior for vmscans but at the same time provide little bit of
non-symmetry for user land applications. Thoughts?
-rohit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-12 17:36 ` Rohit Seth
@ 2006-06-12 17:58 ` Andi Kleen
2006-06-12 19:42 ` Rohit Seth
0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2006-06-12 17:58 UTC (permalink / raw)
To: rohitseth; +Cc: Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
> It is just the price of those walks that makes smaps not an attractive
> solution for monitoring purposes.
It just shouldn't be used for that. It's a debugging hack and not really
suitable for monitoring even with optimizations.
For monitoring if the current numa statistics are not good enough
you should probably propose new counters.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-12 17:58 ` Andi Kleen
@ 2006-06-12 19:42 ` Rohit Seth
2006-06-13 3:51 ` Andi Kleen
0 siblings, 1 reply; 23+ messages in thread
From: Rohit Seth @ 2006-06-12 19:42 UTC (permalink / raw)
To: Andi Kleen; +Cc: Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
On Mon, 2006-06-12 at 19:58 +0200, Andi Kleen wrote:
> > It is just the price of those walks that makes smaps not an attractive
> > solution for monitoring purposes.
>
> It just shouldn't be used for that. It's a debugging hack and not really
> suitable for monitoring even with optimizations.
>
> For monitoring if the current numa statistics are not good enough
> you should probably propose new counters.
numa stats are giving different data. The proposed vma->nr_phys is the
new counter that can provide a detailed information about physical mem
usage at each virtual mem segment level. I think having this
information in each vma keeps the impact (of adding new counter) to very
low.
Second question is to advertize this value to user space. Please let me
know what suites the most among /proc, /sys or system call (or if there
is any other mechanism then let me know) for a per process per segment
related information.
-rohit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-12 19:42 ` Rohit Seth
@ 2006-06-13 3:51 ` Andi Kleen
2006-06-13 4:27 ` Nick Piggin
2006-06-13 16:59 ` Rohit Seth
0 siblings, 2 replies; 23+ messages in thread
From: Andi Kleen @ 2006-06-13 3:51 UTC (permalink / raw)
To: rohitseth; +Cc: Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
On Monday 12 June 2006 21:42, Rohit Seth wrote:
> On Mon, 2006-06-12 at 19:58 +0200, Andi Kleen wrote:
> > > It is just the price of those walks that makes smaps not an attractive
> > > solution for monitoring purposes.
> >
> > It just shouldn't be used for that. It's a debugging hack and not really
> > suitable for monitoring even with optimizations.
> >
> > For monitoring if the current numa statistics are not good enough
> > you should probably propose new counters.
>
>
> numa stats are giving different data. The proposed vma->nr_phys is the
> new counter that can provide a detailed information about physical mem
> usage at each virtual mem segment level.
And for what do you need that?
It's somewhat useful to debug the NUMA tuning of your app (although
there are other ways to do that too) but do you
really need it for normal runtime monitoring?
> I think having this
> information in each vma keeps the impact (of adding new counter) to very
> low.
>
> Second question is to advertize this value to user space. Please let me
> know what suites the most among /proc, /sys or system call (or if there
> is any other mechanism then let me know) for a per process per segment
> related information.
I think we first need to identify the basic need.
Don't see why we even need per VMA information so far.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-13 3:51 ` Andi Kleen
@ 2006-06-13 4:27 ` Nick Piggin
2006-06-13 16:59 ` Rohit Seth
1 sibling, 0 replies; 23+ messages in thread
From: Nick Piggin @ 2006-06-13 4:27 UTC (permalink / raw)
To: Andi Kleen; +Cc: rohitseth, Andrew Morton, Linux-mm, Linux-kernel
Andi Kleen wrote:
>On Monday 12 June 2006 21:42, Rohit Seth wrote:
>
>>I think having this
>>information in each vma keeps the impact (of adding new counter) to very
>>low.
>>
>>Second question is to advertize this value to user space. Please let me
>>know what suites the most among /proc, /sys or system call (or if there
>>is any other mechanism then let me know) for a per process per segment
>>related information.
>>
>
>I think we first need to identify the basic need.
>Don't see why we even need per VMA information so far.
>
Exactly. There is no question that walking page tables will be slower
than having a counter like your patch does; the question is why we
need it.
--
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-13 3:51 ` Andi Kleen
2006-06-13 4:27 ` Nick Piggin
@ 2006-06-13 16:59 ` Rohit Seth
2006-06-13 17:28 ` Hugh Dickins
2006-06-13 17:31 ` Andi Kleen
1 sibling, 2 replies; 23+ messages in thread
From: Rohit Seth @ 2006-06-13 16:59 UTC (permalink / raw)
To: Andi Kleen; +Cc: Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
On Tue, 2006-06-13 at 05:51 +0200, Andi Kleen wrote:
> On Monday 12 June 2006 21:42, Rohit Seth wrote:
>
> > I think having this information in each vma keeps the impact (of adding new counter) to very
> > low.
> >
> > Second question is to advertize this value to user space. Please let me
> > know what suites the most among /proc, /sys or system call (or if there
> > is any other mechanism then let me know) for a per process per segment
> > related information.
>
> I think we first need to identify the basic need.
> Don't see why we even need per VMA information so far.
This information is for user land applications to have the knowledge of
which virtual ranges are getting actively used and which are not.
This information then can be fed into a new system call
sys_change_page_activation(pid, start_va, len, flag). The purpose of
this system call would be to give hints to kernel that certain physical
pages are okay to be inactivated (or vice versa).
-rohit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-13 16:59 ` Rohit Seth
@ 2006-06-13 17:28 ` Hugh Dickins
2006-06-13 18:09 ` Rohit Seth
2006-06-13 17:31 ` Andi Kleen
1 sibling, 1 reply; 23+ messages in thread
From: Hugh Dickins @ 2006-06-13 17:28 UTC (permalink / raw)
To: Rohit Seth; +Cc: Andi Kleen, Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
On Tue, 13 Jun 2006, Rohit Seth wrote:
> On Tue, 2006-06-13 at 05:51 +0200, Andi Kleen wrote:
> >
> > I think we first need to identify the basic need.
> > Don't see why we even need per VMA information so far.
>
> This information is for user land applications to have the knowledge of
> which virtual ranges are getting actively used and which are not.
> This information then can be fed into a new system call
> sys_change_page_activation(pid, start_va, len, flag). The purpose of
> this system call would be to give hints to kernel that certain physical
> pages are okay to be inactivated (or vice versa).
Then perhaps you want a sys_report_page_activation(pid, start_va, len, ...)
which would examine and report on the range in question, instead of adding
your count to so many vmas on which this will never be used.
Though your syscall sounds like pid_madvise: perhaps the call name
should be less specific and left to the flags (come, gentle syscall
multiplexing flames, and warm me).
Looking through the existing fields of a vma, it seems a vm_area_struct
would commonly be on clean cachelines: your count making one of them
now commonly and bouncily dirty.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-13 17:28 ` Hugh Dickins
@ 2006-06-13 18:09 ` Rohit Seth
0 siblings, 0 replies; 23+ messages in thread
From: Rohit Seth @ 2006-06-13 18:09 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andi Kleen, Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
On Tue, 2006-06-13 at 18:28 +0100, Hugh Dickins wrote:
> On Tue, 13 Jun 2006, Rohit Seth wrote:
> > On Tue, 2006-06-13 at 05:51 +0200, Andi Kleen wrote:
> > >
> > > I think we first need to identify the basic need.
> > > Don't see why we even need per VMA information so far.
> >
> > This information is for user land applications to have the knowledge of
> > which virtual ranges are getting actively used and which are not.
> > This information then can be fed into a new system call
> > sys_change_page_activation(pid, start_va, len, flag). The purpose of
> > this system call would be to give hints to kernel that certain physical
> > pages are okay to be inactivated (or vice versa).
>
> Then perhaps you want a sys_report_page_activation(pid, start_va, len, ...)
> which would examine and report on the range in question, instead of adding
> your count to so many vmas on which this will never be used.
>
That will reduce the cost of not traversing the whole process address
space. But still for a given length, we will need to traverse the PTs.
On a positive side, this interface can give more specific information to
user in terms of page attributes. I'm fine with this interface if
others are okay. Andi?
> Though your syscall sounds like pid_madvise: perhaps the call name
> should be less specific and left to the flags (come, gentle syscall
> multiplexing flames, and warm me).
>
Agreed.
> Looking through the existing fields of a vma, it seems a vm_area_struct
> would commonly be on clean cachelines: your count making one of them
> now commonly and bouncily dirty.
The additional cost of this counter will be long size of extra memory
per segment, an atomic operation when ptl is used and dirtying an
additional cache line. I overlooked the last two cost factors earlier.
-rohit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-13 16:59 ` Rohit Seth
2006-06-13 17:28 ` Hugh Dickins
@ 2006-06-13 17:31 ` Andi Kleen
1 sibling, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2006-06-13 17:31 UTC (permalink / raw)
To: rohitseth; +Cc: Nick Piggin, Andrew Morton, Linux-mm, Linux-kernel
> This information is for user land applications to have the knowledge of
> which virtual ranges are getting actively used and which are not.
If you think the kernel needs better information on that wouldn't
it be better to use the page accessed bits of the hardware more
aggressively?
Before giving up and adding hacks like you're proposing it would
be better to explore fully automatic mechanisms fully.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 1:33 [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it Rohit Seth
2006-06-10 2:42 ` Andrew Morton
2006-06-10 7:35 ` Nick Piggin
@ 2006-06-11 16:09 ` Arjan van de Ven
2006-06-12 11:17 ` Andi Kleen
2006-06-12 16:43 ` Christoph Lameter
3 siblings, 1 reply; 23+ messages in thread
From: Arjan van de Ven @ 2006-06-11 16:09 UTC (permalink / raw)
To: rohitseth; +Cc: Andrew Morton, Linux-mm, Linux-kernel
On Fri, 2006-06-09 at 18:33 -0700, Rohit Seth wrote:
> Below is a patch that adds number of physical pages that each vma is
> using in a process. Exporting this information to user space
> using /proc/<pid>/maps interface.
is it really worth bloating the vma struct for this? there are quite a
few workloads that have a gazilion vma's, and this patch adds both
memory usage and cache pressure to those workloads...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-11 16:09 ` Arjan van de Ven
@ 2006-06-12 11:17 ` Andi Kleen
2006-06-12 12:49 ` Jan Engelhardt
0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2006-06-12 11:17 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: rohitseth, Andrew Morton, Linux-mm, Linux-kernel
On Sunday 11 June 2006 18:09, Arjan van de Ven wrote:
> On Fri, 2006-06-09 at 18:33 -0700, Rohit Seth wrote:
> > Below is a patch that adds number of physical pages that each vma is
> > using in a process. Exporting this information to user space
> > using /proc/<pid>/maps interface.
>
> is it really worth bloating the vma struct for this? there are quite a
> few workloads that have a gazilion vma's, and this patch adds both
> memory usage and cache pressure to those workloads...
I agree it's a bad idea. smaps is only a debugging kludge anyways
and it's not a good idea to we bloat core data structures for it.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-12 11:17 ` Andi Kleen
@ 2006-06-12 12:49 ` Jan Engelhardt
2006-06-12 12:54 ` Andi Kleen
0 siblings, 1 reply; 23+ messages in thread
From: Jan Engelhardt @ 2006-06-12 12:49 UTC (permalink / raw)
To: Andi Kleen
Cc: Arjan van de Ven, rohitseth, Andrew Morton, Linux-mm, Linux-kernel
>
>I agree it's a bad idea. smaps is only a debugging kludge anyways
>and it's not a good idea to we bloat core data structures for it.
>
Is there a way to disable it (smaps), then?
Jan Engelhardt
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-12 12:49 ` Jan Engelhardt
@ 2006-06-12 12:54 ` Andi Kleen
0 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2006-06-12 12:54 UTC (permalink / raw)
To: Jan Engelhardt
Cc: Arjan van de Ven, rohitseth, Andrew Morton, Linux-mm, Linux-kernel
On Monday 12 June 2006 14:49, Jan Engelhardt wrote:
> >
> >I agree it's a bad idea. smaps is only a debugging kludge anyways
> >and it's not a good idea to we bloat core data structures for it.
> >
> Is there a way to disable it (smaps), then?
Just don't use it?
Not set CONFIG_NUMA?
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it
2006-06-10 1:33 [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it Rohit Seth
` (2 preceding siblings ...)
2006-06-11 16:09 ` Arjan van de Ven
@ 2006-06-12 16:43 ` Christoph Lameter
3 siblings, 0 replies; 23+ messages in thread
From: Christoph Lameter @ 2006-06-12 16:43 UTC (permalink / raw)
To: Rohit Seth; +Cc: Andrew Morton, Linux-mm, Linux-kernel
On Fri, 9 Jun 2006, Rohit Seth wrote:
> There is currently /proc/<pid>/smaps that prints the detailed
> information about the usage of physical pages but that is a very
> expensive operation as it traverses all the PTs (for some one who is
> just interested in getting that data for each vma).
Adding a new counter to a vma may cause a bouncing cacheline etc. I
would think that such a counter is far more expensive than occasional
scans through the page table because someone is curious about the
number of page in use. /proc/<pid>/numa_maps also uses these scans to
determine dirty pages etc.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2006-06-13 18:09 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-13 5:53 [PATCH]: Adding a counter in vma to indicate the number of physical_pages_backing it Albert Cahalan
2006-06-13 5:56 ` Andi Kleen
2006-06-13 17:10 ` Rohit Seth
2006-06-13 17:18 ` Andi Kleen
-- strict thread matches above, loose matches on Subject: below --
2006-06-10 1:33 [PATCH]: Adding a counter in vma to indicate the number of physical pages backing it Rohit Seth
2006-06-10 2:42 ` Andrew Morton
2006-06-12 17:49 ` Rohit Seth
2006-06-10 7:35 ` Nick Piggin
2006-06-11 10:15 ` Jan Engelhardt
2006-06-12 17:36 ` Rohit Seth
2006-06-12 17:58 ` Andi Kleen
2006-06-12 19:42 ` Rohit Seth
2006-06-13 3:51 ` Andi Kleen
2006-06-13 4:27 ` Nick Piggin
2006-06-13 16:59 ` Rohit Seth
2006-06-13 17:28 ` Hugh Dickins
2006-06-13 18:09 ` Rohit Seth
2006-06-13 17:31 ` Andi Kleen
2006-06-11 16:09 ` Arjan van de Ven
2006-06-12 11:17 ` Andi Kleen
2006-06-12 12:49 ` Jan Engelhardt
2006-06-12 12:54 ` Andi Kleen
2006-06-12 16:43 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox