* [PATCH 00/14] Zoned VM counters V2
@ 2006-06-08 23:02 Christoph Lameter
2006-06-08 23:02 ` [PATCH 01/14] Per zone counter functionality Christoph Lameter
` (13 more replies)
0 siblings, 14 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:02 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Zone based VM statistics are necessary to be able to determine what the state
of memory in one zone is. In a NUMA system this can be helpful for local
reclaim and other memory optimizations that may be able to shift VM load
in order to get more balanced memory use.
It is also helpful to know how the computing load affects the memory
allocations on various zones.
The patchset introduces a framework for counters that is a cross between the
existing page_stats --which are simply global counters split per cpu-- and
the approach of deferred incremental updates implemented for nr_pagecache.
Small per cpu 8 bit counters are added to struct zone. If the counter
exceeds certain thresholds then the counters are accumulated in an array of
atomic_long in the zone and in a global array that sums up all
zone values.
Access to VM counter information for a zone and for the whole machine
is then possible by simply indexing an array (Thanks to Nick Piggin for
pointing out that approach). The access to the total number of pages of
various types does no longer require the summing up of all per cpu counters.
Benefits of this patchset right now:
- zone_reclaim_interval vanishes since VM stats can now determine
when it is worth to scan for reclaimable pages.
- loops over all processors are avoided in writeback and
reclaim paths.
- Get rid of the nr_pagecache atomic for the single processor case
(Marcelo's idea).
- Accurate counters in /sys/devices/system/node/node*/meminfo. Current
counters are based on where the pages were allocated so the counters
were not useful to show the actual use of pages on a node.
- Detailed VM counters available in more /proc and /sys status files.
References to earlier discussions:
V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2
Earlier approaches: http://marc.theaimsgroup.com/?l=linux-kernel&m=113460596414687&w=2
Performance test with AIM7 did not show any regressions. Seems to be a tad
faster even. Tested on ia64/ NUMA. Builds fine on i386, SMP / UP.
Changelog
V1->V2:
- Cleanup code, resequence and base patches on 2.6.17-rc6-mm1
- Reduce interrupt holdoffs
- Add zone reclaim interval removal patch
- Rename EVENT_COUNTER to VM_EVENT_COUNTERS (also all variables and functions)
The patchset consists of 14 patches. These are:
01/14 Per zone counter infrastructure
Sets up the functionality to handle per zone counters but does not
define any.
02/14 Add zoned counters to /proc/vmstat
Adds the display of zoned counters
03/14 Conversion of nr_mapped to a per zone counter
Converts nr_mapped and sets up the first per zone counters. This allows
optimizations in various places that avoid looping over counters from all
processors.
04/14 Conversion of nr_pagecache to a per zone counter
Replace the single atomic variable with a per cpu counter. For UP this means
that no atomic operations have to be used for nr_pagecache anymore. Remove
special nr_pagecache code.
05/14 Use zoned counters instead of zone_reclaim_interval
Replace the zone_reclaim_interval logic with a check for
unmapped pages.
06/14 Extend proc per node, per zone stats by adding per zone counters
Adds new counters to various places where we display counters.
07/14 Conversion of nr_slab to a per zone counter
This avoids looping over processors in the reclaim code and allows accurate
accounting of slab use per zone.
08/14 Conversion of nr_pagetable to a per zone counter
Allows accurate accounting of pagetable pages per zone.
09/14 Conversion of nr_dirty to a per zone counter
Avoids loop over processors in writeback state determination
10/14 Conversion of nr_writeback to a per zone counter
Avoids loop over processors in writeback state determination.
11/14 Conversio of nr_unstable to a per zone counter
Avoids loop over proessors in writeback state determination.
12/14 Remove get_page_state functions
There is no need anymoore for the get_page_state function. So remove it.
13/14 Convert nr_bounce to a per zone counter
nr_bounce also counts a type of page.
14/14 Remove writeback structures
There is really no need anymore to cache writeback information since
the counters are readily available. Remove the writeback information
structure.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 01/14] Per zone counter functionality
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
@ 2006-06-08 23:02 ` Christoph Lameter
2006-06-09 4:00 ` Andrew Morton
2006-06-09 4:28 ` Andi Kleen
2006-06-08 23:02 ` [PATCH 02/14] Include per zone counters in /proc/vmstat Christoph Lameter
` (12 subsequent siblings)
13 siblings, 2 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:02 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Per zone counter infrastructure
The counters that we currently have for the VM are split per processor. They
count the counter increments/decrements occurring per cpu. However, this has
no relation to the pages in use in a zone and we cannot tell f.e. how many
ZONE_DMA pages are dirty. So we are blind to potentially inbalances in the use
of various zones. In a NUMA system we cannot tell what pages are used for
what purpose. If we knew then we could put measures into the VM to balance
the use of memory between different zones and different nodes in a NUMA system.
For example it would be possible to limit the dirty pages per node so that
fast local memory is kept available even if a process is dirtying huge amounts
of pages.
Another example is zone reclaim. We do not know how many unmapped pages
exist per zone. So we just have to try to reclaim. If it is not working
then we pause and try again later. It would be better if we knew when
it makes sense to reclaim unmapped pages from a zone. This patchset allows
the determination of the number of unmapped pages per zone. We can remove
the zone reclaim interval with the counters introduced here.
Futhermore the ability to have various usage statistics available will
allow the development of new NUMA balancing algorithms that may be able
to improve the decisionmaking in the scheduler of when to move a process
to another node and hopefully will also enable automatic page migration
through a user space program that can analyse the memory load distribution
and then rebalance memory use in order to increase performance.
The counter framework here implements differential counters for each processor
in struct zone. The differential counters are consolidated when a threshold
is exceeded (like done in the current implementation for nr_pageache), when
slab reaping occurs or when a consolidation function is called.
Consolidation uses atomic operations and accumulates counters per zone in
the zone structure and also globally in the vm_stat array. VM functions can
access the counts by simply indexing a global or zone specific array.
The arrangement of counters in an array also simplifies processing when output
has to be generated for /proc/*.
Counter updates can be triggered by calling *_zone_page_state or
__*_zone_page_state. The second function can be called if it is known that
interrupts are disabled.
Specially optimized increment and decrement functions are provided. These
can avoid certain checks and use increment or decrement instructions that
an architecture may provide.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 15:20:10.476033192 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:42:25.204930191 -0700
@@ -46,6 +46,19 @@ struct zone_padding {
#define ZONE_PADDING(name)
#endif
+enum zone_stat_item {
+ NR_STAT_ITEMS };
+
+#ifdef CONFIG_SMP
+typedef atomic_long_t vm_stat_t;
+#define VM_STAT_GET(x) atomic_long_read(&(x))
+#define VM_STAT_ADD(x,v) atomic_long_add(v, &(x))
+#else
+typedef unsigned long vm_stat_t;
+#define VM_STAT_GET(x) (x)
+#define VM_STAT_ADD(x,v) (x) += (v)
+#endif
+
struct per_cpu_pages {
int count; /* number of pages in the list */
int high; /* high watermark, emptying needed */
@@ -55,6 +68,10 @@ struct per_cpu_pages {
struct per_cpu_pageset {
struct per_cpu_pages pcp[2]; /* 0: hot. 1: cold */
+#ifdef CONFIG_SMP
+ s8 vm_stat_diff[NR_STAT_ITEMS];
+#endif
+
#ifdef CONFIG_NUMA
unsigned long numa_hit; /* allocated in intended node */
unsigned long numa_miss; /* allocated in non intended node */
@@ -170,6 +187,8 @@ struct zone {
/* A count of how many reclaimers are scanning this zone */
atomic_t reclaim_in_progress;
+ /* Zone statistics */
+ vm_stat_t vm_stat[NR_STAT_ITEMS];
/*
* timestamp (in jiffies) of the last zone reclaim that did not
* result in freeing of pages. This is used to avoid repeated scans
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:20:10.627391012 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:42:25.205906693 -0700
@@ -233,6 +233,52 @@ extern void __mod_page_state_offset(unsi
} while (0)
/*
+ * Zone based accounting with per cpu differentials.
+ */
+extern vm_stat_t vm_stat[NR_STAT_ITEMS];
+
+static inline unsigned long global_page_state(enum zone_stat_item item)
+{
+ long x = VM_STAT_GET(vm_stat[item]);
+#ifdef CONFIG_SMP
+ if (x < 0)
+ x = 0;
+#endif
+ return x;
+}
+
+static inline unsigned long zone_page_state(struct zone *zone,
+ enum zone_stat_item item)
+{
+ long x = VM_STAT_GET(zone->vm_stat[item]);
+#ifdef CONFIG_SMP
+ if (x < 0)
+ x = 0;
+#endif
+ return x;
+}
+
+#ifdef CONFIG_NUMA
+unsigned long node_page_state(int node, enum zone_stat_item);
+#else
+#define node_page_state(node, item) global_page_state(item)
+#endif
+
+void __mod_zone_page_state(struct zone *, enum zone_stat_item item, int);
+void __inc_zone_page_state(struct page *, enum zone_stat_item);
+void __dec_zone_page_state(struct page *, enum zone_stat_item);
+
+#define __add_zone_page_state(__z, __i, __d) __mod_zone_page_state(__z, __i, __d)
+#define __sub_zone_page_state(__z, __i, __d) __mod_zone_page_state(__z, __i,-(__d))
+
+void mod_zone_page_state(struct zone *, enum zone_stat_item, int);
+void inc_zone_page_state(struct page *, enum zone_stat_item);
+void dec_zone_page_state(struct page *, enum zone_stat_item);
+
+#define add_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, __d)
+#define sub_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, -(__d))
+
+/*
* Manipulation of page state flags
*/
#define PageLocked(page) \
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:20:11.552138471 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:43:40.691466807 -0700
@@ -628,8 +628,279 @@ static int rmqueue_bulk(struct zone *zon
return i;
}
+/*
+ * Manage combined zone based / global counters
+ *
+ * vm_stat contains the global counters
+ */
+vm_stat_t vm_stat[NR_STAT_ITEMS];
+
+static inline void zone_page_state_add(long x, struct zone *zone,
+ enum zone_stat_item item)
+{
+ VM_STAT_ADD(zone->vm_stat[item], x);
+ VM_STAT_ADD(vm_stat[item], x);
+}
+
+#ifdef CONFIG_SMP
+
+#define STAT_THRESHOLD 32
+
+/*
+ * Determine pointer to currently valid differential byte given a zone and
+ * the item number.
+ *
+ * Preemption must be off
+ */
+static inline s8 *diff_pointer(struct zone *zone, enum zone_stat_item item)
+{
+ return &zone_pcp(zone, smp_processor_id())->vm_stat_diff[item];
+}
+
+/*
+ * For use when we know that interrupts are disabled.
+ */
+void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ s8 *p;
+ long x;
+
+ p = diff_pointer(zone, item);
+ x = delta + *p;
+
+ if (unlikely(x > STAT_THRESHOLD || x < -STAT_THRESHOLD)) {
+ zone_page_state_add(x, zone, item);
+ x = 0;
+ }
+
+ *p = x;
+}
+EXPORT_SYMBOL(__mod_zone_page_state);
+
+/*
+ * For an unknown interrupt state
+ */
+void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ __mod_zone_page_state(zone, item, delta);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_zone_page_state);
+
+/*
+ * Optimized increment and decrement functions.
+ *
+ * These are only for a single page and therefore can take a struct page *
+ * argument instead of struct zone *. This allows the inclusion of the code
+ * generated for page_zone(page) into the optimized functions.
+ *
+ * No overflow check is necessary and therefore the differential can be
+ * incremented or decremented in place which may allow the compilers to
+ * generate better code.
+ *
+ * The increment or decrement is known and therefore one boundary check can
+ * be omitted.
+ *
+ * Some processors have inc/dec instructions that are atomic vs an interrupt.
+ * However, the code must first determine the differential location in a zone
+ * based on the processor number and then inc/dec the counter. There is no
+ * guarantee without disabling preemption that the processor will not change
+ * in between and therefore the atomicity vs. interrupt cannot be exploited
+ * in a useful way here.
+ */
+void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
+EXPORT_SYMBOL(__inc_zone_page_state);
+
+void __dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)--;
+
+ if (unlikely(*p < -STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
+EXPORT_SYMBOL(__dec_zone_page_state);
+
+void inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+ struct zone *zone;
+ s8 *p;
+
+ zone = page_zone(page);
+ local_irq_save(flags);
+ p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(inc_zone_page_state);
+
+void dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+ struct zone *zone;
+ s8 *p;
+
+ zone = page_zone(page);
+ local_irq_save(flags);
+ p = diff_pointer(zone, item);
+
+ (*p)--;
+
+ if (unlikely(*p < -STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state);
+
+/*
+ * Update the zone counters for one cpu.
+ */
+void refresh_cpu_vm_stats(int cpu)
+{
+ struct zone *zone;
+ int i;
+ unsigned long flags;
+
+ for_each_zone(zone) {
+ struct per_cpu_pageset *pcp;
+
+ pcp = zone_pcp(zone, cpu);
+
+ for (i = 0; i < NR_STAT_ITEMS; i++)
+ if (pcp->vm_stat_diff[i]) {
+ local_irq_save(flags);
+ zone_page_state_add(pcp->vm_stat_diff[i],
+ zone, i);
+ pcp->vm_stat_diff[i] = 0;
+ local_irq_restore(flags);
+ }
+ }
+}
+
+static void __refresh_cpu_vm_stats(void *dummy)
+{
+ refresh_cpu_vm_stats(smp_processor_id());
+}
+
+/*
+ * Consolidate all counters.
+ *
+ * Note that the result is less inaccurate but still inaccurate
+ * if concurrent processes are allowed to run.
+ */
+void refresh_vm_stats(void)
+{
+ on_each_cpu(__refresh_cpu_vm_stats, NULL, 0, 1);
+}
+EXPORT_SYMBOL(refresh_vm_stats);
+
+#else /* CONFIG_SMP */
+
+/*
+ * We do not maintain differentials in a single processor configuration.
+ * The functions directly modify the zone and global counters.
+ */
+
+void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ zone_page_state_add(delta, zone, item);
+}
+EXPORT_SYMBOL(__mod_zone_page_state);
+
+void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add(delta, zone, item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_zone_page_state);
+
+void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ zone_page_state_add(1, page_zone(page), item);
+}
+EXPORT_SYMBOL(__inc_zone_page_state);
+
+void __dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ zone_page_state_add(-1, page_zone(page), item);
+}
+EXPORT_SYMBOL(__dec_zone_page_state);
+
+void inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add(1, page_zone(page), item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(inc_zone_page_state);
+
+void dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add( -1, page_zone(page), item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state);
+#endif
+
#ifdef CONFIG_NUMA
/*
+ * Determine the per node value of a stat item. This is done by cycling
+ * through all the zones of a node.
+ */
+unsigned long node_page_state(int node, enum zone_stat_item item)
+{
+ struct zone *zones = NODE_DATA(node)->node_zones;
+ int i;
+ long v = 0;
+
+ for (i = 0; i < MAX_NR_ZONES; i++)
+ v += VM_STAT_GET(zones[i].vm_stat[item]);
+ if (v < 0)
+ v = 0;
+ return v;
+}
+EXPORT_SYMBOL(node_page_state);
+
+/*
* Called from the slab reaper to drain pagesets on a particular node that
* belong to the currently executing processor.
* Note that this function must be called with the thread pinned to
@@ -2278,6 +2549,7 @@ static void __meminit free_area_init_cor
zone->nr_scan_inactive = 0;
zone->nr_active = 0;
zone->nr_inactive = 0;
+ memset(zone->vm_stat, 0, sizeof(zone->vm_stat));
atomic_set(&zone->reclaim_in_progress, 0);
if (!size)
continue;
@@ -2661,7 +2933,8 @@ static int page_alloc_cpu_notify(struct
}
local_irq_enable();
- }
+ refresh_cpu_vmstats(cpu);
+ }
return NOTIFY_OK;
}
#endif /* CONFIG_HOTPLUG_CPU */
Index: linux-2.6.17-rc6-mm1/include/linux/gfp.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/gfp.h 2006-06-08 15:20:10.249484712 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/gfp.h 2006-06-08 15:43:40.689513803 -0700
@@ -163,4 +163,12 @@ void drain_node_pages(int node);
static inline void drain_node_pages(int node) { };
#endif
+#ifdef CONFIG_SMP
+void refresh_cpu_vm_stats(int);
+void refresh_vm_stats(void);
+#else
+static inline void refresh_cpu_vm_stats(int cpu) { };
+static inline void refresh_vm_stats(void) { };
+#endif
+
#endif /* __LINUX_GFP_H */
Index: linux-2.6.17-rc6-mm1/mm/slab.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/slab.c 2006-06-08 15:20:11.585339541 -0700
+++ linux-2.6.17-rc6-mm1/mm/slab.c 2006-06-08 15:42:25.210789203 -0700
@@ -3826,6 +3826,7 @@ next:
check_irq_on();
mutex_unlock(&cache_chain_mutex);
next_reap_node();
+ refresh_cpu_vm_stats(smp_processor_id());
/* Set up the next iteration */
schedule_delayed_work(&__get_cpu_var(reap_work), REAPTIMEOUT_CPUC);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 02/14] Include per zone counters in /proc/vmstat
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
2006-06-08 23:02 ` [PATCH 01/14] Per zone counter functionality Christoph Lameter
@ 2006-06-08 23:02 ` Christoph Lameter
2006-06-08 23:02 ` [PATCH 03/14] Conversion of nr_mapped to per zone counter Christoph Lameter
` (11 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:02 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Add zoned counters to /proc/vmstat
This makes vmstat print counters from a combined array of zoned counters
plus the current page_state (which will be later be replaced by the
event counter patchset).
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 13:51:16.145258328 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 13:53:18.336915650 -0700
@@ -2800,6 +2800,9 @@ struct seq_operations zoneinfo_op = {
};
static char *vmstat_text[] = {
+ /* Zoned VM counters */
+
+ /* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
@@ -2857,19 +2860,25 @@ static char *vmstat_text[] = {
static void *vmstat_start(struct seq_file *m, loff_t *pos)
{
+ unsigned long *v;
struct page_state *ps;
+ int i;
if (*pos >= ARRAY_SIZE(vmstat_text))
return NULL;
- ps = kmalloc(sizeof(*ps), GFP_KERNEL);
- m->private = ps;
- if (!ps)
+ v = kmalloc(NR_STAT_ITEMS * sizeof(unsigned long)
+ + sizeof(struct page_state), GFP_KERNEL);
+ m->private = v;
+ if (!v)
return ERR_PTR(-ENOMEM);
+ for (i = 0; i < NR_STAT_ITEMS; i++)
+ v[i] = global_page_state(i);
+ ps = (struct page_state *)(v + NR_STAT_ITEMS);
get_full_page_state(ps);
ps->pgpgin /= 2; /* sectors -> kbytes */
ps->pgpgout /= 2;
- return (unsigned long *)ps + *pos;
+ return v + *pos;
}
static void *vmstat_next(struct seq_file *m, void *arg, loff_t *pos)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 03/14] Conversion of nr_mapped to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
2006-06-08 23:02 ` [PATCH 01/14] Per zone counter functionality Christoph Lameter
2006-06-08 23:02 ` [PATCH 02/14] Include per zone counters in /proc/vmstat Christoph Lameter
@ 2006-06-08 23:02 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 04/14] Conversion of nr_pagecache " Christoph Lameter
` (10 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:02 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_mapped to a per zone counter
nr_mapped is important because it allows a determination how many pages of a
zone are not mapped, which would allow a more efficient means of determining
when we need to reclaim memory in a zone.
We take the nr_mapped field out of the page state structure and
define a new per zone counter named NR_MAPPED.
We replace the use of nr_mapped in various kernel locations. This avoids
the looping over all processors in try_to_free_pages(), writeback, reclaim
(swap + zone reclaim).
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 15:20:05.426540998 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 15:44:58.889749567 -0700
@@ -44,18 +44,18 @@ static ssize_t node_read_meminfo(struct
unsigned long inactive;
unsigned long active;
unsigned long free;
+ unsigned long nr_mapped;
si_meminfo_node(&i, nid);
get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
+ nr_mapped = node_page_state(nid, NR_MAPPED);
/* Check for negative values in these approximate counters */
if ((long)ps.nr_dirty < 0)
ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
- if ((long)ps.nr_mapped < 0)
- ps.nr_mapped = 0;
if ((long)ps.nr_slab < 0)
ps.nr_slab = 0;
@@ -84,7 +84,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(ps.nr_dirty),
nid, K(ps.nr_writeback),
- nid, K(ps.nr_mapped),
+ nid, K(nr_mapped),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 15:20:08.334564156 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 15:44:58.890726069 -0700
@@ -190,7 +190,7 @@ static int meminfo_read_proc(char *page,
K(i.freeswap),
K(ps.nr_dirty),
K(ps.nr_writeback),
- K(ps.nr_mapped),
+ K(global_page_state(NR_MAPPED)),
K(ps.nr_slab),
K(allowed),
K(committed),
Index: linux-2.6.17-rc6-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/vmscan.c 2006-06-08 15:20:11.595104562 -0700
+++ linux-2.6.17-rc6-mm1/mm/vmscan.c 2006-06-08 15:44:58.891702571 -0700
@@ -997,7 +997,7 @@ unsigned long try_to_free_pages(struct z
}
for (priority = DEF_PRIORITY; priority >= 0; priority--) {
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_MAPPED);
sc.nr_scanned = 0;
if (!priority)
disable_swap_token();
@@ -1082,7 +1082,7 @@ loop_again:
total_scanned = 0;
nr_reclaimed = 0;
sc.may_writepage = !laptop_mode,
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_MAPPED);
inc_page_state(pageoutrun);
@@ -1417,7 +1417,7 @@ unsigned long shrink_all_memory(unsigned
for (prio = DEF_PRIORITY; prio >= 0; prio--) {
unsigned long nr_to_scan = nr_pages - ret;
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_MAPPED);
sc.nr_scanned = 0;
ret += shrink_all_zones(nr_to_scan, prio, pass, &sc);
@@ -1559,7 +1559,7 @@ static int __zone_reclaim(struct zone *z
struct scan_control sc = {
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_swap = !!(zone_reclaim_mode & RECLAIM_SWAP),
- .nr_mapped = read_page_state(nr_mapped),
+ .nr_mapped = global_page_state(NR_MAPPED),
.swap_cluster_max = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
Index: linux-2.6.17-rc6-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page-writeback.c 2006-06-08 15:20:11.553114973 -0700
+++ linux-2.6.17-rc6-mm1/mm/page-writeback.c 2006-06-08 15:44:58.892679074 -0700
@@ -111,7 +111,7 @@ static void get_writeback_state(struct w
{
wbs->nr_dirty = read_page_state(nr_dirty);
wbs->nr_unstable = read_page_state(nr_unstable);
- wbs->nr_mapped = read_page_state(nr_mapped);
+ wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = read_page_state(nr_writeback);
}
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:44:55.859663761 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:44:58.893655576 -0700
@@ -1815,7 +1815,7 @@ void show_free_areas(void)
ps.nr_unstable,
nr_free_pages(),
ps.nr_slab,
- ps.nr_mapped,
+ global_page_state(NR_MAPPED),
ps.nr_page_table_pages);
for_each_zone(zone) {
@@ -2801,13 +2801,13 @@ struct seq_operations zoneinfo_op = {
static char *vmstat_text[] = {
/* Zoned VM counters */
+ "nr_mapped",
/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
"nr_page_table_pages",
- "nr_mapped",
"nr_slab",
"pgpgin",
Index: linux-2.6.17-rc6-mm1/mm/rmap.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/rmap.c 2006-06-08 15:20:11.579480529 -0700
+++ linux-2.6.17-rc6-mm1/mm/rmap.c 2006-06-08 15:44:58.894632078 -0700
@@ -455,7 +455,7 @@ static void __page_set_anon_rmap(struct
* nr_mapped state can be updated without turning off
* interrupts because it is not modified via interrupt.
*/
- __inc_page_state(nr_mapped);
+ __inc_zone_page_state(page, NR_MAPPED);
}
/**
@@ -499,7 +499,7 @@ void page_add_new_anon_rmap(struct page
void page_add_file_rmap(struct page *page)
{
if (atomic_inc_and_test(&page->_mapcount))
- __inc_page_state(nr_mapped);
+ __inc_zone_page_state(page, NR_MAPPED);
}
/**
@@ -531,7 +531,7 @@ void page_remove_rmap(struct page *page)
*/
if (page_test_and_clear_dirty(page))
set_page_dirty(page);
- __dec_page_state(nr_mapped);
+ __dec_zone_page_state(page, NR_MAPPED);
}
}
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 15:42:25.204930191 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:44:58.895608580 -0700
@@ -47,6 +47,9 @@ struct zone_padding {
#endif
enum zone_stat_item {
+ NR_MAPPED, /* mapped into pagetables.
+ only modified from process context */
+
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:42:25.205906693 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:44:58.896585082 -0700
@@ -123,8 +123,6 @@ struct page_state {
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_mapped; /* mapped into pagetables.
- * only modified from process context */
unsigned long nr_slab; /* In slab */
#define GET_PAGE_STATE_LAST nr_slab
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-08 15:20:11.590222051 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-08 15:44:58.896585082 -0700
@@ -394,7 +394,7 @@ static int prefetch_suitable(void)
* even if the slab is being allocated on a remote node. This
* would be expensive to fix and not of great significance.
*/
- limit = ps.nr_mapped + ps.nr_slab + ps.nr_dirty +
+ limit = global_page_state(NR_MAPPED) + ps.nr_slab + ps.nr_dirty +
ps.nr_unstable + total_swapcache_pages;
if (limit > ns->prefetch_watermark) {
node_clear(node, sp_stat.prefetch_nodes);
Index: linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/i386/mm/pgtable.c 2006-06-05 17:57:02.000000000 -0700
+++ linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c 2006-06-08 15:45:17.903220642 -0700
@@ -61,7 +61,7 @@ void show_mem(void)
get_page_state(&ps);
printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
- printk(KERN_INFO "%lu pages mapped\n", ps.nr_mapped);
+ printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_MAPPED));
printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 04/14] Conversion of nr_pagecache to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (2 preceding siblings ...)
2006-06-08 23:02 ` [PATCH 03/14] Conversion of nr_mapped to per zone counter Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 04/14] Use per zone counters to remove zone_reclaim_interval Christoph Lameter
` (9 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_pagecache to a per zone counter
Currently a single atomic variable is used to establish the size of the page
cache in the whole machine. The zoned VM counters have the same method of
implementation as the nr_pagecache code but also allow the determination
of the pagecache size per zone.
Remove the special implementation for nr_pagecache and make it a zoned
counter.
Updates of the page cache counters are always performed with interrupts off.
We can therefore use the __ variant here.
This will make UP no longer require atomic operations for nr_pagecache.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/include/linux/pagemap.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/pagemap.h 2006-06-08 13:03:39.304520730 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/pagemap.h 2006-06-08 14:07:35.879548728 -0700
@@ -115,51 +115,6 @@ int add_to_page_cache_lru(struct page *p
extern void remove_from_page_cache(struct page *page);
extern void __remove_from_page_cache(struct page *page);
-extern atomic_t nr_pagecache;
-
-#ifdef CONFIG_SMP
-
-#define PAGECACHE_ACCT_THRESHOLD max(16, NR_CPUS * 2)
-DECLARE_PER_CPU(long, nr_pagecache_local);
-
-/*
- * pagecache_acct implements approximate accounting for pagecache.
- * vm_enough_memory() do not need high accuracy. Writers will keep
- * an offset in their per-cpu arena and will spill that into the
- * global count whenever the absolute value of the local count
- * exceeds the counter's threshold.
- *
- * MUST be protected from preemption.
- * current protection is mapping->page_lock.
- */
-static inline void pagecache_acct(int count)
-{
- long *local;
-
- local = &__get_cpu_var(nr_pagecache_local);
- *local += count;
- if (*local > PAGECACHE_ACCT_THRESHOLD || *local < -PAGECACHE_ACCT_THRESHOLD) {
- atomic_add(*local, &nr_pagecache);
- *local = 0;
- }
-}
-
-#else
-
-static inline void pagecache_acct(int count)
-{
- atomic_add(count, &nr_pagecache);
-}
-#endif
-
-static inline unsigned long get_page_cache_size(void)
-{
- int ret = atomic_read(&nr_pagecache);
- if (unlikely(ret < 0))
- ret = 0;
- return ret;
-}
-
/*
* Return byte-offset into filesystem object for page.
*/
Index: linux-2.6.17-rc6-mm1/mm/swap_state.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_state.c 2006-06-08 13:03:39.924599579 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_state.c 2006-06-08 14:07:35.880525230 -0700
@@ -89,7 +89,7 @@ static int __add_to_swap_cache(struct pa
SetPageSwapCache(page);
set_page_private(page, entry.val);
total_swapcache_pages++;
- pagecache_acct(1);
+ __inc_zone_page_state(page, NR_PAGECACHE);
}
write_unlock_irq(&swapper_space.tree_lock);
radix_tree_preload_end();
@@ -135,7 +135,7 @@ void __delete_from_swap_cache(struct pag
set_page_private(page, 0);
ClearPageSwapCache(page);
total_swapcache_pages--;
- pagecache_acct(-1);
+ __dec_zone_page_state(page, NR_PAGECACHE);
INC_CACHE_INFO(del_total);
}
Index: linux-2.6.17-rc6-mm1/mm/filemap.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/filemap.c 2006-06-08 13:03:39.864056448 -0700
+++ linux-2.6.17-rc6-mm1/mm/filemap.c 2006-06-08 14:07:35.881501732 -0700
@@ -126,7 +126,7 @@ void __remove_from_page_cache(struct pag
radix_tree_delete(&mapping->page_tree, page->index);
page->mapping = NULL;
mapping->nrpages--;
- pagecache_acct(-1);
+ __dec_zone_page_state(page, NR_PAGECACHE);
}
EXPORT_SYMBOL(__remove_from_page_cache);
@@ -424,7 +424,7 @@ int add_to_page_cache(struct page *page,
page->mapping = mapping;
page->index = offset;
mapping->nrpages++;
- pagecache_acct(1);
+ __inc_zone_page_state(page, NR_PAGECACHE);
}
write_unlock_irq(&mapping->tree_lock);
radix_tree_preload_end();
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 13:55:23.119232821 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 14:07:35.883454737 -0700
@@ -1583,12 +1583,6 @@ static void show_node(struct zone *zone)
*/
static DEFINE_PER_CPU(struct page_state, page_states) = {0};
-atomic_t nr_pagecache = ATOMIC_INIT(0);
-EXPORT_SYMBOL(nr_pagecache);
-#ifdef CONFIG_SMP
-DEFINE_PER_CPU(long, nr_pagecache_local) = 0;
-#endif
-
static void __get_page_state(struct page_state *ret, int nr, cpumask_t *cpumask)
{
unsigned cpu;
@@ -2802,6 +2796,7 @@ struct seq_operations zoneinfo_op = {
static char *vmstat_text[] = {
/* Zoned VM counters */
"nr_mapped",
+ "nr_pagecache",
/* Page state */
"nr_dirty",
Index: linux-2.6.17-rc6-mm1/mm/mmap.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/mmap.c 2006-06-08 13:03:39.900187026 -0700
+++ linux-2.6.17-rc6-mm1/mm/mmap.c 2006-06-08 14:07:35.884431239 -0700
@@ -96,7 +96,7 @@ int __vm_enough_memory(long pages, int c
if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {
unsigned long n;
- free = get_page_cache_size();
+ free = global_page_state(NR_PAGECACHE);
free += nr_swap_pages;
/*
Index: linux-2.6.17-rc6-mm1/mm/nommu.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/nommu.c 2006-06-05 17:57:02.000000000 -0700
+++ linux-2.6.17-rc6-mm1/mm/nommu.c 2006-06-08 14:07:36.292609102 -0700
@@ -1122,7 +1122,7 @@ int __vm_enough_memory(long pages, int c
if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {
unsigned long n;
- free = get_page_cache_size();
+ free = global_page_state(NR_PAGECACHE);
free += nr_swap_pages;
/*
Index: linux-2.6.17-rc6-mm1/arch/sparc64/kernel/sys_sunos32.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/sparc64/kernel/sys_sunos32.c 2006-06-08 13:03:31.354816936 -0700
+++ linux-2.6.17-rc6-mm1/arch/sparc64/kernel/sys_sunos32.c 2006-06-08 14:07:36.469355975 -0700
@@ -155,7 +155,7 @@ asmlinkage int sunos_brk(u32 baddr)
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
- freepages = get_page_cache_size();
+ freepages = global_page_state(NR_PAGECACHE);
freepages >>= 1;
freepages += nr_free_pages();
freepages += nr_swap_pages;
Index: linux-2.6.17-rc6-mm1/arch/sparc/kernel/sys_sunos.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/sparc/kernel/sys_sunos.c 2006-06-08 13:03:31.362628953 -0700
+++ linux-2.6.17-rc6-mm1/arch/sparc/kernel/sys_sunos.c 2006-06-08 14:07:36.481074000 -0700
@@ -196,7 +196,7 @@ asmlinkage int sunos_brk(unsigned long b
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
- freepages = get_page_cache_size();
+ freepages = global_page_state(NR_PAGECACHE);
freepages >>= 1;
freepages += nr_free_pages();
freepages += nr_swap_pages;
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 13:55:23.115326813 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 14:07:36.482050502 -0700
@@ -142,7 +142,7 @@ static int meminfo_read_proc(char *page,
allowed = ((totalram_pages - hugetlb_total_pages())
* sysctl_overcommit_ratio / 100) + total_swap_pages;
- cached = get_page_cache_size() - total_swapcache_pages - i.bufferram;
+ cached = global_page_state(NR_PAGECACHE) - total_swapcache_pages - i.bufferram;
if (cached < 0)
cached = 0;
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 13:55:23.121185825 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 14:07:36.482050502 -0700
@@ -49,7 +49,7 @@ struct zone_padding {
enum zone_stat_item {
NR_MAPPED, /* mapped into pagetables.
only modified from process context */
-
+ NR_PAGECACHE, /* file backed pages */
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 04/14] Use per zone counters to remove zone_reclaim_interval
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (3 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 04/14] Conversion of nr_pagecache " Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-09 4:00 ` Andrew Morton
2006-06-08 23:03 ` [PATCH 06/14] Add per zone counters to zone node and global VM statistics Christoph Lameter
` (8 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Use zoned counters to remove zone_reclaim_interval
The zone_reclaim_interval was necessary because we were not able to determine
how many unmapped pages exist in a zone. Therefore we had to scan in intervals
to figure out if any additional unmapped pages are created.
With the zoned counters we know now the number of pagecache pages
and the number of mapped pages in a zone. So we are able to
establish the number of unmapped pages.
Caveat: The number of mapped pages includes anonymous pages.
The current check works but is a bit too cautious. We could perform
zone reclaim down to the last unmapped page if we would split NR_MAPPED
into NR_MAPPED_PAGECACHE and NR_MAPPED_ANON. Maybe later.
Drop all support for zone_reclaim_interval.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/swap.h 2006-06-07 22:11:37.574190076 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/swap.h 2006-06-07 22:17:53.246235576 -0700
@@ -190,7 +190,6 @@ extern long vm_total_pages;
#ifdef CONFIG_NUMA
extern int zone_reclaim_mode;
-extern int zone_reclaim_interval;
extern int zone_reclaim(struct zone *, gfp_t, unsigned int);
#else
#define zone_reclaim_mode 0
Index: linux-2.6.17-rc6-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/vmscan.c 2006-06-07 22:11:57.798523443 -0700
+++ linux-2.6.17-rc6-mm1/mm/vmscan.c 2006-06-07 22:22:04.724800800 -0700
@@ -1534,11 +1534,6 @@ int zone_reclaim_mode __read_mostly;
#define RECLAIM_SLAB (1<<3) /* Do a global slab shrink if the zone is out of memory */
/*
- * Mininum time between zone reclaim scans
- */
-int zone_reclaim_interval __read_mostly = 30*HZ;
-
-/*
* Priority for ZONE_RECLAIM. This determines the fraction of pages
* of a node considered for each zone_reclaim. 4 scans 1/16th of
* a zone.
@@ -1604,16 +1599,6 @@ static int __zone_reclaim(struct zone *z
p->reclaim_state = NULL;
current->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE);
-
- if (nr_reclaimed == 0) {
- /*
- * We were unable to reclaim enough pages to stay on node. We
- * now allow off node accesses for a certain time period before
- * trying again to reclaim pages from the local zone.
- */
- zone->last_unsuccessful_zone_reclaim = jiffies;
- }
-
return nr_reclaimed >= nr_pages;
}
@@ -1623,13 +1608,14 @@ int zone_reclaim(struct zone *zone, gfp_
int node_id;
/*
- * Do not reclaim if there was a recent unsuccessful attempt at zone
- * reclaim. In that case we let allocations go off node for the
- * zone_reclaim_interval. Otherwise we would scan for each off-node
- * page allocation.
+ * Do not reclaim if there are not enough reclaimable pages in this
+ * zone. We decide this based on the number of mapped pages
+ * in relation to the number of page cache pages in this zone.
+ * If there are more pagecache pages than mapped pages then we can
+ * be certain that pages can be reclaimed.
*/
- if (time_before(jiffies,
- zone->last_unsuccessful_zone_reclaim + zone_reclaim_interval))
+ if (zone_page_state(zone, NR_PAGECACHE) <
+ zone_page_state(zone, NR_MAPPED))
return 0;
/*
Index: linux-2.6.17-rc6-mm1/kernel/sysctl.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/kernel/sysctl.c 2006-06-07 22:11:39.249867545 -0700
+++ linux-2.6.17-rc6-mm1/kernel/sysctl.c 2006-06-07 22:17:26.248884215 -0700
@@ -1027,15 +1027,6 @@ static ctl_table vm_table[] = {
.strategy = &sysctl_intvec,
.extra1 = &zero,
},
- {
- .ctl_name = VM_ZONE_RECLAIM_INTERVAL,
- .procname = "zone_reclaim_interval",
- .data = &zone_reclaim_interval,
- .maxlen = sizeof(zone_reclaim_interval),
- .mode = 0644,
- .proc_handler = &proc_dointvec_jiffies,
- .strategy = &sysctl_jiffies,
- },
#endif
#ifdef CONFIG_X86_32
{
Index: linux-2.6.17-rc6-mm1/include/linux/sysctl.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/sysctl.h 2006-06-07 22:11:37.606414643 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/sysctl.h 2006-06-07 22:22:30.026944631 -0700
@@ -191,7 +191,6 @@ enum
VM_DROP_PAGECACHE=29, /* int: nuke lots of pagecache */
VM_PERCPU_PAGELIST_FRACTION=30,/* int: fraction of pages in each percpu_pagelist */
VM_ZONE_RECLAIM_MODE=31, /* reclaim local zone memory before going off node */
- VM_ZONE_RECLAIM_INTERVAL=32, /* time period to wait after reclaim failure */
VM_PANIC_ON_OOM=33, /* panic at out-of-memory */
VM_VDSO_ENABLED=34, /* map VDSO into new processes? */
VM_SWAP_PREFETCH=35, /* swap prefetch */
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-07 22:11:57.963552285 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-07 22:16:48.392830386 -0700
@@ -192,12 +192,6 @@ struct zone {
/* Zone statistics */
vm_stat_t vm_stat[NR_STAT_ITEMS];
- /*
- * timestamp (in jiffies) of the last zone reclaim that did not
- * result in freeing of pages. This is used to avoid repeated scans
- * if all memory in the zone is in use.
- */
- unsigned long last_unsuccessful_zone_reclaim;
/*
* prev_priority holds the scanning priority for this zone. It is
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 06/14] Add per zone counters to zone node and global VM statistics
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (4 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 04/14] Use per zone counters to remove zone_reclaim_interval Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-09 4:01 ` Andrew Morton
2006-06-08 23:03 ` [PATCH 07/14] Conversion of nr_slab to per zone counter Christoph Lameter
` (7 subsequent siblings)
13 siblings, 1 reply; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Extend per node and per zone statistics by printing the additional counters now available.
- Add new counters to per node statistics
- Add new counters to per zone statistics
- Provide an array describing zoned VM counters
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 14:29:45.931956736 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 14:57:17.733967150 -0700
@@ -44,12 +44,14 @@ static ssize_t node_read_meminfo(struct
unsigned long inactive;
unsigned long active;
unsigned long free;
- unsigned long nr_mapped;
+ int j;
+ unsigned long nr[NR_STAT_ITEMS];
si_meminfo_node(&i, nid);
get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
- nr_mapped = node_page_state(nid, NR_MAPPED);
+ for (j = 0; j < NR_STAT_ITEMS; j++)
+ nr[j] = node_page_state(nid, j);
/* Check for negative values in these approximate counters */
if ((long)ps.nr_dirty < 0)
@@ -72,6 +74,7 @@ static ssize_t node_read_meminfo(struct
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
+ "Node %d Pagecache: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
@@ -84,7 +87,8 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(ps.nr_dirty),
nid, K(ps.nr_writeback),
- nid, K(nr_mapped),
+ nid, K(nr[NR_MAPPED]),
+ nid, K(nr[NR_PAGECACHE]),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 14:29:46.317675014 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 14:57:05.712250246 -0700
@@ -628,6 +628,8 @@ static int rmqueue_bulk(struct zone *zon
return i;
}
+char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache" };
+
/*
* Manage combined zone based / global counters
*
@@ -2724,6 +2726,11 @@ static int zoneinfo_show(struct seq_file
zone->nr_scan_active, zone->nr_scan_inactive,
zone->spanned_pages,
zone->present_pages);
+ for(i = 0; i < NR_STAT_ITEMS; i++)
+ seq_printf(m, "\n %-8s %lu",
+ vm_stat_item_descr[i],
+ zone_page_state(zone, i));
+
seq_printf(m,
"\n protection: (%lu",
zone->lowmem_reserve[0]);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 07/14] Conversion of nr_slab to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (5 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 06/14] Add per zone counters to zone node and global VM statistics Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 08/14] Conversion of nr_pagetable " Christoph Lameter
` (6 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_slab to a per zone counter
- Allows reclaim to access counter without looping over processor counts.
- Allows accurate statistics on how many pages are used in a zone by
the slab. This may become useful to balance slab allocations over
various zones.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 15:45:42.633134614 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 15:45:44.040274043 -0700
@@ -58,8 +58,6 @@ static ssize_t node_read_meminfo(struct
ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
- if ((long)ps.nr_slab < 0)
- ps.nr_slab = 0;
n = sprintf(buf, "\n"
"Node %d MemTotal: %8lu kB\n"
@@ -89,7 +87,7 @@ static ssize_t node_read_meminfo(struct
nid, K(ps.nr_writeback),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
- nid, K(ps.nr_slab));
+ nid, K(nr[NR_SLAB]));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
}
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 15:45:25.571691103 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 15:45:44.041250545 -0700
@@ -191,7 +191,7 @@ static int meminfo_read_proc(char *page,
K(ps.nr_dirty),
K(ps.nr_writeback),
K(global_page_state(NR_MAPPED)),
- K(ps.nr_slab),
+ K(global_page_state(NR_SLAB)),
K(allowed),
K(committed),
K(ps.nr_page_table_pages),
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:45:42.636064120 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:45:44.043203549 -0700
@@ -628,7 +628,7 @@ static int rmqueue_bulk(struct zone *zon
return i;
}
-char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache" };
+char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab" };
/*
* Manage combined zone based / global counters
@@ -1810,7 +1810,7 @@ void show_free_areas(void)
ps.nr_writeback,
ps.nr_unstable,
nr_free_pages(),
- ps.nr_slab,
+ global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
ps.nr_page_table_pages);
@@ -2804,13 +2804,13 @@ static char *vmstat_text[] = {
/* Zoned VM counters */
"nr_mapped",
"nr_pagecache",
+ "nr_slab",
/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
"nr_page_table_pages",
- "nr_slab",
"pgpgin",
"pgpgout",
Index: linux-2.6.17-rc6-mm1/mm/slab.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/slab.c 2006-06-08 15:42:25.210789203 -0700
+++ linux-2.6.17-rc6-mm1/mm/slab.c 2006-06-08 15:45:44.046133055 -0700
@@ -1525,7 +1525,7 @@ static void *kmem_getpages(struct kmem_c
nr_pages = (1 << cachep->gfporder);
if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
atomic_add(nr_pages, &slab_reclaim_pages);
- add_page_state(nr_slab, nr_pages);
+ add_zone_page_state(page_zone(page), NR_SLAB, nr_pages);
for (i = 0; i < nr_pages; i++)
__SetPageSlab(page + i);
return page_address(page);
@@ -1545,7 +1545,7 @@ static void kmem_freepages(struct kmem_c
__ClearPageSlab(page);
page++;
}
- sub_page_state(nr_slab, nr_freed);
+ sub_zone_page_state(page_zone(page), NR_SLAB, nr_freed);
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += nr_freed;
free_pages((unsigned long)addr, cachep->gfporder);
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 15:45:41.276773291 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:45:44.046133055 -0700
@@ -50,6 +50,7 @@ enum zone_stat_item {
NR_MAPPED, /* mapped into pagetables.
only modified from process context */
NR_PAGECACHE, /* file backed pages */
+ NR_SLAB, /* used by slab allocator */
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:44:58.896585082 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:45:44.047109557 -0700
@@ -123,8 +123,7 @@ struct page_state {
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_slab; /* In slab */
-#define GET_PAGE_STATE_LAST nr_slab
+#define GET_PAGE_STATE_LAST nr_page_table_pages
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-08 15:44:58.896585082 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-08 15:45:44.048086059 -0700
@@ -394,7 +394,9 @@ static int prefetch_suitable(void)
* even if the slab is being allocated on a remote node. This
* would be expensive to fix and not of great significance.
*/
- limit = global_page_state(NR_MAPPED) + ps.nr_slab + ps.nr_dirty +
+ limit = global_page_state(NR_MAPPED) +
+ global_page_state(NR_SLAB) +
+ ps.nr_dirty +
ps.nr_unstable + total_swapcache_pages;
if (limit > ns->prefetch_watermark) {
node_clear(node, sp_stat.prefetch_nodes);
Index: linux-2.6.17-rc6-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/vmscan.c 2006-06-08 15:45:41.273843785 -0700
+++ linux-2.6.17-rc6-mm1/mm/vmscan.c 2006-06-08 15:45:44.049062561 -0700
@@ -1375,7 +1375,7 @@ unsigned long shrink_all_memory(unsigned
for_each_zone(zone)
lru_pages += zone->nr_active + zone->nr_inactive;
- nr_slab = read_page_state(nr_slab);
+ nr_slab = global_page_state(NR_SLAB);
/* If slab caches are huge, it's better to hit them first */
while (nr_slab >= lru_pages) {
reclaim_state.reclaimed_slab = 0;
Index: linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/i386/mm/pgtable.c 2006-06-08 15:45:17.903220642 -0700
+++ linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c 2006-06-08 15:46:08.094448609 -0700
@@ -62,7 +62,7 @@ void show_mem(void)
printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_MAPPED));
- printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
+ printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 08/14] Conversion of nr_pagetable to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (6 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 07/14] Conversion of nr_slab to per zone counter Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 09/14] Conversion of nr_dirty " Christoph Lameter
` (5 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_page_table_pages to a per zone counter
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/mm/memory.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/memory.c 2006-06-08 15:20:11.539443944 -0700
+++ linux-2.6.17-rc6-mm1/mm/memory.c 2006-06-08 15:46:10.307202214 -0700
@@ -127,7 +127,7 @@ static void free_pte_range(struct mmu_ga
pmd_clear(pmd);
pte_lock_deinit(page);
pte_free_tlb(tlb, page);
- dec_page_state(nr_page_table_pages);
+ dec_zone_page_state(page, NR_PAGETABLE);
tlb->mm->nr_ptes--;
}
@@ -312,7 +312,7 @@ int __pte_alloc(struct mm_struct *mm, pm
pte_free(new);
} else {
mm->nr_ptes++;
- inc_page_state(nr_page_table_pages);
+ inc_zone_page_state(new, NR_PAGETABLE);
pmd_populate(mm, pmd, new);
}
spin_unlock(&mm->page_table_lock);
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:45:44.043203549 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:46:10.309155218 -0700
@@ -628,7 +628,7 @@ static int rmqueue_bulk(struct zone *zon
return i;
}
-char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab" };
+char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab", "pagetable" };
/*
* Manage combined zone based / global counters
@@ -1812,7 +1812,7 @@ void show_free_areas(void)
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
- ps.nr_page_table_pages);
+ global_page_state(NR_PAGETABLE));
for_each_zone(zone) {
int i;
@@ -2805,12 +2805,12 @@ static char *vmstat_text[] = {
"nr_mapped",
"nr_pagecache",
"nr_slab",
+ "nr_page_table_pages",
/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
- "nr_page_table_pages",
"pgpgin",
"pgpgout",
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:45:44.047109557 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:46:10.309155218 -0700
@@ -122,8 +122,7 @@ struct page_state {
unsigned long nr_dirty; /* Dirty writeable pages */
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
- unsigned long nr_page_table_pages;/* Pages used for pagetables */
-#define GET_PAGE_STATE_LAST nr_page_table_pages
+#define GET_PAGE_STATE_LAST nr_unstable
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 15:45:44.046133055 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:46:10.310131720 -0700
@@ -51,6 +51,7 @@ enum zone_stat_item {
only modified from process context */
NR_PAGECACHE, /* file backed pages */
NR_SLAB, /* used by slab allocator */
+ NR_PAGETABLE, /* used for pagetables */
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 15:45:44.041250545 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 15:46:10.311108222 -0700
@@ -194,7 +194,7 @@ static int meminfo_read_proc(char *page,
K(global_page_state(NR_SLAB)),
K(allowed),
K(committed),
- K(ps.nr_page_table_pages),
+ K(global_page_state(NR_PAGETABLE)),
(unsigned long)VMALLOC_TOTAL >> 10,
vmi.used >> 10,
vmi.largest_chunk >> 10
Index: linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/i386/mm/pgtable.c 2006-06-08 15:46:08.094448609 -0700
+++ linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c 2006-06-08 15:46:34.925794955 -0700
@@ -63,7 +63,7 @@ void show_mem(void)
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_MAPPED));
printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
- printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
+ printk(KERN_INFO "%lu pages pagetables\n", global_page_state(NR_PAGETABLE));
}
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 09/14] Conversion of nr_dirty to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (7 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 08/14] Conversion of nr_pagetable " Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 10/14] Conversion of nr_writeback " Christoph Lameter
` (4 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_dirty to a per zone counter
This makes nr_dirty a per zone counter. Looping over all processors
is avoided during writeback state determination.
The counter aggregation for nr_dirty had to be undone in the NFS layer
since it summed up the page counts from multiple zones.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 14:41:12.656973943 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 14:41:29.656898506 -0700
@@ -628,7 +628,9 @@ static int rmqueue_bulk(struct zone *zon
return i;
}
-char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache", "slab", "pagetable" };
+char *vm_stat_item_descr[NR_STAT_ITEMS] = {
+ "mapped", "pagecache", "slab", "pagetable", "dirty"
+};
/*
* Manage combined zone based / global counters
@@ -1806,7 +1808,7 @@ void show_free_areas(void)
"unstable:%lu free:%u slab:%lu mapped:%lu pagetables:%lu\n",
active,
inactive,
- ps.nr_dirty,
+ global_page_state(NR_DIRTY),
ps.nr_writeback,
ps.nr_unstable,
nr_free_pages(),
@@ -2806,9 +2808,9 @@ static char *vmstat_text[] = {
"nr_pagecache",
"nr_slab",
"nr_page_table_pages",
+ "nr_dirty",
/* Page state */
- "nr_dirty",
"nr_writeback",
"nr_unstable",
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 14:41:12.657950445 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 14:41:13.609063463 -0700
@@ -119,7 +119,6 @@
* commented here.
*/
struct page_state {
- unsigned long nr_dirty; /* Dirty writeable pages */
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
#define GET_PAGE_STATE_LAST nr_unstable
Index: linux-2.6.17-rc6-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page-writeback.c 2006-06-08 14:29:45.935862744 -0700
+++ linux-2.6.17-rc6-mm1/mm/page-writeback.c 2006-06-08 14:41:13.609063463 -0700
@@ -109,7 +109,7 @@ struct writeback_state
static void get_writeback_state(struct writeback_state *wbs)
{
- wbs->nr_dirty = read_page_state(nr_dirty);
+ wbs->nr_dirty = global_page_state(NR_DIRTY);
wbs->nr_unstable = read_page_state(nr_unstable);
wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = read_page_state(nr_writeback);
@@ -640,7 +640,7 @@ int __set_page_dirty_nobuffers(struct pa
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
@@ -727,9 +727,9 @@ int test_clear_page_dirty(struct page *p
radix_tree_tag_clear(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_DIRTY);
- write_unlock_irqrestore(&mapping->tree_lock, flags);
if (mapping_cap_account_dirty(mapping))
- dec_page_state(nr_dirty);
+ __dec_zone_page_state(page, NR_DIRTY);
+ write_unlock_irqrestore(&mapping->tree_lock, flags);
return 1;
}
write_unlock_irqrestore(&mapping->tree_lock, flags);
@@ -760,7 +760,7 @@ int clear_page_dirty_for_io(struct page
if (mapping) {
if (TestClearPageDirty(page)) {
if (mapping_cap_account_dirty(mapping))
- dec_page_state(nr_dirty);
+ dec_zone_page_state(page, NR_DIRTY);
return 1;
}
return 0;
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 14:41:12.658926947 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 14:41:13.610039965 -0700
@@ -52,6 +52,7 @@ enum zone_stat_item {
NR_PAGECACHE, /* file backed pages */
NR_SLAB, /* used by slab allocator */
NR_PAGETABLE, /* used for pagetables */
+ NR_DIRTY,
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 14:41:11.901161340 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 14:41:13.611016467 -0700
@@ -54,8 +54,6 @@ static ssize_t node_read_meminfo(struct
nr[j] = node_page_state(nid, j);
/* Check for negative values in these approximate counters */
- if ((long)ps.nr_dirty < 0)
- ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
@@ -83,7 +81,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freehigh),
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh),
- nid, K(ps.nr_dirty),
+ nid, K(nr[NR_DIRTY]),
nid, K(ps.nr_writeback),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
Index: linux-2.6.17-rc6-mm1/fs/fs-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/fs-writeback.c 2006-06-08 13:03:36.220727022 -0700
+++ linux-2.6.17-rc6-mm1/fs/fs-writeback.c 2006-06-08 14:41:13.611992969 -0700
@@ -472,7 +472,7 @@ void sync_inodes_sb(struct super_block *
.range_start = 0,
.range_end = LLONG_MAX,
};
- unsigned long nr_dirty = read_page_state(nr_dirty);
+ unsigned long nr_dirty = global_page_state(NR_DIRTY);
unsigned long nr_unstable = read_page_state(nr_unstable);
wbc.nr_to_write = nr_dirty + nr_unstable +
Index: linux-2.6.17-rc6-mm1/fs/buffer.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/buffer.c 2006-06-08 13:03:36.036168121 -0700
+++ linux-2.6.17-rc6-mm1/fs/buffer.c 2006-06-08 14:41:13.612969471 -0700
@@ -854,7 +854,7 @@ int __set_page_dirty_buffers(struct page
write_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_DIRTY);
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 14:41:12.659903449 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 14:41:13.613945973 -0700
@@ -188,7 +188,7 @@ static int meminfo_read_proc(char *page,
K(i.freeram-i.freehigh),
K(i.totalswap),
K(i.freeswap),
- K(ps.nr_dirty),
+ K(global_page_state(NR_DIRTY)),
K(ps.nr_writeback),
K(global_page_state(NR_MAPPED)),
K(global_page_state(NR_SLAB)),
Index: linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/i386/mm/pgtable.c 2006-06-05 17:57:02.000000000 -0700
+++ linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c 2006-06-08 14:41:13.614922475 -0700
@@ -59,7 +59,7 @@ void show_mem(void)
printk(KERN_INFO "%d pages swap cached\n", cached);
get_page_state(&ps);
- printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
+ printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_DIRTY));
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", ps.nr_mapped);
printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
Index: linux-2.6.17-rc6-mm1/fs/reiser4/page_cache.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/reiser4/page_cache.c 2006-06-08 13:03:36.616210382 -0700
+++ linux-2.6.17-rc6-mm1/fs/reiser4/page_cache.c 2006-06-08 14:41:13.615898977 -0700
@@ -464,7 +464,7 @@ int set_page_dirty_internal(struct page
if (!TestSetPageDirty(page)) {
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ inc_zone_page_state(page, NR_DIRTY);
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
}
Index: linux-2.6.17-rc6-mm1/fs/nfs/write.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/nfs/write.c 2006-06-08 13:03:36.484382596 -0700
+++ linux-2.6.17-rc6-mm1/fs/nfs/write.c 2006-06-08 14:41:13.616875479 -0700
@@ -497,7 +497,7 @@ nfs_mark_request_dirty(struct nfs_page *
nfs_list_add_request(req, &nfsi->dirty);
nfsi->ndirty++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_dirty);
+ inc_zone_page_state(req->wb_page, NR_DIRTY);
mark_inode_dirty(inode);
}
@@ -598,7 +598,6 @@ nfs_scan_dirty(struct inode *inode, stru
if (nfsi->ndirty != 0) {
res = nfs_scan_lock_dirty(nfsi, dst, idx_start, npages);
nfsi->ndirty -= res;
- sub_page_state(nr_dirty,res);
if ((nfsi->ndirty == 0) != list_empty(&nfsi->dirty))
printk(KERN_ERR "NFS: desynchronized value of nfs_i.ndirty.\n");
}
Index: linux-2.6.17-rc6-mm1/fs/reiser4/as_ops.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/reiser4/as_ops.c 2006-06-08 13:03:36.524419183 -0700
+++ linux-2.6.17-rc6-mm1/fs/reiser4/as_ops.c 2006-06-08 14:41:13.616875479 -0700
@@ -83,7 +83,7 @@ int reiser4_set_page_dirty(struct page *
if (page->mapping) {
assert("vs-1652", page->mapping == mapping);
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page->index,
PAGECACHE_TAG_REISER4_MOVED);
Index: linux-2.6.17-rc6-mm1/fs/nfs/pagelist.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/nfs/pagelist.c 2006-06-05 17:57:02.000000000 -0700
+++ linux-2.6.17-rc6-mm1/fs/nfs/pagelist.c 2006-06-08 14:41:13.617851982 -0700
@@ -315,6 +315,7 @@ nfs_scan_lock_dirty(struct nfs_inode *nf
req->wb_index, NFS_PAGE_TAG_DIRTY);
nfs_list_remove_request(req);
nfs_list_add_request(req, dst);
+ inc_zone_page_state(req->wb_page, NR_DIRTY);
res++;
}
}
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-08 14:41:11.908973356 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-08 14:41:13.617851982 -0700
@@ -396,7 +396,7 @@ static int prefetch_suitable(void)
*/
limit = global_page_state(NR_MAPPED) +
global_page_state(NR_SLAB) +
- ps.nr_dirty +
+ global_page_state(NR_DIRTY) +
ps.nr_unstable + total_swapcache_pages;
if (limit > ns->prefetch_watermark) {
node_clear(node, sp_stat.prefetch_nodes);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 10/14] Conversion of nr_writeback to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (8 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 09/14] Conversion of nr_dirty " Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 11/14] Conversion of nr_unstable " Christoph Lameter
` (3 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_writeback to per zone counter
Avoids per processor loop during writeback state determination.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 15:46:37.439311186 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 15:46:56.901973196 -0700
@@ -53,9 +53,6 @@ static ssize_t node_read_meminfo(struct
for (j = 0; j < NR_STAT_ITEMS; j++)
nr[j] = node_page_state(nid, j);
- /* Check for negative values in these approximate counters */
- if ((long)ps.nr_writeback < 0)
- ps.nr_writeback = 0;
n = sprintf(buf, "\n"
"Node %d MemTotal: %8lu kB\n"
@@ -82,7 +79,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh),
nid, K(nr[NR_DIRTY]),
- nid, K(ps.nr_writeback),
+ nid, K(nr[NR_WRITEBACK]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
nid, K(nr[NR_SLAB]));
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 15:46:37.442240692 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 15:46:56.902949698 -0700
@@ -189,7 +189,7 @@ static int meminfo_read_proc(char *page,
K(i.totalswap),
K(i.freeswap),
K(global_page_state(NR_DIRTY)),
- K(ps.nr_writeback),
+ K(global_page_state(NR_WRITEBACK)),
K(global_page_state(NR_MAPPED)),
K(global_page_state(NR_SLAB)),
K(allowed),
Index: linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/arch/i386/mm/pgtable.c 2006-06-08 15:46:48.173998029 -0700
+++ linux-2.6.17-rc6-mm1/arch/i386/mm/pgtable.c 2006-06-08 15:47:41.868915289 -0700
@@ -30,7 +30,6 @@ void show_mem(void)
struct page *page;
pg_data_t *pgdat;
unsigned long i;
- struct page_state ps;
unsigned long flags;
printk(KERN_INFO "Mem-info:\n");
@@ -58,9 +57,8 @@ void show_mem(void)
printk(KERN_INFO "%d pages shared\n", shared);
printk(KERN_INFO "%d pages swap cached\n", cached);
- get_page_state(&ps);
printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_DIRTY));
- printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
+ printk(KERN_INFO "%lu pages writeback\n", global_page_state(NR_WRITEBACK));
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_MAPPED));
printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
printk(KERN_INFO "%lu pages pagetables\n", global_page_state(NR_PAGETABLE));
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:46:37.436381680 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:46:56.905879204 -0700
@@ -629,7 +629,7 @@ static int rmqueue_bulk(struct zone *zon
}
char *vm_stat_item_descr[NR_STAT_ITEMS] = {
- "mapped", "pagecache", "slab", "pagetable", "dirty"
+ "mapped", "pagecache", "slab", "pagetable", "dirty", "writeback"
};
/*
@@ -1809,7 +1809,7 @@ void show_free_areas(void)
active,
inactive,
global_page_state(NR_DIRTY),
- ps.nr_writeback,
+ global_page_state(NR_WRITEBACK),
ps.nr_unstable,
nr_free_pages(),
global_page_state(NR_SLAB),
@@ -2809,9 +2809,9 @@ static char *vmstat_text[] = {
"nr_slab",
"nr_page_table_pages",
"nr_dirty",
+ "nr_writeback",
/* Page state */
- "nr_writeback",
"nr_unstable",
"pgpgin",
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:46:37.437358182 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:46:56.906855706 -0700
@@ -119,7 +119,6 @@
* commented here.
*/
struct page_state {
- unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
#define GET_PAGE_STATE_LAST nr_unstable
@@ -349,7 +348,7 @@ void dec_zone_page_state(struct page *,
do { \
if (!test_and_set_bit(PG_writeback, \
&(page)->flags)) \
- inc_page_state(nr_writeback); \
+ inc_zone_page_state(page, NR_WRITEBACK); \
} while (0)
#define TestSetPageWriteback(page) \
({ \
@@ -357,14 +356,14 @@ void dec_zone_page_state(struct page *,
ret = test_and_set_bit(PG_writeback, \
&(page)->flags); \
if (!ret) \
- inc_page_state(nr_writeback); \
+ inc_zone_page_state(page, NR_WRITEBACK); \
ret; \
})
#define ClearPageWriteback(page) \
do { \
if (test_and_clear_bit(PG_writeback, \
&(page)->flags)) \
- dec_page_state(nr_writeback); \
+ dec_zone_page_state(page, NR_WRITEBACK); \
} while (0)
#define TestClearPageWriteback(page) \
({ \
@@ -372,7 +371,7 @@ void dec_zone_page_state(struct page *,
ret = test_and_clear_bit(PG_writeback, \
&(page)->flags); \
if (ret) \
- dec_page_state(nr_writeback); \
+ dec_zone_page_state(page, NR_WRITEBACK); \
ret; \
})
Index: linux-2.6.17-rc6-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page-writeback.c 2006-06-08 15:46:37.438334684 -0700
+++ linux-2.6.17-rc6-mm1/mm/page-writeback.c 2006-06-08 15:46:56.907832208 -0700
@@ -112,7 +112,7 @@ static void get_writeback_state(struct w
wbs->nr_dirty = global_page_state(NR_DIRTY);
wbs->nr_unstable = read_page_state(nr_unstable);
wbs->nr_mapped = global_page_state(NR_MAPPED);
- wbs->nr_writeback = read_page_state(nr_writeback);
+ wbs->nr_writeback = global_page_state(NR_WRITEBACK);
}
/*
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 15:46:37.438334684 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:46:56.908808710 -0700
@@ -53,6 +53,7 @@ enum zone_stat_item {
NR_SLAB, /* used by slab allocator */
NR_PAGETABLE, /* used for pagetables */
NR_DIRTY,
+ NR_WRITEBACK,
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-08 15:46:37.447123203 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-08 15:46:56.908808710 -0700
@@ -381,7 +381,7 @@ static int prefetch_suitable(void)
get_page_state_node(&ps, node);
/* We shouldn't prefetch when we are doing writeback */
- if (ps.nr_writeback) {
+ if (global_page_state(NR_WRITEBACK)) {
node_clear(node, sp_stat.prefetch_nodes);
continue;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 11/14] Conversion of nr_unstable to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (9 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 10/14] Conversion of nr_writeback " Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 12/14] Remove unused get_page_stat functions Christoph Lameter
` (2 subsequent siblings)
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_unstable to a per zone counter
Avoids looping over processor to establish writeback state.
This converts the last critical page state for the VM and therefore
GET_PAGE_STATE_LAST becomes invalid. The next patch is needed to
make the kernel compile again.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/fs/fs-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/fs-writeback.c 2006-06-08 15:46:37.440287688 -0700
+++ linux-2.6.17-rc6-mm1/fs/fs-writeback.c 2006-06-08 15:47:46.620574179 -0700
@@ -473,7 +473,7 @@ void sync_inodes_sb(struct super_block *
.range_end = LLONG_MAX,
};
unsigned long nr_dirty = global_page_state(NR_DIRTY);
- unsigned long nr_unstable = read_page_state(nr_unstable);
+ unsigned long nr_unstable = global_page_state(NR_UNSTABLE);
wbc.nr_to_write = nr_dirty + nr_unstable +
(inodes_stat.nr_inodes - inodes_stat.nr_unused) +
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:46:56.905879204 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:47:46.622527183 -0700
@@ -629,7 +629,8 @@ static int rmqueue_bulk(struct zone *zon
}
char *vm_stat_item_descr[NR_STAT_ITEMS] = {
- "mapped", "pagecache", "slab", "pagetable", "dirty", "writeback"
+ "mapped", "pagecache", "slab", "pagetable", "dirty", "writeback",
+ "unstable"
};
/*
@@ -1810,7 +1811,7 @@ void show_free_areas(void)
inactive,
global_page_state(NR_DIRTY),
global_page_state(NR_WRITEBACK),
- ps.nr_unstable,
+ global_page_state(NR_UNSTABLE),
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_MAPPED),
@@ -2810,10 +2811,9 @@ static char *vmstat_text[] = {
"nr_page_table_pages",
"nr_dirty",
"nr_writeback",
-
- /* Page state */
"nr_unstable",
+ /* Page state */
"pgpgin",
"pgpgout",
"pswpin",
Index: linux-2.6.17-rc6-mm1/fs/nfs/write.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/nfs/write.c 2006-06-08 15:46:37.445170199 -0700
+++ linux-2.6.17-rc6-mm1/fs/nfs/write.c 2006-06-08 15:47:46.624480187 -0700
@@ -525,7 +525,7 @@ nfs_mark_request_commit(struct nfs_page
nfs_list_add_request(req, &nfsi->commit);
nfsi->ncommit++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_unstable);
+ inc_zone_page_state(req->wb_page, NR_UNSTABLE);
mark_inode_dirty(inode);
}
#endif
@@ -1382,7 +1382,6 @@ static void nfs_commit_done(struct rpc_t
{
struct nfs_write_data *data = calldata;
struct nfs_page *req;
- int res = 0;
dprintk("NFS: %4d nfs_commit_done (status %d)\n",
task->tk_pid, task->tk_status);
@@ -1420,9 +1419,8 @@ static void nfs_commit_done(struct rpc_t
nfs_mark_request_dirty(req);
next:
nfs_clear_page_writeback(req);
- res++;
+ dec_zone_page_state(req->wb_page, NR_UNSTABLE);
}
- sub_page_state(nr_unstable,res);
}
static const struct rpc_call_ops nfs_commit_ops = {
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:46:56.906855706 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:47:46.624480187 -0700
@@ -119,8 +119,7 @@
* commented here.
*/
struct page_state {
- unsigned long nr_unstable; /* NFS unstable pages */
-#define GET_PAGE_STATE_LAST nr_unstable
+#define GET_PAGE_STATE_LAST xxx
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
Index: linux-2.6.17-rc6-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page-writeback.c 2006-06-08 15:46:56.907832208 -0700
+++ linux-2.6.17-rc6-mm1/mm/page-writeback.c 2006-06-08 15:47:46.625456689 -0700
@@ -110,7 +110,7 @@ struct writeback_state
static void get_writeback_state(struct writeback_state *wbs)
{
wbs->nr_dirty = global_page_state(NR_DIRTY);
- wbs->nr_unstable = read_page_state(nr_unstable);
+ wbs->nr_unstable = global_page_state(NR_UNSTABLE);
wbs->nr_mapped = global_page_state(NR_MAPPED);
wbs->nr_writeback = global_page_state(NR_WRITEBACK);
}
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 15:46:56.908808710 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:47:46.626433191 -0700
@@ -54,6 +54,7 @@ enum zone_stat_item {
NR_PAGETABLE, /* used for pagetables */
NR_DIRTY,
NR_WRITEBACK,
+ NR_UNSTABLE, /* NFS unstable pages */
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 15:46:56.901973196 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 15:47:46.627409693 -0700
@@ -66,6 +66,7 @@ static ssize_t node_read_meminfo(struct
"Node %d LowFree: %8lu kB\n"
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
+ "Node %d Unstable: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d Pagecache: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
@@ -80,6 +81,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(nr[NR_DIRTY]),
nid, K(nr[NR_WRITEBACK]),
+ nid, K(nr[NR_UNSTABLE]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
nid, K(nr[NR_SLAB]));
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-08 15:46:56.908808710 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-08 15:47:46.627409693 -0700
@@ -397,7 +397,8 @@ static int prefetch_suitable(void)
limit = global_page_state(NR_MAPPED) +
global_page_state(NR_SLAB) +
global_page_state(NR_DIRTY) +
- ps.nr_unstable + total_swapcache_pages;
+ global_page_state(NR_UNSTABLE) +
+ total_swapcache_pages;
if (limit > ns->prefetch_watermark) {
node_clear(node, sp_stat.prefetch_nodes);
continue;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 12/14] Remove unused get_page_stat functions
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (10 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 11/14] Conversion of nr_unstable " Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 13/14] Conversion of nr_bounce to per zone counter Christoph Lameter
2006-06-08 23:03 ` [PATCH 14/14] Remove useless writeback structure Christoph Lameter
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Remove get_page_state functions / structures
We can remove all the get_page_state related functions after all the basic
page state variables have been moved to the zone based scheme.
The last patch broke the compile. This fixed it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 15:48:25.563475234 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:48:28.536923923 -0700
@@ -119,8 +119,6 @@
* commented here.
*/
struct page_state {
-#define GET_PAGE_STATE_LAST xxx
-
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
* to add up all these.
@@ -173,8 +171,6 @@ struct page_state {
unsigned long nr_bounce; /* pages for bounce buffers */
};
-extern void get_page_state(struct page_state *ret);
-extern void get_page_state_node(struct page_state *ret, int node);
extern void get_full_page_state(struct page_state *ret);
extern unsigned long read_page_state_offset(unsigned long offset);
extern void mod_page_state_offset(unsigned long offset, unsigned long delta);
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 15:48:25.565428238 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 15:48:28.537900425 -0700
@@ -40,7 +40,6 @@ static ssize_t node_read_meminfo(struct
int n;
int nid = dev->id;
struct sysinfo i;
- struct page_state ps;
unsigned long inactive;
unsigned long active;
unsigned long free;
@@ -48,7 +47,6 @@ static ssize_t node_read_meminfo(struct
unsigned long nr[NR_STAT_ITEMS];
si_meminfo_node(&i, nid);
- get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
for (j = 0; j < NR_STAT_ITEMS; j++)
nr[j] = node_page_state(nid, j);
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 15:48:25.561522230 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:48:28.540829931 -0700
@@ -1613,28 +1613,6 @@ static void __get_page_state(struct page
}
}
-void get_page_state_node(struct page_state *ret, int node)
-{
- int nr;
- cpumask_t mask = node_to_cpumask(node);
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr+1, &mask);
-}
-
-void get_page_state(struct page_state *ret)
-{
- int nr;
- cpumask_t mask = CPU_MASK_ALL;
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr + 1, &mask);
-}
-
void get_full_page_state(struct page_state *ret)
{
cpumask_t mask = CPU_MASK_ALL;
@@ -1766,7 +1744,6 @@ void si_meminfo_node(struct sysinfo *val
*/
void show_free_areas(void)
{
- struct page_state ps;
int cpu, temperature;
unsigned long active;
unsigned long inactive;
@@ -1798,7 +1775,6 @@ void show_free_areas(void)
}
}
- get_page_state(&ps);
get_zone_counts(&active, &inactive, &free);
printk("Free pages: %11ukB (%ukB HighMem)\n",
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-08 15:48:16.932173769 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-08 15:48:28.541806433 -0700
@@ -120,7 +120,6 @@ static int meminfo_read_proc(char *page,
{
struct sysinfo i;
int len;
- struct page_state ps;
unsigned long inactive;
unsigned long active;
unsigned long free;
@@ -129,7 +128,6 @@ static int meminfo_read_proc(char *page,
struct vmalloc_info vmi;
long cached;
- get_page_state(&ps);
get_zone_counts(&active, &inactive, &free);
/*
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-08 15:48:25.566404740 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-08 15:48:28.541806433 -0700
@@ -357,7 +357,6 @@ static int prefetch_suitable(void)
*/
for_each_node_mask(node, sp_stat.prefetch_nodes) {
struct node_stats *ns = &sp_stat.node[node];
- struct page_state ps;
/*
* We check to see that pages are not being allocated
@@ -378,8 +377,6 @@ static int prefetch_suitable(void)
if (!test_pagestate)
continue;
- get_page_state_node(&ps, node);
-
/* We shouldn't prefetch when we are doing writeback */
if (global_page_state(NR_WRITEBACK)) {
node_clear(node, sp_stat.prefetch_nodes);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 13/14] Conversion of nr_bounce to per zone counter
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (11 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 12/14] Remove unused get_page_stat functions Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 14/14] Remove useless writeback structure Christoph Lameter
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Conversion of nr_bounce to a per zone counter
nr_bounce is only used for proc output. So it could be left as an eventcounter.
However, the eventcounters are not accurate and nr_bounce is categorizing one
type of page in a zone. So we really need this to also be a per zone counter.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-08 14:57:31.889341062 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-08 15:01:43.192115178 -0700
@@ -55,6 +55,7 @@ enum zone_stat_item {
NR_DIRTY,
NR_WRITEBACK,
NR_UNSTABLE, /* NFS unstable pages */
+ NR_BOUNCE,
NR_STAT_ITEMS };
#ifdef CONFIG_SMP
Index: linux-2.6.17-rc6-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/page-flags.h 2006-06-08 14:58:56.228837442 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/page-flags.h 2006-06-08 15:01:43.193091680 -0700
@@ -168,7 +168,6 @@ struct page_state {
unsigned long allocstall; /* direct reclaim calls */
unsigned long pgrotated; /* pages rotated to tail of the LRU */
- unsigned long nr_bounce; /* pages for bounce buffers */
};
extern void get_full_page_state(struct page_state *ret);
Index: linux-2.6.17-rc6-mm1/mm/highmem.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/highmem.c 2006-06-05 17:57:02.000000000 -0700
+++ linux-2.6.17-rc6-mm1/mm/highmem.c 2006-06-08 15:01:43.194068182 -0700
@@ -316,7 +316,7 @@ static void bounce_end_io(struct bio *bi
continue;
mempool_free(bvec->bv_page, pool);
- dec_page_state(nr_bounce);
+ dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
}
bio_endio(bio_orig, bio_orig->bi_size, err);
@@ -397,7 +397,7 @@ static void __blk_queue_bounce(request_q
to->bv_page = mempool_alloc(pool, q->bounce_gfp);
to->bv_len = from->bv_len;
to->bv_offset = from->bv_offset;
- inc_page_state(nr_bounce);
+ inc_zone_page_state(to->bv_page, NR_BOUNCE);
if (rw == WRITE) {
char *vto, *vfrom;
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 14:58:56.232743450 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 15:02:13.761508395 -0700
@@ -630,7 +630,7 @@ static int rmqueue_bulk(struct zone *zon
char *vm_stat_item_descr[NR_STAT_ITEMS] = {
"mapped", "pagecache", "slab", "pagetable", "dirty", "writeback",
- "unstable"
+ "unstable", "bounce"
};
/*
@@ -2788,6 +2788,7 @@ static char *vmstat_text[] = {
"nr_dirty",
"nr_writeback",
"nr_unstable",
+ "nr_bounce",
/* Page state */
"pgpgin",
@@ -2834,8 +2835,7 @@ static char *vmstat_text[] = {
"pageoutrun",
"allocstall",
- "pgrotated",
- "nr_bounce",
+ "pgrotated"
};
static void *vmstat_start(struct seq_file *m, loff_t *pos)
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-08 14:58:56.229813944 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-08 15:01:43.196997687 -0700
@@ -67,7 +67,8 @@ static ssize_t node_read_meminfo(struct
"Node %d Unstable: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d Pagecache: %8lu kB\n"
- "Node %d Slab: %8lu kB\n",
+ "Node %d Slab: %8lu kB\n"
+ "Node %d Bounce: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
nid, K(i.totalram - i.freeram),
@@ -82,7 +83,8 @@ static ssize_t node_read_meminfo(struct
nid, K(nr[NR_UNSTABLE]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
- nid, K(nr[NR_SLAB]));
+ nid, K(nr[NR_SLAB]),
+ nid, K(nr[NR_BOUNCE]));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 14/14] Remove useless writeback structure
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
` (12 preceding siblings ...)
2006-06-08 23:03 ` [PATCH 13/14] Conversion of nr_bounce to per zone counter Christoph Lameter
@ 2006-06-08 23:03 ` Christoph Lameter
13 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-08 23:03 UTC (permalink / raw)
To: linux-kernel
Cc: akpm, Hugh Dickins, Nick Piggin, linux-mm, Andi Kleen,
Marcelo Tosatti, Christoph Lameter
Remove writeback state
We can remove some functions now that were needed to calculate the page
state for writeback control since these statistics are now directly
available.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page-writeback.c 2006-06-07 18:57:51.435777984 -0700
+++ linux-2.6.17-rc6-mm1/mm/page-writeback.c 2006-06-07 19:21:47.052964181 -0700
@@ -99,22 +99,6 @@ EXPORT_SYMBOL(laptop_mode);
static void background_writeout(unsigned long _min_pages);
-struct writeback_state
-{
- unsigned long nr_dirty;
- unsigned long nr_unstable;
- unsigned long nr_mapped;
- unsigned long nr_writeback;
-};
-
-static void get_writeback_state(struct writeback_state *wbs)
-{
- wbs->nr_dirty = global_page_state(NR_DIRTY);
- wbs->nr_unstable = global_page_state(NR_UNSTABLE);
- wbs->nr_mapped = global_page_state(NR_MAPPED);
- wbs->nr_writeback = global_page_state(NR_WRITEBACK);
-}
-
/*
* Work out the current dirty-memory clamping and background writeout
* thresholds.
@@ -133,8 +117,7 @@ static void get_writeback_state(struct w
* clamping level.
*/
static void
-get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty,
- struct address_space *mapping)
+get_dirty_limits(long *pbackground, long *pdirty, struct address_space *mapping)
{
int background_ratio; /* Percentages */
int dirty_ratio;
@@ -144,8 +127,6 @@ get_dirty_limits(struct writeback_state
unsigned long available_memory = total_pages;
struct task_struct *tsk;
- get_writeback_state(wbs);
-
#ifdef CONFIG_HIGHMEM
/*
* If this mapping can only allocate from low memory,
@@ -156,7 +137,7 @@ get_dirty_limits(struct writeback_state
#endif
- unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
+ unmapped_ratio = 100 - (global_page_state(NR_MAPPED) * 100) / total_pages;
dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
@@ -189,7 +170,6 @@ get_dirty_limits(struct writeback_state
*/
static void balance_dirty_pages(struct address_space *mapping)
{
- struct writeback_state wbs;
long nr_reclaimable;
long background_thresh;
long dirty_thresh;
@@ -207,10 +187,9 @@ static void balance_dirty_pages(struct a
.range_cyclic = 1,
};
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
+ get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
if (!dirty_exceeded)
@@ -224,10 +203,9 @@ static void balance_dirty_pages(struct a
*/
if (nr_reclaimable) {
writeback_inodes(&wbc);
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
+ get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
pages_written += write_chunk - wbc.nr_to_write;
if (pages_written >= write_chunk)
@@ -236,8 +214,9 @@ static void balance_dirty_pages(struct a
blk_congestion_wait(WRITE, HZ/10);
}
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh && dirty_exceeded)
- dirty_exceeded = 0;
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK)
+ <= dirty_thresh && dirty_exceeded)
+ dirty_exceeded = 0;
if (writeback_in_progress(bdi))
return; /* pdflush is already working this queue */
@@ -299,12 +278,11 @@ EXPORT_SYMBOL(balance_dirty_pages_rateli
void throttle_vm_writeout(void)
{
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;
for ( ; ; ) {
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
/*
* Boost the allowable dirty threshold a bit for page
@@ -312,7 +290,7 @@ void throttle_vm_writeout(void)
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
- if (wbs.nr_unstable + wbs.nr_writeback <= dirty_thresh)
+ if (global_page_state(NR_UNSTABLE) + global_page_state(NR_WRITEBACK) <= dirty_thresh)
break;
blk_congestion_wait(WRITE, HZ/10);
}
@@ -336,12 +314,11 @@ static void background_writeout(unsigned
};
for ( ; ; ) {
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
- if (wbs.nr_dirty + wbs.nr_unstable < background_thresh
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
+ if (global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE) < background_thresh
&& min_pages <= 0)
break;
wbc.encountered_congestion = 0;
@@ -365,12 +342,8 @@ static void background_writeout(unsigned
*/
int wakeup_pdflush(long nr_pages)
{
- if (nr_pages == 0) {
- struct writeback_state wbs;
-
- get_writeback_state(&wbs);
- nr_pages = wbs.nr_dirty + wbs.nr_unstable;
- }
+ if (nr_pages == 0)
+ nr_pages = global_page_state(NR_DIRTY) + global_page_state(NR_UNSTABLE);
return pdflush_operation(background_writeout, nr_pages);
}
@@ -401,7 +374,6 @@ static void wb_kupdate(unsigned long arg
unsigned long start_jif;
unsigned long next_jif;
long nr_to_write;
- struct writeback_state wbs;
struct writeback_control wbc = {
.bdi = NULL,
.sync_mode = WB_SYNC_NONE,
@@ -414,11 +386,11 @@ static void wb_kupdate(unsigned long arg
sync_supers();
- get_writeback_state(&wbs);
oldest_jif = jiffies - dirty_expire_interval;
start_jif = jiffies;
next_jif = start_jif + dirty_writeback_interval;
- nr_to_write = wbs.nr_dirty + wbs.nr_unstable +
+ nr_to_write = global_page_state(NR_DIRTY) +
+ global_page_state(NR_UNSTABLE) +
(inodes_stat.nr_inodes - inodes_stat.nr_unused);
while (nr_to_write > 0) {
wbc.encountered_congestion = 0;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-08 23:02 ` [PATCH 01/14] Per zone counter functionality Christoph Lameter
@ 2006-06-09 4:00 ` Andrew Morton
2006-06-09 4:38 ` Andi Kleen
` (2 more replies)
2006-06-09 4:28 ` Andi Kleen
1 sibling, 3 replies; 32+ messages in thread
From: Andrew Morton @ 2006-06-09 4:00 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Thu, 8 Jun 2006 16:02:44 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> Per zone counter infrastructure
>
Is the use of 8-bit accumulators more efficient than using 32-bit ones?
Obviously it's better from a cache POV, given that we have a pretty large
array of them. But is there a downside on some architectures in not using
the natural wordsize? I assume not, but I don't really know...
> +#ifdef CONFIG_SMP
> +typedef atomic_long_t vm_stat_t;
> +#define VM_STAT_GET(x) atomic_long_read(&(x))
> +#define VM_STAT_ADD(x,v) atomic_long_add(v, &(x))
> +#else
> +typedef unsigned long vm_stat_t;
> +#define VM_STAT_GET(x) (x)
> +#define VM_STAT_ADD(x,v) (x) += (v)
> +#endif
Is there a need to do this? On !SMP the atomic ops for well-cared-for
architectures use nonatomic RMWs anyway. For most architectures I'd expect
that we can simply use atomic_long_foo() in both cases with no loss of
efficiency.
> +/*
> + * Update the zone counters for one cpu.
> + */
> +void refresh_cpu_vm_stats(int cpu)
> +{
> + struct zone *zone;
> + int i;
> + unsigned long flags;
> +
> + for_each_zone(zone) {
> + struct per_cpu_pageset *pcp;
> +
> + pcp = zone_pcp(zone, cpu);
> +
> + for (i = 0; i < NR_STAT_ITEMS; i++)
> + if (pcp->vm_stat_diff[i]) {
> + local_irq_save(flags);
> + zone_page_state_add(pcp->vm_stat_diff[i],
> + zone, i);
> + pcp->vm_stat_diff[i] = 0;
> + local_irq_restore(flags);
> + }
> + }
> +}
Note that when this function is called via on_each_cpu(), local interrupts
are already disabled. So a small efficiency gain would come from changing
the API definition here to "caller must have disabled local interrupts".
> +void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
> + int delta)
> +{
> + zone_page_state_add(delta, zone, item);
> +}
> +EXPORT_SYMBOL(__mod_zone_page_state);
> +
> +void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
> + int delta)
> +{
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + zone_page_state_add(delta, zone, item);
> + local_irq_restore(flags);
> +}
> +EXPORT_SYMBOL(mod_zone_page_state);
> +
> +void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
> +{
> + zone_page_state_add(1, page_zone(page), item);
> +}
> +EXPORT_SYMBOL(__inc_zone_page_state);
> +
> +void __dec_zone_page_state(struct page *page, enum zone_stat_item item)
> +{
> + zone_page_state_add(-1, page_zone(page), item);
> +}
> +EXPORT_SYMBOL(__dec_zone_page_state);
> +
> +void inc_zone_page_state(struct page *page, enum zone_stat_item item)
> +{
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + zone_page_state_add(1, page_zone(page), item);
> + local_irq_restore(flags);
> +}
> +EXPORT_SYMBOL(inc_zone_page_state);
> +
> +void dec_zone_page_state(struct page *page, enum zone_stat_item item)
> +{
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + zone_page_state_add( -1, page_zone(page), item);
> + local_irq_restore(flags);
> +}
> +EXPORT_SYMBOL(dec_zone_page_state);
> +#endif
Now my head is spinning ;) But it looks sane.
We're sure all these exports are needed?
> #ifdef CONFIG_NUMA
> /*
> + * Determine the per node value of a stat item. This is done by cycling
> + * through all the zones of a node.
> + */
> +unsigned long node_page_state(int node, enum zone_stat_item item)
> +{
> + struct zone *zones = NODE_DATA(node)->node_zones;
> + int i;
> + long v = 0;
> +
> + for (i = 0; i < MAX_NR_ZONES; i++)
> + v += VM_STAT_GET(zones[i].vm_stat[item]);
> + if (v < 0)
> + v = 0;
> + return v;
> +}
> +EXPORT_SYMBOL(node_page_state);
Well I guess if this doesn't oops then we've finally answered that "Should
this ever happen" in __alloc_pages().
> +#ifdef CONFIG_SMP
> +void refresh_cpu_vm_stats(int);
> +void refresh_vm_stats(void);
> +#else
> +static inline void refresh_cpu_vm_stats(int cpu) { };
> +static inline void refresh_vm_stats(void) { };
> +#endif
do {} while (0), please. Always. All other forms (afaik) have problems.
In this case,
if (something)
refresh_vm_stats();
else
foo();
will not compile.
Always...
Would it be possible/sensible to move all this stuff into a new .c file?
page_alloc.c is getting awfully large and multipurpose, and this code is a
single logical chunk.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 04/14] Use per zone counters to remove zone_reclaim_interval
2006-06-08 23:03 ` [PATCH 04/14] Use per zone counters to remove zone_reclaim_interval Christoph Lameter
@ 2006-06-09 4:00 ` Andrew Morton
2006-06-09 18:54 ` zoned VM stats: Add NR_ANON Christoph Lameter
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2006-06-09 4:00 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Thu, 8 Jun 2006 16:03:05 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> Caveat: The number of mapped pages includes anonymous pages.
> The current check works but is a bit too cautious. We could perform
> zone reclaim down to the last unmapped page if we would split NR_MAPPED
> into NR_MAPPED_PAGECACHE and NR_MAPPED_ANON. Maybe later.
That caveat should be in a code comment, please. Otherwise we'll forget.
You have two [patch 04/14]s and no [patch 05/14].
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 06/14] Add per zone counters to zone node and global VM statistics
2006-06-08 23:03 ` [PATCH 06/14] Add per zone counters to zone node and global VM statistics Christoph Lameter
@ 2006-06-09 4:01 ` Andrew Morton
2006-06-09 15:55 ` Christoph Lameter
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2006-06-09 4:01 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Thu, 8 Jun 2006 16:03:10 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> --- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-08 14:29:46.317675014 -0700
> +++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-08 14:57:05.712250246 -0700
> @@ -628,6 +628,8 @@ static int rmqueue_bulk(struct zone *zon
> return i;
> }
>
> +char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache" };
static?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-08 23:02 ` [PATCH 01/14] Per zone counter functionality Christoph Lameter
2006-06-09 4:00 ` Andrew Morton
@ 2006-06-09 4:28 ` Andi Kleen
2006-06-09 16:00 ` Christoph Lameter
1 sibling, 1 reply; 32+ messages in thread
From: Andi Kleen @ 2006-06-09 4:28 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, akpm, Hugh Dickins, Nick Piggin, linux-mm, Marcelo Tosatti
> +/*
> + * For an unknown interrupt state
> + */
> +void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
> + int delta)
> +{
> + unsigned long flags;
> +
> + local_irq_save(flags);
> + __mod_zone_page_state(zone, item, delta);
> + local_irq_restore(flags);
It would be nicer to use some variant of local_t - then you could do that
without turning off interrupts (which some CPUs like P4 don't like)
There currently is not 1 byte local_t but it could be added.
Mind you it would only make sense when most of the calls are not already
with interrupts disabled.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 4:00 ` Andrew Morton
@ 2006-06-09 4:38 ` Andi Kleen
2006-06-09 9:22 ` Peter Zijlstra
2006-06-09 15:54 ` Christoph Lameter
2 siblings, 0 replies; 32+ messages in thread
From: Andi Kleen @ 2006-06-09 4:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Christoph Lameter, linux-kernel, hugh, nickpiggin, linux-mm,
marcelo.tosatti
On Friday 09 June 2006 06:00, Andrew Morton wrote:
> On Thu, 8 Jun 2006 16:02:44 -0700 (PDT)
> Christoph Lameter <clameter@sgi.com> wrote:
>
> > Per zone counter infrastructure
> >
>
> Is the use of 8-bit accumulators more efficient than using 32-bit ones?
> Obviously it's better from a cache POV, given that we have a pretty large
> array of them. But is there a downside on some architectures in not using
> the natural wordsize?
Maybe on very old alphas which didn't have 8 bit stores. They need a RMW cycle.
Other than that i wouldn't expect any problems. RISCs will just do the usual
32bit add in registers, but do a 8bit load/store.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 4:00 ` Andrew Morton
2006-06-09 4:38 ` Andi Kleen
@ 2006-06-09 9:22 ` Peter Zijlstra
2006-06-09 9:29 ` Andrew Morton
2006-06-09 18:19 ` Horst von Brand
2006-06-09 15:54 ` Christoph Lameter
2 siblings, 2 replies; 32+ messages in thread
From: Peter Zijlstra @ 2006-06-09 9:22 UTC (permalink / raw)
To: Andrew Morton
Cc: Christoph Lameter, linux-kernel, hugh, nickpiggin, linux-mm, ak,
marcelo.tosatti
On Thu, 2006-06-08 at 21:00 -0700, Andrew Morton wrote:
> On Thu, 8 Jun 2006 16:02:44 -0700 (PDT)
> Christoph Lameter <clameter@sgi.com> wrote:
> > +#ifdef CONFIG_SMP
> > +void refresh_cpu_vm_stats(int);
> > +void refresh_vm_stats(void);
> > +#else
> > +static inline void refresh_cpu_vm_stats(int cpu) { };
> > +static inline void refresh_vm_stats(void) { };
> > +#endif
>
> do {} while (0), please. Always. All other forms (afaik) have problems.
> In this case,
>
> if (something)
> refresh_vm_stats();
> else
> foo();
>
> will not compile.
It surely will, 'static inline' does not make it less of a function.
Although the trailing ; is not needed in the function definition.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 9:22 ` Peter Zijlstra
@ 2006-06-09 9:29 ` Andrew Morton
2006-06-09 18:19 ` Horst von Brand
1 sibling, 0 replies; 32+ messages in thread
From: Andrew Morton @ 2006-06-09 9:29 UTC (permalink / raw)
To: Peter Zijlstra
Cc: clameter, linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Fri, 09 Jun 2006 11:22:13 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2006-06-08 at 21:00 -0700, Andrew Morton wrote:
> > On Thu, 8 Jun 2006 16:02:44 -0700 (PDT)
> > Christoph Lameter <clameter@sgi.com> wrote:
>
> > > +#ifdef CONFIG_SMP
> > > +void refresh_cpu_vm_stats(int);
> > > +void refresh_vm_stats(void);
> > > +#else
> > > +static inline void refresh_cpu_vm_stats(int cpu) { };
> > > +static inline void refresh_vm_stats(void) { };
> > > +#endif
> >
> > do {} while (0), please. Always. All other forms (afaik) have problems.
> > In this case,
> >
> > if (something)
> > refresh_vm_stats();
> > else
> > foo();
> >
> > will not compile.
>
> It surely will, 'static inline' does not make it less of a function.
> Although the trailing ; is not needed in the function definition.
doh, I read it as a #define. Ignore.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 4:00 ` Andrew Morton
2006-06-09 4:38 ` Andi Kleen
2006-06-09 9:22 ` Peter Zijlstra
@ 2006-06-09 15:54 ` Christoph Lameter
2006-06-09 17:06 ` Andrew Morton
2 siblings, 1 reply; 32+ messages in thread
From: Christoph Lameter @ 2006-06-09 15:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Thu, 8 Jun 2006, Andrew Morton wrote:
> Is the use of 8-bit accumulators more efficient than using 32-bit ones?
> Obviously it's better from a cache POV, given that we have a pretty large
> array of them. But is there a downside on some architectures in not using
> the natural wordsize? I assume not, but I don't really know...
The advantage is that the whole thing fits into one cacheline right with
the pcp information. Some architectures need additional cycles but this
increases the cache hit rate. The speed of accessing memory is by far
worse than that.
> > +#ifdef CONFIG_SMP
> > +typedef atomic_long_t vm_stat_t;
> > +#define VM_STAT_GET(x) atomic_long_read(&(x))
> > +#define VM_STAT_ADD(x,v) atomic_long_add(v, &(x))
> > +#else
> > +typedef unsigned long vm_stat_t;
> > +#define VM_STAT_GET(x) (x)
> > +#define VM_STAT_ADD(x,v) (x) += (v)
> > +#endif
>
> Is there a need to do this? On !SMP the atomic ops for well-cared-for
> architectures use nonatomic RMWs anyway. For most architectures I'd expect
> that we can simply use atomic_long_foo() in both cases with no loss of
> efficiency.
Maybe I am not up to date too much on !SMP. I thought they still needed
atomic ops for MMU races.
> > +void refresh_cpu_vm_stats(int cpu)
> > +{
> > + struct zone *zone;
> > + int i;
> > + unsigned long flags;
> > +
> > + for_each_zone(zone) {
> > + struct per_cpu_pageset *pcp;
> > +
> > + pcp = zone_pcp(zone, cpu);
> > +
> > + for (i = 0; i < NR_STAT_ITEMS; i++)
> > + if (pcp->vm_stat_diff[i]) {
> > + local_irq_save(flags);
> > + zone_page_state_add(pcp->vm_stat_diff[i],
> > + zone, i);
> > + pcp->vm_stat_diff[i] = 0;
> > + local_irq_restore(flags);
> > + }
> > + }
> > +}
>
> Note that when this function is called via on_each_cpu(), local interrupts
> are already disabled. So a small efficiency gain would come from changing
> the API definition here to "caller must have disabled local interrupts".
Interrupts are enabled for on_each_cpu on IA64. The function is also
called from memory hotplug.
> We're sure all these exports are needed?
Hummm... Maybe some functions are not used right now.
> > #ifdef CONFIG_NUMA
> > /*
> > + * Determine the per node value of a stat item. This is done by cycling
> > + * through all the zones of a node.
> > + */
> > +unsigned long node_page_state(int node, enum zone_stat_item item)
> > +{
> > + struct zone *zones = NODE_DATA(node)->node_zones;
> > + int i;
> > + long v = 0;
> > +
> > + for (i = 0; i < MAX_NR_ZONES; i++)
> > + v += VM_STAT_GET(zones[i].vm_stat[item]);
> > + if (v < 0)
> > + v = 0;
> > + return v;
> > +}
> > +EXPORT_SYMBOL(node_page_state);
>
> Well I guess if this doesn't oops then we've finally answered that "Should
> this ever happen" in __alloc_pages().
Why would this oops? I thought all the zones are always populated?
> > +#ifdef CONFIG_SMP
> > +void refresh_cpu_vm_stats(int);
> > +void refresh_vm_stats(void);
> > +#else
> > +static inline void refresh_cpu_vm_stats(int cpu) { };
> > +static inline void refresh_vm_stats(void) { };
> > +#endif
>
> do {} while (0), please. Always. All other forms (afaik) have problems.
> In this case,
These are inline definitions and not macros.
> Would it be possible/sensible to move all this stuff into a new .c file?
> page_alloc.c is getting awfully large and multipurpose, and this code is a
> single logical chunk.
Right thought about that one as well. Can we stablize this first before I
do another big reorg?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 06/14] Add per zone counters to zone node and global VM statistics
2006-06-09 4:01 ` Andrew Morton
@ 2006-06-09 15:55 ` Christoph Lameter
0 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-09 15:55 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Thu, 8 Jun 2006, Andrew Morton wrote:
> > +char *vm_stat_item_descr[NR_STAT_ITEMS] = { "mapped","pagecache" };
>
> static?
It is accessed from driver/base/node.c.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 4:28 ` Andi Kleen
@ 2006-06-09 16:00 ` Christoph Lameter
0 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-09 16:00 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-kernel, akpm, Hugh Dickins, Nick Piggin, linux-mm, Marcelo Tosatti
On Fri, 9 Jun 2006, Andi Kleen wrote:
> It would be nicer to use some variant of local_t - then you could do that
> without turning off interrupts (which some CPUs like P4 don't like)
>
> There currently is not 1 byte local_t but it could be added.
>
> Mind you it would only make sense when most of the calls are not already
> with interrupts disabled.
We have discussed this before and there is a comment in the patch:
+ *
+ * Some processors have inc/dec instructions that are atomic vs an interrupt.
+ * However, the code must first determine the differential location in a zone
+ * based on the processor number and then inc/dec the counter. There is no
+ * guarantee without disabling preemption that the processor will not change
+ * in between and therefore the atomicity vs. interrupt cannot be exploited
+ * in a useful way here.
+ */
+void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
AFAIK the restrictions on local_t use are such that is barely usable.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 15:54 ` Christoph Lameter
@ 2006-06-09 17:06 ` Andrew Morton
2006-06-09 17:18 ` Christoph Lameter
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2006-06-09 17:06 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Fri, 9 Jun 2006 08:54:39 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> On Thu, 8 Jun 2006, Andrew Morton wrote:
>
> > Is the use of 8-bit accumulators more efficient than using 32-bit ones?
> > Obviously it's better from a cache POV, given that we have a pretty large
> > array of them. But is there a downside on some architectures in not using
> > the natural wordsize? I assume not, but I don't really know...
>
> The advantage is that the whole thing fits into one cacheline right with
> the pcp information. Some architectures need additional cycles but this
> increases the cache hit rate. The speed of accessing memory is by far
> worse than that.
>
> > > +#ifdef CONFIG_SMP
> > > +typedef atomic_long_t vm_stat_t;
> > > +#define VM_STAT_GET(x) atomic_long_read(&(x))
> > > +#define VM_STAT_ADD(x,v) atomic_long_add(v, &(x))
> > > +#else
> > > +typedef unsigned long vm_stat_t;
> > > +#define VM_STAT_GET(x) (x)
> > > +#define VM_STAT_ADD(x,v) (x) += (v)
> > > +#endif
> >
> > Is there a need to do this? On !SMP the atomic ops for well-cared-for
> > architectures use nonatomic RMWs anyway. For most architectures I'd expect
> > that we can simply use atomic_long_foo() in both cases with no loss of
> > efficiency.
>
> Maybe I am not up to date too much on !SMP. I thought they still needed
> atomic ops for MMU races.
There's no need for an atomic op - at the most the architecture would need
local_irq_disable() protection, and that's only if it doesn't have an
atomic-wrt-this-cpu add instruction.
> > > +void refresh_cpu_vm_stats(int cpu)
> > > +{
> > > + struct zone *zone;
> > > + int i;
> > > + unsigned long flags;
> > > +
> > > + for_each_zone(zone) {
> > > + struct per_cpu_pageset *pcp;
> > > +
> > > + pcp = zone_pcp(zone, cpu);
> > > +
> > > + for (i = 0; i < NR_STAT_ITEMS; i++)
> > > + if (pcp->vm_stat_diff[i]) {
> > > + local_irq_save(flags);
> > > + zone_page_state_add(pcp->vm_stat_diff[i],
> > > + zone, i);
> > > + pcp->vm_stat_diff[i] = 0;
> > > + local_irq_restore(flags);
> > > + }
> > > + }
> > > +}
> >
> > Note that when this function is called via on_each_cpu(), local interrupts
> > are already disabled. So a small efficiency gain would come from changing
> > the API definition here to "caller must have disabled local interrupts".
>
> Interrupts are enabled for on_each_cpu on IA64.
Not from my reading of arch/ia64/kernel/smp.c:handle_IPI(). And if I've
misread it, ia64 has broken invalidate_bh_lrus() and who knows what else.
> > Well I guess if this doesn't oops then we've finally answered that "Should
> > this ever happen" in __alloc_pages().
>
> Why would this oops? I thought all the zones are always populated?
That's my point - probably the check in __alloc_pages() isn't needed.
> > Would it be possible/sensible to move all this stuff into a new .c file?
> > page_alloc.c is getting awfully large and multipurpose, and this code is a
> > single logical chunk.
>
> Right thought about that one as well. Can we stablize this first before I
> do another big reorg?
That's unfortunate patch ordering. Do it (much) later I guess.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 17:06 ` Andrew Morton
@ 2006-06-09 17:18 ` Christoph Lameter
2006-06-09 17:38 ` Andrew Morton
0 siblings, 1 reply; 32+ messages in thread
From: Christoph Lameter @ 2006-06-09 17:18 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Fri, 9 Jun 2006, Andrew Morton wrote:
> There's no need for an atomic op - at the most the architecture would need
> local_irq_disable() protection, and that's only if it doesn't have an
> atomic-wrt-this-cpu add instruction.
So I can drop the VM_STATS() definitions?
> > Right thought about that one as well. Can we stablize this first before I
> > do another big reorg?
>
> That's unfortunate patch ordering. Do it (much) later I guess.
Well there are a couple of trailing issues that would have to be resolved
before that happens. I have another patchset here that does something more
to the remaining counters.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 17:18 ` Christoph Lameter
@ 2006-06-09 17:38 ` Andrew Morton
0 siblings, 0 replies; 32+ messages in thread
From: Andrew Morton @ 2006-06-09 17:38 UTC (permalink / raw)
To: Christoph Lameter
Cc: linux-kernel, hugh, nickpiggin, linux-mm, ak, marcelo.tosatti
On Fri, 9 Jun 2006 10:18:23 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> On Fri, 9 Jun 2006, Andrew Morton wrote:
>
> > There's no need for an atomic op - at the most the architecture would need
> > local_irq_disable() protection, and that's only if it doesn't have an
> > atomic-wrt-this-cpu add instruction.
>
> So I can drop the VM_STATS() definitions?
I _think_ so. But a bit of a review of the existing atomic ops for the
major architectures wouldn't hurt.
> > > Right thought about that one as well. Can we stablize this first before I
> > > do another big reorg?
> >
> > That's unfortunate patch ordering. Do it (much) later I guess.
>
> Well there are a couple of trailing issues that would have to be resolved
> before that happens. I have another patchset here that does something more
> to the remaining counters.
It's a relatively minor issue - we can do this little cleanup much later on.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/14] Per zone counter functionality
2006-06-09 9:22 ` Peter Zijlstra
2006-06-09 9:29 ` Andrew Morton
@ 2006-06-09 18:19 ` Horst von Brand
1 sibling, 0 replies; 32+ messages in thread
From: Horst von Brand @ 2006-06-09 18:19 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Andrew Morton, Christoph Lameter, linux-kernel, hugh, nickpiggin,
linux-mm, ak, marcelo.tosatti
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2006-06-08 at 21:00 -0700, Andrew Morton wrote:
> > On Thu, 8 Jun 2006 16:02:44 -0700 (PDT)
> > Christoph Lameter <clameter@sgi.com> wrote:
>
> > > +#ifdef CONFIG_SMP
> > > +void refresh_cpu_vm_stats(int);
> > > +void refresh_vm_stats(void);
> > > +#else
> > > +static inline void refresh_cpu_vm_stats(int cpu) { };
> > > +static inline void refresh_vm_stats(void) { };
> > > +#endif
> >
> > do {} while (0), please. Always. All other forms (afaik) have problems.
> > In this case,
> >
> > if (something)
> > refresh_vm_stats();
> > else
> > foo();
> >
> > will not compile.
>
> It surely will, 'static inline' does not make it less of a function.
> Although the trailing ; is not needed in the function definition.
The trailing ';' is broken.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* zoned VM stats: Add NR_ANON
2006-06-09 4:00 ` Andrew Morton
@ 2006-06-09 18:54 ` Christoph Lameter
2006-06-10 4:32 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 32+ messages in thread
From: Christoph Lameter @ 2006-06-09 18:54 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, hugh, npiggin, linux-mm, ak
On Thu, 8 Jun 2006, Andrew Morton wrote:
> On Thu, 8 Jun 2006 16:03:05 -0700 (PDT)
> Christoph Lameter <clameter@sgi.com> wrote:
>
> > Caveat: The number of mapped pages includes anonymous pages.
> > The current check works but is a bit too cautious. We could perform
> > zone reclaim down to the last unmapped page if we would split NR_MAPPED
> > into NR_MAPPED_PAGECACHE and NR_MAPPED_ANON. Maybe later.
>
> That caveat should be in a code comment, please. Otherwise we'll forget.
Maybe we can deal with this issue immediately by introducing NR_ANON?
zoned VM stats: Add NR_ANON
The current NR_MAPPED is used by zone reclaim and the dirty load calculation
as the number of mapped pagecache pages. However, that is not true. NR_MAPPED
includes the mapped anonymous pages. This patch clearly separates those and
therefore allows an accurate tracking of the anonymous pages per zone and the
number of mapped pages in the pagecache of each zone.
We can then more accurately determine when zone reclaim is to be run.
Also it may now be possible to determine the mapped/unmapped
ratio in get_dirty_limit. Isnt the number of anonymous pages
irrelevant in that calculation?
Note that this will change the meaning of the number of mapped pages
reported in /proc/vmstat /proc/meminfo and in the per node statistics.
This may affect user space tools that monitor these counters!
However, NR_MAPPED then works like NR_DIRTY. It is only valid for
pagecache pages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc6-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/drivers/base/node.c 2006-06-09 10:30:52.414461763 -0700
+++ linux-2.6.17-rc6-mm1/drivers/base/node.c 2006-06-09 11:26:59.385565250 -0700
@@ -65,6 +65,7 @@ static ssize_t node_read_meminfo(struct
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
"Node %d Unstable: %8lu kB\n"
+ "Node %d Anonymous: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d Pagecache: %8lu kB\n"
"Node %d Slab: %8lu kB\n"
@@ -81,6 +82,7 @@ static ssize_t node_read_meminfo(struct
nid, K(nr[NR_DIRTY]),
nid, K(nr[NR_WRITEBACK]),
nid, K(nr[NR_UNSTABLE]),
+ nid, K(nr[NR_ANON]),
nid, K(nr[NR_MAPPED]),
nid, K(nr[NR_PAGECACHE]),
nid, K(nr[NR_SLAB]),
Index: linux-2.6.17-rc6-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page_alloc.c 2006-06-09 10:30:52.414461763 -0700
+++ linux-2.6.17-rc6-mm1/mm/page_alloc.c 2006-06-09 11:41:32.525809812 -0700
@@ -629,8 +629,9 @@ static int rmqueue_bulk(struct zone *zon
}
char *vm_stat_item_descr[NR_STAT_ITEMS] = {
- "mapped", "pagecache", "slab", "pagetable", "dirty", "writeback",
- "unstable", "bounce"
+ "anon", "mapped", "pagecache", "slab",
+ "pagetable", "dirty", "writeback", "unstable",
+ "bounce"
};
/*
@@ -2781,6 +2782,7 @@ struct seq_operations zoneinfo_op = {
static char *vmstat_text[] = {
/* Zoned VM counters */
+ "nr_anon",
"nr_mapped",
"nr_pagecache",
"nr_slab",
Index: linux-2.6.17-rc6-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc6-mm1.orig/include/linux/mmzone.h 2006-06-09 10:30:52.400790734 -0700
+++ linux-2.6.17-rc6-mm1/include/linux/mmzone.h 2006-06-09 11:26:59.388494756 -0700
@@ -47,7 +47,8 @@ struct zone_padding {
#endif
enum zone_stat_item {
- NR_MAPPED, /* mapped into pagetables.
+ NR_ANON, /* Mapped anonymous pages */
+ NR_MAPPED, /* pagecache pages mapped into pagetables.
only modified from process context */
NR_PAGECACHE, /* file backed pages */
NR_SLAB, /* used by slab allocator */
Index: linux-2.6.17-rc6-mm1/mm/rmap.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/rmap.c 2006-06-09 10:30:51.768993888 -0700
+++ linux-2.6.17-rc6-mm1/mm/rmap.c 2006-06-09 11:26:59.389471258 -0700
@@ -455,7 +455,7 @@ static void __page_set_anon_rmap(struct
* nr_mapped state can be updated without turning off
* interrupts because it is not modified via interrupt.
*/
- __inc_zone_page_state(page, NR_MAPPED);
+ __inc_zone_page_state(page, NR_ANON);
}
/**
@@ -531,7 +531,7 @@ void page_remove_rmap(struct page *page)
*/
if (page_test_and_clear_dirty(page))
set_page_dirty(page);
- __dec_zone_page_state(page, NR_MAPPED);
+ __dec_zone_page_state(page, PageAnon(page) ? NR_ANON : NR_MAPPED);
}
}
Index: linux-2.6.17-rc6-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/vmscan.c 2006-06-09 11:17:08.125321243 -0700
+++ linux-2.6.17-rc6-mm1/mm/vmscan.c 2006-06-09 11:30:34.159368970 -0700
@@ -747,7 +747,8 @@ static void shrink_active_list(unsigned
* how much memory
* is mapped.
*/
- mapped_ratio = global_page_state(NR_MAPPED) / vm_total_pages;
+ mapped_ratio = (global_page_state(NR_MAPPED) +
+ global_page_state(NR_ANON)) / vm_total_pages;
/*
* Now decide how much we really want to unmap some pages. The
@@ -1603,13 +1604,16 @@ int zone_reclaim(struct zone *zone, gfp_
/*
* Do not reclaim if there are not enough reclaimable pages in this
- * zone. We decide this based on the number of mapped pages
- * in relation to the number of page cache pages in this zone.
- * If there are more pagecache pages than mapped pages then we can
- * be certain that pages can be reclaimed.
+ * zone that would satify this allocations.
+ *
+ * All unmapped pagecache pages are reclaimable.
+ *
+ * Both counters may be temporarily off a bit so we use
+ * SWAP_CLUSTER_MAX as the boundary. It may also be good to
+ * leave a few frequently used unmapped pagecache pages around.
*/
- if (zone_page_state(zone, NR_PAGECACHE) <
- zone_page_state(zone, NR_MAPPED))
+ if (zone_page_state(zone, NR_PAGECACHE) -
+ zone_page_state(zone, NR_MAPPED) < SWAP_CLUSTER_MAX)
return 0;
/*
Index: linux-2.6.17-rc6-mm1/mm/swap_prefetch.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/swap_prefetch.c 2006-06-09 11:17:22.126406920 -0700
+++ linux-2.6.17-rc6-mm1/mm/swap_prefetch.c 2006-06-09 11:27:25.602691036 -0700
@@ -388,6 +388,7 @@ static int prefetch_suitable(void)
* dirty, we need to leave some free for pagecache.
*/
limit = global_page_state(NR_MAPPED) +
+ global_page_state(NR_ANON) +
global_page_state(NR_SLAB) +
global_page_state(NR_DIRTY) +
global_page_state(NR_UNSTABLE) +
Index: linux-2.6.17-rc6-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/mm/page-writeback.c 2006-06-09 10:30:52.452545344 -0700
+++ linux-2.6.17-rc6-mm1/mm/page-writeback.c 2006-06-09 11:32:08.435754999 -0700
@@ -137,7 +137,9 @@ get_dirty_limits(long *pbackground, long
#endif
- unmapped_ratio = 100 - (global_page_state(NR_MAPPED) * 100) / total_pages;
+ unmapped_ratio = 100 - ((global_page_state(NR_MAPPED) +
+ global_page_state(NR_ANON)) * 100) /
+ total_pages;
dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
Index: linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-rc6-mm1.orig/fs/proc/proc_misc.c 2006-06-09 10:30:52.363683655 -0700
+++ linux-2.6.17-rc6-mm1/fs/proc/proc_misc.c 2006-06-09 11:33:46.053730933 -0700
@@ -165,6 +165,7 @@ static int meminfo_read_proc(char *page,
"SwapFree: %8lu kB\n"
"Dirty: %8lu kB\n"
"Writeback: %8lu kB\n"
+ "Anonymous: %8lu kB\n"
"Mapped: %8lu kB\n"
"Slab: %8lu kB\n"
"CommitLimit: %8lu kB\n"
@@ -188,6 +189,7 @@ static int meminfo_read_proc(char *page,
K(i.freeswap),
K(global_page_state(NR_DIRTY)),
K(global_page_state(NR_WRITEBACK)),
+ K(global_page_state(NR_ANON)),
K(global_page_state(NR_MAPPED)),
K(global_page_state(NR_SLAB)),
K(allowed),
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: zoned VM stats: Add NR_ANON
2006-06-09 18:54 ` zoned VM stats: Add NR_ANON Christoph Lameter
@ 2006-06-10 4:32 ` KAMEZAWA Hiroyuki
2006-06-10 4:52 ` Christoph Lameter
0 siblings, 1 reply; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2006-06-10 4:32 UTC (permalink / raw)
To: Christoph Lameter; +Cc: akpm, linux-kernel, hugh, npiggin, linux-mm, ak
On Fri, 9 Jun 2006 11:54:07 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> Note that this will change the meaning of the number of mapped pages
> reported in /proc/vmstat /proc/meminfo and in the per node statistics.
> This may affect user space tools that monitor these counters!
>
> However, NR_MAPPED then works like NR_DIRTY. It is only valid for
> pagecache pages.
> Index: linux-2.6.17-rc6-mm1/mm/rmap.c
> ===================================================================
> --- linux-2.6.17-rc6-mm1.orig/mm/rmap.c 2006-06-09 10:30:51.768993888 -0700
> +++ linux-2.6.17-rc6-mm1/mm/rmap.c 2006-06-09 11:26:59.389471258 -0700
> @@ -455,7 +455,7 @@ static void __page_set_anon_rmap(struct
> * nr_mapped state can be updated without turning off
> * interrupts because it is not modified via interrupt.
> */
> - __inc_zone_page_state(page, NR_MAPPED);
> + __inc_zone_page_state(page, NR_ANON);
> }
>
> /**
> @@ -531,7 +531,7 @@ void page_remove_rmap(struct page *page)
> */
> if (page_test_and_clear_dirty(page))
> set_page_dirty(page);
> - __dec_zone_page_state(page, NR_MAPPED);
> + __dec_zone_page_state(page, PageAnon(page) ? NR_ANON : NR_MAPPED);
> }
> }
Can this accounting catch page migration ? TBD ?
Now all coutners are counted per zone, migration should be cared.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: zoned VM stats: Add NR_ANON
2006-06-10 4:32 ` KAMEZAWA Hiroyuki
@ 2006-06-10 4:52 ` Christoph Lameter
0 siblings, 0 replies; 32+ messages in thread
From: Christoph Lameter @ 2006-06-10 4:52 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: akpm, linux-kernel, hugh, npiggin, linux-mm, ak
On Sat, 10 Jun 2006, KAMEZAWA Hiroyuki wrote:
> Can this accounting catch page migration ? TBD ?
> Now all coutners are counted per zone, migration should be cared.
Page migration removes the reverse mapping for the old page and installs
the mappings to the new page later. This means that the counters are taken
care of.
try_to_unmap_one removes the mapping and decrements the zone counter.
remove_migration_pte adds the mapping to the new page and increments the
relevant zone counter.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2006-06-10 4:52 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-08 23:02 [PATCH 00/14] Zoned VM counters V2 Christoph Lameter
2006-06-08 23:02 ` [PATCH 01/14] Per zone counter functionality Christoph Lameter
2006-06-09 4:00 ` Andrew Morton
2006-06-09 4:38 ` Andi Kleen
2006-06-09 9:22 ` Peter Zijlstra
2006-06-09 9:29 ` Andrew Morton
2006-06-09 18:19 ` Horst von Brand
2006-06-09 15:54 ` Christoph Lameter
2006-06-09 17:06 ` Andrew Morton
2006-06-09 17:18 ` Christoph Lameter
2006-06-09 17:38 ` Andrew Morton
2006-06-09 4:28 ` Andi Kleen
2006-06-09 16:00 ` Christoph Lameter
2006-06-08 23:02 ` [PATCH 02/14] Include per zone counters in /proc/vmstat Christoph Lameter
2006-06-08 23:02 ` [PATCH 03/14] Conversion of nr_mapped to per zone counter Christoph Lameter
2006-06-08 23:03 ` [PATCH 04/14] Conversion of nr_pagecache " Christoph Lameter
2006-06-08 23:03 ` [PATCH 04/14] Use per zone counters to remove zone_reclaim_interval Christoph Lameter
2006-06-09 4:00 ` Andrew Morton
2006-06-09 18:54 ` zoned VM stats: Add NR_ANON Christoph Lameter
2006-06-10 4:32 ` KAMEZAWA Hiroyuki
2006-06-10 4:52 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 06/14] Add per zone counters to zone node and global VM statistics Christoph Lameter
2006-06-09 4:01 ` Andrew Morton
2006-06-09 15:55 ` Christoph Lameter
2006-06-08 23:03 ` [PATCH 07/14] Conversion of nr_slab to per zone counter Christoph Lameter
2006-06-08 23:03 ` [PATCH 08/14] Conversion of nr_pagetable " Christoph Lameter
2006-06-08 23:03 ` [PATCH 09/14] Conversion of nr_dirty " Christoph Lameter
2006-06-08 23:03 ` [PATCH 10/14] Conversion of nr_writeback " Christoph Lameter
2006-06-08 23:03 ` [PATCH 11/14] Conversion of nr_unstable " Christoph Lameter
2006-06-08 23:03 ` [PATCH 12/14] Remove unused get_page_stat functions Christoph Lameter
2006-06-08 23:03 ` [PATCH 13/14] Conversion of nr_bounce to per zone counter Christoph Lameter
2006-06-08 23:03 ` [PATCH 14/14] Remove useless writeback structure Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox