* [PATCH 00/14] Zoned VM counters V6
@ 2006-06-22 16:40 Christoph Lameter
2006-06-22 16:40 ` [PATCH 01/14] Create vmstat.c/.h from page_alloc.c/.h Christoph Lameter
` (14 more replies)
0 siblings, 15 replies; 18+ messages in thread
From: Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
reliable whereas event counters do not need to be.
Zone based VM statistics are necessary to be able to determine what the state
of memory in one zone is. In a NUMA system this can be helpful for local
reclaim and other memory optimizations that may be able to shift VM load
in order to get more balanced memory use.
It is also useful to know how the computing load affects the memory
allocations on various zones. This patchset allows the retrieval of that
data from userspace.
The patchset introduces a framework for counters that is a cross between the
existing page_stats --which are simply global counters split per cpu-- and
the approach of deferred incremental updates implemented for nr_pagecache.
Small per cpu 8 bit counters are added to struct zone. If the counter
exceeds certain thresholds then the counters are accumulated in an array of
atomic_long in the zone and in a global array that sums up all
zone values. The small 8 bit counters are next to the per cpu page pointers
and so they will be in high in the cpu cache when pages are allocated and
freed.
Access to VM counter information for a zone and for the whole machine
is then possible by simply indexing an array (Thanks to Nick Piggin for
pointing out that approach). The access to the total number of pages of
various types does no longer require the summing up of all per cpu counters.
Benefits of this patchset right now:
- Ability for UP and SMP configuration to determine how memory
is balanced between the DMA, NORMAL and HIGHMEM zones.
- loops over all processors are avoided in writeback and
reclaim paths. We can avoid caching the writeback information
because the needed information is directly accessible.
- Special handling for nr_pagecache removed.
- zone_reclaim_interval vanishes since VM stats can now determine
when it is worth to do local reclaim.
- Fast inline per node page state determination.
- Accurate counters in /sys/devices/system/node/node*/meminfo. Current
counters are counting simply which processor allocated a page somewhere
and guestimate based on that. So the counters were not useful to show
the actual distribution of page use on a specific zone.
- The swap_prefetch patch requires per node statistics in order to
figure out when processors of a node can prefetch. This patch provides
some of the needed numbers.
- Detailed VM counters available in more /proc and /sys status files.
References to earlier discussions:
V1 http://marc.theaimsgroup.com/?l=linux-kernel&m=113511649910826&w=2
V2 http://marc.theaimsgroup.com/?l=linux-kernel&m=114980851924230&w=2
V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115014697910351&w=2
V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767318740&w=2
Performance tests with AIM7 did not show any regressions. Seems to be a tad
faster even. Tested on ia64/NUMA. Builds fine on i386, SMP / UP. Includes
fixes for s390/arm/uml arch code.
Changelog
V1->V2:
- Cleanup code, resequence and base patches on 2.6.17-rc6-mm1
- Reduce interrupt holdoffs
- Add zone reclaim interval removal patch
V2->V3:
- Against temp tree by Andrew. (2.6.17-rc6-mm2 - old patches)
Temp patch at http://www.zip.com.au/~akpm/linux/patches/stuff/cl.bz2
- Incorporate additional fixes for arch code.
- Create vmstat.c/h from pieces of page_alloc.c.
- Do the swap prefetch support patches the right way.
- Reorganize patchset so that the tree compiles after each
patch (However, swap prefetch/reiser4 patches are separate.
So if a swap prefetch patch follows then two patches must
be applied for the kernel to compile again).
- Do various prescribed tests. Make sure that there is no remaining
reference to page state in some arch code.
- Optimize the node_page_state function so that it can be used inline.
V3->V4:
- nr_pagecache definition was not cleaned up in V3.
- Fix nfs issues with NR_UNSTABLE where the page reference was not valid
and with NR_DIRTY.
- Update swap_prefetch patches after feedback from Colin.
- Rename NR_STAT_ITEMS to NR_VM_ZONE_STAT_ITEMS.
- IA64: Make CONFIG_DMA_IS_NORMAL depend on SGI_SN2. Others
may be added in the future.
- Fix order issues with vmstat
- Limit crossposting
V4->V5:
- Drop special patches for swap prefetch and reiser4
- Rediff against 2.6.17-mm1.
- Rename NR_UNSTABLE -> NR_UNSTABLE_NFS
- Rename NR_DIRTY -> NR_FILE_DIRTY
- Rename NR_MAPPED -> NR_FILE_MAPPED
- Rename NR_PAGECACHE -> NR_FILE_PAGES
- Rename NR_ANON -> NR_ANON_PAGES
- Update strings displayed in /proc files but leave established strings as is.
V5->V6
- Restore the removal of individual counters from the page state that
was deferred into a later patch when going from V2->V3. This also
caused the removal of get_page_state_node and get_page_state() to
drop out of the patch that converted nr_unstable.
- Fix mailing list address.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 01/14] Create vmstat.c/.h from page_alloc.c/.h
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter
2006-06-22 16:40 ` [PATCH 02/14] Basic ZVC (zoned vm counter) implementation, zoned vm counters: per zone counter functionality Christoph Lameter, Christoph Lameter
` (13 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Move counter code from page_alloc.c/page-flags.h to vmstat.c/h.
Create vmstat.c/vmstat.h by separating the counter code and the proc functions.
Move the vm_stat_text array before zoneinfo_show.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-mm1/mm/Makefile
===================================================================
--- linux-2.6.17-mm1.orig/mm/Makefile 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/Makefile 2006-06-21 07:44:15.332346034 -0700
@@ -10,7 +10,7 @@ mmu-$(CONFIG_MMU) := fremap.o highmem.o
obj-y := bootmem.o filemap.o mempool.o oom_kill.o fadvise.o \
page_alloc.o page-writeback.o pdflush.o \
readahead.o swap.o truncate.o vmscan.o \
- prio_tree.o util.o mmzone.o $(mmu-y)
+ prio_tree.o util.o mmzone.o vmstat.o $(mmu-y)
obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o thrash.o
obj-$(CONFIG_HUGETLBFS) += hugetlb.o
Index: linux-2.6.17-mm1/include/linux/mm.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mm.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/include/linux/mm.h 2006-06-21 07:44:15.333322536 -0700
@@ -4,6 +4,7 @@
#include <linux/sched.h>
#include <linux/errno.h>
#include <linux/capability.h>
+#include <linux/vmstat.h>
#ifdef __KERNEL__
@@ -37,7 +38,6 @@ extern int sysctl_legacy_va_layout;
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/processor.h>
-#include <asm/atomic.h>
#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-21 07:44:15.334299038 -0700
@@ -0,0 +1,135 @@
+#ifndef _LINUX_VMSTAT_H
+#define _LINUX_VMSTAT_H
+
+#include <linux/types.h>
+
+/*
+ * Global page accounting. One instance per CPU. Only unsigned longs are
+ * allowed.
+ *
+ * - Fields can be modified with xxx_page_state and xxx_page_state_zone at
+ * any time safely (which protects the instance from modification by
+ * interrupt.
+ * - The __xxx_page_state variants can be used safely when interrupts are
+ * disabled.
+ * - The __xxx_page_state variants can be used if the field is only
+ * modified from process context and protected from preemption, or only
+ * modified from interrupt context. In this case, the field should be
+ * commented here.
+ */
+struct page_state {
+ unsigned long nr_dirty; /* Dirty writeable pages */
+ unsigned long nr_writeback; /* Pages under writeback */
+ unsigned long nr_unstable; /* NFS unstable pages */
+ unsigned long nr_page_table_pages;/* Pages used for pagetables */
+ unsigned long nr_mapped; /* mapped into pagetables.
+ * only modified from process context */
+ unsigned long nr_slab; /* In slab */
+#define GET_PAGE_STATE_LAST nr_slab
+
+ /*
+ * The below are zeroed by get_page_state(). Use get_full_page_state()
+ * to add up all these.
+ */
+ unsigned long pgpgin; /* Disk reads */
+ unsigned long pgpgout; /* Disk writes */
+ unsigned long pswpin; /* swap reads */
+ unsigned long pswpout; /* swap writes */
+
+ unsigned long pgalloc_high; /* page allocations */
+ unsigned long pgalloc_normal;
+ unsigned long pgalloc_dma32;
+ unsigned long pgalloc_dma;
+
+ unsigned long pgfree; /* page freeings */
+ unsigned long pgactivate; /* pages moved inactive->active */
+ unsigned long pgdeactivate; /* pages moved active->inactive */
+
+ unsigned long pgfault; /* faults (major+minor) */
+ unsigned long pgmajfault; /* faults (major only) */
+
+ unsigned long pgrefill_high; /* inspected in refill_inactive_zone */
+ unsigned long pgrefill_normal;
+ unsigned long pgrefill_dma32;
+ unsigned long pgrefill_dma;
+
+ unsigned long pgsteal_high; /* total highmem pages reclaimed */
+ unsigned long pgsteal_normal;
+ unsigned long pgsteal_dma32;
+ unsigned long pgsteal_dma;
+
+ unsigned long pgscan_kswapd_high;/* total highmem pages scanned */
+ unsigned long pgscan_kswapd_normal;
+ unsigned long pgscan_kswapd_dma32;
+ unsigned long pgscan_kswapd_dma;
+
+ unsigned long pgscan_direct_high;/* total highmem pages scanned */
+ unsigned long pgscan_direct_normal;
+ unsigned long pgscan_direct_dma32;
+ unsigned long pgscan_direct_dma;
+
+ unsigned long pginodesteal; /* pages reclaimed via inode freeing */
+ unsigned long slabs_scanned; /* slab objects scanned */
+ unsigned long kswapd_steal; /* pages reclaimed by kswapd */
+ unsigned long kswapd_inodesteal;/* reclaimed via kswapd inode freeing */
+ unsigned long pageoutrun; /* kswapd's calls to page reclaim */
+ unsigned long allocstall; /* direct reclaim calls */
+
+ unsigned long pgrotated; /* pages rotated to tail of the LRU */
+ unsigned long nr_bounce; /* pages for bounce buffers */
+};
+
+extern void get_page_state(struct page_state *ret);
+extern void get_page_state_node(struct page_state *ret, int node);
+extern void get_full_page_state(struct page_state *ret);
+extern unsigned long read_page_state_offset(unsigned long offset);
+extern void mod_page_state_offset(unsigned long offset, unsigned long delta);
+extern void __mod_page_state_offset(unsigned long offset, unsigned long delta);
+
+#define read_page_state(member) \
+ read_page_state_offset(offsetof(struct page_state, member))
+
+#define mod_page_state(member, delta) \
+ mod_page_state_offset(offsetof(struct page_state, member), (delta))
+
+#define __mod_page_state(member, delta) \
+ __mod_page_state_offset(offsetof(struct page_state, member), (delta))
+
+#define inc_page_state(member) mod_page_state(member, 1UL)
+#define dec_page_state(member) mod_page_state(member, 0UL - 1)
+#define add_page_state(member,delta) mod_page_state(member, (delta))
+#define sub_page_state(member,delta) mod_page_state(member, 0UL - (delta))
+
+#define __inc_page_state(member) __mod_page_state(member, 1UL)
+#define __dec_page_state(member) __mod_page_state(member, 0UL - 1)
+#define __add_page_state(member,delta) __mod_page_state(member, (delta))
+#define __sub_page_state(member,delta) __mod_page_state(member, 0UL - (delta))
+
+#define page_state(member) (*__page_state(offsetof(struct page_state, member)))
+
+#define state_zone_offset(zone, member) \
+({ \
+ unsigned offset; \
+ if (is_highmem(zone)) \
+ offset = offsetof(struct page_state, member##_high); \
+ else if (is_normal(zone)) \
+ offset = offsetof(struct page_state, member##_normal); \
+ else if (is_dma32(zone)) \
+ offset = offsetof(struct page_state, member##_dma32); \
+ else \
+ offset = offsetof(struct page_state, member##_dma); \
+ offset; \
+})
+
+#define __mod_page_state_zone(zone, member, delta) \
+ do { \
+ __mod_page_state_offset(state_zone_offset(zone, member), (delta)); \
+ } while (0)
+
+#define mod_page_state_zone(zone, member, delta) \
+ do { \
+ mod_page_state_offset(state_zone_offset(zone, member), (delta)); \
+ } while (0)
+
+#endif /* _LINUX_VMSTAT_H */
+
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-21 07:44:15.335275540 -0700
@@ -0,0 +1,417 @@
+/*
+ * linux/mm/vmstat.c
+ *
+ * Manages VM statistics
+ * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds
+ */
+
+#include <linux/config.h>
+#include <linux/mm.h>
+
+/*
+ * Accumulate the page_state information across all CPUs.
+ * The result is unavoidably approximate - it can change
+ * during and after execution of this function.
+ */
+static DEFINE_PER_CPU(struct page_state, page_states) = {0};
+
+atomic_t nr_pagecache = ATOMIC_INIT(0);
+EXPORT_SYMBOL(nr_pagecache);
+#ifdef CONFIG_SMP
+DEFINE_PER_CPU(long, nr_pagecache_local) = 0;
+#endif
+
+static void __get_page_state(struct page_state *ret, int nr, cpumask_t *cpumask)
+{
+ unsigned cpu;
+
+ memset(ret, 0, nr * sizeof(unsigned long));
+ cpus_and(*cpumask, *cpumask, cpu_online_map);
+
+ for_each_cpu_mask(cpu, *cpumask) {
+ unsigned long *in;
+ unsigned long *out;
+ unsigned off;
+ unsigned next_cpu;
+
+ in = (unsigned long *)&per_cpu(page_states, cpu);
+
+ next_cpu = next_cpu(cpu, *cpumask);
+ if (likely(next_cpu < NR_CPUS))
+ prefetch(&per_cpu(page_states, next_cpu));
+
+ out = (unsigned long *)ret;
+ for (off = 0; off < nr; off++)
+ *out++ += *in++;
+ }
+}
+
+void get_page_state_node(struct page_state *ret, int node)
+{
+ int nr;
+ cpumask_t mask = node_to_cpumask(node);
+
+ nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
+ nr /= sizeof(unsigned long);
+
+ __get_page_state(ret, nr+1, &mask);
+}
+
+void get_page_state(struct page_state *ret)
+{
+ int nr;
+ cpumask_t mask = CPU_MASK_ALL;
+
+ nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
+ nr /= sizeof(unsigned long);
+
+ __get_page_state(ret, nr + 1, &mask);
+}
+
+void get_full_page_state(struct page_state *ret)
+{
+ cpumask_t mask = CPU_MASK_ALL;
+
+ __get_page_state(ret, sizeof(*ret) / sizeof(unsigned long), &mask);
+}
+
+unsigned long read_page_state_offset(unsigned long offset)
+{
+ unsigned long ret = 0;
+ int cpu;
+
+ for_each_online_cpu(cpu) {
+ unsigned long in;
+
+ in = (unsigned long)&per_cpu(page_states, cpu) + offset;
+ ret += *((unsigned long *)in);
+ }
+ return ret;
+}
+
+void __mod_page_state_offset(unsigned long offset, unsigned long delta)
+{
+ void *ptr;
+
+ ptr = &__get_cpu_var(page_states);
+ *(unsigned long *)(ptr + offset) += delta;
+}
+EXPORT_SYMBOL(__mod_page_state_offset);
+
+void mod_page_state_offset(unsigned long offset, unsigned long delta)
+{
+ unsigned long flags;
+ void *ptr;
+
+ local_irq_save(flags);
+ ptr = &__get_cpu_var(page_states);
+ *(unsigned long *)(ptr + offset) += delta;
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_page_state_offset);
+
+void __get_zone_counts(unsigned long *active, unsigned long *inactive,
+ unsigned long *free, struct pglist_data *pgdat)
+{
+ struct zone *zones = pgdat->node_zones;
+ int i;
+
+ *active = 0;
+ *inactive = 0;
+ *free = 0;
+ for (i = 0; i < MAX_NR_ZONES; i++) {
+ *active += zones[i].nr_active;
+ *inactive += zones[i].nr_inactive;
+ *free += zones[i].free_pages;
+ }
+}
+
+void get_zone_counts(unsigned long *active,
+ unsigned long *inactive, unsigned long *free)
+{
+ struct pglist_data *pgdat;
+
+ *active = 0;
+ *inactive = 0;
+ *free = 0;
+ for_each_online_pgdat(pgdat) {
+ unsigned long l, m, n;
+ __get_zone_counts(&l, &m, &n, pgdat);
+ *active += l;
+ *inactive += m;
+ *free += n;
+ }
+}
+
+#ifdef CONFIG_PROC_FS
+
+#include <linux/seq_file.h>
+
+static void *frag_start(struct seq_file *m, loff_t *pos)
+{
+ pg_data_t *pgdat;
+ loff_t node = *pos;
+ for (pgdat = first_online_pgdat();
+ pgdat && node;
+ pgdat = next_online_pgdat(pgdat))
+ --node;
+
+ return pgdat;
+}
+
+static void *frag_next(struct seq_file *m, void *arg, loff_t *pos)
+{
+ pg_data_t *pgdat = (pg_data_t *)arg;
+
+ (*pos)++;
+ return next_online_pgdat(pgdat);
+}
+
+static void frag_stop(struct seq_file *m, void *arg)
+{
+}
+
+/*
+ * This walks the free areas for each zone.
+ */
+static int frag_show(struct seq_file *m, void *arg)
+{
+ pg_data_t *pgdat = (pg_data_t *)arg;
+ struct zone *zone;
+ struct zone *node_zones = pgdat->node_zones;
+ unsigned long flags;
+ int order;
+
+ for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
+ if (!populated_zone(zone))
+ continue;
+
+ spin_lock_irqsave(&zone->lock, flags);
+ seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name);
+ for (order = 0; order < MAX_ORDER; ++order)
+ seq_printf(m, "%6lu ", zone->free_area[order].nr_free);
+ spin_unlock_irqrestore(&zone->lock, flags);
+ seq_putc(m, '\n');
+ }
+ return 0;
+}
+
+struct seq_operations fragmentation_op = {
+ .start = frag_start,
+ .next = frag_next,
+ .stop = frag_stop,
+ .show = frag_show,
+};
+
+static char *vmstat_text[] = {
+ "nr_dirty",
+ "nr_writeback",
+ "nr_unstable",
+ "nr_page_table_pages",
+ "nr_mapped",
+ "nr_slab",
+
+ "pgpgin",
+ "pgpgout",
+ "pswpin",
+ "pswpout",
+
+ "pgalloc_high",
+ "pgalloc_normal",
+ "pgalloc_dma32",
+ "pgalloc_dma",
+
+ "pgfree",
+ "pgactivate",
+ "pgdeactivate",
+
+ "pgfault",
+ "pgmajfault",
+
+ "pgrefill_high",
+ "pgrefill_normal",
+ "pgrefill_dma32",
+ "pgrefill_dma",
+
+ "pgsteal_high",
+ "pgsteal_normal",
+ "pgsteal_dma32",
+ "pgsteal_dma",
+
+ "pgscan_kswapd_high",
+ "pgscan_kswapd_normal",
+ "pgscan_kswapd_dma32",
+ "pgscan_kswapd_dma",
+
+ "pgscan_direct_high",
+ "pgscan_direct_normal",
+ "pgscan_direct_dma32",
+ "pgscan_direct_dma",
+
+ "pginodesteal",
+ "slabs_scanned",
+ "kswapd_steal",
+ "kswapd_inodesteal",
+ "pageoutrun",
+ "allocstall",
+
+ "pgrotated",
+ "nr_bounce",
+};
+
+/*
+ * Output information about zones in @pgdat.
+ */
+static int zoneinfo_show(struct seq_file *m, void *arg)
+{
+ pg_data_t *pgdat = arg;
+ struct zone *zone;
+ struct zone *node_zones = pgdat->node_zones;
+ unsigned long flags;
+
+ for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; zone++) {
+ int i;
+
+ if (!populated_zone(zone))
+ continue;
+
+ spin_lock_irqsave(&zone->lock, flags);
+ seq_printf(m, "Node %d, zone %8s", pgdat->node_id, zone->name);
+ seq_printf(m,
+ "\n pages free %lu"
+ "\n min %lu"
+ "\n low %lu"
+ "\n high %lu"
+ "\n active %lu"
+ "\n inactive %lu"
+ "\n scanned %lu (a: %lu i: %lu)"
+ "\n spanned %lu"
+ "\n present %lu",
+ zone->free_pages,
+ zone->pages_min,
+ zone->pages_low,
+ zone->pages_high,
+ zone->nr_active,
+ zone->nr_inactive,
+ zone->pages_scanned,
+ zone->nr_scan_active, zone->nr_scan_inactive,
+ zone->spanned_pages,
+ zone->present_pages);
+ seq_printf(m,
+ "\n protection: (%lu",
+ zone->lowmem_reserve[0]);
+ for (i = 1; i < ARRAY_SIZE(zone->lowmem_reserve); i++)
+ seq_printf(m, ", %lu", zone->lowmem_reserve[i]);
+ seq_printf(m,
+ ")"
+ "\n pagesets");
+ for_each_online_cpu(i) {
+ struct per_cpu_pageset *pageset;
+ int j;
+
+ pageset = zone_pcp(zone, i);
+ for (j = 0; j < ARRAY_SIZE(pageset->pcp); j++) {
+ if (pageset->pcp[j].count)
+ break;
+ }
+ if (j == ARRAY_SIZE(pageset->pcp))
+ continue;
+ for (j = 0; j < ARRAY_SIZE(pageset->pcp); j++) {
+ seq_printf(m,
+ "\n cpu: %i pcp: %i"
+ "\n count: %i"
+ "\n high: %i"
+ "\n batch: %i",
+ i, j,
+ pageset->pcp[j].count,
+ pageset->pcp[j].high,
+ pageset->pcp[j].batch);
+ }
+#ifdef CONFIG_NUMA
+ seq_printf(m,
+ "\n numa_hit: %lu"
+ "\n numa_miss: %lu"
+ "\n numa_foreign: %lu"
+ "\n interleave_hit: %lu"
+ "\n local_node: %lu"
+ "\n other_node: %lu",
+ pageset->numa_hit,
+ pageset->numa_miss,
+ pageset->numa_foreign,
+ pageset->interleave_hit,
+ pageset->local_node,
+ pageset->other_node);
+#endif
+ }
+ seq_printf(m,
+ "\n all_unreclaimable: %u"
+ "\n prev_priority: %i"
+ "\n temp_priority: %i"
+ "\n start_pfn: %lu",
+ zone->all_unreclaimable,
+ zone->prev_priority,
+ zone->temp_priority,
+ zone->zone_start_pfn);
+ spin_unlock_irqrestore(&zone->lock, flags);
+ seq_putc(m, '\n');
+ }
+ return 0;
+}
+
+struct seq_operations zoneinfo_op = {
+ .start = frag_start, /* iterate over all zones. The same as in
+ * fragmentation. */
+ .next = frag_next,
+ .stop = frag_stop,
+ .show = zoneinfo_show,
+};
+
+static void *vmstat_start(struct seq_file *m, loff_t *pos)
+{
+ struct page_state *ps;
+
+ if (*pos >= ARRAY_SIZE(vmstat_text))
+ return NULL;
+
+ ps = kmalloc(sizeof(*ps), GFP_KERNEL);
+ m->private = ps;
+ if (!ps)
+ return ERR_PTR(-ENOMEM);
+ get_full_page_state(ps);
+ ps->pgpgin /= 2; /* sectors -> kbytes */
+ ps->pgpgout /= 2;
+ return (unsigned long *)ps + *pos;
+}
+
+static void *vmstat_next(struct seq_file *m, void *arg, loff_t *pos)
+{
+ (*pos)++;
+ if (*pos >= ARRAY_SIZE(vmstat_text))
+ return NULL;
+ return (unsigned long *)m->private + *pos;
+}
+
+static int vmstat_show(struct seq_file *m, void *arg)
+{
+ unsigned long *l = arg;
+ unsigned long off = l - (unsigned long *)m->private;
+
+ seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
+ return 0;
+}
+
+static void vmstat_stop(struct seq_file *m, void *arg)
+{
+ kfree(m->private);
+ m->private = NULL;
+}
+
+struct seq_operations vmstat_op = {
+ .start = vmstat_start,
+ .next = vmstat_next,
+ .stop = vmstat_stop,
+ .show = vmstat_show,
+};
+
+#endif /* CONFIG_PROC_FS */
+
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-21 07:44:15.338205046 -0700
@@ -1226,141 +1226,6 @@ static void show_node(struct zone *zone)
#define show_node(zone) do { } while (0)
#endif
-/*
- * Accumulate the page_state information across all CPUs.
- * The result is unavoidably approximate - it can change
- * during and after execution of this function.
- */
-static DEFINE_PER_CPU(struct page_state, page_states) = {0};
-
-atomic_t nr_pagecache = ATOMIC_INIT(0);
-EXPORT_SYMBOL(nr_pagecache);
-#ifdef CONFIG_SMP
-DEFINE_PER_CPU(long, nr_pagecache_local) = 0;
-#endif
-
-static void __get_page_state(struct page_state *ret, int nr, cpumask_t *cpumask)
-{
- unsigned cpu;
-
- memset(ret, 0, nr * sizeof(unsigned long));
- cpus_and(*cpumask, *cpumask, cpu_online_map);
-
- for_each_cpu_mask(cpu, *cpumask) {
- unsigned long *in;
- unsigned long *out;
- unsigned off;
- unsigned next_cpu;
-
- in = (unsigned long *)&per_cpu(page_states, cpu);
-
- next_cpu = next_cpu(cpu, *cpumask);
- if (likely(next_cpu < NR_CPUS))
- prefetch(&per_cpu(page_states, next_cpu));
-
- out = (unsigned long *)ret;
- for (off = 0; off < nr; off++)
- *out++ += *in++;
- }
-}
-
-void get_page_state_node(struct page_state *ret, int node)
-{
- int nr;
- cpumask_t mask = node_to_cpumask(node);
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr+1, &mask);
-}
-
-void get_page_state(struct page_state *ret)
-{
- int nr;
- cpumask_t mask = CPU_MASK_ALL;
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr + 1, &mask);
-}
-
-void get_full_page_state(struct page_state *ret)
-{
- cpumask_t mask = CPU_MASK_ALL;
-
- __get_page_state(ret, sizeof(*ret) / sizeof(unsigned long), &mask);
-}
-
-unsigned long read_page_state_offset(unsigned long offset)
-{
- unsigned long ret = 0;
- int cpu;
-
- for_each_online_cpu(cpu) {
- unsigned long in;
-
- in = (unsigned long)&per_cpu(page_states, cpu) + offset;
- ret += *((unsigned long *)in);
- }
- return ret;
-}
-
-void __mod_page_state_offset(unsigned long offset, unsigned long delta)
-{
- void *ptr;
-
- ptr = &__get_cpu_var(page_states);
- *(unsigned long *)(ptr + offset) += delta;
-}
-EXPORT_SYMBOL(__mod_page_state_offset);
-
-void mod_page_state_offset(unsigned long offset, unsigned long delta)
-{
- unsigned long flags;
- void *ptr;
-
- local_irq_save(flags);
- ptr = &__get_cpu_var(page_states);
- *(unsigned long *)(ptr + offset) += delta;
- local_irq_restore(flags);
-}
-EXPORT_SYMBOL(mod_page_state_offset);
-
-void __get_zone_counts(unsigned long *active, unsigned long *inactive,
- unsigned long *free, struct pglist_data *pgdat)
-{
- struct zone *zones = pgdat->node_zones;
- int i;
-
- *active = 0;
- *inactive = 0;
- *free = 0;
- for (i = 0; i < MAX_NR_ZONES; i++) {
- *active += zones[i].nr_active;
- *inactive += zones[i].nr_inactive;
- *free += zones[i].free_pages;
- }
-}
-
-void get_zone_counts(unsigned long *active,
- unsigned long *inactive, unsigned long *free)
-{
- struct pglist_data *pgdat;
-
- *active = 0;
- *inactive = 0;
- *free = 0;
- for_each_online_pgdat(pgdat) {
- unsigned long l, m, n;
- __get_zone_counts(&l, &m, &n, pgdat);
- *active += l;
- *inactive += m;
- *free += n;
- }
-}
-
void si_meminfo(struct sysinfo *val)
{
val->totalram = totalram_pages;
@@ -2178,278 +2043,6 @@ void __init free_area_init(unsigned long
__pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL);
}
-#ifdef CONFIG_PROC_FS
-
-#include <linux/seq_file.h>
-
-static void *frag_start(struct seq_file *m, loff_t *pos)
-{
- pg_data_t *pgdat;
- loff_t node = *pos;
- for (pgdat = first_online_pgdat();
- pgdat && node;
- pgdat = next_online_pgdat(pgdat))
- --node;
-
- return pgdat;
-}
-
-static void *frag_next(struct seq_file *m, void *arg, loff_t *pos)
-{
- pg_data_t *pgdat = (pg_data_t *)arg;
-
- (*pos)++;
- return next_online_pgdat(pgdat);
-}
-
-static void frag_stop(struct seq_file *m, void *arg)
-{
-}
-
-/*
- * This walks the free areas for each zone.
- */
-static int frag_show(struct seq_file *m, void *arg)
-{
- pg_data_t *pgdat = (pg_data_t *)arg;
- struct zone *zone;
- struct zone *node_zones = pgdat->node_zones;
- unsigned long flags;
- int order;
-
- for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
- if (!populated_zone(zone))
- continue;
-
- spin_lock_irqsave(&zone->lock, flags);
- seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name);
- for (order = 0; order < MAX_ORDER; ++order)
- seq_printf(m, "%6lu ", zone->free_area[order].nr_free);
- spin_unlock_irqrestore(&zone->lock, flags);
- seq_putc(m, '\n');
- }
- return 0;
-}
-
-struct seq_operations fragmentation_op = {
- .start = frag_start,
- .next = frag_next,
- .stop = frag_stop,
- .show = frag_show,
-};
-
-/*
- * Output information about zones in @pgdat.
- */
-static int zoneinfo_show(struct seq_file *m, void *arg)
-{
- pg_data_t *pgdat = arg;
- struct zone *zone;
- struct zone *node_zones = pgdat->node_zones;
- unsigned long flags;
-
- for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; zone++) {
- int i;
-
- if (!populated_zone(zone))
- continue;
-
- spin_lock_irqsave(&zone->lock, flags);
- seq_printf(m, "Node %d, zone %8s", pgdat->node_id, zone->name);
- seq_printf(m,
- "\n pages free %lu"
- "\n min %lu"
- "\n low %lu"
- "\n high %lu"
- "\n active %lu"
- "\n inactive %lu"
- "\n scanned %lu (a: %lu i: %lu)"
- "\n spanned %lu"
- "\n present %lu",
- zone->free_pages,
- zone->pages_min,
- zone->pages_low,
- zone->pages_high,
- zone->nr_active,
- zone->nr_inactive,
- zone->pages_scanned,
- zone->nr_scan_active, zone->nr_scan_inactive,
- zone->spanned_pages,
- zone->present_pages);
- seq_printf(m,
- "\n protection: (%lu",
- zone->lowmem_reserve[0]);
- for (i = 1; i < ARRAY_SIZE(zone->lowmem_reserve); i++)
- seq_printf(m, ", %lu", zone->lowmem_reserve[i]);
- seq_printf(m,
- ")"
- "\n pagesets");
- for_each_online_cpu(i) {
- struct per_cpu_pageset *pageset;
- int j;
-
- pageset = zone_pcp(zone, i);
- for (j = 0; j < ARRAY_SIZE(pageset->pcp); j++) {
- if (pageset->pcp[j].count)
- break;
- }
- if (j == ARRAY_SIZE(pageset->pcp))
- continue;
- for (j = 0; j < ARRAY_SIZE(pageset->pcp); j++) {
- seq_printf(m,
- "\n cpu: %i pcp: %i"
- "\n count: %i"
- "\n high: %i"
- "\n batch: %i",
- i, j,
- pageset->pcp[j].count,
- pageset->pcp[j].high,
- pageset->pcp[j].batch);
- }
-#ifdef CONFIG_NUMA
- seq_printf(m,
- "\n numa_hit: %lu"
- "\n numa_miss: %lu"
- "\n numa_foreign: %lu"
- "\n interleave_hit: %lu"
- "\n local_node: %lu"
- "\n other_node: %lu",
- pageset->numa_hit,
- pageset->numa_miss,
- pageset->numa_foreign,
- pageset->interleave_hit,
- pageset->local_node,
- pageset->other_node);
-#endif
- }
- seq_printf(m,
- "\n all_unreclaimable: %u"
- "\n prev_priority: %i"
- "\n temp_priority: %i"
- "\n start_pfn: %lu",
- zone->all_unreclaimable,
- zone->prev_priority,
- zone->temp_priority,
- zone->zone_start_pfn);
- spin_unlock_irqrestore(&zone->lock, flags);
- seq_putc(m, '\n');
- }
- return 0;
-}
-
-struct seq_operations zoneinfo_op = {
- .start = frag_start, /* iterate over all zones. The same as in
- * fragmentation. */
- .next = frag_next,
- .stop = frag_stop,
- .show = zoneinfo_show,
-};
-
-static char *vmstat_text[] = {
- "nr_dirty",
- "nr_writeback",
- "nr_unstable",
- "nr_page_table_pages",
- "nr_mapped",
- "nr_slab",
-
- "pgpgin",
- "pgpgout",
- "pswpin",
- "pswpout",
-
- "pgalloc_high",
- "pgalloc_normal",
- "pgalloc_dma32",
- "pgalloc_dma",
-
- "pgfree",
- "pgactivate",
- "pgdeactivate",
-
- "pgfault",
- "pgmajfault",
-
- "pgrefill_high",
- "pgrefill_normal",
- "pgrefill_dma32",
- "pgrefill_dma",
-
- "pgsteal_high",
- "pgsteal_normal",
- "pgsteal_dma32",
- "pgsteal_dma",
-
- "pgscan_kswapd_high",
- "pgscan_kswapd_normal",
- "pgscan_kswapd_dma32",
- "pgscan_kswapd_dma",
-
- "pgscan_direct_high",
- "pgscan_direct_normal",
- "pgscan_direct_dma32",
- "pgscan_direct_dma",
-
- "pginodesteal",
- "slabs_scanned",
- "kswapd_steal",
- "kswapd_inodesteal",
- "pageoutrun",
- "allocstall",
-
- "pgrotated",
- "nr_bounce",
-};
-
-static void *vmstat_start(struct seq_file *m, loff_t *pos)
-{
- struct page_state *ps;
-
- if (*pos >= ARRAY_SIZE(vmstat_text))
- return NULL;
-
- ps = kmalloc(sizeof(*ps), GFP_KERNEL);
- m->private = ps;
- if (!ps)
- return ERR_PTR(-ENOMEM);
- get_full_page_state(ps);
- ps->pgpgin /= 2; /* sectors -> kbytes */
- ps->pgpgout /= 2;
- return (unsigned long *)ps + *pos;
-}
-
-static void *vmstat_next(struct seq_file *m, void *arg, loff_t *pos)
-{
- (*pos)++;
- if (*pos >= ARRAY_SIZE(vmstat_text))
- return NULL;
- return (unsigned long *)m->private + *pos;
-}
-
-static int vmstat_show(struct seq_file *m, void *arg)
-{
- unsigned long *l = arg;
- unsigned long off = l - (unsigned long *)m->private;
-
- seq_printf(m, "%s %lu\n", vmstat_text[off], *l);
- return 0;
-}
-
-static void vmstat_stop(struct seq_file *m, void *arg)
-{
- kfree(m->private);
- m->private = NULL;
-}
-
-struct seq_operations vmstat_op = {
- .start = vmstat_start,
- .next = vmstat_next,
- .stop = vmstat_stop,
- .show = vmstat_show,
-};
-
-#endif /* CONFIG_PROC_FS */
-
#ifdef CONFIG_HOTPLUG_CPU
static int page_alloc_cpu_notify(struct notifier_block *self,
unsigned long action, void *hcpu)
Index: linux-2.6.17-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/page-flags.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/include/linux/page-flags.h 2006-06-21 07:45:02.657540879 -0700
@@ -5,9 +5,7 @@
#ifndef PAGE_FLAGS_H
#define PAGE_FLAGS_H
-#include <linux/percpu.h>
-#include <linux/cache.h>
-#include <asm/pgtable.h>
+#include <linux/vmstat.h>
/*
* Various page->flags bits:
@@ -91,134 +89,6 @@
#define PG_uncached 20 /* Page has been mapped as uncached */
/*
- * Global page accounting. One instance per CPU. Only unsigned longs are
- * allowed.
- *
- * - Fields can be modified with xxx_page_state and xxx_page_state_zone at
- * any time safely (which protects the instance from modification by
- * interrupt.
- * - The __xxx_page_state variants can be used safely when interrupts are
- * disabled.
- * - The __xxx_page_state variants can be used if the field is only
- * modified from process context and protected from preemption, or only
- * modified from interrupt context. In this case, the field should be
- * commented here.
- */
-struct page_state {
- unsigned long nr_dirty; /* Dirty writeable pages */
- unsigned long nr_writeback; /* Pages under writeback */
- unsigned long nr_unstable; /* NFS unstable pages */
- unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_mapped; /* mapped into pagetables.
- * only modified from process context */
- unsigned long nr_slab; /* In slab */
-#define GET_PAGE_STATE_LAST nr_slab
-
- /*
- * The below are zeroed by get_page_state(). Use get_full_page_state()
- * to add up all these.
- */
- unsigned long pgpgin; /* Disk reads */
- unsigned long pgpgout; /* Disk writes */
- unsigned long pswpin; /* swap reads */
- unsigned long pswpout; /* swap writes */
-
- unsigned long pgalloc_high; /* page allocations */
- unsigned long pgalloc_normal;
- unsigned long pgalloc_dma32;
- unsigned long pgalloc_dma;
-
- unsigned long pgfree; /* page freeings */
- unsigned long pgactivate; /* pages moved inactive->active */
- unsigned long pgdeactivate; /* pages moved active->inactive */
-
- unsigned long pgfault; /* faults (major+minor) */
- unsigned long pgmajfault; /* faults (major only) */
-
- unsigned long pgrefill_high; /* inspected in refill_inactive_zone */
- unsigned long pgrefill_normal;
- unsigned long pgrefill_dma32;
- unsigned long pgrefill_dma;
-
- unsigned long pgsteal_high; /* total highmem pages reclaimed */
- unsigned long pgsteal_normal;
- unsigned long pgsteal_dma32;
- unsigned long pgsteal_dma;
-
- unsigned long pgscan_kswapd_high;/* total highmem pages scanned */
- unsigned long pgscan_kswapd_normal;
- unsigned long pgscan_kswapd_dma32;
- unsigned long pgscan_kswapd_dma;
-
- unsigned long pgscan_direct_high;/* total highmem pages scanned */
- unsigned long pgscan_direct_normal;
- unsigned long pgscan_direct_dma32;
- unsigned long pgscan_direct_dma;
-
- unsigned long pginodesteal; /* pages reclaimed via inode freeing */
- unsigned long slabs_scanned; /* slab objects scanned */
- unsigned long kswapd_steal; /* pages reclaimed by kswapd */
- unsigned long kswapd_inodesteal;/* reclaimed via kswapd inode freeing */
- unsigned long pageoutrun; /* kswapd's calls to page reclaim */
- unsigned long allocstall; /* direct reclaim calls */
-
- unsigned long pgrotated; /* pages rotated to tail of the LRU */
- unsigned long nr_bounce; /* pages for bounce buffers */
-};
-
-extern void get_page_state(struct page_state *ret);
-extern void get_page_state_node(struct page_state *ret, int node);
-extern void get_full_page_state(struct page_state *ret);
-extern unsigned long read_page_state_offset(unsigned long offset);
-extern void mod_page_state_offset(unsigned long offset, unsigned long delta);
-extern void __mod_page_state_offset(unsigned long offset, unsigned long delta);
-
-#define read_page_state(member) \
- read_page_state_offset(offsetof(struct page_state, member))
-
-#define mod_page_state(member, delta) \
- mod_page_state_offset(offsetof(struct page_state, member), (delta))
-
-#define __mod_page_state(member, delta) \
- __mod_page_state_offset(offsetof(struct page_state, member), (delta))
-
-#define inc_page_state(member) mod_page_state(member, 1UL)
-#define dec_page_state(member) mod_page_state(member, 0UL - 1)
-#define add_page_state(member,delta) mod_page_state(member, (delta))
-#define sub_page_state(member,delta) mod_page_state(member, 0UL - (delta))
-
-#define __inc_page_state(member) __mod_page_state(member, 1UL)
-#define __dec_page_state(member) __mod_page_state(member, 0UL - 1)
-#define __add_page_state(member,delta) __mod_page_state(member, (delta))
-#define __sub_page_state(member,delta) __mod_page_state(member, 0UL - (delta))
-
-#define page_state(member) (*__page_state(offsetof(struct page_state, member)))
-
-#define state_zone_offset(zone, member) \
-({ \
- unsigned offset; \
- if (is_highmem(zone)) \
- offset = offsetof(struct page_state, member##_high); \
- else if (is_normal(zone)) \
- offset = offsetof(struct page_state, member##_normal); \
- else if (is_dma32(zone)) \
- offset = offsetof(struct page_state, member##_dma32); \
- else \
- offset = offsetof(struct page_state, member##_dma); \
- offset; \
-})
-
-#define __mod_page_state_zone(zone, member, delta) \
- do { \
- __mod_page_state_offset(state_zone_offset(zone, member), (delta)); \
- } while (0)
-
-#define mod_page_state_zone(zone, member, delta) \
- do { \
- mod_page_state_offset(state_zone_offset(zone, member), (delta)); \
- } while (0)
-
-/*
* Manipulation of page state flags
*/
#define PageLocked(page) \
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 02/14] Basic ZVC (zoned vm counter) implementation, zoned vm counters: per zone counter functionality
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
2006-06-22 16:40 ` [PATCH 01/14] Create vmstat.c/.h from page_alloc.c/.h Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 03/14] Convert nr_mapped to per zone counter, zoned vm counters: conversion of nr_mapped to per zone counter Christoph Lameter, Christoph Lameter
` (12 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Per zone counter infrastructure
The counters that we currently have for the VM are split per processor.
The processor however has not much to do with the zone these pages belong
to. We cannot tell f.e. how many ZONE_DMA pages are dirty.
So we are blind to potentially inbalances in the usage of memory in various
zones. F.e. in a NUMA system we cannot tell how many pages are dirty on
a particular node. If we knew then we could put measures into the VM to balance
the use of memory between different zones and different nodes in a NUMA
system. For example it would be possible to limit the dirty pages per node
so that fast local memory is kept available even if a process is dirtying
huge amounts of pages.
Another example is zone reclaim. We do not know how many unmapped pages exist
per zone. So we just have to try to reclaim. If it is not working then we
pause and try again later. It would be better if we knew when it makes sense
to reclaim unmapped pages from a zone. This patchset allows the determination
of the number of unmapped pages per zone. We can remove the zone reclaim
interval with the counters introduced here.
Futhermore the ability to have various usage statistics available will allow
the development of new NUMA balancing algorithms that may be able to improve
the decision making in the scheduler of when to move a process to another node
and hopefully will also enable automatic page migration through a user space
program that can analyse the memory load distribution and then rebalance
memory use in order to increase performance.
The counter framework here implements differential counters for each processor
in struct zone. The differential counters are consolidated when a threshold
is exceeded (like done in the current implementation for nr_pageache), when
slab reaping occurs or when a consolidation function is called.
Consolidation uses atomic operations and accumulates counters per zone in the
zone structure and also globally in the vm_stat array. VM functions can
access the counts by simply indexing a global or zone specific array.
The arrangement of counters in an array also simplifies processing when output
has to be generated for /proc/*.
Counters can be updated by calling inc/dec_zone_page_state or
_inc/dec_zone_page_state analogous to *_page_state. The second group of
functions can be called if it is known that interrupts are disabled.
Special optimized increment and decrement functions are provided. These can
avoid certain checks and use increment or decrement instructions that an
architecture may provide.
We also add a new CONFIG_DMA_IS_NORMAL that signifies that an architecture
can do DMA to all memory and therefore ZONE_NORMAL will not be populated.
This is only currently set for IA64 SGI SN2 and currently only affects
node_page_state(). In the best case node_page_state can be reduced to
retrieving a single counter for the one zone on the node.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-21 07:33:56.667598796 -0700
@@ -47,6 +47,9 @@ struct zone_padding {
#define ZONE_PADDING(name)
#endif
+enum zone_stat_item {
+ NR_VM_ZONE_STAT_ITEMS };
+
struct per_cpu_pages {
int count; /* number of pages in the list */
int high; /* high watermark, emptying needed */
@@ -56,6 +59,10 @@ struct per_cpu_pages {
struct per_cpu_pageset {
struct per_cpu_pages pcp[2]; /* 0: hot. 1: cold */
+#ifdef CONFIG_SMP
+ s8 vm_stat_diff[NR_VM_ZONE_STAT_ITEMS];
+#endif
+
#ifdef CONFIG_NUMA
unsigned long numa_hit; /* allocated in intended node */
unsigned long numa_miss; /* allocated in non intended node */
@@ -166,6 +173,8 @@ struct zone {
/* A count of how many reclaimers are scanning this zone */
atomic_t reclaim_in_progress;
+ /* Zone statistics */
+ atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
/*
* timestamp (in jiffies) of the last zone reclaim that did not
* result in freeing of pages. This is used to avoid repeated scans
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-21 07:33:51.539010735 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-21 07:33:56.668575298 -0700
@@ -1971,6 +1971,7 @@ static void __init free_area_init_core(s
zone->nr_scan_inactive = 0;
zone->nr_active = 0;
zone->nr_inactive = 0;
+ zap_zone_vm_stats(zone);
atomic_set(&zone->reclaim_in_progress, 0);
if (!size)
continue;
@@ -2072,6 +2073,7 @@ static int page_alloc_cpu_notify(struct
}
local_irq_enable();
+ refresh_cpu_vm_stats(cpu);
}
return NOTIFY_OK;
}
Index: linux-2.6.17-mm1/mm/slab.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/slab.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/slab.c 2006-06-21 07:33:56.671504804 -0700
@@ -3763,6 +3763,7 @@ next:
check_irq_on();
mutex_unlock(&cache_chain_mutex);
next_reap_node();
+ refresh_cpu_vm_stats(smp_processor_id());
/* Set up the next iteration */
schedule_delayed_work(&__get_cpu_var(reap_work), REAPTIMEOUT_CPUC);
}
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-21 07:28:50.423904770 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-21 07:33:56.671504804 -0700
@@ -2,6 +2,9 @@
#define _LINUX_VMSTAT_H
#include <linux/types.h>
+#include <linux/config.h>
+#include <linux/mmzone.h>
+#include <asm/atomic.h>
/*
* Global page accounting. One instance per CPU. Only unsigned longs are
@@ -131,5 +134,84 @@ extern void __mod_page_state_offset(unsi
mod_page_state_offset(state_zone_offset(zone, member), (delta)); \
} while (0)
+/*
+ * Zone based page accounting with per cpu differentials.
+ */
+extern atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
+
+static inline unsigned long global_page_state(enum zone_stat_item item)
+{
+ long x = atomic_long_read(&vm_stat[item]);
+#ifdef CONFIG_SMP
+ if (x < 0)
+ x = 0;
+#endif
+ return x;
+}
+
+static inline unsigned long zone_page_state(struct zone *zone,
+ enum zone_stat_item item)
+{
+ long x = atomic_long_read(&zone->vm_stat[item]);
+#ifdef CONFIG_SMP
+ if (x < 0)
+ x = 0;
+#endif
+ return x;
+}
+
+#ifdef CONFIG_NUMA
+/*
+ * Determine the per node value of a stat item. This function
+ * is called frequently in a NUMA machine, so try to be as
+ * frugal as possible.
+ */
+static inline unsigned long node_page_state(int node,
+ enum zone_stat_item item)
+{
+ struct zone *zones = NODE_DATA(node)->node_zones;
+
+ return
+#ifndef CONFIG_DMA_IS_NORMAL
+#if !defined(CONFIG_DMA_IS_DMA32) && BITS_PER_LONG >= 64
+ zone_page_state(&zones[ZONE_DMA32], item) +
+#endif
+ zone_page_state(&zones[ZONE_NORMAL], item) +
+#endif
+#ifdef CONFIG_HIGHMEM
+ zone_page_state(&zones[ZONE_HIGHMEM], item) +
+#endif
+ zone_page_state(&zones[ZONE_DMA], item);
+}
+#else
+#define node_page_state(node, item) global_page_state(item)
+#endif
+
+void __mod_zone_page_state(struct zone *, enum zone_stat_item item, int);
+void __inc_zone_page_state(struct page *, enum zone_stat_item);
+void __dec_zone_page_state(struct page *, enum zone_stat_item);
+
+#define __add_zone_page_state(__z, __i, __d) __mod_zone_page_state(__z, __i, __d)
+#define __sub_zone_page_state(__z, __i, __d) __mod_zone_page_state(__z, __i,-(__d))
+
+void mod_zone_page_state(struct zone *, enum zone_stat_item, int);
+void inc_zone_page_state(struct page *, enum zone_stat_item);
+void dec_zone_page_state(struct page *, enum zone_stat_item);
+
+#define add_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, __d)
+#define sub_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, -(__d))
+
+static inline void zap_zone_vm_stats(struct zone *zone) {
+ memset(zone->vm_stat, 0, sizeof(zone->vm_stat));
+}
+
+#ifdef CONFIG_SMP
+void refresh_cpu_vm_stats(int);
+void refresh_vm_stats(void);
+#else
+static inline void refresh_cpu_vm_stats(int cpu) { }
+static inline void refresh_vm_stats(void) { }
+#endif
+
#endif /* _LINUX_VMSTAT_H */
Index: linux-2.6.17-mm1/arch/ia64/Kconfig
===================================================================
--- linux-2.6.17-mm1.orig/arch/ia64/Kconfig 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/ia64/Kconfig 2006-06-21 07:33:56.672481306 -0700
@@ -70,6 +70,11 @@ config DMA_IS_DMA32
bool
default y
+config DMA_IS_NORMAL
+ bool
+ depends on IA64_SGI_SN2
+ default y
+
choice
prompt "System type"
default IA64_GENERIC
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-21 07:28:50.424881272 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-21 07:33:56.673457808 -0700
@@ -3,10 +3,15 @@
*
* Manages VM statistics
* Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds
+ *
+ * zoned VM statistics
+ * Copyright (C) 2006 Silicon Graphics, Inc.,
+ * Christoph Lameter <christoph@lameter.com>
*/
#include <linux/config.h>
#include <linux/mm.h>
+#include <linux/module.h>
/*
* Accumulate the page_state information across all CPUs.
@@ -143,6 +148,259 @@ void get_zone_counts(unsigned long *acti
}
}
+/*
+ * Manage combined zone based / global counters
+ *
+ * vm_stat contains the global counters
+ */
+atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
+
+static inline void zone_page_state_add(long x, struct zone *zone,
+ enum zone_stat_item item)
+{
+ atomic_long_add(x, &zone->vm_stat[item]);
+ atomic_long_add(x, &vm_stat[item]);
+}
+
+#ifdef CONFIG_SMP
+
+#define STAT_THRESHOLD 32
+
+/*
+ * Determine pointer to currently valid differential byte given a zone and
+ * the item number.
+ *
+ * Preemption must be off
+ */
+static inline s8 *diff_pointer(struct zone *zone, enum zone_stat_item item)
+{
+ return &zone_pcp(zone, smp_processor_id())->vm_stat_diff[item];
+}
+
+/*
+ * For use when we know that interrupts are disabled.
+ */
+void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ s8 *p;
+ long x;
+
+ p = diff_pointer(zone, item);
+ x = delta + *p;
+
+ if (unlikely(x > STAT_THRESHOLD || x < -STAT_THRESHOLD)) {
+ zone_page_state_add(x, zone, item);
+ x = 0;
+ }
+
+ *p = x;
+}
+EXPORT_SYMBOL(__mod_zone_page_state);
+
+/*
+ * For an unknown interrupt state
+ */
+void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ __mod_zone_page_state(zone, item, delta);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_zone_page_state);
+
+/*
+ * Optimized increment and decrement functions.
+ *
+ * These are only for a single page and therefore can take a struct page *
+ * argument instead of struct zone *. This allows the inclusion of the code
+ * generated for page_zone(page) into the optimized functions.
+ *
+ * No overflow check is necessary and therefore the differential can be
+ * incremented or decremented in place which may allow the compilers to
+ * generate better code.
+ *
+ * The increment or decrement is known and therefore one boundary check can
+ * be omitted.
+ *
+ * Some processors have inc/dec instructions that are atomic vs an interrupt.
+ * However, the code must first determine the differential location in a zone
+ * based on the processor number and then inc/dec the counter. There is no
+ * guarantee without disabling preemption that the processor will not change
+ * in between and therefore the atomicity vs. interrupt cannot be exploited
+ * in a useful way here.
+ */
+void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
+EXPORT_SYMBOL(__inc_zone_page_state);
+
+void __dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ struct zone *zone = page_zone(page);
+ s8 *p = diff_pointer(zone, item);
+
+ (*p)--;
+
+ if (unlikely(*p < -STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+}
+EXPORT_SYMBOL(__dec_zone_page_state);
+
+void inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+ struct zone *zone;
+ s8 *p;
+
+ zone = page_zone(page);
+ local_irq_save(flags);
+ p = diff_pointer(zone, item);
+
+ (*p)++;
+
+ if (unlikely(*p > STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(inc_zone_page_state);
+
+void dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+ struct zone *zone;
+ s8 *p;
+
+ zone = page_zone(page);
+ local_irq_save(flags);
+ p = diff_pointer(zone, item);
+
+ (*p)--;
+
+ if (unlikely(*p < -STAT_THRESHOLD)) {
+ zone_page_state_add(*p, zone, item);
+ *p = 0;
+ }
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state);
+
+/*
+ * Update the zone counters for one cpu.
+ */
+void refresh_cpu_vm_stats(int cpu)
+{
+ struct zone *zone;
+ int i;
+ unsigned long flags;
+
+ for_each_zone(zone) {
+ struct per_cpu_pageset *pcp;
+
+ pcp = zone_pcp(zone, cpu);
+
+ for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+ if (pcp->vm_stat_diff[i]) {
+ local_irq_save(flags);
+ zone_page_state_add(pcp->vm_stat_diff[i],
+ zone, i);
+ pcp->vm_stat_diff[i] = 0;
+ local_irq_restore(flags);
+ }
+ }
+}
+
+static void __refresh_cpu_vm_stats(void *dummy)
+{
+ refresh_cpu_vm_stats(smp_processor_id());
+}
+
+/*
+ * Consolidate all counters.
+ *
+ * Note that the result is less inaccurate but still inaccurate
+ * if concurrent processes are allowed to run.
+ */
+void refresh_vm_stats(void)
+{
+ on_each_cpu(__refresh_cpu_vm_stats, NULL, 0, 1);
+}
+EXPORT_SYMBOL(refresh_vm_stats);
+
+#else /* CONFIG_SMP */
+
+/*
+ * We do not maintain differentials in a single processor configuration.
+ * The functions directly modify the zone and global counters.
+ */
+
+void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ zone_page_state_add(delta, zone, item);
+}
+EXPORT_SYMBOL(__mod_zone_page_state);
+
+void mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
+ int delta)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add(delta, zone, item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(mod_zone_page_state);
+
+void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ zone_page_state_add(1, page_zone(page), item);
+}
+EXPORT_SYMBOL(__inc_zone_page_state);
+
+void __dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ zone_page_state_add(-1, page_zone(page), item);
+}
+EXPORT_SYMBOL(__dec_zone_page_state);
+
+void inc_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add(1, page_zone(page), item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(inc_zone_page_state);
+
+void dec_zone_page_state(struct page *page, enum zone_stat_item item)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ zone_page_state_add( -1, page_zone(page), item);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state);
+#endif
+
#ifdef CONFIG_PROC_FS
#include <linux/seq_file.h>
@@ -204,6 +462,9 @@ struct seq_operations fragmentation_op =
};
static char *vmstat_text[] = {
+ /* Zoned VM counters */
+
+ /* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
@@ -297,6 +558,11 @@ static int zoneinfo_show(struct seq_file
zone->nr_scan_active, zone->nr_scan_inactive,
zone->spanned_pages,
zone->present_pages);
+
+ for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+ seq_printf(m, "\n %-12s %lu", vmstat_text[i],
+ zone_page_state(zone, i));
+
seq_printf(m,
"\n protection: (%lu",
zone->lowmem_reserve[0]);
@@ -368,19 +634,25 @@ struct seq_operations zoneinfo_op = {
static void *vmstat_start(struct seq_file *m, loff_t *pos)
{
+ unsigned long *v;
struct page_state *ps;
+ int i;
if (*pos >= ARRAY_SIZE(vmstat_text))
return NULL;
- ps = kmalloc(sizeof(*ps), GFP_KERNEL);
- m->private = ps;
- if (!ps)
+ v = kmalloc(NR_VM_ZONE_STAT_ITEMS * sizeof(unsigned long)
+ + sizeof(*ps), GFP_KERNEL);
+ m->private = v;
+ if (!v)
return ERR_PTR(-ENOMEM);
+ for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
+ v[i] = global_page_state(i);
+ ps = (struct page_state *)(v + NR_VM_ZONE_STAT_ITEMS);
get_full_page_state(ps);
ps->pgpgin /= 2; /* sectors -> kbytes */
ps->pgpgout /= 2;
- return (unsigned long *)ps + *pos;
+ return v + *pos;
}
static void *vmstat_next(struct seq_file *m, void *arg, loff_t *pos)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 03/14] Convert nr_mapped to per zone counter, zoned vm counters: conversion of nr_mapped to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
2006-06-22 16:40 ` [PATCH 01/14] Create vmstat.c/.h from page_alloc.c/.h Christoph Lameter
2006-06-22 16:40 ` [PATCH 02/14] Basic ZVC (zoned vm counter) implementation, zoned vm counters: per zone counter functionality Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 04/14] Conversion of nr_pagecache to per zone counter, zoned vm counters: conversion of nr_pagecache " Christoph Lameter, Christoph Lameter
` (11 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
nr_mapped is important because it allows a determination of how many pages of
a zone are not mapped, which would allow a more efficient means of determining
when we need to reclaim memory in a zone.
We take the nr_mapped field out of the page state structure and define a new
per zone counter named NR_FILE_MAPPED (the anonymous pages will be split
off from NR_MAPPED in the next patch).
We replace the use of nr_mapped in various kernel locations. This avoids the
looping over all processors in try_to_free_pages(), writeback, reclaim (swap +
zone reclaim).
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/i386/mm/pgtable.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/i386/mm/pgtable.c 2006-06-21 08:06:14.414759983 -0700
@@ -61,7 +61,7 @@ void show_mem(void)
get_page_state(&ps);
printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
- printk(KERN_INFO "%lu pages mapped\n", ps.nr_mapped);
+ printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_FILE_MAPPED));
printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
}
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-21 08:06:14.415736485 -0700
@@ -53,8 +53,6 @@ static ssize_t node_read_meminfo(struct
ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
- if ((long)ps.nr_mapped < 0)
- ps.nr_mapped = 0;
if ((long)ps.nr_slab < 0)
ps.nr_slab = 0;
@@ -83,7 +81,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(ps.nr_dirty),
nid, K(ps.nr_writeback),
- nid, K(ps.nr_mapped),
+ nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-21 08:06:14.416712987 -0700
@@ -190,7 +190,7 @@ static int meminfo_read_proc(char *page,
K(i.freeswap),
K(ps.nr_dirty),
K(ps.nr_writeback),
- K(ps.nr_mapped),
+ K(global_page_state(NR_FILE_MAPPED)),
K(ps.nr_slab),
K(allowed),
K(committed),
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-21 08:04:57.507406207 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-21 08:06:14.418665992 -0700
@@ -48,6 +48,9 @@ struct zone_padding {
#endif
enum zone_stat_item {
+ NR_FILE_MAPPED, /* mapped into pagetables.
+ only modified from process context */
+
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-21 08:04:57.509359211 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-21 08:06:14.419642494 -0700
@@ -1314,7 +1314,7 @@ void show_free_areas(void)
ps.nr_unstable,
nr_free_pages(),
ps.nr_slab,
- ps.nr_mapped,
+ global_page_state(NR_FILE_MAPPED),
ps.nr_page_table_pages);
for_each_zone(zone) {
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-21 08:06:14.420618996 -0700
@@ -111,7 +111,7 @@ static void get_writeback_state(struct w
{
wbs->nr_dirty = read_page_state(nr_dirty);
wbs->nr_unstable = read_page_state(nr_unstable);
- wbs->nr_mapped = read_page_state(nr_mapped);
+ wbs->nr_mapped = global_page_state(NR_FILE_MAPPED);
wbs->nr_writeback = read_page_state(nr_writeback);
}
Index: linux-2.6.17-mm1/mm/rmap.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/rmap.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/rmap.c 2006-06-21 08:06:14.421595498 -0700
@@ -493,7 +493,7 @@ static void __page_set_anon_rmap(struct
* nr_mapped state can be updated without turning off
* interrupts because it is not modified via interrupt.
*/
- __inc_page_state(nr_mapped);
+ __inc_zone_page_state(page, NR_FILE_MAPPED);
}
/**
@@ -537,7 +537,7 @@ void page_add_new_anon_rmap(struct page
void page_add_file_rmap(struct page *page)
{
if (atomic_inc_and_test(&page->_mapcount))
- __inc_page_state(nr_mapped);
+ __inc_zone_page_state(page, NR_FILE_MAPPED);
}
/**
@@ -569,7 +569,7 @@ void page_remove_rmap(struct page *page)
*/
if (page_test_and_clear_dirty(page))
set_page_dirty(page);
- __dec_page_state(nr_mapped);
+ __dec_zone_page_state(page, NR_FILE_MAPPED);
}
}
Index: linux-2.6.17-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmscan.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/vmscan.c 2006-06-21 08:06:14.422572000 -0700
@@ -972,7 +972,7 @@ unsigned long try_to_free_pages(struct z
}
for (priority = DEF_PRIORITY; priority >= 0; priority--) {
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_FILE_MAPPED);
sc.nr_scanned = 0;
if (!priority)
disable_swap_token();
@@ -1062,7 +1062,7 @@ loop_again:
total_scanned = 0;
nr_reclaimed = 0;
sc.may_writepage = !laptop_mode;
- sc.nr_mapped = read_page_state(nr_mapped);
+ sc.nr_mapped = global_page_state(NR_FILE_MAPPED);
inc_page_state(pageoutrun);
@@ -1412,7 +1412,7 @@ static int __zone_reclaim(struct zone *z
struct scan_control sc = {
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_swap = !!(zone_reclaim_mode & RECLAIM_SWAP),
- .nr_mapped = read_page_state(nr_mapped),
+ .nr_mapped = global_page_state(NR_FILE_MAPPED),
.swap_cluster_max = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-21 08:04:57.514241722 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-21 08:06:14.423548502 -0700
@@ -463,13 +463,13 @@ struct seq_operations fragmentation_op =
static char *vmstat_text[] = {
/* Zoned VM counters */
+ "nr_mapped",
/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
"nr_page_table_pages",
- "nr_mapped",
"nr_slab",
"pgpgin",
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-21 08:04:57.512288718 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 08:10:13.219810354 -0700
@@ -25,8 +25,6 @@ struct page_state {
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_mapped; /* mapped into pagetables.
- * only modified from process context */
unsigned long nr_slab; /* In slab */
#define GET_PAGE_STATE_LAST nr_slab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 04/14] Conversion of nr_pagecache to per zone counter, zoned vm counters: conversion of nr_pagecache to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (2 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 03/14] Convert nr_mapped to per zone counter, zoned vm counters: conversion of nr_mapped to per zone counter Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 05/14] Remove NR_FILE_MAPPED from scan control structure, zoned VM stats: Remove nr_mapped from scan control Christoph Lameter, Christoph Lameter
` (10 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Currently a single atomic variable is used to establish the size of the page
cache in the whole machine. The zoned VM counters have the same method of
implementation as the nr_pagecache code but also allow the determination of
the pagecache size per zone.
Remove the special implementation for nr_pagecache and make it a zoned
counter named NR_FILE_PAGES.
Updates of the page cache counters are always performed with interrupts off.
We can therefore use the __ variant here.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/arch/sparc64/kernel/sys_sunos32.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/sparc64/kernel/sys_sunos32.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/sparc64/kernel/sys_sunos32.c 2006-06-21 07:36:17.677405206 -0700
@@ -155,7 +155,7 @@ asmlinkage int sunos_brk(u32 baddr)
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
- freepages = get_page_cache_size();
+ freepages = global_page_state(NR_FILE_PAGES);
freepages >>= 1;
freepages += nr_free_pages();
freepages += nr_swap_pages;
Index: linux-2.6.17-mm1/arch/sparc/kernel/sys_sunos.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/sparc/kernel/sys_sunos.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/sparc/kernel/sys_sunos.c 2006-06-21 07:36:17.678381708 -0700
@@ -196,7 +196,7 @@ asmlinkage int sunos_brk(unsigned long b
* simple, it hopefully works in most obvious cases.. Easy to
* fool it, but this should catch most mistakes.
*/
- freepages = get_page_cache_size();
+ freepages = global_page_state(NR_FILE_PAGES);
freepages >>= 1;
freepages += nr_free_pages();
freepages += nr_swap_pages;
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-21 07:34:08.376833270 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-21 07:36:17.679358210 -0700
@@ -142,7 +142,8 @@ static int meminfo_read_proc(char *page,
allowed = ((totalram_pages - hugetlb_total_pages())
* sysctl_overcommit_ratio / 100) + total_swap_pages;
- cached = get_page_cache_size() - total_swapcache_pages - i.bufferram;
+ cached = global_page_state(NR_FILE_PAGES) -
+ total_swapcache_pages - i.bufferram;
if (cached < 0)
cached = 0;
Index: linux-2.6.17-mm1/include/linux/pagemap.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/pagemap.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/include/linux/pagemap.h 2006-06-21 07:36:17.680334712 -0700
@@ -106,51 +106,6 @@ int add_to_page_cache_lru(struct page *p
extern void remove_from_page_cache(struct page *page);
extern void __remove_from_page_cache(struct page *page);
-extern atomic_t nr_pagecache;
-
-#ifdef CONFIG_SMP
-
-#define PAGECACHE_ACCT_THRESHOLD max(16, NR_CPUS * 2)
-DECLARE_PER_CPU(long, nr_pagecache_local);
-
-/*
- * pagecache_acct implements approximate accounting for pagecache.
- * vm_enough_memory() do not need high accuracy. Writers will keep
- * an offset in their per-cpu arena and will spill that into the
- * global count whenever the absolute value of the local count
- * exceeds the counter's threshold.
- *
- * MUST be protected from preemption.
- * current protection is mapping->page_lock.
- */
-static inline void pagecache_acct(int count)
-{
- long *local;
-
- local = &__get_cpu_var(nr_pagecache_local);
- *local += count;
- if (*local > PAGECACHE_ACCT_THRESHOLD || *local < -PAGECACHE_ACCT_THRESHOLD) {
- atomic_add(*local, &nr_pagecache);
- *local = 0;
- }
-}
-
-#else
-
-static inline void pagecache_acct(int count)
-{
- atomic_add(count, &nr_pagecache);
-}
-#endif
-
-static inline unsigned long get_page_cache_size(void)
-{
- int ret = atomic_read(&nr_pagecache);
- if (unlikely(ret < 0))
- ret = 0;
- return ret;
-}
-
/*
* Return byte-offset into filesystem object for page.
*/
Index: linux-2.6.17-mm1/mm/filemap.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/filemap.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/filemap.c 2006-06-21 07:36:17.682287716 -0700
@@ -120,7 +120,7 @@ void __remove_from_page_cache(struct pag
radix_tree_delete(&mapping->page_tree, page->index);
page->mapping = NULL;
mapping->nrpages--;
- pagecache_acct(-1);
+ __dec_zone_page_state(page, NR_FILE_PAGES);
}
void remove_from_page_cache(struct page *page)
@@ -415,7 +415,7 @@ int add_to_page_cache(struct page *page,
page->mapping = mapping;
page->index = offset;
mapping->nrpages++;
- pagecache_acct(1);
+ __inc_zone_page_state(page, NR_FILE_PAGES);
}
write_unlock_irq(&mapping->tree_lock);
radix_tree_preload_end();
Index: linux-2.6.17-mm1/mm/mmap.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/mmap.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/mmap.c 2006-06-21 07:36:17.684240720 -0700
@@ -96,7 +96,7 @@ int __vm_enough_memory(long pages, int c
if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {
unsigned long n;
- free = get_page_cache_size();
+ free = global_page_state(NR_FILE_PAGES);
free += nr_swap_pages;
/*
Index: linux-2.6.17-mm1/mm/nommu.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/nommu.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/nommu.c 2006-06-21 07:36:17.684240720 -0700
@@ -1122,7 +1122,7 @@ int __vm_enough_memory(long pages, int c
if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) {
unsigned long n;
- free = get_page_cache_size();
+ free = global_page_state(NR_FILE_PAGES);
free += nr_swap_pages;
/*
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-21 07:34:08.379762776 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-21 07:36:17.686193724 -0700
@@ -2049,16 +2049,11 @@ static int page_alloc_cpu_notify(struct
unsigned long action, void *hcpu)
{
int cpu = (unsigned long)hcpu;
- long *count;
unsigned long *src, *dest;
if (action == CPU_DEAD) {
int i;
- /* Drain local pagecache count. */
- count = &per_cpu(nr_pagecache_local, cpu);
- atomic_add(*count, &nr_pagecache);
- *count = 0;
local_irq_disable();
__drain_pages(cpu);
Index: linux-2.6.17-mm1/mm/swap_state.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/swap_state.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/swap_state.c 2006-06-21 07:36:17.686193724 -0700
@@ -87,7 +87,7 @@ static int __add_to_swap_cache(struct pa
SetPageSwapCache(page);
set_page_private(page, entry.val);
total_swapcache_pages++;
- pagecache_acct(1);
+ __inc_zone_page_state(page, NR_FILE_PAGES);
}
write_unlock_irq(&swapper_space.tree_lock);
radix_tree_preload_end();
@@ -132,7 +132,7 @@ void __delete_from_swap_cache(struct pag
set_page_private(page, 0);
ClearPageSwapCache(page);
total_swapcache_pages--;
- pagecache_acct(-1);
+ __dec_zone_page_state(page, NR_FILE_PAGES);
INC_CACHE_INFO(del_total);
}
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-21 07:34:08.377809772 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-21 07:36:17.687170225 -0700
@@ -50,7 +50,7 @@ struct zone_padding {
enum zone_stat_item {
NR_FILE_MAPPED, /* mapped into pagetables.
only modified from process context */
-
+ NR_FILE_PAGES,
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/arch/s390/appldata/appldata_mem.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/s390/appldata/appldata_mem.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/s390/appldata/appldata_mem.c 2006-06-21 07:36:17.688146727 -0700
@@ -130,7 +130,8 @@ static void appldata_get_mem_data(void *
mem_data->totalhigh = P2K(val.totalhigh);
mem_data->freehigh = P2K(val.freehigh);
mem_data->bufferram = P2K(val.bufferram);
- mem_data->cached = P2K(atomic_read(&nr_pagecache) - val.bufferram);
+ mem_data->cached = P2K(global_page_state(NR_FILE_PAGES)
+ - val.bufferram);
si_swapinfo(&val);
mem_data->totalswap = P2K(val.totalswap);
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-21 07:34:08.375856768 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-21 07:36:17.689123229 -0700
@@ -68,6 +68,7 @@ static ssize_t node_read_meminfo(struct
"Node %d LowFree: %8lu kB\n"
"Node %d Dirty: %8lu kB\n"
"Node %d Writeback: %8lu kB\n"
+ "Node %d FilePages: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
@@ -81,6 +82,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freeram - i.freehigh),
nid, K(ps.nr_dirty),
nid, K(ps.nr_writeback),
+ nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-21 07:34:08.382692282 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-21 07:36:17.689123229 -0700
@@ -20,12 +20,6 @@
*/
static DEFINE_PER_CPU(struct page_state, page_states) = {0};
-atomic_t nr_pagecache = ATOMIC_INIT(0);
-EXPORT_SYMBOL(nr_pagecache);
-#ifdef CONFIG_SMP
-DEFINE_PER_CPU(long, nr_pagecache_local) = 0;
-#endif
-
static void __get_page_state(struct page_state *ret, int nr, cpumask_t *cpumask)
{
unsigned cpu;
@@ -464,6 +458,7 @@ struct seq_operations fragmentation_op =
static char *vmstat_text[] = {
/* Zoned VM counters */
"nr_mapped",
+ "nr_file_pages",
/* Page state */
"nr_dirty",
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 05/14] Remove NR_FILE_MAPPED from scan control structure, zoned VM stats: Remove nr_mapped from scan control
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (3 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 04/14] Conversion of nr_pagecache to per zone counter, zoned vm counters: conversion of nr_pagecache " Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 06/14] Split NR_ANON_PAGES off from NR_FILE_MAPPED, zoned VM stats: Add NR_ANON_PAGES Christoph Lameter, Christoph Lameter
` (9 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
We can now access the number of pages in a mapped state in an inexpensive
way in shrink_active_list. So drop the nr_mapped field from scan_control.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmscan.c 2006-06-21 07:35:39.744211663 -0700
+++ linux-2.6.17-mm1/mm/vmscan.c 2006-06-21 07:37:24.659577925 -0700
@@ -46,8 +46,6 @@ struct scan_control {
/* Incremented by the number of inactive pages that were scanned */
unsigned long nr_scanned;
- unsigned long nr_mapped; /* From page_state */
-
/* This context's GFP mask */
gfp_t gfp_mask;
@@ -727,7 +725,8 @@ static void shrink_active_list(unsigned
* how much memory
* is mapped.
*/
- mapped_ratio = (sc->nr_mapped * 100) / total_memory;
+ mapped_ratio = (global_page_state(NR_FILE_MAPPED) * 100) /
+ total_memory;
/*
* Now decide how much we really want to unmap some pages. The
@@ -972,7 +971,6 @@ unsigned long try_to_free_pages(struct z
}
for (priority = DEF_PRIORITY; priority >= 0; priority--) {
- sc.nr_mapped = global_page_state(NR_FILE_MAPPED);
sc.nr_scanned = 0;
if (!priority)
disable_swap_token();
@@ -1062,8 +1060,6 @@ loop_again:
total_scanned = 0;
nr_reclaimed = 0;
sc.may_writepage = !laptop_mode;
- sc.nr_mapped = global_page_state(NR_FILE_MAPPED);
-
inc_page_state(pageoutrun);
for (i = 0; i < pgdat->nr_zones; i++) {
@@ -1412,7 +1408,6 @@ static int __zone_reclaim(struct zone *z
struct scan_control sc = {
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_swap = !!(zone_reclaim_mode & RECLAIM_SWAP),
- .nr_mapped = global_page_state(NR_FILE_MAPPED),
.swap_cluster_max = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 06/14] Split NR_ANON_PAGES off from NR_FILE_MAPPED, zoned VM stats: Add NR_ANON_PAGES
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (4 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 05/14] Remove NR_FILE_MAPPED from scan control structure, zoned VM stats: Remove nr_mapped from scan control Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 07/14] zone_reclaim: remove /proc/sys/vm/zone_reclaim_interval, zoned vm counters: use per zone counters to remove zone_reclaim_interval Christoph Lameter, Christoph Lameter
` (8 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
The current NR_FILE_MAPPED is used by zone reclaim and the dirty load
calculation as the number of mapped pagecache pages. However, that is not
true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch
separates those and therefore allows an accurate tracking of the anonymous
pages per zone.
It then becomes possible to determine the number of unmapped pages
per zone and we can avoid scanning for unmapped pages if there
are none.
Also it may now be possible to determine the mapped/unmapped ratio in
get_dirty_limit. Isnt the number of anonymous pages irrelevant in that
calculation?
Note that this will change the meaning of the number of mapped pages
reported in /proc/vmstat /proc/meminfo and in the per node statistics.
This may affect user space tools that monitor these counters!
NR_FILE_MAPPED works like NR_FILE_DIRTY. It is only valid for pagecache pages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-21 08:08:30.098752388 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-21 08:11:09.531275713 -0700
@@ -168,6 +168,7 @@ static int meminfo_read_proc(char *page,
"SwapFree: %8lu kB\n"
"Dirty: %8lu kB\n"
"Writeback: %8lu kB\n"
+ "AnonPages: %8lu kB\n"
"Mapped: %8lu kB\n"
"Slab: %8lu kB\n"
"CommitLimit: %8lu kB\n"
@@ -191,6 +192,7 @@ static int meminfo_read_proc(char *page,
K(i.freeswap),
K(ps.nr_dirty),
K(ps.nr_writeback),
+ K(global_page_state(NR_ANON_PAGES)),
K(global_page_state(NR_FILE_MAPPED)),
K(ps.nr_slab),
K(allowed),
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-21 08:08:30.106564405 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-21 08:11:09.532252216 -0700
@@ -48,7 +48,8 @@ struct zone_padding {
#endif
enum zone_stat_item {
- NR_FILE_MAPPED, /* mapped into pagetables.
+ NR_ANON_PAGES, /* Mapped anonymous pages */
+ NR_FILE_MAPPED, /* pagecache pages mapped into pagetables.
only modified from process context */
NR_FILE_PAGES,
NR_VM_ZONE_STAT_ITEMS };
Index: linux-2.6.17-mm1/mm/rmap.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/rmap.c 2006-06-21 08:06:14.421595498 -0700
+++ linux-2.6.17-mm1/mm/rmap.c 2006-06-21 08:11:09.533228718 -0700
@@ -493,7 +493,7 @@ static void __page_set_anon_rmap(struct
* nr_mapped state can be updated without turning off
* interrupts because it is not modified via interrupt.
*/
- __inc_zone_page_state(page, NR_FILE_MAPPED);
+ __inc_zone_page_state(page, NR_ANON_PAGES);
}
/**
@@ -569,7 +569,8 @@ void page_remove_rmap(struct page *page)
*/
if (page_test_and_clear_dirty(page))
set_page_dirty(page);
- __dec_zone_page_state(page, NR_FILE_MAPPED);
+ __dec_zone_page_state(page,
+ PageAnon(page) ? NR_ANON_PAGES : NR_FILE_MAPPED);
}
}
Index: linux-2.6.17-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmscan.c 2006-06-21 08:11:00.302354268 -0700
+++ linux-2.6.17-mm1/mm/vmscan.c 2006-06-21 08:11:09.535181722 -0700
@@ -725,7 +725,8 @@ static void shrink_active_list(unsigned
* how much memory
* is mapped.
*/
- mapped_ratio = (global_page_state(NR_FILE_MAPPED) * 100) /
+ mapped_ratio = ((global_page_state(NR_FILE_MAPPED) +
+ global_page_state(NR_ANON_PAGES)) * 100) /
total_memory;
/*
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-21 08:10:58.765339946 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-21 08:11:09.535181722 -0700
@@ -70,6 +70,7 @@ static ssize_t node_read_meminfo(struct
"Node %d Writeback: %8lu kB\n"
"Node %d FilePages: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
+ "Node %d AnonPages: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
@@ -84,6 +85,7 @@ static ssize_t node_read_meminfo(struct
nid, K(ps.nr_writeback),
nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
+ nid, K(node_page_state(nid, NR_ANON_PAGES)),
nid, K(ps.nr_slab));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-21 08:06:14.420618996 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-21 08:11:09.536158224 -0700
@@ -111,7 +111,8 @@ static void get_writeback_state(struct w
{
wbs->nr_dirty = read_page_state(nr_dirty);
wbs->nr_unstable = read_page_state(nr_unstable);
- wbs->nr_mapped = global_page_state(NR_FILE_MAPPED);
+ wbs->nr_mapped = global_page_state(NR_FILE_MAPPED) +
+ global_page_state(NR_ANON_PAGES);
wbs->nr_writeback = read_page_state(nr_writeback);
}
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-21 08:08:30.108517409 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-21 08:11:09.537134726 -0700
@@ -457,6 +457,7 @@ struct seq_operations fragmentation_op =
static char *vmstat_text[] = {
/* Zoned VM counters */
+ "nr_anon_pages",
"nr_mapped",
"nr_file_pages",
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 07/14] zone_reclaim: remove /proc/sys/vm/zone_reclaim_interval, zoned vm counters: use per zone counters to remove zone_reclaim_interval
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (5 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 06/14] Split NR_ANON_PAGES off from NR_FILE_MAPPED, zoned VM stats: Add NR_ANON_PAGES Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 08/14] Conversion of nr_slab to per zone counter, zoned vm counters: conversion of nr_slab to per zone counter Christoph Lameter, Christoph Lameter
` (7 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
The zone_reclaim_interval was necessary because we were not able to determine
how many unmapped pages exist in a zone. Therefore we had to scan in
intervals to figure out if any pages were unmapped.
With the zoned counters and NR_ANON_PAGES we now know the number of pagecache pages
and the number of mapped pages in a zone. So we can simply skip the reclaim
if there is an insufficient number of unmapped pages. We use SWAP_CLUSTER_MAX
as the boundary.
Drop all support for /proc/sys/vm/zone_reclaim_interval.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-21 07:37:46.333038070 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-21 07:38:54.090553468 -0700
@@ -179,12 +179,6 @@ struct zone {
/* Zone statistics */
atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
- /*
- * timestamp (in jiffies) of the last zone reclaim that did not
- * result in freeing of pages. This is used to avoid repeated scans
- * if all memory in the zone is in use.
- */
- unsigned long last_unsuccessful_zone_reclaim;
/*
* prev_priority holds the scanning priority for this zone. It is
Index: linux-2.6.17-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/swap.h 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/include/linux/swap.h 2006-06-21 07:38:54.091529970 -0700
@@ -194,7 +194,6 @@ extern pageout_t pageout(struct page *pa
#ifdef CONFIG_NUMA
extern int zone_reclaim_mode;
-extern int zone_reclaim_interval;
extern int zone_reclaim(struct zone *, gfp_t, unsigned int);
#else
#define zone_reclaim_mode 0
Index: linux-2.6.17-mm1/kernel/sysctl.c
===================================================================
--- linux-2.6.17-mm1.orig/kernel/sysctl.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/kernel/sysctl.c 2006-06-21 07:38:54.094459475 -0700
@@ -905,15 +905,6 @@ static ctl_table vm_table[] = {
.strategy = &sysctl_intvec,
.extra1 = &zero,
},
- {
- .ctl_name = VM_ZONE_RECLAIM_INTERVAL,
- .procname = "zone_reclaim_interval",
- .data = &zone_reclaim_interval,
- .maxlen = sizeof(zone_reclaim_interval),
- .mode = 0644,
- .proc_handler = &proc_dointvec_jiffies,
- .strategy = &sysctl_jiffies,
- },
#endif
{ .ctl_name = 0 }
};
Index: linux-2.6.17-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmscan.c 2006-06-21 07:38:28.451518977 -0700
+++ linux-2.6.17-mm1/mm/vmscan.c 2006-06-21 07:38:54.095435977 -0700
@@ -1384,11 +1384,6 @@ int zone_reclaim_mode __read_mostly;
#define RECLAIM_SLAB (1<<3) /* Do a global slab shrink if the zone is out of memory */
/*
- * Mininum time between zone reclaim scans
- */
-int zone_reclaim_interval __read_mostly = 30*HZ;
-
-/*
* Priority for ZONE_RECLAIM. This determines the fraction of pages
* of a node considered for each zone_reclaim. 4 scans 1/16th of
* a zone.
@@ -1452,16 +1447,6 @@ static int __zone_reclaim(struct zone *z
p->reclaim_state = NULL;
current->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE);
-
- if (nr_reclaimed == 0) {
- /*
- * We were unable to reclaim enough pages to stay on node. We
- * now allow off node accesses for a certain time period before
- * trying again to reclaim pages from the local zone.
- */
- zone->last_unsuccessful_zone_reclaim = jiffies;
- }
-
return nr_reclaimed >= nr_pages;
}
@@ -1471,13 +1456,17 @@ int zone_reclaim(struct zone *zone, gfp_
int node_id;
/*
- * Do not reclaim if there was a recent unsuccessful attempt at zone
- * reclaim. In that case we let allocations go off node for the
- * zone_reclaim_interval. Otherwise we would scan for each off-node
- * page allocation.
+ * Do not reclaim if there are not enough reclaimable pages in this
+ * zone that would satify this allocations.
+ *
+ * All unmapped pagecache pages are reclaimable.
+ *
+ * Both counters may be temporarily off a bit so we use
+ * SWAP_CLUSTER_MAX as the boundary. It may also be good to
+ * leave a few frequently used unmapped pagecache pages around.
*/
- if (time_before(jiffies,
- zone->last_unsuccessful_zone_reclaim + zone_reclaim_interval))
+ if (zone_page_state(zone, NR_FILE_PAGES) -
+ zone_page_state(zone, NR_FILE_MAPPED) < SWAP_CLUSTER_MAX)
return 0;
/*
Index: linux-2.6.17-mm1/Documentation/sysctl/vm.txt
===================================================================
--- linux-2.6.17-mm1.orig/Documentation/sysctl/vm.txt 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/Documentation/sysctl/vm.txt 2006-06-21 07:39:34.186698961 -0700
@@ -28,7 +28,6 @@ Currently, these files are in /proc/sys/
- block_dump
- drop-caches
- zone_reclaim_mode
-- zone_reclaim_interval
==============================================================
@@ -166,15 +165,3 @@ use of files and builds up large slab ca
shrink operation is global, may take a long time and free slabs
in all nodes of the system.
-================================================================
-
-zone_reclaim_interval:
-
-The time allowed for off node allocations after zone reclaim
-has failed to reclaim enough pages to allow a local allocation.
-
-Time is set in seconds and set by default to 30 seconds.
-
-Reduce the interval if undesired off node allocations occur. However, too
-frequent scans will have a negative impact onoff node allocation performance.
-
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 08/14] Conversion of nr_slab to per zone counter, zoned vm counters: conversion of nr_slab to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (6 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 07/14] zone_reclaim: remove /proc/sys/vm/zone_reclaim_interval, zoned vm counters: use per zone counters to remove zone_reclaim_interval Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 09/14] Conversion of nr_pagetables to per zone counter, zoned vm counters: conversion of nr_pagetable " Christoph Lameter, Christoph Lameter
` (6 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
- Allows reclaim to access counter without looping over processor counts.
- Allows accurate statistics on how many pages are used in a zone by
the slab. This may become useful to balance slab allocations over
various zones.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/i386/mm/pgtable.c 2006-06-21 08:06:14.414759983 -0700
+++ linux-2.6.17-mm1/arch/i386/mm/pgtable.c 2006-06-22 08:23:00.263168234 -0700
@@ -62,7 +62,7 @@ void show_mem(void)
printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_FILE_MAPPED));
- printk(KERN_INFO "%lu pages slab\n", ps.nr_slab);
+ printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
}
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-22 08:22:54.960761923 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-22 08:23:00.264144736 -0700
@@ -53,8 +53,6 @@ static ssize_t node_read_meminfo(struct
ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
- if ((long)ps.nr_slab < 0)
- ps.nr_slab = 0;
n = sprintf(buf, "\n"
"Node %d MemTotal: %8lu kB\n"
@@ -86,7 +84,7 @@ static ssize_t node_read_meminfo(struct
nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(node_page_state(nid, NR_ANON_PAGES)),
- nid, K(ps.nr_slab));
+ nid, K(node_page_state(nid, NR_SLAB)));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
}
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-22 08:22:54.955879413 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-22 08:23:00.265121238 -0700
@@ -194,7 +194,7 @@ static int meminfo_read_proc(char *page,
K(ps.nr_writeback),
K(global_page_state(NR_ANON_PAGES)),
K(global_page_state(NR_FILE_MAPPED)),
- K(ps.nr_slab),
+ K(global_page_state(NR_SLAB)),
K(allowed),
K(committed),
K(ps.nr_page_table_pages),
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-22 08:22:58.091427601 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-22 08:23:00.266097740 -0700
@@ -52,6 +52,7 @@ enum zone_stat_item {
NR_FILE_MAPPED, /* pagecache pages mapped into pagetables.
only modified from process context */
NR_FILE_PAGES,
+ NR_SLAB, /* Pages used by slab allocator */
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-22 08:19:49.773977174 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-22 08:23:00.267074242 -0700
@@ -1313,7 +1313,7 @@ void show_free_areas(void)
ps.nr_writeback,
ps.nr_unstable,
nr_free_pages(),
- ps.nr_slab,
+ global_page_state(NR_SLAB),
global_page_state(NR_FILE_MAPPED),
ps.nr_page_table_pages);
Index: linux-2.6.17-mm1/mm/slab.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/slab.c 2006-06-21 08:04:57.511312216 -0700
+++ linux-2.6.17-mm1/mm/slab.c 2006-06-22 08:23:00.270003748 -0700
@@ -1469,7 +1469,7 @@ static void *kmem_getpages(struct kmem_c
i = (1 << cachep->gfporder);
if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
atomic_add(i, &slab_reclaim_pages);
- add_page_state(nr_slab, i);
+ add_zone_page_state(page_zone(page), NR_SLAB, i);
while (i--) {
__SetPageSlab(page);
page++;
@@ -1491,7 +1491,7 @@ static void kmem_freepages(struct kmem_c
__ClearPageSlab(page);
page++;
}
- sub_page_state(nr_slab, nr_freed);
+ sub_zone_page_state(page_zone(page), NR_SLAB, nr_freed);
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += nr_freed;
free_pages((unsigned long)addr, cachep->gfporder);
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-22 08:22:54.961738425 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-22 08:23:00.270003748 -0700
@@ -460,13 +460,13 @@ static char *vmstat_text[] = {
"nr_anon_pages",
"nr_mapped",
"nr_file_pages",
+ "nr_slab",
/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
"nr_page_table_pages",
- "nr_slab",
"pgpgin",
"pgpgout",
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-22 08:10:13.219810354 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 08:23:17.482805961 -0700
@@ -25,8 +25,7 @@ struct page_state {
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
unsigned long nr_page_table_pages;/* Pages used for pagetables */
- unsigned long nr_slab; /* In slab */
-#define GET_PAGE_STATE_LAST nr_slab
+#define GET_PAGE_STATE_LAST nr_page_table_pages
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 09/14] Conversion of nr_pagetables to per zone counter, zoned vm counters: conversion of nr_pagetable to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (7 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 08/14] Conversion of nr_slab to per zone counter, zoned vm counters: conversion of nr_slab to per zone counter Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 10/14] Conversion of nr_dirty to per zone counter, zoned vm counters: conversion of nr_dirty " Christoph Lameter, Christoph Lameter
` (5 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Conversion of nr_page_table_pages to a per zone counter
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/i386/mm/pgtable.c 2006-06-22 08:23:00.263168234 -0700
+++ linux-2.6.17-mm1/arch/i386/mm/pgtable.c 2006-06-22 08:37:08.449894359 -0700
@@ -63,7 +63,8 @@ void show_mem(void)
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_FILE_MAPPED));
printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
- printk(KERN_INFO "%lu pages pagetables\n", ps.nr_page_table_pages);
+ printk(KERN_INFO "%lu pages pagetables\n",
+ global_page_state(NR_PAGETABLE));
}
/*
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-22 08:23:00.265121238 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-22 08:37:08.450870861 -0700
@@ -171,9 +171,9 @@ static int meminfo_read_proc(char *page,
"AnonPages: %8lu kB\n"
"Mapped: %8lu kB\n"
"Slab: %8lu kB\n"
+ "PageTables: %8lu kB\n"
"CommitLimit: %8lu kB\n"
"Committed_AS: %8lu kB\n"
- "PageTables: %8lu kB\n"
"VmallocTotal: %8lu kB\n"
"VmallocUsed: %8lu kB\n"
"VmallocChunk: %8lu kB\n",
@@ -195,9 +195,9 @@ static int meminfo_read_proc(char *page,
K(global_page_state(NR_ANON_PAGES)),
K(global_page_state(NR_FILE_MAPPED)),
K(global_page_state(NR_SLAB)),
+ K(global_page_state(NR_PAGETABLE)),
K(allowed),
K(committed),
- K(ps.nr_page_table_pages),
(unsigned long)VMALLOC_TOTAL >> 10,
vmi.used >> 10,
vmi.largest_chunk >> 10
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-22 08:23:00.266097740 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-22 08:37:08.451847363 -0700
@@ -53,6 +53,7 @@ enum zone_stat_item {
only modified from process context */
NR_FILE_PAGES,
NR_SLAB, /* Pages used by slab allocator */
+ NR_PAGETABLE, /* used for pagetables */
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/mm/memory.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/memory.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/memory.c 2006-06-22 08:37:08.452823865 -0700
@@ -126,7 +126,7 @@ static void free_pte_range(struct mmu_ga
pmd_clear(pmd);
pte_lock_deinit(page);
pte_free_tlb(tlb, page);
- dec_page_state(nr_page_table_pages);
+ dec_zone_page_state(page, NR_PAGETABLE);
tlb->mm->nr_ptes--;
}
@@ -311,7 +311,7 @@ int __pte_alloc(struct mm_struct *mm, pm
pte_free(new);
} else {
mm->nr_ptes++;
- inc_page_state(nr_page_table_pages);
+ inc_zone_page_state(new, NR_PAGETABLE);
pmd_populate(mm, pmd, new);
}
spin_unlock(&mm->page_table_lock);
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-22 08:23:00.267074242 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-22 08:37:08.454776869 -0700
@@ -1315,7 +1315,7 @@ void show_free_areas(void)
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_FILE_MAPPED),
- ps.nr_page_table_pages);
+ global_page_state(NR_PAGETABLE));
for_each_zone(zone) {
int i;
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-22 08:23:00.264144736 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-22 08:37:08.455753371 -0700
@@ -69,6 +69,7 @@ static ssize_t node_read_meminfo(struct
"Node %d FilePages: %8lu kB\n"
"Node %d Mapped: %8lu kB\n"
"Node %d AnonPages: %8lu kB\n"
+ "Node %d PageTables: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
@@ -84,6 +85,7 @@ static ssize_t node_read_meminfo(struct
nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(node_page_state(nid, NR_ANON_PAGES)),
+ nid, K(node_page_state(nid, NR_PAGETABLE)),
nid, K(node_page_state(nid, NR_SLAB)));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-mm1/arch/um/kernel/skas/mmu.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/um/kernel/skas/mmu.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/um/kernel/skas/mmu.c 2006-06-22 08:37:08.456729873 -0700
@@ -152,7 +152,7 @@ void destroy_context_skas(struct mm_stru
free_page(mmu->id.stack);
pte_lock_deinit(virt_to_page(mmu->last_page_table));
pte_free_kernel((pte_t *) mmu->last_page_table);
- dec_page_state(nr_page_table_pages);
+ dec_zone_page_state(virt_to_page(mmu->last_page_table), NR_PAGETABLE);
#ifdef CONFIG_3_LEVEL_PGTABLES
pmd_free((pmd_t *) mmu->last_pmd);
#endif
Index: linux-2.6.17-mm1/arch/arm/mm/mm-armv.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/arm/mm/mm-armv.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/arch/arm/mm/mm-armv.c 2006-06-22 08:37:08.456729873 -0700
@@ -227,7 +227,7 @@ void free_pgd_slow(pgd_t *pgd)
pte = pmd_page(*pmd);
pmd_clear(pmd);
- dec_page_state(nr_page_table_pages);
+ dec_zone_page_state(virt_to_page((unsigned long *)pgd), NR_PAGETABLE);
pte_lock_deinit(pte);
pte_free(pte);
pmd_free(pmd);
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-22 08:23:00.270003748 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-22 08:37:08.457706375 -0700
@@ -461,12 +461,12 @@ static char *vmstat_text[] = {
"nr_mapped",
"nr_file_pages",
"nr_slab",
+ "nr_page_table_pages",
/* Page state */
"nr_dirty",
"nr_writeback",
"nr_unstable",
- "nr_page_table_pages",
"pgpgin",
"pgpgout",
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-22 08:23:17.482805961 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 08:37:27.048350906 -0700
@@ -24,8 +24,7 @@ struct page_state {
unsigned long nr_dirty; /* Dirty writeable pages */
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
- unsigned long nr_page_table_pages;/* Pages used for pagetables */
-#define GET_PAGE_STATE_LAST nr_page_table_pages
+#define GET_PAGE_STATE_LAST nr_unstable
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 10/14] Conversion of nr_dirty to per zone counter, zoned vm counters: conversion of nr_dirty to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (8 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 09/14] Conversion of nr_pagetables to per zone counter, zoned vm counters: conversion of nr_pagetable " Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:40 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 11/14] Conversion of nr_writeback to per zone counter, zoned vm counters: conversion of nr_writeback " Christoph Lameter, Christoph Lameter
` (4 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:40 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
This makes nr_dirty a per zone counter. Looping over all processors is
avoided during writeback state determination.
The counter aggregation for nr_dirty had to be undone in the NFS layer since
we summed up the page counts from multiple zones. Someone more familiar with
NFS should probably review what I have done.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/i386/mm/pgtable.c 2006-06-22 08:37:08.449894359 -0700
+++ linux-2.6.17-mm1/arch/i386/mm/pgtable.c 2006-06-22 08:43:21.308619504 -0700
@@ -59,7 +59,7 @@ void show_mem(void)
printk(KERN_INFO "%d pages swap cached\n", cached);
get_page_state(&ps);
- printk(KERN_INFO "%lu pages dirty\n", ps.nr_dirty);
+ printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_FILE_DIRTY));
printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_FILE_MAPPED));
printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-22 08:37:08.455753371 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-22 08:43:21.309596006 -0700
@@ -49,8 +49,6 @@ static ssize_t node_read_meminfo(struct
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
/* Check for negative values in these approximate counters */
- if ((long)ps.nr_dirty < 0)
- ps.nr_dirty = 0;
if ((long)ps.nr_writeback < 0)
ps.nr_writeback = 0;
@@ -80,7 +78,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.freehigh),
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh),
- nid, K(ps.nr_dirty),
+ nid, K(node_page_state(nid, NR_FILE_DIRTY)),
nid, K(ps.nr_writeback),
nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
Index: linux-2.6.17-mm1/fs/buffer.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/buffer.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/fs/buffer.c 2006-06-22 08:43:21.311549010 -0700
@@ -854,7 +854,7 @@ int __set_page_dirty_buffers(struct page
write_lock_irq(&mapping->tree_lock);
if (page->mapping) { /* Race with truncate? */
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_FILE_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_DIRTY);
Index: linux-2.6.17-mm1/fs/fs-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/fs-writeback.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/fs/fs-writeback.c 2006-06-22 08:43:21.312525512 -0700
@@ -462,7 +462,7 @@ void sync_inodes_sb(struct super_block *
struct writeback_control wbc = {
.sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
};
- unsigned long nr_dirty = read_page_state(nr_dirty);
+ unsigned long nr_dirty = global_page_state(NR_FILE_DIRTY);
unsigned long nr_unstable = read_page_state(nr_unstable);
wbc.nr_to_write = nr_dirty + nr_unstable +
Index: linux-2.6.17-mm1/fs/nfs/pagelist.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/nfs/pagelist.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/fs/nfs/pagelist.c 2006-06-22 08:43:21.313502014 -0700
@@ -315,6 +315,7 @@ nfs_scan_lock_dirty(struct nfs_inode *nf
req->wb_index, NFS_PAGE_TAG_DIRTY);
nfs_list_remove_request(req);
nfs_list_add_request(req, dst);
+ dec_zone_page_state(req->wb_page, NR_FILE_DIRTY);
res++;
}
}
Index: linux-2.6.17-mm1/fs/nfs/write.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/nfs/write.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/fs/nfs/write.c 2006-06-22 08:43:21.314478516 -0700
@@ -501,7 +501,7 @@ nfs_mark_request_dirty(struct nfs_page *
nfs_list_add_request(req, &nfsi->dirty);
nfsi->ndirty++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_dirty);
+ inc_zone_page_state(req->wb_page, NR_FILE_DIRTY);
mark_inode_dirty(inode);
}
@@ -602,7 +602,6 @@ nfs_scan_dirty(struct inode *inode, stru
if (nfsi->ndirty != 0) {
res = nfs_scan_lock_dirty(nfsi, dst, idx_start, npages);
nfsi->ndirty -= res;
- sub_page_state(nr_dirty,res);
if ((nfsi->ndirty == 0) != list_empty(&nfsi->dirty))
printk(KERN_ERR "NFS: desynchronized value of nfs_i.ndirty.\n");
}
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-22 08:37:08.450870861 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-22 08:43:21.314478516 -0700
@@ -190,7 +190,7 @@ static int meminfo_read_proc(char *page,
K(i.freeram-i.freehigh),
K(i.totalswap),
K(i.freeswap),
- K(ps.nr_dirty),
+ K(global_page_state(NR_FILE_DIRTY)),
K(ps.nr_writeback),
K(global_page_state(NR_ANON_PAGES)),
K(global_page_state(NR_FILE_MAPPED)),
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-22 08:37:08.451847363 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-22 08:43:21.315455018 -0700
@@ -54,6 +54,7 @@ enum zone_stat_item {
NR_FILE_PAGES,
NR_SLAB, /* Pages used by slab allocator */
NR_PAGETABLE, /* used for pagetables */
+ NR_FILE_DIRTY,
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-22 08:37:08.454776869 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-22 08:43:21.317408022 -0700
@@ -1309,7 +1309,7 @@ void show_free_areas(void)
"unstable:%lu free:%u slab:%lu mapped:%lu pagetables:%lu\n",
active,
inactive,
- ps.nr_dirty,
+ global_page_state(NR_FILE_DIRTY),
ps.nr_writeback,
ps.nr_unstable,
nr_free_pages(),
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-22 08:22:54.960761923 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-22 08:43:21.318384524 -0700
@@ -109,7 +109,7 @@ struct writeback_state
static void get_writeback_state(struct writeback_state *wbs)
{
- wbs->nr_dirty = read_page_state(nr_dirty);
+ wbs->nr_dirty = global_page_state(NR_FILE_DIRTY);
wbs->nr_unstable = read_page_state(nr_unstable);
wbs->nr_mapped = global_page_state(NR_FILE_MAPPED) +
global_page_state(NR_ANON_PAGES);
@@ -638,7 +638,7 @@ int __set_page_dirty_nobuffers(struct pa
if (mapping2) { /* Race with truncate? */
BUG_ON(mapping2 != mapping);
if (mapping_cap_account_dirty(mapping))
- inc_page_state(nr_dirty);
+ __inc_zone_page_state(page, NR_FILE_DIRTY);
radix_tree_tag_set(&mapping->page_tree,
page_index(page), PAGECACHE_TAG_DIRTY);
}
@@ -725,9 +725,9 @@ int test_clear_page_dirty(struct page *p
radix_tree_tag_clear(&mapping->page_tree,
page_index(page),
PAGECACHE_TAG_DIRTY);
- write_unlock_irqrestore(&mapping->tree_lock, flags);
if (mapping_cap_account_dirty(mapping))
- dec_page_state(nr_dirty);
+ __dec_zone_page_state(page, NR_FILE_DIRTY);
+ write_unlock_irqrestore(&mapping->tree_lock, flags);
return 1;
}
write_unlock_irqrestore(&mapping->tree_lock, flags);
@@ -758,7 +758,7 @@ int clear_page_dirty_for_io(struct page
if (mapping) {
if (TestClearPageDirty(page)) {
if (mapping_cap_account_dirty(mapping))
- dec_page_state(nr_dirty);
+ dec_zone_page_state(page, NR_FILE_DIRTY);
return 1;
}
return 0;
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-22 08:37:08.457706375 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-22 08:43:21.318384524 -0700
@@ -462,9 +462,9 @@ static char *vmstat_text[] = {
"nr_file_pages",
"nr_slab",
"nr_page_table_pages",
+ "nr_dirty",
/* Page state */
- "nr_dirty",
"nr_writeback",
"nr_unstable",
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-22 08:37:27.048350906 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 08:43:32.863567390 -0700
@@ -21,7 +21,6 @@
* commented here.
*/
struct page_state {
- unsigned long nr_dirty; /* Dirty writeable pages */
unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
#define GET_PAGE_STATE_LAST nr_unstable
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 11/14] Conversion of nr_writeback to per zone counter, zoned vm counters: conversion of nr_writeback to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (9 preceding siblings ...)
2006-06-22 16:40 ` [PATCH 10/14] Conversion of nr_dirty to per zone counter, zoned vm counters: conversion of nr_dirty " Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:41 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 12/14] Conversion of nr_unstable to per zone counter, zoned vm counters: conversion of nr_unstable " Christoph Lameter, Christoph Lameter
` (3 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:41 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Conversion of nr_writeback to per zone counter.
This removes the last page_state counter from arch/i386/mm/pgtable.c
so we drop the page_state from there.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/arch/i386/mm/pgtable.c
===================================================================
--- linux-2.6.17-mm1.orig/arch/i386/mm/pgtable.c 2006-06-22 08:43:21.308619504 -0700
+++ linux-2.6.17-mm1/arch/i386/mm/pgtable.c 2006-06-22 08:49:30.151016328 -0700
@@ -30,7 +30,6 @@ void show_mem(void)
struct page *page;
pg_data_t *pgdat;
unsigned long i;
- struct page_state ps;
unsigned long flags;
printk(KERN_INFO "Mem-info:\n");
@@ -58,9 +57,9 @@ void show_mem(void)
printk(KERN_INFO "%d pages shared\n", shared);
printk(KERN_INFO "%d pages swap cached\n", cached);
- get_page_state(&ps);
printk(KERN_INFO "%lu pages dirty\n", global_page_state(NR_FILE_DIRTY));
- printk(KERN_INFO "%lu pages writeback\n", ps.nr_writeback);
+ printk(KERN_INFO "%lu pages writeback\n",
+ global_page_state(NR_WRITEBACK));
printk(KERN_INFO "%lu pages mapped\n", global_page_state(NR_FILE_MAPPED));
printk(KERN_INFO "%lu pages slab\n", global_page_state(NR_SLAB));
printk(KERN_INFO "%lu pages pagetables\n",
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-22 08:43:21.309596006 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-22 08:49:30.151992830 -0700
@@ -48,9 +48,6 @@ static ssize_t node_read_meminfo(struct
get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
- /* Check for negative values in these approximate counters */
- if ((long)ps.nr_writeback < 0)
- ps.nr_writeback = 0;
n = sprintf(buf, "\n"
"Node %d MemTotal: %8lu kB\n"
@@ -79,7 +76,7 @@ static ssize_t node_read_meminfo(struct
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh),
nid, K(node_page_state(nid, NR_FILE_DIRTY)),
- nid, K(ps.nr_writeback),
+ nid, K(node_page_state(nid, NR_WRITEBACK)),
nid, K(node_page_state(nid, NR_FILE_PAGES)),
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(node_page_state(nid, NR_ANON_PAGES)),
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-22 08:43:21.314478516 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-22 08:49:30.152969332 -0700
@@ -191,7 +191,7 @@ static int meminfo_read_proc(char *page,
K(i.totalswap),
K(i.freeswap),
K(global_page_state(NR_FILE_DIRTY)),
- K(ps.nr_writeback),
+ K(global_page_state(NR_WRITEBACK)),
K(global_page_state(NR_ANON_PAGES)),
K(global_page_state(NR_FILE_MAPPED)),
K(global_page_state(NR_SLAB)),
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-22 08:43:21.315455018 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-22 08:49:30.152969332 -0700
@@ -55,6 +55,7 @@ enum zone_stat_item {
NR_SLAB, /* Pages used by slab allocator */
NR_PAGETABLE, /* used for pagetables */
NR_FILE_DIRTY,
+ NR_WRITEBACK,
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/include/linux/page-flags.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/page-flags.h 2006-06-21 07:45:02.657540879 -0700
+++ linux-2.6.17-mm1/include/linux/page-flags.h 2006-06-22 08:49:30.153945834 -0700
@@ -164,7 +164,7 @@
do { \
if (!test_and_set_bit(PG_writeback, \
&(page)->flags)) \
- inc_page_state(nr_writeback); \
+ inc_zone_page_state(page, NR_WRITEBACK); \
} while (0)
#define TestSetPageWriteback(page) \
({ \
@@ -172,14 +172,14 @@
ret = test_and_set_bit(PG_writeback, \
&(page)->flags); \
if (!ret) \
- inc_page_state(nr_writeback); \
+ inc_zone_page_state(page, NR_WRITEBACK); \
ret; \
})
#define ClearPageWriteback(page) \
do { \
if (test_and_clear_bit(PG_writeback, \
&(page)->flags)) \
- dec_page_state(nr_writeback); \
+ dec_zone_page_state(page, NR_WRITEBACK); \
} while (0)
#define TestClearPageWriteback(page) \
({ \
@@ -187,7 +187,7 @@
ret = test_and_clear_bit(PG_writeback, \
&(page)->flags); \
if (ret) \
- dec_page_state(nr_writeback); \
+ dec_zone_page_state(page, NR_WRITEBACK); \
ret; \
})
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-22 08:43:21.317408022 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-22 08:49:30.155898838 -0700
@@ -1310,7 +1310,7 @@ void show_free_areas(void)
active,
inactive,
global_page_state(NR_FILE_DIRTY),
- ps.nr_writeback,
+ global_page_state(NR_WRITEBACK),
ps.nr_unstable,
nr_free_pages(),
global_page_state(NR_SLAB),
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-22 08:43:21.318384524 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-22 08:49:30.155898838 -0700
@@ -113,7 +113,7 @@ static void get_writeback_state(struct w
wbs->nr_unstable = read_page_state(nr_unstable);
wbs->nr_mapped = global_page_state(NR_FILE_MAPPED) +
global_page_state(NR_ANON_PAGES);
- wbs->nr_writeback = read_page_state(nr_writeback);
+ wbs->nr_writeback = global_page_state(NR_WRITEBACK);
}
/*
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-22 08:43:21.318384524 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-22 08:49:30.156875341 -0700
@@ -463,9 +463,9 @@ static char *vmstat_text[] = {
"nr_slab",
"nr_page_table_pages",
"nr_dirty",
+ "nr_writeback",
/* Page state */
- "nr_writeback",
"nr_unstable",
"pgpgin",
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-22 08:43:32.863567390 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 08:49:41.977433405 -0700
@@ -21,7 +21,6 @@
* commented here.
*/
struct page_state {
- unsigned long nr_writeback; /* Pages under writeback */
unsigned long nr_unstable; /* NFS unstable pages */
#define GET_PAGE_STATE_LAST nr_unstable
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 12/14] Conversion of nr_unstable to per zone counter, zoned vm counters: conversion of nr_unstable to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (10 preceding siblings ...)
2006-06-22 16:41 ` [PATCH 11/14] Conversion of nr_writeback to per zone counter, zoned vm counters: conversion of nr_writeback " Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:41 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 13/14] Conversion of nr_bounce to per zone counter, zoned vm counters: conversion of nr_bounce " Christoph Lameter, Christoph Lameter
` (2 subsequent siblings)
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:41 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Conversion of nr_unstable to a per zone counter
We need to do some special modifications to the nfs code
since there are multiple cases of disposition and we need
to have a page ref for proper accounting.
This converts the last critical page state of the VM and therefore
we need to remove several functions that were depending on
GET_PAGE_STATE_LAST in order to make the kernel compile again.
We are only left with event type counters in page state.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-22 08:49:30.151992830 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-22 08:55:49.724221334 -0700
@@ -39,13 +39,11 @@ static ssize_t node_read_meminfo(struct
int n;
int nid = dev->id;
struct sysinfo i;
- struct page_state ps;
unsigned long inactive;
unsigned long active;
unsigned long free;
si_meminfo_node(&i, nid);
- get_page_state_node(&ps, nid);
__get_zone_counts(&active, &inactive, &free, NODE_DATA(nid));
@@ -65,6 +63,7 @@ static ssize_t node_read_meminfo(struct
"Node %d Mapped: %8lu kB\n"
"Node %d AnonPages: %8lu kB\n"
"Node %d PageTables: %8lu kB\n"
+ "Node %d NFS Unstable: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
@@ -81,6 +80,7 @@ static ssize_t node_read_meminfo(struct
nid, K(node_page_state(nid, NR_FILE_MAPPED)),
nid, K(node_page_state(nid, NR_ANON_PAGES)),
nid, K(node_page_state(nid, NR_PAGETABLE)),
+ nid, K(node_page_state(nid, NR_UNSTABLE_NFS)),
nid, K(node_page_state(nid, NR_SLAB)));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-mm1/fs/fs-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/fs-writeback.c 2006-06-22 08:43:21.312525512 -0700
+++ linux-2.6.17-mm1/fs/fs-writeback.c 2006-06-22 08:55:49.725197836 -0700
@@ -463,7 +463,7 @@ void sync_inodes_sb(struct super_block *
.sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_HOLD,
};
unsigned long nr_dirty = global_page_state(NR_FILE_DIRTY);
- unsigned long nr_unstable = read_page_state(nr_unstable);
+ unsigned long nr_unstable = global_page_state(NR_UNSTABLE_NFS);
wbc.nr_to_write = nr_dirty + nr_unstable +
(inodes_stat.nr_inodes - inodes_stat.nr_unused) +
Index: linux-2.6.17-mm1/fs/nfs/write.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/nfs/write.c 2006-06-22 08:43:21.314478516 -0700
+++ linux-2.6.17-mm1/fs/nfs/write.c 2006-06-22 08:55:49.726174339 -0700
@@ -529,7 +529,7 @@ nfs_mark_request_commit(struct nfs_page
nfs_list_add_request(req, &nfsi->commit);
nfsi->ncommit++;
spin_unlock(&nfsi->req_lock);
- inc_page_state(nr_unstable);
+ inc_zone_page_state(req->wb_page, NR_UNSTABLE_NFS);
mark_inode_dirty(inode);
}
#endif
@@ -1386,7 +1386,6 @@ static void nfs_commit_done(struct rpc_t
{
struct nfs_write_data *data = calldata;
struct nfs_page *req;
- int res = 0;
dprintk("NFS: %4d nfs_commit_done (status %d)\n",
task->tk_pid, task->tk_status);
@@ -1423,10 +1422,10 @@ static void nfs_commit_done(struct rpc_t
dprintk(" mismatch\n");
nfs_mark_request_dirty(req);
next:
+ if (req->wb_page)
+ dec_zone_page_state(req->wb_page, NR_UNSTABLE_NFS);
nfs_clear_page_writeback(req);
- res++;
}
- sub_page_state(nr_unstable,res);
}
static const struct rpc_call_ops nfs_commit_ops = {
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-22 08:49:30.152969332 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-22 08:55:49.727150841 -0700
@@ -56,6 +56,7 @@ enum zone_stat_item {
NR_PAGETABLE, /* used for pagetables */
NR_FILE_DIRTY,
NR_WRITEBACK,
+ NR_UNSTABLE_NFS, /* NFS unstable pages */
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page_alloc.c 2006-06-22 08:49:30.155898838 -0700
+++ linux-2.6.17-mm1/mm/page_alloc.c 2006-06-22 08:55:49.728127343 -0700
@@ -1266,7 +1266,6 @@ void si_meminfo_node(struct sysinfo *val
*/
void show_free_areas(void)
{
- struct page_state ps;
int cpu, temperature;
unsigned long active;
unsigned long inactive;
@@ -1298,7 +1297,6 @@ void show_free_areas(void)
}
}
- get_page_state(&ps);
get_zone_counts(&active, &inactive, &free);
printk("Free pages: %11ukB (%ukB HighMem)\n",
@@ -1311,7 +1309,7 @@ void show_free_areas(void)
inactive,
global_page_state(NR_FILE_DIRTY),
global_page_state(NR_WRITEBACK),
- ps.nr_unstable,
+ global_page_state(NR_UNSTABLE_NFS),
nr_free_pages(),
global_page_state(NR_SLAB),
global_page_state(NR_FILE_MAPPED),
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-22 08:49:30.155898838 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-22 08:55:49.729103845 -0700
@@ -110,7 +110,7 @@ struct writeback_state
static void get_writeback_state(struct writeback_state *wbs)
{
wbs->nr_dirty = global_page_state(NR_FILE_DIRTY);
- wbs->nr_unstable = read_page_state(nr_unstable);
+ wbs->nr_unstable = global_page_state(NR_UNSTABLE_NFS);
wbs->nr_mapped = global_page_state(NR_FILE_MAPPED) +
global_page_state(NR_ANON_PAGES);
wbs->nr_writeback = global_page_state(NR_WRITEBACK);
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-22 08:49:30.152969332 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-22 08:55:49.730080347 -0700
@@ -120,7 +120,6 @@ static int meminfo_read_proc(char *page,
{
struct sysinfo i;
int len;
- struct page_state ps;
unsigned long inactive;
unsigned long active;
unsigned long free;
@@ -129,7 +128,6 @@ static int meminfo_read_proc(char *page,
struct vmalloc_info vmi;
long cached;
- get_page_state(&ps);
get_zone_counts(&active, &inactive, &free);
/*
@@ -172,6 +170,7 @@ static int meminfo_read_proc(char *page,
"Mapped: %8lu kB\n"
"Slab: %8lu kB\n"
"PageTables: %8lu kB\n"
+ "NFS Unstable: %8lu kB\n"
"CommitLimit: %8lu kB\n"
"Committed_AS: %8lu kB\n"
"VmallocTotal: %8lu kB\n"
@@ -196,6 +195,7 @@ static int meminfo_read_proc(char *page,
K(global_page_state(NR_FILE_MAPPED)),
K(global_page_state(NR_SLAB)),
K(global_page_state(NR_PAGETABLE)),
+ K(global_page_state(NR_UNSTABLE_NFS)),
K(allowed),
K(committed),
(unsigned long)VMALLOC_TOTAL >> 10,
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-22 08:49:30.156875341 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-22 08:56:55.160603832 -0700
@@ -45,28 +45,6 @@ static void __get_page_state(struct page
}
}
-void get_page_state_node(struct page_state *ret, int node)
-{
- int nr;
- cpumask_t mask = node_to_cpumask(node);
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr+1, &mask);
-}
-
-void get_page_state(struct page_state *ret)
-{
- int nr;
- cpumask_t mask = CPU_MASK_ALL;
-
- nr = offsetof(struct page_state, GET_PAGE_STATE_LAST);
- nr /= sizeof(unsigned long);
-
- __get_page_state(ret, nr + 1, &mask);
-}
-
void get_full_page_state(struct page_state *ret)
{
cpumask_t mask = CPU_MASK_ALL;
@@ -464,10 +442,9 @@ static char *vmstat_text[] = {
"nr_page_table_pages",
"nr_dirty",
"nr_writeback",
-
- /* Page state */
"nr_unstable",
+ /* Event counters */
"pgpgin",
"pgpgout",
"pswpin",
Index: linux-2.6.17-mm1/fs/nfs/pagelist.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/nfs/pagelist.c 2006-06-22 08:43:21.313502014 -0700
+++ linux-2.6.17-mm1/fs/nfs/pagelist.c 2006-06-22 08:55:49.731056849 -0700
@@ -154,6 +154,7 @@ void nfs_clear_request(struct nfs_page *
{
struct page *page = req->wb_page;
if (page != NULL) {
+ dec_zone_page_state(page, NR_UNSTABLE_NFS);
page_cache_release(page);
req->wb_page = NULL;
}
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-22 08:49:41.977433405 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 08:57:08.529894129 -0700
@@ -21,9 +21,6 @@
* commented here.
*/
struct page_state {
- unsigned long nr_unstable; /* NFS unstable pages */
-#define GET_PAGE_STATE_LAST nr_unstable
-
/*
* The below are zeroed by get_page_state(). Use get_full_page_state()
* to add up all these.
@@ -76,8 +73,6 @@ struct page_state {
unsigned long nr_bounce; /* pages for bounce buffers */
};
-extern void get_page_state(struct page_state *ret);
-extern void get_page_state_node(struct page_state *ret, int node);
extern void get_full_page_state(struct page_state *ret);
extern unsigned long read_page_state_offset(unsigned long offset);
extern void mod_page_state_offset(unsigned long offset, unsigned long delta);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 13/14] Conversion of nr_bounce to per zone counter, zoned vm counters: conversion of nr_bounce to per zone counter
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (11 preceding siblings ...)
2006-06-22 16:41 ` [PATCH 12/14] Conversion of nr_unstable to per zone counter, zoned vm counters: conversion of nr_unstable " Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:41 ` Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 14/14] Remove useless struct wbs, zoned vm counters: remove useless writeback structure Christoph Lameter, Christoph Lameter
2006-06-22 17:19 ` [PATCH 00/14] Zoned VM counters V6 Andrew Morton
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:41 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Conversion of nr_bounce to a per zone counter
nr_bounce is only used for proc output. So it could be left as an
event counter. However, the event counters may not be accurate and nr_bounce
is categorizing types of pages in a zone. So we really need this to also be a
per zone counter.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/drivers/base/node.c
===================================================================
--- linux-2.6.17-mm1.orig/drivers/base/node.c 2006-06-22 08:55:49.724221334 -0700
+++ linux-2.6.17-mm1/drivers/base/node.c 2006-06-22 09:15:08.668078402 -0700
@@ -64,6 +64,7 @@ static ssize_t node_read_meminfo(struct
"Node %d AnonPages: %8lu kB\n"
"Node %d PageTables: %8lu kB\n"
"Node %d NFS Unstable: %8lu kB\n"
+ "Node %d Bounce: %8lu kB\n"
"Node %d Slab: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
@@ -81,6 +82,7 @@ static ssize_t node_read_meminfo(struct
nid, K(node_page_state(nid, NR_ANON_PAGES)),
nid, K(node_page_state(nid, NR_PAGETABLE)),
nid, K(node_page_state(nid, NR_UNSTABLE_NFS)),
+ nid, K(node_page_state(nid, NR_BOUNCE)),
nid, K(node_page_state(nid, NR_SLAB)));
n += hugetlb_report_node_meminfo(nid, buf + n);
return n;
Index: linux-2.6.17-mm1/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/mmzone.h 2006-06-22 08:55:49.727150841 -0700
+++ linux-2.6.17-mm1/include/linux/mmzone.h 2006-06-22 09:15:08.669054904 -0700
@@ -57,6 +57,7 @@ enum zone_stat_item {
NR_FILE_DIRTY,
NR_WRITEBACK,
NR_UNSTABLE_NFS, /* NFS unstable pages */
+ NR_BOUNCE,
NR_VM_ZONE_STAT_ITEMS };
struct per_cpu_pages {
Index: linux-2.6.17-mm1/mm/highmem.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/highmem.c 2006-06-17 18:49:35.000000000 -0700
+++ linux-2.6.17-mm1/mm/highmem.c 2006-06-22 09:15:08.670031406 -0700
@@ -316,7 +316,7 @@ static void bounce_end_io(struct bio *bi
continue;
mempool_free(bvec->bv_page, pool);
- dec_page_state(nr_bounce);
+ dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
}
bio_endio(bio_orig, bio_orig->bi_size, err);
@@ -397,7 +397,7 @@ static void __blk_queue_bounce(request_q
to->bv_page = mempool_alloc(pool, q->bounce_gfp);
to->bv_len = from->bv_len;
to->bv_offset = from->bv_offset;
- inc_page_state(nr_bounce);
+ inc_zone_page_state(to->bv_page, NR_BOUNCE);
if (rw == WRITE) {
char *vto, *vfrom;
Index: linux-2.6.17-mm1/mm/vmstat.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/vmstat.c 2006-06-22 08:56:55.160603832 -0700
+++ linux-2.6.17-mm1/mm/vmstat.c 2006-06-22 09:15:08.671007908 -0700
@@ -443,6 +443,7 @@ static char *vmstat_text[] = {
"nr_dirty",
"nr_writeback",
"nr_unstable",
+ "nr_bounce",
/* Event counters */
"pgpgin",
@@ -490,7 +491,6 @@ static char *vmstat_text[] = {
"allocstall",
"pgrotated",
- "nr_bounce",
};
/*
Index: linux-2.6.17-mm1/fs/proc/proc_misc.c
===================================================================
--- linux-2.6.17-mm1.orig/fs/proc/proc_misc.c 2006-06-22 08:55:49.730080347 -0700
+++ linux-2.6.17-mm1/fs/proc/proc_misc.c 2006-06-22 09:15:08.671984410 -0700
@@ -171,6 +171,7 @@ static int meminfo_read_proc(char *page,
"Slab: %8lu kB\n"
"PageTables: %8lu kB\n"
"NFS Unstable: %8lu kB\n"
+ "Bounce: %8lu kB\n"
"CommitLimit: %8lu kB\n"
"Committed_AS: %8lu kB\n"
"VmallocTotal: %8lu kB\n"
@@ -196,6 +197,7 @@ static int meminfo_read_proc(char *page,
K(global_page_state(NR_SLAB)),
K(global_page_state(NR_PAGETABLE)),
K(global_page_state(NR_UNSTABLE_NFS)),
+ K(global_page_state(NR_BOUNCE)),
K(allowed),
K(committed),
(unsigned long)VMALLOC_TOTAL >> 10,
Index: linux-2.6.17-mm1/include/linux/vmstat.h
===================================================================
--- linux-2.6.17-mm1.orig/include/linux/vmstat.h 2006-06-22 09:15:20.479846399 -0700
+++ linux-2.6.17-mm1/include/linux/vmstat.h 2006-06-22 09:15:22.334223666 -0700
@@ -70,7 +70,6 @@ struct page_state {
unsigned long allocstall; /* direct reclaim calls */
unsigned long pgrotated; /* pages rotated to tail of the LRU */
- unsigned long nr_bounce; /* pages for bounce buffers */
};
extern void get_full_page_state(struct page_state *ret);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 14/14] Remove useless struct wbs, zoned vm counters: remove useless writeback structure
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (12 preceding siblings ...)
2006-06-22 16:41 ` [PATCH 13/14] Conversion of nr_bounce to per zone counter, zoned vm counters: conversion of nr_bounce " Christoph Lameter, Christoph Lameter
@ 2006-06-22 16:41 ` Christoph Lameter, Christoph Lameter
2006-06-22 17:19 ` [PATCH 00/14] Zoned VM counters V6 Andrew Morton
14 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-22 16:41 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, Christoph Lameter
Remove writeback state
We can remove some functions now that were needed to calculate the page state
for writeback control since these statistics are now directly available.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-21 07:45:22.931676221 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-21 07:45:30.543509618 -0700
@@ -99,23 +99,6 @@ EXPORT_SYMBOL(laptop_mode);
static void background_writeout(unsigned long _min_pages);
-struct writeback_state
-{
- unsigned long nr_dirty;
- unsigned long nr_unstable;
- unsigned long nr_mapped;
- unsigned long nr_writeback;
-};
-
-static void get_writeback_state(struct writeback_state *wbs)
-{
- wbs->nr_dirty = global_page_state(NR_FILE_DIRTY);
- wbs->nr_unstable = global_page_state(NR_UNSTABLE_NFS);
- wbs->nr_mapped = global_page_state(NR_FILE_MAPPED) +
- global_page_state(NR_ANON_PAGES);
- wbs->nr_writeback = global_page_state(NR_WRITEBACK);
-}
-
/*
* Work out the current dirty-memory clamping and background writeout
* thresholds.
@@ -134,8 +117,8 @@ static void get_writeback_state(struct w
* clamping level.
*/
static void
-get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty,
- struct address_space *mapping)
+get_dirty_limits(long *pbackground, long *pdirty,
+ struct address_space *mapping)
{
int background_ratio; /* Percentages */
int dirty_ratio;
@@ -145,8 +128,6 @@ get_dirty_limits(struct writeback_state
unsigned long available_memory = total_pages;
struct task_struct *tsk;
- get_writeback_state(wbs);
-
#ifdef CONFIG_HIGHMEM
/*
* If this mapping can only allocate from low memory,
@@ -157,7 +138,9 @@ get_dirty_limits(struct writeback_state
#endif
- unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
+ unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
+ global_page_state(NR_ANON_PAGES)) * 100) /
+ total_pages;
dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
@@ -190,7 +173,6 @@ get_dirty_limits(struct writeback_state
*/
static void balance_dirty_pages(struct address_space *mapping)
{
- struct writeback_state wbs;
long nr_reclaimable;
long background_thresh;
long dirty_thresh;
@@ -207,11 +189,12 @@ static void balance_dirty_pages(struct a
.nr_to_write = write_chunk,
};
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
- break;
+ get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <=
+ dirty_thresh)
+ break;
if (!dirty_exceeded)
dirty_exceeded = 1;
@@ -224,11 +207,14 @@ static void balance_dirty_pages(struct a
*/
if (nr_reclaimable) {
writeback_inodes(&wbc);
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
- break;
+ get_dirty_limits(&background_thresh,
+ &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
+ if (nr_reclaimable +
+ global_page_state(NR_WRITEBACK)
+ <= dirty_thresh)
+ break;
pages_written += write_chunk - wbc.nr_to_write;
if (pages_written >= write_chunk)
break; /* We've done our duty */
@@ -236,8 +222,9 @@ static void balance_dirty_pages(struct a
blk_congestion_wait(WRITE, HZ/10);
}
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh && dirty_exceeded)
- dirty_exceeded = 0;
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK)
+ <= dirty_thresh && dirty_exceeded)
+ dirty_exceeded = 0;
if (writeback_in_progress(bdi))
return; /* pdflush is already working this queue */
@@ -299,12 +286,11 @@ EXPORT_SYMBOL(balance_dirty_pages_rateli
void throttle_vm_writeout(void)
{
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;
for ( ; ; ) {
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
/*
* Boost the allowable dirty threshold a bit for page
@@ -312,8 +298,9 @@ void throttle_vm_writeout(void)
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
- if (wbs.nr_unstable + wbs.nr_writeback <= dirty_thresh)
- break;
+ if (global_page_state(NR_UNSTABLE_NFS) +
+ global_page_state(NR_WRITEBACK) <= dirty_thresh)
+ break;
blk_congestion_wait(WRITE, HZ/10);
}
}
@@ -335,12 +322,12 @@ static void background_writeout(unsigned
};
for ( ; ; ) {
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
- if (wbs.nr_dirty + wbs.nr_unstable < background_thresh
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
+ if (global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS) < background_thresh
&& min_pages <= 0)
break;
wbc.encountered_congestion = 0;
@@ -364,12 +351,9 @@ static void background_writeout(unsigned
*/
int wakeup_pdflush(long nr_pages)
{
- if (nr_pages == 0) {
- struct writeback_state wbs;
-
- get_writeback_state(&wbs);
- nr_pages = wbs.nr_dirty + wbs.nr_unstable;
- }
+ if (nr_pages == 0)
+ nr_pages = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
return pdflush_operation(background_writeout, nr_pages);
}
@@ -400,7 +384,6 @@ static void wb_kupdate(unsigned long arg
unsigned long start_jif;
unsigned long next_jif;
long nr_to_write;
- struct writeback_state wbs;
struct writeback_control wbc = {
.bdi = NULL,
.sync_mode = WB_SYNC_NONE,
@@ -412,11 +395,11 @@ static void wb_kupdate(unsigned long arg
sync_supers();
- get_writeback_state(&wbs);
oldest_jif = jiffies - dirty_expire_interval;
start_jif = jiffies;
next_jif = start_jif + dirty_writeback_interval;
- nr_to_write = wbs.nr_dirty + wbs.nr_unstable +
+ nr_to_write = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS) +
(inodes_stat.nr_inodes - inodes_stat.nr_unused);
while (nr_to_write > 0) {
wbc.encountered_congestion = 0;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 00/14] Zoned VM counters V6
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
` (13 preceding siblings ...)
2006-06-22 16:41 ` [PATCH 14/14] Remove useless struct wbs, zoned vm counters: remove useless writeback structure Christoph Lameter, Christoph Lameter
@ 2006-06-22 17:19 ` Andrew Morton
2006-06-22 17:59 ` Christoph Lameter
14 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2006-06-22 17:19 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
On Thu, 22 Jun 2006 09:40:04 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:
> V5->V6
> - Restore the removal of individual counters from the page state that
> was deferred into a later patch when going from V2->V3. This also
> caused the removal of get_page_state_node and get_page_state() to
> drop out of the patch that converted nr_unstable.
argh. I'm happy with the patches I have now - they compile at each step
and the machine doesn't hang.
> - Fix mailing list address.
A single patch for this would be good.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 00/14] Zoned VM counters V6
2006-06-22 17:19 ` [PATCH 00/14] Zoned VM counters V6 Andrew Morton
@ 2006-06-22 17:59 ` Christoph Lameter
0 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter @ 2006-06-22 17:59 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm
On Thu, 22 Jun 2006, Andrew Morton wrote:
> On Thu, 22 Jun 2006 09:40:04 -0700 (PDT)
> Christoph Lameter <clameter@sgi.com> wrote:
>
> > V5->V6
> > - Restore the removal of individual counters from the page state that
> > was deferred into a later patch when going from V2->V3. This also
> > caused the removal of get_page_state_node and get_page_state() to
> > drop out of the patch that converted nr_unstable.
>
> argh. I'm happy with the patches I have now - they compile at each step
> and the machine doesn't hang.
Ok. Just skip this one.
>
> > - Fix mailing list address.
>
> A single patch for this would be good.
I did that too. Seems that all is fine without this set then.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 14/14] Remove useless struct wbs, zoned vm counters: remove useless writeback structure
2006-06-21 15:44 [PATCH 00/14] Zoned VM counters V5 Christoph Lameter
@ 2006-06-21 15:45 ` Christoph Lameter, Christoph Lameter
0 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter, Christoph Lameter @ 2006-06-21 15:45 UTC (permalink / raw)
To: akpm; +Cc: Martin Bligh, linux-mm, Christoph Lameter
Remove writeback state
We can remove some functions now that were needed to calculate the page state
for writeback control since these statistics are now directly available.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-mm1/mm/page-writeback.c
===================================================================
--- linux-2.6.17-mm1.orig/mm/page-writeback.c 2006-06-21 07:45:22.931676221 -0700
+++ linux-2.6.17-mm1/mm/page-writeback.c 2006-06-21 07:45:30.543509618 -0700
@@ -99,23 +99,6 @@ EXPORT_SYMBOL(laptop_mode);
static void background_writeout(unsigned long _min_pages);
-struct writeback_state
-{
- unsigned long nr_dirty;
- unsigned long nr_unstable;
- unsigned long nr_mapped;
- unsigned long nr_writeback;
-};
-
-static void get_writeback_state(struct writeback_state *wbs)
-{
- wbs->nr_dirty = global_page_state(NR_FILE_DIRTY);
- wbs->nr_unstable = global_page_state(NR_UNSTABLE_NFS);
- wbs->nr_mapped = global_page_state(NR_FILE_MAPPED) +
- global_page_state(NR_ANON_PAGES);
- wbs->nr_writeback = global_page_state(NR_WRITEBACK);
-}
-
/*
* Work out the current dirty-memory clamping and background writeout
* thresholds.
@@ -134,8 +117,8 @@ static void get_writeback_state(struct w
* clamping level.
*/
static void
-get_dirty_limits(struct writeback_state *wbs, long *pbackground, long *pdirty,
- struct address_space *mapping)
+get_dirty_limits(long *pbackground, long *pdirty,
+ struct address_space *mapping)
{
int background_ratio; /* Percentages */
int dirty_ratio;
@@ -145,8 +128,6 @@ get_dirty_limits(struct writeback_state
unsigned long available_memory = total_pages;
struct task_struct *tsk;
- get_writeback_state(wbs);
-
#ifdef CONFIG_HIGHMEM
/*
* If this mapping can only allocate from low memory,
@@ -157,7 +138,9 @@ get_dirty_limits(struct writeback_state
#endif
- unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
+ unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
+ global_page_state(NR_ANON_PAGES)) * 100) /
+ total_pages;
dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
@@ -190,7 +173,6 @@ get_dirty_limits(struct writeback_state
*/
static void balance_dirty_pages(struct address_space *mapping)
{
- struct writeback_state wbs;
long nr_reclaimable;
long background_thresh;
long dirty_thresh;
@@ -207,11 +189,12 @@ static void balance_dirty_pages(struct a
.nr_to_write = write_chunk,
};
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
- break;
+ get_dirty_limits(&background_thresh, &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK) <=
+ dirty_thresh)
+ break;
if (!dirty_exceeded)
dirty_exceeded = 1;
@@ -224,11 +207,14 @@ static void balance_dirty_pages(struct a
*/
if (nr_reclaimable) {
writeback_inodes(&wbc);
- get_dirty_limits(&wbs, &background_thresh,
- &dirty_thresh, mapping);
- nr_reclaimable = wbs.nr_dirty + wbs.nr_unstable;
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh)
- break;
+ get_dirty_limits(&background_thresh,
+ &dirty_thresh, mapping);
+ nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
+ if (nr_reclaimable +
+ global_page_state(NR_WRITEBACK)
+ <= dirty_thresh)
+ break;
pages_written += write_chunk - wbc.nr_to_write;
if (pages_written >= write_chunk)
break; /* We've done our duty */
@@ -236,8 +222,9 @@ static void balance_dirty_pages(struct a
blk_congestion_wait(WRITE, HZ/10);
}
- if (nr_reclaimable + wbs.nr_writeback <= dirty_thresh && dirty_exceeded)
- dirty_exceeded = 0;
+ if (nr_reclaimable + global_page_state(NR_WRITEBACK)
+ <= dirty_thresh && dirty_exceeded)
+ dirty_exceeded = 0;
if (writeback_in_progress(bdi))
return; /* pdflush is already working this queue */
@@ -299,12 +286,11 @@ EXPORT_SYMBOL(balance_dirty_pages_rateli
void throttle_vm_writeout(void)
{
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;
for ( ; ; ) {
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
/*
* Boost the allowable dirty threshold a bit for page
@@ -312,8 +298,9 @@ void throttle_vm_writeout(void)
*/
dirty_thresh += dirty_thresh / 10; /* wheeee... */
- if (wbs.nr_unstable + wbs.nr_writeback <= dirty_thresh)
- break;
+ if (global_page_state(NR_UNSTABLE_NFS) +
+ global_page_state(NR_WRITEBACK) <= dirty_thresh)
+ break;
blk_congestion_wait(WRITE, HZ/10);
}
}
@@ -335,12 +322,12 @@ static void background_writeout(unsigned
};
for ( ; ; ) {
- struct writeback_state wbs;
long background_thresh;
long dirty_thresh;
- get_dirty_limits(&wbs, &background_thresh, &dirty_thresh, NULL);
- if (wbs.nr_dirty + wbs.nr_unstable < background_thresh
+ get_dirty_limits(&background_thresh, &dirty_thresh, NULL);
+ if (global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS) < background_thresh
&& min_pages <= 0)
break;
wbc.encountered_congestion = 0;
@@ -364,12 +351,9 @@ static void background_writeout(unsigned
*/
int wakeup_pdflush(long nr_pages)
{
- if (nr_pages == 0) {
- struct writeback_state wbs;
-
- get_writeback_state(&wbs);
- nr_pages = wbs.nr_dirty + wbs.nr_unstable;
- }
+ if (nr_pages == 0)
+ nr_pages = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
return pdflush_operation(background_writeout, nr_pages);
}
@@ -400,7 +384,6 @@ static void wb_kupdate(unsigned long arg
unsigned long start_jif;
unsigned long next_jif;
long nr_to_write;
- struct writeback_state wbs;
struct writeback_control wbc = {
.bdi = NULL,
.sync_mode = WB_SYNC_NONE,
@@ -412,11 +395,11 @@ static void wb_kupdate(unsigned long arg
sync_supers();
- get_writeback_state(&wbs);
oldest_jif = jiffies - dirty_expire_interval;
start_jif = jiffies;
next_jif = start_jif + dirty_writeback_interval;
- nr_to_write = wbs.nr_dirty + wbs.nr_unstable +
+ nr_to_write = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS) +
(inodes_stat.nr_inodes - inodes_stat.nr_unused);
while (nr_to_write > 0) {
wbc.encountered_congestion = 0;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2006-06-22 17:59 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-22 16:40 [PATCH 00/14] Zoned VM counters V6 Christoph Lameter
2006-06-22 16:40 ` [PATCH 01/14] Create vmstat.c/.h from page_alloc.c/.h Christoph Lameter
2006-06-22 16:40 ` [PATCH 02/14] Basic ZVC (zoned vm counter) implementation, zoned vm counters: per zone counter functionality Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 03/14] Convert nr_mapped to per zone counter, zoned vm counters: conversion of nr_mapped to per zone counter Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 04/14] Conversion of nr_pagecache to per zone counter, zoned vm counters: conversion of nr_pagecache " Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 05/14] Remove NR_FILE_MAPPED from scan control structure, zoned VM stats: Remove nr_mapped from scan control Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 06/14] Split NR_ANON_PAGES off from NR_FILE_MAPPED, zoned VM stats: Add NR_ANON_PAGES Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 07/14] zone_reclaim: remove /proc/sys/vm/zone_reclaim_interval, zoned vm counters: use per zone counters to remove zone_reclaim_interval Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 08/14] Conversion of nr_slab to per zone counter, zoned vm counters: conversion of nr_slab to per zone counter Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 09/14] Conversion of nr_pagetables to per zone counter, zoned vm counters: conversion of nr_pagetable " Christoph Lameter, Christoph Lameter
2006-06-22 16:40 ` [PATCH 10/14] Conversion of nr_dirty to per zone counter, zoned vm counters: conversion of nr_dirty " Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 11/14] Conversion of nr_writeback to per zone counter, zoned vm counters: conversion of nr_writeback " Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 12/14] Conversion of nr_unstable to per zone counter, zoned vm counters: conversion of nr_unstable " Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 13/14] Conversion of nr_bounce to per zone counter, zoned vm counters: conversion of nr_bounce " Christoph Lameter, Christoph Lameter
2006-06-22 16:41 ` [PATCH 14/14] Remove useless struct wbs, zoned vm counters: remove useless writeback structure Christoph Lameter, Christoph Lameter
2006-06-22 17:19 ` [PATCH 00/14] Zoned VM counters V6 Andrew Morton
2006-06-22 17:59 ` Christoph Lameter
-- strict thread matches above, loose matches on Subject: below --
2006-06-21 15:44 [PATCH 00/14] Zoned VM counters V5 Christoph Lameter
2006-06-21 15:45 ` [PATCH 14/14] Remove useless struct wbs, zoned vm counters: remove useless writeback structure Christoph Lameter, Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox