From: Mel Gorman <mgorman@suse.de>
To: Linux-MM <linux-mm@kvack.org>
Cc: Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: [PATCH 01/16] mm: Disable zone_reclaim_mode by default
Date: Fri, 18 Apr 2014 15:50:28 +0100 [thread overview]
Message-ID: <1397832643-14275-2-git-send-email-mgorman@suse.de> (raw)
In-Reply-To: <1397832643-14275-1-git-send-email-mgorman@suse.de>
zone_reclaim_mode causes processes to prefer reclaiming memory from local
node instead of spilling over to other nodes. This made sense initially when
NUMA machines were almost exclusively HPC and the workload was partitioned
into nodes. The NUMA penalties were sufficiently high to justify reclaiming
the memory. On current machines and workloads it is often the case that
zone_reclaim_mode destroys performance but not all users know how to detect
this. Favour the common case and disable it by default. Users that are
sophisticated enough to know they need zone_reclaim_mode will detect it.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
---
Documentation/sysctl/vm.txt | 17 +++++++++--------
arch/ia64/include/asm/topology.h | 3 ++-
arch/powerpc/include/asm/topology.h | 8 ++------
include/linux/topology.h | 3 ++-
mm/page_alloc.c | 2 --
5 files changed, 15 insertions(+), 18 deletions(-)
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index dd9d0e3..5b6da0f 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -772,16 +772,17 @@ This is value ORed together of
2 = Zone reclaim writes dirty pages out
4 = Zone reclaim swaps pages
-zone_reclaim_mode is set during bootup to 1 if it is determined that pages
-from remote zones will cause a measurable performance reduction. The
-page allocator will then reclaim easily reusable pages (those page
-cache pages that are currently not used) before allocating off node pages.
-
-It may be beneficial to switch off zone reclaim if the system is
-used for a file server and all of memory should be used for caching files
-from disk. In that case the caching effect is more important than
+zone_reclaim_mode is disabled by default. For file servers or workloads
+that benefit from having their data cached, zone_reclaim_mode should be
+left disabled as the caching effect is likely to be more important than
data locality.
+zone_reclaim may be enabled if it's known that the workload is partitioned
+such that each partition fits within a NUMA node and that accessing remote
+memory would cause a measurable performance reduction. The page allocator
+will then reclaim easily reusable pages (those page cache pages that are
+currently not used) before allocating off node pages.
+
Allowing zone reclaim to write out pages stops processes that are
writing large amounts of data from dirtying pages on other nodes. Zone
reclaim will write out dirty pages if a zone fills up and so effectively
diff --git a/arch/ia64/include/asm/topology.h b/arch/ia64/include/asm/topology.h
index 5cb55a1..3555fdd 100644
--- a/arch/ia64/include/asm/topology.h
+++ b/arch/ia64/include/asm/topology.h
@@ -21,7 +21,8 @@
#define PENALTY_FOR_NODE_WITH_CPUS 255
/*
- * Distance above which we begin to use zone reclaim
+ * Nodes within this distance are eligible for reclaim by zone_reclaim() when
+ * zone_reclaim_mode is enabled.
*/
#define RECLAIM_DISTANCE 15
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index c920215..6c8a8c5 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -9,12 +9,8 @@ struct device_node;
#ifdef CONFIG_NUMA
/*
- * Before going off node we want the VM to try and reclaim from the local
- * node. It does this if the remote distance is larger than RECLAIM_DISTANCE.
- * With the default REMOTE_DISTANCE of 20 and the default RECLAIM_DISTANCE of
- * 20, we never reclaim and go off node straight away.
- *
- * To fix this we choose a smaller value of RECLAIM_DISTANCE.
+ * If zone_reclaim_mode is enabled, a RECLAIM_DISTANCE of 10 will mean that
+ * all zones on all nodes will be eligible for zone_reclaim().
*/
#define RECLAIM_DISTANCE 10
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 7062330..53261e2 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -58,7 +58,8 @@ int arch_update_cpu_topology(void);
/*
* If the distance between nodes in a system is larger than RECLAIM_DISTANCE
* (in whatever arch specific measurement units returned by node_distance())
- * then switch on zone reclaim on boot.
+ * and zone_reclaim_mode is enabled then the VM will only call zone_reclaim()
+ * on nodes within this distance.
*/
#define RECLAIM_DISTANCE 30
#endif
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5dba293..628f1e7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1860,8 +1860,6 @@ static void __paginginit init_zone_allows_reclaim(int nid)
for_each_node_state(i, N_MEMORY)
if (node_distance(nid, i) <= RECLAIM_DISTANCE)
node_set(i, NODE_DATA(nid)->reclaim_nodes);
- else
- zone_reclaim_mode = 1;
}
#else /* CONFIG_NUMA */
--
1.8.4.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-04-18 14:50 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-18 14:50 [PATCH 00/16] Misc page alloc, shmem and mark_page_accessed optimisations Mel Gorman
2014-04-18 14:50 ` Mel Gorman [this message]
2014-04-18 17:26 ` [PATCH 01/16] mm: Disable zone_reclaim_mode by default Andi Kleen
2014-04-18 21:15 ` Dave Hansen
2014-04-18 14:50 ` [PATCH 02/16] mm: page_alloc: Do not cache reclaim distances Mel Gorman
2014-04-18 14:50 ` [PATCH 03/16] mm: page_alloc: Do not update zlc unless the zlc is active Mel Gorman
2014-04-18 17:52 ` Johannes Weiner
2014-04-18 14:50 ` [PATCH 04/16] mm: page_alloc: Do not treat a zone that cannot be used for dirty pages as "full" Mel Gorman
2014-04-18 17:52 ` Johannes Weiner
2014-04-18 14:50 ` [PATCH 05/16] mm: page_alloc: Use jump labels to avoid checking number_of_cpusets Mel Gorman
2014-04-18 14:50 ` [PATCH 06/16] mm: page_alloc: Calculate classzone_idx once from the zonelist ref Mel Gorman
2014-04-18 18:03 ` Johannes Weiner
2014-04-19 11:18 ` Mel Gorman
2014-04-18 14:50 ` [PATCH 07/16] mm: page_alloc: Only check the zone id check if pages are buddies Mel Gorman
2014-04-18 18:05 ` Johannes Weiner
2014-04-18 14:50 ` [PATCH 08/16] mm: page_alloc: Only check the alloc flags and gfp_mask for dirty once Mel Gorman
2014-04-18 18:08 ` Johannes Weiner
2014-04-19 11:19 ` Mel Gorman
2014-04-18 14:50 ` [PATCH 09/16] mm: page_alloc: Take the ALLOC_NO_WATERMARK check out of the fast path Mel Gorman
2014-04-18 18:10 ` Johannes Weiner
2014-04-18 14:50 ` [PATCH 10/16] mm: page_alloc: Use word-based accesses for get/set pageblock bitmaps Mel Gorman
2014-04-18 17:16 ` Vlastimil Babka
2014-04-18 14:50 ` [PATCH 11/16] mm: page_alloc: Reduce number of times page_to_pfn is called Mel Gorman
2014-04-18 14:50 ` [PATCH 12/16] mm: shmem: Avoid atomic operation during shmem_getpage_gfp Mel Gorman
2014-04-18 18:13 ` Johannes Weiner
2014-04-18 14:50 ` [PATCH 13/16] mm: Do not use atomic operations when releasing pages Mel Gorman
2014-04-18 14:50 ` [PATCH 14/16] mm: Do not use unnecessary atomic operations when adding pages to the LRU Mel Gorman
2014-04-18 14:50 ` [PATCH 15/16] mm: Non-atomically mark page accessed in write_begin where possible Mel Gorman
2014-04-18 14:50 ` [PATCH 16/16] mm: filemap: Prefetch page->flags if !PageUptodate Mel Gorman
2014-04-18 19:16 ` Hugh Dickins
2014-04-19 11:23 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1397832643-14275-2-git-send-email-mgorman@suse.de \
--to=mgorman@suse.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox