linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: readahead: get back a sensible upper limit
@ 2015-02-24 12:58 Rafael Aquini
  2015-02-24 20:50 ` David Rientjes
  2015-02-24 21:56 ` Linus Torvalds
  0 siblings, 2 replies; 7+ messages in thread
From: Rafael Aquini @ 2015-02-24 12:58 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, jweiner, riel, rientjes, linux-kernel, loberman, lwoodman,
	raghavendra.kt

commit 6d2be915e589 ("mm/readahead.c: fix readahead failure for memoryless NUMA
nodes and limit readahead pages")[1] imposed 2 mB hard limits to readahead by 
changing max_sane_readahead() to sort out a corner case where a thread runs on 
amemoryless NUMA node and it would have its readahead capability disabled.

The aforementioned change, despite fixing that corner case, is detrimental to
other ordinary workloads that memory map big files and rely on readahead() or
posix_fadvise(WILLNEED) syscalls to get most of the file populating system's cache.

Laurence Oberman reports, via https://bugzilla.redhat.com/show_bug.cgi?id=1187940,
slowdowns up to 3-4 times when changes for mentioned commit [1] got introduced in
RHEL kenrel. We also have an upstream bugzilla opened for similar complaint:
https://bugzilla.kernel.org/show_bug.cgi?id=79111

This patch brings back the old behavior of max_sane_readahead() where we used to
consider NR_INACTIVE_FILE and NR_FREE_PAGES pages to derive a sensible / adujstable
readahead upper limit. This patch also keeps the 2 mB ceiling scheme introduced by
commit [1] to avoid regressions on CONFIG_HAVE_MEMORYLESS_NODES systems,
where numa_mem_id(), by any buggy reason, might end up not returning
the 'local memory' for a memoryless node CPU.

Reported-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Rafael Aquini <aquini@redhat.com>
---
 mm/readahead.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 9356758..73f934d 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -203,6 +203,7 @@ out:
 	return ret;
 }
 
+#define MAX_READAHEAD   ((512 * 4096) / PAGE_CACHE_SIZE)
 /*
  * Chunk the readahead into 2 megabyte units, so that we don't pin too much
  * memory at once.
@@ -217,7 +218,7 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
 	while (nr_to_read) {
 		int err;
 
-		unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_CACHE_SIZE;
+		unsigned long this_chunk = MAX_READAHEAD;
 
 		if (this_chunk > nr_to_read)
 			this_chunk = nr_to_read;
@@ -232,14 +233,15 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
 	return 0;
 }
 
-#define MAX_READAHEAD   ((512*4096)/PAGE_CACHE_SIZE)
 /*
  * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
  * sensible upper limit.
  */
 unsigned long max_sane_readahead(unsigned long nr)
 {
-	return min(nr, MAX_READAHEAD);
+	return min(nr, max(MAX_READAHEAD,
+			  (node_page_state(numa_mem_id(), NR_INACTIVE_FILE) +
+			   node_page_state(numa_mem_id(), NR_FREE_PAGES)) / 2));
 }
 
 /*
-- 
1.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-02-24 23:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-24 12:58 [PATCH] mm: readahead: get back a sensible upper limit Rafael Aquini
2015-02-24 20:50 ` David Rientjes
2015-02-24 21:13   ` Rafael Aquini
2015-02-24 21:56 ` Linus Torvalds
2015-02-24 22:08   ` Rafael Aquini
2015-02-24 22:12     ` Linus Torvalds
2015-02-24 22:54       ` Laurence Oberman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox