* [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages
@ 2014-01-09 19:24 Raghavendra K T
2014-01-10 8:36 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Raghavendra K T @ 2014-01-09 19:24 UTC (permalink / raw)
To: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
Damien Ramonda, jack, Linus
Cc: linux-mm, linux-kernel, Raghavendra K T
We limit the number of readahead pages to 4k.
max_sane_readahead returns zero on the cpu having no local memory
node. Fix that by returning a sanitized number of pages viz.,
minimum of (requested pages, 4k, number of local free pages)
Result:
fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
32GB* 4G RAM numa machine ( 12 iterations) yielded
kernel Avg Stddev
base 7.264 0.56%
patched 7.285 1.14%
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
---
mm/readahead.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
V4: incorporated 16MB limit suggested by Linus for readahead and
fixed transitioning to large readahead anomaly pointed by Andrew Morton with
Honza's suggestion.
Test results shows no significant overhead with the current changes.
(Do I have to break patches into two??)
Suggestions/Comments please let me know.
diff --git a/mm/readahead.c b/mm/readahead.c
index 7cdbb44..2f561a0 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -237,14 +237,30 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
return ret;
}
+#define MAX_REMOTE_READAHEAD 4096UL
/*
* Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
* sensible upper limit.
*/
unsigned long max_sane_readahead(unsigned long nr)
{
- return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
- + node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
+ unsigned long local_free_page;
+ unsigned long sane_nr;
+ int nid;
+
+ nid = numa_node_id();
+ sane_nr = min(nr, MAX_REMOTE_READAHEAD);
+
+ local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
+ + node_page_state(nid, NR_FREE_PAGES);
+
+ /*
+ * Readahead onto remote memory is better than no readahead when local
+ * numa node does not have memory. We sanitize readahead size depending
+ * on free memory in the local node but limiting to 4k pages.
+ */
+ return node_present_pages(nid) ?
+ min(sane_nr, local_free_page / 2) : sane_nr;
}
/*
--
1.7.11.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages
2014-01-09 19:24 [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages Raghavendra K T
@ 2014-01-10 8:36 ` Jan Kara
2014-01-10 9:52 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2014-01-10 8:36 UTC (permalink / raw)
To: Raghavendra K T
Cc: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
Damien Ramonda, jack, Linus, linux-mm, linux-kernel
On Fri 10-01-14 00:54:50, Raghavendra K T wrote:
> We limit the number of readahead pages to 4k.
>
> max_sane_readahead returns zero on the cpu having no local memory
> node. Fix that by returning a sanitized number of pages viz.,
> minimum of (requested pages, 4k, number of local free pages)
>
> Result:
> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
> 32GB* 4G RAM numa machine ( 12 iterations) yielded
>
> kernel Avg Stddev
> base 7.264 0.56%
> patched 7.285 1.14%
OK, looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
>
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
> mm/readahead.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> V4: incorporated 16MB limit suggested by Linus for readahead and
> fixed transitioning to large readahead anomaly pointed by Andrew Morton with
> Honza's suggestion.
>
> Test results shows no significant overhead with the current changes.
>
> (Do I have to break patches into two??)
>
> Suggestions/Comments please let me know.
>
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 7cdbb44..2f561a0 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -237,14 +237,30 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
> return ret;
> }
>
> +#define MAX_REMOTE_READAHEAD 4096UL
> /*
> * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
> * sensible upper limit.
> */
> unsigned long max_sane_readahead(unsigned long nr)
> {
> - return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
> - + node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
> + unsigned long local_free_page;
> + unsigned long sane_nr;
> + int nid;
> +
> + nid = numa_node_id();
> + sane_nr = min(nr, MAX_REMOTE_READAHEAD);
> +
> + local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
> + + node_page_state(nid, NR_FREE_PAGES);
> +
> + /*
> + * Readahead onto remote memory is better than no readahead when local
> + * numa node does not have memory. We sanitize readahead size depending
> + * on free memory in the local node but limiting to 4k pages.
> + */
> + return node_present_pages(nid) ?
> + min(sane_nr, local_free_page / 2) : sane_nr;
> }
>
> /*
> --
> 1.7.11.7
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages
2014-01-10 8:36 ` Jan Kara
@ 2014-01-10 9:52 ` Jan Kara
2014-01-10 10:27 ` Raghavendra K T
2014-01-16 11:23 ` Raghavendra K T
0 siblings, 2 replies; 5+ messages in thread
From: Jan Kara @ 2014-01-10 9:52 UTC (permalink / raw)
To: Raghavendra K T
Cc: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
Damien Ramonda, jack, Linus, linux-mm, linux-kernel
On Fri 10-01-14 09:36:56, Jan Kara wrote:
> On Fri 10-01-14 00:54:50, Raghavendra K T wrote:
> > We limit the number of readahead pages to 4k.
> >
> > max_sane_readahead returns zero on the cpu having no local memory
> > node. Fix that by returning a sanitized number of pages viz.,
> > minimum of (requested pages, 4k, number of local free pages)
> >
> > Result:
> > fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
> > 32GB* 4G RAM numa machine ( 12 iterations) yielded
> >
> > kernel Avg Stddev
> > base 7.264 0.56%
> > patched 7.285 1.14%
> OK, looks good to me. You can add:
> Reviewed-by: Jan Kara <jack@suse.cz>
Hum, while doing some other work I've realized there may be still a
problem hiding with the 16 MB limitation. E.g. the dynamic linker is
doing MADV_WILLNEED on the shared libraries. If the library (or executable)
is larger than 16 MB, then it may cause performance problems since access
is random in nature and we don't really know which part of the file do we
need first.
I'm not sure what others think about this but I'm now more inclined to a
bit more careful and introduce the 16 MB limit only for the NUMA case. I.e.
something like:
unsigned long local_free_page;
int nid;
nid = numa_node_id();
if (node_present_pages(nid)) {
/*
* We sanitize readahead size depending on free memory in
* the local node.
*/
local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
+ node_page_state(nid, NR_FREE_PAGES);
return min(nr, local_free_page / 2);
}
/*
* Readahead onto remote memory is better than no readahead when local
* numa node does not have memory. We limit the readahead to 4k
* pages though to avoid trashing page cache.
*/
return min(nr, MAX_REMOTE_READAHEAD);
Honza
> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> > ---
> > mm/readahead.c | 20 ++++++++++++++++++--
> > 1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > V4: incorporated 16MB limit suggested by Linus for readahead and
> > fixed transitioning to large readahead anomaly pointed by Andrew Morton with
> > Honza's suggestion.
> >
> > Test results shows no significant overhead with the current changes.
> >
> > (Do I have to break patches into two??)
> >
> > Suggestions/Comments please let me know.
> >
> > diff --git a/mm/readahead.c b/mm/readahead.c
> > index 7cdbb44..2f561a0 100644
> > --- a/mm/readahead.c
> > +++ b/mm/readahead.c
> > @@ -237,14 +237,30 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
> > return ret;
> > }
> >
> > +#define MAX_REMOTE_READAHEAD 4096UL
> > /*
> > * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
> > * sensible upper limit.
> > */
> > unsigned long max_sane_readahead(unsigned long nr)
> > {
> > - return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
> > - + node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
> > + unsigned long local_free_page;
> > + unsigned long sane_nr;
> > + int nid;
> > +
> > + nid = numa_node_id();
> > + sane_nr = min(nr, MAX_REMOTE_READAHEAD);
> > +
> > + local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
> > + + node_page_state(nid, NR_FREE_PAGES);
> > +
> > + /*
> > + * Readahead onto remote memory is better than no readahead when local
> > + * numa node does not have memory. We sanitize readahead size depending
> > + * on free memory in the local node but limiting to 4k pages.
> > + */
> > + return node_present_pages(nid) ?
> > + min(sane_nr, local_free_page / 2) : sane_nr;
> > }
> >
> > /*
> > --
> > 1.7.11.7
> >
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages
2014-01-10 9:52 ` Jan Kara
@ 2014-01-10 10:27 ` Raghavendra K T
2014-01-16 11:23 ` Raghavendra K T
1 sibling, 0 replies; 5+ messages in thread
From: Raghavendra K T @ 2014-01-10 10:27 UTC (permalink / raw)
To: Jan Kara, Andrew Morton, Linus
Cc: Fengguang Wu, David Cohen, Al Viro, Damien Ramonda, linux-mm,
linux-kernel
On 01/10/2014 03:22 PM, Jan Kara wrote:
> On Fri 10-01-14 09:36:56, Jan Kara wrote:
>> On Fri 10-01-14 00:54:50, Raghavendra K T wrote:
>>> We limit the number of readahead pages to 4k.
>>>
>>> max_sane_readahead returns zero on the cpu having no local memory
>>> node. Fix that by returning a sanitized number of pages viz.,
>>> minimum of (requested pages, 4k, number of local free pages)
>>>
>>> Result:
>>> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
>>> 32GB* 4G RAM numa machine ( 12 iterations) yielded
>>>
>>> kernel Avg Stddev
>>> base 7.264 0.56%
>>> patched 7.285 1.14%
>> OK, looks good to me. You can add:
>> Reviewed-by: Jan Kara <jack@suse.cz>
> Hum, while doing some other work I've realized there may be still a
> problem hiding with the 16 MB limitation. E.g. the dynamic linker is
> doing MADV_WILLNEED on the shared libraries. If the library (or executable)
> is larger than 16 MB, then it may cause performance problems since access
> is random in nature and we don't really know which part of the file do we
> need first.
>
> I'm not sure what others think about this but I'm now more inclined to a
> bit more careful and introduce the 16 MB limit only for the NUMA case. I.e.
> something like:
Your suggestion makes sense. I do not have any strong preference.
may be we shall wait for Linus/Andrew's comments (if any) since Linus
suggested the 16MB idea.
>
> unsigned long local_free_page;
> int nid;
>
> nid = numa_node_id();
> if (node_present_pages(nid)) {
> /*
> * We sanitize readahead size depending on free memory in
> * the local node.
> */
> local_free_page = node_page_state(nid, NR_INACTIVE_FILE)
> + node_page_state(nid, NR_FREE_PAGES);
> return min(nr, local_free_page / 2);
> }
> /*
> * Readahead onto remote memory is better than no readahead when local
> * numa node does not have memory. We limit the readahead to 4k
> * pages though to avoid trashing page cache.
> */
> return min(nr, MAX_REMOTE_READAHEAD);
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages
2014-01-10 9:52 ` Jan Kara
2014-01-10 10:27 ` Raghavendra K T
@ 2014-01-16 11:23 ` Raghavendra K T
1 sibling, 0 replies; 5+ messages in thread
From: Raghavendra K T @ 2014-01-16 11:23 UTC (permalink / raw)
To: Andrew Morton, Linus
Cc: Jan Kara, Fengguang Wu, David Cohen, Al Viro, Damien Ramonda,
linux-mm, linux-kernel
On 01/10/2014 03:22 PM, Jan Kara wrote:
> On Fri 10-01-14 09:36:56, Jan Kara wrote:
>> On Fri 10-01-14 00:54:50, Raghavendra K T wrote:
>>> We limit the number of readahead pages to 4k.
>>>
>>> max_sane_readahead returns zero on the cpu having no local memory
>>> node. Fix that by returning a sanitized number of pages viz.,
>>> minimum of (requested pages, 4k, number of local free pages)
>>>
>>> Result:
>>> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
>>> 32GB* 4G RAM numa machine ( 12 iterations) yielded
>>>
>>> kernel Avg Stddev
>>> base 7.264 0.56%
>>> patched 7.285 1.14%
>> OK, looks good to me. You can add:
>> Reviewed-by: Jan Kara <jack@suse.cz>
> Hum, while doing some other work I've realized there may be still a
> problem hiding with the 16 MB limitation. E.g. the dynamic linker is
> doing MADV_WILLNEED on the shared libraries. If the library (or executable)
> is larger than 16 MB, then it may cause performance problems since access
> is random in nature and we don't really know which part of the file do we
> need first.
>
> I'm not sure what others think about this but I'm now more inclined to a
> bit more careful and introduce the 16 MB limit only for the NUMA case. I.e.
> something like:
>
Hi Linus, Andrew,
Could you please let us know your suggestion or comment?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-01-16 16:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09 19:24 [RFC PATCH V4] mm readahead: Fix readahead fail for no local memory and limit readahead pages Raghavendra K T
2014-01-10 8:36 ` Jan Kara
2014-01-10 9:52 ` Jan Kara
2014-01-10 10:27 ` Raghavendra K T
2014-01-16 11:23 ` Raghavendra K T
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox