From: Mel Gorman <mel@csn.ul.ie>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Con Kolivas <kernel@kolivas.org>,
linux-mm@kvack.org
Subject: Re: [patch] mm: adjust kswapd nice level for high priority page allocators
Date: Mon, 1 Mar 2010 13:52:42 +0000 [thread overview]
Message-ID: <20100301135242.GE3852@csn.ul.ie> (raw)
In-Reply-To: <alpine.DEB.2.00.1003010213480.26824@chino.kir.corp.google.com>
On Mon, Mar 01, 2010 at 02:14:39AM -0800, David Rientjes wrote:
> From: Con Kolivas <kernel@kolivas.org>
>
> When kswapd is awoken due to reclaim by a running task, set the priority
> of kswapd to that of the task allocating pages thus making memory reclaim
> cpu activity affected by nice level.
>
Why?
When a process kicks kswapd, the watermark at which a process enters
direct reclaim has not been reached yet. In other words, there is no
guarantee that a process will stall due to memory pressure.
The exception would be if there are many high-priority processes allocating
pages at a steady rate that are starving kswapd of CPU time and
consequently entering direct reclaim. In this case, the high-priority
processes effectively should stall until they have reclaimed the pages.
As Con is involved, I'm guessing there are high-priority interactive
processes that jitter in low-memory situations but as I've never
observed such a scenario I'm not sure.
My main concern is that in the case there are a mix of high and low processes
with kswapd towards the higher priority as a result of this patch, kswapd
could be keeping CPU time from low-priority processes that are well behaved
that would would make less forward progress as a result of this patch.
I'm not against it as such, but I'd like to know more about the problem
this solves and what the before and after behaviour looks like.
> [rientjes@google.com: refactor for current]
> Cc: Mel Gorman <mel@csn.ul.ie>
> Signed-off-by: Con Kolivas <kernel@kolivas.org>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> mm/vmscan.c | 33 ++++++++++++++++++++++++++++++++-
> 1 files changed, 32 insertions(+), 1 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1658,6 +1658,33 @@ static void shrink_zone(int priority, struct zone *zone,
> }
>
> /*
> + * Helper functions to adjust nice level of kswapd, based on the priority of
> + * the task allocating pages. If it is already higher priority we do not
> + * demote its nice level since it is still working on behalf of a higher
> + * priority task. With kernel threads we leave it at nice 0.
> + *
> + * We don't ever run kswapd real time, so if a real time task calls kswapd we
> + * set it to highest SCHED_NORMAL priority.
> + */
> +static int effective_sc_prio(struct task_struct *p)
> +{
> + if (likely(p->mm)) {
> + if (rt_task(p))
> + return -20;
> + return task_nice(p);
> + }
> + return 0;
> +}
> +
> +static void set_kswapd_nice(struct task_struct *kswapd, int active)
> +{
> + long nice = effective_sc_prio(current);
> +
> + if (task_nice(kswapd) > nice || !active)
> + set_user_nice(kswapd, nice);
> +}
> +
> +/*
> * This is the direct reclaim path, for page-allocating processes. We only
> * try to reclaim pages from zones which will satisfy the caller's allocation
> * request.
> @@ -2257,6 +2284,7 @@ static int kswapd(void *p)
> }
> }
>
> + set_user_nice(tsk, 0);
> order = pgdat->kswapd_max_order;
> }
> finish_wait(&pgdat->kswapd_wait, &wait);
> @@ -2281,6 +2309,7 @@ static int kswapd(void *p)
> void wakeup_kswapd(struct zone *zone, int order)
> {
> pg_data_t *pgdat;
> + int active;
>
> if (!populated_zone(zone))
> return;
> @@ -2292,7 +2321,9 @@ void wakeup_kswapd(struct zone *zone, int order)
> pgdat->kswapd_max_order = order;
> if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
> return;
> - if (!waitqueue_active(&pgdat->kswapd_wait))
> + active = waitqueue_active(&pgdat->kswapd_wait);
> + set_kswapd_nice(pgdat->kswapd, active);
> + if (!active)
> return;
> wake_up_interruptible(&pgdat->kswapd_wait);
> }
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-01 13:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-01 10:14 David Rientjes
2010-03-01 13:52 ` Mel Gorman [this message]
2010-03-01 17:56 ` David Rientjes
2010-03-01 18:04 ` Mel Gorman
2010-03-08 23:23 ` David Rientjes
2010-03-02 23:48 ` Andrew Morton
2010-03-01 16:02 ` Minchan Kim
2010-03-02 4:29 ` Minchan Kim
2010-03-03 0:14 ` David Rientjes
2010-03-03 6:25 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100301135242.GE3852@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=kernel@kolivas.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox