From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-numa@vger.kernel.org,
Mel Gorman <mel@csn.ul.ie>,
Randy Dunlap <randy.dunlap@oracle.com>,
Nishanth Aravamudan <nacc@us.ibm.com>,
Adam Litke <agl@us.ibm.com>, Andy Whitcroft <apw@canonical.com>,
Christoph Lameter <cl@linux-foundation.org>,
eric.whitney@hp.com, Yasunori Goto <y-goto@jp.fujitsu.com>
Subject: Re: [patch] mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined
Date: Wed, 07 Oct 2009 12:48:07 -0400 [thread overview]
Message-ID: <1254934087.4483.227.camel@useless.americas.hpqcorp.net> (raw)
In-Reply-To: <alpine.DEB.1.00.0910070043140.16136@chino.kir.corp.google.com>
On Wed, 2009-10-07 at 01:24 -0700, David Rientjes wrote:
> On Mon, 5 Oct 2009, Lee Schermerhorn wrote:
>
> > [PATCH 11/11] hugetlb: offload [un]registration of sysfs attr to worker thread
> >
> > Against: 2.6.31-mmotm-090925-1435
> >
> > New in V6
> >
> > V7: + remove redundant check for memory{ful|less} node from
> > node_hugetlb_work(). Rely on [added] return from
> > hugetlb_register_node() to differentiate between transitions
> > to/from memoryless state.
> >
>
> That doesn't work because the memory hotplug code doesn't clear the
> N_HIGH_MEMORY bit for status_change_nid on MEM_OFFLINE, so
> hugetlb_register_node() will always return true under such conditions.
>
> The following should fix it. Christoph?
>
>
Almost missed this one because of the subject.
What shall we do with this for the huge pages controls series?
Options:
1) leave series as is, and note that it depends on this patch?
2) Include this patch [or the subset that clears the N_HIGH_MEMORY node
state--maybe leave the kswapd handling separate?] in the series?
Lee
>
> mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined
>
> When memory is hot-removed, its node must be cleared in N_HIGH_MEMORY if
> there are no present pages left.
>
> In such a situation, kswapd must also be stopped since it has nothing
> left to do.
>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
> Cc: Mel Gorman <mel@csn.ul.ie>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> include/linux/swap.h | 1 +
> mm/memory_hotplug.c | 4 ++++
> mm/vmscan.c | 28 ++++++++++++++++++++++------
> 3 files changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -273,6 +273,7 @@ extern int scan_unevictable_register_node(struct node *node);
> extern void scan_unevictable_unregister_node(struct node *node);
>
> extern int kswapd_run(int nid);
> +extern void kswapd_stop(int nid);
>
> #ifdef CONFIG_MMU
> /* linux/mm/shmem.c */
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -838,6 +838,10 @@ repeat:
>
> setup_per_zone_wmarks();
> calculate_zone_inactive_ratio(zone);
> + if (!node_present_pages(node)) {
> + node_clear_state(node, N_HIGH_MEMORY);
> + kswapd_stop(node);
> + }
>
> vm_total_pages = nr_free_pagecache_pages();
> writeback_set_ratelimit();
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2163,6 +2163,7 @@ static int kswapd(void *p)
> order = 0;
> for ( ; ; ) {
> unsigned long new_order;
> + int ret;
>
> prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
> new_order = pgdat->kswapd_max_order;
> @@ -2174,19 +2175,23 @@ static int kswapd(void *p)
> */
> order = new_order;
> } else {
> - if (!freezing(current))
> + if (!freezing(current) && !kthread_should_stop())
> schedule();
>
> order = pgdat->kswapd_max_order;
> }
> finish_wait(&pgdat->kswapd_wait, &wait);
>
> - if (!try_to_freeze()) {
> - /* We can speed up thawing tasks if we don't call
> - * balance_pgdat after returning from the refrigerator
> - */
> + ret = try_to_freeze();
> + if (kthread_should_stop())
> + break;
> +
> + /*
> + * We can speed up thawing tasks if we don't call balance_pgdat
> + * after returning from the refrigerator
> + */
> + if (!ret)
> balance_pgdat(pgdat, order);
> - }
> }
> return 0;
> }
> @@ -2441,6 +2446,17 @@ int kswapd_run(int nid)
> return ret;
> }
>
> +/*
> + * Called by memory hotplug when all memory in a node is offlined.
> + */
> +void kswapd_stop(int nid)
> +{
> + struct task_struct *kswapd = NODE_DATA(nid)->kswapd;
> +
> + if (kswapd)
> + kthread_stop(kswapd);
> +}
> +
> static int __init kswapd_init(void)
> {
> int nid;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-10-07 16:48 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-06 3:17 [PATCH 0/11] hugetlb: V9 numa control of persistent huge pages alloc/free Lee Schermerhorn
2009-10-06 3:17 ` [PATCH 1/11] hugetlb: rework hstate_next_node_* functions Lee Schermerhorn
2009-10-06 3:17 ` [PATCH 2/11] hugetlb: add nodemask arg to huge page alloc, free and surplus adjust fcns Lee Schermerhorn
2009-10-06 9:09 ` David Rientjes
2009-10-07 3:26 ` David Rientjes
2009-10-07 14:13 ` Lee Schermerhorn
2009-10-06 3:17 ` [PATCH 3/11] hugetlb: factor init_nodemask_of_node Lee Schermerhorn
2009-10-07 3:21 ` David Rientjes
2009-10-06 3:18 ` [PATCH 4/11] hugetlb: derive huge pages nodes allowed from task mempolicy Lee Schermerhorn
2009-10-07 3:26 ` David Rientjes
2009-10-07 16:30 ` Lee Schermerhorn
2009-10-07 20:09 ` David Rientjes
2009-10-06 3:18 ` [PATCH 5/11] hugetlb: accomodate reworked NODEMASK_ALLOC Lee Schermerhorn
2009-10-06 3:18 ` [PATCH 6/11] hugetlb: add generic definition of NUMA_NO_NODE Lee Schermerhorn
2009-10-06 9:28 ` David Rientjes
2009-10-06 3:18 ` [PATCH 7/11] hugetlb: add per node hstate attributes Lee Schermerhorn
2009-10-07 4:04 ` David Rientjes
2009-10-06 3:18 ` [PATCH 8/11] hugetlb: update hugetlb documentation for NUMA controls Lee Schermerhorn
2009-10-06 3:18 ` [PATCH 9/11] hugetlb: use only nodes with memory for huge pages Lee Schermerhorn
2009-10-06 3:18 ` [PATCH 10/11] hugetlb: handle memory hot-plug events Lee Schermerhorn
2009-10-07 4:12 ` David Rientjes
2009-10-06 3:19 ` [PATCH 11/11] hugetlb: offload per node attribute registrations Lee Schermerhorn
2009-10-06 16:01 ` Andi Kleen
2009-10-06 16:28 ` Lee Schermerhorn
2009-10-06 16:46 ` Andi Kleen
2009-10-06 17:57 ` Lee Schermerhorn
2009-10-07 8:24 ` [patch] mm: clear node in N_HIGH_MEMORY and stop kswapd when all memory is offlined David Rientjes
2009-10-07 14:25 ` Christoph Lameter
2009-10-07 16:48 ` Lee Schermerhorn [this message]
2009-10-07 19:53 ` David Rientjes
2009-10-06 16:02 ` [PATCH 0/11] hugetlb: V9 numa control of persistent huge pages alloc/free Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1254934087.4483.227.camel@useless.americas.hpqcorp.net \
--to=lee.schermerhorn@hp.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=cl@linux-foundation.org \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=linux-numa@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=nacc@us.ibm.com \
--cc=randy.dunlap@oracle.com \
--cc=rientjes@google.com \
--cc=y-goto@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox