* [PATCH] mm/page_alloc: deferred meminit: replace rwsem with completion
@ 2015-07-06 0:17 Nicolai Stange
2015-07-06 8:21 ` Mel Gorman
0 siblings, 1 reply; 3+ messages in thread
From: Nicolai Stange @ 2015-07-06 0:17 UTC (permalink / raw)
To: Andrew Morton
Cc: Mel Gorman, Vlastimil Babka, Johannes Weiner, Michal Hocko,
Joonsoo Kim, David Rientjes, Alexander Duyck, Sasha Levin,
linux-mm, linux-kernel
Commit 0e1cc95b4cc7
("mm: meminit: finish initialisation of struct pages before basic setup")
introduced a rwsem to signal completion of the initialization workers.
Lockdep complains about possible recursive locking:
=============================================
[ INFO: possible recursive locking detected ]
4.1.0-12802-g1dc51b8 #3 Not tainted
---------------------------------------------
swapper/0/1 is trying to acquire lock:
(pgdat_init_rwsem){++++.+},
at: [<ffffffff8424c7fb>] page_alloc_init_late+0xc7/0xe6
but task is already holding lock:
(pgdat_init_rwsem){++++.+},
at: [<ffffffff8424c772>] page_alloc_init_late+0x3e/0xe6
Replace the rwsem by a completion together with an atomic
"outstanding work counter".
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
mm/page_alloc.c | 34 +++++++++++++++++++++++++++-------
1 file changed, 27 insertions(+), 7 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 506eac8..3886e66 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -18,7 +18,9 @@
#include <linux/mm.h>
#include <linux/swap.h>
#include <linux/interrupt.h>
-#include <linux/rwsem.h>
+#include <linux/completion.h>
+#include <linux/atomic.h>
+#include <asm/barrier.h>
#include <linux/pagemap.h>
#include <linux/jiffies.h>
#include <linux/bootmem.h>
@@ -1062,7 +1064,20 @@ static void __init deferred_free_range(struct page *page,
__free_pages_boot_core(page, pfn, 0);
}
-static __initdata DECLARE_RWSEM(pgdat_init_rwsem);
+/* counter and completion tracking outstanding deferred_init_memmap()
+ threads */
+static atomic_t pgdat_init_n_undone __initdata;
+static __initdata DECLARE_COMPLETION(pgdat_init_all_done_comp);
+
+static inline void __init pgdat_init_report_one_done(void)
+{
+ /* Write barrier is paired with read barrier in
+ page_alloc_init_late(). It makes all writes visible to
+ readers seeing our decrement on pgdat_init_n_undone. */
+ smp_wmb();
+ if (atomic_dec_and_test(&pgdat_init_n_undone))
+ complete(&pgdat_init_all_done_comp);
+}
/* Initialise remaining memory on a node */
static int __init deferred_init_memmap(void *data)
@@ -1079,7 +1094,7 @@ static int __init deferred_init_memmap(void *data)
const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id);
if (first_init_pfn == ULONG_MAX) {
- up_read(&pgdat_init_rwsem);
+ pgdat_init_report_one_done();
return 0;
}
@@ -1179,7 +1194,8 @@ free_range:
pr_info("node %d initialised, %lu pages in %ums\n", nid, nr_pages,
jiffies_to_msecs(jiffies - start));
- up_read(&pgdat_init_rwsem);
+
+ pgdat_init_report_one_done();
return 0;
}
@@ -1187,14 +1203,18 @@ void __init page_alloc_init_late(void)
{
int nid;
+ /* There will be num_node_state(N_MEMORY) threads */
+ atomic_set(&pgdat_init_n_undone, num_node_state(N_MEMORY));
for_each_node_state(nid, N_MEMORY) {
- down_read(&pgdat_init_rwsem);
kthread_run(deferred_init_memmap, NODE_DATA(nid), "pgdatinit%d", nid);
}
/* Block until all are initialised */
- down_write(&pgdat_init_rwsem);
- up_write(&pgdat_init_rwsem);
+ wait_for_completion(&pgdat_init_all_done_comp);
+
+ /* Paired with write barrier in deferred_init_memmap(),
+ ensures a consistent view of all its writes. */
+ smp_rmb();
}
#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */
--
2.4.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] mm/page_alloc: deferred meminit: replace rwsem with completion
2015-07-06 0:17 [PATCH] mm/page_alloc: deferred meminit: replace rwsem with completion Nicolai Stange
@ 2015-07-06 8:21 ` Mel Gorman
2015-07-06 15:11 ` Nicolai Stange
0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2015-07-06 8:21 UTC (permalink / raw)
To: Nicolai Stange
Cc: Andrew Morton, Vlastimil Babka, Johannes Weiner, Michal Hocko,
Joonsoo Kim, David Rientjes, Alexander Duyck, Sasha Levin,
linux-mm, linux-kernel
On Mon, Jul 06, 2015 at 02:17:30AM +0200, Nicolai Stange wrote:
> Commit 0e1cc95b4cc7
> ("mm: meminit: finish initialisation of struct pages before basic setup")
> introduced a rwsem to signal completion of the initialization workers.
>
> Lockdep complains about possible recursive locking:
> =============================================
> [ INFO: possible recursive locking detected ]
> 4.1.0-12802-g1dc51b8 #3 Not tainted
> ---------------------------------------------
> swapper/0/1 is trying to acquire lock:
> (pgdat_init_rwsem){++++.+},
> at: [<ffffffff8424c7fb>] page_alloc_init_late+0xc7/0xe6
>
> but task is already holding lock:
> (pgdat_init_rwsem){++++.+},
> at: [<ffffffff8424c772>] page_alloc_init_late+0x3e/0xe6
>
> Replace the rwsem by a completion together with an atomic
> "outstanding work counter".
>
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Acked-by: Mel Gorman <mgorman@suse.de>
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] mm/page_alloc: deferred meminit: replace rwsem with completion
2015-07-06 8:21 ` Mel Gorman
@ 2015-07-06 15:11 ` Nicolai Stange
0 siblings, 0 replies; 3+ messages in thread
From: Nicolai Stange @ 2015-07-06 15:11 UTC (permalink / raw)
To: Mel Gorman
Cc: Nicolai Stange, Andrew Morton, Vlastimil Babka, Johannes Weiner,
Michal Hocko, Joonsoo Kim, David Rientjes, Alexander Duyck,
Sasha Levin, linux-mm, linux-kernel
Mel Gorman <mgorman@suse.de> writes:
> Acked-by: Mel Gorman <mgorman@suse.de>
Thank you!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-07-06 15:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-06 0:17 [PATCH] mm/page_alloc: deferred meminit: replace rwsem with completion Nicolai Stange
2015-07-06 8:21 ` Mel Gorman
2015-07-06 15:11 ` Nicolai Stange
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox