linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* kcompactd hang during memory offlining
@ 2016-05-03 17:02 Reza Arbab
  2016-05-03 22:16 ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Reza Arbab @ 2016-05-03 17:02 UTC (permalink / raw)
  To: Vlastimil Babka, Arnd Bergmann, Paul Gortmaker, Andrew Morton
  Cc: Andrea Arcangeli, Kirill A. Shutemov, Rik van Riel, Joonsoo Kim,
	Mel Gorman, David Rientjes, Michal Hocko, Johannes Weiner,
	Hugh Dickins, linux-mm, linux-kernel

Assume memory47 is the last online block left in node1. This will hang:

# echo offline > /sys/devices/system/node/node1/memory47/state

After a couple of minutes, the following pops up in dmesg:

INFO: task bash:957 blocked for more than 120 seconds.
       Not tainted 4.6.0-rc6+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
bash            D ffff8800b7adbaf8     0   957    951 0x00000000
  ffff8800b7adbaf8 ffff880034d5b880 ffff8800b698d4c0 ffff8800b7adc000
  7fffffffffffffff ffff88003381ff10 ffff8800b698d4c0 0000000000180000
  ffff8800b7adbb10 ffffffff817be0b5 ffff88003381ff08 ffff8800b7adbbc0
Call Trace:
  [<ffffffff817be0b5>] schedule+0x35/0x80
  [<ffffffff817c100c>] schedule_timeout+0x1ac/0x270
  [<ffffffff810d9750>] ? check_preempt_wakeup+0x100/0x220
  [<ffffffff810ce0a0>] ? check_preempt_curr+0x80/0x90
  [<ffffffff817bf501>] wait_for_completion+0xe1/0x120
  [<ffffffff810cefc0>] ? wake_up_q+0x70/0x70
  [<ffffffff810c42ff>] kthread_stop+0x4f/0x110
  [<ffffffff811e1046>] kcompactd_stop+0x26/0x40
  [<ffffffff817b7a16>] __offline_pages.constprop.28+0x7e6/0x840
  [<ffffffff8121ee61>] offline_pages+0x11/0x20
  [<ffffffff8151a073>] memory_block_action+0x73/0x1d0
  [<ffffffff8151a217>] memory_subsys_offline+0x47/0x60
  [<ffffffff81502dc6>] device_offline+0x86/0xb0
  [<ffffffff8151a8fa>] store_mem_state+0xda/0xf0
  [<ffffffff814ffea8>] dev_attr_store+0x18/0x30
  [<ffffffff812c1097>] sysfs_kf_write+0x37/0x40
  [<ffffffff812c062d>] kernfs_fop_write+0x11d/0x170
  [<ffffffff8123e797>] __vfs_write+0x37/0x120
  [<ffffffff8134d1ad>] ? security_file_permission+0x3d/0xc0
  [<ffffffff810eed32>] ? percpu_down_read+0x12/0x50
  [<ffffffff8123f969>] vfs_write+0xa9/0x1a0
  [<ffffffff8134d543>] ? security_file_fcntl+0x43/0x60
  [<ffffffff81240dc5>] SyS_write+0x55/0xc0
  [<ffffffff817c21b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4

Bisect ends on commit 698b1b306 ("mm, compaction: introduce kcompactd").

-- 
Reza Arbab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: kcompactd hang during memory offlining
  2016-05-03 17:02 kcompactd hang during memory offlining Reza Arbab
@ 2016-05-03 22:16 ` Vlastimil Babka
  2016-05-04 18:09   ` Reza Arbab
  0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2016-05-03 22:16 UTC (permalink / raw)
  To: Reza Arbab, Arnd Bergmann, Paul Gortmaker, Andrew Morton
  Cc: Andrea Arcangeli, Kirill A. Shutemov, Rik van Riel, Joonsoo Kim,
	Mel Gorman, David Rientjes, Michal Hocko, Johannes Weiner,
	Hugh Dickins, linux-mm, linux-kernel

On 05/03/2016 07:02 PM, Reza Arbab wrote:
> Assume memory47 is the last online block left in node1. This will hang:
> 
> # echo offline > /sys/devices/system/node/node1/memory47/state
> 
> After a couple of minutes, the following pops up in dmesg:
> 
> INFO: task bash:957 blocked for more than 120 seconds.

Damn, can you test this patch? I hope it's just the simple mistake and kcompactd is
waiting for the kcompactd_max_order > 0 when it's woken up to actually exit.
No idea what happens if memory actually gets offlined during compaction's pfn scan...
but that wouldn't be new or specific to kcompactd...

----8<----
diff --git a/mm/compaction.c b/mm/compaction.c
index 481004c73c90..0e28981d4510 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1852,7 +1852,7 @@ void compaction_unregister_node(struct node *node)
 
 static inline bool kcompactd_work_requested(pg_data_t *pgdat)
 {
-       return pgdat->kcompactd_max_order > 0;
+       return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
 }
 
 static bool kcompactd_node_suitable(pg_data_t *pgdat)
@@ -1916,6 +1916,8 @@ static void kcompactd_do_work(pg_data_t *pgdat)
                INIT_LIST_HEAD(&cc.freepages);
                INIT_LIST_HEAD(&cc.migratepages);
 
+               if (kthread_should_stop())
+                       return;
                status = compact_zone(zone, &cc);
 
                if (zone_watermark_ok(zone, cc.order, low_wmark_pages(zone),

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: kcompactd hang during memory offlining
  2016-05-03 22:16 ` Vlastimil Babka
@ 2016-05-04 18:09   ` Reza Arbab
  2016-05-04 21:23     ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Reza Arbab @ 2016-05-04 18:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Arnd Bergmann, Paul Gortmaker, Andrew Morton, Andrea Arcangeli,
	Kirill A. Shutemov, Rik van Riel, Joonsoo Kim, Mel Gorman,
	David Rientjes, Michal Hocko, Johannes Weiner, Hugh Dickins,
	linux-mm, linux-kernel

On Wed, May 04, 2016 at 12:16:42AM +0200, Vlastimil Babka wrote:
>Damn, can you test this patch?

That fixed the regression for me. Thanks!

Tested-by: Reza Arbab <arbab@linux.vnet.ibm.com>

-- 
Reza Arbab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: kcompactd hang during memory offlining
  2016-05-04 18:09   ` Reza Arbab
@ 2016-05-04 21:23     ` Vlastimil Babka
  0 siblings, 0 replies; 4+ messages in thread
From: Vlastimil Babka @ 2016-05-04 21:23 UTC (permalink / raw)
  To: Reza Arbab
  Cc: Arnd Bergmann, Paul Gortmaker, Andrew Morton, Andrea Arcangeli,
	Kirill A. Shutemov, Rik van Riel, Joonsoo Kim, Mel Gorman,
	David Rientjes, Michal Hocko, Johannes Weiner, Hugh Dickins,
	linux-mm, linux-kernel

On 4.5.2016 20:09, Reza Arbab wrote:
> On Wed, May 04, 2016 at 12:16:42AM +0200, Vlastimil Babka wrote:
>> Damn, can you test this patch?
> 
> That fixed the regression for me. Thanks!
> 
> Tested-by: Reza Arbab <arbab@linux.vnet.ibm.com>

Thanks for testing, and Andrew for picking the patch already!

Vlastimil

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-04 21:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-03 17:02 kcompactd hang during memory offlining Reza Arbab
2016-05-03 22:16 ` Vlastimil Babka
2016-05-04 18:09   ` Reza Arbab
2016-05-04 21:23     ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox