* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..) [not found] <hbd4dk$5ac$1@ultimate100.geggus.net> @ 2009-10-17 20:30 ` Frans Pop [not found] ` <hbd9v8$7rf$1@ultimate100.geggus.net> 0 siblings, 1 reply; 4+ messages in thread From: Frans Pop @ 2009-10-17 20:30 UTC (permalink / raw) To: Sven Geggus; +Cc: linux-kernel, Mel Gorman, linux-mm, Kernel Testers List Hello Sven, Sven Geggus wrote: > I can reproducible crash my machine by writing bulk data from a > socket to an encrypted partition. It always crashes after a few > Gigabytes have been written. > > The Partition in charge is using dm-crypt+xfs filesystem. This is without any doubt related to an issue that's already being investigated. I have to warn you that the thread is very long: http://thread.gmane.org/gmane.linux.kernel/896714 What is the _exact_ command sequence you use to reproduce it? I already have a testcase, but a second test case, or a simpler one, may be useful. In all cases reported so far, and also in your case, networking is involved in the actual allocation errors. It would also be useful if you could try to bisect the issue independently. For me bisection has proven difficult because the symptoms change between .30 and .31. The suspicion is that more than one change is involved in the regression. Cheers, FJP -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <hbd9v8$7rf$1@ultimate100.geggus.net>]
* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..) [not found] ` <hbd9v8$7rf$1@ultimate100.geggus.net> @ 2009-10-18 23:41 ` Frans Pop 2009-10-20 19:16 ` Sven Geggus 0 siblings, 1 reply; 4+ messages in thread From: Frans Pop @ 2009-10-18 23:41 UTC (permalink / raw) To: Sven Geggus; +Cc: linux-kernel, Mel Gorman, linux-mm, Kernel Testers List (Sven: For kernel mailing lists please always do a "reply to all". Although some other communities do not want that, it is standard for the kernel community. It is needed because otherwise, with the huge amount of traffic on the linux-kernel list, people are too likely to miss replies.) Sven Geggus wrote: > Frans Pop <elendil@planet.nl> wrote: > >> What is the _exact_ command sequence you use to reproduce it? I already >> have a testcase, but a second test case, or a simpler one, may be >> useful. > > Not a particular easy testcase. This is what I did: > > On the crashing machine with the dm-encrypted xfs volume: > ionice -c 3 socat TCP4-LISTEN:5555 - >backup.tar > > On the source machine: > tar cv dir |socat - TCP4:targetmachine:5555 > > You will certainly not need to use tar. > > socat /dev/zero TCP4:targetmachine:5555 should work as well. > > I don't know if TCP traffic is really needed probably it is. Thanks. In the mean time I've been able to trace the culprit. Could you please try if reverting 373c0a7e + 8aa7e847 [1] on top of 2.6.31 fixes the issue for you? Cheers, FJP [1] The first commit is a build fix for the second. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..) 2009-10-18 23:41 ` Frans Pop @ 2009-10-20 19:16 ` Sven Geggus 2009-10-20 20:02 ` Mel Gorman 0 siblings, 1 reply; 4+ messages in thread From: Sven Geggus @ 2009-10-20 19:16 UTC (permalink / raw) To: Frans Pop; +Cc: linux-kernel, Mel Gorman, linux-mm, Kernel Testers List Frans Pop schrieb am Montag, den 19. Oktober um 01:41 Uhr: > In the mean time I've been able to trace the culprit. Could you please try > if reverting 373c0a7e + 8aa7e847 [1] on top of 2.6.31 fixes the issue for > you? Unfortunately not :( Starting from 2.6.31.4 I did git revert 373c0a7e git revert 8aa7e847 and build a new kernel. The problem persists. The Kernel crashed again, this time in "swapper". Regards Sven -- "I'm a bastard, and proud of it" (Linus Torvalds, Wednesday Sep 6, 2000) /me is giggls@ircnet, http://sven.gegg.us/ on the Web -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..) 2009-10-20 19:16 ` Sven Geggus @ 2009-10-20 20:02 ` Mel Gorman 0 siblings, 0 replies; 4+ messages in thread From: Mel Gorman @ 2009-10-20 20:02 UTC (permalink / raw) To: Sven Geggus; +Cc: Frans Pop, linux-kernel, linux-mm, Kernel Testers List On Tue, Oct 20, 2009 at 09:16:57PM +0200, Sven Geggus wrote: > Frans Pop schrieb am Montag, den 19. Oktober um 01:41 Uhr: > > > In the mean time I've been able to trace the culprit. Could you please try > > if reverting 373c0a7e + 8aa7e847 [1] on top of 2.6.31 fixes the issue for > > you? > > Unfortunately not :( > > Starting from 2.6.31.4 I did > git revert 373c0a7e > git revert 8aa7e847 and build a new kernel. > > The problem persists. The Kernel crashed again, this > time in "swapper". > Can you please try with this patch also applied? i.e. this patch with the two reverts. I'm looking for either allocation failures or the WARN_ON triggering. Thanks ==== CUT HERE ==== page-allocator: Always wake kswapd when restarting an allocation attempt after direct reclaim failed If a direct reclaim makes no forward progress, it considers whether it should go OOM or not. Whether OOM is triggered or not, it may retry the application afterwards. In times past, this would always wake kswapd as well but currently, kswapd is not woken up after direct reclaim fails. For order-0 allocations, this makes little difference but if there is a heavy mix of higher-order allocations that direct reclaim is failing for, it might mean that kswapd is not rewoken for higher orders as much as it did previously. This patch wakes up kswapd when an allocation is being retried after a direct reclaim failure. It would be expected that kswapd is already awake, but this has the effect of telling kswapd to reclaim at the higher order as well. Signed-off-by: Mel Gorman <mel@csn.ul.ie> --- mm/page_alloc.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0b3c6cb..e07b2f2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1763,16 +1763,17 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (NUMA_BUILD && (gfp_mask & GFP_THISNODE) == GFP_THISNODE) goto nopage; - wake_all_kswapd(order, zonelist, high_zoneidx); - /* - * OK, we're below the kswapd watermark and have kicked background - * reclaim. Now things get more complex, so set up alloc_flags according - * to how we want to proceed. + * OK, we're below the kswapd watermark and now things get more + * complex, so set up alloc_flags according to how we want to + * proceed. */ alloc_flags = gfp_to_alloc_flags(gfp_mask); restart: + /* Kick background reclaim */ + wake_all_kswapd(order, zonelist, high_zoneidx); + /* This is the last chance, in general, before the goto nopage. */ page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS, @@ -1802,6 +1803,9 @@ rebalance: if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL)) goto nopage; + /* This shouldn't be possible but needs to be eliminated */ + WARN_ON_ONCE(alloc_flags & ALLOC_NO_WATERMARKS); + /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, zonelist, high_zoneidx, -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-10-20 20:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <hbd4dk$5ac$1@ultimate100.geggus.net>
2009-10-17 20:30 ` Kernel crash on 2.6.31.x (kcryptd: page allocation failure..) Frans Pop
[not found] ` <hbd9v8$7rf$1@ultimate100.geggus.net>
2009-10-18 23:41 ` Frans Pop
2009-10-20 19:16 ` Sven Geggus
2009-10-20 20:02 ` Mel Gorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox