* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..)
[not found] <hbd4dk$5ac$1@ultimate100.geggus.net>
@ 2009-10-17 20:30 ` Frans Pop
[not found] ` <hbd9v8$7rf$1@ultimate100.geggus.net>
0 siblings, 1 reply; 4+ messages in thread
From: Frans Pop @ 2009-10-17 20:30 UTC (permalink / raw)
To: Sven Geggus; +Cc: linux-kernel, Mel Gorman, linux-mm, Kernel Testers List
Hello Sven,
Sven Geggus wrote:
> I can reproducible crash my machine by writing bulk data from a
> socket to an encrypted partition. It always crashes after a few
> Gigabytes have been written.
>
> The Partition in charge is using dm-crypt+xfs filesystem.
This is without any doubt related to an issue that's already being
investigated. I have to warn you that the thread is very long:
http://thread.gmane.org/gmane.linux.kernel/896714
What is the _exact_ command sequence you use to reproduce it? I already
have a testcase, but a second test case, or a simpler one, may be useful.
In all cases reported so far, and also in your case, networking is involved
in the actual allocation errors.
It would also be useful if you could try to bisect the issue independently.
For me bisection has proven difficult because the symptoms change
between .30 and .31. The suspicion is that more than one change is
involved in the regression.
Cheers,
FJP
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..)
[not found] ` <hbd9v8$7rf$1@ultimate100.geggus.net>
@ 2009-10-18 23:41 ` Frans Pop
2009-10-20 19:16 ` Sven Geggus
0 siblings, 1 reply; 4+ messages in thread
From: Frans Pop @ 2009-10-18 23:41 UTC (permalink / raw)
To: Sven Geggus; +Cc: linux-kernel, Mel Gorman, linux-mm, Kernel Testers List
(Sven: For kernel mailing lists please always do a "reply to all".
Although some other communities do not want that, it is standard for the
kernel community. It is needed because otherwise, with the huge amount of
traffic on the linux-kernel list, people are too likely to miss replies.)
Sven Geggus wrote:
> Frans Pop <elendil@planet.nl> wrote:
>
>> What is the _exact_ command sequence you use to reproduce it? I already
>> have a testcase, but a second test case, or a simpler one, may be
>> useful.
>
> Not a particular easy testcase. This is what I did:
>
> On the crashing machine with the dm-encrypted xfs volume:
> ionice -c 3 socat TCP4-LISTEN:5555 - >backup.tar
>
> On the source machine:
> tar cv dir |socat - TCP4:targetmachine:5555
>
> You will certainly not need to use tar.
>
> socat /dev/zero TCP4:targetmachine:5555 should work as well.
>
> I don't know if TCP traffic is really needed probably it is.
Thanks.
In the mean time I've been able to trace the culprit. Could you please try
if reverting 373c0a7e + 8aa7e847 [1] on top of 2.6.31 fixes the issue for
you?
Cheers,
FJP
[1] The first commit is a build fix for the second.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..)
2009-10-18 23:41 ` Frans Pop
@ 2009-10-20 19:16 ` Sven Geggus
2009-10-20 20:02 ` Mel Gorman
0 siblings, 1 reply; 4+ messages in thread
From: Sven Geggus @ 2009-10-20 19:16 UTC (permalink / raw)
To: Frans Pop; +Cc: linux-kernel, Mel Gorman, linux-mm, Kernel Testers List
Frans Pop schrieb am Montag, den 19. Oktober um 01:41 Uhr:
> In the mean time I've been able to trace the culprit. Could you please try
> if reverting 373c0a7e + 8aa7e847 [1] on top of 2.6.31 fixes the issue for
> you?
Unfortunately not :(
Starting from 2.6.31.4 I did
git revert 373c0a7e
git revert 8aa7e847 and build a new kernel.
The problem persists. The Kernel crashed again, this
time in "swapper".
Regards
Sven
--
"I'm a bastard, and proud of it"
(Linus Torvalds, Wednesday Sep 6, 2000)
/me is giggls@ircnet, http://sven.gegg.us/ on the Web
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Kernel crash on 2.6.31.x (kcryptd: page allocation failure..)
2009-10-20 19:16 ` Sven Geggus
@ 2009-10-20 20:02 ` Mel Gorman
0 siblings, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2009-10-20 20:02 UTC (permalink / raw)
To: Sven Geggus; +Cc: Frans Pop, linux-kernel, linux-mm, Kernel Testers List
On Tue, Oct 20, 2009 at 09:16:57PM +0200, Sven Geggus wrote:
> Frans Pop schrieb am Montag, den 19. Oktober um 01:41 Uhr:
>
> > In the mean time I've been able to trace the culprit. Could you please try
> > if reverting 373c0a7e + 8aa7e847 [1] on top of 2.6.31 fixes the issue for
> > you?
>
> Unfortunately not :(
>
> Starting from 2.6.31.4 I did
> git revert 373c0a7e
> git revert 8aa7e847 and build a new kernel.
>
> The problem persists. The Kernel crashed again, this
> time in "swapper".
>
Can you please try with this patch also applied? i.e. this patch with
the two reverts. I'm looking for either allocation failures or the
WARN_ON triggering.
Thanks
==== CUT HERE ====
page-allocator: Always wake kswapd when restarting an allocation attempt after direct reclaim failed
If a direct reclaim makes no forward progress, it considers whether it
should go OOM or not. Whether OOM is triggered or not, it may retry the
application afterwards. In times past, this would always wake kswapd as well
but currently, kswapd is not woken up after direct reclaim fails. For order-0
allocations, this makes little difference but if there is a heavy mix of
higher-order allocations that direct reclaim is failing for, it might mean
that kswapd is not rewoken for higher orders as much as it did previously.
This patch wakes up kswapd when an allocation is being retried after a direct
reclaim failure. It would be expected that kswapd is already awake, but
this has the effect of telling kswapd to reclaim at the higher order as well.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
mm/page_alloc.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0b3c6cb..e07b2f2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1763,16 +1763,17 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (NUMA_BUILD && (gfp_mask & GFP_THISNODE) == GFP_THISNODE)
goto nopage;
- wake_all_kswapd(order, zonelist, high_zoneidx);
-
/*
- * OK, we're below the kswapd watermark and have kicked background
- * reclaim. Now things get more complex, so set up alloc_flags according
- * to how we want to proceed.
+ * OK, we're below the kswapd watermark and now things get more
+ * complex, so set up alloc_flags according to how we want to
+ * proceed.
*/
alloc_flags = gfp_to_alloc_flags(gfp_mask);
restart:
+ /* Kick background reclaim */
+ wake_all_kswapd(order, zonelist, high_zoneidx);
+
/* This is the last chance, in general, before the goto nopage. */
page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
@@ -1802,6 +1803,9 @@ rebalance:
if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
goto nopage;
+ /* This shouldn't be possible but needs to be eliminated */
+ WARN_ON_ONCE(alloc_flags & ALLOC_NO_WATERMARKS);
+
/* Try direct reclaim and then allocating */
page = __alloc_pages_direct_reclaim(gfp_mask, order,
zonelist, high_zoneidx,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-10-20 20:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <hbd4dk$5ac$1@ultimate100.geggus.net>
2009-10-17 20:30 ` Kernel crash on 2.6.31.x (kcryptd: page allocation failure..) Frans Pop
[not found] ` <hbd9v8$7rf$1@ultimate100.geggus.net>
2009-10-18 23:41 ` Frans Pop
2009-10-20 19:16 ` Sven Geggus
2009-10-20 20:02 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox