From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with SMTP id D71F28D0039 for ; Thu, 17 Feb 2011 13:58:09 -0500 (EST) From: ebiederm@xmission.com (Eric W. Biederman) References: <20110216185234.GA11636@tiehlicka.suse.cz> <20110216193700.GA6377@elte.hu> <20110217090910.GA3781@tiehlicka.suse.cz> <20110217163531.GF14168@elte.hu> Date: Thu, 17 Feb 2011 10:57:54 -0800 In-Reply-To: <20110217163531.GF14168@elte.hu> (Ingo Molnar's message of "Thu, 17 Feb 2011 17:35:31 +0100") Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: BUG: Bad page map in process udevd (anon_vma: (null)) in 2.6.38-rc4 Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: Linus Torvalds , Michal Hocko , linux-mm@kvack.org, LKML Ingo Molnar writes: > * Linus Torvalds wrote: > >> And in addition, I don't see why others wouldn't see it (I've got >> DEBUG_PAGEALLOC and SLUB_DEBUG_ON turned on myself, and I know others >> do too). > > I've done extensive randconfig testing and no crash triggers for typical workloads > on a typical dual-core PC. If there's a generic crashes in there my tests tend to > trigger them at least 10x as often as regular testers ;-) But the tests are still > only statistical so the race could simply be special and missed by the tests. > >> So I'm wondering what triggers it. Must be something subtle. > > I think what Michal did before he got the corruption seemed somewhat atypical: > suspend/resume and udevd wifi twiddling, right? > > Now, Eric's crashes look similar - and he does not seem to have done anything > special to trigger the crashes. > > Eric, could you possibly describe your system in a bit more detail, does it do > suspend and does the box use wifi actively? Anything atypical in your setup or usage > that doesnt match a bog-standard whitebox PC with LAN? Swap to file? NFS? FUSE? > Anything that is even just borderline atypical. 10G RAM 2G Swap dual socket system 4 cores per socket No hyperthreading. fedora 14 ext4 on all filesystems The biggest difference is I beat the system to death with automated builds. I was about to say this happens with DEBUG_PAGEALLOC enabled but it appears that options keeps eluding my fingers when I have a few minutes to play with it. Perhaps this time will be the charm. The biggest difference may be that I am constantly stressing the system to the edge of triggering the OOM killer. My builds and tests are greedy when it comes to memory. I guess also I only see the bad PMD on processes that exit. So it may be that it is a matter of timing to see it. Eric -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org