From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx164.postini.com [74.125.245.164]) by kanga.kvack.org (Postfix) with SMTP id 464636B005A for ; Tue, 13 Nov 2012 17:13:35 -0500 (EST) Received: by mail-ee0-f41.google.com with SMTP id d41so252708eek.14 for ; Tue, 13 Nov 2012 14:13:33 -0800 (PST) MIME-Version: 1.0 From: Andy Lutomirski Date: Tue, 13 Nov 2012 14:13:13 -0800 Message-ID: Subject: [3.6 regression?] THP + migration/compaction livelock (I think) Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-linux-mm@kvack.org List-ID: To: linux-kernel@vger.kernel.org, linux-mm@kvack.org I've seen an odd problem three times in the past two weeks. I suspect a Linux 3.6 regression. I"m on 3.6.3-1.fc17.x86_64. I run a parallel compilation, and no progress is made. All cpus are pegged at 100% system time by the respective cc1plus processes. Reading /proc//stack shows either [] __cond_resched+0x2a/0x40 [] isolate_migratepages_range+0xb2/0x620 [] compact_zone+0x144/0x410 [] compact_zone_order+0x82/0xc0 [] try_to_compact_pages+0xe1/0x130 [] __alloc_pages_direct_compact+0xaa/0x190 [] __alloc_pages_nodemask+0x526/0x990 [] alloc_pages_vma+0xb6/0x190 [] do_huge_pmd_anonymous_page+0x143/0x340 [] handle_mm_fault+0x27d/0x320 [] do_page_fault+0x15c/0x4b0 [] page_fault+0x25/0x30 [] 0xffffffffffffffff or [] 0xffffffffffffffff seemingly at random (i.e. if I read that file twice in a row, I might see different results). If I had to guess, I'd say that perf shows no 'faults'. The livelock resolved after several minutes (and before I got far enough with perf to get more useful results). Every time this happens, firefox hangs but everything else keeps working. If I trigger it again, I'll try to grab /proc/zoneinfo and /proc/meminfo. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org