From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 7 Dec 2006 21:22:57 +0000 (GMT) From: Hugh Dickins Subject: Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124 In-Reply-To: <45782B32.6040401@cern.ch> Message-ID: References: <200612070355.kB73tGf4021820@fire-2.osdl.org> <20061206201246.be7fb860.akpm@osdl.org> <4577A36B.6090803@cern.ch> <20061206230338.b0bf2b9e.akpm@osdl.org> <45782B32.6040401@cern.ch> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Ramiro Voicu Cc: Andrew Morton , linux-mm@kvack.org, bugme-daemon@bugzilla.kernel.org List-ID: On Thu, 7 Dec 2006, Ramiro Voicu wrote: > Andrew Morton wrote: > > On Thu, 07 Dec 2006 06:15:23 +0100 > > Ramiro Voicu wrote: > >> > >> Dec 7 06:12:11 xxxx kernel: [ 319.720340] pte_val: 629025 > > > > hm. A valid, read-only, accessed user page with a sane-looking pfn. > > And this is repeatable, on two different machines. > > > > I don't know what to do, sorry. A bisection-search would have a good > > chance of finding the bug, but that would be pretty painful. It looks like > > you were able to hit the bug after five minutes uptime, which helps. Is it > > always that easy to hit? > > It depends ... It can take days or minutes until it happens. The program > is a simple FTP-like using multiple TCP Streams, implemented with Java NIO. Interesting. I think you needn't bother with that bisection. I can't say why this started happening to you only with recent releases (timings changed somehow I guess): it looks like reading /dev/zero has been using zeromap_page_range unsafely for years. First it zaps existing ptes, then it inserts the zero page ptes - but only while holding mmap_sem for read: could be racing against another thread doing the same, or against ordinary faulting. Now, it may well be that the program is buggy to be racing against itself in this way (which would fit with why this hasn't been observed before - buggy programs are exceedingly rare, aren't they ;-?) but of course it shouldn't trigger a kernel BUG (or leak, which preceded the BUG). Please try the simple patch below: I expect it to fix your problem. Whether it's the right patch, I'm not quite sure: we do commonly use zap_page_range and zeromap_page_range with mmap_sem held for write, but perhaps we'd want to avoid such serialization in this case? Hugh --- 2.6.19/drivers/char/mem.c 2006-11-29 21:57:37.000000000 +0000 +++ linux/drivers/char/mem.c 2006-12-07 20:21:46.000000000 +0000 @@ -631,7 +631,7 @@ static inline size_t read_zero_pagealign mm = current->mm; /* Oops, this was forgotten before. -ben */ - down_read(&mm->mmap_sem); + down_write(&mm->mmap_sem); /* For private mappings, just map in zero pages. */ for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) { @@ -655,7 +655,7 @@ static inline size_t read_zero_pagealign goto out_up; } - up_read(&mm->mmap_sem); + up_write(&mm->mmap_sem); /* The shared case is hard. Let's do the conventional zeroing. */ do { @@ -669,7 +669,7 @@ static inline size_t read_zero_pagealign return size; out_up: - up_read(&mm->mmap_sem); + up_write(&mm->mmap_sem); return size; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org