From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 8 Dec 2006 15:52:00 -0800 From: Andrew Morton Subject: Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124 Message-Id: <20061208155200.0e2794a1.akpm@osdl.org> In-Reply-To: References: <200612070355.kB73tGf4021820@fire-2.osdl.org> <20061206201246.be7fb860.akpm@osdl.org> <4577A36B.6090803@cern.ch> <20061206230338.b0bf2b9e.akpm@osdl.org> <45782B32.6040401@cern.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Hugh Dickins Cc: Ramiro Voicu , linux-mm@kvack.org, bugme-daemon@bugzilla.kernel.org List-ID: On Thu, 7 Dec 2006 21:22:57 +0000 (GMT) Hugh Dickins wrote: > On Thu, 7 Dec 2006, Ramiro Voicu wrote: > > Andrew Morton wrote: > > > On Thu, 07 Dec 2006 06:15:23 +0100 > > > Ramiro Voicu wrote: > > >> > > >> Dec 7 06:12:11 xxxx kernel: [ 319.720340] pte_val: 629025 > > > > > > hm. A valid, read-only, accessed user page with a sane-looking pfn. > > > And this is repeatable, on two different machines. > > > > > > I don't know what to do, sorry. A bisection-search would have a good > > > chance of finding the bug, but that would be pretty painful. It looks like > > > you were able to hit the bug after five minutes uptime, which helps. Is it > > > always that easy to hit? > > > > It depends ... It can take days or minutes until it happens. The program > > is a simple FTP-like using multiple TCP Streams, implemented with Java NIO. > > Interesting. I think you needn't bother with that bisection. I can't > say why this started happening to you only with recent releases (timings > changed somehow I guess): it looks like reading /dev/zero has been using > zeromap_page_range unsafely for years. > > First it zaps existing ptes, then it inserts the zero page ptes - but > only while holding mmap_sem for read: could be racing against another > thread doing the same, or against ordinary faulting. Now, it may well > be that the program is buggy to be racing against itself in this way > (which would fit with why this hasn't been observed before - buggy > programs are exceedingly rare, aren't they ;-?) but of course it > shouldn't trigger a kernel BUG (or leak, which preceded the BUG). > > Please try the simple patch below: I expect it to fix your problem. > Whether it's the right patch, I'm not quite sure: we do commonly use > zap_page_range and zeromap_page_range with mmap_sem held for write, > but perhaps we'd want to avoid such serialization in this case? > > Hugh > > --- 2.6.19/drivers/char/mem.c 2006-11-29 21:57:37.000000000 +0000 > +++ linux/drivers/char/mem.c 2006-12-07 20:21:46.000000000 +0000 > @@ -631,7 +631,7 @@ static inline size_t read_zero_pagealign > > mm = current->mm; > /* Oops, this was forgotten before. -ben */ > - down_read(&mm->mmap_sem); > + down_write(&mm->mmap_sem); > > /* For private mappings, just map in zero pages. */ > for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) { > @@ -655,7 +655,7 @@ static inline size_t read_zero_pagealign > goto out_up; > } > > - up_read(&mm->mmap_sem); > + up_write(&mm->mmap_sem); > > /* The shared case is hard. Let's do the conventional zeroing. */ > do { > @@ -669,7 +669,7 @@ static inline size_t read_zero_pagealign > > return size; > out_up: > - up_read(&mm->mmap_sem); > + up_write(&mm->mmap_sem); > return size; > } > Ramiro, have you had a chance to test this yet? Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org