Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
       [not found] <200612070355.kB73tGf4021820@fire-2.osdl.org>
@ 2006-12-07  4:12 ` Andrew Morton
  2006-12-07  5:15   ` Ramiro Voicu
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-12-07  4:12 UTC (permalink / raw)
  To: linux-mm; +Cc: bugme-daemon, Ramiro.Voicu

(switching to email - please retain all cc's).

On Wed, 6 Dec 2006 19:55:16 -0800
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=7645
> 
>            Summary: Kernel BUG at mm/memory.c:1124
>     Kernel Version: 2.6.19
>             Status: NEW
>           Severity: high
>              Owner: akpm@osdl.org
>          Submitter: Ramiro.Voicu@cern.ch
> 
> 
> Most recent kernel where this bug did *NOT* occur: 2.6.17 ... as far as I
> remember ( for sure it works fine with 2.6.15.4 )
> 
> Distribution: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) & Slackware 11
> 
> cat /proc/version
> Linux version 2.6.19-RH-server-250-lock (root@xxxx.cern.ch) (gcc version 3.4.6
> 20060404 (Red Hat 3.4.6-3)) #2 SMP Tue Dec 5 16:29:12 CET 2006
> 
> Hardware Environment:
> 
> CPU: 2CPU-s Dual core Opteron ( 4 entries in /proc/cpuinfo )
> <snip>
> model name      : Dual Core AMD Opteron(tm) Processor 275
> 
> cat /proc/modules
> myri10ge 41296 0 - Live 0xffffffff8806a000
> af_packet 19788 0 - Live 0xffffffff88064000
> binfmt_misc 10764 1 - Live 0xffffffff88060000
> dm_mirror 19776 0 - Live 0xffffffff8805a000
> dm_mod 55696 1 dm_mirror, Live 0xffffffff8804b000
> ohci_hcd 20292 0 - Live 0xffffffff88045000
> ehci_hcd 31304 0 - Live 0xffffffff8803c000
> usbcore 132840 3 ohci_hcd,ehci_hcd, Live 0xffffffff8801a000
> i2c_nforce2 7872 0 - Live 0xffffffff88017000
> i2c_core 20288 1 i2c_nforce2, Live 0xffffffff88011000
> floppy 62632 0 - Live 0xffffffff88000000
> 
> 
> Problem Description:
> 
> I am using a Java program ( based on NIO ) to do some data transfers. I have
> encountered the problem since 2.6.18 ( with all 2.6.18.x versions and all the
> -rc versions from 2.6.19 )
> 
> The problem appeared not only on the machine above, but also on my desktop
> machine with the same error in /var/log/messages. With 2.6.19-rc6 my machine
> freezes completly with no error reports.
> 
> When the kernel gets stuck I got the following message in the console:
> Message from syslogd@xxxxx at Thu Dec  7 04:17:21 2006 ...
> xxxxx kernel: [128531.708976] invalid opcode: 0000 [1] SMP
> 
> And in /var/log/messages:
> 
> Dec  7 04:17:21 xxxxx kernel: [128531.708947] ----------- [cut here ] ---------
> [please bite here ] ---------
> Dec  7 04:17:21 xxxxx kernel: [128531.708967] Kernel BUG at mm/memory.c:1124
> Dec  7 04:17:21 xxxxx kernel: [128531.708976] invalid opcode: 0000 [1] SMP
> Dec  7 04:17:21 xxxxx kernel: [128531.708988] CPU 0
> Dec  7 04:17:21 xxxxx kernel: [128531.708995] Modules linked in: myri10ge
> af_packet binfmt_misc dm_mirror dm_mod ohci_hcd ehci_hcd usbcore i2c_nforce2
> i2c_core floppy
> Dec  7 04:17:21 xxxxx kernel: [128531.709032] Pid: 21891, comm: java Not tainted
> 2.6.19-RH-server-250-lock #2
> Dec  7 04:17:21 xxxxx kernel: [128531.709045] RIP: 0010:[<ffffffff8026722b>] 
> [<ffffffff8026722b>] zeromap_page_range+0x2ab/0x330
> Dec  7 04:17:21 xxxxx kernel: [128531.709066] RSP: 0018:ffff81011639be38 
> EFLAGS: 00010202
> Dec  7 04:17:21 xxxxx kernel: [128531.709076] RAX: 0000000000000400 RBX:
> 8000000000629025 RCX: 000000000000001f
> Dec  7 04:17:21 xxxxx kernel: [128531.709090] RDX: ffff8100006a88f8 RSI:
> 00002aaaad781000 RDI: ffff8100006a88f8
> Dec  7 04:17:21 xxxxx kernel: [128531.709104] RBP: ffff8100006a88f8 R08:
> 0000000000000000 R09: 0000000000000000
> Dec  7 04:17:21 xxxxx kernel: [128531.709118] R10: 0000000000000002 R11:
> 0000000000000202 R12: ffff810062b20fa0
> Dec  7 04:17:21 xxxxx kernel: [128531.709161] R13: 00002aaaadbf4000 R14:
> ffff810065a81240 R15: ffff810069f21b68
> Dec  7 04:17:21 xxxxx kernel: [128531.709205] FS:  0000000043686960(0063)
> GS:ffffffff805b7000(0000) knlGS:00000000f7f1c6c0
> Dec  7 04:17:21 xxxxx kernel: [128531.709250] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> Dec  7 04:17:21 xxxxx kernel: [128531.709277] CR2: 00002b7df12d7f58 CR3:
> 0000000069888000 CR4: 00000000000006e0
> Dec  7 04:17:21 xxxxx kernel: [128531.709321] Process java (pid: 21891,
> threadinfo ffff81011639a000, task ffff81011bb7c100)
> Dec  7 04:17:21 xxxxx kernel: [128531.709365] Stack:  00002aaaadf80fff
> 00002aaaadf80fff 00002aaaadf80fff ffff810001c29f10
> Dec  7 04:17:21 xxxxx kernel: [128531.709413]  00002aaaadc00000 00002aaaadf81000
> ffff8100654ec550 00002aaaadf81000
> Dec  7 04:17:21 xxxxx kernel: [128531.709460]  00002aaaadf81000 ffff8100698882a8
> 8000000000000025 0000000000800000
> Dec  7 04:17:21 xxxxx kernel: [128531.709491] Call Trace:
> Dec  7 04:17:21 xxxxx kernel: [128531.709532]  [<ffffffff803b751f>]
> read_zero+0x14f/0x230
> Dec  7 04:17:21 xxxxx kernel: [128531.709561]  [<ffffffff802801f9>]
> vfs_read+0xe9/0x1b0
> Dec  7 04:17:21 xxxxx kernel: [128531.709587]  [<ffffffff802805e3>]
> sys_read+0x53/0x90
> Dec  7 04:17:21 xxxxx kernel: [128531.709615]  [<ffffffff80209b5e>]
> system_call+0x7e/0x83
> Dec  7 04:17:21 xxxxx kernel: [128531.709641]
> Dec  7 04:17:21 xxxxx kernel: [128531.709658]
> Dec  7 04:17:21 xxxxx kernel: [128531.709659] Code: 0f 0b 68 f0 9f 4e 80 c2 64
> 04 49 89 1c 24 49 81 c5 00 10 00
> Dec  7 04:17:21 xxxxx kernel: [128531.709737] RIP  [<ffffffff8026722b>]
> zeromap_page_range+0x2ab/0x330
> Dec  7 04:17:21 xxxxx kernel: [128531.709765]  RSP <ffff81011639be38>
> 
> 
>  My desktop machine is an Intel(R) Pentium(R) 4 CPU 3.20GHz with HT. The same
> application runs fine on Solaris10 ( also on my desktop ) and on older versions
> of Linux kernel.
> 


This is

	BUG_ON(!pte_none(*pte));

in zeromap_pte_range().

Could you please add this?

--- a/mm/memory.c~a
+++ a/mm/memory.c
@@ -1121,7 +1121,10 @@ static int zeromap_pte_range(struct mm_s
 		page_cache_get(page);
 		page_add_file_rmap(page);
 		inc_mm_counter(mm, file_rss);
-		BUG_ON(!pte_none(*pte));
+		if (!pte_none(*pte)) {
+			printk("pte_val: %lx\n", pte_val(*pte));
+			BUG();
+		}
 		set_pte_at(mm, addr, pte, zero_pte);
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 	arch_leave_lazy_mmu_mode();
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-07  4:12 ` [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124 Andrew Morton
@ 2006-12-07  5:15   ` Ramiro Voicu
  2006-12-07  7:03     ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Ramiro Voicu @ 2006-12-07  5:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, bugme-daemon

Hi,

 Here is the stack trace after I've applied the patch


Dec  7 06:12:11 xxxx kernel: [  319.720340] pte_val: 629025
Dec  7 06:12:11 xxxx kernel: [  319.720422] ----------- [cut here ]
--------- [please bite here ] ---------
Dec  7 06:12:11 xxxx kernel: [  319.720467] Kernel BUG at mm/memory.c:1126
Dec  7 06:12:11 xxxx kernel: [  319.720505] invalid opcode: 0000 [1] SMP
Dec  7 06:12:11 xxxx kernel: [  319.720603] CPU 1
Dec  7 06:12:11 xxxx kernel: [  319.720666] Modules linked in: myri10ge
af_packet binfmt_misc dm_mirror dm_mod ohci_hcd ehci_hcd usbcore
i2c_nforce2 i2c_core floppy
Dec  7 06:12:11 xxxx kernel: [  319.721086] Pid: 4493, comm: java Not
tainted 2.6.19smp-250-lock-AM-patch #3
Dec  7 06:12:11 xxxx kernel: [  319.721131] RIP:
0010:[<ffffffff8026723b>]  [<ffffffff8026723b>]
zeromap_page_range+0x2bb/0x340
Dec  7 06:12:11 xxxx kernel: [  319.721213] RSP: 0018:ffff810121787e38
EFLAGS: 00010296
Dec  7 06:12:11 xxxx kernel: [  319.721254] RAX: 0000000000000022 RBX:
8000000000629025 RCX: ffffffff80541688
Dec  7 06:12:11 xxxx kernel: [  319.721299] RDX: ffffffff80541688 RSI:
0000000000000086 RDI: ffffffff80541680
Dec  7 06:12:11 xxxx kernel: [  319.721343] RBP: ffff8100006a88f8 R08:
0000000000000000 R09: 0000000000000064
Dec  7 06:12:11 xxxx kernel: [  319.721388] R10: 0000000000000080 R11:
0000000000000080 R12: ffff810078a7dae0
Dec  7 06:12:11 xxxx kernel: [  319.721433] R13: 00002aaab2f5c000 R14:
ffff81012c173240 R15: ffff81012aa31cb8
Dec  7 06:12:11 xxxx kernel: [  319.721478] FS:  0000000042676960(0063)
GS:ffff8100028dccc0(0000) knlGS:0000000000000000
Dec  7 06:12:11 xxxx kernel: [  319.721525] CS:  0010 DS: 0000 ES: 0000
CR0: 0000000080050033
Dec  7 06:12:11 xxxx kernel: [  319.721566] CR2: 00002b045ff65f2c CR3:
0000000125329000 CR4: 00000000000006e0
Dec  7 06:12:11 xxxx kernel: [  319.721625] Process java (pid: 4493,
threadinfo ffff810121786000, task ffff810122af88a0)
Dec  7 06:12:11 xxxx kernel: [  319.721696] Stack:  00002aaab350efff
00002aaab350efff 00002aaab350efff ffff8100020f7b68
Dec  7 06:12:11 xxxx kernel: [  319.721919]  00002aaab3000000
00002aaab350f000 ffff810123065550 00002aaab350f000
Dec  7 06:12:11 xxxx kernel: [  319.722109]  00002aaab350f000
ffff8101253292a8 8000000000000025 0000000000800000
Dec  7 06:12:11 xxxx kernel: [  319.722257] Call Trace:
Dec  7 06:12:11 xxxx kernel: [  319.722353]  [<ffffffff803b752f>]
read_zero+0x14f/0x230
Dec  7 06:12:11 xxxx kernel: [  319.722410]  [<ffffffff80280209>]
vfs_read+0xe9/0x1b0
Dec  7 06:12:11 xxxx kernel: [  319.722464]  [<ffffffff802805f3>]
sys_read+0x53/0x90
Dec  7 06:12:11 xxxx kernel: [  319.722520]  [<ffffffff80209b5e>]
system_call+0x7e/0x83
Dec  7 06:12:11 xxxx kernel: [  319.722575]
Dec  7 06:12:11 xxxx kernel: [  319.722620]
Dec  7 06:12:11 xxxx kernel: [  319.722621] Code: 0f 0b 68 f0 9f 4e 80
c2 66 04 49 89 1c 24 49 81 c5 00 10 00
Dec  7 06:12:11 xxxx kernel: [  319.723357] RIP  [<ffffffff8026723b>]
zeromap_page_range+0x2bb/0x340
Dec  7 06:12:11 xxxx kernel: [  319.723443]  RSP <ffff810121787e38>
Dec  7 06:12:17 xxxx ntpd[3057]: synchronized to LOCAL(0), stratum 10
Dec  7 06:12:17 xxxx ntpd[3057]: kernel time sync disabled 0041


Andrew Morton wrote:
> (switching to email - please retain all cc's).
> 
> On Wed, 6 Dec 2006 19:55:16 -0800
> bugme-daemon@bugzilla.kernel.org wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=7645
>>
>>            Summary: Kernel BUG at mm/memory.c:1124
>>     Kernel Version: 2.6.19
>>             Status: NEW
>>           Severity: high
>>              Owner: akpm@osdl.org
>>          Submitter: Ramiro.Voicu@cern.ch
>>
>>
>> Most recent kernel where this bug did *NOT* occur: 2.6.17 ... as far as I
>> remember ( for sure it works fine with 2.6.15.4 )
>>
>> Distribution: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) & Slackware 11
>>
>> cat /proc/version
>> Linux version 2.6.19-RH-server-250-lock (root@xxxx.cern.ch) (gcc version 3.4.6
>> 20060404 (Red Hat 3.4.6-3)) #2 SMP Tue Dec 5 16:29:12 CET 2006
>>
>> Hardware Environment:
>>
>> CPU: 2CPU-s Dual core Opteron ( 4 entries in /proc/cpuinfo )
>> <snip>
>> model name      : Dual Core AMD Opteron(tm) Processor 275
>>
>> cat /proc/modules
>> myri10ge 41296 0 - Live 0xffffffff8806a000
>> af_packet 19788 0 - Live 0xffffffff88064000
>> binfmt_misc 10764 1 - Live 0xffffffff88060000
>> dm_mirror 19776 0 - Live 0xffffffff8805a000
>> dm_mod 55696 1 dm_mirror, Live 0xffffffff8804b000
>> ohci_hcd 20292 0 - Live 0xffffffff88045000
>> ehci_hcd 31304 0 - Live 0xffffffff8803c000
>> usbcore 132840 3 ohci_hcd,ehci_hcd, Live 0xffffffff8801a000
>> i2c_nforce2 7872 0 - Live 0xffffffff88017000
>> i2c_core 20288 1 i2c_nforce2, Live 0xffffffff88011000
>> floppy 62632 0 - Live 0xffffffff88000000
>>
>>
>> Problem Description:
>>
>> I am using a Java program ( based on NIO ) to do some data transfers. I have
>> encountered the problem since 2.6.18 ( with all 2.6.18.x versions and all the
>> -rc versions from 2.6.19 )
>>
>> The problem appeared not only on the machine above, but also on my desktop
>> machine with the same error in /var/log/messages. With 2.6.19-rc6 my machine
>> freezes completly with no error reports.
>>
>> When the kernel gets stuck I got the following message in the console:
>> Message from syslogd@xxxxx at Thu Dec  7 04:17:21 2006 ...
>> xxxxx kernel: [128531.708976] invalid opcode: 0000 [1] SMP
>>
>> And in /var/log/messages:
>>
>> Dec  7 04:17:21 xxxxx kernel: [128531.708947] ----------- [cut here ] ---------
>> [please bite here ] ---------
>> Dec  7 04:17:21 xxxxx kernel: [128531.708967] Kernel BUG at mm/memory.c:1124
>> Dec  7 04:17:21 xxxxx kernel: [128531.708976] invalid opcode: 0000 [1] SMP
>> Dec  7 04:17:21 xxxxx kernel: [128531.708988] CPU 0
>> Dec  7 04:17:21 xxxxx kernel: [128531.708995] Modules linked in: myri10ge
>> af_packet binfmt_misc dm_mirror dm_mod ohci_hcd ehci_hcd usbcore i2c_nforce2
>> i2c_core floppy
>> Dec  7 04:17:21 xxxxx kernel: [128531.709032] Pid: 21891, comm: java Not tainted
>> 2.6.19-RH-server-250-lock #2
>> Dec  7 04:17:21 xxxxx kernel: [128531.709045] RIP: 0010:[<ffffffff8026722b>] 
>> [<ffffffff8026722b>] zeromap_page_range+0x2ab/0x330
>> Dec  7 04:17:21 xxxxx kernel: [128531.709066] RSP: 0018:ffff81011639be38 
>> EFLAGS: 00010202
>> Dec  7 04:17:21 xxxxx kernel: [128531.709076] RAX: 0000000000000400 RBX:
>> 8000000000629025 RCX: 000000000000001f
>> Dec  7 04:17:21 xxxxx kernel: [128531.709090] RDX: ffff8100006a88f8 RSI:
>> 00002aaaad781000 RDI: ffff8100006a88f8
>> Dec  7 04:17:21 xxxxx kernel: [128531.709104] RBP: ffff8100006a88f8 R08:
>> 0000000000000000 R09: 0000000000000000
>> Dec  7 04:17:21 xxxxx kernel: [128531.709118] R10: 0000000000000002 R11:
>> 0000000000000202 R12: ffff810062b20fa0
>> Dec  7 04:17:21 xxxxx kernel: [128531.709161] R13: 00002aaaadbf4000 R14:
>> ffff810065a81240 R15: ffff810069f21b68
>> Dec  7 04:17:21 xxxxx kernel: [128531.709205] FS:  0000000043686960(0063)
>> GS:ffffffff805b7000(0000) knlGS:00000000f7f1c6c0
>> Dec  7 04:17:21 xxxxx kernel: [128531.709250] CS:  0010 DS: 0000 ES: 0000 CR0:
>> 0000000080050033
>> Dec  7 04:17:21 xxxxx kernel: [128531.709277] CR2: 00002b7df12d7f58 CR3:
>> 0000000069888000 CR4: 00000000000006e0
>> Dec  7 04:17:21 xxxxx kernel: [128531.709321] Process java (pid: 21891,
>> threadinfo ffff81011639a000, task ffff81011bb7c100)
>> Dec  7 04:17:21 xxxxx kernel: [128531.709365] Stack:  00002aaaadf80fff
>> 00002aaaadf80fff 00002aaaadf80fff ffff810001c29f10
>> Dec  7 04:17:21 xxxxx kernel: [128531.709413]  00002aaaadc00000 00002aaaadf81000
>> ffff8100654ec550 00002aaaadf81000
>> Dec  7 04:17:21 xxxxx kernel: [128531.709460]  00002aaaadf81000 ffff8100698882a8
>> 8000000000000025 0000000000800000
>> Dec  7 04:17:21 xxxxx kernel: [128531.709491] Call Trace:
>> Dec  7 04:17:21 xxxxx kernel: [128531.709532]  [<ffffffff803b751f>]
>> read_zero+0x14f/0x230
>> Dec  7 04:17:21 xxxxx kernel: [128531.709561]  [<ffffffff802801f9>]
>> vfs_read+0xe9/0x1b0
>> Dec  7 04:17:21 xxxxx kernel: [128531.709587]  [<ffffffff802805e3>]
>> sys_read+0x53/0x90
>> Dec  7 04:17:21 xxxxx kernel: [128531.709615]  [<ffffffff80209b5e>]
>> system_call+0x7e/0x83
>> Dec  7 04:17:21 xxxxx kernel: [128531.709641]
>> Dec  7 04:17:21 xxxxx kernel: [128531.709658]
>> Dec  7 04:17:21 xxxxx kernel: [128531.709659] Code: 0f 0b 68 f0 9f 4e 80 c2 64
>> 04 49 89 1c 24 49 81 c5 00 10 00
>> Dec  7 04:17:21 xxxxx kernel: [128531.709737] RIP  [<ffffffff8026722b>]
>> zeromap_page_range+0x2ab/0x330
>> Dec  7 04:17:21 xxxxx kernel: [128531.709765]  RSP <ffff81011639be38>
>>
>>
>>  My desktop machine is an Intel(R) Pentium(R) 4 CPU 3.20GHz with HT. The same
>> application runs fine on Solaris10 ( also on my desktop ) and on older versions
>> of Linux kernel.
>>
> 
> 
> This is
> 
> 	BUG_ON(!pte_none(*pte));
> 
> in zeromap_pte_range().
> 
> Could you please add this?
> 
> --- a/mm/memory.c~a
> +++ a/mm/memory.c
> @@ -1121,7 +1121,10 @@ static int zeromap_pte_range(struct mm_s
>  		page_cache_get(page);
>  		page_add_file_rmap(page);
>  		inc_mm_counter(mm, file_rss);
> -		BUG_ON(!pte_none(*pte));
> +		if (!pte_none(*pte)) {
> +			printk("pte_val: %lx\n", pte_val(*pte));
> +			BUG();
> +		}
>  		set_pte_at(mm, addr, pte, zero_pte);
>  	} while (pte++, addr += PAGE_SIZE, addr != end);
>  	arch_leave_lazy_mmu_mode();
> _
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-07  5:15   ` Ramiro Voicu
@ 2006-12-07  7:03     ` Andrew Morton
  2006-12-07 14:54       ` Ramiro Voicu
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-12-07  7:03 UTC (permalink / raw)
  To: Ramiro Voicu; +Cc: linux-mm, bugme-daemon

On Thu, 07 Dec 2006 06:15:23 +0100
Ramiro Voicu <Ramiro.Voicu@cern.ch> wrote:

>  Here is the stack trace after I've applied the patch
> 
> 
> Dec  7 06:12:11 xxxx kernel: [  319.720340] pte_val: 629025

hm.  A valid, read-only, accessed user page with a sane-looking pfn.
And this is repeatable, on two different machines.

I don't know what to do, sorry.  A bisection-search would have a good
chance of finding the bug, but that would be pretty painful.  It looks like
you were able to hit the bug after five minutes uptime, which helps.  Is it
always that easy to hit?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-07  7:03     ` Andrew Morton
@ 2006-12-07 14:54       ` Ramiro Voicu
  2006-12-07 21:22         ` Hugh Dickins
  0 siblings, 1 reply; 10+ messages in thread
From: Ramiro Voicu @ 2006-12-07 14:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, bugme-daemon

It depends ... It can take days or minutes until it happens. The program
is a simple FTP-like using multiple TCP Streams, implemented with Java NIO.

 What I have noticed is that if a do a lot of connect/disconnect from
the client the kernel on the server machine gets stuck. It never happens
with the client, though. I will try if I can, somehow, to isolate the
problem ... although I does not seem that there is a pattern for this

Andrew Morton wrote:
> On Thu, 07 Dec 2006 06:15:23 +0100
> Ramiro Voicu <Ramiro.Voicu@cern.ch> wrote:
> 
>>  Here is the stack trace after I've applied the patch
>>
>>
>> Dec  7 06:12:11 xxxx kernel: [  319.720340] pte_val: 629025
> 
> hm.  A valid, read-only, accessed user page with a sane-looking pfn.
> And this is repeatable, on two different machines.
> 
> I don't know what to do, sorry.  A bisection-search would have a good
> chance of finding the bug, but that would be pretty painful.  It looks like
> you were able to hit the bug after five minutes uptime, which helps.  Is it
> always that easy to hit?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-07 14:54       ` Ramiro Voicu
@ 2006-12-07 21:22         ` Hugh Dickins
  2006-12-08 23:52           ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2006-12-07 21:22 UTC (permalink / raw)
  To: Ramiro Voicu; +Cc: Andrew Morton, linux-mm, bugme-daemon

On Thu, 7 Dec 2006, Ramiro Voicu wrote:
> Andrew Morton wrote:
> > On Thu, 07 Dec 2006 06:15:23 +0100
> > Ramiro Voicu <Ramiro.Voicu@cern.ch> wrote:
> >>
> >> Dec  7 06:12:11 xxxx kernel: [  319.720340] pte_val: 629025
> > 
> > hm.  A valid, read-only, accessed user page with a sane-looking pfn.
> > And this is repeatable, on two different machines.
> > 
> > I don't know what to do, sorry.  A bisection-search would have a good
> > chance of finding the bug, but that would be pretty painful.  It looks like
> > you were able to hit the bug after five minutes uptime, which helps.  Is it
> > always that easy to hit?
> 
> It depends ... It can take days or minutes until it happens. The program
> is a simple FTP-like using multiple TCP Streams, implemented with Java NIO.

Interesting.  I think you needn't bother with that bisection.  I can't
say why this started happening to you only with recent releases (timings
changed somehow I guess): it looks like reading /dev/zero has been using
zeromap_page_range unsafely for years.

First it zaps existing ptes, then it inserts the zero page ptes - but
only while holding mmap_sem for read: could be racing against another
thread doing the same, or against ordinary faulting.  Now, it may well
be that the program is buggy to be racing against itself in this way
(which would fit with why this hasn't been observed before - buggy
programs are exceedingly rare, aren't they ;-?) but of course it
shouldn't trigger a kernel BUG (or leak, which preceded the BUG).

Please try the simple patch below: I expect it to fix your problem.
Whether it's the right patch, I'm not quite sure: we do commonly use
zap_page_range and zeromap_page_range with mmap_sem held for write,
but perhaps we'd want to avoid such serialization in this case?

Hugh

--- 2.6.19/drivers/char/mem.c	2006-11-29 21:57:37.000000000 +0000
+++ linux/drivers/char/mem.c	2006-12-07 20:21:46.000000000 +0000
@@ -631,7 +631,7 @@ static inline size_t read_zero_pagealign

 	mm = current->mm;
 	/* Oops, this was forgotten before. -ben */
-	down_read(&mm->mmap_sem);
+	down_write(&mm->mmap_sem);

 	/* For private mappings, just map in zero pages. */
 	for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
@@ -655,7 +655,7 @@ static inline size_t read_zero_pagealign
 			goto out_up;
 	}

-	up_read(&mm->mmap_sem);
+	up_write(&mm->mmap_sem);

 	/* The shared case is hard. Let's do the conventional zeroing. */ 
 	do {
@@ -669,7 +669,7 @@ static inline size_t read_zero_pagealign

 	return size;
 out_up:
-	up_read(&mm->mmap_sem);
+	up_write(&mm->mmap_sem);
 	return size;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-07 21:22         ` Hugh Dickins
@ 2006-12-08 23:52           ` Andrew Morton
  2006-12-09  4:34             ` Hugh Dickins
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-12-08 23:52 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Ramiro Voicu, linux-mm, bugme-daemon

On Thu, 7 Dec 2006 21:22:57 +0000 (GMT)
Hugh Dickins <hugh@veritas.com> wrote:

> On Thu, 7 Dec 2006, Ramiro Voicu wrote:
> > Andrew Morton wrote:
> > > On Thu, 07 Dec 2006 06:15:23 +0100
> > > Ramiro Voicu <Ramiro.Voicu@cern.ch> wrote:
> > >>
> > >> Dec  7 06:12:11 xxxx kernel: [  319.720340] pte_val: 629025
> > > 
> > > hm.  A valid, read-only, accessed user page with a sane-looking pfn.
> > > And this is repeatable, on two different machines.
> > > 
> > > I don't know what to do, sorry.  A bisection-search would have a good
> > > chance of finding the bug, but that would be pretty painful.  It looks like
> > > you were able to hit the bug after five minutes uptime, which helps.  Is it
> > > always that easy to hit?
> > 
> > It depends ... It can take days or minutes until it happens. The program
> > is a simple FTP-like using multiple TCP Streams, implemented with Java NIO.
> 
> Interesting.  I think you needn't bother with that bisection.  I can't
> say why this started happening to you only with recent releases (timings
> changed somehow I guess): it looks like reading /dev/zero has been using
> zeromap_page_range unsafely for years.
> 
> First it zaps existing ptes, then it inserts the zero page ptes - but
> only while holding mmap_sem for read: could be racing against another
> thread doing the same, or against ordinary faulting.  Now, it may well
> be that the program is buggy to be racing against itself in this way
> (which would fit with why this hasn't been observed before - buggy
> programs are exceedingly rare, aren't they ;-?) but of course it
> shouldn't trigger a kernel BUG (or leak, which preceded the BUG).
> 
> Please try the simple patch below: I expect it to fix your problem.
> Whether it's the right patch, I'm not quite sure: we do commonly use
> zap_page_range and zeromap_page_range with mmap_sem held for write,
> but perhaps we'd want to avoid such serialization in this case?
> 
> Hugh
> 
> --- 2.6.19/drivers/char/mem.c	2006-11-29 21:57:37.000000000 +0000
> +++ linux/drivers/char/mem.c	2006-12-07 20:21:46.000000000 +0000
> @@ -631,7 +631,7 @@ static inline size_t read_zero_pagealign
>  
>  	mm = current->mm;
>  	/* Oops, this was forgotten before. -ben */
> -	down_read(&mm->mmap_sem);
> +	down_write(&mm->mmap_sem);
>  
>  	/* For private mappings, just map in zero pages. */
>  	for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
> @@ -655,7 +655,7 @@ static inline size_t read_zero_pagealign
>  			goto out_up;
>  	}
>  
> -	up_read(&mm->mmap_sem);
> +	up_write(&mm->mmap_sem);
>  	
>  	/* The shared case is hard. Let's do the conventional zeroing. */ 
>  	do {
> @@ -669,7 +669,7 @@ static inline size_t read_zero_pagealign
>  
>  	return size;
>  out_up:
> -	up_read(&mm->mmap_sem);
> +	up_write(&mm->mmap_sem);
>  	return size;
>  }
>  

Ramiro, have you had a chance to test this yet?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-08 23:52           ` Andrew Morton
@ 2006-12-09  4:34             ` Hugh Dickins
  2006-12-09 17:24               ` Ramiro Voicu
  0 siblings, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2006-12-09  4:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ramiro Voicu, linux-mm, bugme-daemon

On Fri, 8 Dec 2006, Andrew Morton wrote:
> On Thu, 7 Dec 2006 21:22:57 +0000 (GMT)
> Hugh Dickins <hugh@veritas.com> wrote:
> > 
> > Please try the simple patch below: I expect it to fix your problem.
> > Whether it's the right patch, I'm not quite sure: we do commonly use
> > zap_page_range and zeromap_page_range with mmap_sem held for write,
> > but perhaps we'd want to avoid such serialization in this case?
> 
> Ramiro, have you had a chance to test this yet?

Here's a bigger but better patch: if you wouldn't mind,
please try this one instead, Ramiro - thanks.


Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range:
kernel bugzilla 7645.  Right: read_zero_pagealigned uses down_read of
mmap_sem, but another thread's racing read of /dev/zero, or a normal
fault, can easily set that pte again, in between zap_page_range and
zeromap_page_range getting there.  It's been wrong ever since 2.4.3.

The simple fix is to use down_write instead, but that would serialize
reads of /dev/zero more than at present: perhaps some app would be
badly affected.  So instead let zeromap_page_range return the error
instead of BUG_ON, and read_zero_pagealigned break to the slower
clear_user loop in that case - there's no need to optimize for it.

Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other
user of zeromap_page_range), though it really isn't interesting there.
And since mmap_zero wants -EAGAIN for out-of-memory, the zeromaps
better return that than -ENOMEM.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
---

 drivers/char/mem.c |   12 ++++++++----
 mm/memory.c        |   32 +++++++++++++++++++++-----------
 2 files changed, 29 insertions(+), 15 deletions(-)

--- 2.6.19/drivers/char/mem.c	2006-11-29 21:57:37.000000000 +0000
+++ linux/drivers/char/mem.c	2006-12-08 14:09:51.000000000 +0000
@@ -646,7 +646,8 @@ static inline size_t read_zero_pagealign
 			count = size;
 
 		zap_page_range(vma, addr, count, NULL);
-        	zeromap_page_range(vma, addr, count, PAGE_COPY);
+        	if (zeromap_page_range(vma, addr, count, PAGE_COPY))
+			break;
 
 		size -= count;
 		buf += count;
@@ -713,11 +714,14 @@ out:
 
 static int mmap_zero(struct file * file, struct vm_area_struct * vma)
 {
+	int err;
+
 	if (vma->vm_flags & VM_SHARED)
 		return shmem_zero_setup(vma);
-	if (zeromap_page_range(vma, vma->vm_start, vma->vm_end - vma->vm_start, vma->vm_page_prot))
-		return -EAGAIN;
-	return 0;
+	err = zeromap_page_range(vma, vma->vm_start,
+			vma->vm_end - vma->vm_start, vma->vm_page_prot);
+	BUG_ON(err == -EEXIST);
+	return err;
 }
 #else /* CONFIG_MMU */
 static ssize_t read_zero(struct file * file, char * buf, 
--- 2.6.19/mm/memory.c	2006-11-29 21:57:37.000000000 +0000
+++ linux/mm/memory.c	2006-12-08 14:09:51.000000000 +0000
@@ -1110,23 +1110,29 @@ static int zeromap_pte_range(struct mm_s
 {
 	pte_t *pte;
 	spinlock_t *ptl;
+	int err = 0;
 
 	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
 	if (!pte)
-		return -ENOMEM;
+		return -EAGAIN;
 	arch_enter_lazy_mmu_mode();
 	do {
 		struct page *page = ZERO_PAGE(addr);
 		pte_t zero_pte = pte_wrprotect(mk_pte(page, prot));
+
+		if (unlikely(!pte_none(*pte))) {
+			err = -EEXIST;
+			pte++;
+			break;
+		}
 		page_cache_get(page);
 		page_add_file_rmap(page);
 		inc_mm_counter(mm, file_rss);
-		BUG_ON(!pte_none(*pte));
 		set_pte_at(mm, addr, pte, zero_pte);
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 	arch_leave_lazy_mmu_mode();
 	pte_unmap_unlock(pte - 1, ptl);
-	return 0;
+	return err;
 }
 
 static inline int zeromap_pmd_range(struct mm_struct *mm, pud_t *pud,
@@ -1134,16 +1140,18 @@ static inline int zeromap_pmd_range(stru
 {
 	pmd_t *pmd;
 	unsigned long next;
+	int err;
 
 	pmd = pmd_alloc(mm, pud, addr);
 	if (!pmd)
-		return -ENOMEM;
+		return -EAGAIN;
 	do {
 		next = pmd_addr_end(addr, end);
-		if (zeromap_pte_range(mm, pmd, addr, next, prot))
-			return -ENOMEM;
+		err = zeromap_pte_range(mm, pmd, addr, next, prot);
+		if (err)
+			break;
 	} while (pmd++, addr = next, addr != end);
-	return 0;
+	return err;
 }
 
 static inline int zeromap_pud_range(struct mm_struct *mm, pgd_t *pgd,
@@ -1151,16 +1159,18 @@ static inline int zeromap_pud_range(stru
 {
 	pud_t *pud;
 	unsigned long next;
+	int err;
 
 	pud = pud_alloc(mm, pgd, addr);
 	if (!pud)
-		return -ENOMEM;
+		return -EAGAIN;
 	do {
 		next = pud_addr_end(addr, end);
-		if (zeromap_pmd_range(mm, pud, addr, next, prot))
-			return -ENOMEM;
+		err = zeromap_pmd_range(mm, pud, addr, next, prot);
+		if (err)
+			break;
 	} while (pud++, addr = next, addr != end);
-	return 0;
+	return err;
 }
 
 int zeromap_page_range(struct vm_area_struct *vma,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-09  4:34             ` Hugh Dickins
@ 2006-12-09 17:24               ` Ramiro Voicu
  2006-12-09 18:40                 ` Hugh Dickins
  0 siblings, 1 reply; 10+ messages in thread
From: Ramiro Voicu @ 2006-12-09 17:24 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, linux-mm, bugme-daemon

It seems that this patch fixed the problem. I tested on my desktop and
the problem seems gone.

Based on what Hugh supposed, I was able to have a small java program to
test it ... and indeed it is very possible that there was a race in the
initial app

I will try to test it tomorrow on the other machine ( it is unable to
boot now after a hard reboot ), but I think the bug can be closed now.

Thank you very much for your support!

Cheers,
Ramiro

Hugh Dickins wrote:
> On Fri, 8 Dec 2006, Andrew Morton wrote:
>> On Thu, 7 Dec 2006 21:22:57 +0000 (GMT)
>> Hugh Dickins <hugh@veritas.com> wrote:
>>> Please try the simple patch below: I expect it to fix your problem.
>>> Whether it's the right patch, I'm not quite sure: we do commonly use
>>> zap_page_range and zeromap_page_range with mmap_sem held for write,
>>> but perhaps we'd want to avoid such serialization in this case?
>> Ramiro, have you had a chance to test this yet?
> 
> Here's a bigger but better patch: if you wouldn't mind,
> please try this one instead, Ramiro - thanks.
> 
> 
> Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range:
> kernel bugzilla 7645.  Right: read_zero_pagealigned uses down_read of
> mmap_sem, but another thread's racing read of /dev/zero, or a normal
> fault, can easily set that pte again, in between zap_page_range and
> zeromap_page_range getting there.  It's been wrong ever since 2.4.3.
> 
> The simple fix is to use down_write instead, but that would serialize
> reads of /dev/zero more than at present: perhaps some app would be
> badly affected.  So instead let zeromap_page_range return the error
> instead of BUG_ON, and read_zero_pagealigned break to the slower
> clear_user loop in that case - there's no need to optimize for it.
> 
> Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other
> user of zeromap_page_range), though it really isn't interesting there.
> And since mmap_zero wants -EAGAIN for out-of-memory, the zeromaps
> better return that than -ENOMEM.
> 
> Signed-off-by: Hugh Dickins <hugh@veritas.com>
> ---
> 
>  drivers/char/mem.c |   12 ++++++++----
>  mm/memory.c        |   32 +++++++++++++++++++++-----------
>  2 files changed, 29 insertions(+), 15 deletions(-)
> 
> --- 2.6.19/drivers/char/mem.c	2006-11-29 21:57:37.000000000 +0000
> +++ linux/drivers/char/mem.c	2006-12-08 14:09:51.000000000 +0000
> @@ -646,7 +646,8 @@ static inline size_t read_zero_pagealign
>  			count = size;
>  
>  		zap_page_range(vma, addr, count, NULL);
> -        	zeromap_page_range(vma, addr, count, PAGE_COPY);
> +        	if (zeromap_page_range(vma, addr, count, PAGE_COPY))
> +			break;
>  
>  		size -= count;
>  		buf += count;
> @@ -713,11 +714,14 @@ out:
>  
>  static int mmap_zero(struct file * file, struct vm_area_struct * vma)
>  {
> +	int err;
> +
>  	if (vma->vm_flags & VM_SHARED)
>  		return shmem_zero_setup(vma);
> -	if (zeromap_page_range(vma, vma->vm_start, vma->vm_end - vma->vm_start, vma->vm_page_prot))
> -		return -EAGAIN;
> -	return 0;
> +	err = zeromap_page_range(vma, vma->vm_start,
> +			vma->vm_end - vma->vm_start, vma->vm_page_prot);
> +	BUG_ON(err == -EEXIST);
> +	return err;
>  }
>  #else /* CONFIG_MMU */
>  static ssize_t read_zero(struct file * file, char * buf, 
> --- 2.6.19/mm/memory.c	2006-11-29 21:57:37.000000000 +0000
> +++ linux/mm/memory.c	2006-12-08 14:09:51.000000000 +0000
> @@ -1110,23 +1110,29 @@ static int zeromap_pte_range(struct mm_s
>  {
>  	pte_t *pte;
>  	spinlock_t *ptl;
> +	int err = 0;
>  
>  	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
>  	if (!pte)
> -		return -ENOMEM;
> +		return -EAGAIN;
>  	arch_enter_lazy_mmu_mode();
>  	do {
>  		struct page *page = ZERO_PAGE(addr);
>  		pte_t zero_pte = pte_wrprotect(mk_pte(page, prot));
> +
> +		if (unlikely(!pte_none(*pte))) {
> +			err = -EEXIST;
> +			pte++;
> +			break;
> +		}
>  		page_cache_get(page);
>  		page_add_file_rmap(page);
>  		inc_mm_counter(mm, file_rss);
> -		BUG_ON(!pte_none(*pte));
>  		set_pte_at(mm, addr, pte, zero_pte);
>  	} while (pte++, addr += PAGE_SIZE, addr != end);
>  	arch_leave_lazy_mmu_mode();
>  	pte_unmap_unlock(pte - 1, ptl);
> -	return 0;
> +	return err;
>  }
>  
>  static inline int zeromap_pmd_range(struct mm_struct *mm, pud_t *pud,
> @@ -1134,16 +1140,18 @@ static inline int zeromap_pmd_range(stru
>  {
>  	pmd_t *pmd;
>  	unsigned long next;
> +	int err;
>  
>  	pmd = pmd_alloc(mm, pud, addr);
>  	if (!pmd)
> -		return -ENOMEM;
> +		return -EAGAIN;
>  	do {
>  		next = pmd_addr_end(addr, end);
> -		if (zeromap_pte_range(mm, pmd, addr, next, prot))
> -			return -ENOMEM;
> +		err = zeromap_pte_range(mm, pmd, addr, next, prot);
> +		if (err)
> +			break;
>  	} while (pmd++, addr = next, addr != end);
> -	return 0;
> +	return err;
>  }
>  
>  static inline int zeromap_pud_range(struct mm_struct *mm, pgd_t *pgd,
> @@ -1151,16 +1159,18 @@ static inline int zeromap_pud_range(stru
>  {
>  	pud_t *pud;
>  	unsigned long next;
> +	int err;
>  
>  	pud = pud_alloc(mm, pgd, addr);
>  	if (!pud)
> -		return -ENOMEM;
> +		return -EAGAIN;
>  	do {
>  		next = pud_addr_end(addr, end);
> -		if (zeromap_pmd_range(mm, pud, addr, next, prot))
> -			return -ENOMEM;
> +		err = zeromap_pmd_range(mm, pud, addr, next, prot);
> +		if (err)
> +			break;
>  	} while (pud++, addr = next, addr != end);
> -	return 0;
> +	return err;
>  }
>  
>  int zeromap_page_range(struct vm_area_struct *vma,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-09 17:24               ` Ramiro Voicu
@ 2006-12-09 18:40                 ` Hugh Dickins
  2006-12-11 18:56                   ` Ramiro Voicu
  0 siblings, 1 reply; 10+ messages in thread
From: Hugh Dickins @ 2006-12-09 18:40 UTC (permalink / raw)
  To: Ramiro Voicu; +Cc: Andrew Morton, linux-mm, bugme-daemon

On Sat, 9 Dec 2006, Ramiro Voicu wrote:
> Hugh Dickins wrote:
> > On Fri, 8 Dec 2006, Andrew Morton wrote:
> >> On Thu, 7 Dec 2006 21:22:57 +0000 (GMT)
> >> Ramiro, have you had a chance to test this yet?
> > 
> > Here's a bigger but better patch: if you wouldn't mind,
> > please try this one instead, Ramiro - thanks.
> 
> It seems that this patch fixed the problem. I tested on my desktop and
> the problem seems gone.

Great, thanks.  Well, actually it's trivial that it has fixed
the problem, in that it removed that particular BUG_ON: what's more
important is that it then allowed your program to work as usual, good.

> 
> Based on what Hugh supposed, I was able to have a small java program to
> test it ... and indeed it is very possible that there was a race in the
> initial app
> 
> I will try to test it tomorrow on the other machine ( it is unable to
> boot now after a hard reboot ), but I think the bug can be closed now.
> 
> Thank you very much for your support!

Thank _you_ very much for reporting and testing:
it's a pleasure to deal with bugs we can fix so easily!

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124
  2006-12-09 18:40                 ` Hugh Dickins
@ 2006-12-11 18:56                   ` Ramiro Voicu
  0 siblings, 0 replies; 10+ messages in thread
From: Ramiro Voicu @ 2006-12-11 18:56 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, linux-mm, bugme-daemon

Hi,

 I've tested the patch on the other machine also and it works as
expected, so the bug ca be closed now.

 Thank you very much for your support!

Regards,
Ramiro

Hugh Dickins wrote:
> On Sat, 9 Dec 2006, Ramiro Voicu wrote:
>> Hugh Dickins wrote:
>>> On Fri, 8 Dec 2006, Andrew Morton wrote:
>>>> On Thu, 7 Dec 2006 21:22:57 +0000 (GMT)
>>>> Ramiro, have you had a chance to test this yet?
>>> Here's a bigger but better patch: if you wouldn't mind,
>>> please try this one instead, Ramiro - thanks.
>> It seems that this patch fixed the problem. I tested on my desktop and
>> the problem seems gone.
> 
> Great, thanks.  Well, actually it's trivial that it has fixed
> the problem, in that it removed that particular BUG_ON: what's more
> important is that it then allowed your program to work as usual, good.
> 
>> Based on what Hugh supposed, I was able to have a small java program to
>> test it ... and indeed it is very possible that there was a race in the
>> initial app
>>
>> I will try to test it tomorrow on the other machine ( it is unable to
>> boot now after a hard reboot ), but I think the bug can be closed now.
>>
>> Thank you very much for your support!
> 
> Thank _you_ very much for reporting and testing:
> it's a pleasure to deal with bugs we can fix so easily!
> 
> Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-12-11 18:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200612070355.kB73tGf4021820@fire-2.osdl.org>
2006-12-07  4:12 ` [Bugme-new] [Bug 7645] New: Kernel BUG at mm/memory.c:1124 Andrew Morton
2006-12-07  5:15   ` Ramiro Voicu
2006-12-07  7:03     ` Andrew Morton
2006-12-07 14:54       ` Ramiro Voicu
2006-12-07 21:22         ` Hugh Dickins
2006-12-08 23:52           ` Andrew Morton
2006-12-09  4:34             ` Hugh Dickins
2006-12-09 17:24               ` Ramiro Voicu
2006-12-09 18:40                 ` Hugh Dickins
2006-12-11 18:56                   ` Ramiro Voicu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox