From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Oops in kernel 2.4.19-pre10-ac2-preempt Message-ID: From: Mark_H_Johnson@Raytheon.com Date: Thu, 20 Jun 2002 11:01:22 -0500 MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Sender: owner-linux-mm@kvack.org Return-Path: To: kpreempt-tech@lists.sourceforge.net, linux-mm@kvack.org Cc: Robert_Horton@Raytheon.com, James_P_Cassidy@Raytheon.com, Stanley_R_Allen@Raytheon.com List-ID: I'd like to put out a warning on using the preempt-kernel-2.4.19-pre10-ac2.patch. The best I can determine, the pre10-ac2 kernel includes the new rmap memory management. However, there was a message on the linux-mm mailing list from Bill Irwin stating... >On Wed, Jun 19, 2002 at 04:18:00AM -0700, Craig Kulesa wrote: >> Where: http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/ >> This patch implements Rik van Riel's patches for a reverse mapping VM >> atop the 2.5.23 kernel infrastructure. The principal sticky bits in > >There is a small bit of trouble here: pte_chain_lock() needs to >preempt_disable() and pte_chain_unlock() needs to preempt_enable(), >as they are meant to protect critical sections. This was in reference to a port of rmap to the 2.5 kernel series, but I think this also applies to rmap on 2.4 as well. I highly recommend avoiding applying the kernel preemption patches to this kernel (2.4.19-pre10-ac2) until the preemption cleanup is completed on rmap. Since I have two different Oopses, I can't tell if the fix posted on the linux-mm is complete or not. In testing yesterday, we got a pair of oops as follows. The first one is... kernel BUG at rmap.c:267! invalid operand: 0000 CPU: 1 EIP: 0010 [] not tainted EFLAGS: 00010202 eax: 01014099 ebx: c16c0e70 ecx: c16c0e70 edx: 00000010 esi: c16c0e70 edi: d1720000 ebp: 00000000 esp: d1721f44 ds: 0018 es: 0018 ss: 0018 Process kswapd (pid: 7, stack page = d1721000) Stack : c16c0e70 c16c0e8c d1720000 d1720000 00000000 c0288824 c013a054 0000107d 000001d0 c0288824 00000000 c16coe86 00000000 c0288854 d1720000 0000107d ... Call trace [] [] [] [] [] Code: 0f 0b 0b 01 40 0a 25 c0 8b 46 18 a8 01 75 08 0f 0b 0d 01 40 <3> kswapd[7] exited with preempt_count 1 The second one is... Unable to handle kernel NULL pointer deference at virtual address 00000078 printing EIP: c0141a4b *pde=00000000 Oops: 0000 CPU: 1 EIP: 0010 [] not tainted EFLAGS: 00010206 eax: 00000000 ebx: c1000030 ecx: d1d5dbe0 edx: 0001bd33 esi: c13f234c edi: d1720000 ebp: d1720000 esp: d1721f74 ds: 0018 es: 0018 ss: 0018 Process kswapd (pid: 7, stack page = d1721000) Stack: c13f2330 c013a9f6 00000000 000037f0 c0288824 0008e000 00000000 00000000 c028884c d1720000 00000000 00000000 00000000 00000000 00000000 00000000 d1720000 00000000 00000000 000037f0 000001e9 c013b43d c0288b24 00000006 Call trace [] [] [] Code: 8b 50 78 85 d2 74 05 39 50 64 77 09 31 c0 eb 10 90 8d 74 26 <3> kswapd[7] exited with preempt_count 1 I also had problems with unmounting disks at shutdown - occasionally the system shutdown script would get stuck at that step and require a hard reboot as well. It may be related to rmap race conditions if cache flushes from memory to disk were getting screwed up. For now, we have gone back to a 2.4.16 based kernel for further work w/ kernel preemption. Let me know if you need additional information to help diagnose the problems. I'm going on vacation for a few weeks "tonight", but I can check on messages & arrange for someone to forward additional data if needed. --Mark H Johnson -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/