Re: 2.5.66-mm1 - Mike Galbraith

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mike Galbraith <efault@gmx.de>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: Ingo Molnar <mingo@elte.hu>, Andrew Morton <akpm@digeo.com>,
	Ed Tomlinson <tomlins@cam.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: 2.5.66-mm1
Date: Fri, 28 Mar 2003 17:01:21 +0100	[thread overview]
Message-ID: <5.2.0.9.2.20030328164837.019c0968@pop.gmx.net> (raw)
In-Reply-To: <Pine.LNX.4.50.0303280942420.2884-100000@montezuma.mastecen de.com>

[-- Attachment #1: Type: text/plain, Size: 3414 bytes --]

At 09:56 AM 3/28/2003 -0500, Zwane Mwaikambo wrote:
>On Fri, 28 Mar 2003, Mike Galbraith wrote:
>
> > >hm, this is an 'impossible' scenario from the scheduler code POV. Whenever
> > >we deactivate a task, we remove it from the runqueue and set p->array to
> > >NULL. Whenever we activate a task again, we set p->array to non-NULL. A
> > >double-deactivate is not possible. I tried to reproduce it with various
> > >scheduler workloads, but didnt succeed.
> > >
> > >Mike, do you have a backtrace of the crash you saw?
> >
> > No, I didn't save it due to "grubby fingerprints".
>
>Hmm i think i may have his this one but i never posted due to being unable
>to reproduce it on a vanilla kernel or the same kernel afterwards (which
>was hacked so i won't vouch for it's cleanliness). I think preempt
>might have bitten him in a bad place (mine is also CONFIG_PREEMPT), is it
>possible that when we did the task_rq_unlock we got preempted and when we
>got back we used the local variable requeue_waker which was set before
>dropping the lock, and therefore might not be valid anymore due to
>scheduler decisions done after dropping the runqueue lock?

Dunno.  I did have one lying around.  The attached one was while printing 
out array switch latency after starvation timeout.  Others happened while 
printing wakeup stats for p->state > 1 tasks in scheduler_tick() [under 
lock w/ wakeup disabled in printk.c].  It's nothing I did to the scheduler 
;-) I don't think, but this was in 65-mm3-twiddle-twiddle-twiddle.

>Unable to handle kernel NULL pointer dereference at virtual address 00000000
>  printing eip:
>c011b8d9
>*pde = 00000000
>Oops: 0000 [#1]
>CPU:    0
>EIP:    0060:[<c011b8d9>]    Not tainted
>EFLAGS: 00010046
>EIP is at try_to_wake_up+0x1e9/0x4f0
>eax: c055a000   ebx: c04e5aa0   ecx: c0552fc0   edx: c04e5aa0
>esi: 00000000   edi: 00000000   ebp: c055bee4   esp: c055beb8
>ds: 007b   es: 007b   ss: 0068
>Process swapper (pid: 0, threadinfo=c055a000 task=c04e5aa0)
>Stack: 00000001 c055a000 c0552fc0 00000000 cb1a0000 00000001 00000001 
>00000002
>        00000000 c04e88e4 00000001 c055bf08 c011d172 c1694700 00000001 
> 00000000
>        c04e88e4 c04e88dc c055a000 00000001 c055bf3c c011d203 c04e88dc 
> 00000001
>Call Trace:
>  [<c011d172>] __wake_up_common+0x32/0x60
>  [<c011d203>] __wake_up+0x63/0xb0
>  [<c0122fb5>] release_console_sem+0x165/0x170
>  [<c0122d7b>] printk+0x1eb/0x270
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c01163f2>] smp_call_function_interrupt+0x42/0xb0
>  [<c015e210>] invalidate_bh_lru+0x0/0x60
>  [<c0106eb0>] default_idle+0x0/0x40
>  [<c010a41a>] call_function_interrupt+0x1a/0x20
>  [<c0106eb0>] default_idle+0x0/0x40
>  [<c0106ede>] default_idle+0x2e/0x40
>  [<c0106f6a>] cpu_idle+0x3a/0x50
>  [<c0105000>] rest_init+0x0/0x80
>
>Code: 8b 06 48 89 06 8b 4a 24 8b 42 20 89 01 89 48 04 8b 4a 18 8d
>
>0xc011b8d9 is in try_to_wake_up (kernel/sched.c:282).
>277     /*
>278      * Adding/removing a task to/from a priority array:
>279      */
>280     static inline void dequeue_task(struct task_struct *p, 
>prio_array_t *array)
>281     {
>282             array->nr_active--;
>283             list_del(&p->run_list);
>284             if (list_empty(array->queue + p->prio))
>285                     __clear_bit(p->prio, array->bitmap);
>286     }

Same spot.

         -Mike 

[-- Attachment #2: oops.txt --]
[-- Type: text/plain, Size: 3183 bytes --]

Loglevel set to 9
hmm.. 289 ms
hmm.. 6 ms
hmm.. 4 ms
hmm.. 7 ms
hmm.. 13 ms
hmm.. 15 ms
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c0114d0a
*pde = 00000000
Oops: 0002 [#1]
CPU:    0
EIP:    0060:[<c0114d0a>]    Not tainted VLI
EFLAGS: 00010006
EIP is at try_to_wake_up+0x1e2/0x258
eax: 00000008   ebx: c02cb3c8   ecx: c0dcf360   edx: c0dcf360
esi: c0c24000   edi: 00000000   ebp: c0c25ed4   esp: c0c25eb8
ds: 007b   es: 007b   ss: 0068
Process gcc (pid: 592, threadinfo=c0c24000 task=c0dcf360)
Stack: 00000001 00000001 c0298ff4 c0c25ed0 00000001 00000001 00000002 c0c25ee8 
       c0115887 c7b8a0a0 00000003 00000000 c0c25f08 c01158c2 c2d81e5c 00000003 
       00000000 c0c24000 00000082 c0298fe8 c0c25f20 c011594a c0298ff0 00000003 
Call Trace:
 [<c0115887>] default_wake_function+0x17/0x1c
 [<c01158c2>] __wake_up_common+0x36/0x50
 [<c011594a>] __wake_up_locked+0xe/0x14
 [<c0107cdc>] __down_trylock+0x34/0x54
 [<c0107d1b>] __down_failed_trylock+0x7/0xc
 [<c011928b>] .text.lock.printk+0x5/0x2a
 [<c01155f0>] schedule+0x13c/0x378
 [<c011ab07>] sys_wait4+0xab/0x234
 [<c011ac5d>] sys_wait4+0x201/0x234
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0108b5f>] syscall_call+0x7/0xb

Code: ff 48 14 8b 40 08 a8 08 74 07 e8 3e 0b 00 00 89 f6 85 f6 74 7e 8b 55 f0 9c 8f 02 fa be 00 e0 ff ff 21 e6 ff 46 14 8b 16 8b 7a 28 <ff> 0f 8b 42 20 8b 4a 24 89 48 04 89 01 8b 52 18 8d 44 d7 18 39 

 (gdb) list *try_to_wake_up+0x1e2
 0x26a is in try_to_wake_up (kernel/sched.c:310).
 305     /*
 306      * Adding/removing a task to/from a priority array:
 307      */
 308     static inline void dequeue_task(struct task_struct *p, prio_array_t *array)
 309     {
 310             array->nr_active--;
 311             list_del(&p->run_list);
 312             if (list_empty(array->queue + p->prio))
 313                     __clear_bit(p->prio, array->bitmap);
 314     }
 (gdb)
 <6>note: gcc[592] exited with preempt_count 5
bad: scheduling while atomic!
Call Trace:
 [<c01154f0>] schedule+0x3c/0x378
 [<c0134b26>] unmap_vmas+0xea/0x1e0
 [<c011647b>] __cond_resched+0x17/0x1c
 [<c0134b86>] unmap_vmas+0x14a/0x1e0
 [<c0137fb8>] exit_mmap+0x64/0x158
 [<c0116dbd>] mmput+0x55/0x74
 [<c011a368>] do_exit+0x158/0x3b4
 [<c0109267>] die+0x87/0x88
 [<c0114068>] do_page_fault+0x2d8/0x404
 [<c0113d90>] do_page_fault+0x0/0x404
 [<c011b91a>] do_softirq+0x5a/0xac
 [<c010a170>] do_IRQ+0xfc/0x118
 [<c012e173>] __rmqueue+0xa3/0x10c
 [<c012e21f>] rmqueue_bulk+0x43/0x6c
 [<c0108d69>] error_code+0x2d/0x38
 [<c0114d0a>] try_to_wake_up+0x1e2/0x258
 [<c0115887>] default_wake_function+0x17/0x1c
 [<c01158c2>] __wake_up_common+0x36/0x50
 [<c011594a>] __wake_up_locked+0xe/0x14
 [<c0107cdc>] __down_trylock+0x34/0x54
 [<c0107d1b>] __down_failed_trylock+0x7/0xc
 [<c011928b>] .text.lock.printk+0x5/0x2a
 [<c01155f0>] schedule+0x13c/0x378
 [<c011ab07>] sys_wait4+0xab/0x234
 [<c011ac5d>] sys_wait4+0x201/0x234
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0115870>] default_wake_function+0x0/0x1c
 [<c0108b5f>] syscall_call+0x7/0xb

hmm.. 42 ms
hmm.. 24 ms
hmm.. 33 ms
hmm.. 23 ms
hmm.. 31 ms
hmm.. 30 ms
hmm.. 30 ms

     prev parent reply	other threads:[~2003-03-28 16:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-26  9:38 2.5.66-mm1 Andrew Morton
2003-03-28  2:06 ` 2.5.66-mm1 Ed Tomlinson
2003-03-28  4:59   ` 2.5.66-mm1 Andrew Morton
2003-03-28 10:45     ` 2.5.66-mm1 Ingo Molnar
     [not found]     ` <Pine.LNX.4.44.0303281139500.6678-100000@localhost.localdom ain>
2003-03-28 14:26       ` 2.5.66-mm1 Mike Galbraith
2003-03-28 14:56         ` 2.5.66-mm1 Zwane Mwaikambo
2003-03-28 15:25           ` 2.5.66-mm1 Ingo Molnar
     [not found]           ` <Pine.LNX.4.44.0303281619530.9943-100000@localhost.localdom ain>
2003-03-28 16:05             ` 2.5.66-mm1 Mike Galbraith
     [not found]     ` <Pine.LNX.4.50.0303280942420.2884-100000@montezuma.mastecen de.com>
2003-03-28 16:01       ` Mike Galbraith [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5.2.0.9.2.20030328164837.019c0968@pop.gmx.net \
    --to=efault@gmx.de \
    --cc=akpm@digeo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=tomlins@cam.org \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox