linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* 2.4.0-t9p7 and mmap002 - freeze
@ 2000-09-27 21:21 Roger Larsson
  2000-09-28  6:31 ` Mike Galbraith
  0 siblings, 1 reply; 5+ messages in thread
From: Roger Larsson @ 2000-09-27 21:21 UTC (permalink / raw)
  To: linux-kernel, linux-mm, Rik van Riel

Hi,

Tried latest patch with the same result - freeze...

No extra patches added.

running from console as root
mmap002 from memtest-0.0.3
with RAMSIZE defined as 90 MB (I have 96MB)
after a while with heavy disk access (thrashing?) the drive
becomes silent - no more progress...
[if you can not repeat this - try with less memory 32 MB...]

Magic works!

Magic memory
 Constantly LOW on inactive_clean (0 is the most common)
 lots of shared memory (almost equals active)
 [can be normal condition since mmap002 produces dirty
  mmaped pages]

Magic process:
  Manual samples gave the following locations.
  (NOTE: not a call trace)
  We are trying to clean pages, but do we make any
  progress since disk is silent?

Trace; c0127d85 <page_launder+3d/724>
Trace; c0126dad <deactivate_page_nolock+13d/248>
Trace; c0127e00 <page_launder+b8/724>
Trace; c0128035 <page_launder+2ed/724>
Trace; c0127dcc <page_launder+84/724>
Trace; c0127dd0 <page_launder+88/724>
Trace; c0127e00 <page_launder+b8/724>
Trace; c012fd38 <try_to_free_buffers+4/138>

Magic Sigterm (Alt+SysRq+E)
 Gives you a running system again.


Notes:
 Probably timing critical for entry into this state
 since adding a few printk:s makes it happen less often.
 I have even got complete mmap002 runs succeed - but
 disk is running too much and for too long time...
 a lot more than 10 min - normal run on previous testX
 did usually take less than 3 minutes.

/RogerL

--
Home page:
  http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.0-t9p7 and mmap002 - freeze
  2000-09-27 21:21 2.4.0-t9p7 and mmap002 - freeze Roger Larsson
@ 2000-09-28  6:31 ` Mike Galbraith
  2000-09-28 10:12   ` Rik van Riel
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Galbraith @ 2000-09-28  6:31 UTC (permalink / raw)
  To: Roger Larsson; +Cc: linux-kernel, linux-mm, Rik van Riel

On Wed, 27 Sep 2000, Roger Larsson wrote:

> Hi,
> 
> Tried latest patch with the same result - freeze...

Ditto.

> No extra patches added.

Ditto.

> running from console as root
> mmap002 from memtest-0.0.3
> with RAMSIZE defined as 90 MB (I have 96MB)
> after a while with heavy disk access (thrashing?) the drive
> becomes silent - no more progress...
> [if you can not repeat this - try with less memory 32 MB...]

I'm using a little proggy from Christoph Rohland (swptst.c), and
do not have to jump through any hoops to reproduce the freeze.

> Magic works!
> 
> Magic memory
>  Constantly LOW on inactive_clean (0 is the most common)
>  lots of shared memory (almost equals active)
>  [can be normal condition since mmap002 produces dirty
>   mmaped pages]
> 
> Magic process:
>   Manual samples gave the following locations.
>   (NOTE: not a call trace)

If a kdb call trace will help (doubt it.. see below) I can post one.

>   We are trying to clean pages, but do we make any
>   progress since disk is silent?
> 
> Trace; c0127d85 <page_launder+3d/724>
> Trace; c0126dad <deactivate_page_nolock+13d/248>
> Trace; c0127e00 <page_launder+b8/724>
> Trace; c0128035 <page_launder+2ed/724>
> Trace; c0127dcc <page_launder+84/724>
> Trace; c0127dd0 <page_launder+88/724>
> Trace; c0127e00 <page_launder+b8/724>
> Trace; c012fd38 <try_to_free_buffers+4/138>
> 
> Magic Sigterm (Alt+SysRq+E)
>  Gives you a running system again.

Not here.  I looked at it with an IKD kernel, and here it's the same
loop as before.. __alloc_pages() running through try_again forever.
inactive_clean=0, a few pages bouncing between active and inactive_dirty.
__switch_to() never happens.  (though I can artificially yield and thus
make sysrq-e work.  Artificially scheduling only ensures that all other
tasks loop the same way. [coz inactive_clean=0.. page_launder() is always
failing to find something freeable])

> Notes:
>  Probably timing critical for entry into this state
>  since adding a few printk:s makes it happen less often.
>  I have even got complete mmap002 runs succeed - but
>  disk is running too much and for too long time...
>  a lot more than 10 min - normal run on previous testX
>  did usually take less than 3 minutes.

I'm still at < 1 minute survival.. with many seconds to spare ;-)
I have yet to have a run succeed, though virgin source _does_ last
a bit longer (odd) than KDB enabled kernel.

	-Mike

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.0-t9p7 and mmap002 - freeze
  2000-09-28  6:31 ` Mike Galbraith
@ 2000-09-28 10:12   ` Rik van Riel
  2000-09-28 15:12     ` Mike Galbraith
  0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2000-09-28 10:12 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Roger Larsson, linux-kernel, linux-mm

On Thu, 28 Sep 2000, Mike Galbraith wrote:
> On Wed, 27 Sep 2000, Roger Larsson wrote:
> 
> > Tried latest patch with the same result - freeze...
> 
> Ditto.

I'm finally back from Linux Kongress and Linux Expo and
will look at the latest tree and integrate the fixes I
made while on the road later today (after I get some
sleep).

I have fixed this particular bug, which was caused by
us moving unfreeable pages to the inactive_dirty list
and back again, while not accomplishing anything useful.

The fix for this is trivial and I'll post it later
today (cleaned up and working in the current source
tree).

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.0-t9p7 and mmap002 - freeze
  2000-09-28 10:12   ` Rik van Riel
@ 2000-09-28 15:12     ` Mike Galbraith
  2000-09-29 14:54       ` Rik van Riel
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Galbraith @ 2000-09-28 15:12 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Roger Larsson, linux-kernel, linux-mm

On Thu, 28 Sep 2000, Rik van Riel wrote:

> On Thu, 28 Sep 2000, Mike Galbraith wrote:
> > On Wed, 27 Sep 2000, Roger Larsson wrote:
> > 
> > > Tried latest patch with the same result - freeze...
> > 
> > Ditto.
> 
> I'm finally back from Linux Kongress and Linux Expo and
> will look at the latest tree and integrate the fixes I
> made while on the road later today (after I get some
> sleep).
> 
> I have fixed this particular bug, which was caused by
> us moving unfreeable pages to the inactive_dirty list
> and back again, while not accomplishing anything useful.
> 
> The fix for this is trivial and I'll post it later
> today (cleaned up and working in the current source
> tree).

Cool!

I've had a tiny bit of success (swptst _passed_ once, and currently
locks with 1 inactive_clean page instead of always 0;) by fiddling
with __alloc_pages() a bit.

One thing that I _think_ may be a problem is using stale information.
direct_reclaim is set once, it's set without checking that a reclaim
is possible, and it's not updated as we proceed although the situation
may change.

Another thing I'm curious about is increasing memory pressure in the
event of an allocation failure (retry).  Why do we do that?

Comments?

	-Mike (down periscope.. ahead dead slow;)

P.S.  in buffer.c, we do a LockPage(), but no UnlockPage() in the
case of no_buffer_head.. is that correct?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.4.0-t9p7 and mmap002 - freeze
  2000-09-28 15:12     ` Mike Galbraith
@ 2000-09-29 14:54       ` Rik van Riel
  0 siblings, 0 replies; 5+ messages in thread
From: Rik van Riel @ 2000-09-29 14:54 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Roger Larsson, linux-kernel, linux-mm

On Thu, 28 Sep 2000, Mike Galbraith wrote:

> Another thing I'm curious about is increasing memory pressure in
> the event of an allocation failure (retry).  Why do we do that?

We were short on free memory, so kswapd should work /harder/
to keep up with the current load.

> P.S.  in buffer.c, we do a LockPage(), but no UnlockPage() in
> the case of no_buffer_head.. is that correct?

No it isn't ;)  Thanks for pointing out this one...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2000-09-29 14:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-09-27 21:21 2.4.0-t9p7 and mmap002 - freeze Roger Larsson
2000-09-28  6:31 ` Mike Galbraith
2000-09-28 10:12   ` Rik van Riel
2000-09-28 15:12     ` Mike Galbraith
2000-09-29 14:54       ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox