VM trouble, both 2.4 and 2.5

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* VM trouble, both 2.4 and 2.5
@ 2002-11-15 22:21 Rene Herman
  2002-11-15 22:44 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Rene Herman @ 2002-11-15 22:21 UTC (permalink / raw)
  To: linux-mm; +Cc: Andrew Morton, Con Kolivas

Hi Andrew, all ...

All of 2.4.19, 2.4.19-rmap14b, 2.5.47 and 2.5.47-mm3 would appear to have a 
problem reclaiming memory. On all of these kernels a "dd" with a large 
blocksize "misplaces memory" here:

rene@7ixe4:~$ cat /proc/sys/vm/overcommit_memory
0

rene@7ixe4:~$ cat /proc/meminfo
MemTotal:       776156 kB
MemFree:        667416 kB
MemShared:           0 kB
Buffers:          7088 kB
Cached:          61564 kB
SwapCached:          0 kB
Active:          41652 kB
Inactive:        46584 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       776156 kB
LowFree:        667416 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:             104 kB
Writeback:           0 kB
Mapped:          34224 kB
Slab:             6068 kB
Committed_AS:    34864 kB
PageTables:        668 kB
ReverseMaps:     31359

rene@7ixe4:~$ dd if=/dev/zero of=/tmp/zero bs=512M count=1
1+0 records in
1+0 records out

rene@7ixe4:~$ dd if=/dev/zero of=/tmp/zero bs=512M count=1
dd: memory exhausted

rene@7ixe4:~$ cat /proc/meminfo
MemTotal:       776156 kB
MemFree:        412112 kB
MemShared:           0 kB
Buffers:          7668 kB
Cached:          61564 kB
SwapCached:          0 kB
Active:          42168 kB
Inactive:       296572 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       776156 kB
LowFree:        412112 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:             440 kB
Writeback:           0 kB
Mapped:          34228 kB
Slab:            10932 kB
Committed_AS:    34868 kB
PageTables:        668 kB
ReverseMaps:     31360

The first dd above ate some 250M (that number varies wildly, I have also seen 
it eat 400M and more, and sometimes significantly less, making the second dd 
still succeed but in that case the third or fourth dies) that /proc/meminfo 
only accounts under Inactive and then the second "dd" fails to allocate its 
buffer (bs=512M large) and exits with "memory exhausted". You can continue 
this process, choosing a smaller bs= each time (< MemFree), until allmost all 
memory is under "Inactive" and every non-tiny allocation fails.

Note: the above is without any swap enabled to show the problem more clearly, 
but it also happens with swap.

The real fun bit is that you can now get your memory back (putting it back in 
"Cached" where I guess it should have been in the first place?) by doing 
something like "ls -lR /". Upon hearing that, Rik van Riel noted that that 
probably meant that setting overcommit_memory=1 would be a work around for 
the problem and indeed it is. If you after having "run out" of memory in this 
way set overcommit_memory=1 and repeat the "dd"s, now giving a bs= that's 
slightly *larger* than MemFree each time, you can move everything back from 
Inactive to Cached in the same way as with the "ls -lR /".

dd allocates a buffer with size bs= (ie, large) to read/write from. Without 
overcommit, the system fails the allocation because it believes not enough 
memory is available (everything is under "Inactive"). With overcommit 
enabled, I assume the buffer is faulted in one or a few pages at a time. The 
"ls -lR" probably does many small allocations so it seems that those small 
allocations are what fix things up again.

I asked around (on IRC) if others were also seeing this behaviour and they 
were not. I assume though that they had overcommit enabled, which then masks 
the problem, since I can reproduce this completely consistently, as said on 
all of 2.4.19, 2.4.19-rmap14b, 2.5.47 and 2.5.47-mm3. To rule out GCC issues 
(my normal compiler is gcc-3.2) I also tried it with a gcc-2.95.3 compiled 
2.4.19. They all behave as described above.

Maybe significant (?): does *not* happen with of=/dev/null. Does happen both 
with ext2 and ext3 on /tmp.

Any and all comments much appreciated. And if anyone wants me to test out 
something else or more, please say so...

Rene.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM trouble, both 2.4 and 2.5
  2002-11-15 22:21 VM trouble, both 2.4 and 2.5 Rene Herman
@ 2002-11-15 22:44 ` Andrew Morton
  2002-11-16  0:18   ` Rene Herman
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2002-11-15 22:44 UTC (permalink / raw)
  To: Rene Herman; +Cc: linux-mm, Con Kolivas

Rene Herman wrote:
> 
> ...
> rene@7ixe4:~$ cat /proc/meminfo
> MemTotal:       776156 kB
> MemFree:        412112 kB
> MemShared:           0 kB
> Buffers:          7668 kB
> Cached:          61564 kB
> SwapCached:          0 kB
> Active:          42168 kB
> Inactive:       296572 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:       776156 kB
> LowFree:        412112 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> Dirty:             440 kB
> Writeback:           0 kB
> Mapped:          34228 kB
> Slab:            10932 kB
> Committed_AS:    34868 kB
> PageTables:        668 kB
> ReverseMaps:     31360

That looks like the ext3 truncate thing.
 
> ...
> Maybe significant (?): does *not* happen with of=/dev/null. Does happen both
> with ext2 and ext3 on /tmp.

Are you *sure* it happens with ext2?  Checked /proc/mounts to ensure that
/tmp is really ext2?

Because if you write a ton of memory to an ext3 file and then immediately
delete the file, that memory ends on on the inactive list, not in pagecache,
just as you have shown.

But ext2 won't do that, because truncate is able to take the buffers
away from the truncated pages.

I could certainly believe that the (weird) ext3 behaviour would upset
the overcommit beancounting though.  Hundreds of megabytes of memory
on the inactive list but not in pagecache probably looks like anonymous
memory to the overcommit logic.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM trouble, both 2.4 and 2.5
  2002-11-15 22:44 ` Andrew Morton
@ 2002-11-16  0:18   ` Rene Herman
  2002-11-16  0:39     ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Rene Herman @ 2002-11-16  0:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, Con Kolivas

On Friday 15 November 2002 23:44, Andrew Morton wrote:

> Are you *sure* it happens with ext2?  Checked /proc/mounts to ensure
> that /tmp is really ext2?

Darn it!

You are absolutely correct, /tmp was on /, ext3 builtin, ext2 as module, so 
it was really still ext3. /bin/mount lied to me. When I moved /tmp to its own 
partition, really ext2 this time, things stopped misbehaving. That ext2/ext3 
thing was the very first thing I tried, wasted a lot of time :-(

> I could certainly believe that the (weird) ext3 behaviour would upset
> the overcommit beancounting though.  Hundreds of megabytes of memory
> on the inactive list but not in pagecache probably looks like anonymous
> memory to the overcommit logic.

Does this bit mean the report was still somewhat useful (for fixing either 
ext3 or the overcommit accounting) though, or was it already well-known?

Well, anyways, thanks heaps for the explanation, was going slowly mad here ...

Rene.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM trouble, both 2.4 and 2.5
  2002-11-16  0:18   ` Rene Herman
@ 2002-11-16  0:39     ` Andrew Morton
  2002-11-16  0:59       ` Rene Herman
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2002-11-16  0:39 UTC (permalink / raw)
  To: Rene Herman; +Cc: linux-mm, Con Kolivas

Rene Herman wrote:
> 
> On Friday 15 November 2002 23:44, Andrew Morton wrote:
> 
> > Are you *sure* it happens with ext2?  Checked /proc/mounts to ensure
> > that /tmp is really ext2?
> 
> Darn it!
> 
> You are absolutely correct, /tmp was on /, ext3 builtin, ext2 as module, so
> it was really still ext3. /bin/mount lied to me. When I moved /tmp to its own
> partition, really ext2 this time, things stopped misbehaving. That ext2/ext3
> thing was the very first thing I tried, wasted a lot of time :-(

heh.  That mount(8) thing really sucks.  Especially if you spend
time helping folk out with ext3 problems.

Maybe we should fix it...

> > I could certainly believe that the (weird) ext3 behaviour would upset
> > the overcommit beancounting though.  Hundreds of megabytes of memory
> > on the inactive list but not in pagecache probably looks like anonymous
> > memory to the overcommit logic.
> 
> Does this bit mean the report was still somewhat useful (for fixing either
> ext3 or the overcommit accounting) though, or was it already well-known?

Very useful thanks, no it's not well known.  Or at least, it wasn't.

It's at the "hm, that's funny.  Oh, I know what that is" stage.  The
pages are trivially reclaimable, but I hadn't thought about the
effect on overcommit's deadreckoning logic.

The problem got worse in 2.5 because truncate got better - the first
pass of truncate will zoom over the locked pages and shoot down all
the dirty pages which aren't under IO yet.  Then it will go back and
do the under-IO pages.  It's all the dirty pages which were reaped
by the first pass which cause this problem.
 
> Well, anyways, thanks heaps for the explanation, was going slowly mad here ...

Well.  What the heck am I going to do about it?  I guess change the
overcommit logic to look at page_states.nr_mapped or something.  Or
maybe take a look at fixing ext3.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: VM trouble, both 2.4 and 2.5
  2002-11-16  0:39     ` Andrew Morton
@ 2002-11-16  0:59       ` Rene Herman
  0 siblings, 0 replies; 5+ messages in thread
From: Rene Herman @ 2002-11-16  0:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, Con Kolivas

On Saturday 16 November 2002 01:39, Andrew Morton wrote:

> heh.  That mount(8) thing really sucks.  Especially if you
> spend time helping folk out with ext3 problems.
>
> Maybe we should fix it...

Not before I get the chance to laugh at someone *else* being confused by it, 
I hope...

> > Does this bit mean the report was still somewhat useful (for fixing
> > either ext3 or the overcommit accounting) though, or was it already
> > well-known?
>
> Very useful thanks, no it's not well known.  Or at least, it wasn't.

Thanks, makes me feel much better :-)

> Well.  What the heck am I going to do about it?  I guess change the
> overcommit logic to look at page_states.nr_mapped or something.  Or
> maybe take a look at fixing ext3.

Do note that I haven't actually a clue what I'm talking about, but given that 
lack, the latter does sound better. It would seem to make sense to have those 
pages show up in the pagecache, regardless of any ability to work around them 
not doing so elsewhere?

Rene.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-11-16  0:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-15 22:21 VM trouble, both 2.4 and 2.5 Rene Herman
2002-11-15 22:44 ` Andrew Morton
2002-11-16  0:18   ` Rene Herman
2002-11-16  0:39     ` Andrew Morton
2002-11-16  0:59       ` Rene Herman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox