Re: useless report -- perhaps memory allocation problems in 2.1.12[678]

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
       [not found] <199811131746.LAA23512@mail.mankato.msus.edu>
@ 1998-11-16 14:27 ` Rik van Riel
  1998-11-17 11:21   ` Stephen C. Tweedie
  0 siblings, 1 reply; 9+ messages in thread
From: Rik van Riel @ 1998-11-16 14:27 UTC (permalink / raw)
  To: Jeffrey Hundstad; +Cc: Linux MM, Linus Torvalds

On Fri, 13 Nov 1998, Jeffrey Hundstad wrote:

> When I was recompiling gimp, with tkRat, and Netscape running it felt
> like the machine was running out of ram. (I've got 128m of ram 128m of

> happens on 2.1.125.  Something has changed for the worse, but it does
> FEEL peppier at the keyboard ;-)

In 2.1.127+ the freeing of memory is done in the context of
programs themselves too and the whole system is busy freeing
memory. This means that the kswapd-loop has now been migrated
into other contexts as well. This, together with the fact that
kswapd never blocks on disk access any more, has caused serious
trouble when a system runs out of memory.

I guess this means I'll have to update and clean my out
of memory killer patch really soon now... :(

cheers,

Rik -- slowly getting used to dvorak kbd layout...
+-------------------------------------------------------------------+
| Linux memory management tour guide.        H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader.      http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-16 14:27 ` useless report -- perhaps memory allocation problems in 2.1.12[678] Rik van Riel
@ 1998-11-17 11:21   ` Stephen C. Tweedie
  1998-11-17 20:18     ` Rik van Riel
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen C. Tweedie @ 1998-11-17 11:21 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Jeffrey Hundstad, Linux MM, Linus Torvalds

Hi,

In article
<Pine.LNX.3.96.981116152322.20349E-100000@mirkwood.dummy.home>, Rik van
Riel <H.H.vanRiel@phys.uu.nl> writes:

> In 2.1.127+ the freeing of memory is done in the context of
> programs themselves too 

It always has done: it's just a bit better at it in some situations now.

> and the whole system is busy freeing memory. This means that the
> kswapd-loop has now been migrated into other contexts as well. This,
> together with the fact that kswapd never blocks on disk access any
> more,

Yes it does.  We don't pass GFP_WAIT to swap_out(), but that just means
that the swapout will be done asynchronously.  We are still free to
write stuff out to swap, and in fact once we hit the limit on
outstanding IOs we may well block in the write.

--Stephen
--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-17 11:21   ` Stephen C. Tweedie
@ 1998-11-17 20:18     ` Rik van Riel
  1998-11-17 23:14       ` Linus Torvalds
  0 siblings, 1 reply; 9+ messages in thread
From: Rik van Riel @ 1998-11-17 20:18 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Jeffrey Hundstad, Linux MM, Linus Torvalds

On Tue, 17 Nov 1998, Stephen C. Tweedie wrote:
> Rik van Riel <H.H.vanRiel@phys.uu.nl> writes:
> 
> > and the whole system is busy freeing memory. This means that the
> > kswapd-loop has now been migrated into other contexts as well. This,
> > together with the fact that kswapd never blocks on disk access any
> > more,
> 
> Yes it does.  We don't pass GFP_WAIT to swap_out(), but that just
> means that the swapout will be done asynchronously.  We are still
> free to write stuff out to swap, and in fact once we hit the limit
> on outstanding IOs we may well block in the write. 

Whoops, I saw that run_task_queue(&tq_disk) had dissapeared
from it's original position but I couldn't find it in it's
new place... /usr/bin/grep has been a real help now you pointed
it out, thanks to you both :)

cheers,

Rik -- slowly getting used to dvorak kbd layout...
+-------------------------------------------------------------------+
| Linux memory management tour guide.        H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader.      http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-17 20:18     ` Rik van Riel
@ 1998-11-17 23:14       ` Linus Torvalds
  1998-11-18  1:09         ` Stephen C. Tweedie
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 1998-11-17 23:14 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Stephen C. Tweedie, Jeffrey Hundstad, Linux MM

On Tue, 17 Nov 1998, Rik van Riel wrote:
> 
> Whoops, I saw that run_task_queue(&tq_disk) had dissapeared
> from it's original position but I couldn't find it in it's
> new place... /usr/bin/grep has been a real help now you pointed
> it out, thanks to you both :)

I think it should be in the original position (inside the kswapd loop), I
think removing it was probably a mistake. I prefer Stephens test there
rather than in page_io (setting "wait" in page_io.c has more ramifications
than just getting the IO started, I'm not sure we really actually want to
wait on the page). 

Hmm.. I could go either way on this. Arguments from all sides?

		Linus

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-17 23:14       ` Linus Torvalds
@ 1998-11-18  1:09         ` Stephen C. Tweedie
  1998-11-18  1:21           ` Linus Torvalds
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen C. Tweedie @ 1998-11-18  1:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rik van Riel, Stephen C. Tweedie, Jeffrey Hundstad, Linux MM

Hi,

On Tue, 17 Nov 1998 15:14:10 -0800 (PST), Linus Torvalds
<torvalds@transmeta.com> said:

> I think it should be in the original position (inside the kswapd loop), I
> think removing it was probably a mistake. I prefer Stephens test there
> rather than in page_io (setting "wait" in page_io.c has more ramifications
> than just getting the IO started, I'm not sure we really actually want to
> wait on the page). 

First, I think it's just a performance issue: I _think_ there are no
correctness issues, since the IO always has a chance to block anyway
(on the request queue if nothing else).  If anyone can spot a
correctness issue then shout!

The main benefit from having the nr_async_pages check in page_io.c is
that this way it also throttles the try_to_free_pages() loop during
normal allocations.

When we get a try_to_free_pages() from get_free_pages(), we are
basically saying "I want free memory, and I can't do anything until
you give it to me".  If we are in this state and don't set the io
wait, we can happily submit SWAP_CLUSTER_MAX pages to the IO request
layer and return without actually having freed up any memory.  That
doesn't help the allocation to succeed and in the worst case may cause
a swap IO flood.

It's not just kswapd which can have the problem of submitting massive
unreasonable swap activity: because get_free_pages() can also submit
async swapout, doing the nr_async_pages check in page_io.c makes sure
we catch both cases.  Andi Kleen has observed massive over-swap (to
the tune of 20 to 40MB at a time) when doing parallel makes: it
doesn't happen on single-threaded make, which suggests that it is not
only kswapd which can cause the swap floods.

Linus, the reason I proposed the breakout on (nr_free_pages >
freepages.max + SWAP_CLUSTER_MAX) in try_to_free_pages() was because
as soon as you have a significant number of memory hungry processes
trying to allocate in a low memory situation, they all start swapping
out SWAP_CLUSTER_MAX pages.  That's a significant amount of memory.
Is there any particular reason you omited that patch from
2.1.129-pre5?  It occurs to me that restoring this check would
actually be quite a good way of making sure that a normal
get_free_pages() doesn't enter a stalling try_to_free_pages()
unnecessarily, which would address some of the negative performance
implications of having the nr_async_pages stall in page_io.c.

--Stephen

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-18  1:09         ` Stephen C. Tweedie
@ 1998-11-18  1:21           ` Linus Torvalds
  1998-11-18  1:41             ` Linus Torvalds
  1998-11-18  9:19             ` Stephen C. Tweedie
  0 siblings, 2 replies; 9+ messages in thread
From: Linus Torvalds @ 1998-11-18  1:21 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Rik van Riel, Jeffrey Hundstad, Linux MM

On Wed, 18 Nov 1998, Stephen C. Tweedie wrote:
> 
> When we get a try_to_free_pages() from get_free_pages(), we are
> basically saying "I want free memory, and I can't do anything until
> you give it to me".  If we are in this state and don't set the io
> wait, we can happily submit SWAP_CLUSTER_MAX pages to the IO request
> layer and return without actually having freed up any memory.  That
> doesn't help the allocation to succeed and in the worst case may cause
> a swap IO flood.

Yes. But in that case we already have __GPF_IO set, so in this case we
_will_ wait synchronously.

It's only kswapd that does this asynchronously as far as I can see, and
it's ok for kswapd to not be that asynchronous. It just must not be _too_
asynchronous - we must decide to start the requests at some point, to make
sure there aren't too many things in transit. 

So the difference in behaviour then becomes one of "does kswapd actually
start to synchronously wait on certain pages when it's done a lot of
asynchronous requests" or "should kswapd just make sure that the async
requests go out in an orderly manner"? 

I don't know. Maybe waiting synchronously every once in a while is the
right answer. 

> Linus, the reason I proposed the breakout on (nr_free_pages >
> freepages.max + SWAP_CLUSTER_MAX) in try_to_free_pages() was because
> as soon as you have a significant number of memory hungry processes
> trying to allocate in a low memory situation, they all start swapping
> out SWAP_CLUSTER_MAX pages.  That's a significant amount of memory.
> Is there any particular reason you omited that patch from
> 2.1.129-pre5?

We shouldn't have gotten to try_to_free_pages() unless kswapd couldn't
keep up with the number of memory allocations, and in that case I think
the right answer _is_ to let everybody who wants to get memory free up
noticeable more memory than they need - we don't want to get into the
trickle situation where we are constantly trickling out a small amount of
swapspace. 

>		  It occurs to me that restoring this check would
> actually be quite a good way of making sure that a normal
> get_free_pages() doesn't enter a stalling try_to_free_pages()
> unnecessarily, which would address some of the negative performance
> implications of having the nr_async_pages stall in page_io.c.

I don't want a normal get_free_pages() ever to get even _close_ to calling
try_to_free_pages(). The normal action should be that kswapd happily
throws out pages at the same rate they are needed, so that any other
process never needs to get into try_to_free_pages() at all. 

Whenever you see processes that actually try to synchronously free memory,
you're much much too low on memory already. At least that's the idea, and
that's why I thought your patch was not right. 

I do know that my system feels a _lot_ better with recent kernels, now
that the main heavy lifting is done by kswapd. Interactive performance is
just great (and yes, I have half a gig of RAM, but I still page heavily
occasionally), so I'm fairly certain that this is basically the right
approach. 

But whether kswapd should go page-synchronous at some point? Maybe. I can
see arguments both for and against (the "for" argument is that we prefer
to have more intense bouts of IO followed by a nice clean wait, while the
"against" argument is that maybe we want to spread out the thing). 

Still looking for more argument..

		Linus

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-18  1:21           ` Linus Torvalds
@ 1998-11-18  1:41             ` Linus Torvalds
  1998-11-18  8:58               ` Rik van Riel
  1998-11-18  9:19             ` Stephen C. Tweedie
  1 sibling, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 1998-11-18  1:41 UTC (permalink / raw)
  To: Stephen C. Tweedie; +Cc: Rik van Riel, Jeffrey Hundstad, Linux MM



On Tue, 17 Nov 1998, Linus Torvalds wrote:
> 
> But whether kswapd should go page-synchronous at some point? Maybe. I can
> see arguments both for and against (the "for" argument is that we prefer
> to have more intense bouts of IO followed by a nice clean wait, while the
> "against" argument is that maybe we want to spread out the thing). 

Oh, well, I'm currently leaning for "for", which means your patch to
page_io.c is what I have now.. I don't like "trickling" pages by running
out of requests or something like that, so having the occasional nice wait
is probably best.

		Linus

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-18  1:41             ` Linus Torvalds
@ 1998-11-18  8:58               ` Rik van Riel
  0 siblings, 0 replies; 9+ messages in thread
From: Rik van Riel @ 1998-11-18  8:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Stephen C. Tweedie, Jeffrey Hundstad, Linux MM

On Tue, 17 Nov 1998, Linus Torvalds wrote:
> On Tue, 17 Nov 1998, Linus Torvalds wrote:
> > 
> > But whether kswapd should go page-synchronous at some point? Maybe. I can
> > see arguments both for and against (the "for" argument is that we prefer
> > to have more intense bouts of IO followed by a nice clean wait, while the
> > "against" argument is that maybe we want to spread out the thing). 
> 
> Oh, well, I'm currently leaning for "for", which means your patch to
> page_io.c is what I have now.. I don't like "trickling" pages by
> running out of requests or something like that, so having the
> occasional nice wait is probably best. 

It seems like you decided for my point of view before I
woke up again, so I'll just let you know that this is
one of the reasons why I submitted the original (2.1.90?)
patch to you. The other reason was that async, clustered
swapouts have a much higher bandwidth than synchronous
swapouts. This means we can do more swap I/O without
getting into trouble.

The only request I have to make is that you use the
sysctl tuneable limit pager_daemon.swap_cluster as
the limit.  Doing this will enable people to optimize
their kswapd configuration for multiple swap partitions
or disks with loads of tagged queues (or a shortage 
thereoff).

I have found that setting that limit to SWAP_CLUSTER_MAX
* number_of_highest_priority_swap_areas doubled swapout
performance, leaving 50% extra I/O bandwidth for swapins.

cheers,

Rik -- slowly getting used to dvorak kbd layout...
+-------------------------------------------------------------------+
| Linux memory management tour guide.        H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader.      http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: useless report -- perhaps memory allocation problems in 2.1.12[678]
  1998-11-18  1:21           ` Linus Torvalds
  1998-11-18  1:41             ` Linus Torvalds
@ 1998-11-18  9:19             ` Stephen C. Tweedie
  1 sibling, 0 replies; 9+ messages in thread
From: Stephen C. Tweedie @ 1998-11-18  9:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Stephen C. Tweedie, Rik van Riel, Jeffrey Hundstad, Linux MM

Hi,

On Tue, 17 Nov 1998 17:21:23 -0800 (PST), Linus Torvalds
<torvalds@transmeta.com> said:

> On Wed, 18 Nov 1998, Stephen C. Tweedie wrote:

> Yes. But in that case we already have __GPF_IO set, so in this case we
> _will_ wait synchronously.

Right.

> It's only kswapd that does this asynchronously as far as I can see, and
> it's ok for kswapd to not be that asynchronous. It just must not be _too_
> asynchronous - we must decide to start the requests at some point, to make
> sure there aren't too many things in transit. 

That's exactly my concern, and if it's only kswap which is using the
async code then I don't think it matters too much _where_ we do the
nr_async_pages check.

> So the difference in behaviour then becomes one of "does kswapd actually
> start to synchronously wait on certain pages when it's done a lot of
> asynchronous requests" or "should kswapd just make sure that the async
> requests go out in an orderly manner"? 

There's a related question: should kswapd keep on swapping at all once
it has submitted enough async IO?  Beyond a certain point we _know_ that
these pages will become free; swapping even more dirty pages won't help
us.  There's only any point in kswapd carrying on if we restrict
ourselves to unmapping clean pages: that's the only way we'll actually
increase the free page count right now.

So, should try_to_swap_out skip dirty pages if nr_async_pages is too
high?  This sounds like an attractive answer, if we are below
freepages.low, becuase it will let kswapd find free memory for interrupt
traffic.  If we aren't that low in memory then we don't want to be
unnecessarily unfair to clean pages.

I'm off to SANE now --- back next week.

--Stephen
--
This is a majordomo managed list.  To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~1998-11-18 10:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <199811131746.LAA23512@mail.mankato.msus.edu>
1998-11-16 14:27 ` useless report -- perhaps memory allocation problems in 2.1.12[678] Rik van Riel
1998-11-17 11:21   ` Stephen C. Tweedie
1998-11-17 20:18     ` Rik van Riel
1998-11-17 23:14       ` Linus Torvalds
1998-11-18  1:09         ` Stephen C. Tweedie
1998-11-18  1:21           ` Linus Torvalds
1998-11-18  1:41             ` Linus Torvalds
1998-11-18  8:58               ` Rik van Riel
1998-11-18  9:19             ` Stephen C. Tweedie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox