* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] [not found] <199811131746.LAA23512@mail.mankato.msus.edu> @ 1998-11-16 14:27 ` Rik van Riel 1998-11-17 11:21 ` Stephen C. Tweedie 0 siblings, 1 reply; 9+ messages in thread From: Rik van Riel @ 1998-11-16 14:27 UTC (permalink / raw) To: Jeffrey Hundstad; +Cc: Linux MM, Linus Torvalds On Fri, 13 Nov 1998, Jeffrey Hundstad wrote: > When I was recompiling gimp, with tkRat, and Netscape running it felt > like the machine was running out of ram. (I've got 128m of ram 128m of > happens on 2.1.125. Something has changed for the worse, but it does > FEEL peppier at the keyboard ;-) In 2.1.127+ the freeing of memory is done in the context of programs themselves too and the whole system is busy freeing memory. This means that the kswapd-loop has now been migrated into other contexts as well. This, together with the fact that kswapd never blocks on disk access any more, has caused serious trouble when a system runs out of memory. I guess this means I'll have to update and clean my out of memory killer patch really soon now... :( cheers, Rik -- slowly getting used to dvorak kbd layout... +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-16 14:27 ` useless report -- perhaps memory allocation problems in 2.1.12[678] Rik van Riel @ 1998-11-17 11:21 ` Stephen C. Tweedie 1998-11-17 20:18 ` Rik van Riel 0 siblings, 1 reply; 9+ messages in thread From: Stephen C. Tweedie @ 1998-11-17 11:21 UTC (permalink / raw) To: Rik van Riel; +Cc: Jeffrey Hundstad, Linux MM, Linus Torvalds Hi, In article <Pine.LNX.3.96.981116152322.20349E-100000@mirkwood.dummy.home>, Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > In 2.1.127+ the freeing of memory is done in the context of > programs themselves too It always has done: it's just a bit better at it in some situations now. > and the whole system is busy freeing memory. This means that the > kswapd-loop has now been migrated into other contexts as well. This, > together with the fact that kswapd never blocks on disk access any > more, Yes it does. We don't pass GFP_WAIT to swap_out(), but that just means that the swapout will be done asynchronously. We are still free to write stuff out to swap, and in fact once we hit the limit on outstanding IOs we may well block in the write. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-17 11:21 ` Stephen C. Tweedie @ 1998-11-17 20:18 ` Rik van Riel 1998-11-17 23:14 ` Linus Torvalds 0 siblings, 1 reply; 9+ messages in thread From: Rik van Riel @ 1998-11-17 20:18 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Jeffrey Hundstad, Linux MM, Linus Torvalds On Tue, 17 Nov 1998, Stephen C. Tweedie wrote: > Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > > > and the whole system is busy freeing memory. This means that the > > kswapd-loop has now been migrated into other contexts as well. This, > > together with the fact that kswapd never blocks on disk access any > > more, > > Yes it does. We don't pass GFP_WAIT to swap_out(), but that just > means that the swapout will be done asynchronously. We are still > free to write stuff out to swap, and in fact once we hit the limit > on outstanding IOs we may well block in the write. Whoops, I saw that run_task_queue(&tq_disk) had dissapeared from it's original position but I couldn't find it in it's new place... /usr/bin/grep has been a real help now you pointed it out, thanks to you both :) cheers, Rik -- slowly getting used to dvorak kbd layout... +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-17 20:18 ` Rik van Riel @ 1998-11-17 23:14 ` Linus Torvalds 1998-11-18 1:09 ` Stephen C. Tweedie 0 siblings, 1 reply; 9+ messages in thread From: Linus Torvalds @ 1998-11-17 23:14 UTC (permalink / raw) To: Rik van Riel; +Cc: Stephen C. Tweedie, Jeffrey Hundstad, Linux MM On Tue, 17 Nov 1998, Rik van Riel wrote: > > Whoops, I saw that run_task_queue(&tq_disk) had dissapeared > from it's original position but I couldn't find it in it's > new place... /usr/bin/grep has been a real help now you pointed > it out, thanks to you both :) I think it should be in the original position (inside the kswapd loop), I think removing it was probably a mistake. I prefer Stephens test there rather than in page_io (setting "wait" in page_io.c has more ramifications than just getting the IO started, I'm not sure we really actually want to wait on the page). Hmm.. I could go either way on this. Arguments from all sides? Linus -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-17 23:14 ` Linus Torvalds @ 1998-11-18 1:09 ` Stephen C. Tweedie 1998-11-18 1:21 ` Linus Torvalds 0 siblings, 1 reply; 9+ messages in thread From: Stephen C. Tweedie @ 1998-11-18 1:09 UTC (permalink / raw) To: Linus Torvalds Cc: Rik van Riel, Stephen C. Tweedie, Jeffrey Hundstad, Linux MM Hi, On Tue, 17 Nov 1998 15:14:10 -0800 (PST), Linus Torvalds <torvalds@transmeta.com> said: > I think it should be in the original position (inside the kswapd loop), I > think removing it was probably a mistake. I prefer Stephens test there > rather than in page_io (setting "wait" in page_io.c has more ramifications > than just getting the IO started, I'm not sure we really actually want to > wait on the page). First, I think it's just a performance issue: I _think_ there are no correctness issues, since the IO always has a chance to block anyway (on the request queue if nothing else). If anyone can spot a correctness issue then shout! The main benefit from having the nr_async_pages check in page_io.c is that this way it also throttles the try_to_free_pages() loop during normal allocations. When we get a try_to_free_pages() from get_free_pages(), we are basically saying "I want free memory, and I can't do anything until you give it to me". If we are in this state and don't set the io wait, we can happily submit SWAP_CLUSTER_MAX pages to the IO request layer and return without actually having freed up any memory. That doesn't help the allocation to succeed and in the worst case may cause a swap IO flood. It's not just kswapd which can have the problem of submitting massive unreasonable swap activity: because get_free_pages() can also submit async swapout, doing the nr_async_pages check in page_io.c makes sure we catch both cases. Andi Kleen has observed massive over-swap (to the tune of 20 to 40MB at a time) when doing parallel makes: it doesn't happen on single-threaded make, which suggests that it is not only kswapd which can cause the swap floods. Linus, the reason I proposed the breakout on (nr_free_pages > freepages.max + SWAP_CLUSTER_MAX) in try_to_free_pages() was because as soon as you have a significant number of memory hungry processes trying to allocate in a low memory situation, they all start swapping out SWAP_CLUSTER_MAX pages. That's a significant amount of memory. Is there any particular reason you omited that patch from 2.1.129-pre5? It occurs to me that restoring this check would actually be quite a good way of making sure that a normal get_free_pages() doesn't enter a stalling try_to_free_pages() unnecessarily, which would address some of the negative performance implications of having the nr_async_pages stall in page_io.c. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-18 1:09 ` Stephen C. Tweedie @ 1998-11-18 1:21 ` Linus Torvalds 1998-11-18 1:41 ` Linus Torvalds 1998-11-18 9:19 ` Stephen C. Tweedie 0 siblings, 2 replies; 9+ messages in thread From: Linus Torvalds @ 1998-11-18 1:21 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Rik van Riel, Jeffrey Hundstad, Linux MM On Wed, 18 Nov 1998, Stephen C. Tweedie wrote: > > When we get a try_to_free_pages() from get_free_pages(), we are > basically saying "I want free memory, and I can't do anything until > you give it to me". If we are in this state and don't set the io > wait, we can happily submit SWAP_CLUSTER_MAX pages to the IO request > layer and return without actually having freed up any memory. That > doesn't help the allocation to succeed and in the worst case may cause > a swap IO flood. Yes. But in that case we already have __GPF_IO set, so in this case we _will_ wait synchronously. It's only kswapd that does this asynchronously as far as I can see, and it's ok for kswapd to not be that asynchronous. It just must not be _too_ asynchronous - we must decide to start the requests at some point, to make sure there aren't too many things in transit. So the difference in behaviour then becomes one of "does kswapd actually start to synchronously wait on certain pages when it's done a lot of asynchronous requests" or "should kswapd just make sure that the async requests go out in an orderly manner"? I don't know. Maybe waiting synchronously every once in a while is the right answer. > Linus, the reason I proposed the breakout on (nr_free_pages > > freepages.max + SWAP_CLUSTER_MAX) in try_to_free_pages() was because > as soon as you have a significant number of memory hungry processes > trying to allocate in a low memory situation, they all start swapping > out SWAP_CLUSTER_MAX pages. That's a significant amount of memory. > Is there any particular reason you omited that patch from > 2.1.129-pre5? We shouldn't have gotten to try_to_free_pages() unless kswapd couldn't keep up with the number of memory allocations, and in that case I think the right answer _is_ to let everybody who wants to get memory free up noticeable more memory than they need - we don't want to get into the trickle situation where we are constantly trickling out a small amount of swapspace. > It occurs to me that restoring this check would > actually be quite a good way of making sure that a normal > get_free_pages() doesn't enter a stalling try_to_free_pages() > unnecessarily, which would address some of the negative performance > implications of having the nr_async_pages stall in page_io.c. I don't want a normal get_free_pages() ever to get even _close_ to calling try_to_free_pages(). The normal action should be that kswapd happily throws out pages at the same rate they are needed, so that any other process never needs to get into try_to_free_pages() at all. Whenever you see processes that actually try to synchronously free memory, you're much much too low on memory already. At least that's the idea, and that's why I thought your patch was not right. I do know that my system feels a _lot_ better with recent kernels, now that the main heavy lifting is done by kswapd. Interactive performance is just great (and yes, I have half a gig of RAM, but I still page heavily occasionally), so I'm fairly certain that this is basically the right approach. But whether kswapd should go page-synchronous at some point? Maybe. I can see arguments both for and against (the "for" argument is that we prefer to have more intense bouts of IO followed by a nice clean wait, while the "against" argument is that maybe we want to spread out the thing). Still looking for more argument.. Linus -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-18 1:21 ` Linus Torvalds @ 1998-11-18 1:41 ` Linus Torvalds 1998-11-18 8:58 ` Rik van Riel 1998-11-18 9:19 ` Stephen C. Tweedie 1 sibling, 1 reply; 9+ messages in thread From: Linus Torvalds @ 1998-11-18 1:41 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Rik van Riel, Jeffrey Hundstad, Linux MM On Tue, 17 Nov 1998, Linus Torvalds wrote: > > But whether kswapd should go page-synchronous at some point? Maybe. I can > see arguments both for and against (the "for" argument is that we prefer > to have more intense bouts of IO followed by a nice clean wait, while the > "against" argument is that maybe we want to spread out the thing). Oh, well, I'm currently leaning for "for", which means your patch to page_io.c is what I have now.. I don't like "trickling" pages by running out of requests or something like that, so having the occasional nice wait is probably best. Linus -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-18 1:41 ` Linus Torvalds @ 1998-11-18 8:58 ` Rik van Riel 0 siblings, 0 replies; 9+ messages in thread From: Rik van Riel @ 1998-11-18 8:58 UTC (permalink / raw) To: Linus Torvalds; +Cc: Stephen C. Tweedie, Jeffrey Hundstad, Linux MM On Tue, 17 Nov 1998, Linus Torvalds wrote: > On Tue, 17 Nov 1998, Linus Torvalds wrote: > > > > But whether kswapd should go page-synchronous at some point? Maybe. I can > > see arguments both for and against (the "for" argument is that we prefer > > to have more intense bouts of IO followed by a nice clean wait, while the > > "against" argument is that maybe we want to spread out the thing). > > Oh, well, I'm currently leaning for "for", which means your patch to > page_io.c is what I have now.. I don't like "trickling" pages by > running out of requests or something like that, so having the > occasional nice wait is probably best. It seems like you decided for my point of view before I woke up again, so I'll just let you know that this is one of the reasons why I submitted the original (2.1.90?) patch to you. The other reason was that async, clustered swapouts have a much higher bandwidth than synchronous swapouts. This means we can do more swap I/O without getting into trouble. The only request I have to make is that you use the sysctl tuneable limit pager_daemon.swap_cluster as the limit. Doing this will enable people to optimize their kswapd configuration for multiple swap partitions or disks with loads of tagged queues (or a shortage thereoff). I have found that setting that limit to SWAP_CLUSTER_MAX * number_of_highest_priority_swap_areas doubled swapout performance, leaving 50% extra I/O bandwidth for swapins. cheers, Rik -- slowly getting used to dvorak kbd layout... +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: useless report -- perhaps memory allocation problems in 2.1.12[678] 1998-11-18 1:21 ` Linus Torvalds 1998-11-18 1:41 ` Linus Torvalds @ 1998-11-18 9:19 ` Stephen C. Tweedie 1 sibling, 0 replies; 9+ messages in thread From: Stephen C. Tweedie @ 1998-11-18 9:19 UTC (permalink / raw) To: Linus Torvalds Cc: Stephen C. Tweedie, Rik van Riel, Jeffrey Hundstad, Linux MM Hi, On Tue, 17 Nov 1998 17:21:23 -0800 (PST), Linus Torvalds <torvalds@transmeta.com> said: > On Wed, 18 Nov 1998, Stephen C. Tweedie wrote: > Yes. But in that case we already have __GPF_IO set, so in this case we > _will_ wait synchronously. Right. > It's only kswapd that does this asynchronously as far as I can see, and > it's ok for kswapd to not be that asynchronous. It just must not be _too_ > asynchronous - we must decide to start the requests at some point, to make > sure there aren't too many things in transit. That's exactly my concern, and if it's only kswap which is using the async code then I don't think it matters too much _where_ we do the nr_async_pages check. > So the difference in behaviour then becomes one of "does kswapd actually > start to synchronously wait on certain pages when it's done a lot of > asynchronous requests" or "should kswapd just make sure that the async > requests go out in an orderly manner"? There's a related question: should kswapd keep on swapping at all once it has submitted enough async IO? Beyond a certain point we _know_ that these pages will become free; swapping even more dirty pages won't help us. There's only any point in kswapd carrying on if we restrict ourselves to unmapping clean pages: that's the only way we'll actually increase the free page count right now. So, should try_to_swap_out skip dirty pages if nr_async_pages is too high? This sounds like an attractive answer, if we are below freepages.low, becuase it will let kswapd find free memory for interrupt traffic. If we aren't that low in memory then we don't want to be unnecessarily unfair to clean pages. I'm off to SANE now --- back next week. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~1998-11-18 10:25 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <199811131746.LAA23512@mail.mankato.msus.edu>
1998-11-16 14:27 ` useless report -- perhaps memory allocation problems in 2.1.12[678] Rik van Riel
1998-11-17 11:21 ` Stephen C. Tweedie
1998-11-17 20:18 ` Rik van Riel
1998-11-17 23:14 ` Linus Torvalds
1998-11-18 1:09 ` Stephen C. Tweedie
1998-11-18 1:21 ` Linus Torvalds
1998-11-18 1:41 ` Linus Torvalds
1998-11-18 8:58 ` Rik van Riel
1998-11-18 9:19 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox