[PATCH] prevent kswapd from freeing excessive amounts of lowmem

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] prevent kswapd from freeing excessive amounts of lowmem
@ 2007-09-05 23:01 Rik van Riel
  2007-09-06  1:23 ` Andrew Morton
  2007-09-07 12:24 ` Pavel Machek
  0 siblings, 2 replies; 7+ messages in thread
From: Rik van Riel @ 2007-09-05 23:01 UTC (permalink / raw)
  To: Linux kernel Mailing List; +Cc: linux-mm, akpm, safari-kernel

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

The current VM can get itself into trouble fairly easily on systems
with a small ZONE_HIGHMEM, which is common on i686 computers with
1GB of memory.

On one side, page_alloc() will allocate down to zone->pages_low,
while on the other side, kswapd() and balance_pgdat() will try
to free memory from every zone, until every zone has more free
pages than zone->pages_high.

Highmem can be filled up to zone->pages_low with page tables,
ramfs, vmalloc allocations and other unswappable things quite
easily and without many bad side effects, since we still have
a huge ZONE_NORMAL to do future allocations from.

However, as long as the number of free pages in the highmem
zone is below zone->pages_high, kswapd will continue swapping
things out from ZONE_NORMAL, too!

Sami Farin managed to get his system into a stage where kswapd
had freed about 700MB of low memory and was still "going strong".

The attached patch will make kswapd stop paging out data from
zones when there is more than enough memory free.  We do go above
zone->pages_high in order to keep pressure between zones equal
in normal circumstances, but the patch should prevent the kind
of excesses that made Sami's computer totally unusable.

Please merge this into -mm.

Signed-off-by: Rik van Riel <riel@redhat.com>

[-- Attachment #2: linux-2.6-excessive-pageout.patch --]
[-- Type: text/x-patch, Size: 714 bytes --]

--- linux-2.6.22.noarch/mm/vmscan.c.excessive	2007-09-05 12:19:49.000000000 -0400
+++ linux-2.6.22.noarch/mm/vmscan.c	2007-09-05 12:21:40.000000000 -0400
@@ -1371,7 +1371,13 @@ loop_again:
 			temp_priority[i] = priority;
 			sc.nr_scanned = 0;
 			note_zone_scanning_priority(zone, priority);
-			nr_reclaimed += shrink_zone(priority, zone, &sc);
+			/*
+			 * We put equal pressure on every zone, unless one
+			 * zone has way too many pages free already.
+			 */
+			if (!zone_watermark_ok(zone, order, 8*zone->pages_high,
+						end_zone, 0))
+				nr_reclaimed += shrink_zone(priority, zone, &sc);
 			reclaim_state->reclaimed_slab = 0;
 			nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL,
 						lru_pages);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] prevent kswapd from freeing excessive amounts of lowmem
  2007-09-05 23:01 [PATCH] prevent kswapd from freeing excessive amounts of lowmem Rik van Riel
@ 2007-09-06  1:23 ` Andrew Morton
  2007-09-06 16:38   ` Rik van Riel
  2007-09-07 12:24 ` Pavel Machek
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2007-09-06  1:23 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, linux-mm, safari-kernel

> On Wed, 05 Sep 2007 19:01:25 -0400 Rik van Riel <riel@redhat.com> wrote:
> The current VM can get itself into trouble fairly easily on systems
> with a small ZONE_HIGHMEM, which is common on i686 computers with
> 1GB of memory.
> 
> On one side, page_alloc() will allocate down to zone->pages_low,
> while on the other side, kswapd() and balance_pgdat() will try
> to free memory from every zone, until every zone has more free
> pages than zone->pages_high.
> 
> Highmem can be filled up to zone->pages_low with page tables,
> ramfs, vmalloc allocations and other unswappable things quite
> easily and without many bad side effects, since we still have
> a huge ZONE_NORMAL to do future allocations from.
>
> However, as long as the number of free pages in the highmem
> zone is below zone->pages_high, kswapd will continue swapping
> things out from ZONE_NORMAL, too!

crap.  I guess suitably-fashioned mlock could do the same thing.

> Sami Farin managed to get his system into a stage where kswapd
> had freed about 700MB of low memory and was still "going strong".
> 
> The attached patch will make kswapd stop paging out data from
> zones when there is more than enough memory free.

hm.  Did highmem's all_unreclaimable get set?  If so perhaps we could use
that in some way.

>  We do go above
> zone->pages_high in order to keep pressure between zones equal
> in normal circumstances, but the patch should prevent the kind
> of excesses that made Sami's computer totally unusable.
> 
> Please merge this into -mm.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
> 
> [linux-2.6-excessive-pageout.patch  text/x-patch (715B)]
> --- linux-2.6.22.noarch/mm/vmscan.c.excessive	2007-09-05 12:19:49.000000000 -0400
> +++ linux-2.6.22.noarch/mm/vmscan.c	2007-09-05 12:21:40.000000000 -0400
> @@ -1371,7 +1371,13 @@ loop_again:
>  			temp_priority[i] = priority;
>  			sc.nr_scanned = 0;
>  			note_zone_scanning_priority(zone, priority);
> -			nr_reclaimed += shrink_zone(priority, zone, &sc);
> +			/*
> +			 * We put equal pressure on every zone, unless one
> +			 * zone has way too many pages free already.
> +			 */
> +			if (!zone_watermark_ok(zone, order, 8*zone->pages_high,
> +						end_zone, 0))
> +				nr_reclaimed += shrink_zone(priority, zone, &sc);
>  			reclaim_state->reclaimed_slab = 0;
>  			nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL,
>  						lru_pages);

I guess for a very small upper zone and a very large lower zone this could
still put the scan balancing out of whack, fixable by a smarter version of
"8*zone->pages_high" but it doesn't seem very likely that this will affect
things much.

Why doesn't direct reclaim need similar treatment?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] prevent kswapd from freeing excessive amounts of lowmem
  2007-09-06  1:23 ` Andrew Morton
@ 2007-09-06 16:38   ` Rik van Riel
  2007-09-06 22:34     ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Rik van Riel @ 2007-09-06 16:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, safari-kernel

Andrew Morton wrote:

> I guess for a very small upper zone and a very large lower zone this could
> still put the scan balancing out of whack, fixable by a smarter version of
> "8*zone->pages_high" but it doesn't seem very likely that this will affect
> things much.
> 
> Why doesn't direct reclaim need similar treatment?

Because we only go into the direct reclaim path once
every zone is at or below zone->pages_low, and the
direct reclaim path will exit once we have freed more
than swap_cluster_max pages.

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] prevent kswapd from freeing excessive amounts of lowmem
  2007-09-06 16:38   ` Rik van Riel
@ 2007-09-06 22:34     ` Andrew Morton
  2007-09-06 22:47       ` Rik van Riel
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2007-09-06 22:34 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, linux-mm, safari-kernel

> On Thu, 06 Sep 2007 12:38:13 -0400 Rik van Riel <riel@redhat.com> wrote:
> Andrew Morton wrote:
> 

(What happened to the other stuff I said?)

> > I guess for a very small upper zone and a very large lower zone this could
> > still put the scan balancing out of whack, fixable by a smarter version of
> > "8*zone->pages_high" but it doesn't seem very likely that this will affect
> > things much.
> > 
> > Why doesn't direct reclaim need similar treatment?
> 
> Because we only go into the direct reclaim path once
> every zone is at or below zone->pages_low, and the
> direct reclaim path will exit once we have freed more
> than swap_cluster_max pages.
> 

hm.  Now I need to remember why direct-reclaim does that :(

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] prevent kswapd from freeing excessive amounts of lowmem
  2007-09-06 22:34     ` Andrew Morton
@ 2007-09-06 22:47       ` Rik van Riel
  0 siblings, 0 replies; 7+ messages in thread
From: Rik van Riel @ 2007-09-06 22:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, safari-kernel

Andrew Morton wrote:
>> On Thu, 06 Sep 2007 12:38:13 -0400 Rik van Riel <riel@redhat.com> wrote:
>> Andrew Morton wrote:
> 
> (What happened to the other stuff I said?)

Mlock can cause the problem too.  As for all_unreclaimable,
it is ignored when priority == DEF_PRIORITY, balance_pgdat
always seems to start in this stage.

>>> I guess for a very small upper zone and a very large lower zone this could
>>> still put the scan balancing out of whack, fixable by a smarter version of
>>> "8*zone->pages_high" but it doesn't seem very likely that this will affect
>>> things much.
>>>
>>> Why doesn't direct reclaim need similar treatment?
>> Because we only go into the direct reclaim path once
>> every zone is at or below zone->pages_low, and the
>> direct reclaim path will exit once we have freed more
>> than swap_cluster_max pages.
>>
> 
> hm.  Now I need to remember why direct-reclaim does that :(

This is done so the system does not end up with the first
process that goes into page reclaim staying there forever,
while the other processes in the system happily consume
the pages freed by that poor first process.

There may be other reasons, too.

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] prevent kswapd from freeing excessive amounts of lowmem
  2007-09-05 23:01 [PATCH] prevent kswapd from freeing excessive amounts of lowmem Rik van Riel
  2007-09-06  1:23 ` Andrew Morton
@ 2007-09-07 12:24 ` Pavel Machek
  2007-09-08 20:20   ` Rik van Riel
  1 sibling, 1 reply; 7+ messages in thread
From: Pavel Machek @ 2007-09-07 12:24 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Linux kernel Mailing List, linux-mm, akpm, safari-kernel

Hi!

> The current VM can get itself into trouble fairly easily 
> on systems
> with a small ZONE_HIGHMEM, which is common on i686 
> computers with
> 1GB of memory.
> 
> On one side, page_alloc() will allocate down to 
> zone->pages_low,
> while on the other side, kswapd() and balance_pgdat() 
> will try
> to free memory from every zone, until every zone has 
> more free
> pages than zone->pages_high.
> 
> Highmem can be filled up to zone->pages_low with page 
> tables,
> ramfs, vmalloc allocations and other unswappable things 
> quite
> easily and without many bad side effects, since we still 
> have
> a huge ZONE_NORMAL to do future allocations from.
> 
> However, as long as the number of free pages in the 
> highmem
> zone is below zone->pages_high, kswapd will continue 
> swapping
> things out from ZONE_NORMAL, too!
> 
> Sami Farin managed to get his system into a stage where 
> kswapd
> had freed about 700MB of low memory and was still "going 
> strong".
> 
> The attached patch will make kswapd stop paging out data 
> from
> zones when there is more than enough memory free.  We do 
> go above
> zone->pages_high in order to keep pressure between zones 
> equal
> in normal circumstances, but the patch should prevent 
> the kind
> of excesses that made Sami's computer totally unusable.
> 
> Please merge this into -mm.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

> --- linux-2.6.22.noarch/mm/vmscan.c.excessive	2007-09-05 12:19:49.000000000 -0400
> +++ linux-2.6.22.noarch/mm/vmscan.c	2007-09-05 12:21:40.000000000 -0400
> @@ -1371,7 +1371,13 @@ loop_again:
>  			temp_priority[i] = priority;
>  			sc.nr_scanned = 0;
>  			note_zone_scanning_priority(zone, priority);
> -			nr_reclaimed += shrink_zone(priority, zone, &sc);
> +			/*
> +			 * We put equal pressure on every zone, unless one
> +			 * zone has way too many pages free already.
> +			 */

That does not seem right. Having empty HIGHMEM and full LOWMEM would
be very bad, right? We may stop freeing when there's enough LOWMEM
free, but not if there's only HIGHMEM free.

							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] prevent kswapd from freeing excessive amounts of lowmem
  2007-09-07 12:24 ` Pavel Machek
@ 2007-09-08 20:20   ` Rik van Riel
  0 siblings, 0 replies; 7+ messages in thread
From: Rik van Riel @ 2007-09-08 20:20 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux kernel Mailing List, linux-mm, akpm, safari-kernel

Pavel Machek wrote:
> Hi!
> 
>> The current VM can get itself into trouble fairly easily 
>> on systems
>> with a small ZONE_HIGHMEM, which is common on i686 
>> computers with
>> 1GB of memory.
>>
>> On one side, page_alloc() will allocate down to 
>> zone->pages_low,
>> while on the other side, kswapd() and balance_pgdat() 
>> will try
>> to free memory from every zone, until every zone has 
>> more free
>> pages than zone->pages_high.
>>
>> Highmem can be filled up to zone->pages_low with page 
>> tables,
>> ramfs, vmalloc allocations and other unswappable things 
>> quite
>> easily and without many bad side effects, since we still 
>> have
>> a huge ZONE_NORMAL to do future allocations from.
>>
>> However, as long as the number of free pages in the 
>> highmem
>> zone is below zone->pages_high, kswapd will continue 
>> swapping
>> things out from ZONE_NORMAL, too!
>>
>> Sami Farin managed to get his system into a stage where 
>> kswapd
>> had freed about 700MB of low memory and was still "going 
>> strong".
>>
>> The attached patch will make kswapd stop paging out data 
>> from
>> zones when there is more than enough memory free.  We do 
>> go above
>> zone->pages_high in order to keep pressure between zones 
>> equal
>> in normal circumstances, but the patch should prevent 
>> the kind
>> of excesses that made Sami's computer totally unusable.
>>
>> Please merge this into -mm.
>>
>> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
>> --- linux-2.6.22.noarch/mm/vmscan.c.excessive	2007-09-05 12:19:49.000000000 -0400
>> +++ linux-2.6.22.noarch/mm/vmscan.c	2007-09-05 12:21:40.000000000 -0400
>> @@ -1371,7 +1371,13 @@ loop_again:
>>  			temp_priority[i] = priority;
>>  			sc.nr_scanned = 0;
>>  			note_zone_scanning_priority(zone, priority);
>> -			nr_reclaimed += shrink_zone(priority, zone, &sc);
>> +			/*
>> +			 * We put equal pressure on every zone, unless one
>> +			 * zone has way too many pages free already.
>> +			 */
> 
> That does not seem right. Having empty HIGHMEM and full LOWMEM would
> be very bad, right? We may stop freeing when there's enough LOWMEM
> free, but not if there's only HIGHMEM free.

Please read the code this patch applies to.

The check I add conditionalizes the individual
calls to shrink_zone(), so we do not call
shrink_zone() for a zone that has a ton of free
pages.  We still call shrink_zone() for the
other zones.

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-09-08 20:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-05 23:01 [PATCH] prevent kswapd from freeing excessive amounts of lowmem Rik van Riel
2007-09-06  1:23 ` Andrew Morton
2007-09-06 16:38   ` Rik van Riel
2007-09-06 22:34     ` Andrew Morton
2007-09-06 22:47       ` Rik van Riel
2007-09-07 12:24 ` Pavel Machek
2007-09-08 20:20   ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox