* [PATCH] mm: limit lowmem_reserve
[not found] ` <200604041235.59876.kernel@kolivas.org>
@ 2006-04-06 1:10 ` Con Kolivas
2006-04-06 1:29 ` Respin: " Con Kolivas
2006-04-07 6:25 ` Nick Piggin
0 siblings, 2 replies; 19+ messages in thread
From: Con Kolivas @ 2006-04-06 1:10 UTC (permalink / raw)
To: Andrew Morton; +Cc: ck, Nick Piggin, linux list, linux-mm
It is possible with a low enough lowmem_reserve ratio to make
zone_watermark_ok always fail if the lower_zone is small enough.
Impose a lower limit on the ratio to only allow 1/4 of the lower_zone
size to be set as lowmem_reserve. This limit is hit in ZONE_DMA by changing
the default vmsplit on i386 even without changing the default sysctl values.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
---
mm/page_alloc.c | 24 +++++++++++++++++++++---
1 files changed, 21 insertions(+), 3 deletions(-)
Index: linux-2.6.17-rc1-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc1-mm1.orig/mm/page_alloc.c 2006-04-06 10:32:31.000000000 +1000
+++ linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-06 11:09:17.000000000 +1000
@@ -2566,14 +2566,32 @@ static void setup_per_zone_lowmem_reserv
zone->lowmem_reserve[j] = 0;
for (idx = j-1; idx >= 0; idx--) {
+ unsigned long max_reserve;
+ unsigned long reserve;
struct zone *lower_zone;
+ lower_zone = pgdat->node_zones + idx;
+ /*
+ * Put an upper limit on the reserve at 1/4
+ * the lower_zone size. This prevents large
+ * zone size differences such as 3G VMSPLIT
+ * or low sysctl values from making
+ * zone_watermark_ok always fail. This
+ * enforces a lower limit on the reserve_ratio
+ */
+ max_reserve = lower_zone->present_pages / 4;
+
if (sysctl_lowmem_reserve_ratio[idx] < 1)
sysctl_lowmem_reserve_ratio[idx] = 1;
-
- lower_zone = pgdat->node_zones + idx;
- lower_zone->lowmem_reserve[j] = present_pages /
+ reserve = present_pages /
sysctl_lowmem_reserve_ratio[idx];
+ if (reserve > max_reserve) {
+ reserve = max_reserve;
+ sysctl_lowmem_reserve_ratio[idx] =
+ present_pages / max_reserve;
+ }
+
+ lower_zone->lowmem_reserve[j] = reserve;
present_pages += lower_zone->present_pages;
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 1:10 ` [PATCH] mm: limit lowmem_reserve Con Kolivas
@ 2006-04-06 1:29 ` Con Kolivas
2006-04-06 2:43 ` Andrew Morton
2006-04-07 6:25 ` Nick Piggin
1 sibling, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-06 1:29 UTC (permalink / raw)
To: linux-kernel; +Cc: Andrew Morton, ck, Nick Piggin, linux-mm
Err zone needs to have some pages too sorry.
Respin
---
It is possible with a low enough lowmem_reserve ratio to make
zone_watermark_ok fail repeatedly if the lower_zone is small enough.
Impose a lower limit on the ratio to only allow 1/4 of the lower_zone
size to be set as lowmem_reserve. This limit is hit in ZONE_DMA by changing
the default vmsplit on i386 even without changing the default sysctl values.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
---
mm/page_alloc.c | 24 +++++++++++++++++++++---
1 files changed, 21 insertions(+), 3 deletions(-)
Index: linux-2.6.17-rc1-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc1-mm1.orig/mm/page_alloc.c 2006-04-06 10:32:31.000000000 +1000
+++ linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-06 11:28:11.000000000 +1000
@@ -2566,14 +2566,32 @@ static void setup_per_zone_lowmem_reserv
zone->lowmem_reserve[j] = 0;
for (idx = j-1; idx >= 0; idx--) {
+ unsigned long max_reserve;
+ unsigned long reserve;
struct zone *lower_zone;
+ lower_zone = pgdat->node_zones + idx;
+ /*
+ * Put an upper limit on the reserve at 1/4
+ * the lower_zone size. This prevents large
+ * zone size differences such as 3G VMSPLIT
+ * or low sysctl values from making
+ * zone_watermark_ok always fail. This
+ * enforces a lower limit on the reserve_ratio
+ */
+ max_reserve = lower_zone->present_pages / 4;
+
if (sysctl_lowmem_reserve_ratio[idx] < 1)
sysctl_lowmem_reserve_ratio[idx] = 1;
-
- lower_zone = pgdat->node_zones + idx;
- lower_zone->lowmem_reserve[j] = present_pages /
+ reserve = present_pages /
sysctl_lowmem_reserve_ratio[idx];
+ if (max_reserve && reserve > max_reserve) {
+ reserve = max_reserve;
+ sysctl_lowmem_reserve_ratio[idx] =
+ present_pages / max_reserve;
+ }
+
+ lower_zone->lowmem_reserve[j] = reserve;
present_pages += lower_zone->present_pages;
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 1:29 ` Respin: " Con Kolivas
@ 2006-04-06 2:43 ` Andrew Morton
2006-04-06 2:55 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2006-04-06 2:43 UTC (permalink / raw)
To: Con Kolivas; +Cc: linux-kernel, ck, nickpiggin, linux-mm
Con Kolivas <kernel@kolivas.org> wrote:
>
> It is possible with a low enough lowmem_reserve ratio to make
> zone_watermark_ok fail repeatedly if the lower_zone is small enough.
Is that actually a problem?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 2:43 ` Andrew Morton
@ 2006-04-06 2:55 ` Con Kolivas
2006-04-06 2:58 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-06 2:55 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, ck, nickpiggin, linux-mm
On Thursday 06 April 2006 12:43, Andrew Morton wrote:
> Con Kolivas <kernel@kolivas.org> wrote:
> > It is possible with a low enough lowmem_reserve ratio to make
> > zone_watermark_ok fail repeatedly if the lower_zone is small enough.
>
> Is that actually a problem?
Every single call to get_page_from_freelist will call on zone reclaim. It
seems a problem to me if every call to __alloc_pages will do that?
Cheers,
Con
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 2:55 ` Con Kolivas
@ 2006-04-06 2:58 ` Con Kolivas
2006-04-06 3:40 ` Andrew Morton
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-06 2:58 UTC (permalink / raw)
To: ck; +Cc: Andrew Morton, nickpiggin, linux-kernel, linux-mm
On Thursday 06 April 2006 12:55, Con Kolivas wrote:
> On Thursday 06 April 2006 12:43, Andrew Morton wrote:
> > Con Kolivas <kernel@kolivas.org> wrote:
> > > It is possible with a low enough lowmem_reserve ratio to make
> > > zone_watermark_ok fail repeatedly if the lower_zone is small enough.
> >
> > Is that actually a problem?
>
> Every single call to get_page_from_freelist will call on zone reclaim. It
> seems a problem to me if every call to __alloc_pages will do that?
every call to __alloc_pages of that zone I mean
Cheers,
Con
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 2:58 ` Con Kolivas
@ 2006-04-06 3:40 ` Andrew Morton
2006-04-06 4:36 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2006-04-06 3:40 UTC (permalink / raw)
To: Con Kolivas; +Cc: ck, nickpiggin, linux-kernel, linux-mm
Con Kolivas <kernel@kolivas.org> wrote:
>
> On Thursday 06 April 2006 12:55, Con Kolivas wrote:
> > On Thursday 06 April 2006 12:43, Andrew Morton wrote:
> > > Con Kolivas <kernel@kolivas.org> wrote:
> > > > It is possible with a low enough lowmem_reserve ratio to make
> > > > zone_watermark_ok fail repeatedly if the lower_zone is small enough.
> > >
> > > Is that actually a problem?
> >
> > Every single call to get_page_from_freelist will call on zone reclaim. It
> > seems a problem to me if every call to __alloc_pages will do that?
>
> every call to __alloc_pages of that zone I mean
>
One would need to check with the NUMA guys. zone_reclaim() has a
(lame-looking) timer in there to prevent it from doing too much work.
That, or I'm missing something. This problem wasn't particularly well
described, sorry.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 3:40 ` Andrew Morton
@ 2006-04-06 4:36 ` Con Kolivas
2006-04-06 4:52 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-06 4:36 UTC (permalink / raw)
To: Andrew Morton; +Cc: ck, nickpiggin, linux-kernel, linux-mm
On Thursday 06 April 2006 13:40, Andrew Morton wrote:
> Con Kolivas <kernel@kolivas.org> wrote:
> > On Thursday 06 April 2006 12:55, Con Kolivas wrote:
> > > On Thursday 06 April 2006 12:43, Andrew Morton wrote:
> > > > Con Kolivas <kernel@kolivas.org> wrote:
> > > > > It is possible with a low enough lowmem_reserve ratio to make
> > > > > zone_watermark_ok fail repeatedly if the lower_zone is small
> > > > > enough.
> > > >
> > > > Is that actually a problem?
> > >
> > > Every single call to get_page_from_freelist will call on zone reclaim.
> > > It seems a problem to me if every call to __alloc_pages will do that?
> >
> > every call to __alloc_pages of that zone I mean
>
> One would need to check with the NUMA guys. zone_reclaim() has a
> (lame-looking) timer in there to prevent it from doing too much work.
>
> That, or I'm missing something. This problem wasn't particularly well
> described, sorry.
Ah ok. This all came about because I'm trying to honour the lowmem_reserve
better in swap_prefetch at Nick's request. It's hard to honour a watermark
that on some configurations is never reached.
Cheers,
Con
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Respin: [PATCH] mm: limit lowmem_reserve
2006-04-06 4:36 ` Con Kolivas
@ 2006-04-06 4:52 ` Con Kolivas
0 siblings, 0 replies; 19+ messages in thread
From: Con Kolivas @ 2006-04-06 4:52 UTC (permalink / raw)
To: ck; +Cc: Andrew Morton, nickpiggin, linux-kernel, linux-mm
On Thursday 06 April 2006 14:36, Con Kolivas wrote:
> On Thursday 06 April 2006 13:40, Andrew Morton wrote:
> > Con Kolivas <kernel@kolivas.org> wrote:
> > > On Thursday 06 April 2006 12:55, Con Kolivas wrote:
> > > > On Thursday 06 April 2006 12:43, Andrew Morton wrote:
> > > > > Con Kolivas <kernel@kolivas.org> wrote:
> > > > > > It is possible with a low enough lowmem_reserve ratio to make
> > > > > > zone_watermark_ok fail repeatedly if the lower_zone is small
> > > > > > enough.
> > > > >
> > > > > Is that actually a problem?
> > > >
> > > > Every single call to get_page_from_freelist will call on zone
> > > > reclaim. It seems a problem to me if every call to __alloc_pages will
> > > > do that?
> > >
> > > every call to __alloc_pages of that zone I mean
> >
> > One would need to check with the NUMA guys. zone_reclaim() has a
> > (lame-looking) timer in there to prevent it from doing too much work.
> >
> > That, or I'm missing something. This problem wasn't particularly well
> > described, sorry.
>
> Ah ok. This all came about because I'm trying to honour the lowmem_reserve
> better in swap_prefetch at Nick's request. It's hard to honour a watermark
> that on some configurations is never reached.
Forget that. If the numa people don't care about it I shouldn't touch it. I
thought I was doing something helpful at the source but got no response from
Nick or the the other numa_ids out there so they obviously don't care. I'll
tackle it differently in swap prefetch.
Cheers,
Con
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-06 1:10 ` [PATCH] mm: limit lowmem_reserve Con Kolivas
2006-04-06 1:29 ` Respin: " Con Kolivas
@ 2006-04-07 6:25 ` Nick Piggin
2006-04-07 9:02 ` Con Kolivas
1 sibling, 1 reply; 19+ messages in thread
From: Nick Piggin @ 2006-04-07 6:25 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, ck, linux list, linux-mm
Con Kolivas wrote:
> It is possible with a low enough lowmem_reserve ratio to make
> zone_watermark_ok always fail if the lower_zone is small enough.
I don't see how this would happen?
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-07 6:25 ` Nick Piggin
@ 2006-04-07 9:02 ` Con Kolivas
2006-04-07 12:40 ` Nick Piggin
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-07 9:02 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, ck, linux list, linux-mm
On Friday 07 April 2006 16:25, Nick Piggin wrote:
> Con Kolivas wrote:
> > It is possible with a low enough lowmem_reserve ratio to make
> > zone_watermark_ok always fail if the lower_zone is small enough.
>
> I don't see how this would happen?
3GB lowmem and a reserve ratio of 180 is enough to do it.
Cheers,
Con
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-07 9:02 ` Con Kolivas
@ 2006-04-07 12:40 ` Nick Piggin
2006-04-08 0:15 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Nick Piggin @ 2006-04-07 12:40 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, ck, linux list, linux-mm
Con Kolivas wrote:
> On Friday 07 April 2006 16:25, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>It is possible with a low enough lowmem_reserve ratio to make
>>>zone_watermark_ok always fail if the lower_zone is small enough.
>>
>>I don't see how this would happen?
>
>
> 3GB lowmem and a reserve ratio of 180 is enough to do it.
>
How would zone_watermark_ok always fail though?
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-07 12:40 ` Nick Piggin
@ 2006-04-08 0:15 ` Con Kolivas
2006-04-08 0:55 ` Nick Piggin
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-08 0:15 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, ck, linux list, linux-mm
On Friday 07 April 2006 22:40, Nick Piggin wrote:
> Con Kolivas wrote:
> > On Friday 07 April 2006 16:25, Nick Piggin wrote:
> >>Con Kolivas wrote:
> >>>It is possible with a low enough lowmem_reserve ratio to make
> >>>zone_watermark_ok always fail if the lower_zone is small enough.
> >>
> >>I don't see how this would happen?
> >
> > 3GB lowmem and a reserve ratio of 180 is enough to do it.
>
> How would zone_watermark_ok always fail though?
Withdrew this patch a while back; ignore
--
-ck
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-08 0:15 ` Con Kolivas
@ 2006-04-08 0:55 ` Nick Piggin
2006-04-08 1:01 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Nick Piggin @ 2006-04-08 0:55 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, ck, linux list, linux-mm
Con Kolivas wrote:
> On Friday 07 April 2006 22:40, Nick Piggin wrote:
>
>>How would zone_watermark_ok always fail though?
>
>
> Withdrew this patch a while back; ignore
>
Well, whether or not that particular patch isa good idea, it
is definitely a bug if zone_watermark_ok could ever always
fail due to lowmem reserve and we should fix it.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-08 0:55 ` Nick Piggin
@ 2006-04-08 1:01 ` Con Kolivas
2006-04-08 1:25 ` Nick Piggin
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-04-08 1:01 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, ck, linux list, linux-mm
On Saturday 08 April 2006 10:55, Nick Piggin wrote:
> Con Kolivas wrote:
> > On Friday 07 April 2006 22:40, Nick Piggin wrote:
> >>How would zone_watermark_ok always fail though?
> >
> > Withdrew this patch a while back; ignore
>
> Well, whether or not that particular patch isa good idea, it
> is definitely a bug if zone_watermark_ok could ever always
> fail due to lowmem reserve and we should fix it.
Ok. I think I presented enough information for why I thought zone_watermark_ok
would fail (for ZONE_DMA). With 16MB ZONE_DMA and a vmsplit of 3GB we have a
lowmem_reserve of 12MB. It's pretty hard to keep that much ZONE_DMA free, I
don't think I've ever seen that much free on my ZONE_DMA on an ordinary
desktop without any particular ZONE_DMA users. Changing the tunable can make
the lowmem_reserve larger than ZONE_DMA is on any vmsplit too as far as I
understand the ratio.
--
-ck
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-08 1:01 ` Con Kolivas
@ 2006-04-08 1:25 ` Nick Piggin
2006-05-17 14:11 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Nick Piggin @ 2006-04-08 1:25 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, ck, linux list, linux-mm
Con Kolivas wrote:
> On Saturday 08 April 2006 10:55, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>On Friday 07 April 2006 22:40, Nick Piggin wrote:
>>>
>>>>How would zone_watermark_ok always fail though?
>>>
>>>Withdrew this patch a while back; ignore
>>
>>Well, whether or not that particular patch isa good idea, it
>>is definitely a bug if zone_watermark_ok could ever always
>>fail due to lowmem reserve and we should fix it.
>
>
> Ok. I think I presented enough information for why I thought zone_watermark_ok
> would fail (for ZONE_DMA). With 16MB ZONE_DMA and a vmsplit of 3GB we have a
> lowmem_reserve of 12MB. It's pretty hard to keep that much ZONE_DMA free, I
> don't think I've ever seen that much free on my ZONE_DMA on an ordinary
> desktop without any particular ZONE_DMA users. Changing the tunable can make
> the lowmem_reserve larger than ZONE_DMA is on any vmsplit too as far as I
> understand the ratio.
>
Umm, for ZONE_DMA allocations, ZONE_DMA isn't a lower zone. So that
12MB protection should never come into it (unless it is buggy?).
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-04-08 1:25 ` Nick Piggin
@ 2006-05-17 14:11 ` Con Kolivas
2006-05-18 7:11 ` Nick Piggin
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-05-17 14:11 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, ck, linux list, linux-mm
I hate to resuscitate this old thread, sorry but I'm still not sure we
resolved it and I want to make sure this issue isn't here as I see it.
On Saturday 08 April 2006 11:25, Nick Piggin wrote:
> Con Kolivas wrote:
> > Ok. I think I presented enough information for why I thought
> > zone_watermark_ok would fail (for ZONE_DMA). With 16MB ZONE_DMA and a
> > vmsplit of 3GB we have a lowmem_reserve of 12MB. It's pretty hard to keep
> > that much ZONE_DMA free, I don't think I've ever seen that much free on
> > my ZONE_DMA on an ordinary desktop without any particular ZONE_DMA users.
> > Changing the tunable can make the lowmem_reserve larger than ZONE_DMA is
> > on any vmsplit too as far as I understand the ratio.
>
> Umm, for ZONE_DMA allocations, ZONE_DMA isn't a lower zone. So that
> 12MB protection should never come into it (unless it is buggy?).
An i386 pc with a 3GB split will have approx
4000 pages ZONE_DMA
and lowmem reserve will set lowmem reserve to approx
0 0 3000 3000
So if we call zone_watermark_ok with zone of ZONE_DMA and a classzone_idx of a
ZONE_NORMAL we will fail a zone_watermark_ok test almost always since it's
almost impossible to have 3000 free ZONE_DMA pages. I believe it can happen
like this:
In balance_pgdat (vmscan.c:1116) if we end up with end_zone being a
ZONE_NORMAL zone, then during the scan below we (vmscan.c:1137) iterate over
all zones from 0 to end_zone and (vmscan.c:1147) we end up calling
if (!zone_watermark_ok(zone, order, zone->pages_high, end_zone, 0))
which would now call zone_watermark_ok with zone being a ZONE_DMA, and
end_zone being the idx of a ZONE_NORMAL.
So in summary if I'm not mistaken (and I'm good at being mistaken), if we
balance pgdat and find that ZONE_NORMAL or higher needs scanning, we'll end
up trying to flush the crap out of ZONE_DMA.
On my test case this indeed happens and my ZONE_DMA never goes below 3000
pages free. If I lower the reserve even further my pages free gets stuck at
3208 and can't free any more, and doesn't ever drop below that either.
Here is the patch I was proposing
---
It is possible with a low enough lowmem_reserve ratio to make
zone_watermark_ok fail repeatedly if the lower_zone is small enough.
Impose a lower limit on the ratio to only allow 1/4 of the lower_zone
size to be set as lowmem_reserve. This limit is hit in ZONE_DMA by changing
the default vmsplit on i386 even without changing the default sysctl values.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
---
mm/page_alloc.c | 24 +++++++++++++++++++++---
1 files changed, 21 insertions(+), 3 deletions(-)
Index: linux-2.6.17-rc1-mm1/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc1-mm1.orig/mm/page_alloc.c 2006-04-06 10:32:31.000000000 +1000
+++ linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-06 11:28:11.000000000 +1000
@@ -2566,14 +2566,32 @@ static void setup_per_zone_lowmem_reserv
zone->lowmem_reserve[j] = 0;
for (idx = j-1; idx >= 0; idx--) {
+ unsigned long max_reserve;
+ unsigned long reserve;
struct zone *lower_zone;
+ lower_zone = pgdat->node_zones + idx;
+ /*
+ * Put an upper limit on the reserve at 1/4
+ * the lower_zone size. This prevents large
+ * zone size differences such as 3G VMSPLIT
+ * or low sysctl values from making
+ * zone_watermark_ok always fail. This
+ * enforces a lower limit on the reserve_ratio
+ */
+ max_reserve = lower_zone->present_pages / 4;
+
if (sysctl_lowmem_reserve_ratio[idx] < 1)
sysctl_lowmem_reserve_ratio[idx] = 1;
-
- lower_zone = pgdat->node_zones + idx;
- lower_zone->lowmem_reserve[j] = present_pages /
+ reserve = present_pages /
sysctl_lowmem_reserve_ratio[idx];
+ if (max_reserve && reserve > max_reserve) {
+ reserve = max_reserve;
+ sysctl_lowmem_reserve_ratio[idx] =
+ present_pages / max_reserve;
+ }
+
+ lower_zone->lowmem_reserve[j] = reserve;
present_pages += lower_zone->present_pages;
}
}
--
-ck
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-05-17 14:11 ` Con Kolivas
@ 2006-05-18 7:11 ` Nick Piggin
2006-05-18 7:21 ` Con Kolivas
0 siblings, 1 reply; 19+ messages in thread
From: Nick Piggin @ 2006-05-18 7:11 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, ck, linux list, linux-mm
Con Kolivas wrote:
> I hate to resuscitate this old thread, sorry but I'm still not sure we
> resolved it and I want to make sure this issue isn't here as I see it.
>
OK, reclaim is slightly different.
> On Saturday 08 April 2006 11:25, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>Ok. I think I presented enough information for why I thought
>>>zone_watermark_ok would fail (for ZONE_DMA). With 16MB ZONE_DMA and a
>>>vmsplit of 3GB we have a lowmem_reserve of 12MB. It's pretty hard to keep
>>>that much ZONE_DMA free, I don't think I've ever seen that much free on
>>>my ZONE_DMA on an ordinary desktop without any particular ZONE_DMA users.
>>>Changing the tunable can make the lowmem_reserve larger than ZONE_DMA is
>>>on any vmsplit too as far as I understand the ratio.
>>
>>Umm, for ZONE_DMA allocations, ZONE_DMA isn't a lower zone. So that
>>12MB protection should never come into it (unless it is buggy?).
>
>
> An i386 pc with a 3GB split will have approx
>
> 4000 pages ZONE_DMA
>
> and lowmem reserve will set lowmem reserve to approx
>
> 0 0 3000 3000
>
> So if we call zone_watermark_ok with zone of ZONE_DMA and a classzone_idx of a
> ZONE_NORMAL we will fail a zone_watermark_ok test almost always since it's
> almost impossible to have 3000 free ZONE_DMA pages. I believe it can happen
> like this:
>
> In balance_pgdat (vmscan.c:1116) if we end up with end_zone being a
> ZONE_NORMAL zone, then during the scan below we (vmscan.c:1137) iterate over
> all zones from 0 to end_zone and (vmscan.c:1147) we end up calling
>
> if (!zone_watermark_ok(zone, order, zone->pages_high, end_zone, 0))
>
> which would now call zone_watermark_ok with zone being a ZONE_DMA, and
> end_zone being the idx of a ZONE_NORMAL.
>
> So in summary if I'm not mistaken (and I'm good at being mistaken), if we
> balance pgdat and find that ZONE_NORMAL or higher needs scanning, we'll end
> up trying to flush the crap out of ZONE_DMA.
If we're under memory pressure, kswapd will try to free up any candidate
zone, yes.
>
> On my test case this indeed happens and my ZONE_DMA never goes below 3000
> pages free. If I lower the reserve even further my pages free gets stuck at
> 3208 and can't free any more, and doesn't ever drop below that either.
>
> Here is the patch I was proposing
What problem does that fix though?
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-05-18 7:11 ` Nick Piggin
@ 2006-05-18 7:21 ` Con Kolivas
2006-05-18 7:26 ` Nick Piggin
0 siblings, 1 reply; 19+ messages in thread
From: Con Kolivas @ 2006-05-18 7:21 UTC (permalink / raw)
To: Nick Piggin; +Cc: Andrew Morton, ck, linux list, linux-mm
On Thursday 18 May 2006 17:11, Nick Piggin wrote:
> If we're under memory pressure, kswapd will try to free up any candidate
> zone, yes.
>
> > On my test case this indeed happens and my ZONE_DMA never goes below 3000
> > pages free. If I lower the reserve even further my pages free gets stuck
> > at 3208 and can't free any more, and doesn't ever drop below that either.
> >
> > Here is the patch I was proposing
>
> What problem does that fix though?
It's a generic concern and I honestly don't know how significant it is which
is why I'm asking if it needs attention. That concern being that any time
we're under any sort of memory pressure, ZONE_DMA will undergo intense
reclaim even though there may not really be anything specifically going on in
ZONE_DMA. It just seems a waste of cycles doing that.
--
-ck
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm: limit lowmem_reserve
2006-05-18 7:21 ` Con Kolivas
@ 2006-05-18 7:26 ` Nick Piggin
0 siblings, 0 replies; 19+ messages in thread
From: Nick Piggin @ 2006-05-18 7:26 UTC (permalink / raw)
To: Con Kolivas; +Cc: Andrew Morton, ck, linux list, linux-mm
Con Kolivas wrote:
> On Thursday 18 May 2006 17:11, Nick Piggin wrote:
>
>>If we're under memory pressure, kswapd will try to free up any candidate
>>zone, yes.
>>
>>
>>>On my test case this indeed happens and my ZONE_DMA never goes below 3000
>>>pages free. If I lower the reserve even further my pages free gets stuck
>>>at 3208 and can't free any more, and doesn't ever drop below that either.
>>>
>>>Here is the patch I was proposing
>>
>>What problem does that fix though?
>
>
> It's a generic concern and I honestly don't know how significant it is which
> is why I'm asking if it needs attention. That concern being that any time
> we're under any sort of memory pressure, ZONE_DMA will undergo intense
> reclaim even though there may not really be anything specifically going on in
> ZONE_DMA. It just seems a waste of cycles doing that.
>
If it doesn't have any/much pagecache or slab cache in it, there won't be
intense reclaim; if it does then it can be reclaimed and the memory used.
reclaim / allocation could be slightly smarter about scaling watermarks,
however I don't think it is much of an issue at the moment.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2006-05-18 7:26 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <200604021401.13331.kernel@kolivas.org>
[not found] ` <200604031248.13532.kernel@kolivas.org>
[not found] ` <200604041235.59876.kernel@kolivas.org>
2006-04-06 1:10 ` [PATCH] mm: limit lowmem_reserve Con Kolivas
2006-04-06 1:29 ` Respin: " Con Kolivas
2006-04-06 2:43 ` Andrew Morton
2006-04-06 2:55 ` Con Kolivas
2006-04-06 2:58 ` Con Kolivas
2006-04-06 3:40 ` Andrew Morton
2006-04-06 4:36 ` Con Kolivas
2006-04-06 4:52 ` Con Kolivas
2006-04-07 6:25 ` Nick Piggin
2006-04-07 9:02 ` Con Kolivas
2006-04-07 12:40 ` Nick Piggin
2006-04-08 0:15 ` Con Kolivas
2006-04-08 0:55 ` Nick Piggin
2006-04-08 1:01 ` Con Kolivas
2006-04-08 1:25 ` Nick Piggin
2006-05-17 14:11 ` Con Kolivas
2006-05-18 7:11 ` Nick Piggin
2006-05-18 7:21 ` Con Kolivas
2006-05-18 7:26 ` Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox