From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@skynet.ie>,
Nicolas Mailhot <nicolas.mailhot@laposte.net>,
Christoph Lameter <clameter@sgi.com>,
akpm@linux-foundation.org,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] Have kswapd keep a minimum order free other than order-0
Date: Fri, 18 May 2007 12:25:00 +1000 [thread overview]
Message-ID: <464D0E7C.5050509@yahoo.com.au> (raw)
In-Reply-To: <464C48F1.3060903@shadowen.org>
Andy Whitcroft wrote:
> Nick Piggin wrote:
>>>order-0 alloc
>>>watermark hit => wake kswapd
>>>order-0 alloc kswapd reclaiming order 0
>>>order-0 alloc kswapd reclaiming order 0
>>>order-3 alloc => kick kswap for order 3
>>>order-0 alloc kswapd reclaiming order 0
>>>order-3 alloc kswapd reclaiming order 0
>>>order-3 alloc kswapd reclaiming order 0
>>>order-3 alloc => highorder mark hit, fail
>>>
>>>kswapd will keep reclaiming at order-0 until it completes a reclaim cycle
>>>and spots the new order and start over again. So there is a potentially
>>>sizable window there where problems can hit. Right?
>>
>>Take a look at the code. wakeup_kswapd and __alloc_pages.
>>
>>First, assume the zone is above high watermarks for order-0 and order-1.
>>order-0 allocs...
>>order-1 low watermark hit => don't care, not allocing order-1
>>order-0 low watermark hit => wake kswapd reclaim order 0
>>order-1 alloc => wakeup_kswapd raises kswapd_max_order to 1
>>order-1 allocs continue to succeed until the min watermark is hit
>>order-1 *atomic* allocs continue until the atomic reserve is hit
>>order-1 memalloc allocs continue until no more order-1 pages left.
>
>
> This represents the ideal. However we never consider the reserves at
> order-1 unless we get an order-1 allocation. With lots of order-0
> allocations (the norm) we can run the order-1 availability well below
> even the atomic reserve without anyone noticing, while the total reserve
> is above the order-0 low watermark.
Yes, but my reply was addressing the misconception that kswapd never
has its reclaim-order updated while it is reclaiming for a lower order.
It is by design that we don't make order-0 allocations notice order-1
watermarks, so if there is some problem with that, then that is what
should be changed. Not randomly break the watermarking code.
> Here kswapd has been idle as there
> is only order-0 activity and we have sufficient of those. THEN an
> order-1 comes in, we are below the order-1 low watermarks, we wake
> kswapd, and retry and discover we are below the atomic threshold and
> _fail_ the allocation.
And that is by design because we don't want to have order-1 pages free
if there are only order-0 allocations.
Anyway, atomic allocations are able to fail gracefully, in which case
kswapd will be kicked for next time. Non-atomic allocations can enter
direct reclaim, so it isn't the end of the world.
>>There really is (or should be) a proper watermarking system in place that
>>provides the right buffering for higher order allocations.
>
>
> I think that this is should be, not is.
Well you also said earlier that our problems are due to higher order
watermarks being too aggressive. So I think what is needed is to
actually work out what the real problem is first.
>>>I believe it failed to work due to a combination of kswapd reclaiming at
>>>the wrong order for a while and the fact that the watermarks are pretty
>>>agressive when it comes to higher orders. I'm trying to think of
>>>alternative fixes but keep coming back to the current fix using
>>>!(alloc_flags & ALLOC_CPUSET) to allow !wait allocations to succeed if
>>>the memory is there and above min watermarks at order-0.
>>
>>kswapd reclaiming at the wrong order should be a bug. It should start
>>reclaiming at the right order as soon as an allocation (atomic or not)
>>goes through the "start reclaiming now" watermark.
>>
>>Now this is just looking at mainline code that has the kswapd_max_order,
>>and kswapd doesn't actually reclaim "at" any order -- it just uses the
>>kswapd_max_order to know when the required "stop reclaiming now" marks
>>have been hit. If lumpy reclaim is not reclaiming at the right order,
>>then it means it isn't refreshing from kswapd_max_order enough.
>
>
> Yes I believe all of this is working as designed. The problem is that
> we treat order-0 and order-1 allocations as independant. We do not take
> into account that we split order-1's to make order-0. We do not check
> the order-1 reserve for order 0 and so wake kswapd early enough. It is
> very hard given the interdependant nature if the current calculation to
> detect transitions at _other_ orders when we allocate at any specific order.
Breaking the watermark code then adding a ridiculous hack to pin the
reclaim order to the highest created kmem cache is the wrong way to
go about this.
There are a number of right ways to help with this problem you describe.
One would be to *raise* higher order watermarks. Another would be to
have some decaying check-this-order-watermark-on-alloc counter in the
zone.
All this higher order allocation stuff had better _really_ be worth it...
--
SUSE Labs, Novell Inc.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-05-18 2:25 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-14 17:32 [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Mel Gorman
2007-05-14 17:32 ` [PATCH 1/2] Have kswapd keep a minimum order free other than order-0 Mel Gorman
2007-05-14 18:01 ` Christoph Lameter
2007-05-14 18:13 ` Christoph Lameter
2007-05-14 18:24 ` Mel Gorman
2007-05-14 18:52 ` Christoph Lameter
2007-05-15 8:42 ` Nicolas Mailhot
2007-05-15 9:16 ` Mel Gorman
2007-05-16 8:25 ` Nick Piggin
2007-05-16 9:03 ` Mel Gorman
2007-05-16 9:10 ` Nick Piggin
2007-05-16 9:45 ` Mel Gorman
2007-05-16 12:28 ` Nick Piggin
2007-05-16 13:50 ` Mel Gorman
2007-05-16 14:04 ` Nick Piggin
2007-05-16 15:32 ` Mel Gorman
2007-05-16 15:44 ` Nick Piggin
2007-05-16 16:46 ` Mel Gorman
2007-05-17 7:09 ` Nick Piggin
2007-05-17 12:22 ` Andy Whitcroft
2007-05-18 2:25 ` Nick Piggin [this message]
2007-05-16 15:46 ` Nick Piggin
2007-05-16 14:20 ` Nick Piggin
2007-05-16 15:06 ` Nicolas Mailhot
2007-05-16 15:33 ` Mel Gorman
2007-05-15 17:09 ` Christoph Lameter
2007-05-15 4:39 ` Christoph Lameter
2007-05-14 18:19 ` Mel Gorman
2007-05-14 17:32 ` [PATCH 2/2] Only check absolute watermarks for ALLOC_HIGH and ALLOC_HARDER allocations Mel Gorman
2007-05-16 12:14 ` Nick Piggin
2007-05-16 13:24 ` Mel Gorman
2007-05-16 13:35 ` Nick Piggin
2007-05-16 14:00 ` Mel Gorman
2007-05-16 14:11 ` Nick Piggin
2007-05-16 18:28 ` Andy Whitcroft
2007-05-16 18:48 ` Mel Gorman
2007-05-16 19:00 ` Christoph Lameter
2007-05-17 7:34 ` Nick Piggin
2007-05-14 18:13 ` [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Nicolas Mailhot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=464D0E7C.5050509@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@linux-foundation.org \
--cc=apw@shadowen.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
--cc=nicolas.mailhot@laposte.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox