linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@skynet.ie>,
	Nicolas Mailhot <nicolas.mailhot@laposte.net>,
	Christoph Lameter <clameter@sgi.com>,
	akpm@linux-foundation.org,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] Have kswapd keep a minimum order free other than order-0
Date: Fri, 18 May 2007 12:25:00 +1000	[thread overview]
Message-ID: <464D0E7C.5050509@yahoo.com.au> (raw)
In-Reply-To: <464C48F1.3060903@shadowen.org>

Andy Whitcroft wrote:
> Nick Piggin wrote:

>>>order-0 alloc
>>>watermark hit => wake kswapd
>>>order-0 alloc            kswapd reclaiming order 0
>>>order-0 alloc            kswapd reclaiming order 0
>>>order-3 alloc => kick kswap for order 3
>>>order-0 alloc            kswapd reclaiming order 0
>>>order-3 alloc            kswapd reclaiming order 0
>>>order-3 alloc            kswapd reclaiming order 0
>>>order-3 alloc => highorder mark hit, fail
>>>
>>>kswapd will keep reclaiming at order-0 until it completes a reclaim cycle
>>>and spots the new order and start over again. So there is a potentially
>>>sizable window there where problems can hit. Right?
>>
>>Take a look at the code. wakeup_kswapd and __alloc_pages.
>>
>>First, assume the zone is above high watermarks for order-0 and order-1.
>>order-0 allocs...
>>order-1 low watermark hit => don't care, not allocing order-1
>>order-0 low watermark hit => wake kswapd reclaim order 0
>>order-1 alloc => wakeup_kswapd raises kswapd_max_order to 1
>>order-1 allocs continue to succeed until the min watermark is hit
>>order-1 *atomic* allocs continue until the atomic reserve is hit
>>order-1 memalloc allocs continue until no more order-1 pages left.
> 
> 
> This represents the ideal.  However we never consider the reserves at
> order-1 unless we get an order-1 allocation.  With lots of order-0
> allocations (the norm) we can run the order-1 availability well below
> even the atomic reserve without anyone noticing, while the total reserve
> is above the order-0 low watermark.

Yes, but my reply was addressing the misconception that kswapd never
has its reclaim-order updated while it is reclaiming for a lower order.

It is by design that we don't make order-0 allocations notice order-1
watermarks, so if there is some problem with that, then that is what
should be changed. Not randomly break the watermarking code.


>  Here kswapd has been idle as there
> is only order-0 activity and we have sufficient of those.  THEN an
> order-1 comes in, we are below the order-1 low watermarks, we wake
> kswapd, and retry and discover we are below the atomic threshold and
> _fail_ the allocation.

And that is by design because we don't want to have order-1 pages free
if there are only order-0 allocations.

Anyway, atomic allocations are able to fail gracefully, in which case
kswapd will be kicked for next time. Non-atomic allocations can enter
direct reclaim, so it isn't the end of the world.


>>There really is (or should be) a proper watermarking system in place that
>>provides the right buffering for higher order allocations.
> 
> 
> I think that this is should be, not is.

Well you also said earlier that our problems are due to higher order
watermarks being too aggressive. So I think what is needed is to
actually work out what the real problem is first.


>>>I believe it failed to work due to a combination of kswapd reclaiming at
>>>the wrong order for a while and the fact that the watermarks are pretty
>>>agressive when it comes to higher orders. I'm trying to think of
>>>alternative fixes but keep coming back to the current fix using
>>>!(alloc_flags & ALLOC_CPUSET) to allow !wait allocations to succeed if
>>>the memory is there and above min watermarks at order-0.
>>
>>kswapd reclaiming at the wrong order should be a bug. It should start
>>reclaiming at the right order as soon as an allocation (atomic or not)
>>goes through the "start reclaiming now" watermark.
>>
>>Now this is just looking at mainline code that has the kswapd_max_order,
>>and kswapd doesn't actually reclaim "at" any order -- it just uses the
>>kswapd_max_order to know when the required "stop reclaiming now" marks
>>have been hit. If lumpy reclaim is not reclaiming at the right order,
>>then it means it isn't refreshing from kswapd_max_order enough.
> 
> 
> Yes I believe all of this is working as designed.  The problem is that
> we treat order-0 and order-1 allocations as independant.  We do not take
> into account that we split order-1's to make order-0.  We do not check
> the order-1 reserve for order 0 and so wake kswapd early enough.  It is
> very hard given the interdependant nature if the current calculation to
> detect transitions at _other_ orders when we allocate at any specific order.

Breaking the watermark code then adding a ridiculous hack to pin the
reclaim order to the highest created kmem cache is the wrong way to
go about this.

There are a number of right ways to help with this problem you describe.
One would be to *raise* higher order watermarks. Another would be to
have some decaying check-this-order-watermark-on-alloc counter in the
zone.

All this higher order allocation stuff had better _really_ be worth it...

-- 
SUSE Labs, Novell Inc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-18  2:25 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-14 17:32 [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Mel Gorman
2007-05-14 17:32 ` [PATCH 1/2] Have kswapd keep a minimum order free other than order-0 Mel Gorman
2007-05-14 18:01   ` Christoph Lameter
2007-05-14 18:13     ` Christoph Lameter
2007-05-14 18:24       ` Mel Gorman
2007-05-14 18:52         ` Christoph Lameter
2007-05-15  8:42         ` Nicolas Mailhot
2007-05-15  9:16           ` Mel Gorman
2007-05-16  8:25             ` Nick Piggin
2007-05-16  9:03               ` Mel Gorman
2007-05-16  9:10                 ` Nick Piggin
2007-05-16  9:45                   ` Mel Gorman
2007-05-16 12:28                     ` Nick Piggin
2007-05-16 13:50                       ` Mel Gorman
2007-05-16 14:04                         ` Nick Piggin
2007-05-16 15:32                           ` Mel Gorman
2007-05-16 15:44                             ` Nick Piggin
2007-05-16 16:46                               ` Mel Gorman
2007-05-17  7:09                                 ` Nick Piggin
2007-05-17 12:22                                   ` Andy Whitcroft
2007-05-18  2:25                                     ` Nick Piggin [this message]
2007-05-16 15:46                             ` Nick Piggin
2007-05-16 14:20                         ` Nick Piggin
2007-05-16 15:06                           ` Nicolas Mailhot
2007-05-16 15:33                             ` Mel Gorman
2007-05-15 17:09           ` Christoph Lameter
2007-05-15  4:39       ` Christoph Lameter
2007-05-14 18:19     ` Mel Gorman
2007-05-14 17:32 ` [PATCH 2/2] Only check absolute watermarks for ALLOC_HIGH and ALLOC_HARDER allocations Mel Gorman
2007-05-16 12:14   ` Nick Piggin
2007-05-16 13:24     ` Mel Gorman
2007-05-16 13:35       ` Nick Piggin
2007-05-16 14:00         ` Mel Gorman
2007-05-16 14:11           ` Nick Piggin
2007-05-16 18:28             ` Andy Whitcroft
2007-05-16 18:48               ` Mel Gorman
2007-05-16 19:00                 ` Christoph Lameter
2007-05-17  7:34               ` Nick Piggin
2007-05-14 18:13 ` [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Nicolas Mailhot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=464D0E7C.5050509@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=apw@shadowen.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=nicolas.mailhot@laposte.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox