linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Mel Gorman <mel@skynet.ie>
Cc: Nicolas Mailhot <nicolas.mailhot@laposte.net>,
	Christoph Lameter <clameter@sgi.com>,
	Andy Whitcroft <apw@shadowen.org>,
	akpm@linux-foundation.org,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] Have kswapd keep a minimum order free other than order-0
Date: Thu, 17 May 2007 01:44:40 +1000	[thread overview]
Message-ID: <464B26E8.3060404@yahoo.com.au> (raw)
In-Reply-To: <20070516153215.GB10225@skynet.ie>

Mel Gorman wrote:
> On (17/05/07 00:04), Nick Piggin didst pronounce:
> 
>>Mel Gorman wrote:

>>>I guess we should only set this for non kmalloc caches then. 
>>>So move the call into kmem_cache_create? Would make the min order 3 on
>>>most of my mm machines.
>>>===
>>
>>You do not *know* if the slab is going to be allocated from. Or maybe it
>>is a few times at bootup, or once every 10 minutes.
>>
> 
> 
> So is your primary issue with raise_kswapd_order() being called at the
> time a cache is opened for use and instead it should be more selective?
> 
> 
>>>The second part of what you say is that there could be a non-slab user of
>>>high order allocs. That is true and expected. In that case, the existing
>>>mechanism informs kswapd of the higher order as it does today so it can
>>>reclaim at the higher order for a bit and enter direct reclaim if 
>>>necessary.
>>
>>You seem to have broken the existing mechanism though.
>>
> 
> 
> How is it broken exactly? What has changed in this patch is that there
> may be a minimum order that kswapd reclaims at. The same minimum number
> of pages are kept free.

I mean with patch 2.


> If the watermark was totally ignored with the second patch, I would understand
> but they are still obeyed. Even if it is an ALLOC_HIGH or ALLOC_HARDER
> allocation, the watermarks are obeyed for order-0 so memory does not get
> exhausted as that could cause a host of problems. The difference is if this
> is a HIGH or HARDER allocation and the memory can be granted without going
> belong the order-0 watermarks, it'll succeed. Would it be better if the
> lack of ALLOC_CPUSET was used to determine when only order-0 watermarks
> should be obeyed?

But I don't know why you want to disobey higher order watermarks in the
first place. *Those* are exactly the things that are going to be helpful
to fix this problem of atomic higher order allocations failing or non
atomic ones going into direct reclaim.


>>>It's not being replaced. That existing watermarking is still used. If it
>>>was being replaced, the for loop in zone_watermark_ok() would have been
>>>taken out.
>>
>>Patch 2 sure doesn't make it any better.
>>
> 
> 
> The second patch is simply saying "If you can satisfy the allocation without
> going below the watermarks for order-0, then do it". Again, if it used
> !(alloc_flags & ALLOC_CPUSET), would you be happier?

No ;)


>>>My point is that when it does, a caller is still likely to enter direct
>>>reclaim and kswapd can help prevent stalls if it pre-emptively reclaims at
>>>an order known to be commonly used when free pages is below watermarks
>>
>>So we should increase the watermarks, and keep the existing, working
>>code there and it will work for everyone, not just for slab, and it
>>will not keep higher orders free if they are not needed.
>>
> 
> 
> Raising watermarks is no guarantee that a high-order allocation that can sleep
> will occur at the right time to kick kswapd awake and that it'll get back from
> whatever it's doing in time to spot the new order and start reclaiming again.

You don't *need* a higher order allocation that can sleep in order
to kick kswapd. Crikey, I keep saying this.


>>>Well, if it could, order:3 allocation failure reports wouldn't occur
>>>periodically.
>>
>>They are reports of failures, not failure to handle the failures.
>>
> 
> 
> If the failures were being handled correctly, why would it be logging at
> all? They would have set __GFP_NOWARN and recovered silently.

Lots of places don't set __GFP_NOWARN but handle failures. Generally
you want to keep the warning even for atomic allocations if it is
a reasonably small order (0 or 1 or even 2).

The failures I have seen are not "networking stops working". They are
"e1000 gives page allocation failures", and the replies have always
been "that's not unexpected". Have you seen *any* of the former type?


>>>It already reserves and still occasionally hits the problem.
>>
>>e1000 reserves page? It would have to use them in a manner that guaranteed
>>timely return to the reserve pool like mempools. If it did that then it
>>would not have a problem.
>>
> 
> 
> When I last looked, they kept a series of buffers in a ring buffer. My
> understanding at the time was that this buffer regularly gets depleted
> and refilled.

But refilled via the allocator, right? One which does not revert to a
private stash if it cannot get a page.


>>>>All this stuff used to work properly :(
>>>>
>>>
>>>
>>>It only came to light recently that there might be issues.
>>
>>I mean kswapd asynchronously freeing higher order pages proactively. We
>>should get that working again first.
>>
> 
> 
> What do you suggest then?

Working out why it apparently isn't working, first. Then maybe look at
raising watermarks (they get reduced fairly rapidly as the order increases,
so it might just be that there is not enough at order-3).

-- 
SUSE Labs, Novell Inc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-16 15:44 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-14 17:32 [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Mel Gorman
2007-05-14 17:32 ` [PATCH 1/2] Have kswapd keep a minimum order free other than order-0 Mel Gorman
2007-05-14 18:01   ` Christoph Lameter
2007-05-14 18:13     ` Christoph Lameter
2007-05-14 18:24       ` Mel Gorman
2007-05-14 18:52         ` Christoph Lameter
2007-05-15  8:42         ` Nicolas Mailhot
2007-05-15  9:16           ` Mel Gorman
2007-05-16  8:25             ` Nick Piggin
2007-05-16  9:03               ` Mel Gorman
2007-05-16  9:10                 ` Nick Piggin
2007-05-16  9:45                   ` Mel Gorman
2007-05-16 12:28                     ` Nick Piggin
2007-05-16 13:50                       ` Mel Gorman
2007-05-16 14:04                         ` Nick Piggin
2007-05-16 15:32                           ` Mel Gorman
2007-05-16 15:44                             ` Nick Piggin [this message]
2007-05-16 16:46                               ` Mel Gorman
2007-05-17  7:09                                 ` Nick Piggin
2007-05-17 12:22                                   ` Andy Whitcroft
2007-05-18  2:25                                     ` Nick Piggin
2007-05-16 15:46                             ` Nick Piggin
2007-05-16 14:20                         ` Nick Piggin
2007-05-16 15:06                           ` Nicolas Mailhot
2007-05-16 15:33                             ` Mel Gorman
2007-05-15 17:09           ` Christoph Lameter
2007-05-15  4:39       ` Christoph Lameter
2007-05-14 18:19     ` Mel Gorman
2007-05-14 17:32 ` [PATCH 2/2] Only check absolute watermarks for ALLOC_HIGH and ALLOC_HARDER allocations Mel Gorman
2007-05-16 12:14   ` Nick Piggin
2007-05-16 13:24     ` Mel Gorman
2007-05-16 13:35       ` Nick Piggin
2007-05-16 14:00         ` Mel Gorman
2007-05-16 14:11           ` Nick Piggin
2007-05-16 18:28             ` Andy Whitcroft
2007-05-16 18:48               ` Mel Gorman
2007-05-16 19:00                 ` Christoph Lameter
2007-05-17  7:34               ` Nick Piggin
2007-05-14 18:13 ` [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Nicolas Mailhot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=464B26E8.3060404@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=apw@shadowen.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=nicolas.mailhot@laposte.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox