From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <464AF8DB.9030000@yahoo.com.au> Date: Wed, 16 May 2007 22:28:11 +1000 From: Nick Piggin MIME-Version: 1.0 Subject: Re: [PATCH 1/2] Have kswapd keep a minimum order free other than order-0 References: <20070514173218.6787.56089.sendpatchset@skynet.skynet.ie> <20070514173238.6787.57003.sendpatchset@skynet.skynet.ie> <20070514182456.GA9006@skynet.ie> <1179218576.25205.1.camel@rousalka.dyndns.org> <464AC00E.10704@yahoo.com.au> <464ACA68.2040707@yahoo.com.au> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Mel Gorman Cc: Nicolas Mailhot , Christoph Lameter , Andy Whitcroft , akpm@linux-foundation.org, Linux Memory Management List List-ID: Mel Gorman wrote: > On Wed, 16 May 2007, Nick Piggin wrote: > >> Mel Gorman wrote: >> >>> On Wed, 16 May 2007, Nick Piggin wrote: >> >> >>>> Hmm, so we require higher order pages be kept free even if nothing is >>>> using them? That's not very nice :( >>>> >>> >>> Not quite. We are already required to keep a minimum number of pages >>> free even though nothing is using them. The difference is that if it >>> is known high-order allocations are frequently required, the freed >>> pages will be contiguous. If no one calls raise_kswapd_order(), >>> kswapd will continue reclaiming at order-0. >> >> >> And after they are stopped being used, it falls back to order-0? > > > No, raise_kswapd_order() is used when it is known there are many > high-order allocations of a particular value. It becomes the minimum > value kswapd reclaims at. SLUB does not *require* high order allocations > but can be configured to use them so it makes sense to keep > min_free_kbytes at that order to reduce stalls due to direct reclaim. The point is you still might not have anything performing those allocations from those higher order caches. Or you might have things that are doing higher order allocations, but not via slab. Basically this is dumbing down the existing higher order watermarking already there in favour of a worse special case AFAIKS. >> Why >> can't this use the infrastructure that is already in place for that? >> > > The infrastructure there currently deals nicely with the situation where > there are rarely allocations of a high order. This change is for when it > is known there are frequent high-order (e.g. orders 1-4) allocations. > While the callers often can direct reclaim, kswapd should help them > avoid stalls because reducing stalls is one of it's functions. With this > patch, kswapd still reclaims the same number of pages, just tries to > reclaim contiguous ones. kswapd already does reclaim on behalf of non-sleeping higher order allocations (or at least it does in mainline). >>> Arguably, e1000 should also be calling raise_kswapd_order() when it >>> is using jumbo frames. >> >> >> It should be able to handle higher order page allocation failures >> gracefully. > > > Has something changed recently that it can handle failures? It might > have because it has been hinted that it's possible, just not very fast. I don't know, but it is stupid if it can't. It should not be too hard to keep it fast where it is fast today, and have it at least work where it would otherwise fail... just by reserving some memory pages in case none can be allocated. >> kswapd will be notified of the attempts and go on and try >> to free up some higher order pages for it for next time. What is wrong >> with this process? > > > It's reactive, it only occurs when a process has already entered direct > reclaim. No it should not be. It should be proactive even for higher order allocations. All this stuff used to work properly :( >> Are the higher order watermarks insufficient? >> > > The high-order watermarks are still used to make a process that can > sleep enter direct reclaim when the higher order watermarks are not > being met. > >> (I would also add that non-arguably, e1000 should also be able to do >> scatter gather with jumbo frames too.) >> > > That's another football that has done the laps. I think the hardware can do it. -- SUSE Labs, Novell Inc. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org