linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@arcor.de>
To: Mel Gorman <mel@csn.ul.ie>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC] My research agenda for 2.7
Date: Fri, 27 Jun 2003 02:30:23 +0200	[thread overview]
Message-ID: <200306270222.27727.phillips@arcor.de> (raw)
In-Reply-To: <Pine.LNX.4.53.0306262030500.5910@skynet>

On Thursday 26 June 2003 22:01, Mel Gorman wrote:
> I think that finding pages like this together is unlikely, especially if
> the system has been running a long time. In the worst case you will have
> every easily-moved page adjactent to a near-impossible-to-move page.

Mel,

I addressed that in my previous post: "Most slab pages are hard to move ... we 
could just tell slab to use its own biggish chunks of memory, which it can 
play in as it sees fit".  Another way of putting this is, we isolate the type 
of data that's hard to defrag in its own space, so it can't fragment our 
precious movable space.  That separate space can expand and contract 
chunkwise.

To be sure, this solution means we still can't forcibly create higher order 
units in those undefraggable regions, but that is no worse than the current 
situation.  In the defraggable regions, hopefully the majority of memory, we 
have a much improved situation: we can forcibly create as many high order 
allocation units as we need, in order to handle, e.g., an Ext2/3 filesystem 
that's running with a large blocksize.

> Moving slab pages is probably not an option unless some quick way of
> updating all the pointers to the objects is found.

That was the part in my original post about a "new API".  If we choose to do 
so, we could fix up most kinds of slab objects to support some protocol to 
assist in the effort, or use a data structure better suited to relocation.  I 
doubt it's necessary to go that far.  As far as I can see, slab is not a big 
offender in terms of fragmentation at the moment.

> I also wonder if moving kernel pages is really worth the hassle.

That's the question of course.  The benefit is getting rid of high order 
allocation failures, and gaining some confidence that larger filesystem 
blocksizes will work reliably, however the workload evolves.  The cost is 
some additional complexity, no argument about that.  On the other hand, it 
would no longer be necessary to use voodoo to try to get the allocator to 
magically not fragment.

> It's perfectly possible that defragging only user pages will be enough.

Indeed.  That's an excellent place to start.  We'd include page cache pages in 
that, I'd expect.  Then it would be attractive to go on to handle page table 
pages, and with that we'd have the low-hanging fruit.

> The alternative sounds like it would be scanning mem_map looking for pages
> that can be moved which sounds expensive.

It looks linear to me, and it's not supposed to happen often.  It would 
typically be a response to a nasty situation like you'd get by mounting a 
volume that uses 32K blocks after running a long time with mostly 4K blocks.

> > Without updating pointers, active defragmentation is not possible.  But
> > perhaps you meant to say that active defragmentation is not needed?
>
> I am saying that full defragmentation would be a very expensive operation
> which might not work if it cannot find suitably freeable adjacent pages.

Full defragmentation, whatever that is, would be pointless for the 
applications I've described, though perhaps it could be needed by some 
bizarre application like software hotplug memory.  In the latter case you 
wouldn't be too worried about how long it takes.

In general though, the defragger would only run long enough to restore balance  
to the free counts across the various orders.  In rare cases, it would have 
to run synchronously in order to prevent an allocation from failing.

> If order0 pages were in slab, the whole searching problem becomes trivial
> (just go to the relevant cache and scan the slabs).

You might want to write a separate [rfc] to describe your idea.  For one 
thing, I don't see why you'd want to use slab for that.

Regards,

Daniel
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  parent reply	other threads:[~2003-06-27  0:30 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-06-24 23:11 Daniel Phillips
2003-06-25  0:47 ` William Lee Irwin III
2003-06-25  1:07   ` Daniel Phillips
2003-06-25  1:10     ` William Lee Irwin III
2003-06-25  1:25       ` Daniel Phillips
2003-06-25  1:30         ` William Lee Irwin III
2003-06-25  9:29 ` Mel Gorman
2003-06-26 19:00   ` Daniel Phillips
2003-06-26 20:01     ` Mel Gorman
2003-06-26 20:10       ` Andrew Morton
2003-06-27  0:30       ` Daniel Phillips [this message]
2003-06-27 13:00         ` Mel Gorman
2003-06-27 14:38           ` Martin J. Bligh
2003-06-27 14:41           ` Daniel Phillips
2003-06-27 14:43           ` Martin J. Bligh
2003-06-27 14:54             ` Daniel Phillips
2003-06-27 15:04               ` Martin J. Bligh
2003-06-27 15:17                 ` Daniel Phillips
2003-06-27 15:22                   ` Mel Gorman
2003-06-27 15:50                     ` Daniel Phillips
2003-06-27 16:00                       ` Daniel Phillips
2003-06-29 19:25                         ` Mel Gorman
2003-06-28 21:06                           ` Daniel Phillips
2003-06-29 21:26                             ` Mel Gorman
2003-06-28 21:54                               ` Daniel Phillips
2003-06-29 22:07                                 ` William Lee Irwin III
2003-06-28 23:18                                   ` Daniel Phillips
2003-07-02 21:10           ` Mike Fedyk
2003-07-03  2:04             ` Larry McVoy
2003-07-03  2:20               ` William Lee Irwin III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200306270222.27727.phillips@arcor.de \
    --to=phillips@arcor.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox