From: Mel Gorman <mel@csn.ul.ie>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andy Whitcroft <apw@shadowen.org>, Andrew Morton <akpm@osdl.org>,
Nick Piggin <nickpiggin@yahoo.com.au>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Linux Memory Management List <linux-mm@kvack.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: Page allocator: Single Zone optimizations
Date: Fri, 3 Nov 2006 09:14:57 +0000 (GMT) [thread overview]
Message-ID: <Pine.LNX.4.64.0611030900480.9787@skynet.skynet.ie> (raw)
In-Reply-To: <Pine.LNX.4.64.0611021442210.10447@schroedinger.engr.sgi.com>
On Thu, 2 Nov 2006, Christoph Lameter wrote:
> On Thu, 2 Nov 2006, Mel Gorman wrote:
>
>>> Reclaim is a way of
>>> evicting pages from memory to avoid the move. This may be useful if memory
>>> is filled up because defragging can then do what swapping would have to
>>> do. However, evicting pages means that they have to be reread. Page
>>> migration can migrate pages at 1GB/sec which is certainly much higher
>>> than having to reread the page.
>
>> The reason why anti-frag currently reclaims is because reclaiming was easy and
>> happens under memory pressure not because I thought pageout was free. As a
>> proof-of-concept, I needed to show that pages clustered on reclaimability
>> would free contiguous blocks of pages later. There was no point starting with
>> defragmentation when I knew that unmovable pages would be with movable pages
>> in the same MAX_ORDER_NR_PAGES block.
>
> Could you go to defrag with what we have discussed now?
>
The defrag code would have to be developed first. So, no, I can't go with
defrag "now", it doesn't exist yet.
>>> 1. An mlocked page. This is a page that is movable but not reclaimable.
>>> How does defrag handle that case right now? It should really move the
>>> page if necessary.
>>>
>>
>> Defrag doesn't exist right now. If anti-frag got some traction, working on
>> using page migration to handle movable-but-not-reclaimable pages would be the
>> next step. Pages that are mlocked() will have been allocated with
>> __GFP_EASYRCLM so will be clustered together with other movable pages.
>
> But mlocked pages are not reclaimable.
>
I didn't say they were. I would mark them __GFP_EASYRCLM *when* defrag was
developed.
>>> 2. There are a number of unreclaimable page types that are easily movable.
>>> F.e. page table pages are movable if you take a write-lock on mmap_sem
>>> and handle the tree carefully. These pages again are not reclaimable but
>>> they are movable.
>>>
>>
>> Page tables are currently not allocated with __GFP_EASYRCLM because I knew I
>> couldn't reclaim them without killing processes. However, if page migration
>> within ranges was implemented, we'd start clustering based on movability
>> instead of reclaimability.
>
> There would have to be a separate function to move page table pages since
> they cannot be handled like regular pages. We would need some way of
> id'ing the mm struct the page belongs to in order to get to the top of
> the tree and to mmap_sem.
>
I know, this sort of thing would have to be written into page
migration before defrag for high-order allocations was developed. Even
then, defrag needs to sit on top of something like anti-frag to get teh
clustering of movable pages.
>>> Various caching objects in the slab (cpucache align cache etc) are also
>>> easily movable. If we put them into a separate slab cache then we could
>>> make them movable.
>> As subsystems will have pointers to objects within the slab, I doubt they are
>> easily movable but I'll take your word on it for the moment.
>
> The slab already has these pointers in the page struct. They are needed to
> id the slab on kfree(). We already reallocate all caches when we tune the
> cpucaches. So there is not much new for the slab cache objects.
>
It wasn't the pointers in the struct page I was concerned about. It was
pointers found by void *someptr = kmem_cache_alloc(...). But if they can
be cleaned up, then sure, they are movable.
>>> I would suggest to not categorize pages according to their reclaimability
>>> but according to their movability.
>>
>> ok, I see your point. However, reclaimability seems a reasonable starting
>> point. If I know pages of similar reclaimability are clustered together, I can
>> work on using page migration to move pages out of the blocks of known
>> reclaimability instead of paging them out. When that works, the __GFP_ flags
>> identifying reclaimability can be renamed to marking movability and flag page
>> table pages as well. This is a logical progression.
>
> I'd rather go direct to defrag instead of creating churn with
> fragmentation avoidance.
>
Even if I had defrag right now, we'd be looking to cluster pages by
movability which would end up looking almost identicial to the anti-frag
patches except that references to RECLAIM would look like MOVABLE.
This intermediate step would still exist but I'd like to start getting
data on it's effectiveness now to help shape the development of defrag.
>> Agreed, but swapping them out was an easier starting point.
>
> I think this work is very valuable and the acceptance issues have probably
> dominated the design of the patch so far. But I sure wish we would now go
> to the full thing instead of an intermediate step that we then will have
> to undo later.
We'd be renaming a few defines, hardly a major undo.
> An intemediate step that would make sense is starting to
> marking pages as unmovable and then reclaim movable pages. Then we can add
> more and more logic to make move pages movable on top. With marking
> pages for reclaim we wont get there.
>
Ok, I can make that renaming change now so. The renaming will look like
Movable - These are userspace pages that are easily moved. This
flag is set when it is known that the pages will be trivially
moved by using page migration or if under significant
memory pressure, writing the page out to swap or syncing with
backing storage
These allocations are marked with __GFP_MOVABLE
Reclaimable - These are kernel allocations for caches that are
reclaimable or allocations that are known to be very short-lived.
These allocations are marked __GFP_RECLAIMABLE
Non-Movable - These are pages that are allocated by the kernel that
are not trivially reclaimed. For example, the memory allocated for a
loaded module would be in this category. By default, allocations are
considered to be of this type
These are allocations that are not marked otherwise
So, right now, page tables would not be marked __GFP_MOVABLE, but they
would be later when defrag was developed. Would that be any better?
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-11-03 9:14 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-17 0:50 Christoph Lameter
2006-10-17 1:10 ` Andrew Morton
2006-10-17 1:13 ` Christoph Lameter
2006-10-17 1:27 ` KAMEZAWA Hiroyuki
2006-10-17 1:25 ` Christoph Lameter
2006-10-17 6:04 ` Nick Piggin
2006-10-17 17:54 ` Christoph Lameter
2006-10-18 11:15 ` Nick Piggin
2006-10-18 19:38 ` Andrew Morton
2006-10-23 23:08 ` Christoph Lameter
2006-10-24 1:07 ` Christoph Lameter
2006-10-26 22:09 ` Andrew Morton
2006-10-26 22:28 ` Christoph Lameter
2006-10-28 1:00 ` Christoph Lameter
2006-10-28 2:04 ` Andrew Morton
2006-10-28 2:12 ` Christoph Lameter
2006-10-28 2:24 ` Andrew Morton
2006-10-28 2:31 ` Christoph Lameter
2006-10-28 4:43 ` Andrew Morton
2006-10-28 7:47 ` KAMEZAWA Hiroyuki
2006-10-28 16:12 ` Andi Kleen
2006-10-29 0:48 ` Christoph Lameter
2006-10-29 1:04 ` Andrew Morton
2006-10-29 1:29 ` Christoph Lameter
2006-10-29 11:32 ` Nick Piggin
2006-10-30 16:41 ` Christoph Lameter
2006-11-01 18:26 ` Mel Gorman
2006-11-01 20:34 ` Andrew Morton
2006-11-01 21:00 ` Christoph Lameter
2006-11-01 21:46 ` Andrew Morton
2006-11-01 21:50 ` Christoph Lameter
2006-11-01 22:13 ` Mel Gorman
2006-11-01 23:29 ` Christoph Lameter
2006-11-02 0:22 ` Andrew Morton
2006-11-02 0:27 ` Christoph Lameter
2006-11-02 12:45 ` Mel Gorman
2006-11-01 22:10 ` Mel Gorman
2006-11-02 17:37 ` Andy Whitcroft
2006-11-02 18:08 ` Christoph Lameter
2006-11-02 20:58 ` Mel Gorman
2006-11-02 21:04 ` Christoph Lameter
2006-11-02 21:16 ` Mel Gorman
2006-11-02 21:52 ` Christoph Lameter
2006-11-02 22:37 ` Mel Gorman
2006-11-02 22:50 ` Christoph Lameter
2006-11-03 9:14 ` Mel Gorman [this message]
2006-11-03 13:17 ` Andy Whitcroft
2006-11-03 18:11 ` Christoph Lameter
2006-11-03 19:06 ` Mel Gorman
2006-11-03 19:44 ` Christoph Lameter
2006-11-03 21:11 ` Mel Gorman
2006-11-03 21:42 ` Christoph Lameter
2006-11-03 21:50 ` Andrew Morton
2006-11-03 21:53 ` Christoph Lameter
2006-11-03 22:12 ` Andrew Morton
2006-11-03 22:15 ` Christoph Lameter
2006-11-03 22:19 ` Andi Kleen
2006-11-04 0:37 ` Christoph Lameter
2006-11-04 1:32 ` Andi Kleen
2006-11-06 16:40 ` Christoph Lameter
2006-11-06 16:56 ` Andi Kleen
2006-11-06 17:00 ` Christoph Lameter
2006-11-06 17:07 ` Andi Kleen
2006-11-06 17:12 ` Hugh Dickins
2006-11-06 17:15 ` Christoph Lameter
2006-11-06 17:20 ` Andi Kleen
2006-11-06 17:26 ` Christoph Lameter
2006-11-07 16:30 ` Mel Gorman
2006-11-07 17:54 ` Christoph Lameter
2006-11-07 18:14 ` Mel Gorman
2006-11-08 0:29 ` KAMEZAWA Hiroyuki
2006-11-08 2:08 ` Christoph Lameter
2006-11-13 21:08 ` Mel Gorman
2006-11-03 12:48 ` Peter Zijlstra
2006-11-03 18:15 ` Christoph Lameter
2006-11-03 18:53 ` Peter Zijlstra
2006-11-03 19:23 ` Christoph Lameter
2006-11-02 18:52 ` Andrew Morton
2006-11-02 21:51 ` Mel Gorman
2006-11-02 22:03 ` Andy Whitcroft
2006-11-02 22:11 ` Andrew Morton
2006-11-01 18:13 ` Mel Gorman
2006-11-01 17:39 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0611030900480.9787@skynet.skynet.ie \
--to=mel@csn.ul.ie \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@osdl.org \
--cc=apw@shadowen.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox