Re: [RFC] 0/4 Migration Cache Overview

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: Christoph Lameter <clameter@engr.sgi.com>
Cc: linux-mm <linux-mm@kvack.org>,
	Christoph Lameter <clameter@sgi.com>,
	Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Subject: Re: [RFC] 0/4 Migration Cache Overview
Date: Fri, 17 Feb 2006 11:59:58 -0500	[thread overview]
Message-ID: <1140195598.5219.77.camel@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.64.0602170816530.30999@schroedinger.engr.sgi.com>

On Fri, 2006-02-17 at 08:22 -0800, Christoph Lameter wrote:
> On Fri, 17 Feb 2006, Lee Schermerhorn wrote:
> 
> > Marcello Tosatti introduced the migration cache back in Oct04 to obviate use
> > of swap space for anon pages during page migration.  He posted the original
> > migration patch [let's call this V0] to the linux-mm list:
> 
> Could add a justification of this feature? What is the benefit of having a 
> migration cache instead of using swap pte (current migration is not really 
> using swap space per se)?

I think Marcello covered that in his original posts, which I linked.  
I can go back and extract his arguments.  My primary interest is for
"lazy page migration" where anon pages can hang around the the cache
until the task faults them in [possibly migrating] or exits, if ever.
I think the desire to avoid using swap for this case is even stronger.

> 
> > migration work has been submitted upstream, I have ported the migration
> > cache patches to work with his direct migration in 2.6.16-rc3-mm1. I'm
> > calling this "V8".
> 
> Direct migration is in Linus' tree and I am not aware of anything 
> necessary in mm.

Correct.  But, I figured if any testing were going to be done, it would
be against the mm tree, so I diffed against that.  I don't know how much
the mm tree differs from the corresponding 16-rc? tree in the areas
touched
by these patches.

> 
> > One complication in all of this is when direct migration of an anon
> > page falls back to swapping out the pages.  If the page had not already
> > been in the swap cache, it will have been added to the migration 
> > cache.  To swap the page out, we need to move if from the migration
> > cache to the swap cache.  Note that this would also be required if
> > shrink_list() encounters a page in the migration cache.  Both the
> > page migration code and shrink_list() have been modified to call
> > a new function "migration_move_to_swap()" in these cases.  Marcello
> > mentions the need to do this in his first migration cache post linked
> > above.
> 
> We could potentially remove the ability to fall back to swap or add an 
> option to disallow the fallback. This is also necessary if we want to 
> migration mlocked memory. Maybe this could simplify the code?

Possibly.  I think we'd still want to be able to do this for vmscan.  
Again, because anon pages could languish in the migration cache
indefinitely.

> 
> > QUESTION:  what does this mean for tasks that fault on the 
> > migration cache pte while we're moving the page to the swap
> > cache?  I think that if they manage to look up the page in the
> > migration cache and get a reference on it, the current test
> > in do_swap_page() will work OK.  However, is there a potential
> > race between the time __handle_mm_fault() fetches the pte from
> > the page table and when do_swap_page() does the cache lookup?
> > [in a preemptible kernel?]
> 
> Yes there is since handle_mm_fault accesses the pte without locking.
> do_swap_page acquires the lock and will then recheck if the pte is the 
> same. If anything happened in between it should redo the page fault.
> 

I thought so, but hadn't thought of an efficient check for the fault
handlers.  I'm thinking that if the fault handler doesn't find the 
page in the cache and the page's private data doesn't match the pte
[after appropriate conversion],  the handler could return -EAGAIN
causing handle_mm_fault to refetch the pte.  I guess if the handler
doesn't find the page in the cache, this is the slow path anyway,
so maybe efficiency isn't such a concern.  It will require converion
of the pte to swp_entry or vice versa, right?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2006-02-17 16:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-17 15:36 Lee Schermerhorn
2006-02-17 16:22 ` Christoph Lameter
2006-02-17 16:59   ` Lee Schermerhorn [this message]
2006-02-17 17:12     ` Christoph Lameter
2006-02-21  3:18       ` Nick Piggin
2006-02-21 18:40         ` Marcelo Tosatti
2006-02-21 18:04           ` Christoph Lameter
2006-02-21 18:49             ` Lee Schermerhorn
2006-02-21 19:19               ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1140195598.5219.77.camel@localhost.localdomain \
    --to=lee.schermerhorn@hp.com \
    --cc=clameter@engr.sgi.com \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=marcelo.tosatti@cyclades.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox