Re: [NUMA] Fix memory policy refcounting

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>,
	linux-mm@kvack.org, Eric Whitney <eric.whitney@hp.com>,
	David Rientjes <rientjes@google.com>, Paul Jackson <pj@sgi.com>
Subject: Re: [NUMA] Fix memory policy refcounting
Date: Tue, 06 Nov 2007 15:08:11 -0500	[thread overview]
Message-ID: <1194379691.5317.101.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0711061139230.30127@schroedinger.engr.sgi.com>

On Tue, 2007-11-06 at 11:43 -0800, Christoph Lameter wrote:
> On Tue, 6 Nov 2007, Lee Schermerhorn wrote:
> 
> > We always seem to rathole on that subject.  I just hoped to head that
> > off...
> 
> Well fix this and the rathole will be gone.,

I'll hold you to that! :-)

> 
> > > What do you mean by in use? If a vma can potentially use a shared policy 
> > > in a rbtree then it is in use right?
> > 
> > Not really--not for shared policies.  Again, another task is allowed to
> > remove or replace the shared policies at any time, regardless of the
> > number of task's attached to the segment.  We can't differentiate
> > between simple attachment and current use.  We need the lookup-time
> > ref/unref to know that the policy is actually in use.  We can still
> > replace it in the tree while it's "in use".  This will remove the tree's
> > reference on the policy, but the policy won't be freed until the task
> > holding the extra ref drops it.  
> 
> Stil unclear as to why we need lookup time ref/unref. A task can replace 
> the shared policy at any time you just need to update the refcounts. If 
> you have a pointer to the policy in the vma then its possible to do so.

A pointer in the vma won't work.  Different tasks could apply policies
on different ranges and shared policy semantics dictate that all tasks
see the same policy for a particular offset in the region--modulo
set/get races.  The only way we could keep a pointer in the vma would be
to split the vmas in every task that has the shared region attached
whenever any task changes the policy of a range of the region, so that
all tasks have the same set of vma's all pointing to the same set of
policies in the tree.  I don't think we can be changing other task's
address space externally like this.  And it still wouldn't work, I
think, for shared policy semantics--again, except maybe with some sort
of rcu mechanism.  More below on what constitutes actual "use".

> 
> > I suppose we could stick any replaced mempolicy on a list associated
> > with the segment and keep them there until all tasks detach from the
> > shared segment.  Not too much of a memory leak, as long as a task
> 
> Well you have the refcount on the policy? Why keep the mempolicy around?

A non-zero ref count is what keeps the policy around.  It implies that
some structure has a pointer to the policy, or some task is actively
examining the policy and will drop the reference when finsished with it.
[The latter is what's NOT happening now for shared policy.]  

> 
> > > AFAICT: If you take a reference on the shared policy for each 
> > > vma then you can tell from the references that the policy is in use.
> > 
> > See above.  A vma reference does not constitute use for a shared policy.
> 
> Why not? What does constitute "use" of a shared policy? A page that has 
> used the policy?

Currently, when you lookup the policy [based on offset] in the rbtree
under spin_lock, the lookup function does an mpol_get() before dropping
the lock.  Now, you can use the policy to allocate a page or to report
via get_mempolicy(MPOL_F_ADDR) or show_numa_maps()/mpol_to_str().   When
you're finished with the policy, you mpol_free() to release the
reference.  While you're holding this ref, another task that has the
shared region attached can replace/delete the policy, removing it from
the rbtree and dropping the rbtree's reference via mpol_free().  Now,
the only reference to the policy is any reference held by a task that
has looked it up, but not yet mpol_free()ed it.  When the last task
holding such a reference releases it, we'll free it back to the kmem
cache.

This is the type of use that I can't infer from vma counts or even vma
pointer refs.  I should be able to replace the vma pointer/ref at any
time when the shared policy changes, and mpol_free() the policy for each
such vma pointer/ref.  That leaves no ref to hold the policy should it
be in use [as discussed above].

Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2007-11-06 20:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-26 23:41 Christoph Lameter
2007-10-29 15:48 ` Lee Schermerhorn
2007-10-29 20:24   ` Christoph Lameter
2007-10-29 21:34     ` Lee Schermerhorn
2007-10-29 21:43       ` Christoph Lameter
2007-10-30 16:39         ` Lee Schermerhorn
2007-10-30 18:42           ` Christoph Lameter
2007-10-30 20:18             ` Lee Schermerhorn
2007-11-06 18:56             ` Lee Schermerhorn
2007-11-06 19:15               ` Christoph Lameter
2007-11-06 19:35                 ` Lee Schermerhorn
2007-11-06 19:43                   ` Christoph Lameter
2007-11-06 20:08                     ` Lee Schermerhorn [this message]
2007-11-06 20:19                       ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1194379691.5317.101.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=clameter@sgi.com \
    --cc=eric.whitney@hp.com \
    --cc=linux-mm@kvack.org \
    --cc=pj@sgi.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox