linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>, Gleb Natapov <glebn@voltaire.com>
Cc: linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andi Kleen <ak@suse.de>
Subject: Re: [PATCH] Document Linux Memory Policy
Date: Thu, 31 May 2007 13:07:18 -0400	[thread overview]
Message-ID: <1180631238.5091.57.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0705310021380.6969@schroedinger.engr.sgi.com>

On Thu, 2007-05-31 at 00:24 -0700, Christoph Lameter wrote:
> On Thu, 31 May 2007, Gleb Natapov wrote:
> 
> > > 1. A shared range has multiple tasks that can fault pages in.
> > >    The policy of which task should control how the page is allocated?
> > >    Is it the last one that set the policy?
> 
> > How is it done for shmget? For my particular case I would prefer to get an error
> > from numa_setlocal_memory() if process tries to set policy on the area
> > of the file that already has policy set. This may happen only as a
> > result of a bug in my app.
> 
> Hmmm.... Thats an idea. Lee: Do we have some way of returning an error?
> We then need to have a function that clears memory policy. Maybe the
> default policy is the clear?

For shmem, mbind() of a range of the object [that's what
numa_setlocal_memory() does] replaces any existing policy in that range.
This is what I would expect--the last one applied takes effect--just
like .  Multiple tasks attaching to a shmem, or mmap()ing the same file
shared, would, I hope, be cooperating tasks and know what they are
doing.  Typically--i.e., in the applications I'm familiar with, only one
task that sets up the shmem or file mapping for the multi-task
application would set the policy.

However, I agree that if I'm ever successful in getting policy attached
to shared file mappings, we'll need a way to delete the policy.  I'm
thinking of something like "MPOL_DELETE" that completely deletes the
policy--whether it be on a range of virtual addresses via mbind() or the
task policy, via set_mempolicy().  Of course, MPOL_DELETE would work for
shmem segments as well.

> 
> > > 2. Pagecache pages can be read and written by buffered I/O and
> > >    via mmap. Should there be different allocation semantics
> > >    depending on the way you got the page? Obviously no policy
> > >    for a memory range can be applied to a page allocated via
> > >    buffered I/O. Later it may be mapped via mmap but then
> > >    we never use policies if the page is already in memory.
> 
> > If page is already in the pagecache use it. Or return an error if strict
> > policy is in use. Or something else :) In my case I make sure that files
> > is accessed only through mmap interface.

This is the model that I've been trying to support--tasks which have, as
a portion of their address space, a shared mapping of an application
specific file that is only ever accessed via mmap().

> 
> On an mmap we cannot really return an error. If your program has just run 
> then pages may linger in memory. If you run it on another node then the 
> earlier used pages may be used.

It's true that a page of such a file [private to the application, only
accessed by mmap()] may be in the page cache in the wrong location,
either because you run later on another node, as Christoph says, or
because you've just done a backup or restored from one.  However, in
this case, if your application is the only one that mmap's the file, and
you only apply policy from the "application initialization task", then
the that task will be the only one mapping the file.  In this case, you
can use Christoph's excellent MPOL_MF_MOVE facility to ensure that the
pages follow your new policy.  If other tasks have the page mapped,
you'll need to use MPOL_MF_MOVE_ALL, which requires special privilege
[CAP_SYS_NICE].

Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-05-31 17:07 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-29 19:33 Lee Schermerhorn
2007-05-29 20:04 ` Christoph Lameter
2007-05-29 20:16   ` Andi Kleen
2007-05-30 16:17     ` Lee Schermerhorn
2007-05-30 17:41       ` Christoph Lameter
2007-05-31  8:20       ` Michael Kerrisk
2007-05-31 14:49         ` Lee Schermerhorn
2007-05-31 15:56           ` Michael Kerrisk
2007-06-01 21:15         ` [PATCH] enhance memory policy sys call man pages v1 Lee Schermerhorn
2007-07-23  6:11           ` Michael Kerrisk
2007-07-23  6:32           ` mbind.2 man page patch Michael Kerrisk
2007-07-23 14:26             ` Lee Schermerhorn
2007-07-26 17:19               ` Michael Kerrisk
2007-07-26 18:06                 ` Lee Schermerhorn
2007-07-26 18:18                   ` Michael Kerrisk
2007-07-23  6:32           ` get_mempolicy.2 " Michael Kerrisk
2007-07-28  9:31             ` Michael Kerrisk
2007-08-09 18:43               ` Lee Schermerhorn
2007-08-09 20:57                 ` Michael Kerrisk
2007-08-16 20:05               ` Andi Kleen
2007-08-18  5:50                 ` Michael Kerrisk
2007-08-21 15:45                   ` Lee Schermerhorn
2007-08-22  4:10                     ` Michael Kerrisk
2007-08-22 16:08                       ` [PATCH] Mempolicy Man Pages 2.64 1/3 - mbind.2 Lee Schermerhorn
2007-08-27 11:29                         ` Michael Kerrisk
2007-08-22 16:10                       ` [PATCH] Mempolicy Man Pages 2.64 2/3 - set_mempolicy.2 Lee Schermerhorn
2007-08-27 11:30                         ` Michael Kerrisk
2007-08-22 16:12                       ` [PATCH] Mempolicy Man Pages 2.64 3/3 - get_mempolicy.2 Lee Schermerhorn
2007-08-27 11:30                         ` Michael Kerrisk
2007-08-27 10:46                 ` get_mempolicy.2 man page patch Michael Kerrisk
2007-07-23  6:33           ` set_mempolicy.2 " Michael Kerrisk
2007-05-30 16:55   ` [PATCH] Document Linux Memory Policy Lee Schermerhorn
2007-05-30 17:56     ` Christoph Lameter
2007-05-31  6:18       ` Gleb Natapov
2007-05-31  6:41         ` Christoph Lameter
2007-05-31  6:47           ` Gleb Natapov
2007-05-31  6:56             ` Christoph Lameter
2007-05-31  7:11               ` Gleb Natapov
2007-05-31  7:24                 ` Christoph Lameter
2007-05-31  7:39                   ` Gleb Natapov
2007-05-31 17:43                     ` Christoph Lameter
2007-05-31 17:07                   ` Lee Schermerhorn [this message]
2007-05-31 10:43             ` Andi Kleen
2007-05-31 11:04               ` Gleb Natapov
2007-05-31 11:30                 ` Gleb Natapov
2007-05-31 15:26                   ` Lee Schermerhorn
2007-05-31 17:41                     ` Gleb Natapov
2007-05-31 18:56                       ` Lee Schermerhorn
2007-05-31 20:06                         ` Gleb Natapov
2007-05-31 20:43                           ` Andi Kleen
2007-06-01  9:38                             ` Gleb Natapov
2007-06-01 10:21                               ` Andi Kleen
2007-06-01 12:25                                 ` Gleb Natapov
2007-06-01 13:09                                   ` Andi Kleen
2007-06-01 17:15                                 ` Lee Schermerhorn
2007-06-01 18:43                                   ` Christoph Lameter
2007-06-01 19:38                                     ` Lee Schermerhorn
2007-06-01 19:48                                       ` Christoph Lameter
2007-06-01 21:05                                         ` Lee Schermerhorn
2007-06-01 21:56                                           ` Christoph Lameter
2007-06-04 13:46                                             ` Lee Schermerhorn
2007-06-04 16:34                                               ` Christoph Lameter
2007-06-04 17:02                                                 ` Lee Schermerhorn
2007-06-04 17:11                                                   ` Christoph Lameter
2007-06-04 20:23                                                     ` Andi Kleen
2007-06-04 21:51                                                       ` Christoph Lameter
2007-06-05 14:30                                                         ` Lee Schermerhorn
2007-06-01 20:28                                     ` Gleb Natapov
2007-06-01 20:45                                       ` Christoph Lameter
2007-06-01 21:10                                         ` Lee Schermerhorn
2007-06-01 21:58                                           ` Christoph Lameter
2007-06-02  7:23                                         ` Gleb Natapov
2007-05-31 11:47                 ` Andi Kleen
2007-05-31 11:59                   ` Gleb Natapov
2007-05-31 12:15                     ` Andi Kleen
2007-05-31 12:18                       ` Gleb Natapov
2007-05-31 18:28       ` Lee Schermerhorn
2007-05-31 18:35         ` Christoph Lameter
2007-05-31 19:29           ` Lee Schermerhorn
2007-05-31 19:25       ` Paul Jackson
2007-05-31 20:22         ` Lee Schermerhorn
2007-05-29 20:07 ` Andi Kleen
2007-05-30 16:04   ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1180631238.5091.57.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=glebn@voltaire.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox