linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: glebn@voltaire.com (Gleb Natapov)
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <ak@suse.de>, Christoph Lameter <clameter@sgi.com>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] Document Linux Memory Policy
Date: Thu, 31 May 2007 20:41:16 +0300	[thread overview]
Message-ID: <20070531174116.GB10459@minantech.com> (raw)
In-Reply-To: <1180625204.5091.55.camel@localhost>

On Thu, May 31, 2007 at 11:26:44AM -0400, Lee Schermerhorn wrote:
> On Thu, 2007-05-31 at 14:30 +0300, Gleb Natapov wrote:
> > On Thu, May 31, 2007 at 02:04:12PM +0300, Gleb Natapov wrote:
> > > On Thu, May 31, 2007 at 12:43:19PM +0200, Andi Kleen wrote:
> > > > 
> > > > > > The faulted page will use the memory policy of the task that faulted it 
> > > > > > in. If that process has numa_set_localalloc() set then the page will be 
> > > > > > located as closely as possible to the allocating thread.
> > > > > 
> > > > > Thanks. But I have to say this feels very unnatural.
> > > > 
> > > > What do you think is unnatural exactly? First one wins seems like a quite 
> > > > natural policy to me.
> > > No it is not (not always). I want to create shared memory for
> > > interprocess communication. Process A will write into the memory and
> > > process B will periodically poll it to see if there is a message there.
> > > In NUMA system I want the physical memory for this VMA to be allocated
> > > from node close to process B since it will use it much more frequently.
> > > But I don't want to pre-fault all pages in process B to achieve this
> > > because the region can be huge and because it doesn't guaranty much if
> > > swapping is involved. So numa_set_localalloc() looks like it achieves
> > > exactly this. Without this function I agree that the "first one wins" is
> > > very sensible assumption, but when each process stated it's preferences
> > > explicitly by calling the function it is not longer sensible to me as a
> > > user of the API. When you start to thing about how memory policy may be
> > OK now, rereading man page, I see that numa_tonode_memory() to achieve 
> > this without pre-faulting. A should now what CPU B is running on, but
> > this is a minor problem.
> 
> Gleb:    numa_tonode_memory() won't do what you want if the file is
> mapped shared.  The numa_*_memory() interfaces use mbind() which
> installs a VMA policy in the address space of the caller.  When a page
> is faulted in for a mmap'd file, the page will be allocated using the
> faulting task's task policy, if any, else system default.  
> 
Suppose I have two processes that want to communicate through the shared memory.
They mmap same file with MAP_SHARED. Now first process call
numa_setlocal_memory() on the region where it will receive messages and
call numa_tonode_memory(second process nodeid) on the region where it
will post messages for the second process. The second process does the
same thing. After that no matter what process touches memory first,
faulted in pages should be allocated from the correct memory node. Do I
miss something here?

> I've been proposing patches to generalize the shared policy support
> enjoyed by shmem segments for use with shared mmap'd files.  I was
> beginning to think that I'm the only one with applications [well, with
> customers with applications] that need this behavior.  Sounds like your
> requirements are very similar:  huge file [don't want to prefault nor
> wait for it to all be read into shmem before starting processing], only
> accesses via mmap, ...
I thought this is pretty common user case, but Andi thinks different. I
don't have any hard evidence one way or the other.

> > Man page states:
> >  Memory policy set for memory areas is shared by all threads of the
> >  process. Memory policy is also shared by other processes mapping the
> >  same memory using shmat(2) or mmap(2) from shmfs/hugetlbfs. It is not
> >  shared for disk backed file mappings right now although that may change
> >  in the future.
> > So what does this mean? If I set local policy for memory region in process
> > A it should be obeyed by memory access in process B?
> 
> shmem does, indeed, work this way.  Policies installed on ranges of the
> shared segment via mbind() are stored with the shared object.
> 
> I think the future is now:  time to share policy for disk backed file
> mappings.
> 
At least it will be consistent with what you get when shared memory is
created via shmget(). It will be very surprising for a programmer if
his program' logic will break just because he changes the way how shared
memory is created.

--
			Gleb.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-31 17:41 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-29 19:33 Lee Schermerhorn
2007-05-29 20:04 ` Christoph Lameter
2007-05-29 20:16   ` Andi Kleen
2007-05-30 16:17     ` Lee Schermerhorn
2007-05-30 17:41       ` Christoph Lameter
2007-05-31  8:20       ` Michael Kerrisk
2007-05-31 14:49         ` Lee Schermerhorn
2007-05-31 15:56           ` Michael Kerrisk
2007-06-01 21:15         ` [PATCH] enhance memory policy sys call man pages v1 Lee Schermerhorn
2007-07-23  6:11           ` Michael Kerrisk
2007-07-23  6:32           ` mbind.2 man page patch Michael Kerrisk
2007-07-23 14:26             ` Lee Schermerhorn
2007-07-26 17:19               ` Michael Kerrisk
2007-07-26 18:06                 ` Lee Schermerhorn
2007-07-26 18:18                   ` Michael Kerrisk
2007-07-23  6:32           ` get_mempolicy.2 " Michael Kerrisk
2007-07-28  9:31             ` Michael Kerrisk
2007-08-09 18:43               ` Lee Schermerhorn
2007-08-09 20:57                 ` Michael Kerrisk
2007-08-16 20:05               ` Andi Kleen
2007-08-18  5:50                 ` Michael Kerrisk
2007-08-21 15:45                   ` Lee Schermerhorn
2007-08-22  4:10                     ` Michael Kerrisk
2007-08-22 16:08                       ` [PATCH] Mempolicy Man Pages 2.64 1/3 - mbind.2 Lee Schermerhorn
2007-08-27 11:29                         ` Michael Kerrisk
2007-08-22 16:10                       ` [PATCH] Mempolicy Man Pages 2.64 2/3 - set_mempolicy.2 Lee Schermerhorn
2007-08-27 11:30                         ` Michael Kerrisk
2007-08-22 16:12                       ` [PATCH] Mempolicy Man Pages 2.64 3/3 - get_mempolicy.2 Lee Schermerhorn
2007-08-27 11:30                         ` Michael Kerrisk
2007-08-27 10:46                 ` get_mempolicy.2 man page patch Michael Kerrisk
2007-07-23  6:33           ` set_mempolicy.2 " Michael Kerrisk
2007-05-30 16:55   ` [PATCH] Document Linux Memory Policy Lee Schermerhorn
2007-05-30 17:56     ` Christoph Lameter
2007-05-31  6:18       ` Gleb Natapov
2007-05-31  6:41         ` Christoph Lameter
2007-05-31  6:47           ` Gleb Natapov
2007-05-31  6:56             ` Christoph Lameter
2007-05-31  7:11               ` Gleb Natapov
2007-05-31  7:24                 ` Christoph Lameter
2007-05-31  7:39                   ` Gleb Natapov
2007-05-31 17:43                     ` Christoph Lameter
2007-05-31 17:07                   ` Lee Schermerhorn
2007-05-31 10:43             ` Andi Kleen
2007-05-31 11:04               ` Gleb Natapov
2007-05-31 11:30                 ` Gleb Natapov
2007-05-31 15:26                   ` Lee Schermerhorn
2007-05-31 17:41                     ` Gleb Natapov [this message]
2007-05-31 18:56                       ` Lee Schermerhorn
2007-05-31 20:06                         ` Gleb Natapov
2007-05-31 20:43                           ` Andi Kleen
2007-06-01  9:38                             ` Gleb Natapov
2007-06-01 10:21                               ` Andi Kleen
2007-06-01 12:25                                 ` Gleb Natapov
2007-06-01 13:09                                   ` Andi Kleen
2007-06-01 17:15                                 ` Lee Schermerhorn
2007-06-01 18:43                                   ` Christoph Lameter
2007-06-01 19:38                                     ` Lee Schermerhorn
2007-06-01 19:48                                       ` Christoph Lameter
2007-06-01 21:05                                         ` Lee Schermerhorn
2007-06-01 21:56                                           ` Christoph Lameter
2007-06-04 13:46                                             ` Lee Schermerhorn
2007-06-04 16:34                                               ` Christoph Lameter
2007-06-04 17:02                                                 ` Lee Schermerhorn
2007-06-04 17:11                                                   ` Christoph Lameter
2007-06-04 20:23                                                     ` Andi Kleen
2007-06-04 21:51                                                       ` Christoph Lameter
2007-06-05 14:30                                                         ` Lee Schermerhorn
2007-06-01 20:28                                     ` Gleb Natapov
2007-06-01 20:45                                       ` Christoph Lameter
2007-06-01 21:10                                         ` Lee Schermerhorn
2007-06-01 21:58                                           ` Christoph Lameter
2007-06-02  7:23                                         ` Gleb Natapov
2007-05-31 11:47                 ` Andi Kleen
2007-05-31 11:59                   ` Gleb Natapov
2007-05-31 12:15                     ` Andi Kleen
2007-05-31 12:18                       ` Gleb Natapov
2007-05-31 18:28       ` Lee Schermerhorn
2007-05-31 18:35         ` Christoph Lameter
2007-05-31 19:29           ` Lee Schermerhorn
2007-05-31 19:25       ` Paul Jackson
2007-05-31 20:22         ` Lee Schermerhorn
2007-05-29 20:07 ` Andi Kleen
2007-05-30 16:04   ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070531174116.GB10459@minantech.com \
    --to=glebn@voltaire.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox