linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>,
	linux-mm@kvack.org, akpm@linux-foundation.org,
	nish.aravamudan@gmail.com
Subject: Re: [PATCH/RFC 0/8] Mapped File Policy Overview
Date: Fri, 25 May 2007 10:55:51 -0400	[thread overview]
Message-ID: <1180104952.5730.28.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0705241417130.31587@schroedinger.engr.sgi.com>

On Thu, 2007-05-24 at 14:17 -0700, Christoph Lameter wrote:
> On Thu, 24 May 2007, Lee Schermerhorn wrote:
> 
> > Same use cases for using mbind() at all.  I want to specify the
> > placement of memory backing any of my address space.  A shared mapping
> > of a regular file is, IMO, morally equivalent to a shared memory region,
> > with the added semantic that is it automatically initialized from the
> > file contents, and any changes persist after the file is closed.  [One
> > related semantic that Linux is missing is to initialize the shared
> > mapping from the file, but not writeback any changes--e.g.,
> > MAP_NOWRITEBACK.  Some "enterprise unix" support this, presumably at
> > ISV/customer request.]
> 
> I think Andi was looking for an actual problem that is solved by this 
> patchset. Any user feedback that triggered this solution?

The question usually comes up in the context of migrating customers'
applications or benchmarks from our legacy unix numa APIs to Linux.  I
don't know of the exact applications that install explicit policy on
shared mmap()ed files, but on, say, Tru64 Unix it just works.  As a
result, customers and ISVs have used it.  We try to make it easy for
customers to migrate to Linux--providing support, documentation and
such.  Having a one-for-one API replacement makes this easier.  In this
context, it's a glaring hole in Linux today, and I've had to explain to
colleagues that it's a "feature"--at which point they ask me when did I
transfer to marketing ;-).  

It's easy to fix.  The shared policy support is already there.  We just
need to generalize it for regular files.  In the process,
*page_cache_alloc() obeys "file policy", which will allow additional
features such as you mentioned:  global page cache policy as the default
"file policy".

Now, I understand the concern about any increase in size, even if it's
only ~2K, but I think this is mostly of concern to 32-bit systems, where
I expect the increase will be less than 2k.   I also understand that
there are still a few 32-bit NUMA systems out there [NUMAQ?] and that
some folks use fake NUMA and cpusets on 32-bit systems for
container-like resource management.  For those systems, we could gain
back some of the size increase by making numa_maps configurable.  A
quick test showed that for x86_64, eliminating the /proc/<pid>/numa_maps
makes the kernel with my mapped file policy patches ~1.8K smaller than
the unpatched kernel with numa_maps.  I'm NOT proposing to eliminate
numa_maps, in general, because I find it very useful.  But maybe 32-bit
fake numa systems don't need it?

By the way, I think we need the numa_maps fixes in any case because the
current implementation lies about shmem segments if you look at any task
that didn't install [all of] the policy on the segment, unless it
happens to be a child of the task that did install the policy and that
child was forked after the mbind() calls.  I really dislike all of those
"ifs" and "unlesses"--I found it humorous in the George Carlin routine,
but not in user/programming interface design.

Anyway, I posted the patches in hopes of getting some additional eyes to
look at them and maybe getting some time in -mm to see whether it breaks
anything or impacts performance adversely on systems that I don't have
access to.

Lee


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-25 14:55 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-24 17:28 Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 1/8] Mapped File Policy: move shared policy to inode/mapping Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 2/8] Mapped File Policy: allocate shared policies as needed Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 3/8] Mapped File Policy: let vma policy ops handle sub-vma policies Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 4/8] Mapped File Policy: add generic file set/get policy vm ops Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 5/8] Mapped File Policy: Factor alloc_page_pol routine Lee Schermerhorn
2007-05-24 17:29 ` [PATCH/RFC 6/8] Mapped File Policy: use file policy for page cache allocations Lee Schermerhorn
2007-05-24 17:29 ` [PATCH/RFC 7/8] Mapped File Policy: fix migration of private mappings Lee Schermerhorn
2007-05-24 17:29 ` [PATCH/RFC 8/8] Mapped File Policy: fix show_numa_maps() Lee Schermerhorn
2007-05-24 19:24 ` [PATCH/RFC 0/8] Mapped File Policy Overview Christoph Lameter
2007-05-24 20:46   ` Lee Schermerhorn
2007-05-24 20:41 ` Andi Kleen
2007-05-24 21:05   ` Lee Schermerhorn
2007-05-24 21:17     ` Christoph Lameter
2007-05-25 14:55       ` Lee Schermerhorn [this message]
2007-05-25 15:25         ` Christoph Lameter
2007-05-25 16:06           ` Lee Schermerhorn
2007-05-25 16:24             ` Christoph Lameter
2007-05-25 17:37               ` Lee Schermerhorn
2007-05-25 19:10                 ` Christoph Lameter
2007-05-25 21:12                   ` Lee Schermerhorn
2007-05-25 21:43                     ` Christoph Lameter
2007-05-25 21:01                 ` Andi Kleen
2007-05-25 21:41                   ` Lee Schermerhorn
2007-05-25 21:46                     ` Christoph Lameter
2007-05-29 13:57                       ` Lee Schermerhorn
2007-05-25 21:03           ` Andi Kleen
2007-05-25 21:14             ` Lee Schermerhorn
2007-05-25 22:44               ` Andi Kleen
2007-05-29 14:17                 ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1180104952.5730.28.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=nish.aravamudan@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox