From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>, Gleb Natapov <glebn@voltaire.com>,
linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] Document Linux Memory Policy
Date: Fri, 01 Jun 2007 15:38:33 -0400 [thread overview]
Message-ID: <1180726713.5278.80.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0706011140330.2643@schroedinger.engr.sgi.com>
On Fri, 2007-06-01 at 11:43 -0700, Christoph Lameter wrote:
> On Fri, 1 Jun 2007, Lee Schermerhorn wrote:
>
> > Like Gleb, I find the different behaviors for different memory regions
> > to be unnatural. Not because of the fraction of applications or
> > deployments that might use them, but because [speaking for customers] I
> > expect and want to be able to control placement of any object mapped
> > into an application's address space, subject to permissions and
> > privileges.
>
> Same here and I wish we had a clean memory region based implementation.
> But that is just what your patches do *not* provide. Instead they are file
> based. They should be memory region based.
>
> Would you please come up with such a solution?
Christoph:
I don't understand what you mean by "memory region based".
Linux does not have bona fide "memory objects" that sit between a task's
address space and the backing store--be it swap or regular files--like
some systems I've worked with. Rather, anonymous regions are described
by the vma_struct, and pages backing those regions must be referenced by
one or more ptes or a swap cache entry, or both. For a disk back file
mapped into a task address space, the vma points directly to the inode
+address_space structures via the file structure. Shmem regions attach
to a task address space much like regular files--via a pseudo-fs inode
+address_space. I don't know the rationale, but I suspect that Linux
dispenses with the extra memory object layer to conserve memory for
smaller systems. And that's a good thing, IMO.
So, for a shared memory mapped file, the inode+address_space--i.e., the
in-memory incarnation of the file--is as close to a "memory region" as
we have. In contains the mapping between [file/address] offset and
memory page. It's the only object representing the file and its
in-memory pages that gets shared between multiple task address spaces.
That seems, to me, to be the natural place to hang the shared policy.
Indeed, this is where we attach shared policy to shmem/tmpfs/hugetlbfs
pseudo-files.
Even if we had a layer between the vma's and the files/inodes, I don't
see what that would buy us. We'd still want to maintain coherency
between files accessed via file descriptor function calls and files
mapped via mmap(SHARED). That's one of the purposes of a shared page
cache. [I've seen unix variants where these weren't coherent. Now
THAT's unnatural ;-)!] So, yes any policy applied to the memory mapped
file affects the location of pages accessed via file descriptor access.
That's a good thing for the application that use shared mapped files.
The load/store access by the application that maps the file, and goes to
the trouble of specifying memory policy, takes precedence. Load/store
is the "fast path". File descriptor access system calls are the slow
path.
You're usually gung-ho about locality on a NUMA platform, avoiding off
node access or page allocations, respecting the fast path, ... Why the
resistance here?
>
> > Then why does Christoph keep insisting that "page cache pages" must
> > always follow task policy, when shmem, tmpfs and anonymous pages don't
> > have to?
>
> No I just said that the page cache handling is consistently following task
> policy.
Well, not for anon, shmem, tmpfs, ... page cache pages. All of those
are page cache based, according to Andi, and they certainly aren't
constrained to "consistently follow task policy".
Of course, I'm just being facetious [and, no doubt, annoying] to make a
point. We're using the same words, sometimes referring to the same
concepts, but in slightly different context and "talking past each
other". I'm trying real hard to believe that this is what's happening
in this entire exchange. That's the most benign reason I can come up
with...
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-06-01 19:38 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-29 19:33 Lee Schermerhorn
2007-05-29 20:04 ` Christoph Lameter
2007-05-29 20:16 ` Andi Kleen
2007-05-30 16:17 ` Lee Schermerhorn
2007-05-30 17:41 ` Christoph Lameter
2007-05-31 8:20 ` Michael Kerrisk
2007-05-31 14:49 ` Lee Schermerhorn
2007-05-31 15:56 ` Michael Kerrisk
2007-06-01 21:15 ` [PATCH] enhance memory policy sys call man pages v1 Lee Schermerhorn
2007-07-23 6:11 ` Michael Kerrisk
2007-07-23 6:32 ` mbind.2 man page patch Michael Kerrisk
2007-07-23 14:26 ` Lee Schermerhorn
2007-07-26 17:19 ` Michael Kerrisk
2007-07-26 18:06 ` Lee Schermerhorn
2007-07-26 18:18 ` Michael Kerrisk
2007-07-23 6:32 ` get_mempolicy.2 " Michael Kerrisk
2007-07-28 9:31 ` Michael Kerrisk
2007-08-09 18:43 ` Lee Schermerhorn
2007-08-09 20:57 ` Michael Kerrisk
2007-08-16 20:05 ` Andi Kleen
2007-08-18 5:50 ` Michael Kerrisk
2007-08-21 15:45 ` Lee Schermerhorn
2007-08-22 4:10 ` Michael Kerrisk
2007-08-22 16:08 ` [PATCH] Mempolicy Man Pages 2.64 1/3 - mbind.2 Lee Schermerhorn
2007-08-27 11:29 ` Michael Kerrisk
2007-08-22 16:10 ` [PATCH] Mempolicy Man Pages 2.64 2/3 - set_mempolicy.2 Lee Schermerhorn
2007-08-27 11:30 ` Michael Kerrisk
2007-08-22 16:12 ` [PATCH] Mempolicy Man Pages 2.64 3/3 - get_mempolicy.2 Lee Schermerhorn
2007-08-27 11:30 ` Michael Kerrisk
2007-08-27 10:46 ` get_mempolicy.2 man page patch Michael Kerrisk
2007-07-23 6:33 ` set_mempolicy.2 " Michael Kerrisk
2007-05-30 16:55 ` [PATCH] Document Linux Memory Policy Lee Schermerhorn
2007-05-30 17:56 ` Christoph Lameter
2007-05-31 6:18 ` Gleb Natapov
2007-05-31 6:41 ` Christoph Lameter
2007-05-31 6:47 ` Gleb Natapov
2007-05-31 6:56 ` Christoph Lameter
2007-05-31 7:11 ` Gleb Natapov
2007-05-31 7:24 ` Christoph Lameter
2007-05-31 7:39 ` Gleb Natapov
2007-05-31 17:43 ` Christoph Lameter
2007-05-31 17:07 ` Lee Schermerhorn
2007-05-31 10:43 ` Andi Kleen
2007-05-31 11:04 ` Gleb Natapov
2007-05-31 11:30 ` Gleb Natapov
2007-05-31 15:26 ` Lee Schermerhorn
2007-05-31 17:41 ` Gleb Natapov
2007-05-31 18:56 ` Lee Schermerhorn
2007-05-31 20:06 ` Gleb Natapov
2007-05-31 20:43 ` Andi Kleen
2007-06-01 9:38 ` Gleb Natapov
2007-06-01 10:21 ` Andi Kleen
2007-06-01 12:25 ` Gleb Natapov
2007-06-01 13:09 ` Andi Kleen
2007-06-01 17:15 ` Lee Schermerhorn
2007-06-01 18:43 ` Christoph Lameter
2007-06-01 19:38 ` Lee Schermerhorn [this message]
2007-06-01 19:48 ` Christoph Lameter
2007-06-01 21:05 ` Lee Schermerhorn
2007-06-01 21:56 ` Christoph Lameter
2007-06-04 13:46 ` Lee Schermerhorn
2007-06-04 16:34 ` Christoph Lameter
2007-06-04 17:02 ` Lee Schermerhorn
2007-06-04 17:11 ` Christoph Lameter
2007-06-04 20:23 ` Andi Kleen
2007-06-04 21:51 ` Christoph Lameter
2007-06-05 14:30 ` Lee Schermerhorn
2007-06-01 20:28 ` Gleb Natapov
2007-06-01 20:45 ` Christoph Lameter
2007-06-01 21:10 ` Lee Schermerhorn
2007-06-01 21:58 ` Christoph Lameter
2007-06-02 7:23 ` Gleb Natapov
2007-05-31 11:47 ` Andi Kleen
2007-05-31 11:59 ` Gleb Natapov
2007-05-31 12:15 ` Andi Kleen
2007-05-31 12:18 ` Gleb Natapov
2007-05-31 18:28 ` Lee Schermerhorn
2007-05-31 18:35 ` Christoph Lameter
2007-05-31 19:29 ` Lee Schermerhorn
2007-05-31 19:25 ` Paul Jackson
2007-05-31 20:22 ` Lee Schermerhorn
2007-05-29 20:07 ` Andi Kleen
2007-05-30 16:04 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1180726713.5278.80.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=glebn@voltaire.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox