From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>,
linux-mm@kvack.org, akpm@linux-foundation.org,
nish.aravamudan@gmail.com
Subject: Re: [PATCH/RFC 0/8] Mapped File Policy Overview
Date: Fri, 25 May 2007 17:12:32 -0400 [thread overview]
Message-ID: <1180127552.21879.15.camel@localhost> (raw)
In-Reply-To: <Pine.LNX.4.64.0705251156460.7281@schroedinger.engr.sgi.com>
On Fri, 2007-05-25 at 12:10 -0700, Christoph Lameter wrote:
> On Fri, 25 May 2007, Lee Schermerhorn wrote:
>
> > I knew that! There is no existing practice. However, I think it is in
> > our interests to ease the migration of applications to Linux. And,
> > again, [trying to choose words carefully], I see this as a
> > defect/oversight in the API. I mean, why provide mbind() at all, and
> > then say, "Oh, by the way, this only works for anonymous memory, SysV
> > shared memory and private file mappings. You can't use this if you
> > mmap() a file shared. For that you have to twiddle your task policy,
> > fault in and lock down the pages to make sure they don't get paged out,
> > because, if they do, and you've changed the task policy to place some
> > other mapped file that doesn't obey mbind(), the kernel doesn't remember
> > where you placed them. Oh, and for those private mappings--be sure to
> > write to each page in the range because if you just read, the kernel
> > will ignore your vma policy."
> >
> > Come on!
>
> Well if this patch would simplify things then I would agree but it
> introduces new cornercases.
I don't think this is the case, but I could have missed something. I've
kept the behavior identical, I think, for the default case when no
explicit shared policy is applied. And the remaining corner case
involves those funky private mappings. The behavior there is the same
as the current behavior.
I have a fix for that, but it involves forcing early COW break when the
private mapping has a vma policy and the page cache page doesn't match
the policy. I haven't posted that because: 1) is DOES add additional
checks in the nopage fault path and 2) it depends on the misplacement
check in my "migrate on fault" series. I didn't want to muddy the water
with that yet.
>
> The current scheme is logical if you consider the pagecache as something
> separate. It is after all already controlled via the memory spreading flag
> in cpusets. There is already limited control by the process.
Yes, but I have to treat some parts of my address space [mapped shared
files] differently, when it's unnecessary.
>
> Also allowing vma based memory policies to control shared mapping is
> problematic because they are shared. Concurrent processes may set
> different policies.
But with the shared policy infrastructure, all shared mappers see the
same policy [or policies]. The last one set on any given range of the
underlying file [address_space] is the one that is currently in
effect--just like shmem. If that wasn't clear from my description, I
need to fix that.
> This would make sense if the policy could be set at a
> filesystem level.
??? Why? Different processes could set different policies on the file
in the file system. The last one [before the file was mapped?] would
rule.
>
> > And as for fixing the numa_maps behavior, hey, I didn't post the
> > defective code. I'm just pointing out that my patches happen to fix
> > some existing suspect behavior along the way. But, if some patch
> > submittal standard exists that says one must fix all known outstanding
> > bugs before submitting anything else [Andrew would probably support
> > that ;-)], please point it out to me... and everyone else. And, as I've
> > said before, I see this patch set as one big fix to missing/broken
> > behavior.
>
> I still have not found a bug in there....
I'll send you a memtoy script to demonstrate the issue. Next week...
>
> Convention is that fixes precede enhancements in a patchset.
Seems like a lot of extra effort that could be applied to other tasks,
but you've worn me down. I'll debug the numa_maps hang with hugetlb
shmem segments with shared policy in the current code base, and reorder
the patch set to handle correct display of shmem policy from all tasks
first. Next week or so.
Later,
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-05-25 21:12 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-24 17:28 Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 1/8] Mapped File Policy: move shared policy to inode/mapping Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 2/8] Mapped File Policy: allocate shared policies as needed Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 3/8] Mapped File Policy: let vma policy ops handle sub-vma policies Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 4/8] Mapped File Policy: add generic file set/get policy vm ops Lee Schermerhorn
2007-05-24 17:28 ` [PATCH/RFC 5/8] Mapped File Policy: Factor alloc_page_pol routine Lee Schermerhorn
2007-05-24 17:29 ` [PATCH/RFC 6/8] Mapped File Policy: use file policy for page cache allocations Lee Schermerhorn
2007-05-24 17:29 ` [PATCH/RFC 7/8] Mapped File Policy: fix migration of private mappings Lee Schermerhorn
2007-05-24 17:29 ` [PATCH/RFC 8/8] Mapped File Policy: fix show_numa_maps() Lee Schermerhorn
2007-05-24 19:24 ` [PATCH/RFC 0/8] Mapped File Policy Overview Christoph Lameter
2007-05-24 20:46 ` Lee Schermerhorn
2007-05-24 20:41 ` Andi Kleen
2007-05-24 21:05 ` Lee Schermerhorn
2007-05-24 21:17 ` Christoph Lameter
2007-05-25 14:55 ` Lee Schermerhorn
2007-05-25 15:25 ` Christoph Lameter
2007-05-25 16:06 ` Lee Schermerhorn
2007-05-25 16:24 ` Christoph Lameter
2007-05-25 17:37 ` Lee Schermerhorn
2007-05-25 19:10 ` Christoph Lameter
2007-05-25 21:12 ` Lee Schermerhorn [this message]
2007-05-25 21:43 ` Christoph Lameter
2007-05-25 21:01 ` Andi Kleen
2007-05-25 21:41 ` Lee Schermerhorn
2007-05-25 21:46 ` Christoph Lameter
2007-05-29 13:57 ` Lee Schermerhorn
2007-05-25 21:03 ` Andi Kleen
2007-05-25 21:14 ` Lee Schermerhorn
2007-05-25 22:44 ` Andi Kleen
2007-05-29 14:17 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1180127552.21879.15.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=nish.aravamudan@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox