linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <lee.schermerhorn@hp.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, nacc@us.ibm.com, ak@suse.de,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	clameter@sgi.com
Subject: [PATCH/RFC 0/11] Shared Policy Overview
Date: Mon, 25 Jun 2007 15:52:24 -0400	[thread overview]
Message-ID: <20070625195224.21210.89898.sendpatchset@localhost> (raw)

[RFC] Shared Policy Fixex and Mapped File Policy 0/11 

Against 2.6.22-rc4-mm2

This is my former "Mapped File Policy" patch set, reordered to
move the fixes to numa_maps and the "hook up" of hugetlb shmem
policies to before the hook up of shared policy to shared mmap()ed
regular files.  ["Fixes first" per Christoph L.]

The 2 patches to fix up current behavior issues [#s 4 & 5] sit
atop 3 "cleanup" patches.  The clean up patches simplify the fixes
and, yes, support the generic mapped file policy patches [#s 6-11].

With patches 1-3 applied, external behavior is, AFAICT, exactly
the same as current behavior.  The internal differences are that
shared policy is now a pointer in the address_space structure.
A NULL value [the default] indicates default policy.  The shared
policy is allocated on demand--when one mbind()s a virtual
address range backed by a shmem memory object.

Patch #3 eliminates the need for a pseudo-vma on the stack to 
initialize policies for tmpfs inodes when the superblock has
a non-default policy by changing the interface to
mpol_set_shared_policy() to take a page offset and size in pages,
computed in the shmem set_policy vm_op.  This cleanup addresses
one complaint about the current shared policy infrastructure.

The other internal difference is that linear mappings that support
the 'set_policy' vm_op are mapped by a single VMA--not split on
policy boundaries.  numa_maps needs to be able to handle this
anyway because a task that attaches a shmem segment on which
another task has already installed multiple shared policies will
have a single vma mapping the entire segment.  Patch #4 fixes
numa_maps to display these properly.

Patch #5 hooks up SHM_HUGETLB segments to use the shmem get/set
policy vm_ops.  This "just works" with the fixes to numa_maps
in patch #4.  Without the numa_maps fixes, a cat of the numa_maps
of a task with a hugetlb shmem segment with shared policy attached
would hang.

Again, patches 6-11 define the generic file shared policy support,
They also prevent a private file mapping from affecting any shared
policy installed via a shared mapping, including preventing migrating
the shared pages to follow the address space private policy.
Policies installed via a private mapping apply only to the calling
task's address space--current behavior. 

Patches 6-8 add support for shared policy on regular files, factoring
alloc_page_vma() into vma policy lookup and allocation of a page
given a policy--alloc_page_pol().  Then, the page page cache alloc
function can lookup the shared file policy via page offset and use
the same alloc_page_pol() to allocate the page based on that policy.

Patch #9 defines an initial peristence model for shared policies on
shared mmap()ed files:   a shared policy can only be installed on generic
files via a shared mmap()ing.  Such a policy will persist as long as
any shared mmap()ings exist.  Shared mappings of a file are tracked
by the i_mmap_writable count in the struct address_space.  Patch #9
removes any existing policy when the i_mmap_writable count goes to zero.

Note that the existing shared policy persistence model for shmem segments
is different.  Once installed, the shared policies persist until the segment
is destroyed.  Because shmem goes through the same unmap path, shared
policies on shmem segments are marked with a SPOL_F_PERSIST flag to
prevent them from being removed on last detatch [unmap]--i.e., to preserve
existing behavior.

Also note that because we can remove a shared policy from a "live"
inode, we need to handle potential races with another task performing
a get_file_policy() on the same file via a file descriptor access
[read()/write()/...].  Patch #9 handles this by defining an RCU reader
critical region in get_file_policy() and by synchronizing with this
in mpol_free_shared_policy().

[I hope patch #9 will alleviate Andi's concerns about an unspecified
persistence model.  Note that the model implemented by patch #9 could
easily be enhanced to persist beyond the last shared mapping--e.g.,
via some additional mbind() flags, such as MPOL_MF_[NO]PERSIST--and
possibly enhancements to numactl to set/remove shared policy on files.
I didn't want to pursue that in this patch set because I don't have a
use for it, and it will require some tool to list files with persistent
shared policy--perhaps an enhancement to lsof(8).]

Patch #10 adds a per cpuset control file--shared_file_policy--to
explicitly enable/disable shared policy on shared file mappings.
Default is disabled--current behavior.  That is, even with all 11
patches applied, you'll have to explicitly enable shared file policy,
else the kernel will continue to ignore mbind() of address ranges backed
by a shared regular file mapping.  This preserves existing behavior for
applications that might currently be installing memory policies on
shared regular file mappings, not realizing that they are ignored.
Such applications might break or behave unexpectedly if the kernel
suddenly starts using the shared policy.   With the per cpuset control
defaulting to current behavior, an explicit action by a privileged 
user is required to enable the new behavior.

[I hope patch #10 alleviates Christoph's concern about unexpected
interaction of shared policies on mmap()ed files in one cpuset with
file descriptor access from another cpuset.  This can only happen if
the user/adminstrator explicitly enables shared file policies for an
application.]

Finally, patch #11 adds the generic file set|get_policy vm_ops to
actually hook up shared file mappings to memory policy.  Without this
patch, the shared policy infrastructure enhancements in the previous
patches remain mostly unused, except for the existing shmem and added
hugetlbfs usage.

---

Note:  testing/code sizes/... covered in previous posting:

	http://marc.info/?l=linux-mm&m=118002773528224&w=4

No sense in repeating this until we decide to go forward.
However, this series has been tested with 22-rc4-mm2 on ia64 and
x86_64 platforms.

Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2007-06-25 19:52 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-25 19:52 Lee Schermerhorn [this message]
2007-06-25 19:52 ` [PATCH/RFC 1/11] Shared Policy: move shared policy to inode/mapping Lee Schermerhorn
2007-06-25 19:52 ` [PATCH/RFC 2/11] Shared Policy: allocate shared policies as needed Lee Schermerhorn
2007-06-25 19:52 ` [PATCH/RFC 3/11] Shared Policy: let vma policy ops handle sub-vma policies Lee Schermerhorn
2007-06-25 19:52 ` [PATCH/RFC 4/11] Shared Policy: fix show_numa_maps() Lee Schermerhorn
2007-06-25 19:52 ` [PATCH/RFC 5/11] Shared Policy: Add hugepage shmem policy vm_ops Lee Schermerhorn
2007-06-25 19:53 ` [PATCH/RFC 6/11] Shared Policy: Factor alloc_page_pol routine Lee Schermerhorn
2007-06-25 19:53 ` [PATCH/RFC 7/11] Shared Policy: use shared policy for page cache allocations Lee Schermerhorn
2007-06-25 19:53 ` [PATCH/RFC 8/11] Shared Policy: fix migration of private mappings Lee Schermerhorn
2007-06-25 19:53 ` [PATCH/RFC 9/11] Shared Policy: mapped file policy persistence model Lee Schermerhorn
2007-06-25 19:53 ` [PATCH/RFC 10/11] Shared Policy: per cpuset shared file policy control Lee Schermerhorn
2007-06-25 21:10   ` Paul Jackson
2007-06-27 17:33     ` Lee Schermerhorn
2007-06-27 19:52       ` Paul Jackson
2007-06-27 20:22         ` Lee Schermerhorn
2007-06-27 20:36           ` Paul Jackson
2007-06-25 19:53 ` [PATCH/RFC 11/11] Shared Policy: add generic file set/get policy vm ops Lee Schermerhorn
2007-06-26 22:17 ` [PATCH/RFC 0/11] Shared Policy Overview Christoph Lameter
2007-06-27 13:43   ` Lee Schermerhorn
2007-06-26 22:21 ` Christoph Lameter
2007-06-26 22:42   ` Andi Kleen
2007-06-27  3:25     ` Christoph Lameter
2007-06-27 20:14       ` Lee Schermerhorn
2007-06-27 18:14   ` Lee Schermerhorn
2007-06-27 21:37     ` Christoph Lameter
2007-06-27 22:01       ` Andi Kleen
2007-06-27 22:08         ` Christoph Lameter
2007-06-27 23:46         ` Paul E. McKenney
2007-06-28  0:14           ` Andi Kleen
2007-06-29 21:47           ` Lee Schermerhorn
2007-06-28 13:42         ` Lee Schermerhorn
2007-06-28 22:02           ` Andi Kleen
2007-06-29 17:14             ` Lee Schermerhorn
2007-06-29 17:42               ` Andi Kleen
2007-06-30 18:34                 ` [PATCH/RFC] Fix Mempolicy Ref Counts - was " Lee Schermerhorn
2007-07-03 18:09                   ` Christoph Lameter
2007-06-29  1:39           ` Christoph Lameter
2007-06-29  9:01             ` Andi Kleen
2007-06-29 14:05               ` Christoph Lameter
2007-06-29 17:41                 ` Lee Schermerhorn
2007-06-29 20:15                   ` Christoph Lameter
2007-06-29 13:22             ` Lee Schermerhorn
2007-06-29 14:18               ` Christoph Lameter
2007-06-27 23:36       ` Lee Schermerhorn
2007-06-29  1:41         ` Christoph Lameter
2007-06-29 13:30           ` Lee Schermerhorn
2007-06-29 14:20             ` Andi Kleen
2007-06-29 21:40               ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070625195224.21210.89898.sendpatchset@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=nacc@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox