linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Mel Gorman <mel@skynet.ie>
Cc: linux-mm@kvack.org, ak@suse.de,
	Nishanth Aravamudan <nacc@us.ibm.com>,
	pj@sgi.com, kxr@sgi.com, Christoph Lameter <clameter@sgi.com>,
	akpm@linux-foundation.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH/RFC] 2.6.23-rc1-mm1:  MPOL_PREFERRED fixups for preferred_node < 0
Date: Tue, 31 Jul 2007 11:58:07 -0400	[thread overview]
Message-ID: <1185897487.6240.29.camel@localhost> (raw)
In-Reply-To: <20070731153227.GB18506@skynet.ie>

On Tue, 2007-07-31 at 16:32 +0100, Mel Gorman wrote:
> On (30/07/07 18:00), Lee Schermerhorn didst pronounce:
> > On Mon, 2007-07-30 at 17:38 -0400, Lee Schermerhorn wrote:
> > > These are some "issues" that I came across working on the Memoryless
> > > Node series.  I'm using the same cc: list as that series as the issues
> > > are somewhat related.
> > > 
> > > Only boot tested at this point.
> > 
> > I sent the wrong patch--forgot to refresh before posting :-(.  Bogus
> > code in mpol_to_str() in previous patch.
> > 
> > Try this one.
> > 
> > Lee
> > 
> > > ---------------------------
> > 
> > PATCH/RFC - MPOL_PREFERRED fixups for "local allocation"
> > 
> > Here are a couple of potential "fixups" for MPOL_PREFERRED behavior
> > when v.preferred_node < 0 -- i.e., "local allocation":
> > 
> > 1)  [do_]get_mempolicy() calls the misnamed get_zonemask() to fetch the
> >     nodemask associated with a policy.  Currently, get_zonemask() returns
> >     the set of nodes with memory, when the policy 'mode' is 'PREFERRED,
> 
> Consider a cleanup that renames get_zonemask because the naming is
> misleading.

I can do that.  Wanted to hear from others, such as yourself first.

> 
> >     and the preferred_node is < 0.  Return the set of allowed nodes
> >     instead.  This will already have been masked to include only nodes
> >     with memory.
> > 
> > 2)  When a task is moved into a [new] cpuset, mpol_rebind_policy() is
> >     called to adjust any task and vma policy nodes to be valid in the
> >     new cpuset.  However, when the policy is MPOL_PREFERRED, and the
> >     preferred_node is <0, no rebind is necessary.  The "local allocation"
> >     indication is valid in any cpuset.
> > 
> > 3)  mpol_to_str() produces a printable, "human readable" string from a
> >     struct mempolicy.  For MPOL_PREFERRED with preferred_node <0,  show
> >     the entire set of valid nodes.  Although, technically, MPOL_PREFERRED
> >     takes only a single node, preferred_node <0 is a local allocation policy,
> >     with the preferred node determined by the context where the task
> >     is executing.  All of the allowed nodes are possible, as the task
> >     migrates amoung the nodes in the cpuset.
> > 
> > Comments?
> > 
> > Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
> > 
> >  mm/mempolicy.c |   31 ++++++++++++++++++++++++-------
> >  1 file changed, 24 insertions(+), 7 deletions(-)
> > 
> > Index: Linux/mm/mempolicy.c
> > ===================================================================
> > --- Linux.orig/mm/mempolicy.c	2007-07-30 17:32:06.000000000 -0400
> > +++ Linux/mm/mempolicy.c	2007-07-30 17:38:17.000000000 -0400
> > @@ -494,9 +494,11 @@ static void get_zonemask(struct mempolic
> >  		*nodes = p->v.nodes;
> >  		break;
> >  	case MPOL_PREFERRED:
> > -		/* or use current node instead of memory_map? */
> > +		/*
> > +		 * for "local policy", return allowed memories
> > +		 */
> >  		if (p->v.preferred_node < 0)
> > -			*nodes = node_states[N_MEMORY];
> > +			*nodes = cpuset_current_mems_allowed;
> >  		else
> 
> Is this actually a bugfix? From this context, it looks like memory
> policies using MPOL_PREFERRED can ignore cpusets.

Not a serious bug, if it is one.  More of a cleanup.   All this does is
return a node mask in the case where the application has a task memory
policy of 'PREFERRED with a node id of -1 [which happens when you
specify an empty nodemask to set_mempolicy() or mbind()].  This means
"local allocation"--the actual "current node id" is fetched at
allocation time.  This is a little know "feature" of get_mempolicy().
The results is misleading, but there isn't much the application can do
with it.  Node masks are ANDed with cpuset_current_mems_allowed when
installed via a syscall.

> 
> >  			node_set(p->v.preferred_node, *nodes);
> >  		break;
> > @@ -1650,6 +1652,7 @@ void mpol_rebind_policy(struct mempolicy
> >  {
> >  	nodemask_t *mpolmask;
> >  	nodemask_t tmp;
> > +	int nid;
> >  
> >  	if (!pol)
> >  		return;
> > @@ -1668,9 +1671,15 @@ void mpol_rebind_policy(struct mempolicy
> >  						*mpolmask, *newmask);
> >  		break;
> >  	case MPOL_PREFERRED:
> > -		pol->v.preferred_node = node_remap(pol->v.preferred_node,

Ultimately, node_remap() [the bitmap functions it calls] will return the
old value of "-1" because it's outside the valid range for a the node
bitmasks.  However, it doesn't seem right to be calling node_remap()
with an invalid node id.  I think it's clearer this way:

> > +		/*
> > +		 * no need to remap "local policy"
> > +		 */
> > +		nid = pol->v.preferred_node;
> > +		if (nid >= 0) {
> > +			pol->v.preferred_node = node_remap(nid,
> >  						*mpolmask, *newmask);
> > -		*mpolmask = *newmask;
> > +			*mpolmask = *newmask;
> > +		}
> >  		break;
> >  	case MPOL_BIND: {
> >  		nodemask_t nodes;
> > @@ -1745,7 +1754,7 @@ static const char * const policy_types[]
> >  static inline int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
> >  {
> >  	char *p = buffer;
> > -	int l;
> > +	int nid, l;
> >  	nodemask_t nodes;
> >  	int mode = pol ? pol->policy : MPOL_DEFAULT;
> >  
> > @@ -1755,8 +1764,16 @@ static inline int mpol_to_str(char *buff
> >  		break;
> >  
> >  	case MPOL_PREFERRED:
> > -		nodes_clear(nodes);
> > -		node_set(pol->v.preferred_node, nodes);

Here, I think set_bit() will set bit 31.  Again, misleading, IMO.

> > +		nid = pol->v.preferred_node;
> > +		/*
> > +		 * local interleave, show all valid nodes
> > +		 */
> > +		if (nid < 0 )
> > +			nodes = cpuset_current_mems_allowed;
> > +		else {
> > +			nodes_clear(nodes);
> > +			node_set(nid, nodes);
> > +		}
> >  		break;
> >  
> >  	case MPOL_BIND:
> > 
> 
> -- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-07-31 15:58 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-27 19:43 [PATCH 00/14] NUMA: Memoryless node support V4 Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Lee Schermerhorn
2007-07-30 21:38   ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 Lee Schermerhorn
2007-07-30 22:00     ` Lee Schermerhorn
2007-07-31 15:32       ` Mel Gorman
2007-07-31 15:58         ` Lee Schermerhorn [this message]
2007-07-31 21:05     ` [PATCH/RFC] 2.6.23-rc1-mm1: MPOL_PREFERRED fixups for preferred_node < 0 - v2 Lee Schermerhorn
2007-08-01  2:22   ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Andrew Morton
2007-08-01  2:52     ` Christoph Lameter
2007-08-01  3:05       ` Andrew Morton
2007-08-01  3:14         ` Christoph Lameter
2007-08-01  3:32           ` Andrew Morton
2007-08-01  3:37             ` Christoph Lameter
     [not found]             ` <Pine.LNX.4.64.0707312151400.2894@schroedinger.engr.sgi.com>
2007-08-01  5:07               ` Andrew Morton
2007-08-01  5:11                 ` Andrew Morton
2007-08-01  5:22                 ` Christoph Lameter
2007-08-01 10:24                   ` Mel Gorman
2007-08-02 16:23                   ` Mel Gorman
2007-08-02 20:00                     ` Christoph Lameter
2007-08-01  5:36             ` Paul Mundt
2007-08-01  9:19             ` Andi Kleen
2007-08-01 14:03             ` Lee Schermerhorn
2007-08-01 17:41               ` Christoph Lameter
2007-08-01 17:54                 ` Lee Schermerhorn
2007-08-02 20:05                 ` [PATCH/RFC/WIP] cpuset-independent interleave policy Lee Schermerhorn
2007-08-02 20:34                   ` Christoph Lameter
2007-08-02 21:04                     ` Lee Schermerhorn
2007-08-03  0:31                       ` Christoph Lameter
2007-08-02 20:19                 ` Audit of "all uses of node_online()" Lee Schermerhorn
2007-08-02 20:26                   ` Christoph Lameter
2007-08-08 22:19                     ` Lee Schermerhorn
2007-08-08 23:40                       ` Christoph Lameter
2007-08-16 14:17                         ` [PATCH/RFC] memoryless nodes - fixup uses of node_online_map in generic code Lee Schermerhorn
2007-08-16 18:33                           ` Christoph Lameter
2007-08-16 19:15                             ` Lee Schermerhorn
2007-08-16 21:10                         ` Lee Schermerhorn
2007-08-16 21:13                           ` Christoph Lameter
2007-08-24 16:09                         ` [PATCH] 2.6.23-rc3-mm1 - Move setup of N_CPU node state mask Lee Schermerhorn
2007-09-06 13:56                           ` Mel Gorman
2007-08-02 20:33                   ` Audit of "all uses of node_online()" Andrew Morton
2007-08-02 20:45                     ` Lee Schermerhorn
2007-08-01 15:58           ` [PATCH 01/14] NUMA: Generic management of nodemasks for various purposes Nishanth Aravamudan
2007-08-01 16:09             ` Nishanth Aravamudan
2007-08-01 17:47             ` Christoph Lameter
2007-08-01 15:25         ` Nishanth Aravamudan
2007-07-27 19:43 ` [PATCH 02/14] Memoryless nodes: introduce mask of nodes with memory Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 03/14] Memoryless Nodes: Fix interleave behavior Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 04/14] OOM: use the N_MEMORY map instead of constructing one on the fly Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 05/14] Memoryless Nodes: No need for kswapd Lee Schermerhorn
2007-07-27 19:43 ` [PATCH 06/14] Memoryless Node: Slab support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 07/14] Memoryless nodes: SLUB support Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 08/14] Uncached allocator: Handle memoryless nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 09/14] Memoryless node: Allow profiling data to fall back to other nodes Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 10/14] Memoryless nodes: Update memory policy and page migration Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 11/14] Add N_CPU node state Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 12/14] Memoryless nodes: Fix GFP_THISNODE behavior Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 13/14] Memoryless Nodes: use "node_memory_map" for cpusets Lee Schermerhorn
2007-07-27 19:44 ` [PATCH 14/14] Memoryless nodes: drop one memoryless node boot warning Lee Schermerhorn
2007-07-27 20:59 ` [PATCH 00/14] NUMA: Memoryless node support V4 Nishanth Aravamudan
2007-07-30 13:48   ` Lee Schermerhorn
2007-07-29 12:35 ` Paul Jackson
2007-07-30 16:07   ` Lee Schermerhorn
2007-07-30 18:56     ` Paul Jackson
2007-07-30 21:19 ` Nishanth Aravamudan
2007-07-30 22:06   ` Christoph Lameter
2007-07-30 22:35     ` Andi Kleen
2007-07-30 22:36       ` Christoph Lameter
2007-07-31 23:18         ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1185897487.6240.29.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kxr@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=nacc@us.ibm.com \
    --cc=pj@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox