linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>, Eric Whitney <eric.whitney@hp.com>
Subject: Suspect use of "first_zones_zonelist()"
Date: Tue, 22 Apr 2008 11:17:24 -0400	[thread overview]
Message-ID: <1208877444.5534.34.camel@localhost> (raw)

I was testing my "lazy migration" patches and noticed something
interesting about first_zones_zonelist().  I use this function to find
the target node for MPOL_BIND policy to determine if a page is
"misplaced" and should be migrated.  In my testing, I found that I was
always "off by one".  E.g., if my mempolicy nodemask contained only node
2, I'd migrate to node 3.  If it contained node 3, I'd migrate to node 0
[on a 4-node platform], etc.

Following the usage in slab_node(), I was doing something like:

zr = first_zones_zonelist(node_zonelist(nid, ...), gfp_zone(...),
&pol->v.vnodes, &dummy);
newnid = zonelist_node_idx(zr);

Turns out that the return value is the NEXT zoneref in the zonelist
AFTER the one of interest--i.e., the first that satisfies any nodemask
constraint.  I renamed 'dummy' to 'zone', ignore the return value and
use:  newnid = zone->node.  [I guess I could use zonelist_node_idx(zr
-1) as well.]  This results in page migration to the expected node.

Anyway, after discovering this, I checked other usages of
first_zones_zonelist() outside of the iterator macros, and I THINK they
might be making the same mistake?

Here's a patch that "fixes" these.  Do you agree?  Or am I
misunderstanding this area [again!]?

Lee

PATCH fix off-by-one usage of first_zones_zonelist()

Against:  2.6.25-rc8-mm2

The return value of first_zones_zonelist() is actually the zoneref
AFTER the "requested zone"--i.e., the first zone in the zonelist
that satisfies any nodemask constraint.  The "requested zone" is
returned via the @zone parameter.  The returned zoneref is intended
to be passed to next_zones_zonelist() on subsequent iterations.

Fix up slab_node() and get_page_from_freelist() to use the requested
zone, rather than the next one in the list.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>

 mm/mempolicy.c  |    9 ++++-----
 mm/page_alloc.c |    2 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

Index: linux-2.6.25-rc8-mm2/mm/mempolicy.c
===================================================================
--- linux-2.6.25-rc8-mm2.orig/mm/mempolicy.c	2008-04-22 10:06:29.000000000 -0400
+++ linux-2.6.25-rc8-mm2/mm/mempolicy.c	2008-04-22 10:11:22.000000000 -0400
@@ -1396,14 +1396,13 @@ unsigned slab_node(struct mempolicy *pol
 		 * first node.
 		 */
 		struct zonelist *zonelist;
-		struct zoneref *z;
-		struct zone *dummy;
+		struct zone *zone;
 		enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
 		zonelist = &NODE_DATA(numa_node_id())->node_zonelists[0];
-		z = first_zones_zonelist(zonelist, highest_zoneidx,
+		(void)first_zones_zonelist(zonelist, highest_zoneidx,
 							&policy->v.nodes,
-							&dummy);
-		return zonelist_node_idx(z);
+							&zone);
+		return zone->node;
 	}
 
 	default:
Index: linux-2.6.25-rc8-mm2/mm/page_alloc.c
===================================================================
--- linux-2.6.25-rc8-mm2.orig/mm/page_alloc.c	2008-04-22 10:00:58.000000000 -0400
+++ linux-2.6.25-rc8-mm2/mm/page_alloc.c	2008-04-22 10:16:32.000000000 -0400
@@ -1414,7 +1414,7 @@ get_page_from_freelist(gfp_t gfp_mask, n
 
 	z = first_zones_zonelist(zonelist, high_zoneidx, nodemask,
 							&preferred_zone);
-	classzone_idx = zonelist_zone_idx(z);
+	classzone_idx = zonelist_zone_idx(z - 1); /* z is next zone in list */
 
 zonelist_scan:
 	/*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2008-04-22 15:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-22 15:17 Lee Schermerhorn [this message]
2008-04-22 16:15 ` Mel Gorman
2008-04-22 17:10   ` Lee Schermerhorn
2008-04-22 17:49     ` Mel Gorman
2008-04-22 18:01       ` Lee Schermerhorn
2008-04-22 18:40         ` Lee Schermerhorn
2008-04-23  6:02           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1208877444.5534.34.camel@localhost \
    --to=lee.schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=eric.whitney@hp.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox