From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>, Eric Whitney <eric.whitney@hp.com>
Subject: Suspect use of "first_zones_zonelist()"
Date: Tue, 22 Apr 2008 11:17:24 -0400 [thread overview]
Message-ID: <1208877444.5534.34.camel@localhost> (raw)
I was testing my "lazy migration" patches and noticed something
interesting about first_zones_zonelist(). I use this function to find
the target node for MPOL_BIND policy to determine if a page is
"misplaced" and should be migrated. In my testing, I found that I was
always "off by one". E.g., if my mempolicy nodemask contained only node
2, I'd migrate to node 3. If it contained node 3, I'd migrate to node 0
[on a 4-node platform], etc.
Following the usage in slab_node(), I was doing something like:
zr = first_zones_zonelist(node_zonelist(nid, ...), gfp_zone(...),
&pol->v.vnodes, &dummy);
newnid = zonelist_node_idx(zr);
Turns out that the return value is the NEXT zoneref in the zonelist
AFTER the one of interest--i.e., the first that satisfies any nodemask
constraint. I renamed 'dummy' to 'zone', ignore the return value and
use: newnid = zone->node. [I guess I could use zonelist_node_idx(zr
-1) as well.] This results in page migration to the expected node.
Anyway, after discovering this, I checked other usages of
first_zones_zonelist() outside of the iterator macros, and I THINK they
might be making the same mistake?
Here's a patch that "fixes" these. Do you agree? Or am I
misunderstanding this area [again!]?
Lee
PATCH fix off-by-one usage of first_zones_zonelist()
Against: 2.6.25-rc8-mm2
The return value of first_zones_zonelist() is actually the zoneref
AFTER the "requested zone"--i.e., the first zone in the zonelist
that satisfies any nodemask constraint. The "requested zone" is
returned via the @zone parameter. The returned zoneref is intended
to be passed to next_zones_zonelist() on subsequent iterations.
Fix up slab_node() and get_page_from_freelist() to use the requested
zone, rather than the next one in the list.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mempolicy.c | 9 ++++-----
mm/page_alloc.c | 2 +-
2 files changed, 5 insertions(+), 6 deletions(-)
Index: linux-2.6.25-rc8-mm2/mm/mempolicy.c
===================================================================
--- linux-2.6.25-rc8-mm2.orig/mm/mempolicy.c 2008-04-22 10:06:29.000000000 -0400
+++ linux-2.6.25-rc8-mm2/mm/mempolicy.c 2008-04-22 10:11:22.000000000 -0400
@@ -1396,14 +1396,13 @@ unsigned slab_node(struct mempolicy *pol
* first node.
*/
struct zonelist *zonelist;
- struct zoneref *z;
- struct zone *dummy;
+ struct zone *zone;
enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
zonelist = &NODE_DATA(numa_node_id())->node_zonelists[0];
- z = first_zones_zonelist(zonelist, highest_zoneidx,
+ (void)first_zones_zonelist(zonelist, highest_zoneidx,
&policy->v.nodes,
- &dummy);
- return zonelist_node_idx(z);
+ &zone);
+ return zone->node;
}
default:
Index: linux-2.6.25-rc8-mm2/mm/page_alloc.c
===================================================================
--- linux-2.6.25-rc8-mm2.orig/mm/page_alloc.c 2008-04-22 10:00:58.000000000 -0400
+++ linux-2.6.25-rc8-mm2/mm/page_alloc.c 2008-04-22 10:16:32.000000000 -0400
@@ -1414,7 +1414,7 @@ get_page_from_freelist(gfp_t gfp_mask, n
z = first_zones_zonelist(zonelist, high_zoneidx, nodemask,
&preferred_zone);
- classzone_idx = zonelist_zone_idx(z);
+ classzone_idx = zonelist_zone_idx(z - 1); /* z is next zone in list */
zonelist_scan:
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2008-04-22 15:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-22 15:17 Lee Schermerhorn [this message]
2008-04-22 16:15 ` Mel Gorman
2008-04-22 17:10 ` Lee Schermerhorn
2008-04-22 17:49 ` Mel Gorman
2008-04-22 18:01 ` Lee Schermerhorn
2008-04-22 18:40 ` Lee Schermerhorn
2008-04-23 6:02 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1208877444.5534.34.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox