linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>, Eric Whitney <eric.whitney@hp.com>
Subject: Re: Suspect use of "first_zones_zonelist()"
Date: Tue, 22 Apr 2008 18:49:02 +0100	[thread overview]
Message-ID: <20080422174901.GA7261@csn.ul.ie> (raw)
In-Reply-To: <1208884215.5534.57.camel@localhost>

On (22/04/08 13:10), Lee Schermerhorn didst pronounce:
> > > <SNIP>
> > >
> > > --i.e., the first that satisfies any nodemask
> > > constraint.  I renamed 'dummy' to 'zone', ignore the return value and
> > > use:  newnid = zone->node.  [I guess I could use zonelist_node_idx(zr
> > > -1) as well.] 
> > 
> > zr - 1 would be vunerable to the iterator implementation changing.
> 
> Ah, good point.  Shouldn't peek under the covers like that.
> 
> > 
> > >  This results in page migration to the expected node.
> > > 
> > 
> > This use of zone instead of the zoneref cursor should be made throughout.
> > 
> > > Anyway, after discovering this, I checked other usages of
> > > first_zones_zonelist() outside of the iterator macros, and I THINK they
> > > might be making the same mistake?
> > > 
> > 
> > Yes, you're right.
> > 
> > > Here's a patch that "fixes" these.  Do you agree?  Or am I
> > > misunderstanding this area [again!]?
> > > 
> > 
> > No, I screwed up with the use of cursors and didn't get caught for it as
> > the effect would be very difficult to spot normally. I extended your patch
> > slightly below to catch the other callers. Can you take a read-through please?
> 
> OK.  Looks good.  I see I missed one case.
> 
> A suggestion.  How about enhancing the comment [maybe a kernel doc
> block?] on first_zones_zonelist() to explain that it returns the zone
> via the zone parameter and that the return value is a cursor for
> iterators?  Perhaps similarly for next_zones_zonelist() in mmzone.c?
> 
> Or would you like me to take a cut at this?
> 

That's a good suggestion. How about the following?

=======
Subject: [PATCH] fix off-by-one usage of first_zones_zonelist()

Against:  2.6.25-mm1

The return value of first_zones_zonelist() is actually the zoneref
AFTER the "requested zone"--i.e., the first zone in the zonelist
that satisfies any nodemask constraint.  The "requested zone" is
returned via the @zone parameter.  The returned zoneref is intended
to be passed to next_zones_zonelist() on subsequent iterations.

This patch fixes first_zones_zonelist() callers appropriately and documents
first_zones_zonelist() to help avoid a repeat of the same mistake.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 fs/buffer.c            |    9 ++++-----
 include/linux/mmzone.h |   25 ++++++++++++++++++++++++-
 mm/mempolicy.c         |    9 ++++-----
 mm/page_alloc.c        |    4 ++--
 4 files changed, 34 insertions(+), 13 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/fs/buffer.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/fs/buffer.c
--- linux-2.6.25-mm1-clean/fs/buffer.c	2008-04-22 10:30:02.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/fs/buffer.c	2008-04-22 18:35:56.000000000 +0100
@@ -368,18 +368,17 @@ void invalidate_bdev(struct block_device
  */
 static void free_more_memory(void)
 {
-	struct zoneref *zrefs;
-	struct zone *dummy;
+	struct zone *zone;
 	int nid;
 
 	wakeup_pdflush(1024);
 	yield();
 
 	for_each_online_node(nid) {
-		zrefs = first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
+		(void)first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
 						gfp_zone(GFP_NOFS), NULL,
-						&dummy);
-		if (zrefs->zone)
+						&zone);
+		if (zone)
 			try_to_free_pages(node_zonelist(nid, GFP_NOFS), 0,
 						GFP_NOFS);
 	}
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/include/linux/mmzone.h linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/include/linux/mmzone.h
--- linux-2.6.25-mm1-clean/include/linux/mmzone.h	2008-04-22 10:30:03.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/include/linux/mmzone.h	2008-04-22 18:45:37.000000000 +0100
@@ -742,12 +742,36 @@ static inline int zonelist_node_idx(stru
 #endif /* CONFIG_NUMA */
 }
 
+/**
+ * next_zones_zonelist - Returns the next zone at or below highest_zoneidx within the allowed nodemask using a cursor within a zonelist as a starting point
+ * @z - The cursor used as a starting point for the search
+ * @highest_zoneidx - The zone index of the highest zone to return
+ * @nodes - An optional nodemask to filter the zonelist with
+ * @zone - The first suitable zone found is returned via this parameter
+ *
+ * This function returns the next zone at or below a given zone index that is
+ * within the allowed nodemask using a cursor as the starting point for the
+ * search. The zoneref returned is a cursor that is used as the next starting
+ * point for future calls to next_zones_zonelist().
+ */
 struct zoneref *next_zones_zonelist(struct zoneref *z,
 					enum zone_type highest_zoneidx,
 					nodemask_t *nodes,
 					struct zone **zone);
 
-/* Returns the first zone at or below highest_zoneidx in a zonelist */
+/**
+ * first_zones_zonelist - Returns the first zone at or below highest_zoneidx within the allowed nodemask in a zonelist
+ * @zonelist - The zonelist to search for a suitable zone
+ * @highest_zoneidx - The zone index of the highest zone to return
+ * @nodes - An optional nodemask to filter the zonelist with
+ * @zone - The first suitable zone found is returned via this parameter
+ *
+ * This function returns the first zone at or below a given zone index that is
+ * within the allowed nodemask. The zoneref returned is a cursor that can be
+ * used to iterate the zonelist with next_zones_zonelist. The cursor should
+ * not be used by the caller as it does not match the value of the zone
+ * returned.
+ */
 static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
 					enum zone_type highest_zoneidx,
 					nodemask_t *nodes,
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/mm/mempolicy.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/mempolicy.c
--- linux-2.6.25-mm1-clean/mm/mempolicy.c	2008-04-22 10:30:04.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/mempolicy.c	2008-04-22 18:35:56.000000000 +0100
@@ -1396,14 +1396,13 @@ unsigned slab_node(struct mempolicy *pol
 		 * first node.
 		 */
 		struct zonelist *zonelist;
-		struct zoneref *z;
-		struct zone *dummy;
+		struct zone *zone;
 		enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
 		zonelist = &NODE_DATA(numa_node_id())->node_zonelists[0];
-		z = first_zones_zonelist(zonelist, highest_zoneidx,
+		(void)first_zones_zonelist(zonelist, highest_zoneidx,
 							&policy->v.nodes,
-							&dummy);
-		return zonelist_node_idx(z);
+							&zone);
+		return zone->node;
 	}
 
 	default:
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/mm/page_alloc.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/page_alloc.c
--- linux-2.6.25-mm1-clean/mm/page_alloc.c	2008-04-22 10:30:04.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/page_alloc.c	2008-04-22 18:35:56.000000000 +0100
@@ -1412,9 +1412,9 @@ get_page_from_freelist(gfp_t gfp_mask, n
 	int zlc_active = 0;		/* set if using zonelist_cache */
 	int did_zlc_setup = 0;		/* just call zlc_setup() one time */
 
-	z = first_zones_zonelist(zonelist, high_zoneidx, nodemask,
+	(void)first_zones_zonelist(zonelist, high_zoneidx, nodemask,
 							&preferred_zone);
-	classzone_idx = zonelist_zone_idx(z);
+	classzone_idx = zone_idx(preferred_zone);
 
 zonelist_scan:
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-04-22 17:49 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-22 15:17 Lee Schermerhorn
2008-04-22 16:15 ` Mel Gorman
2008-04-22 17:10   ` Lee Schermerhorn
2008-04-22 17:49     ` Mel Gorman [this message]
2008-04-22 18:01       ` Lee Schermerhorn
2008-04-22 18:40         ` Lee Schermerhorn
2008-04-23  6:02           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080422174901.GA7261@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=eric.whitney@hp.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox