From: Mel Gorman <mel@csn.ul.ie>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm <linux-mm@kvack.org>, Eric Whitney <eric.whitney@hp.com>
Subject: Re: Suspect use of "first_zones_zonelist()"
Date: Tue, 22 Apr 2008 18:49:02 +0100 [thread overview]
Message-ID: <20080422174901.GA7261@csn.ul.ie> (raw)
In-Reply-To: <1208884215.5534.57.camel@localhost>
On (22/04/08 13:10), Lee Schermerhorn didst pronounce:
> > > <SNIP>
> > >
> > > --i.e., the first that satisfies any nodemask
> > > constraint. I renamed 'dummy' to 'zone', ignore the return value and
> > > use: newnid = zone->node. [I guess I could use zonelist_node_idx(zr
> > > -1) as well.]
> >
> > zr - 1 would be vunerable to the iterator implementation changing.
>
> Ah, good point. Shouldn't peek under the covers like that.
>
> >
> > > This results in page migration to the expected node.
> > >
> >
> > This use of zone instead of the zoneref cursor should be made throughout.
> >
> > > Anyway, after discovering this, I checked other usages of
> > > first_zones_zonelist() outside of the iterator macros, and I THINK they
> > > might be making the same mistake?
> > >
> >
> > Yes, you're right.
> >
> > > Here's a patch that "fixes" these. Do you agree? Or am I
> > > misunderstanding this area [again!]?
> > >
> >
> > No, I screwed up with the use of cursors and didn't get caught for it as
> > the effect would be very difficult to spot normally. I extended your patch
> > slightly below to catch the other callers. Can you take a read-through please?
>
> OK. Looks good. I see I missed one case.
>
> A suggestion. How about enhancing the comment [maybe a kernel doc
> block?] on first_zones_zonelist() to explain that it returns the zone
> via the zone parameter and that the return value is a cursor for
> iterators? Perhaps similarly for next_zones_zonelist() in mmzone.c?
>
> Or would you like me to take a cut at this?
>
That's a good suggestion. How about the following?
=======
Subject: [PATCH] fix off-by-one usage of first_zones_zonelist()
Against: 2.6.25-mm1
The return value of first_zones_zonelist() is actually the zoneref
AFTER the "requested zone"--i.e., the first zone in the zonelist
that satisfies any nodemask constraint. The "requested zone" is
returned via the @zone parameter. The returned zoneref is intended
to be passed to next_zones_zonelist() on subsequent iterations.
This patch fixes first_zones_zonelist() callers appropriately and documents
first_zones_zonelist() to help avoid a repeat of the same mistake.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
fs/buffer.c | 9 ++++-----
include/linux/mmzone.h | 25 ++++++++++++++++++++++++-
mm/mempolicy.c | 9 ++++-----
mm/page_alloc.c | 4 ++--
4 files changed, 34 insertions(+), 13 deletions(-)
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/fs/buffer.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/fs/buffer.c
--- linux-2.6.25-mm1-clean/fs/buffer.c 2008-04-22 10:30:02.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/fs/buffer.c 2008-04-22 18:35:56.000000000 +0100
@@ -368,18 +368,17 @@ void invalidate_bdev(struct block_device
*/
static void free_more_memory(void)
{
- struct zoneref *zrefs;
- struct zone *dummy;
+ struct zone *zone;
int nid;
wakeup_pdflush(1024);
yield();
for_each_online_node(nid) {
- zrefs = first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
+ (void)first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
gfp_zone(GFP_NOFS), NULL,
- &dummy);
- if (zrefs->zone)
+ &zone);
+ if (zone)
try_to_free_pages(node_zonelist(nid, GFP_NOFS), 0,
GFP_NOFS);
}
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/include/linux/mmzone.h linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/include/linux/mmzone.h
--- linux-2.6.25-mm1-clean/include/linux/mmzone.h 2008-04-22 10:30:03.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/include/linux/mmzone.h 2008-04-22 18:45:37.000000000 +0100
@@ -742,12 +742,36 @@ static inline int zonelist_node_idx(stru
#endif /* CONFIG_NUMA */
}
+/**
+ * next_zones_zonelist - Returns the next zone at or below highest_zoneidx within the allowed nodemask using a cursor within a zonelist as a starting point
+ * @z - The cursor used as a starting point for the search
+ * @highest_zoneidx - The zone index of the highest zone to return
+ * @nodes - An optional nodemask to filter the zonelist with
+ * @zone - The first suitable zone found is returned via this parameter
+ *
+ * This function returns the next zone at or below a given zone index that is
+ * within the allowed nodemask using a cursor as the starting point for the
+ * search. The zoneref returned is a cursor that is used as the next starting
+ * point for future calls to next_zones_zonelist().
+ */
struct zoneref *next_zones_zonelist(struct zoneref *z,
enum zone_type highest_zoneidx,
nodemask_t *nodes,
struct zone **zone);
-/* Returns the first zone at or below highest_zoneidx in a zonelist */
+/**
+ * first_zones_zonelist - Returns the first zone at or below highest_zoneidx within the allowed nodemask in a zonelist
+ * @zonelist - The zonelist to search for a suitable zone
+ * @highest_zoneidx - The zone index of the highest zone to return
+ * @nodes - An optional nodemask to filter the zonelist with
+ * @zone - The first suitable zone found is returned via this parameter
+ *
+ * This function returns the first zone at or below a given zone index that is
+ * within the allowed nodemask. The zoneref returned is a cursor that can be
+ * used to iterate the zonelist with next_zones_zonelist. The cursor should
+ * not be used by the caller as it does not match the value of the zone
+ * returned.
+ */
static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
enum zone_type highest_zoneidx,
nodemask_t *nodes,
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/mm/mempolicy.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/mempolicy.c
--- linux-2.6.25-mm1-clean/mm/mempolicy.c 2008-04-22 10:30:04.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/mempolicy.c 2008-04-22 18:35:56.000000000 +0100
@@ -1396,14 +1396,13 @@ unsigned slab_node(struct mempolicy *pol
* first node.
*/
struct zonelist *zonelist;
- struct zoneref *z;
- struct zone *dummy;
+ struct zone *zone;
enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
zonelist = &NODE_DATA(numa_node_id())->node_zonelists[0];
- z = first_zones_zonelist(zonelist, highest_zoneidx,
+ (void)first_zones_zonelist(zonelist, highest_zoneidx,
&policy->v.nodes,
- &dummy);
- return zonelist_node_idx(z);
+ &zone);
+ return zone->node;
}
default:
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.25-mm1-clean/mm/page_alloc.c linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/page_alloc.c
--- linux-2.6.25-mm1-clean/mm/page_alloc.c 2008-04-22 10:30:04.000000000 +0100
+++ linux-2.6.25-mm1-fix-first_zone_zonelist-v1r2/mm/page_alloc.c 2008-04-22 18:35:56.000000000 +0100
@@ -1412,9 +1412,9 @@ get_page_from_freelist(gfp_t gfp_mask, n
int zlc_active = 0; /* set if using zonelist_cache */
int did_zlc_setup = 0; /* just call zlc_setup() one time */
- z = first_zones_zonelist(zonelist, high_zoneidx, nodemask,
+ (void)first_zones_zonelist(zonelist, high_zoneidx, nodemask,
&preferred_zone);
- classzone_idx = zonelist_zone_idx(z);
+ classzone_idx = zone_idx(preferred_zone);
zonelist_scan:
/*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-04-22 17:49 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-22 15:17 Lee Schermerhorn
2008-04-22 16:15 ` Mel Gorman
2008-04-22 17:10 ` Lee Schermerhorn
2008-04-22 17:49 ` Mel Gorman [this message]
2008-04-22 18:01 ` Lee Schermerhorn
2008-04-22 18:40 ` Lee Schermerhorn
2008-04-23 6:02 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080422174901.GA7261@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=Lee.Schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox