linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* bind_zonelist() - are we definitely sizing this correctly?
@ 2007-07-26 14:17 Mel Gorman
  2007-07-26 14:25 ` Mel Gorman
  0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2007-07-26 14:17 UTC (permalink / raw)
  To: Lee Schermerhorn, ak, Christoph Lameter, apw, kamezawa.hiroyu
  Cc: linux-mm, linux-kernel

I was looking closer at bind_zonelist() and it has the following snippet

        struct zonelist *zl;
        int num, max, nd;
        enum zone_type k;

        max = 1 + MAX_NR_ZONES * nodes_weight(*nodes);
        max++;                  /* space for zlcache_ptr (see mmzone.h) */
        zl = kmalloc(sizeof(struct zone *) * max, GFP_KERNEL);
        if (!zl)
                return ERR_PTR(-ENOMEM);

That set off alarm bells because we are allocating based on the size of a
zone, not the size of the zonelist.

This is the definition of struct zonelist

struct zonelist {
        struct zonelist_cache *zlcache_ptr;                  // NULL or &zlcache
        struct zone *zones[MAX_ZONES_PER_ZONELIST + 1];      // NULL delimited
#ifdef CONFIG_NUMA
        struct zonelist_cache zlcache;                       // optional ...
#endif
};

Important thing to note here is that zlcache is after *zones and it is
not a pointer. zlcache in turn is defined as

struct zonelist_cache {
        unsigned short z_to_n[MAX_ZONES_PER_ZONELIST];          /* zone->nid */
        DECLARE_BITMAP(fullzones, MAX_ZONES_PER_ZONELIST);      /* zone full? */
        unsigned long last_full_zap;            /* when last zap'd (jiffies) */
};

This is on NUMA only and it's a big structure.

The intention of bind_zonelist() appears to be that we only allocate enough
memory to hold all the zones in the active nodes. This was fine in 2.6.19
but now with zlcache after *zones[], I think we are in danger of allocating
too little memory and any reading of zlcache may be reading randomness when
MPOL_BIND is in use because it will be using the full offset within the
structure whether the memory is allocated or not.

At the risk of sounding stupid, what obvious thing am I missing that makes
this work?

If I'm right and this is broken and we still want to allocate as little memory
as possible, zlcache has to move before zones and the call to kmalloc needs
to take the size of zlcache into account.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-07-26 17:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-26 14:17 bind_zonelist() - are we definitely sizing this correctly? Mel Gorman
2007-07-26 14:25 ` Mel Gorman
2007-07-26 17:58   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox