From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 4/5] mm: page_alloc: Reduce cost of the fair zone allocation policy
Date: Mon, 30 Jun 2014 10:41:42 -0400 [thread overview]
Message-ID: <20140630144142.GA1369@cmpxchg.org> (raw)
In-Reply-To: <20140627192537.GM10819@suse.de>
On Fri, Jun 27, 2014 at 08:25:37PM +0100, Mel Gorman wrote:
> On Fri, Jun 27, 2014 at 02:57:00PM -0400, Johannes Weiner wrote:
> > On Fri, Jun 27, 2014 at 09:14:39AM +0100, Mel Gorman wrote:
> > > And the number of pages allocated from each zone is comparable
> > >
> > > 3.16.0-rc2 3.16.0-rc2
> > > checklow fairzone
> > > DMA allocs 0 0
> > > DMA32 allocs 7374217 7920241
> > > Normal allocs 999277551 996568115
> >
> > Wow, the DMA32 zone gets less than 1% of the allocations. What are
> > the zone sizes in this machine?
> >
>
> managed 3976
> managed 755409
> managed 1281601
Something seems way off with this. On my system here, the DMA32 zone
makes up for 20% of managed pages and it gets roughly 20% of the page
allocations, as I would expect.
Your DMA32 zone makes up for 37% of the managed pages and receives
merely 0.7% of the page allocations. Unless a large portion of that
zone is somehow unreclaimable, fairness seems completely obliberated
in both kernels.
Is that checklow's doing?
> > > @@ -3287,10 +3287,18 @@ void show_free_areas(unsigned int filter)
> > > show_swap_cache_info();
> > > }
> > >
> > > -static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
> > > +static int zoneref_set_zone(pg_data_t *pgdat, struct zone *zone,
> > > + struct zoneref *zoneref, struct zone *preferred_zone)
> > > {
> > > + int zone_type = zone_idx(zone);
> > > + bool fair_enabled = zone_local(zone, preferred_zone);
> > > + if (zone_type == 0 &&
> > > + zone->managed_pages < (pgdat->node_present_pages >> 4))
> > > + fair_enabled = false;
> >
> > This needs a comment.
> >
>
> /*
> * Do not count the lowest zone as of relevance to the fair zone
> * allocation policy if it's a small percentage of the node
> */
>
> However, as I write this I'll look at getting rid of this entirely. It
> made some sense when fair_eligible was tracked on a per-zone basis but
> it's more complex than necessary.
>
> > > zoneref->zone = zone;
> > > - zoneref->zone_idx = zone_idx(zone);
> > > + zoneref->zone_idx = zone_type;
> > > + return fair_enabled;
> > > }
> > >
> > > /*
> > > @@ -3303,17 +3311,26 @@ static int build_zonelists_node(pg_data_t *pgdat, struct zonelist *zonelist,
> > > {
> > > struct zone *zone;
> > > enum zone_type zone_type = MAX_NR_ZONES;
> > > + struct zone *preferred_zone = NULL;
> > > + int nr_fair = 0;
> > >
> > > do {
> > > zone_type--;
> > > zone = pgdat->node_zones + zone_type;
> > > if (populated_zone(zone)) {
> > > - zoneref_set_zone(zone,
> > > - &zonelist->_zonerefs[nr_zones++]);
> > > + if (!preferred_zone)
> > > + preferred_zone = zone;
> > > +
> > > + nr_fair += zoneref_set_zone(pgdat, zone,
> > > + &zonelist->_zonerefs[nr_zones++],
> > > + preferred_zone);
> >
> > Passing preferred_zone to determine locality seems pointless when you
> > walk the zones of a single node.
> >
>
> True.
>
> > And the return value of zoneref_set_zone() is fairly unexpected.
> >
>
> How so?
Given the name zoneref_set_zone(), I wouldn't expect any return value,
or a success/failure type return value at best - certainly not whether
the passed zone is eligible for the fairness policy.
> > It's probably better to determine fair_enabled in the callsite, that
> > would fix both problems, and write a separate helper that tests if a
> > zone is eligible for fair treatment (type && managed_pages test).
> >
>
> Are you thinking of putting that into the page allocator fast path? I'm
> trying to take stuff out of there :/.
Not at all, I was just suggesting to restructure the code for building
the zonelists, and move the fairness stuff out of zoneref_set_zone().
If you remove the small-zone exclusion as per above, this only leaves
the locality check when building the zonelist in zone order and that
can easily be checked inline in build_zonelists_in_zone_order().
build_zonelists_node() can just count every populated zone in nr_fair.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-06-30 14:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-27 8:14 [PATCH 0/5] Improve sequential read throughput v3 Mel Gorman
2014-06-27 8:14 ` [PATCH 1/5] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-06-27 8:14 ` [PATCH 2/5] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-06-27 8:14 ` [PATCH 3/5] mm: vmscan: Do not reclaim from lower zones if they are balanced Mel Gorman
2014-06-27 17:26 ` Johannes Weiner
2014-06-27 18:42 ` Mel Gorman
2014-06-27 8:14 ` [PATCH 4/5] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-06-27 18:57 ` Johannes Weiner
2014-06-27 19:25 ` Mel Gorman
2014-06-30 14:41 ` Johannes Weiner [this message]
2014-06-27 8:14 ` [PATCH 5/5] mm: page_alloc: Reduce cost of dirty zone balancing Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140630144142.GA1369@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox