Hi Mel, On Mon, Sep 8, 2014 at 2:57 PM, Mel Gorman wrote: > Commit 4ffeaf35 (mm: page_alloc: reduce cost of the fair zone allocation > policy) arguably broke the fair zone allocation policy on UP with these > hunks. > > a/mm/page_alloc.c > b/mm/page_alloc.c > @@ -1612,6 +1612,9 @@ again: > } > > __mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order)); > + if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 && > + !zone_is_fair_depleted(zone)) > + zone_set_flag(zone, ZONE_FAIR_DEPLETED); > > __count_zone_vm_events(PGALLOC, zone, 1 << order); > zone_statistics(preferred_zone, zone, gfp_flags); > @@ -1966,8 +1985,10 @@ zonelist_scan: > if (alloc_flags & ALLOC_FAIR) { > if (!zone_local(preferred_zone, zone)) > break; > - if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0) > + if (zone_is_fair_depleted(zone)) { > + nr_fair_skipped++; > continue; > + } > } > > A <= check was replaced with a ==. On SMP it doesn't matter because > negative values are returned as zero due to per-CPU drift which is not > possible in the UP case. Vlastimil Babka correctly pointed out that this > can wrap negative due to high-order allocations. > > However, Leon Romanovsky pointed out that a <= check on zone_page_state > was never correct as zone_page_state returns unsigned long so the root > cause of the breakage was the <= check in the first place. > > zone_page_state is an API hazard because of the difference in behaviour > between SMP and UP is very surprising. There is a good reason to allow > NR_ALLOC_BATCH to go negative -- when the counter is reset the negative > value takes recent activity into account. This patch makes zone_page_state > behave the same on SMP and UP as saving one branch on UP is not likely to > make a measurable performance difference. > > Reported-by: Vlastimil Babka > Reported-by: Leon Romanovsky > Signed-off-by: Mel Gorman > --- > include/linux/vmstat.h | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h > index 82e7db7..cece0f0 100644 > --- a/include/linux/vmstat.h > +++ b/include/linux/vmstat.h > @@ -131,10 +131,8 @@ static inline unsigned long zone_page_state(struct > zone *zone, > enum zone_stat_item item) > { > long x = atomic_long_read(&zone->vm_stat[item]); > -#ifdef CONFIG_SMP > if (x < 0) > x = 0; > -#endif > return x; > } > Since you are changing vmstat.h, what do you think about change in all similiar places? diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index 82e7db7..88d3d3e 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -120,10 +120,8 @@ static inline void zone_page_state_add(long x, struct zone *zone, static inline unsigned long global_page_state(enum zone_stat_item item) { long x = atomic_long_read(&vm_stat[item]); -#ifdef CONFIG_SMP if (x < 0) x = 0; -#endif return x; } @@ -131,10 +129,8 @@ static inline unsigned long zone_page_state(struct zone *zone, enum zone_stat_item item) { long x = atomic_long_read(&zone->vm_stat[item]); -#ifdef CONFIG_SMP if (x < 0) x = 0; -#endif return x; } @@ -153,10 +149,9 @@ static inline unsigned long zone_page_state_snapshot(struct zone *zone, int cpu; for_each_online_cpu(cpu) x += per_cpu_ptr(zone->pageset, cpu)->vm_stat_diff[item]; - +#endif if (x < 0) x = 0; -#endif return x; } -- Leon Romanovsky | Independent Linux Consultant www.leon.nu | leon@leon.nu