From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vc0-f175.google.com (mail-vc0-f175.google.com [209.85.220.175]) by kanga.kvack.org (Postfix) with ESMTP id 59B6C6B0036 for ; Wed, 10 Sep 2014 00:32:41 -0400 (EDT) Received: by mail-vc0-f175.google.com with SMTP id lf12so18226283vcb.34 for ; Tue, 09 Sep 2014 21:32:41 -0700 (PDT) Received: from mail-vc0-f181.google.com (mail-vc0-f181.google.com [209.85.220.181]) by mx.google.com with ESMTPS id yq20si6608361vdb.38.2014.09.09.21.32.40 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 09 Sep 2014 21:32:40 -0700 (PDT) Received: by mail-vc0-f181.google.com with SMTP id ij19so4235165vcb.26 for ; Tue, 09 Sep 2014 21:32:40 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20140909131540.GA10568@cmpxchg.org> References: <20140909131540.GA10568@cmpxchg.org> From: Leon Romanovsky Date: Wed, 10 Sep 2014 07:32:20 +0300 Message-ID: Subject: Re: [patch resend] mm: page_alloc: fix zone allocation fairness on UP Content-Type: multipart/alternative; boundary=089e01634bf0d26ed50502ae8a7f Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Andrew Morton , Mel Gorman , Vlastimil Babka , Linux-MM , "linux-kernel@vger.kernel.org" --089e01634bf0d26ed50502ae8a7f Content-Type: text/plain; charset=UTF-8 Hi Johaness, On Tue, Sep 9, 2014 at 4:15 PM, Johannes Weiner wrote: > The zone allocation batches can easily underflow due to higher-order > allocations or spills to remote nodes. On SMP that's fine, because > underflows are expected from concurrency and dealt with by returning > 0. But on UP, zone_page_state will just return a wrapped unsigned > long, which will get past the <= 0 check and then consider the zone > eligible until its watermarks are hit. > > 3a025760fc15 ("mm: page_alloc: spill to remote nodes before waking > kswapd") already made the counter-resetting use atomic_long_read() to > accomodate underflows from remote spills, but it didn't go all the way > with it. Make it clear that these batches are expected to go negative > regardless of concurrency, and use atomic_long_read() everywhere. > > Fixes: 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") > Reported-by: Vlastimil Babka > Reported-by: Leon Romanovsky > Signed-off-by: Johannes Weiner > Acked-by: Mel Gorman > Cc: "3.12+" > --- > mm/page_alloc.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > Sorry I forgot to CC you, Leon. Resend with updated Tags. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 18cee0d4c8a2..eee961958021 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1612,7 +1612,7 @@ again: > } > > __mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order)); > - if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 && > + if (atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH]) <= 0 && > !zone_is_fair_depleted(zone)) > zone_set_flag(zone, ZONE_FAIR_DEPLETED); > > @@ -5701,9 +5701,8 @@ static void __setup_per_zone_wmarks(void) > zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + (tmp > >> 1); > > __mod_zone_page_state(zone, NR_ALLOC_BATCH, > - high_wmark_pages(zone) - > - low_wmark_pages(zone) - > - zone_page_state(zone, > NR_ALLOC_BATCH)); > + high_wmark_pages(zone) - low_wmark_pages(zone) - > + atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH])); > > setup_zone_migrate_reserve(zone); > spin_unlock_irqrestore(&zone->lock, flags); > I think the better way will be to apply Mel's patch https://lkml.org/lkml/2014/9/8/214 which fix zone_page_state shadow casting issue and convert all atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH])) to zone_page__state(zone, NR_ALLOC_BATCH). This move will unify access to vm_stat. > -- > 2.0.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org > -- Leon Romanovsky | Independent Linux Consultant www.leon.nu | leon@leon.nu --089e01634bf0d26ed50502ae8a7f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Johaness,


On Tue, Sep 9, 2014 at 4:15 PM, Johannes Weiner <hann= es@cmpxchg.org> wrote:
The zone allocation batches can easily underflow due to high= er-order
allocations or spills to remote nodes.=C2=A0 On SMP that's fine, becaus= e
underflows are expected from concurrency and dealt with by returning
0.=C2=A0 But on UP, zone_page_state will just return a wrapped unsigned
long, which will get past the <=3D 0 check and then consider the zone eligible until its watermarks are hit.

3a025760fc15 ("mm: page_alloc: spill to remote nodes before waking
kswapd") already made the counter-resetting use atomic_long_read() to<= br> accomodate underflows from remote spills, but it didn't go all the way<= br> with it.=C2=A0 Make it clear that these batches are expected to go negative=
regardless of concurrency, and use atomic_long_read() everywhere.

Fixes: 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy"= )
Reported-by: Vlastimil Babka <vbabka@s= use.cz>
Reported-by: Leon Romanovsky <leon@leon.= nu>
Signed-off-by: Johannes Weiner <ha= nnes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@suse.de= >
Cc: "3.12+" <stable@kerne= l.org>
---
=C2=A0mm/page_alloc.c | 7 +++----
=C2=A01 file changed, 3 insertions(+), 4 deletions(-)

Sorry I forgot to CC you, Leon.=C2=A0 Resend with updated Tags.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 18cee0d4c8a2..eee961958021 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1612,7 +1612,7 @@ again:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }

=C2=A0 =C2=A0 =C2=A0 =C2=A0 __mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1= << order));
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (zone_page_state(zone, NR_ALLOC_BATCH) =3D= =3D 0 &&
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (atomic_long_read(&zone->vm_stat[NR_A= LLOC_BATCH]) <=3D 0 &&
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 !zone_is_fair_depleted(zone))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 zone_set_flag(zone,= ZONE_FAIR_DEPLETED);

@@ -5701,9 +5701,8 @@ static void __setup_per_zone_wmarks(void)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 zone->watermark[= WMARK_HIGH] =3D min_wmark_pages(zone) + (tmp >> 1);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 __mod_zone_page_sta= te(zone, NR_ALLOC_BATCH,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0high_wmark_pages= (zone) -
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0low_wmark_pages(= zone) -
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0zone_page_state(= zone, NR_ALLOC_BATCH));
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0high_wmark_pages(zone) - low_wmark_pages(zone) -
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH]));

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 setup_zone_migrate_= reserve(zone);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_unlock_irqrest= ore(&zone->lock, flags);

I think= the better way will be to apply Mel's patch https://lkml.org/lkml/2014/9/8/214 which fix zone_= page_state shadow casting issue and convert all atomic_long_read(&zone-= >vm_stat[NR_ALLOC_BATCH])) to zone_page__state(zone, NR_ALLOC_BATCH). Th= is move will unify access to vm_stat.

=C2=A0
--
2.0.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.= =C2=A0 For more info on Linux MM,
see: http://www.linu= x-mm.org/ .
Don't email: <a href=3Dmailto:"dont@kvack.org"> email@kva= ck.org </a>



--
Leon Romanovsky | Independent Linux Consultant
=C2=A0= =C2=A0 =C2=A0 =C2=A0=C2=A0www.leon.nu=C2=A0| l= eon@leon.nu
--089e01634bf0d26ed50502ae8a7f-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org