From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail144.messagelabs.com (mail144.messagelabs.com [216.82.254.51]) by kanga.kvack.org (Postfix) with SMTP id 4666F6B02A5 for ; Tue, 13 Jul 2010 05:32:27 -0400 (EDT) Received: from m6.gw.fujitsu.co.jp ([10.0.50.76]) by fgwmail7.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o6D9WNx3014434 for (envelope-from kosaki.motohiro@jp.fujitsu.com); Tue, 13 Jul 2010 18:32:23 +0900 Received: from smail (m6 [127.0.0.1]) by outgoing.m6.gw.fujitsu.co.jp (Postfix) with ESMTP id 0BD4F45DE4F for ; Tue, 13 Jul 2010 18:32:23 +0900 (JST) Received: from s6.gw.fujitsu.co.jp (s6.gw.fujitsu.co.jp [10.0.50.96]) by m6.gw.fujitsu.co.jp (Postfix) with ESMTP id E543F45DE4C for ; Tue, 13 Jul 2010 18:32:22 +0900 (JST) Received: from s6.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s6.gw.fujitsu.co.jp (Postfix) with ESMTP id CB6D81DB8014 for ; Tue, 13 Jul 2010 18:32:22 +0900 (JST) Received: from m106.s.css.fujitsu.com (m106.s.css.fujitsu.com [10.249.87.106]) by s6.gw.fujitsu.co.jp (Postfix) with ESMTP id 724951DB8012 for ; Tue, 13 Jul 2010 18:32:22 +0900 (JST) From: KOSAKI Motohiro Subject: Re: [PATCH v2 1/2] vmscan: don't subtraction of unsined In-Reply-To: <20100709152851.330bf2b2.akpm@linux-foundation.org> References: <20100709090956.CD51.A69D9226@jp.fujitsu.com> <20100709152851.330bf2b2.akpm@linux-foundation.org> Message-Id: <20100713182918.EA67.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Date: Tue, 13 Jul 2010 18:32:21 +0900 (JST) Sender: owner-linux-mm@kvack.org To: Andrew Morton , Christoph Lameter Cc: kosaki.motohiro@jp.fujitsu.com, LKML , linux-mm , Mel Gorman , Rik van Riel , Minchan Kim , Johannes Weiner List-ID: > On Fri, 9 Jul 2010 10:16:33 +0900 (JST) > KOSAKI Motohiro wrote: >=20 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2588,7 +2588,7 @@ static int __zone_reclaim(struct zone *zone, gfp_= t gfp_mask, unsigned int order) > > .swappiness =3D vm_swappiness, > > .order =3D order, > > }; > > - unsigned long slab_reclaimable; > > + unsigned long nr_slab_pages0, nr_slab_pages1; > > =20 > > disable_swap_token(); > > cond_resched(); > > @@ -2615,8 +2615,8 @@ static int __zone_reclaim(struct zone *zone, gfp_= t gfp_mask, unsigned int order) > > } while (priority >=3D 0 && sc.nr_reclaimed < nr_pages); > > } > > =20 > > - slab_reclaimable =3D zone_page_state(zone, NR_SLAB_RECLAIMABLE); > > - if (slab_reclaimable > zone->min_slab_pages) { > > + nr_slab_pages0 =3D zone_page_state(zone, NR_SLAB_RECLAIMABLE); > > + if (nr_slab_pages0 > zone->min_slab_pages) { > > /* > > * shrink_slab() does not currently allow us to determine how > > * many pages were freed in this zone. >=20 > Well no, but it could do so, with some minor changes to struct > reclaim_state and its handling. Put a zone* and a counter in > reclaim_state, handle them in sl?b.c. >=20 > > So we take the current > > @@ -2628,16 +2628,17 @@ static int __zone_reclaim(struct zone *zone, gf= p_t gfp_mask, unsigned int order) > > * take a long time. > > */ > > while (shrink_slab(sc.nr_scanned, gfp_mask, order) && > > - zone_page_state(zone, NR_SLAB_RECLAIMABLE) > > > - slab_reclaimable - nr_pages) > > + (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > > > + nr_slab_pages0)) > > ; > > =20 > > /* > > * Update nr_reclaimed by the number of slab pages we > > * reclaimed from this zone. > > */ > > - sc.nr_reclaimed +=3D slab_reclaimable - > > - zone_page_state(zone, NR_SLAB_RECLAIMABLE); > > + nr_slab_pages1 =3D zone_page_state(zone, NR_SLAB_RECLAIMABLE); > > + if (nr_slab_pages1 < nr_slab_pages0) > > + sc.nr_reclaimed +=3D nr_slab_pages0 - nr_slab_pages1; >=20 > My, that's horrible. The whole expression says "this number is > basically a pile of random junk. Let's add it in anyway". >=20 >=20 > > } > > =20 > > p->reclaim_state =3D NULL; How's this? Christoph, Can we hear your opinion about to add new branch in slab-free pa= th? I think this is ok, because reclaim makes a lot of cache miss then branch mistaken is relatively minor penalty. thought? =46rom 9f7d7a9bd836b7373ade3056e6a3d2a3d82ac7ce Mon Sep 17 00:00:00 2001 From: KOSAKI Motohiro Date: Tue, 13 Jul 2010 14:43:21 +0900 Subject: [PATCH] vmscan: count reclaimed slab pages properly Andrew Morton pointed out __zone_reclaim() shouldn't compare old and new zone_page_state(NR_SLAB_RECLAIMABLE) result. Instead, it have to account number of free slab pages by to enhance reclaim_state. This patch does it. Signed-off-by: KOSAKI Motohiro --- include/linux/swap.h | 3 ++- mm/slab.c | 4 +++- mm/slob.c | 4 +++- mm/slub.c | 7 +++++-- mm/vmscan.c | 44 ++++++++++++++++---------------------------- 5 files changed, 29 insertions(+), 33 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index ff4acea..b8d3f33 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -107,7 +107,8 @@ typedef struct { * memory reclaim */ struct reclaim_state { - unsigned long reclaimed_slab; + unsigned long reclaimed_slab; + struct zone *zone; }; =20 #ifdef __KERNEL__ diff --git a/mm/slab.c b/mm/slab.c index 4e9c46f..aac9306 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1741,7 +1741,9 @@ static void kmem_freepages(struct kmem_cache *cachep,= void *addr) page++; } if (current->reclaim_state) - current->reclaim_state->reclaimed_slab +=3D nr_freed; + if (!current->reclaim_state->zone || + current->reclaim_state->zone =3D=3D page_zone(page)) + current->reclaim_state->reclaimed_slab +=3D nr_freed; free_pages((unsigned long)addr, cachep->gfporder); } =20 diff --git a/mm/slob.c b/mm/slob.c index 3f19a34..192d05c 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -260,7 +260,9 @@ static void *slob_new_pages(gfp_t gfp, int order, int n= ode) static void slob_free_pages(void *b, int order) { if (current->reclaim_state) - current->reclaim_state->reclaimed_slab +=3D 1 << order; + if (!current->reclaim_state->zone || + current->reclaim_state->zone =3D=3D page_zone(page)) + current->reclaim_state->reclaimed_slab +=3D 1 << order; free_pages((unsigned long)b, order); } =20 diff --git a/mm/slub.c b/mm/slub.c index 7bb7940..f510b14 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1204,8 +1204,11 @@ static void __free_slab(struct kmem_cache *s, struct= page *page) =20 __ClearPageSlab(page); reset_page_mapcount(page); - if (current->reclaim_state) - current->reclaim_state->reclaimed_slab +=3D pages; + if (current->reclaim_state) { + if (!current->reclaim_state->zone || + current->reclaim_state->zone =3D=3D page_zone(page)) + current->reclaim_state->reclaimed_slab +=3D pages; + } __free_pages(page, order); } =20 diff --git a/mm/vmscan.c b/mm/vmscan.c index 1bf9f72..8faef0c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2571,7 +2571,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gf= p_mask, unsigned int order) /* Minimum pages needed in order to stay on node */ const unsigned long nr_pages =3D 1 << order; struct task_struct *p =3D current; - struct reclaim_state reclaim_state; int priority; struct scan_control sc =3D { .may_writepage =3D !!(zone_reclaim_mode & RECLAIM_WRITE), @@ -2583,8 +2582,10 @@ static int __zone_reclaim(struct zone *zone, gfp_t g= fp_mask, unsigned int order) .swappiness =3D vm_swappiness, .order =3D order, }; - unsigned long nr_slab_pages0, nr_slab_pages1; - + struct reclaim_state reclaim_state =3D { + .reclaimed_slab =3D 0, + .zone =3D zone, + }; =20 cond_resched(); /* @@ -2594,7 +2595,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gf= p_mask, unsigned int order) */ p->flags |=3D PF_MEMALLOC | PF_SWAPWRITE; lockdep_set_current_reclaim_state(gfp_mask); - reclaim_state.reclaimed_slab =3D 0; p->reclaim_state =3D &reclaim_state; =20 if (zone_pagecache_reclaimable(zone) > zone->min_unmapped_pages) { @@ -2610,34 +2610,22 @@ static int __zone_reclaim(struct zone *zone, gfp_t = gfp_mask, unsigned int order) } while (priority >=3D 0 && sc.nr_reclaimed < nr_pages); } =20 - nr_slab_pages0 =3D zone_page_state(zone, NR_SLAB_RECLAIMABLE); - if (nr_slab_pages0 > zone->min_slab_pages) { + if (zone_page_state(zone, NR_SLAB_RECLAIMABLE) > zone->min_slab_pages) { unsigned long lru_pages =3D zone_reclaimable_pages(zone); =20 - /* - * shrink_slab() does not currently allow us to determine how - * many pages were freed in this zone. So we take the current - * number of slab pages and shake the slab until it is reduced - * by the same nr_pages that we used for reclaiming unmapped - * pages. - * - * Note that shrink_slab will free memory on all zones and may - * take a long time. - */ - while (shrink_slab(sc.nr_scanned, gfp_mask, lru_pages) && - (zone_page_state(zone, NR_SLAB_RECLAIMABLE) + nr_pages > - nr_slab_pages0)) - ; - - /* - * Update nr_reclaimed by the number of slab pages we - * reclaimed from this zone. - */ - nr_slab_pages1 =3D zone_page_state(zone, NR_SLAB_RECLAIMABLE); - if (nr_slab_pages1 < nr_slab_pages0) - sc.nr_reclaimed +=3D nr_slab_pages0 - nr_slab_pages1; + for(;;) { + /* + * Note that shrink_slab will free memory on all zones + * and may take a long time. + */ + if (!shrink_slab(sc.nr_scanned, gfp_mask, lru_pages)) + break; + if (reclaim_state.reclaimed_slab >=3D nr_pages) + break; + } } =20 + sc.nr_reclaimed +=3D reclaim_state.reclaimed_slab; p->reclaim_state =3D NULL; current->flags &=3D ~(PF_MEMALLOC | PF_SWAPWRITE); lockdep_clear_current_reclaim_state(); --=20 1.6.5.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org