linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: kosaki.motohiro@jp.fujitsu.com, Simon Kirby <sim@hostway.ca>,
	Shaohua Li <shaohua.li@intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>
Subject: Re: Free memory never fully used, swapping
Date: Tue, 30 Nov 2010 15:31:03 +0900 (JST)	[thread overview]
Message-ID: <20101130152002.8307.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <20101126111122.GK26037@csn.ul.ie>

Hi
> On Fri, Nov 26, 2010 at 08:03:04PM +0900, KOSAKI Motohiro wrote:
> > Two points.
> > 
> > > @@ -2310,10 +2324,12 @@ loop_again:
> > >  				 * spectulatively avoid congestion waits
> > >  				 */
> > >  				zone_clear_flag(zone, ZONE_CONGESTED);
> > > +				if (i <= pgdat->high_zoneidx)
> > > +					any_zone_ok = 1;
> > >  			}
> > >  
> > >  		}
> > > -		if (all_zones_ok)
> > > +		if (all_zones_ok || (order && any_zone_ok))
> > >  			break;		/* kswapd: all done */
> > >  		/*
> > >  		 * OK, kswapd is getting into trouble.  Take a nap, then take
> > > @@ -2336,7 +2352,7 @@ loop_again:
> > >  			break;
> > >  	}
> > >  out:
> > > -	if (!all_zones_ok) {
> > > +	if (!(all_zones_ok || (order && any_zone_ok))) {
> > 
> > This doesn't work ;)
> > kswapd have to clear ZONE_CONGESTED flag before enter sleeping.
> > otherwise nobody can clear it.
> > 
> 
> Does it not do it earlier in balance_pgdat() here
> 
>                                 /*
>                                  * If a zone reaches its high watermark,
>                                  * consider it to be no longer congested. It's
>                                  * possible there are dirty pages backed by
>                                  * congested BDIs but as pressure is
>                                  * relieved, spectulatively avoid congestion waits
>                                  */
>                                 zone_clear_flag(zone, ZONE_CONGESTED);
>                                 if (i <= pgdat->high_zoneidx)
>                                         any_zone_ok = 1;

zone_clear_flag(zone, ZONE_CONGESTED) only clear one zone status. other
zone remain old status.



> > Say, we have to fill below condition.
> >  - All zone are successing zone_watermark_ok(order-0)
> 
> We should loop around at least once with order == 0 where all_zones_ok
> is checked.

But no gurantee. IOW kswapd early stopping increase GFP_ATOMIC allocation
failure risk, I think.


> >  - At least one zone are successing zone_watermark_ok(high-order)
> 
> This is preferable but it's possible for kswapd to go to sleep without
> this condition being satisified.

Yes.

> 
> > 
> > 
> > > @@ -2417,6 +2439,7 @@ static int kswapd(void *p)
> > >  		prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
> > >  		new_order = pgdat->kswapd_max_order;
> > >  		pgdat->kswapd_max_order = 0;
> > > +		pgdat->high_zoneidx = MAX_ORDER;
> > 
> > I don't think MAX_ORDER is correct ;)
> > 
> >         high_zoneidx = pgdat->high_zoneidx;
> >         pgdat->high_zoneidx = pgdat->nr_zones - 1;
> > 
> > ?
> > 
> 
> Bah. It should have been MAX_NR_ZONES. This happens to still work because
> MAX_ORDER will always be higher than MAX_NR_ZONES but it's wrong.

Well, no. balance_pgdat() shuldn't read pgdat->high_zoneidx. please remember why
balance balance_pgdat() don't read pgdat->kswapd_max_order directly. wakeup_kswapd()
change pgdat->kswapd_max_order and pgdat->high_zoneidx without any lock. so, 
we need to afraid following bad scenario.


T1: wakeup_kswapd(order=0, HIGHMEM)
T2: enter balance_kswapd()
T1: wakeup_kswapd(order=1, DMA32)
T2: exit balance_kswapd()
      kswapd() erase pgdat->high_zoneidx and decide to don't sleep (because
      old-order=0, new-order=1). So now we will start unnecessary HIGHMEM
      reclaim.



> 
> > And, we have another kswapd_max_order reading place. (after kswapd_try_to_sleep)
> > We need it too.
> > 
> 
> I'm not quite sure what you mean here. kswapd_max_order is read again
> after kswapd tries to sleep (or wakes for that matter) but it'll be in
> response to another caller having tried to wake kswapd indicating that
> those high orders really are needed.

My expected bad scenario was written above. 

Thanks.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-11-30  6:31 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20101115195246.GB17387@hostway.ca>
2010-11-22 23:44 ` Andrew Morton
2010-11-23  1:34   ` Simon Kirby
2010-11-23  8:35   ` Dave Hansen
2010-11-24  8:46     ` Simon Kirby
2010-11-25  1:07       ` Shaohua Li
2010-11-25  9:03         ` Simon Kirby
2010-11-25 10:18           ` KOSAKI Motohiro
2010-11-25 17:13             ` Simon Kirby
2010-11-26  0:33               ` KOSAKI Motohiro
2010-11-25 10:51           ` KOSAKI Motohiro
2010-11-25 16:15             ` Mel Gorman
2010-11-26  2:00               ` Shaohua Li
2010-11-26  2:31                 ` KOSAKI Motohiro
2010-11-26  2:40                   ` Shaohua Li
2010-11-26  9:18                     ` KOSAKI Motohiro
2010-11-29  1:03                       ` Shaohua Li
2010-11-29  1:13                         ` KOSAKI Motohiro
2010-11-26  0:07             ` KOSAKI Motohiro
2010-11-25 16:12           ` Mel Gorman
2010-11-26  1:05             ` Shaohua Li
2010-11-26  1:25               ` Mel Gorman
2010-11-26  2:05                 ` Shaohua Li
2010-11-26 11:03             ` KOSAKI Motohiro
2010-11-26 11:11               ` Mel Gorman
2010-11-30  6:31                 ` KOSAKI Motohiro [this message]
2010-11-30 10:41                   ` Mel Gorman
2010-11-30 11:19                     ` KOSAKI Motohiro
2010-11-30  8:22             ` Simon Kirby
2010-11-29  9:31       ` KOSAKI Motohiro
2010-11-23 10:04   ` Mel Gorman
2010-11-24  6:43     ` Simon Kirby
2010-11-24  9:27       ` Mel Gorman
2010-11-24 19:17         ` Simon Kirby
2010-11-25  1:18           ` KOSAKI Motohiro
2010-11-26 15:48             ` Christoph Lameter
2010-11-30  0:25               ` KOSAKI Motohiro
2010-11-30 19:10                 ` Christoph Lameter
2010-12-01 10:17                   ` KOSAKI Motohiro
2010-12-01 15:29                     ` Christoph Lameter
2010-12-02  2:44                       ` KOSAKI Motohiro
2010-12-02 14:39                         ` Christoph Lameter
2010-11-30  9:13               ` Simon Kirby
2010-11-30 19:13                 ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101130152002.8307.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=shaohua.li@intel.com \
    --cc=sim@hostway.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox