linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan.kim@gmail.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Wu Fengguang <fengguang.wu@intel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [PATCH 8/8] writeback: Do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone
Date: Thu, 16 Sep 2010 23:11:47 +0900	[thread overview]
Message-ID: <20100916141147.GC16115@barrios-desktop> (raw)
In-Reply-To: <20100916091824.GB15709@csn.ul.ie>

On Thu, Sep 16, 2010 at 10:18:24AM +0100, Mel Gorman wrote:
> On Thu, Sep 16, 2010 at 05:13:38PM +0900, Minchan Kim wrote:
> > On Wed, Sep 15, 2010 at 01:27:51PM +0100, Mel Gorman wrote:
> > > If wait_iff_congested() is called with no BDI congested, the function simply
> > > calls cond_resched(). In the event there is significant writeback happening
> > > in the zone that is being reclaimed, this can be a poor decision as reclaim
> > > would succeed once writeback was completed. Without any backoff logic,
> > > younger clean pages can be reclaimed resulting in more reclaim overall and
> > > poor performance.
> > 
> > I agree. 
> > 
> > > 
> > > This patch tracks how many pages backed by a congested BDI were found during
> > > scanning. If all the dirty pages encountered on a list isolated from the
> > > LRU belong to a congested BDI, the zone is marked congested until the zone
> > 
> > I am not sure it works well. 
> 
> Check the competion times for the micro-mapped-file-stream benchmark in
> the leader mail. Backing off like this is faster overall for some
> workloads.
> 
> > We just met the condition once but we backoff it until high watermark.
> 
> Reaching the high watermark is considered to be a relieving of pressure.
> 
> > (ex, 32 isolated dirty pages == 32 pages on congestioned bdi)
> > First impression is rather _aggressive_.
> > 
> 
> Yes, it is. I intended to start with something quite aggressive that is
> close to existing behaviour and then experiment with alternatives.

Agree. 

> 
> For example, I considered clearing zone congestion when but nr_bdi_congested
> drops to 0. This would be less aggressive in terms of congestion waiting but
> it is further from todays behaviour. I felt it would be best to introduce
> wait_iff_congested() in one kernel cycle but wait to a later cycle to deviate
> a lot from congestion_wait().

Fair enough. 

> 
> > How about more checking?
> > For example, if above pattern continues repeately above some threshold,
> > we can regard "zone is congested" and then if the pattern isn't repeated 
> > during some threshold, we can regard "zone isn't congested any more.".
> > 
> 
> I also considered these options and got stuck at what the "some
> threshold" is and how to record the history. Should it be recorded on a
> per BDI basis for example? I think all these questions can be answered
> but should be in a different cycle.
> 
> > > reaches the high watermark.  wait_iff_congested() then checks both the
> > > number of congested BDIs and if the current zone is one that has encounted
> > > congestion recently, it will sleep on the congestion queue. Otherwise it
> > > will call cond_reched() to yield the processor if necessary.
> > > 
> > > The end result is that waiting on the congestion queue is avoided when
> > > necessary but when significant congestion is being encountered,
> > > reclaimers and page allocators will back off.
> > > 
> > > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > > ---
> > >  include/linux/backing-dev.h |    2 +-
> > >  include/linux/mmzone.h      |    8 ++++
> > >  mm/backing-dev.c            |   23 ++++++++----
> > >  mm/page_alloc.c             |    4 +-
> > >  mm/vmscan.c                 |   83 +++++++++++++++++++++++++++++++++++++------
> > >  5 files changed, 98 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> > > index 72bb510..f1b402a 100644
> > > --- a/include/linux/backing-dev.h
> > > +++ b/include/linux/backing-dev.h
> > > +static enum bdi_queue_status may_write_to_queue(struct backing_dev_info *bdi,
> > 
> > <snip>
> > 
> > >  			      struct scan_control *sc)
> > >  {
> > > +	enum bdi_queue_status ret = QUEUEWRITE_DENIED;
> > > +
> > >  	if (current->flags & PF_SWAPWRITE)
> > > -		return 1;
> > > +		return QUEUEWRITE_ALLOWED;
> > >  	if (!bdi_write_congested(bdi))
> > > -		return 1;
> > > +		return QUEUEWRITE_ALLOWED;
> > > +	else
> > > +		ret = QUEUEWRITE_CONGESTED;
> > >  	if (bdi == current->backing_dev_info)
> > > -		return 1;
> > > +		return QUEUEWRITE_ALLOWED;
> > >  
> > >  	/* lumpy reclaim for hugepage often need a lot of write */
> > >  	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> > > -		return 1;
> > > -	return 0;
> > > +		return QUEUEWRITE_ALLOWED;
> > > +	return ret;
> > >  }
> > 
> > The function can't return QUEUEXXX_DENIED.
> > It can affect disable_lumpy_reclaim. 
> > 
> 
> Yes, but that change was made in "vmscan: Narrow the scenarios lumpy
> reclaim uses synchrounous reclaim". Maybe I am misunderstanding your
> objection.

I means current may_write_to_queue never returns QUEUEWRITE_DENIED.
What's the role of it?

In addition, we don't need disable_lumpy_reclaim_mode() in pageout.
That's because both PAGE_KEEP and PAGE_KEEP_CONGESTED go to keep_locked
and calls disable_lumpy_reclaim_mode at last. 

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-09-16 14:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-15 12:27 [PATCH 0/8] Reduce latencies and improve overall reclaim efficiency v2 Mel Gorman
2010-09-15 12:27 ` [PATCH 1/8] tracing, vmscan: Add trace events for LRU list shrinking Mel Gorman
2010-09-15 12:27 ` [PATCH 2/8] writeback: Account for time spent congestion_waited Mel Gorman
2010-09-15 12:27 ` [PATCH 3/8] vmscan: Synchronous lumpy reclaim should not call congestion_wait() Mel Gorman
2010-09-15 12:27 ` [PATCH 4/8] vmscan: Narrow the scenarios lumpy reclaim uses synchrounous reclaim Mel Gorman
2010-09-15 12:27 ` [PATCH 5/8] vmscan: Remove dead code in shrink_inactive_list() Mel Gorman
2010-09-15 12:27 ` [PATCH 6/8] vmscan: isolated_lru_pages() stop neighbour search if neighbour cannot be isolated Mel Gorman
2010-09-15 12:27 ` [PATCH 7/8] writeback: Do not sleep on the congestion queue if there are no congested BDIs Mel Gorman
2010-09-16  7:59   ` Minchan Kim
2010-09-16  8:23     ` Mel Gorman
2010-09-15 12:27 ` [PATCH 8/8] writeback: Do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encountered in the current zone Mel Gorman
2010-09-16  8:13   ` Minchan Kim
2010-09-16  9:18     ` Mel Gorman
2010-09-16 14:11       ` Minchan Kim [this message]
2010-09-16 15:18         ` Mel Gorman
2010-09-16 22:28   ` Andrew Morton
2010-09-20  9:52     ` Mel Gorman
2010-09-21 21:44       ` Andrew Morton
2010-09-21 22:10         ` Mel Gorman
2010-09-21 22:24           ` Andrew Morton
2010-09-20 13:05   ` [PATCH] writeback: Do not sleep on the congestion queue if there are no congested BDIs or if significant congestion is not being encounted in the current zone fix Mel Gorman
2010-09-16 22:28 ` [PATCH 0/8] Reduce latencies and improve overall reclaim efficiency v2 Andrew Morton
2010-09-17  7:52   ` Mel Gorman
2010-10-14 15:28 ` Christian Ehrhardt
2010-10-18 13:55   ` Mel Gorman
2010-10-22 12:29     ` Christian Ehrhardt
2010-11-03 10:50     ` Christian Ehrhardt
2010-11-10 14:37       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100916141147.GC16115@barrios-desktop \
    --to=minchan.kim@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox