linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Wu Fengguang <fengguang.wu@intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Hellwig <hch@infradead.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [RFC PATCH 0/2] Prioritise inodes and zones for writeback required by page reclaim
Date: Thu, 5 Aug 2010 14:42:23 +0100	[thread overview]
Message-ID: <20100805134223.GB25688@csn.ul.ie> (raw)
In-Reply-To: <20100804155610.2a0d5e1f.akpm@linux-foundation.org>

On Wed, Aug 04, 2010 at 03:56:10PM -0700, Andrew Morton wrote:
> On Wed,  4 Aug 2010 15:38:29 +0100
> Mel Gorman <mel@csn.ul.ie> wrote:
> 
> > Commenting on the series "Reduce writeback from page reclaim context V6"
> > Andrew Morton noted;
> > 
> >   direct-reclaim wants to write a dirty page because that page is in the
> >   zone which the caller wants to allocate from!  Telling the flusher threads
> >   to perform generic writeback will sometimes cause them to just gum the
> >   disk up with pages from different zones, making it even harder/slower to
> >   allocate a page from the zones we're interested in, no?
> > 
> > On the machines used to test the series, there were relatively few zones
> > and only one BDI so the scenario describes is a possibility. This series is
> > a very early prototype series aimed at mitigating the problem.
> > 
> > Patch 1 adds wakeup_flusher_threads_pages() which takes a list of pages
> > from page reclaim. Each inode belonging to a page on the list is marked
> > I_DIRTY_RECLAIM. When the flusher thread wakes, inodes with this tag are
> > unconditionally moved to the wb->b_io list for writing.
> > 
> > Patch 2 notes that writing back inodes does not necessarily write back
> > pages belonging to the zone page reclaim is concerned with. In response, it
> > adds a zone and counter to wb_writeback_work. As pages from the target zone
> > are written, the zone-specific counter is updated. When the flusher thread
> > then checks the zone counters if a specific zone is being targeted. While
> > more pages may be written than necessary, the assumption is that the pages
> > need cleaning eventually, the inode must be relatively old to have pages at
> > the end of the LRU, the IO will be relatively efficient due to less random
> > seeks and that pages from the target zone will still be cleaned.
> > 
> > Testing did not show any significant differences in terms of reducing dirty
> > file pages being written back but the lack of multiple BDIs and NUMA nodes in
> > the test rig is a problem. Maybe someone else has access to a more suitable
> > test rig.
> > 
> > Any comment as to the suitability for such a direction?
> 
> um.  Might work.  Isn't pretty though.
> 

No, it's not.

> But until we can demonstrate the problem or someone reports it, we
> probably have more important issues to be looking at ;) I think that a
> better approach is to try to trigger this problem as we develop and
> test reclaim. 

That's a reasonable plan as we'll know for sure if this is the right direction
or not. I'll put the patches on the back-burner for now and hopefully someone
will remember them if a bug is reported about large stalls under memory
pressure but that is specific to a machine with many nodes and many disks.

> And if we _can't_ demonstrate it, work out why the heck
> not - either the code's smarter than we thought it was or the test is
> no good.
> 

It's always possible that we won't be able to demonstrate it because the
right file pages are getting cleaned more often than not by the time
reclaim happens :/

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2010-08-05 13:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-04 14:38 Mel Gorman
2010-08-04 14:38 ` [PATCH 1/2] writeback: Prioritise dirty inodes encountered by reclaim for background flushing Mel Gorman
2010-08-04 14:38 ` [PATCH 2/2] writeback: Account for pages written back belonging to a particular zone Mel Gorman
2010-08-04 22:56 ` [RFC PATCH 0/2] Prioritise inodes and zones for writeback required by page reclaim Andrew Morton
2010-08-05 13:42   ` Mel Gorman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100805134223.GB25688@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox