linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	yanmin.zhang@intel.com, Wu Fengguang <fengguang.wu@intel.com>,
	linuxram@us.ibm.com, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] Reintroduce zone_reclaim_interval for when zone_reclaim() scans and fails to avoid CPU spinning at 100% on NUMA
Date: Wed, 10 Jun 2009 11:00:40 +0100	[thread overview]
Message-ID: <20090610100040.GE25943@csn.ul.ie> (raw)
In-Reply-To: <20090609222301.8da002ae.akpm@linux-foundation.org>

On Tue, Jun 09, 2009 at 10:23:01PM -0700, Andrew Morton wrote:
> On Mon, 8 Jun 2009 16:11:51 +0100 Mel Gorman <mel@csn.ul.ie> wrote:
> 
> > On Mon, Jun 08, 2009 at 10:55:55AM -0400, Christoph Lameter wrote:
> > > On Mon, 8 Jun 2009, Mel Gorman wrote:
> > > 
> > > > > The tmpfs pages are unreclaimable and therefore should not be on the anon
> > > > > lru.
> > > > >
> > > >
> > > > tmpfs pages can be swap-backed so can be reclaimable. Regardless of what
> > > > list they are on, we still need to know how many of them there are if
> > > > this patch is to be avoided.
> > > 
> > > If they are reclaimable then why does it matter? They can be pushed out if
> > > you configure zone reclaim to be that aggressive.
> > > 
> > 
> > Because they are reclaimable by kswapd or normal direct reclaim but *not*
> > reclaimable by zone_reclaim() if the zone_reclaim_mode is not configured
> > appropriately.
> 
> Ah.  (zone_reclaim_mode & RECLAIM_SWAP) == 0.  That was important info.
> 

Yes, zone_reclaim() is a different beast to kswapd or traditional direct
reclaim.

> Couldn't the lack of RECLAIM_WRITE cause a similar problem?
> 

Potentially, yes.

> > I briefly considered setting zone_reclaim_mode to 7 instead of
> > 1 by default for large NUMA distances but that has other serious consequences
> > such as paging in preference to going off-node as a default out-of-box
> > behaviour.
> 
> Maybe we should consider that a bit harder.  At what stage does
> zone_reclaim decide to give up and try a different node?  Perhaps it's
> presently too reluctant to do that?
> 

It decides to give up if it can't reclaim a number of pages
(SWAP_CLUSTER_MAX usually) with the current reclaim_mode. In practice,
that means it will go off-node if there are not enough clean unmapped
pages on the LRU list for that node.

That is a relatively short delay. If the request had to clean filesystem-backed
pages or unmap+swap pages, the cost would likely exceed the sum of all
remote-node accesses for that page.

I think in principal, the zone_reclaim_mode default of 1 is sensible and
the biggest thing this patchset needs to get right is the scan-avoidance
heuristic.

> > The point of the patch is that the heuristics that avoid the scan are not
> > perfect. In the event they are wrong and a useless scan occurs, the response
> > of the kernel after a useless scan should not be to uselessly scan a load
> > more times around the LRU lists making no progress.
> 
> It would be sad to bring back a jiffies-based thing into page reclaim. 
> Wall time has little correlation with the rate of page allocation and
> reclaim activity.
> 

Agreed. If it turns out a patch like this is needed, I'm going to build
on Wu's suggestion to auto-selecting the zone_reclaim_interval based on
scan frequency and how long it takes to do the scan. I'm still hoping that
neither is necessary because we'll be able to guess the number of tmpfs
pages in advance.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-06-10  9:58 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-08 13:01 [PATCH 0/3] [RFC] Functional fix to zone_reclaim() and bring behaviour more in line with expectations Mel Gorman
2009-06-08 13:01 ` [PATCH 1/3] Reintroduce zone_reclaim_interval for when zone_reclaim() scans and fails to avoid CPU spinning at 100% on NUMA Mel Gorman
2009-06-08 13:31   ` Rik van Riel
2009-06-08 13:54     ` Mel Gorman
2009-06-08 14:33       ` Christoph Lameter
2009-06-08 14:38         ` Mel Gorman
2009-06-08 14:55           ` Christoph Lameter
2009-06-08 15:11             ` Mel Gorman
2009-06-10  5:23               ` Andrew Morton
2009-06-10  6:44                 ` KOSAKI Motohiro
2009-06-10 10:00                 ` Mel Gorman [this message]
2009-06-08 14:48       ` Rik van Riel
2009-06-09  8:08         ` Mel Gorman
2009-06-09  1:58   ` Wu Fengguang
2009-06-09  8:14     ` Mel Gorman
2009-06-09  8:25       ` Wu Fengguang
2009-06-09  8:31         ` Mel Gorman
2009-06-09  9:07           ` Wu Fengguang
2009-06-09  9:40             ` Mel Gorman
2009-06-09 13:38               ` Wu Fengguang
2009-06-09 15:06                 ` Mel Gorman
2009-06-10  2:14                   ` Wu Fengguang
2009-06-10  9:54                     ` Mel Gorman
2009-06-09  7:48   ` KOSAKI Motohiro
2009-06-09  8:18     ` Mel Gorman
2009-06-09  8:45       ` KOSAKI Motohiro
2009-06-09  9:42         ` Mel Gorman
2009-06-09  9:45           ` KOSAKI Motohiro
2009-06-09  9:59             ` KOSAKI Motohiro
2009-06-09 10:44               ` Mel Gorman
2009-06-09 10:50                 ` KOSAKI Motohiro
2009-06-08 13:01 ` [PATCH 2/3] Properly account for the number of page cache pages zone_reclaim() can reclaim Mel Gorman
2009-06-08 14:25   ` Christoph Lameter
2009-06-08 14:36     ` Mel Gorman
2009-06-09  2:25   ` Wu Fengguang
2009-06-09  8:27     ` Mel Gorman
2009-06-09  8:45       ` Wu Fengguang
2009-06-09 10:48         ` Mel Gorman
2009-06-09 12:08           ` Wu Fengguang
2009-06-09  8:55       ` KOSAKI Motohiro
2009-06-09  2:37   ` Wu Fengguang
2009-06-09  8:19   ` KOSAKI Motohiro
2009-06-09  8:47     ` Mel Gorman
2009-06-08 13:01 ` [PATCH 3/3] Do not unconditionally treat zones that fail zone_reclaim() as full Mel Gorman
2009-06-08 14:32   ` Christoph Lameter
2009-06-08 14:43     ` Mel Gorman
2009-06-09  3:11   ` Wu Fengguang
2009-06-09  8:50     ` Mel Gorman
2009-06-09  7:48   ` KOSAKI Motohiro
2009-06-09  9:25     ` Mel Gorman
2009-06-09 12:05       ` KOSAKI Motohiro
2009-06-09 13:28         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090610100040.GE25943@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxram@us.ibm.com \
    --cc=riel@redhat.com \
    --cc=yanmin.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox