linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <cl@linux.com>
To: Robert Haas <robertmhaas@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>, Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Josh Berkus <josh@agliodbs.com>,
	Andres Freund <andres@2ndquadrant.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	sivanich@sgi.com
Subject: Re: [PATCH 0/2] Disable zone_reclaim_mode by default
Date: Tue, 8 Apr 2014 17:58:21 -0500 (CDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1404081752390.16708@nuc> (raw)
In-Reply-To: <CA+TgmoY=vUdtdnJUEK1h-UcaNoqqLUctt44S8vj2B7EVUXUOyA@mail.gmail.com>

On Tue, 8 Apr 2014, Robert Haas wrote:

> Well, as Josh quite rightly said, the hit from accessing remote memory
> is never going to be as large as the hit from disk.  If and when there
> is a machine where remote memory is more expensive to access than
> disk, that's a good argument for zone_reclaim_mode.  But I don't
> believe that's anywhere close to being true today, even on an 8-socket
> machine with an SSD.

I am nost sure how disk figures into this?

The tradeoff is zone reclaim vs. the aggregate performance
degradation of the remote memory accesses. That depends on the
cacheability of the app and the scale of memory accesses.

The reason that zone reclaim is on by default is that off node accesses
are a big performance hit on large scale NUMA systems (like ScaleMP and
SGI). Zone reclaim was written *because* those system experienced severe
performance degradation.

On the tightly coupled 4 and 8 node systems there does not seem to
be a benefit from what I hear.

> Now, perhaps the fear is that if we access that remote memory
> *repeatedly* the aggregate cost will exceed what it would have cost to
> fault that page into the local node just once.  But it takes a lot of
> accesses for that to be true, and most of the time you won't get them.
>  Even if you do, I bet many workloads will prefer even performance
> across all the accesses over a very slow first access followed by
> slightly faster subsequent accesses.

Many HPC workloads prefer the opposite.

> In an ideal world, the kernel would put the hottest pages on the local
> node and the less-hot pages on remote nodes, moving pages around as
> the workload shifts.  In practice, that's probably pretty hard.
> Fortunately, it's not nearly as important as making sure we don't
> unnecessarily hit the disk, which is infinitely slower than any memory
> bank.

Shifting pages involves similar tradeoffs as zone reclaim vs. remote
allocations.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2014-04-08 22:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-07 22:34 Mel Gorman
2014-04-07 22:34 ` [PATCH 1/2] mm: " Mel Gorman
2014-04-07 23:35   ` Johannes Weiner
2014-04-08  1:17   ` Zhang Yanfei
2014-04-08  7:14   ` Andres Freund
2014-04-08 14:14   ` Christoph Lameter
2014-04-08 14:47     ` Mel Gorman
2014-04-07 22:34 ` [PATCH 2/2] mm: page_alloc: Do not cache reclaim distances Mel Gorman
2014-04-07 23:36   ` Johannes Weiner
2014-04-08  1:17   ` Zhang Yanfei
2014-04-08  7:26 ` [PATCH 0/2] Disable zone_reclaim_mode by default Vlastimil Babka
2014-04-08 14:17   ` Christoph Lameter
2014-04-08 14:26     ` Andres Freund
     [not found]     ` <WM!ea1193ee171854a74828ee30c859d97ff2ce66405ffa3a0b8c31a1233c6a0b55530cdf3cbfcd989c0ec18fef1d533f81!@asav-3.01.com>
2014-04-08 14:46       ` Josh Berkus
2014-04-08 19:53     ` Robert Haas
     [not found]       ` <WM!55d2a092da9f6180473043487a4eb612ae8195f78d2ffdd83f673ed5cb2cb9659cf61e0c8d5bae23f5c914057bcd2ee4!@asav-3.01.com>
2014-04-08 19:56         ` Josh Berkus
2014-04-09 13:08           ` Mel Gorman
2014-04-08 22:58       ` Christoph Lameter [this message]
2014-04-08 23:26         ` Mel Gorman
2014-04-10 10:26         ` Jeremy Harris
2014-04-18 15:49 ` Michal Hocko
2014-04-18 16:44   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1404081752390.16708@nuc \
    --to=cl@linux.com \
    --cc=akpm@linux-foundation.org \
    --cc=andres@2ndquadrant.com \
    --cc=josh@agliodbs.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=robertmhaas@gmail.com \
    --cc=sivanich@sgi.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox