From: Martin Hicks <mort@sgi.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Ray Bryant <raybry@engr.sgi.com>, linux-mm@kvack.org, ak@suse.de
Subject: Re: [PATCH/RFC 0/4] VM: Manual and Automatic page cache reclaim
Date: Thu, 12 May 2005 14:53:02 -0400 [thread overview]
Message-ID: <20050512185302.GO19244@localhost> (raw)
In-Reply-To: <20050503010846.508bbe62.akpm@osdl.org>
On Tue, May 03, 2005 at 01:08:46AM -0700, Andrew Morton wrote:
>
> Yup. But we could add a knob to each zone which says, during page
> allocation "be more reluctant to advance onto the next node - do some
> direct reclaim instead"
>
> And the good thing about that is that it is an easier merge because it's a
> simpler patch and because it's useful to more machines. People can tune it
> and get better (or worse) performance from existing apps on NUMA.
>
> Yes, if it's a "simple" patch then it _might_ do a bit of swapout or
> something. But the VM does prefer to reclaim clean pagecache first (as
> well as slab, which is a bonus for this approach).
>
> Worth trying, at least?
So, I did this as an exercise. A few things came up:
1) If you just call directly into the reclaim code then it swaps a LOT.
I stuck my "don't swap" flag back in, just to see what would happen. It
works a lot better if you can tell it to just not swap.
2) With a per zone on/off flag for reclaim, I then run into the
trouble where the allocator always reclaims pages, even when it
shouldn't. Filling pagecache with files will start reclaiming from the
preferred zone as soon as the zone fills, leaving the rest of the zones
unused.
My last patch, using mempolicies, got this right because the core
kernel, which wasn't set to use reclaim, would just allocate off-node
for stuff like page cache pages.
3) This patch has no code that limits the amount of scanning that is done
under really heavy memory stress. A "make -j" kernel build takes more
time to complete than I'm willing to wait, while a stock kernel does
complete the run in 15-20 minutes.
Scanning too much is really the biggest problem. I want to keep using
refill_inactive_list(), so that I don't futz with the LRU ordering or
resort to reclaiming active pages like I was doing in my old patch.
4) Under trivial tests, this patch helps NUMA machines get local memory
more often. The silly test was to just fill node 0 with page cache and
then run a "make -j8" kernbench test on node 0 (2 cpu node).
Without zone reclaiming turned on, all memory allocations go to node 1.
With the reclaiming on, page cache is reclaimed and gcc gets all local
memory.
This is a real problem. We even see it on modest 8p/32G build servers
because there is lots of pagecache kicking around and a lot of the
allocations end up being remote.
zone reclaiming on:
Average Optimal -j 8 Load Run:
Elapsed Time 703.87
User Time 1337.77
System Time 47.94
Percent CPU 196
Context Switches 73669
Sleeps 58874
zone reclaiming off:
Average Optimal -j 8 Load Run:
Elapsed Time 741.22
User Time 1396.97
System Time 65.14
Percent CPU 197
Context Switches 73211
Sleeps 58996
mh
--
Martin Hicks || Silicon Graphics Inc. || mort@sgi.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-05-12 18:53 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-27 15:08 Martin Hicks
2005-04-27 17:36 ` Nikita Danilov
2005-04-28 6:33 ` Andrew Morton
2005-04-28 11:16 ` Nick Piggin
2005-04-28 11:56 ` Rik van Riel
2005-04-28 12:53 ` Martin Hicks
2005-05-03 7:17 ` Ray Bryant
2005-05-03 8:08 ` Andrew Morton
2005-05-03 13:21 ` Martin Hicks
2005-05-04 1:23 ` Andrew Morton
2005-05-12 18:53 ` Martin Hicks [this message]
2005-05-12 18:57 ` Martin Hicks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050512185302.GO19244@localhost \
--to=mort@sgi.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=linux-mm@kvack.org \
--cc=raybry@engr.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox