From: Paul Mackerras <paulus@samba.org>
To: Andrew Morton <akpm@osdl.org>
Cc: clameter@sgi.com, linux-ia64@vger.kernel.org, torvalds@osdl.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: Prezeroing V2 [0/3]: Why and When it works
Date: Fri, 24 Dec 2004 10:00:20 +1100 [thread overview]
Message-ID: <16843.19972.17026.69228@cargo.ozlabs.ibm.com> (raw)
In-Reply-To: <20041223133745.1d95bb08.akpm@osdl.org>
Andrew Morton writes:
> When the workload is a gcc run, the pagefault handler dominates the system
> time. That's the page zeroing.
For a program which uses a lot of heap and doesn't fork, that sounds
reasonable.
> x86's movnta instructions provide a way of initialising memory without
> trashing the caches and it has pretty good bandwidth, I believe. We should
> wire that up to these patches and see if it speeds things up.
Yes. I don't know the movnta instruction, but surely, whatever scheme
is used, there has to be a snoop for every cache line's worth of
memory that is zeroed.
The other point is that having the page hot in the cache may well be a
benefit to the program. Using any sort of cache-bypassing zeroing
might not actually make things faster, when the user time as well as
the system time is taken into account.
> > I did some measurements once on my G5 powermac (running a ppc64 linux
> > kernel) of how long clear_page takes, and it only takes 96ns for a 4kB
> > page.
>
> 40GB/s. Is that straight into L1 or does the measurement include writeback?
It is the average elapsed time in clear_page, so it would include the
writeback of any cache lines displaced by the zeroing, but not the
writeback of the newly-zeroed cache lines (which we hope will be
modified by the program before they get written back anyway).
This is using the dcbz (data cache block zero) instruction, which
establishes a cache line in modified state with zero contents without
any memory traffic other than a cache line kill transaction sent to
the other CPUs and possible writeback of a dirty cache line displaced
by the newly-zeroed cache line. The new cache line is established in
the L2 cache, because the L1 is write-through on the G5, and all
stores and dcbz instructions have to go to the L2 cache.
Thus, on the G5 (and POWER4, which is similar) I don't think there
will be much if any benefit from having pre-zeroed cache-cold pages.
We can establish the zero lines in cache much faster using dcbz than
we can by reading them in from main memory. If the program uses only
a few cache lines out of each new page, then reading them from memory
might be faster, but that seems unlikely.
Paul.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2004-12-23 23:00 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <B8E391BBE9FE384DAA4C5C003888BE6F02900FBD@scsmsx401.amr.corp.intel.com>
[not found] ` <41C20E3E.3070209@yahoo.com.au>
2004-12-21 19:55 ` Increase page fault rate by prezeroing V1 [0/3]: Overview Christoph Lameter
2004-12-21 19:56 ` Increase page fault rate by prezeroing V1 [1/3]: Introduce __GFP_ZERO Christoph Lameter
2004-12-21 19:56 ` Christoph Lameter
2004-12-21 19:57 ` Increase page fault rate by prezeroing V1 [2/3]: zeroing and scrubd Christoph Lameter
2004-12-21 19:57 ` Christoph Lameter
2004-12-21 19:57 ` Increase page fault rate by prezeroing V1 [3/3]: Altix SN2 BTE Zeroing Christoph Lameter
2004-12-21 19:57 ` Christoph Lameter
2004-12-23 19:29 ` Prezeroing V2 [0/3]: Why and When it works Christoph Lameter
2004-12-23 19:33 ` Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal Christoph Lameter
2004-12-23 19:33 ` Prezeroing V2 [2/4]: add second parameter to clear_page() for all arches Christoph Lameter
2004-12-24 8:33 ` Pavel Machek
2004-12-24 16:18 ` Christoph Lameter
2004-12-24 16:27 ` Pavel Machek
2004-12-24 17:02 ` David S. Miller
2004-12-24 17:05 ` David S. Miller
2004-12-27 22:48 ` David S. Miller
2005-01-03 17:52 ` Christoph Lameter
2005-01-01 10:24 ` Geert Uytterhoeven
2005-01-04 23:12 ` Prezeroing V3 [0/4]: Discussion and i386 performance tests Christoph Lameter
2005-01-04 23:13 ` Prezeroing V3 [1/4]: Allow request for zeroed memory Christoph Lameter
2005-01-04 23:45 ` Dave Hansen
2005-01-05 1:16 ` Christoph Lameter
2005-01-05 1:26 ` Linus Torvalds
2005-01-05 23:11 ` Christoph Lameter
2005-01-05 0:34 ` Linus Torvalds
2005-01-05 0:47 ` Andrew Morton
2005-01-05 1:15 ` Christoph Lameter
2005-01-08 21:12 ` Hugh Dickins
2005-01-08 21:56 ` David S. Miller
2005-01-21 20:09 ` alloc_zeroed_user_highpage to fix the clear_user_highpage issue Christoph Lameter
2005-02-09 9:58 ` [Patch] Fix oops in alloc_zeroed_user_highpage() when page is NULL Michael Ellerman
2005-01-21 20:12 ` Extend clear_page by an order parameter Christoph Lameter
2005-01-21 22:29 ` Paul Mackerras
2005-01-21 23:48 ` Christoph Lameter
2005-01-22 0:35 ` Paul Mackerras
2005-01-22 0:43 ` Andrew Morton
2005-01-22 1:08 ` Paul Mackerras
2005-01-22 1:20 ` Roman Zippel
2005-01-22 1:25 ` Paul Mackerras
2005-01-22 1:54 ` Christoph Lameter
2005-01-22 2:53 ` Paul Mackerras
2005-01-23 7:45 ` Andrew Morton
2005-01-24 16:37 ` Christoph Lameter
2005-01-24 20:23 ` David S. Miller
2005-01-24 20:33 ` Christoph Lameter
2005-01-21 20:15 ` A scrub daemon (prezeroing) Christoph Lameter
2005-01-10 17:16 ` Prezeroing V3 [1/4]: Allow request for zeroed memory Christoph Lameter
2005-01-10 18:13 ` Linus Torvalds
2005-01-10 20:17 ` Christoph Lameter
2005-01-10 23:53 ` Prezeroing V4 [0/4]: Overview Christoph Lameter
2005-01-10 23:54 ` Prezeroing V4 [1/4]: Arch specific page zeroing during page fault Christoph Lameter
2005-01-11 0:41 ` Chris Wright
2005-01-11 0:46 ` Christoph Lameter
2005-01-11 0:49 ` Chris Wright
2005-01-10 23:55 ` Prezeroing V4 [2/4]: Zeroing implementation Christoph Lameter
2005-01-10 23:55 ` Prezeroing V4 [3/4]: Altix SN2 BTE zero driver Christoph Lameter
2005-01-10 23:56 ` Prezeroing V4 [4/4]: Extend clear_page to take an order parameter Christoph Lameter
2005-01-04 23:14 ` Prezeroing V3 [2/4]: Extension of " Christoph Lameter
2005-01-05 23:25 ` Christoph Lameter
2005-01-04 23:15 ` Prezeroing V3 [3/4]: Page zeroing through kscrubd Christoph Lameter
2005-01-04 23:16 ` Prezeroing V3 [4/4]: Driver for hardware zeroing on Altix Christoph Lameter
2004-12-23 19:34 ` Prezeroing V2 [3/4]: Add support for ZEROED and NOT_ZEROED free maps Christoph Lameter
2004-12-23 19:35 ` Prezeroing V2 [4/4]: Hardware Zeroing through SGI BTE Christoph Lameter
2004-12-23 20:08 ` Prezeroing V2 [1/4]: __GFP_ZERO / clear_page() removal Brian Gerst
2004-12-24 16:24 ` Christoph Lameter
2004-12-23 19:49 ` Prezeroing V2 [0/3]: Why and When it works Arjan van de Ven
2004-12-23 20:57 ` Matt Mackall
2004-12-23 21:01 ` Paul Mackerras
2004-12-23 21:11 ` Paul Mackerras
2004-12-23 21:37 ` Andrew Morton
2004-12-23 23:00 ` Paul Mackerras [this message]
2004-12-23 21:48 ` Linus Torvalds
2004-12-23 22:34 ` Zwane Mwaikambo
2004-12-24 9:14 ` Arjan van de Ven
2004-12-24 18:21 ` Linus Torvalds
2004-12-24 18:57 ` Arjan van de Ven
2004-12-27 22:50 ` David S. Miller
2004-12-28 11:53 ` Marcelo Tosatti
2004-12-24 16:17 ` Christoph Lameter
2004-12-21 19:55 ` Increase page fault rate by prezeroing V1 [0/3]: Overview Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=16843.19972.17026.69228@cargo.ozlabs.ibm.com \
--to=paulus@samba.org \
--cc=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox