From: Brent Casavant <bcasavan@sgi.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH] /dev/zero page fault scaling
Date: Fri, 16 Jul 2004 17:35:33 -0500 [thread overview]
Message-ID: <Pine.SGI.4.58.0407161639110.118146@kzerza.americas.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0407152038160.8010-100000@localhost.localdomain>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2753 bytes --]
On Thu, 15 Jul 2004, Hugh Dickins wrote:
> I'm as likely to find a 512P machine as a basilisk, so scalability
> testing I leave to you.
OK, I managed to grab some time on the machine today. Parallel
page faulting for /dev/zero and SysV shared memory has definitely
improved in the first few test cases I have.
The test we have is a program which specifically targets page faulting.
This test program was written after observing some of these issues on
MPI and OpenMP applications.
The test program does this:
1. Forks N child processes, or creates N Pthreads.
2. Each child/thread creates a memory object via malloc,
mmap of /dev/zero, or shmget.
3. Each child/thread touches each page of the memory object
by writing a single byte to the page.
4. Time to perform step 3 is measured.
5. The results are aggregated by the main process/thread
and a report generated, including statistics such as
pagefaults per CPU per wallclock second.
Another variant has the main thread/process create the memory object
and assign the range to be touched to each child/thread, which then
omit the object creation stage and skip to step 3. We call this
the "preallocate" option.
In our case we typically run with 100MB per child/thread, and run a
sequence of powers of 2 number of CPUs, up to 512. All of the work
to this point has been on the fork without preallocation variants.
I'm now looking at the fork with preallocation variants, and find
that we're hammering *VERY* hard on the shmem_inode_info i_lock,
mostly in shmem_getpage code. In fact, performance drops off
significantly even at 4P, and gets positively horrible by 32P
(you don't even want to know about >32P -- but things get 2-4x
worse with each doubling of CPUs).
Just so you can see it, I've attached the most recent run output
from the program. Take a look at the next to last column of numbers.
In the past few days the last few rows of the second two test cases
have gone from 2-3 digit numbers to 5-digit numbers -- that's what
we've been concentrating on.
Note that due to hardware failures the machine is only running 510 CPUs
in the attached output, and that things got so miserably slow that I
didn't even let the runs finish. The last column is meaningless, and
always 0. Also the label "shared" means "preallocate" from the
discussion above. Oh, and this is a 2.6.7 based kernel -- I'll change
to 2.6.8 sometime soon.
Anyway, the i_lock is my next vic^H^H^Hsubject of investigation.
Cheers, and have a great weekend,
Brent
--
Brent Casavant bcasavan@sgi.com Forget bright-eyed and
Operating System Engineer http://www.sgi.com/ bushy-tailed; I'm red-
Silicon Graphics, Inc. 44.8562N 93.1355W 860F eyed and bushy-haired.
[-- Attachment #2: shmem scaling test case results --]
[-- Type: TEXT/PLAIN, Size: 6448 bytes --]
PF/
MAX MIN TOTCPU/ TOT_PF/ TOT_PF/ WSEC/
TYPE: CPUS WALL WALL SYS USER TOTCPU CPU WALL_SEC SYS_SEC CPU NODES
fork 1 0.059 0.059 0.059 0.000 0.059 0.059 104174 104174 104174 0
fork 2 0.091 0.090 0.178 0.003 0.181 0.090 134419 68686 67209 0
fork 4 0.091 0.089 0.354 0.007 0.360 0.090 268838 69066 67209 0
fork 8 0.091 0.089 0.710 0.013 0.723 0.090 537677 68781 67209 0
fork 16 0.092 0.089 1.420 0.021 1.440 0.090 1063914 68781 66494 0
fork 32 0.092 0.089 2.835 0.048 2.883 0.090 2127828 68899 66494 0
fork 64 0.092 0.088 5.697 0.073 5.771 0.090 4255657 68569 66494 0
fork 128 0.092 0.089 11.381 0.163 11.544 0.090 8511314 68651 66494 0
fork 256 0.117 0.058 22.773 0.314 23.088 0.090 13334391 68616 52087 0
fork 510 0.094 0.057 45.409 0.657 46.066 0.090 33205760 68555 65109 0
PF/
MAX MIN TOTCPU/ TOT_PF/ TOT_PF/ WSEC/
TYPE: CPUS WALL WALL SYS USER TOTCPU CPU WALL_SEC SYS_SEC CPU NODES
fork:zero 1 0.064 0.064 0.064 0.000 0.064 0.064 94704 94704 94704 0
fork:zero 2 0.094 0.094 0.186 0.001 0.187 0.093 130218 65794 65109 0
fork:zero 4 0.094 0.092 0.368 0.003 0.371 0.093 260437 66318 65109 0
fork:zero 8 0.095 0.092 0.729 0.012 0.741 0.093 515504 66939 64438 0
fork:zero 16 0.094 0.091 1.450 0.025 1.476 0.092 1041749 67345 65109 0
fork:zero 32 0.095 0.091 2.923 0.037 2.960 0.092 2062019 66827 64438 0
fork:zero 64 0.095 0.091 5.814 0.092 5.906 0.092 4124038 67187 64438 0
fork:zero 128 0.097 0.090 11.831 0.181 12.012 0.094 8081449 66039 63136 0
fork:zero 256 0.107 0.068 24.208 0.326 24.534 0.096 14546609 64549 56822 0
fork:zero 510 0.475 0.054 173.469 0.683 174.151 0.341 6559162 17945 12861 0
PF/
MAX MIN TOTCPU/ TOT_PF/ TOT_PF/ WSEC/
TYPE: CPUS WALL WALL SYS USER TOTCPU CPU WALL_SEC SYS_SEC CPU NODES
fork:shmem 1 0.063 0.063 0.062 0.002 0.063 0.063 96161 99214 96161 0
fork:shmem 2 0.094 0.093 0.185 0.002 0.187 0.093 130218 66142 65109 0
fork:shmem 4 0.094 0.091 0.363 0.007 0.370 0.093 260437 67209 65109 0
fork:shmem 8 0.094 0.092 0.726 0.013 0.738 0.092 520874 67300 65109 0
fork:shmem 16 0.094 0.092 1.452 0.022 1.475 0.092 1041749 67254 65109 0
fork:shmem 32 0.094 0.091 2.906 0.045 2.951 0.092 2083498 67209 65109 0
fork:shmem 64 0.096 0.092 5.823 0.090 5.913 0.092 4081956 67085 63780 0
fork:shmem 128 0.096 0.091 11.659 0.179 11.838 0.092 8163913 67012 63780 0
fork:shmem 256 0.098 0.063 23.380 0.348 23.728 0.093 16001270 66836 62504 0
fork:shmem 510 0.489 0.048 177.804 0.656 178.460 0.350 6362780 17508 12476 0
PF/
MAX MIN TOTCPU/ TOT_PF/ TOT_PF/ WSEC/
TYPE: CPUS WALL WALL SYS USER TOTCPU CPU WALL_SEC SYS_SEC CPU NODES
fork:zero:shared 1 0.064 0.064 0.062 0.002 0.064 0.064 94704 97664 94704 0
fork:zero:shared 2 0.096 0.095 0.189 0.001 0.190 0.095 127561 64438 63780 0
fork:zero:shared 4 0.213 0.210 0.845 0.003 0.848 0.212 114688 28904 28672 0
fork:zero:shared 8 0.985 0.935 5.220 0.009 5.229 0.654 49557 9355 6194 0
fork:zero:shared 16 3.213 2.811 17.494 0.021 17.516 1.095 30397 5582 1899 0
fork:zero:shared 32 6.832 5.795 44.188 0.052 44.240 1.383 28590 4420 893 0
fork:zero:shared 64 14.677 11.181 128.041 0.082 128.123 2.002 26617 3051 415 0
fork:zero:shared 128 29.026 3.561 282.180 0.172 282.352 2.206 26917 2768 210 0
next prev parent reply other threads:[~2004-07-16 22:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-14 19:27 Brent Casavant
2004-07-14 20:39 ` Hugh Dickins
2004-07-14 21:31 ` Brent Casavant
2004-07-15 16:28 ` Brent Casavant
2004-07-15 20:28 ` Hugh Dickins
2004-07-15 21:36 ` Brent Casavant
2004-07-15 21:52 ` Brent Casavant
2004-07-15 23:21 ` Hugh Dickins
2004-07-16 22:35 ` Brent Casavant [this message]
2004-08-02 14:37 ` Brent Casavant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.SGI.4.58.0407161639110.118146@kzerza.americas.sgi.com \
--to=bcasavan@sgi.com \
--cc=hugh@veritas.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox