Re: [PATCH] /dev/zero page fault scaling

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Brent Casavant <bcasavan@sgi.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH] /dev/zero page fault scaling
Date: Fri, 16 Jul 2004 17:35:33 -0500	[thread overview]
Message-ID: <Pine.SGI.4.58.0407161639110.118146@kzerza.americas.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0407152038160.8010-100000@localhost.localdomain>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2753 bytes --]

On Thu, 15 Jul 2004, Hugh Dickins wrote:

> I'm as likely to find a 512P machine as a basilisk, so scalability
> testing I leave to you.

OK, I managed to grab some time on the machine today.  Parallel
page faulting for /dev/zero and SysV shared memory has definitely
improved in the first few test cases I have.

The test we have is a program which specifically targets page faulting.
This test program was written after observing some of these issues on
MPI and OpenMP applications.

The test program does this:

	1. Forks N child processes, or creates N Pthreads.
	2. Each child/thread creates a memory object via malloc,
	   mmap of /dev/zero, or shmget.
	3. Each child/thread touches each page of the memory object
	   by writing a single byte to the page.
	4. Time to perform step 3 is measured.
	5. The results are aggregated by the main process/thread
	   and a report generated, including statistics such as
	   pagefaults per CPU per wallclock second.

Another variant has the main thread/process create the memory object
and assign the range to be touched to each child/thread, which then
omit the object creation stage and skip to step 3.  We call this
the "preallocate" option.

In our case we typically run with 100MB per child/thread, and run a
sequence of powers of 2 number of CPUs, up to 512.  All of the work
to this point has been on the fork without preallocation variants.

I'm now looking at the fork with preallocation variants, and find
that we're hammering *VERY* hard on the shmem_inode_info i_lock,
mostly in shmem_getpage code.  In fact, performance drops off
significantly even at 4P, and gets positively horrible by 32P
(you don't even want to know about >32P -- but things get 2-4x
worse with each doubling of CPUs).

Just so you can see it, I've attached the most recent run output
from the program.  Take a look at the next to last column of numbers.
In the past few days the last few rows of the second two test cases
have gone from 2-3 digit numbers to 5-digit numbers -- that's what
we've been concentrating on.

Note that due to hardware failures the machine is only running 510 CPUs
in the attached output, and that things got so miserably slow that I
didn't even let the runs finish.  The last column is meaningless, and
always 0.  Also the label "shared" means "preallocate" from the
discussion above.  Oh, and this is a 2.6.7 based kernel -- I'll change
to 2.6.8 sometime soon.

Anyway, the i_lock is my next vic^H^H^Hsubject of investigation.

Cheers, and have a great weekend,
Brent

-- 
Brent Casavant             bcasavan@sgi.com        Forget bright-eyed and
Operating System Engineer  http://www.sgi.com/     bushy-tailed; I'm red-
Silicon Graphics, Inc.     44.8562N 93.1355W 860F  eyed and bushy-haired.

[-- Attachment #2: shmem scaling test case results --]
[-- Type: TEXT/PLAIN, Size: 6448 bytes --]

                                                                                                                     PF/
                                MAX        MIN                                  TOTCPU/      TOT_PF/   TOT_PF/     WSEC/
TYPE:               CPUS       WALL       WALL        SYS     USER     TOTCPU       CPU     WALL_SEC   SYS_SEC       CPU   NODES
fork                   1      0.059      0.059      0.059    0.000      0.059     0.059       104174    104174    104174       0
fork                   2      0.091      0.090      0.178    0.003      0.181     0.090       134419     68686     67209       0
fork                   4      0.091      0.089      0.354    0.007      0.360     0.090       268838     69066     67209       0
fork                   8      0.091      0.089      0.710    0.013      0.723     0.090       537677     68781     67209       0
fork                  16      0.092      0.089      1.420    0.021      1.440     0.090      1063914     68781     66494       0
fork                  32      0.092      0.089      2.835    0.048      2.883     0.090      2127828     68899     66494       0
fork                  64      0.092      0.088      5.697    0.073      5.771     0.090      4255657     68569     66494       0
fork                 128      0.092      0.089     11.381    0.163     11.544     0.090      8511314     68651     66494       0
fork                 256      0.117      0.058     22.773    0.314     23.088     0.090     13334391     68616     52087       0
fork                 510      0.094      0.057     45.409    0.657     46.066     0.090     33205760     68555     65109       0


                                                                                                                     PF/
                                MAX        MIN                                  TOTCPU/      TOT_PF/   TOT_PF/     WSEC/
TYPE:               CPUS       WALL       WALL        SYS     USER     TOTCPU       CPU     WALL_SEC   SYS_SEC       CPU   NODES
fork:zero              1      0.064      0.064      0.064    0.000      0.064     0.064        94704     94704     94704       0
fork:zero              2      0.094      0.094      0.186    0.001      0.187     0.093       130218     65794     65109       0
fork:zero              4      0.094      0.092      0.368    0.003      0.371     0.093       260437     66318     65109       0
fork:zero              8      0.095      0.092      0.729    0.012      0.741     0.093       515504     66939     64438       0
fork:zero             16      0.094      0.091      1.450    0.025      1.476     0.092      1041749     67345     65109       0
fork:zero             32      0.095      0.091      2.923    0.037      2.960     0.092      2062019     66827     64438       0
fork:zero             64      0.095      0.091      5.814    0.092      5.906     0.092      4124038     67187     64438       0
fork:zero            128      0.097      0.090     11.831    0.181     12.012     0.094      8081449     66039     63136       0
fork:zero            256      0.107      0.068     24.208    0.326     24.534     0.096     14546609     64549     56822       0
fork:zero            510      0.475      0.054    173.469    0.683    174.151     0.341      6559162     17945     12861       0


                                                                                                                     PF/
                                MAX        MIN                                  TOTCPU/      TOT_PF/   TOT_PF/     WSEC/
TYPE:               CPUS       WALL       WALL        SYS     USER     TOTCPU       CPU     WALL_SEC   SYS_SEC       CPU   NODES
fork:shmem             1      0.063      0.063      0.062    0.002      0.063     0.063        96161     99214     96161       0
fork:shmem             2      0.094      0.093      0.185    0.002      0.187     0.093       130218     66142     65109       0
fork:shmem             4      0.094      0.091      0.363    0.007      0.370     0.093       260437     67209     65109       0
fork:shmem             8      0.094      0.092      0.726    0.013      0.738     0.092       520874     67300     65109       0
fork:shmem            16      0.094      0.092      1.452    0.022      1.475     0.092      1041749     67254     65109       0
fork:shmem            32      0.094      0.091      2.906    0.045      2.951     0.092      2083498     67209     65109       0
fork:shmem            64      0.096      0.092      5.823    0.090      5.913     0.092      4081956     67085     63780       0
fork:shmem           128      0.096      0.091     11.659    0.179     11.838     0.092      8163913     67012     63780       0
fork:shmem           256      0.098      0.063     23.380    0.348     23.728     0.093     16001270     66836     62504       0
fork:shmem           510      0.489      0.048    177.804    0.656    178.460     0.350      6362780     17508     12476       0


                                                                                                                     PF/
                                MAX        MIN                                  TOTCPU/      TOT_PF/   TOT_PF/     WSEC/
TYPE:               CPUS       WALL       WALL        SYS     USER     TOTCPU       CPU     WALL_SEC   SYS_SEC       CPU   NODES
fork:zero:shared       1      0.064      0.064      0.062    0.002      0.064     0.064        94704     97664     94704       0
fork:zero:shared       2      0.096      0.095      0.189    0.001      0.190     0.095       127561     64438     63780       0
fork:zero:shared       4      0.213      0.210      0.845    0.003      0.848     0.212       114688     28904     28672       0
fork:zero:shared       8      0.985      0.935      5.220    0.009      5.229     0.654        49557      9355      6194       0
fork:zero:shared      16      3.213      2.811     17.494    0.021     17.516     1.095        30397      5582      1899       0
fork:zero:shared      32      6.832      5.795     44.188    0.052     44.240     1.383        28590      4420       893       0
fork:zero:shared      64     14.677     11.181    128.041    0.082    128.123     2.002        26617      3051       415       0
fork:zero:shared     128     29.026      3.561    282.180    0.172    282.352     2.206        26917      2768       210       0

next prev parent reply	other threads:[~2004-07-16 22:35 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-14 19:27 Brent Casavant
2004-07-14 20:39 ` Hugh Dickins
2004-07-14 21:31   ` Brent Casavant
2004-07-15 16:28   ` Brent Casavant
2004-07-15 20:28     ` Hugh Dickins
2004-07-15 21:36       ` Brent Casavant
2004-07-15 21:52       ` Brent Casavant
2004-07-15 23:21         ` Hugh Dickins
2004-07-16 22:35       ` Brent Casavant [this message]
2004-08-02 14:37         ` Brent Casavant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.SGI.4.58.0407161639110.118146@kzerza.americas.sgi.com \
    --to=bcasavan@sgi.com \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox