RE: [PATCH 4/4] IA64: SPARSE_VIRTUAL 16M page size support

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Christoph Lameter <clameter@sgi.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: akpm@linux-foundation.org, linux-ia64@vger.kernel.org,
	linux-kernel@vger.kernel.org, Martin Bligh <mbligh@google.com>,
	linux-mm@kvack.org, Andi Kleen <ak@suse.de>,
	Dave Hansen <hansendc@us.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: RE: [PATCH 4/4] IA64: SPARSE_VIRTUAL 16M page size support
Date: Fri, 6 Apr 2007 10:16:33 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0704061004140.25652@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <617E1C2C70743745A92448908E030B2A0153594A@scsmsx411.amr.corp.intel.com>

On Thu, 5 Apr 2007, Luck, Tony wrote:

> > This implements granule page sized vmemmap support for IA64.
> 
> Christoph,
> 
> Your calculations here are all based on a granule size of 16M, but
> it is possible to configure 64M granules.

Hmm...... Maybe we need to have a separate size for the vmemmap size?

> With current sizeof(struct page) == 56, a 16M page will hold enough
> page structures for about 4.5G of physical space (assuming 16K pages),
> so a 64M page would cover 18G.

Yes that is far too much.

> Maybe a granule is not the right unit of allocation ... perhaps 4M
> would work better (4M/56 ~= 75000 pages ~= 1.1G)?  But if this is
> too small, then a hard-coded 16M would be better than a granule,
> because 64M is (IMHO) too big.

I have some measurements 1M vs. 16M that I took last year when I first 
developed the approach:

1. 16k vmm page size

Tasks    jobs/min  jti  jobs/min/task      real       cpu
    1     2434.08  100      2434.0771      2.46      0.02   Thu Oct 12 03:22:20 2006
  100   178784.27   93      1787.8427      3.36      7.14   Thu Oct 12 03:22:34 2006
  200   279199.63   94      1395.9981      4.30     14.70   Thu Oct 12 03:22:52 2006
  300   340909.09   92      1136.3636      5.28     22.55   Thu Oct 12 03:23:14 2006
  400   381133.87   90       952.8347      6.30     30.64   Thu Oct 12 03:23:40 2006
  500   408942.20   93       817.8844      7.34     38.90   Thu Oct 12 03:24:10 2006
  600   430673.53   89       717.7892      8.36     47.15   Thu Oct 12 03:24:45 2006
  700   445859.87   92       636.9427      9.42     55.59   Thu Oct 12 03:25:23 2006
  800   460564.19   94       575.7052     10.42     63.57   Thu Oct 12 03:26:06 2006

2. 1M vmm page size

Tasks    jobs/min  jti  jobs/min/task      real       cpu
    1     2435.06  100      2435.0649      2.46      0.02   Thu Oct 12 03:08:25 2006
  100   178041.54   93      1780.4154      3.37      7.18   Thu Oct 12 03:08:39 2006
  200   278035.22   96      1390.1761      4.32     14.85   Thu Oct 12 03:08:57 2006
  300   338536.77   96      1128.4559      5.32     22.90   Thu Oct 12 03:09:19 2006
  400   377180.58   89       942.9514      6.36     31.19   Thu Oct 12 03:09:46 2006
  500   407000.41   96       814.0008      7.37     39.21   Thu Oct 12 03:10:16 2006
  600   428979.98   91       714.9666      8.39     47.43   Thu Oct 12 03:10:51 2006
  700   444209.41   94       634.5849      9.46     55.86   Thu Oct 12 03:11:30 2006
  800   455753.89   93       569.6924     10.53     64.59   Thu Oct 12 03:12:13 2006

4M would be right in the middle and maybe not so bad.

Note that these numbers were based on a more complex TLB handler.
See http://marc.info/?l=linux-ia64&m=116069969308257&w=2 (variable
kernel page size handler).

The problem with a different page size is that this would require 
redesign of the TLB lookup logic. We could go back to my variable kernel 
page size patch quoted above but then we walk the complete page table.

The 1 level lookup as far as I can tell only works well with 16M.

If we would try to use a 1 level lookup for a 4M page then we would have
a linear lookup table that takes up 4MB to support 1 Petabyte.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2007-04-06 17:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-04 23:06 [PATCH 1/4] Generic Virtual Memmap suport for SPARSEMEM V3 Christoph Lameter
2007-04-04 23:06 ` [PATCH 2/4] x86_64: SPARSE_VIRTUAL 2M page size support Christoph Lameter
2007-04-04 23:06 ` [PATCH 3/4] IA64: SPARSE_VIRTUAL 16K " Christoph Lameter
2007-04-04 23:06 ` [PATCH 4/4] IA64: SPARSE_VIRTUAL 16M " Christoph Lameter
2007-04-05 22:50   ` Luck, Tony
2007-04-05 23:10     ` David Miller, Luck, Tony
2007-04-06 17:16     ` Christoph Lameter [this message]
2007-04-05 21:29 ` [PATCH 1/4] Generic Virtual Memmap suport for SPARSEMEM V3 David Miller, Christoph Lameter
2007-04-05 22:27   ` Christoph Lameter
2007-04-10 10:43     ` Andy Whitcroft

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0704061004140.25652@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=hansendc@us.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@google.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox