From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Nishanth Aravamudan <nacc@us.ibm.com>,
Christoph Lameter <clameter@sgi.com>
Cc: William Lee Irwin III <wli@holomorphy.com>,
anton@samba.org, akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH v4][RFC] hugetlb: add per-node nr_hugepages sysfs attribute
Date: Wed, 13 Jun 2007 16:05:10 -0400 [thread overview]
Message-ID: <1181765111.6148.98.camel@localhost> (raw)
In-Reply-To: <20070613191908.GR3798@us.ibm.com>
On Wed, 2007-06-13 at 12:19 -0700, Nishanth Aravamudan wrote:
> On 13.06.2007 [14:23:47 -0400], Lee Schermerhorn wrote:
> > On Wed, 2007-06-13 at 08:28 -0700, Nishanth Aravamudan wrote:
> > <snip>
> > >
> > > commit 05a7edb8c909c674cdefb0323348825cf3e2d1d0
> > > Author: Nishanth Aravamudan <nacc@us.ibm.com>
> > > Date: Thu Jun 7 08:54:48 2007 -0700
> > >
> > > hugetlb: add per-node nr_hugepages sysfs attribute
> > >
> > > Allow specifying the number of hugepages to allocate on a particular
> > > node. Our current global sysctl will try its best to put hugepages
> > > equally on each node, but htat may not always be desired. This allows
> > > the admin to control the layout of hugepage allocation at a finer level
> > > (while not breaking the existing interface). Add callbacks in the sysfs
> > > node registration and unregistration functions into hugetlb to add the
> > > nr_hugepages attribute, which is a no-op if !NUMA or !HUGETLB.
> > >
> > > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> > > Cc: William Lee Irwin III <wli@holomorphy.com>
> > > Cc: Christoph Lameter <clameter@sgi.com>
> > > Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
> > > Cc: Anton Blanchard <anton@sambar.org>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > >
> > > ---
> > > Do the dummy function definitions need to be (void)0?
> > >
> >
> > <snip>
I tested hugepage allocation on my HP rx8620 platform [16 cpu ia64, 32GB
in 4 "real" nodes and one pseudo-node containing only DMA memory]. As
expected, I don't get a balanced distribution across the real nodes.
Here's what I see:
# before allocating huge pages:
root@gwydyr(root):cat /sys/devices/system/node/node*/meminfo | grep HugeP
Node 0 HugePages_Total: 0
Node 0 HugePages_Free: 0
Node 1 HugePages_Total: 0
Node 1 HugePages_Free: 0
Node 2 HugePages_Total: 0
Node 2 HugePages_Free: 0
Node 3 HugePages_Total: 0
Node 3 HugePages_Free: 0
Node 4 HugePages_Total: 0
Node 4 HugePages_Free: 0
# Now allocate 64 256MB pages. Only nodes 0-3 have NORMAL memory.
# Zone 4 contains ~512MB of DMA memory. Some has already been
# used, so I doubt that even 1 256MB [aligned] huge page is available.
root@gwydyr(root):echo 64 >/proc/sys/vm/nr_hugepages
root@gwydyr(root):cat /sys/devices/system/node/node*/meminfo | grep HugeP
Node 0 HugePages_Total: 13 <---???
Node 0 HugePages_Free: 26 <---???
Node 1 HugePages_Total: 12
Node 1 HugePages_Free: 12
Node 2 HugePages_Total: 13
Node 2 HugePages_Free: 13
Node 3 HugePages_Total: 13
Node 3 HugePages_Free: 13
Node 4 HugePages_Total: 13 <---???
Node 4 HugePages_Free: 0
# 13 of the pages say they're from Node 4, but I know that has only
~512MB or memory, of which some is already used. Unlikely that I can
allocate even 1 256MB huge page because of alignment. Note that the
free pages are accounted on Node 0, where they actually reside.
Here's some zoneinfo after the allocation above [forgot to snap it
before].
# zoneinfo shell function contains:
# cat /proc/zoneinfo | egrep '^Node|^ pages |^ *present|^ *spanned'
# results after allocating huge pages
root@gwydyr(root):zoneinfo
Node 0, zone Normal
pages free 36157
spanned 486400
present 484738
Node 1, zone Normal
pages free 318034
spanned 520192
present 518413
Node 2, zone Normal
pages free 301526
spanned 520192
present 518414
Node 3, zone Normal
pages free 301932
spanned 520182
present 518362
Node 4, zone DMA
pages free 31706
spanned 32767
present 32656
^^^^^^^^^^^^^^^^^^^^^^ Nope! no huge pages allocated from here!
# now try to free the huge pages.
root@gwydyr(root):echo 0 >/proc/sys/vm/nr_hugepages
root@gwydyr(root):cat /sys/devices/system/node/node*/meminfo | grep HugeP
Node 0 HugePages_Total: 4294967283 <--- ???
Node 0 HugePages_Free: 0
Node 1 HugePages_Total: 0
Node 1 HugePages_Free: 0
Node 2 HugePages_Total: 0
Node 2 HugePages_Free: 0
Node 3 HugePages_Total: 0
Node 3 HugePages_Free: 0
Node 4 HugePages_Total: 13 <---??? they weren't really there to begin with!
Node 4 HugePages_Free: 0
# Apparently on remove, the pages were decremented from node 0 instead
of node 4 where they were accounted for on allocation, resulting in a
negative count on node 0 and the original 13 count still on node 4.
------------------
I tried to "tighten up" alloc_pages_node() to check the location of the
first zone in the selected zonelist, as discussed in previous exchange.
When I do this, I hit a BUG() in slub.c in
early_kmem_cache_node_alloc(), as it apparently can't handle new_slab()
returning a NULL page, even tho' it calls it with GFP_THISNODE. Slub
should be able to handle memoryless nodes, right? I'm looking for a
work around to this now.
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-06-13 20:05 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-11 20:27 [PATCH] Add populated_map to account for memoryless nodes Nishanth Aravamudan, Lee Schermerhorn
2007-06-11 21:25 ` Christoph Lameter
2007-06-11 22:10 ` [PATCH v2] " Nishanth Aravamudan
2007-06-11 22:42 ` Christoph Lameter
2007-06-11 22:52 ` [PATCH v3] " Nishanth Aravamudan
2007-06-11 23:00 ` Christoph Lameter
2007-06-11 23:41 ` [PATCH v4] " Nishanth Aravamudan
2007-06-11 23:45 ` Christoph Lameter
2007-06-12 0:07 ` [PATCH] populated_map: fix !NUMA case, remove comment Nishanth Aravamudan
2007-06-12 0:41 ` Christoph Lameter
2007-06-12 1:43 ` Nishanth Aravamudan
2007-06-12 1:45 ` Christoph Lameter
2007-06-12 1:52 ` Nishanth Aravamudan
2007-06-12 2:39 ` Nishanth Aravamudan
2007-06-12 2:02 ` Nishanth Aravamudan
2007-06-12 2:20 ` Christoph Lameter
2007-06-12 2:32 ` Nishanth Aravamudan
2007-06-12 2:54 ` Christoph Lameter
2007-06-12 3:20 ` Nishanth Aravamudan
2007-06-12 3:21 ` Christoph Lameter
2007-06-12 3:31 ` Nishanth Aravamudan
2007-06-12 15:06 ` Lee Schermerhorn
2007-06-12 17:28 ` Nishanth Aravamudan
2007-06-12 18:43 ` Christoph Lameter
2007-06-12 18:48 ` Lee Schermerhorn
2007-06-12 18:51 ` Christoph Lameter
2007-06-12 19:44 ` Lee Schermerhorn
2007-06-12 19:48 ` Christoph Lameter
2007-06-12 19:58 ` Christoph Lameter
2007-06-12 20:01 ` Nishanth Aravamudan
2007-06-13 15:30 ` Lee Schermerhorn
2007-06-13 17:58 ` Nishanth Aravamudan
2007-06-13 18:21 ` Lee Schermerhorn
2007-06-13 19:01 ` Nishanth Aravamudan
2007-06-13 22:51 ` Christoph Lameter
2007-06-14 15:50 ` Lee Schermerhorn
2007-06-14 15:57 ` Christoph Lameter
2007-06-14 16:54 ` Lee Schermerhorn
2007-06-14 16:09 ` Nishanth Aravamudan
2007-06-14 16:15 ` Christoph Lameter
2007-06-14 17:07 ` Lee Schermerhorn
2007-06-14 17:16 ` Christoph Lameter
2007-06-14 18:04 ` Lee Schermerhorn
2007-06-14 22:35 ` Nishanth Aravamudan
2007-06-13 22:50 ` Christoph Lameter
2007-06-13 23:09 ` Nishanth Aravamudan
2007-06-13 23:12 ` Christoph Lameter
2007-06-13 23:18 ` Nishanth Aravamudan
2007-06-13 23:26 ` Christoph Lameter
2007-06-13 23:56 ` Nishanth Aravamudan
2007-06-14 14:23 ` Lee Schermerhorn
2007-06-13 22:49 ` Christoph Lameter
2007-06-12 19:55 ` Nishanth Aravamudan
2007-06-12 18:41 ` Christoph Lameter
2007-06-12 19:07 ` Lee Schermerhorn
2007-06-12 19:13 ` Christoph Lameter
2007-06-11 23:08 ` [PATCH][RFC] Fix INTERLEAVE with memoryless nodes Nishanth Aravamudan
2007-06-11 23:10 ` [PATCH v6][RFC] Fix hugetlb pool allocation with empty nodes Nishanth Aravamudan
2007-06-11 23:11 ` [PATCH][RFC] hugetlb: numafy several functions Nishanth Aravamudan
2007-06-11 23:13 ` [PATCH][RFC] hugetlb: add per-node nr_hugepages sysfs attribute Nishanth Aravamudan
2007-06-11 23:40 ` Christoph Lameter
2007-06-11 23:42 ` Christoph Lameter
2007-06-12 0:19 ` Nishanth Aravamudan
2007-06-12 0:43 ` Christoph Lameter
2007-06-12 2:19 ` Nishanth Aravamudan
2007-06-12 2:22 ` Christoph Lameter
2007-06-12 2:34 ` Nishanth Aravamudan
2007-06-11 23:38 ` [PATCH][RFC] hugetlb: numafy several functions Christoph Lameter
2007-06-11 23:17 ` [PATCH v6][RFC] Fix hugetlb pool allocation with empty nodes Christoph Lameter
2007-06-12 0:15 ` Nishanth Aravamudan
2007-06-12 0:47 ` Christoph Lameter
2007-06-12 2:12 ` Nishanth Aravamudan
2007-06-12 2:21 ` Christoph Lameter
2007-06-12 2:25 ` Christoph Lameter
2007-06-12 2:34 ` Nishanth Aravamudan
2007-06-12 2:55 ` Christoph Lameter
2007-06-12 3:17 ` Nishanth Aravamudan
2007-06-12 3:19 ` Christoph Lameter
2007-06-12 3:30 ` Nishanth Aravamudan
2007-06-12 3:48 ` Christoph Lameter
2007-06-12 5:07 ` Nishanth Aravamudan
2007-06-12 18:47 ` Christoph Lameter
2007-06-12 17:43 ` Nishanth Aravamudan
2007-06-12 18:49 ` Christoph Lameter
2007-06-12 2:33 ` Nishanth Aravamudan
2007-06-12 3:44 ` William Lee Irwin III
2007-06-12 3:50 ` Christoph Lameter
2007-06-12 3:53 ` William Lee Irwin III
2007-06-12 3:53 ` Christoph Lameter
2007-06-12 4:14 ` William Lee Irwin III
2007-06-12 5:09 ` Nishanth Aravamudan
2007-06-12 5:15 ` William Lee Irwin III
2007-06-12 17:36 ` Nishanth Aravamudan
2007-06-12 18:50 ` Christoph Lameter
2007-06-12 17:45 ` Nishanth Aravamudan
2007-06-12 19:13 ` William Lee Irwin III
2007-06-13 0:04 ` [PATCH v7][RFC] " Nishanth Aravamudan
2007-06-13 15:26 ` [PATCH v3][RFC] hugetlb: numafy several functions Nishanth Aravamudan
2007-06-13 15:28 ` [PATCH v3][RFC] hugetlb: add per-node nr_hugepages sysfs attribute Nishanth Aravamudan
2007-06-13 18:23 ` Lee Schermerhorn
2007-06-13 19:19 ` [PATCH v4][RFC] " Nishanth Aravamudan
2007-06-13 20:05 ` Lee Schermerhorn [this message]
2007-06-13 20:29 ` Nishanth Aravamudan
2007-06-13 21:02 ` Lee Schermerhorn
2007-07-23 19:23 ` Christoph Lameter
2007-07-23 20:14 ` Lee Schermerhorn
2007-06-13 21:04 ` [PATCH v7][RFC] Fix hugetlb pool allocation with empty nodes Lee Schermerhorn
2007-06-13 21:50 ` [PATCH v7][UPDATE][RFC] " Nishanth Aravamudan
2007-06-12 14:28 ` [PATCH v6][RFC] " Lee Schermerhorn
2007-06-11 23:15 ` [PATCH][RFC] Fix INTERLEAVE with memoryless nodes Christoph Lameter
2007-06-12 0:14 ` [PATCH v2][RFC] " Nishanth Aravamudan
2007-06-12 0:42 ` Christoph Lameter
2007-06-12 0:57 ` Andrew Morton
2007-06-12 1:12 ` Christoph Lameter
2007-06-12 1:41 ` Nishanth Aravamudan
2007-06-12 1:52 ` Andrew Morton
2007-06-12 2:03 ` Nishanth Aravamudan
2007-06-12 14:19 ` [PATCH v2] Add populated_map to account for " Lee Schermerhorn
2007-06-12 17:32 ` Nishanth Aravamudan
2007-06-12 18:45 ` Christoph Lameter
2007-06-12 19:17 ` Lee Schermerhorn
2007-06-12 19:22 ` Christoph Lameter
2007-06-12 19:49 ` Nishanth Aravamudan
2007-06-12 19:51 ` Christoph Lameter
2007-06-12 20:00 ` Nishanth Aravamudan
2007-06-12 20:03 ` Christoph Lameter
2007-06-12 20:10 ` Christoph Lameter
2007-06-12 19:52 ` Christoph Lameter
2007-06-12 19:58 ` Christoph Lameter
2007-06-12 20:00 ` Nishanth Aravamudan
2007-06-12 20:06 ` Christoph Lameter
2007-06-12 14:10 ` [PATCH] " Lee Schermerhorn
2007-06-12 17:35 ` Nishanth Aravamudan
2007-06-12 18:39 ` Christoph Lameter
2007-06-12 18:54 ` Lee Schermerhorn
2007-06-12 19:00 ` Christoph Lameter
2007-06-12 2:27 ` KAMEZAWA Hiroyuki
2007-06-12 2:46 ` Nishanth Aravamudan
2007-06-12 2:53 ` Christoph Lameter
2007-06-12 3:04 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1181765111.6148.98.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=anton@samba.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=nacc@us.ibm.com \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox