From: Nishanth Aravamudan <nacc@us.ibm.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: William Lee Irwin III <wli@holomorphy.com>,
Christoph Lameter <clameter@sgi.com>,
anton@samba.org, akpm@linux-foundation.org, linux-mm@kvack.org
Subject: [PATCH v4][RFC] hugetlb: add per-node nr_hugepages sysfs attribute
Date: Wed, 13 Jun 2007 12:19:08 -0700 [thread overview]
Message-ID: <20070613191908.GR3798@us.ibm.com> (raw)
In-Reply-To: <1181759027.6148.77.camel@localhost>
On 13.06.2007 [14:23:47 -0400], Lee Schermerhorn wrote:
> On Wed, 2007-06-13 at 08:28 -0700, Nishanth Aravamudan wrote:
> <snip>
> >
> > commit 05a7edb8c909c674cdefb0323348825cf3e2d1d0
> > Author: Nishanth Aravamudan <nacc@us.ibm.com>
> > Date: Thu Jun 7 08:54:48 2007 -0700
> >
> > hugetlb: add per-node nr_hugepages sysfs attribute
> >
> > Allow specifying the number of hugepages to allocate on a particular
> > node. Our current global sysctl will try its best to put hugepages
> > equally on each node, but htat may not always be desired. This allows
> > the admin to control the layout of hugepage allocation at a finer level
> > (while not breaking the existing interface). Add callbacks in the sysfs
> > node registration and unregistration functions into hugetlb to add the
> > nr_hugepages attribute, which is a no-op if !NUMA or !HUGETLB.
> >
> > Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
> > Cc: William Lee Irwin III <wli@holomorphy.com>
> > Cc: Christoph Lameter <clameter@sgi.com>
> > Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
> > Cc: Anton Blanchard <anton@sambar.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> >
> > ---
> > Do the dummy function definitions need to be (void)0?
> >
>
> <snip>
>
> > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> > index aa0dc9b..e9f5928 100644
> > --- a/include/linux/hugetlb.h
> > +++ b/include/linux/hugetlb.h
> > @@ -5,6 +5,7 @@
> >
> > #include <linux/mempolicy.h>
> > #include <linux/shm.h>
> > +#include <linux/sysdev.h>
> > #include <asm/tlbflush.h>
> >
> > struct ctl_table;
> > @@ -23,6 +24,11 @@ void __unmap_hugepage_range(struct vm_area_struct *, unsigned long, unsigned lon
> > int hugetlb_prefault(struct address_space *, struct vm_area_struct *);
> > int hugetlb_report_meminfo(char *);
> > int hugetlb_report_node_meminfo(int, char *);
> > +int hugetlb_register_node(struct sys_device *);
> > +void hugetlb_unregister_node(struct sys_device *);
>
> The parameter type for the two functions above need to be "struct
> node". You'll need to include <linux/node.h> after <linux/sysdev.h>,
> as well. Otherwise, doesn't build.
Sigh... Actually a few fixes worth doing. Make stuff static, since it's
now all in hugetlb.c and only compile if NUMA. And don't export the
nr_hugepages functions any more via hugetlb.h, as they are now private.
Compile-tested with HUGETLB && NUMA, HUGETLB && !NUMA, !HUGETLB && NUMA,
!HUGETLB && !NUMA.
Will throw it at the machines I ran the previous set on, to verify
everything runs as expected, but for review:
hugetlb: add per-node nr_hugepages sysfs attribute
Allow specifying the number of hugepages to allocate on a particular
node. Our current global sysctl will try its best to put hugepages
equally on each node, but htat may not always be desired. This allows
the admin to control the layout of hugepage allocation at a finer level
(while not breaking the existing interface). Add callbacks in the sysfs
node registration and unregistration functions into hugetlb to add the
nr_hugepages attribute, which is a no-op if !NUMA or !HUGETLB.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
diff --git a/drivers/base/node.c b/drivers/base/node.c
index cae346e..c9d531f 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -151,6 +151,7 @@ int register_node(struct node *node, int num, struct node *parent)
sysdev_create_file(&node->sysdev, &attr_meminfo);
sysdev_create_file(&node->sysdev, &attr_numastat);
sysdev_create_file(&node->sysdev, &attr_distance);
+ hugetlb_register_node(node);
}
return error;
}
@@ -168,6 +169,7 @@ void unregister_node(struct node *node)
sysdev_remove_file(&node->sysdev, &attr_meminfo);
sysdev_remove_file(&node->sysdev, &attr_numastat);
sysdev_remove_file(&node->sysdev, &attr_distance);
+ hugetlb_unregister_node(node);
sysdev_unregister(&node->sysdev);
}
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index aa0dc9b..7872031 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -4,7 +4,9 @@
#ifdef CONFIG_HUGETLB_PAGE
#include <linux/mempolicy.h>
+#include <linux/node.h>
#include <linux/shm.h>
+#include <linux/sysdev.h>
#include <asm/tlbflush.h>
struct ctl_table;
@@ -23,6 +25,13 @@ void __unmap_hugepage_range(struct vm_area_struct *, unsigned long, unsigned lon
int hugetlb_prefault(struct address_space *, struct vm_area_struct *);
int hugetlb_report_meminfo(char *);
int hugetlb_report_node_meminfo(int, char *);
+#ifdef CONFIG_NUMA
+int hugetlb_register_node(struct node *);
+void hugetlb_unregister_node(struct node *);
+#else
+#define hugetlb_register_node(node) 0
+#define hugetlb_unregister_node(node) 0
+#endif
unsigned long hugetlb_total_pages(void);
int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, int write_access);
@@ -114,6 +123,8 @@ static inline unsigned long hugetlb_total_pages(void)
#define unmap_hugepage_range(vma, start, end) BUG()
#define hugetlb_report_meminfo(buf) 0
#define hugetlb_report_node_meminfo(n, buf) 0
+#define hugetlb_register_node(node) 0
+#define hugetlb_unregister_node(node) 0
#define follow_huge_pmd(mm, addr, pmd, write) NULL
#define prepare_hugepage_range(addr,len,pgoff) (-EINVAL)
#define pmd_huge(x) 0
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c4a966e..e6ba07d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -137,6 +137,9 @@ static int alloc_fresh_huge_page(struct mempolicy *policy)
nid = start_nid;
do {
+ /*
+ * this allocation will fail for unpopulated nodes
+ */
page = alloc_fresh_huge_page_node(nid);
nid = interleave_nodes(policy);
} while (!page && nid != start_nid);
@@ -217,7 +220,6 @@ static unsigned int cpuset_mems_nr(unsigned int *array)
return nr;
}
-#ifdef CONFIG_SYSCTL
static void update_and_free_page(int nid, struct page *page)
{
int i;
@@ -270,6 +272,7 @@ static inline void try_to_free_low(unsigned long count)
}
#endif
+#ifdef CONFIG_SYSCTL
static unsigned long set_max_huge_pages(unsigned long count)
{
struct mempolicy *pol;
@@ -343,6 +346,67 @@ int hugetlb_report_node_meminfo(int nid, char *buf)
nid, free_huge_pages_node[nid]);
}
+#ifdef CONFIG_NUMA
+static ssize_t hugetlb_read_nr_hugepages_node(struct sys_device *dev,
+ char *buf)
+{
+ return sprintf(buf, "%u\n", nr_huge_pages_node[dev->id]);
+}
+
+static ssize_t hugetlb_write_nr_hugepages_node(struct sys_device *dev,
+ const char *buf, size_t count)
+{
+ int nid = dev->id;
+ unsigned long target;
+ unsigned long free_on_other_nodes;
+ unsigned long nr_huge_pages_req = simple_strtoul(buf, NULL, 10);
+
+ while (nr_huge_pages_req > nr_huge_pages_node[nid]) {
+ if (!alloc_fresh_huge_page_node(nid))
+ return count;
+ }
+ if (nr_huge_pages_req >= nr_huge_pages_node[nid])
+ return count;
+
+ /* need to ensure that our counts are accurate */
+ spin_lock(&hugetlb_lock);
+ free_on_other_nodes = free_huge_pages - free_huge_pages_node[nid];
+ if (free_on_other_nodes >= resv_huge_pages) {
+ /* other nodes can satisfy reserve */
+ target = nr_huge_pages_req;
+ } else {
+ /* this node needs some free to satisfy reserve */
+ target = max((resv_huge_pages - free_on_other_nodes),
+ nr_huge_pages_req);
+ }
+ try_to_free_low_node(nid, target);
+ while (target < nr_huge_pages_node[nid]) {
+ struct page *page = dequeue_huge_page_node(nid);
+ if (!page)
+ break;
+ update_and_free_page(nid, page);
+ }
+ spin_unlock(&hugetlb_lock);
+
+ return count;
+}
+
+static SYSDEV_ATTR(nr_hugepages, S_IRUGO | S_IWUSR,
+ hugetlb_read_nr_hugepages_node,
+ hugetlb_write_nr_hugepages_node);
+
+int hugetlb_register_node(struct node *node)
+{
+ return sysdev_create_file(&node->sysdev, &attr_nr_hugepages);
+}
+
+void hugetlb_unregister_node(struct node *node)
+{
+ sysdev_remove_file(&node->sysdev, &attr_nr_hugepages);
+}
+
+#endif
+
/* Return the number pages of memory we physically have, in PAGE_SIZE units. */
unsigned long hugetlb_total_pages(void)
{
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-06-13 19:19 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-11 20:27 [PATCH] Add populated_map to account for memoryless nodes Nishanth Aravamudan, Lee Schermerhorn
2007-06-11 21:25 ` Christoph Lameter
2007-06-11 22:10 ` [PATCH v2] " Nishanth Aravamudan
2007-06-11 22:42 ` Christoph Lameter
2007-06-11 22:52 ` [PATCH v3] " Nishanth Aravamudan
2007-06-11 23:00 ` Christoph Lameter
2007-06-11 23:41 ` [PATCH v4] " Nishanth Aravamudan
2007-06-11 23:45 ` Christoph Lameter
2007-06-12 0:07 ` [PATCH] populated_map: fix !NUMA case, remove comment Nishanth Aravamudan
2007-06-12 0:41 ` Christoph Lameter
2007-06-12 1:43 ` Nishanth Aravamudan
2007-06-12 1:45 ` Christoph Lameter
2007-06-12 1:52 ` Nishanth Aravamudan
2007-06-12 2:39 ` Nishanth Aravamudan
2007-06-12 2:02 ` Nishanth Aravamudan
2007-06-12 2:20 ` Christoph Lameter
2007-06-12 2:32 ` Nishanth Aravamudan
2007-06-12 2:54 ` Christoph Lameter
2007-06-12 3:20 ` Nishanth Aravamudan
2007-06-12 3:21 ` Christoph Lameter
2007-06-12 3:31 ` Nishanth Aravamudan
2007-06-12 15:06 ` Lee Schermerhorn
2007-06-12 17:28 ` Nishanth Aravamudan
2007-06-12 18:43 ` Christoph Lameter
2007-06-12 18:48 ` Lee Schermerhorn
2007-06-12 18:51 ` Christoph Lameter
2007-06-12 19:44 ` Lee Schermerhorn
2007-06-12 19:48 ` Christoph Lameter
2007-06-12 19:58 ` Christoph Lameter
2007-06-12 20:01 ` Nishanth Aravamudan
2007-06-13 15:30 ` Lee Schermerhorn
2007-06-13 17:58 ` Nishanth Aravamudan
2007-06-13 18:21 ` Lee Schermerhorn
2007-06-13 19:01 ` Nishanth Aravamudan
2007-06-13 22:51 ` Christoph Lameter
2007-06-14 15:50 ` Lee Schermerhorn
2007-06-14 15:57 ` Christoph Lameter
2007-06-14 16:54 ` Lee Schermerhorn
2007-06-14 16:09 ` Nishanth Aravamudan
2007-06-14 16:15 ` Christoph Lameter
2007-06-14 17:07 ` Lee Schermerhorn
2007-06-14 17:16 ` Christoph Lameter
2007-06-14 18:04 ` Lee Schermerhorn
2007-06-14 22:35 ` Nishanth Aravamudan
2007-06-13 22:50 ` Christoph Lameter
2007-06-13 23:09 ` Nishanth Aravamudan
2007-06-13 23:12 ` Christoph Lameter
2007-06-13 23:18 ` Nishanth Aravamudan
2007-06-13 23:26 ` Christoph Lameter
2007-06-13 23:56 ` Nishanth Aravamudan
2007-06-14 14:23 ` Lee Schermerhorn
2007-06-13 22:49 ` Christoph Lameter
2007-06-12 19:55 ` Nishanth Aravamudan
2007-06-12 18:41 ` Christoph Lameter
2007-06-12 19:07 ` Lee Schermerhorn
2007-06-12 19:13 ` Christoph Lameter
2007-06-11 23:08 ` [PATCH][RFC] Fix INTERLEAVE with memoryless nodes Nishanth Aravamudan
2007-06-11 23:10 ` [PATCH v6][RFC] Fix hugetlb pool allocation with empty nodes Nishanth Aravamudan
2007-06-11 23:11 ` [PATCH][RFC] hugetlb: numafy several functions Nishanth Aravamudan
2007-06-11 23:13 ` [PATCH][RFC] hugetlb: add per-node nr_hugepages sysfs attribute Nishanth Aravamudan
2007-06-11 23:40 ` Christoph Lameter
2007-06-11 23:42 ` Christoph Lameter
2007-06-12 0:19 ` Nishanth Aravamudan
2007-06-12 0:43 ` Christoph Lameter
2007-06-12 2:19 ` Nishanth Aravamudan
2007-06-12 2:22 ` Christoph Lameter
2007-06-12 2:34 ` Nishanth Aravamudan
2007-06-11 23:38 ` [PATCH][RFC] hugetlb: numafy several functions Christoph Lameter
2007-06-11 23:17 ` [PATCH v6][RFC] Fix hugetlb pool allocation with empty nodes Christoph Lameter
2007-06-12 0:15 ` Nishanth Aravamudan
2007-06-12 0:47 ` Christoph Lameter
2007-06-12 2:12 ` Nishanth Aravamudan
2007-06-12 2:21 ` Christoph Lameter
2007-06-12 2:25 ` Christoph Lameter
2007-06-12 2:34 ` Nishanth Aravamudan
2007-06-12 2:55 ` Christoph Lameter
2007-06-12 3:17 ` Nishanth Aravamudan
2007-06-12 3:19 ` Christoph Lameter
2007-06-12 3:30 ` Nishanth Aravamudan
2007-06-12 3:48 ` Christoph Lameter
2007-06-12 5:07 ` Nishanth Aravamudan
2007-06-12 18:47 ` Christoph Lameter
2007-06-12 17:43 ` Nishanth Aravamudan
2007-06-12 18:49 ` Christoph Lameter
2007-06-12 2:33 ` Nishanth Aravamudan
2007-06-12 3:44 ` William Lee Irwin III
2007-06-12 3:50 ` Christoph Lameter
2007-06-12 3:53 ` William Lee Irwin III
2007-06-12 3:53 ` Christoph Lameter
2007-06-12 4:14 ` William Lee Irwin III
2007-06-12 5:09 ` Nishanth Aravamudan
2007-06-12 5:15 ` William Lee Irwin III
2007-06-12 17:36 ` Nishanth Aravamudan
2007-06-12 18:50 ` Christoph Lameter
2007-06-12 17:45 ` Nishanth Aravamudan
2007-06-12 19:13 ` William Lee Irwin III
2007-06-13 0:04 ` [PATCH v7][RFC] " Nishanth Aravamudan
2007-06-13 15:26 ` [PATCH v3][RFC] hugetlb: numafy several functions Nishanth Aravamudan
2007-06-13 15:28 ` [PATCH v3][RFC] hugetlb: add per-node nr_hugepages sysfs attribute Nishanth Aravamudan
2007-06-13 18:23 ` Lee Schermerhorn
2007-06-13 19:19 ` Nishanth Aravamudan [this message]
2007-06-13 20:05 ` [PATCH v4][RFC] " Lee Schermerhorn
2007-06-13 20:29 ` Nishanth Aravamudan
2007-06-13 21:02 ` Lee Schermerhorn
2007-07-23 19:23 ` Christoph Lameter
2007-07-23 20:14 ` Lee Schermerhorn
2007-06-13 21:04 ` [PATCH v7][RFC] Fix hugetlb pool allocation with empty nodes Lee Schermerhorn
2007-06-13 21:50 ` [PATCH v7][UPDATE][RFC] " Nishanth Aravamudan
2007-06-12 14:28 ` [PATCH v6][RFC] " Lee Schermerhorn
2007-06-11 23:15 ` [PATCH][RFC] Fix INTERLEAVE with memoryless nodes Christoph Lameter
2007-06-12 0:14 ` [PATCH v2][RFC] " Nishanth Aravamudan
2007-06-12 0:42 ` Christoph Lameter
2007-06-12 0:57 ` Andrew Morton
2007-06-12 1:12 ` Christoph Lameter
2007-06-12 1:41 ` Nishanth Aravamudan
2007-06-12 1:52 ` Andrew Morton
2007-06-12 2:03 ` Nishanth Aravamudan
2007-06-12 14:19 ` [PATCH v2] Add populated_map to account for " Lee Schermerhorn
2007-06-12 17:32 ` Nishanth Aravamudan
2007-06-12 18:45 ` Christoph Lameter
2007-06-12 19:17 ` Lee Schermerhorn
2007-06-12 19:22 ` Christoph Lameter
2007-06-12 19:49 ` Nishanth Aravamudan
2007-06-12 19:51 ` Christoph Lameter
2007-06-12 20:00 ` Nishanth Aravamudan
2007-06-12 20:03 ` Christoph Lameter
2007-06-12 20:10 ` Christoph Lameter
2007-06-12 19:52 ` Christoph Lameter
2007-06-12 19:58 ` Christoph Lameter
2007-06-12 20:00 ` Nishanth Aravamudan
2007-06-12 20:06 ` Christoph Lameter
2007-06-12 14:10 ` [PATCH] " Lee Schermerhorn
2007-06-12 17:35 ` Nishanth Aravamudan
2007-06-12 18:39 ` Christoph Lameter
2007-06-12 18:54 ` Lee Schermerhorn
2007-06-12 19:00 ` Christoph Lameter
2007-06-12 2:27 ` KAMEZAWA Hiroyuki
2007-06-12 2:46 ` Nishanth Aravamudan
2007-06-12 2:53 ` Christoph Lameter
2007-06-12 3:04 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070613191908.GR3798@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=anton@samba.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox