linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Add pseudo-anonymous huge page mappings V3
@ 2009-08-14 14:08 Eric B Munson
  2009-08-14 14:08 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 Eric B Munson
  2009-08-17 13:53 ` [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Andi Kleen
  0 siblings, 2 replies; 8+ messages in thread
From: Eric B Munson @ 2009-08-14 14:08 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: linux-man, akpm, mtk.manpages, Eric B Munson

This patch set adds a flag to mmap that allows the user to request
a mapping to be backed with huge pages.  This mapping will borrow
functionality from the huge page shm code to create a file on the
kernel internal mount and uses it to approximate an anonymous
mapping.  The MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS
and will not work without both flags being preset.

A new flag is necessary because there is no other way to hook into
huge pages without creating a file on a hugetlbfs mount which
wouldn't be MAP_ANONYMOUS.

To userspace, this mapping will behave just like an anonymous mapping
because the file is not accessible outside of the kernel.

Eric B Munson (3):
  hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on
    the vfs internal mount
  Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions
  Add MAP_HUGETLB example

 Documentation/vm/00-INDEX         |    2 +
 Documentation/vm/hugetlbpage.txt  |   14 ++++---
 Documentation/vm/map_hugetlb.c    |   77 +++++++++++++++++++++++++++++++++++++
 fs/hugetlbfs/inode.c              |   22 +++++++++--
 include/asm-generic/mman-common.h |    1 +
 include/linux/hugetlb.h           |   17 ++++++++-
 ipc/shm.c                         |    3 +-
 mm/mmap.c                         |   16 ++++++++
 8 files changed, 140 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/vm/map_hugetlb.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3
  2009-08-14 14:08 [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Eric B Munson
@ 2009-08-14 14:08 ` Eric B Munson
  2009-08-14 14:08   ` [PATCH 2/3] Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions V3 Eric B Munson
  2009-08-14 19:19   ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 David Rientjes
  2009-08-17 13:53 ` [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Andi Kleen
  1 sibling, 2 replies; 8+ messages in thread
From: Eric B Munson @ 2009-08-14 14:08 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: linux-man, akpm, mtk.manpages, Eric B Munson

There are two means of creating mappings backed by huge pages:

        1. mmap() a file created on hugetlbfs
        2. Use shm which creates a file on an internal mount which essentially
           maps it MAP_SHARED

The internal mount is only used for shared mappings but there is very
little that stops it being used for private mappings. This patch extends
hugetlbfs_file_setup() to deal with the creation of files that will be
mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.

Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
---
Changes from V2:
 Rebase to newest linux-2.6 tree
 Use base 10 value for HUGETLB_SHMFS_INODE instead of hex

Changes from V1:
 Rebase to newest linux-2.6 tree

 fs/hugetlbfs/inode.c    |   22 ++++++++++++++++++----
 include/linux/hugetlb.h |   10 +++++++++-
 ipc/shm.c               |    3 ++-
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 941c842..361f536 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -506,6 +506,13 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, uid_t uid,
 		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
 		INIT_LIST_HEAD(&inode->i_mapping->private_list);
 		info = HUGETLBFS_I(inode);
+		/*
+		 * The policy is initialized here even if we are creating a
+		 * private inode because initialization simply creates an
+		 * an empty rb tree and calls spin_lock_init(), later when we
+		 * call mpol_free_shared_policy() it will just return because
+		 * the rb tree will still be empty.
+		 */
 		mpol_shared_policy_init(&info->policy, NULL);
 		switch (mode & S_IFMT) {
 		default:
@@ -930,12 +937,19 @@ static struct file_system_type hugetlbfs_fs_type = {
 
 static struct vfsmount *hugetlbfs_vfsmount;
 
-static int can_do_hugetlb_shm(void)
+static int can_do_hugetlb_shm(int creat_flags)
 {
-	return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
+	if (!(creat_flags & HUGETLB_SHMFS_INODE))
+		return 0;
+	if (capable(CAP_IPC_LOCK))
+		return 1;
+	if (in_group_p(sysctl_hugetlb_shm_group))
+		return 1;
+	return 0;
 }
 
-struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag)
+struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
+				int creat_flags)
 {
 	int error = -ENOMEM;
 	int unlock_shm = 0;
@@ -948,7 +962,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag)
 	if (!hugetlbfs_vfsmount)
 		return ERR_PTR(-ENOENT);
 
-	if (!can_do_hugetlb_shm()) {
+	if (!can_do_hugetlb_shm(creat_flags)) {
 		if (user_shm_lock(size, user)) {
 			unlock_shm = 1;
 			WARN_ONCE(1,
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2723513..3c48a63 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -109,6 +109,14 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
 
 #endif /* !CONFIG_HUGETLB_PAGE */
 
+enum {
+	/*
+	 * The file will be used as an shm file so shmfs accounting rules
+	 * apply
+	 */
+	HUGETLB_SHMFS_INODE     = 1,
+};
+
 #ifdef CONFIG_HUGETLBFS
 struct hugetlbfs_config {
 	uid_t   uid;
@@ -146,7 +154,7 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb)
 
 extern const struct file_operations hugetlbfs_file_operations;
 extern struct vm_operations_struct hugetlb_vm_ops;
-struct file *hugetlb_file_setup(const char *name, size_t, int);
+struct file *hugetlb_file_setup(const char *name, size_t, int, int);
 int hugetlb_get_quota(struct address_space *mapping, long delta);
 void hugetlb_put_quota(struct address_space *mapping, long delta);
 
diff --git a/ipc/shm.c b/ipc/shm.c
index 15dd238..801c68a 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -369,7 +369,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
 		/* hugetlb_file_setup applies strict accounting */
 		if (shmflg & SHM_NORESERVE)
 			acctflag = VM_NORESERVE;
-		file = hugetlb_file_setup(name, size, acctflag);
+		file = hugetlb_file_setup(name, size, acctflag,
+					HUGETLB_SHMFS_INODE);
 		shp->mlock_user = current_user();
 	} else {
 		/*
-- 
1.6.3.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/3] Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions V3
  2009-08-14 14:08 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 Eric B Munson
@ 2009-08-14 14:08   ` Eric B Munson
  2009-08-14 14:08     ` [PATCH 3/3] Add MAP_HUGETLB example V3 Eric B Munson
  2009-08-14 19:19   ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 David Rientjes
  1 sibling, 1 reply; 8+ messages in thread
From: Eric B Munson @ 2009-08-14 14:08 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: linux-man, akpm, mtk.manpages, Eric B Munson

This patch adds a flag for mmap that will be used to request a huge
page region that will look like anonymous memory to user space.  This
is accomplished by using a file on the internal vfsmount.  MAP_HUGETLB
is a modifier of MAP_ANONYMOUS and so must be specified with it.  The
region will behave the same as a MAP_ANONYMOUS region using small pages.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
---
Changes from V2:
 Rebase to newest linux-2.6 tree
 Use base 10 value for HUGETLB_ANONHUGE_INODE instead of hex
 Return error value from hugetlb_file_setup instead of hard coding ENOMEM

Changes from V1:
 Rebase to newest linux-2.6 tree
 Rename MAP_LARGEPAGE to MAP_HUGETLB to match flag name for huge page shm

 include/asm-generic/mman-common.h |    1 +
 include/linux/hugetlb.h           |    7 +++++++
 mm/mmap.c                         |   16 ++++++++++++++++
 3 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
index 3b69ad3..12f5982 100644
--- a/include/asm-generic/mman-common.h
+++ b/include/asm-generic/mman-common.h
@@ -19,6 +19,7 @@
 #define MAP_TYPE	0x0f		/* Mask for type of mapping */
 #define MAP_FIXED	0x10		/* Interpret addr exactly */
 #define MAP_ANONYMOUS	0x20		/* don't use a file */
+#define MAP_HUGETLB	0x40		/* create a huge page mapping */
 
 #define MS_ASYNC	1		/* sync memory asynchronously */
 #define MS_INVALIDATE	2		/* invalidate the caches */
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 3c48a63..e2b01e6 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -109,12 +109,19 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
 
 #endif /* !CONFIG_HUGETLB_PAGE */
 
+#define HUGETLB_ANON_FILE "anon_hugepage"
+
 enum {
 	/*
 	 * The file will be used as an shm file so shmfs accounting rules
 	 * apply
 	 */
 	HUGETLB_SHMFS_INODE     = 1,
+	/*
+	 * The file is being created on the internal vfs mount and shmfs
+	 * accounting rules do not apply
+	 */
+	HUGETLB_ANONHUGE_INODE  = 2,
 };
 
 #ifdef CONFIG_HUGETLBFS
diff --git a/mm/mmap.c b/mm/mmap.c
index 34579b2..69dbe99 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -29,6 +29,7 @@
 #include <linux/rmap.h>
 #include <linux/mmu_notifier.h>
 #include <linux/perf_counter.h>
+#include <linux/hugetlb.h>
 
 #include <asm/uaccess.h>
 #include <asm/cacheflush.h>
@@ -954,6 +955,21 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
 	if (mm->map_count > sysctl_max_map_count)
 		return -ENOMEM;
 
+	if (flags & MAP_HUGETLB) {
+		if (file)
+			return -EINVAL;
+
+		/*
+		 * VM_NORESERVE is used because the reservations will be
+		 * taken when vm_ops->mmap() is called
+		 */
+		len = ALIGN(len, huge_page_size(&default_hstate));
+		file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, VM_NORESERVE,
+						HUGETLB_ANONHUGE_INODE);
+		if (IS_ERR(file))
+			return PTR_ERR(file);
+	}
+
 	/* Obtain the address to map to. we verify (or select) it and ensure
 	 * that it represents a valid section of the address space.
 	 */
-- 
1.6.3.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3] Add MAP_HUGETLB example V3
  2009-08-14 14:08   ` [PATCH 2/3] Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions V3 Eric B Munson
@ 2009-08-14 14:08     ` Eric B Munson
  2009-08-14 19:20       ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: Eric B Munson @ 2009-08-14 14:08 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: linux-man, akpm, mtk.manpages, Eric B Munson

This patch adds an example of how to use the MAP_HUGETLB flag to the
vm documentation directory and a reference to the example in
hugetlbpage.txt.

Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
---
Changes from V2:
 Rebase to newest linux-2.6 tree
 Fix comment in example referencing MAP_LARGEPAGE
 Move example code to its own file
 Update hugetlbpage.txt with MAP_HUGETLB information and example reference
 Add map_hugetlb.c to 00-INDEX

Changes from V1:
 Rebase to newest linux-2.6 tree
 Change MAP_LARGEPAGE to MAP_HUGETLB to match flag name in huge page shm

 Documentation/vm/00-INDEX        |    2 +
 Documentation/vm/hugetlbpage.txt |   14 ++++---
 Documentation/vm/map_hugetlb.c   |   77 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 87 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/vm/map_hugetlb.c

diff --git a/Documentation/vm/00-INDEX b/Documentation/vm/00-INDEX
index 2f77ced..aabd973 100644
--- a/Documentation/vm/00-INDEX
+++ b/Documentation/vm/00-INDEX
@@ -20,3 +20,5 @@ slabinfo.c
 	- source code for a tool to get reports about slabs.
 slub.txt
 	- a short users guide for SLUB.
+map_hugetlb.c
+	- an example program that uses the MAP_HUGETLB mmap flag.
diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt
index ea8714f..6a8feab 100644
--- a/Documentation/vm/hugetlbpage.txt
+++ b/Documentation/vm/hugetlbpage.txt
@@ -146,12 +146,14 @@ Regular chown, chgrp, and chmod commands (with right permissions) could be
 used to change the file attributes on hugetlbfs.
 
 Also, it is important to note that no such mount command is required if the
-applications are going to use only shmat/shmget system calls.  Users who
-wish to use hugetlb page via shared memory segment should be a member of
-a supplementary group and system admin needs to configure that gid into
-/proc/sys/vm/hugetlb_shm_group.  It is possible for same or different
-applications to use any combination of mmaps and shm* calls, though the
-mount of filesystem will be required for using mmap calls.
+applications are going to use only shmat/shmget system calls or mmap with
+MAP_HUGETLB.  Users who wish to use hugetlb page via shared memory segment
+should be a member of a supplementary group and system admin needs to
+configure that gid into /proc/sys/vm/hugetlb_shm_group.  It is possible for
+same or different applications to use any combination of mmaps and shm*
+calls, though the mount of filesystem will be required for using mmap calls
+without MAP_HUGETLB.  For an example of how to use mmap with MAP_HUGETLB see
+map_hugetlb.c.
 
 *******************************************************************
 
diff --git a/Documentation/vm/map_hugetlb.c b/Documentation/vm/map_hugetlb.c
new file mode 100644
index 0000000..e2bdae3
--- /dev/null
+++ b/Documentation/vm/map_hugetlb.c
@@ -0,0 +1,77 @@
+/*
+ * Example of using hugepage memory in a user application using the mmap
+ * system call with MAP_HUGETLB flag.  Before running this program make
+ * sure the administrator has allocated enough default sized huge pages
+ * to cover the 256 MB allocation.
+ *
+ * For ia64 architecture, Linux kernel reserves Region number 4 for hugepages.
+ * That means the addresses starting with 0x800000... will need to be
+ * specified.  Specifying a fixed address is not required on ppc64, i386
+ * or x86_64.
+ */
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+
+#define LENGTH (256UL*1024*1024)
+#define PROTECTION (PROT_READ | PROT_WRITE)
+
+#ifndef MAP_HUGETLB
+#define MAP_HUGETLB 0x40
+#endif
+
+/* Only ia64 requires this */
+#ifdef __ia64__
+#define ADDR (void *)(0x8000000000000000UL)
+#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_FIXED)
+#else
+#define ADDR (void *)(0x0UL)
+#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
+#endif
+
+void check_bytes(char *addr)
+{
+	printf("First hex is %x\n", *((unsigned int *)addr));
+}
+
+void write_bytes(char *addr)
+{
+	unsigned long i;
+
+	for (i = 0; i < LENGTH; i++)
+		*(addr + i) = (char)i;
+}
+
+void read_bytes(char *addr)
+{
+	unsigned long i;
+
+	check_bytes(addr);
+	for (i = 0; i < LENGTH; i++)
+		if (*(addr + i) != (char)i) {
+			printf("Mismatch at %lu\n", i);
+			break;
+		}
+}
+
+int main(void)
+{
+	void *addr;
+
+	addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
+	if (addr == MAP_FAILED) {
+		perror("mmap");
+		exit(1);
+	}
+
+	printf("Returned address is %p\n", addr);
+	check_bytes(addr);
+	write_bytes(addr);
+	read_bytes(addr);
+
+	munmap(addr, LENGTH);
+
+	return 0;
+}
-- 
1.6.3.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3
  2009-08-14 14:08 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 Eric B Munson
  2009-08-14 14:08   ` [PATCH 2/3] Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions V3 Eric B Munson
@ 2009-08-14 19:19   ` David Rientjes
  1 sibling, 0 replies; 8+ messages in thread
From: David Rientjes @ 2009-08-14 19:19 UTC (permalink / raw)
  To: Eric B Munson; +Cc: linux-kernel, linux-mm, linux-man, akpm, mtk.manpages

On Fri, 14 Aug 2009, Eric B Munson wrote:

> There are two means of creating mappings backed by huge pages:
> 
>         1. mmap() a file created on hugetlbfs
>         2. Use shm which creates a file on an internal mount which essentially
>            maps it MAP_SHARED
> 
> The internal mount is only used for shared mappings but there is very
> little that stops it being used for private mappings. This patch extends
> hugetlbfs_file_setup() to deal with the creation of files that will be
> mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
> used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
> 
> Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
> ---
> Changes from V2:
>  Rebase to newest linux-2.6 tree
>  Use base 10 value for HUGETLB_SHMFS_INODE instead of hex
> 
> Changes from V1:
>  Rebase to newest linux-2.6 tree
> 
>  fs/hugetlbfs/inode.c    |   22 ++++++++++++++++++----
>  include/linux/hugetlb.h |   10 +++++++++-
>  ipc/shm.c               |    3 ++-
>  3 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 941c842..361f536 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -506,6 +506,13 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, uid_t uid,
>  		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
>  		INIT_LIST_HEAD(&inode->i_mapping->private_list);
>  		info = HUGETLBFS_I(inode);
> +		/*
> +		 * The policy is initialized here even if we are creating a
> +		 * private inode because initialization simply creates an
> +		 * an empty rb tree and calls spin_lock_init(), later when we
> +		 * call mpol_free_shared_policy() it will just return because
> +		 * the rb tree will still be empty.
> +		 */
>  		mpol_shared_policy_init(&info->policy, NULL);
>  		switch (mode & S_IFMT) {
>  		default:
> @@ -930,12 +937,19 @@ static struct file_system_type hugetlbfs_fs_type = {
>  
>  static struct vfsmount *hugetlbfs_vfsmount;
>  
> -static int can_do_hugetlb_shm(void)
> +static int can_do_hugetlb_shm(int creat_flags)
>  {
> -	return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
> +	if (!(creat_flags & HUGETLB_SHMFS_INODE))
> +		return 0;

That should be

	if (creat_flags != HUGETLB_SHMFS_INODE)
		return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] Add MAP_HUGETLB example V3
  2009-08-14 14:08     ` [PATCH 3/3] Add MAP_HUGETLB example V3 Eric B Munson
@ 2009-08-14 19:20       ` David Rientjes
  0 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2009-08-14 19:20 UTC (permalink / raw)
  To: Eric B Munson
  Cc: linux-kernel, linux-mm, linux-man, Andrew Morton, mtk.manpages,
	Randy Dunlap

On Fri, 14 Aug 2009, Eric B Munson wrote:

> This patch adds an example of how to use the MAP_HUGETLB flag to the
> vm documentation directory and a reference to the example in
> hugetlbpage.txt.
> 
> Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

Adding Randy Dunlap to the cc.


> ---
> Changes from V2:
>  Rebase to newest linux-2.6 tree
>  Fix comment in example referencing MAP_LARGEPAGE
>  Move example code to its own file
>  Update hugetlbpage.txt with MAP_HUGETLB information and example reference
>  Add map_hugetlb.c to 00-INDEX
> 
> Changes from V1:
>  Rebase to newest linux-2.6 tree
>  Change MAP_LARGEPAGE to MAP_HUGETLB to match flag name in huge page shm
> 
>  Documentation/vm/00-INDEX        |    2 +
>  Documentation/vm/hugetlbpage.txt |   14 ++++---
>  Documentation/vm/map_hugetlb.c   |   77 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 87 insertions(+), 6 deletions(-)
>  create mode 100644 Documentation/vm/map_hugetlb.c
> 
> diff --git a/Documentation/vm/00-INDEX b/Documentation/vm/00-INDEX
> index 2f77ced..aabd973 100644
> --- a/Documentation/vm/00-INDEX
> +++ b/Documentation/vm/00-INDEX
> @@ -20,3 +20,5 @@ slabinfo.c
>  	- source code for a tool to get reports about slabs.
>  slub.txt
>  	- a short users guide for SLUB.
> +map_hugetlb.c
> +	- an example program that uses the MAP_HUGETLB mmap flag.
> diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt
> index ea8714f..6a8feab 100644
> --- a/Documentation/vm/hugetlbpage.txt
> +++ b/Documentation/vm/hugetlbpage.txt
> @@ -146,12 +146,14 @@ Regular chown, chgrp, and chmod commands (with right permissions) could be
>  used to change the file attributes on hugetlbfs.
>  
>  Also, it is important to note that no such mount command is required if the
> -applications are going to use only shmat/shmget system calls.  Users who
> -wish to use hugetlb page via shared memory segment should be a member of
> -a supplementary group and system admin needs to configure that gid into
> -/proc/sys/vm/hugetlb_shm_group.  It is possible for same or different
> -applications to use any combination of mmaps and shm* calls, though the
> -mount of filesystem will be required for using mmap calls.
> +applications are going to use only shmat/shmget system calls or mmap with
> +MAP_HUGETLB.  Users who wish to use hugetlb page via shared memory segment
> +should be a member of a supplementary group and system admin needs to
> +configure that gid into /proc/sys/vm/hugetlb_shm_group.  It is possible for
> +same or different applications to use any combination of mmaps and shm*
> +calls, though the mount of filesystem will be required for using mmap calls
> +without MAP_HUGETLB.  For an example of how to use mmap with MAP_HUGETLB see
> +map_hugetlb.c.
>  
>  *******************************************************************
>  
> diff --git a/Documentation/vm/map_hugetlb.c b/Documentation/vm/map_hugetlb.c
> new file mode 100644
> index 0000000..e2bdae3
> --- /dev/null
> +++ b/Documentation/vm/map_hugetlb.c
> @@ -0,0 +1,77 @@
> +/*
> + * Example of using hugepage memory in a user application using the mmap
> + * system call with MAP_HUGETLB flag.  Before running this program make
> + * sure the administrator has allocated enough default sized huge pages
> + * to cover the 256 MB allocation.
> + *
> + * For ia64 architecture, Linux kernel reserves Region number 4 for hugepages.
> + * That means the addresses starting with 0x800000... will need to be
> + * specified.  Specifying a fixed address is not required on ppc64, i386
> + * or x86_64.
> + */
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include <fcntl.h>
> +
> +#define LENGTH (256UL*1024*1024)
> +#define PROTECTION (PROT_READ | PROT_WRITE)
> +
> +#ifndef MAP_HUGETLB
> +#define MAP_HUGETLB 0x40
> +#endif
> +
> +/* Only ia64 requires this */
> +#ifdef __ia64__
> +#define ADDR (void *)(0x8000000000000000UL)
> +#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_FIXED)
> +#else
> +#define ADDR (void *)(0x0UL)
> +#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
> +#endif
> +
> +void check_bytes(char *addr)
> +{
> +	printf("First hex is %x\n", *((unsigned int *)addr));
> +}
> +
> +void write_bytes(char *addr)
> +{
> +	unsigned long i;
> +
> +	for (i = 0; i < LENGTH; i++)
> +		*(addr + i) = (char)i;
> +}
> +
> +void read_bytes(char *addr)
> +{
> +	unsigned long i;
> +
> +	check_bytes(addr);
> +	for (i = 0; i < LENGTH; i++)
> +		if (*(addr + i) != (char)i) {
> +			printf("Mismatch at %lu\n", i);
> +			break;
> +		}
> +}
> +
> +int main(void)
> +{
> +	void *addr;
> +
> +	addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
> +	if (addr == MAP_FAILED) {
> +		perror("mmap");
> +		exit(1);
> +	}
> +
> +	printf("Returned address is %p\n", addr);
> +	check_bytes(addr);
> +	write_bytes(addr);
> +	read_bytes(addr);
> +
> +	munmap(addr, LENGTH);
> +
> +	return 0;
> +}
> -- 
> 1.6.3.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Add pseudo-anonymous huge page mappings V3
  2009-08-14 14:08 [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Eric B Munson
  2009-08-14 14:08 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 Eric B Munson
@ 2009-08-17 13:53 ` Andi Kleen
  2009-08-18 10:53   ` Eric B Munson
  1 sibling, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2009-08-17 13:53 UTC (permalink / raw)
  To: Eric B Munson; +Cc: linux-kernel, linux-mm, linux-man, akpm, mtk.manpages

Eric B Munson <ebmunson@us.ibm.com> writes:

> This patch set adds a flag to mmap that allows the user to request
> a mapping to be backed with huge pages.  This mapping will borrow
> functionality from the huge page shm code to create a file on the
> kernel internal mount and uses it to approximate an anonymous
> mapping.  The MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS
> and will not work without both flags being preset.


You seem to have forgotten to describe WHY you want this?

>From my guess, this seems to be another step into turning hugetlb.c
into another parallel VM implementation. Instead of basically
developing two parallel VMs wouldn't it be better to unify the two?

I think extending hugetlb.c forever without ever thinking about
that is not the right approach.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Add pseudo-anonymous huge page mappings V3
  2009-08-17 13:53 ` [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Andi Kleen
@ 2009-08-18 10:53   ` Eric B Munson
  0 siblings, 0 replies; 8+ messages in thread
From: Eric B Munson @ 2009-08-18 10:53 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, linux-mm, linux-man, akpm, mtk.manpages

[-- Attachment #1: Type: text/plain, Size: 1741 bytes --]

On Mon, 17 Aug 2009, Andi Kleen wrote:

> Eric B Munson <ebmunson@us.ibm.com> writes:
> 
> > This patch set adds a flag to mmap that allows the user to request
> > a mapping to be backed with huge pages.  This mapping will borrow
> > functionality from the huge page shm code to create a file on the
> > kernel internal mount and uses it to approximate an anonymous
> > mapping.  The MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS
> > and will not work without both flags being preset.
> 
> 
> You seem to have forgotten to describe WHY you want this?
> 
> From my guess, this seems to be another step into turning hugetlb.c
> into another parallel VM implementation. Instead of basically
> developing two parallel VMs wouldn't it be better to unify the two?
> 
> I think extending hugetlb.c forever without ever thinking about
> that is not the right approach.
> 
> -Andi
> 
> -- 
> ak@linux.intel.com -- Speaking for myself only.
> 

This patch is meant to simplify the programming model because presently
there is a large chunk of boiler plate code required to create private,
hugepage backed mappings.  This patch would allow use of huge pages 
without linking to libhugetlbfs or having hugetblfs mounted.

Unification would provide these same benefits, but it has been resisted
each time that it has been suggested for several reasons.  It would
break PAGE_SIZE assumptions across the kernel.  It makes page-table
abstractions really expensive.  And it does not provide any benefit on
architectures that do not support huge pages, incurring fast path
penalties wihtout providing any benefit on these architectures.

Eric

-- 
Eric B Munson
IBM Linux Technology Center
ebmunson@us.ibm.com


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-08-18 10:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-14 14:08 [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Eric B Munson
2009-08-14 14:08 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 Eric B Munson
2009-08-14 14:08   ` [PATCH 2/3] Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions V3 Eric B Munson
2009-08-14 14:08     ` [PATCH 3/3] Add MAP_HUGETLB example V3 Eric B Munson
2009-08-14 19:20       ` David Rientjes
2009-08-14 19:19   ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount V3 David Rientjes
2009-08-17 13:53 ` [PATCH 0/3] Add pseudo-anonymous huge page mappings V3 Andi Kleen
2009-08-18 10:53   ` Eric B Munson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox