* [PATCH 0/3] Add pseudo-anonymous huge page mappings
@ 2009-08-11 22:13 Eric B Munson
2009-08-11 22:13 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
0 siblings, 1 reply; 12+ messages in thread
From: Eric B Munson @ 2009-08-11 22:13 UTC (permalink / raw)
To: linux-kernel, linux-mm; +Cc: linux-man, mtk.manpages, Eric B Munson
This patch set adds a flag to mmap that allows the user to request
a mapping to be backed with huge pages. This mapping will borrow
functionality from the huge page shm code to create a file on the
kernel internal mount and uses it to approximate an anonymous
mapping. The MAP_LARGEPAGE flag is a modifier to MAP_ANONYMOUS
and will not work without both flags being preset.
A new flag is necessary because there is no other way to hook into
huge pages without creating a file on a hugetlbfs mount which
wouldn't be MAP_ANONYMOUS.
To userspace, this mapping will behave just like an anonymous mapping
because the file is not accessible outside of the kernel.
Eric B Munson (3):
hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on
the vfs internal mount
Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions
Add MAP_LARGEPAGE example to vm/hugetlbpage.txt
Documentation/vm/hugetlbpage.txt | 80 +++++++++++++++++++++++++++++++++++++
fs/hugetlbfs/inode.c | 22 ++++++++--
include/asm-generic/mman-common.h | 1 +
include/linux/hugetlb.h | 17 +++++++-
ipc/shm.c | 3 +-
mm/mmap.c | 16 +++++++
6 files changed, 133 insertions(+), 6 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
2009-08-11 22:13 [PATCH 0/3] Add pseudo-anonymous huge page mappings Eric B Munson
@ 2009-08-11 22:13 ` Eric B Munson
2009-08-11 22:13 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Eric B Munson
0 siblings, 1 reply; 12+ messages in thread
From: Eric B Munson @ 2009-08-11 22:13 UTC (permalink / raw)
To: linux-kernel, linux-mm; +Cc: linux-man, mtk.manpages, Eric B Munson
There are two means of creating mappings backed by huge pages:
1. mmap() a file created on hugetlbfs
2. Use shm which creates a file on an internal mount which essentially
maps it MAP_SHARED
The internal mount is only used for shared mappings but there is very
little that stops it being used for private mappings. This patch extends
hugetlbfs_file_setup() to deal with the creation of files that will be
mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
used in a subsequent patch to implement the MAP_LARGEPAGE mmap() flag.
Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
---
fs/hugetlbfs/inode.c | 22 ++++++++++++++++++----
include/linux/hugetlb.h | 10 +++++++++-
ipc/shm.c | 3 ++-
3 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 941c842..361f536 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -506,6 +506,13 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, uid_t uid,
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
INIT_LIST_HEAD(&inode->i_mapping->private_list);
info = HUGETLBFS_I(inode);
+ /*
+ * The policy is initialized here even if we are creating a
+ * private inode because initialization simply creates an
+ * an empty rb tree and calls spin_lock_init(), later when we
+ * call mpol_free_shared_policy() it will just return because
+ * the rb tree will still be empty.
+ */
mpol_shared_policy_init(&info->policy, NULL);
switch (mode & S_IFMT) {
default:
@@ -930,12 +937,19 @@ static struct file_system_type hugetlbfs_fs_type = {
static struct vfsmount *hugetlbfs_vfsmount;
-static int can_do_hugetlb_shm(void)
+static int can_do_hugetlb_shm(int creat_flags)
{
- return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
+ if (!(creat_flags & HUGETLB_SHMFS_INODE))
+ return 0;
+ if (capable(CAP_IPC_LOCK))
+ return 1;
+ if (in_group_p(sysctl_hugetlb_shm_group))
+ return 1;
+ return 0;
}
-struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag)
+struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
+ int creat_flags)
{
int error = -ENOMEM;
int unlock_shm = 0;
@@ -948,7 +962,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag)
if (!hugetlbfs_vfsmount)
return ERR_PTR(-ENOENT);
- if (!can_do_hugetlb_shm()) {
+ if (!can_do_hugetlb_shm(creat_flags)) {
if (user_shm_lock(size, user)) {
unlock_shm = 1;
WARN_ONCE(1,
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2723513..78b6ddf 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -109,6 +109,14 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
#endif /* !CONFIG_HUGETLB_PAGE */
+enum {
+ /*
+ * The file will be used as an shm file so shmfs accounting rules
+ * apply
+ */
+ HUGETLB_SHMFS_INODE = 0x01,
+};
+
#ifdef CONFIG_HUGETLBFS
struct hugetlbfs_config {
uid_t uid;
@@ -146,7 +154,7 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb)
extern const struct file_operations hugetlbfs_file_operations;
extern struct vm_operations_struct hugetlb_vm_ops;
-struct file *hugetlb_file_setup(const char *name, size_t, int);
+struct file *hugetlb_file_setup(const char *name, size_t, int, int);
int hugetlb_get_quota(struct address_space *mapping, long delta);
void hugetlb_put_quota(struct address_space *mapping, long delta);
diff --git a/ipc/shm.c b/ipc/shm.c
index 15dd238..801c68a 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -369,7 +369,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
/* hugetlb_file_setup applies strict accounting */
if (shmflg & SHM_NORESERVE)
acctflag = VM_NORESERVE;
- file = hugetlb_file_setup(name, size, acctflag);
+ file = hugetlb_file_setup(name, size, acctflag,
+ HUGETLB_SHMFS_INODE);
shp->mlock_user = current_user();
} else {
/*
--
1.6.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions
2009-08-11 22:13 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
@ 2009-08-11 22:13 ` Eric B Munson
2009-08-11 22:13 ` [PATCH 3/3] Add MAP_LARGEPAGE example to vm/hugetlbpage.txt Eric B Munson
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Eric B Munson @ 2009-08-11 22:13 UTC (permalink / raw)
To: linux-kernel, linux-mm; +Cc: linux-man, mtk.manpages, Eric B Munson
This patch adds a flag for mmap that will be used to request a huge
page region that will look like anonymous memory to user space. This
is accomplished by using a file on the internal vfsmount. MAP_LARGEPAGE
is a modifier of MAP_ANONYMOUS and so must be specified with it. The
region will behave the same as a MAP_ANONYMOUS region using small pages.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
---
include/asm-generic/mman-common.h | 1 +
include/linux/hugetlb.h | 7 +++++++
mm/mmap.c | 16 ++++++++++++++++
3 files changed, 24 insertions(+), 0 deletions(-)
diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
index 3b69ad3..60b6be7 100644
--- a/include/asm-generic/mman-common.h
+++ b/include/asm-generic/mman-common.h
@@ -19,6 +19,7 @@
#define MAP_TYPE 0x0f /* Mask for type of mapping */
#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */
+#define MAP_LARGEPAGE 0x40 /* create a large page mapping */
#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 78b6ddf..b84361c 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -109,12 +109,19 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
#endif /* !CONFIG_HUGETLB_PAGE */
+#define HUGETLB_ANON_FILE "anon_hugepage"
+
enum {
/*
* The file will be used as an shm file so shmfs accounting rules
* apply
*/
HUGETLB_SHMFS_INODE = 0x01,
+ /*
+ * The file is being created on the internal vfs mount and shmfs
+ * accounting rules do not apply
+ */
+ HUGETLB_ANONHUGE_INODE = 0x02,
};
#ifdef CONFIG_HUGETLBFS
diff --git a/mm/mmap.c b/mm/mmap.c
index 34579b2..c2c729a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -29,6 +29,7 @@
#include <linux/rmap.h>
#include <linux/mmu_notifier.h>
#include <linux/perf_counter.h>
+#include <linux/hugetlb.h>
#include <asm/uaccess.h>
#include <asm/cacheflush.h>
@@ -954,6 +955,21 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
if (mm->map_count > sysctl_max_map_count)
return -ENOMEM;
+ if (flags & MAP_LARGEPAGE) {
+ if (file)
+ return -EINVAL;
+
+ /*
+ * VM_NORESERVE is used because the reservations will be
+ * taken when vm_ops->mmap() is called
+ */
+ len = ALIGN(len, huge_page_size(&default_hstate));
+ file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, VM_NORESERVE,
+ HUGETLB_ANONHUGE_INODE);
+ if (IS_ERR(file))
+ return -ENOMEM;
+ }
+
/* Obtain the address to map to. we verify (or select) it and ensure
* that it represents a valid section of the address space.
*/
--
1.6.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 3/3] Add MAP_LARGEPAGE example to vm/hugetlbpage.txt
2009-08-11 22:13 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Eric B Munson
@ 2009-08-11 22:13 ` Eric B Munson
2009-08-12 5:07 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Michael Kerrisk
2009-08-12 5:45 ` Pekka Enberg
2 siblings, 0 replies; 12+ messages in thread
From: Eric B Munson @ 2009-08-11 22:13 UTC (permalink / raw)
To: linux-kernel, linux-mm; +Cc: linux-man, mtk.manpages, Eric B Munson
This patch adds an example of how to use the MAP_LARGEPAGE flag to
the vm documentation.
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
---
Documentation/vm/hugetlbpage.txt | 80 ++++++++++++++++++++++++++++++++++++++
1 files changed, 80 insertions(+), 0 deletions(-)
diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt
index ea8714f..fec7fc1 100644
--- a/Documentation/vm/hugetlbpage.txt
+++ b/Documentation/vm/hugetlbpage.txt
@@ -337,3 +337,83 @@ int main(void)
return 0;
}
+
+*******************************************************************
+
+/*
+ * Example of using hugepage memory in a user application using the mmap
+ * system call with MAP_LARGEPAGE flag. Before running this program make
+ * sure the administrator has allocated enough default sized huge pages
+ * to cover the 256 MB allocation.
+ *
+ * For ia64 architecture, Linux kernel reserves Region number 4 for hugepages.
+ * That means the addresses starting with 0x800000... will need to be
+ * specified. Specifying a fixed address is not required on ppc64, i386
+ * or x86_64.
+ */
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+
+#define LENGTH (256UL*1024*1024)
+#define PROTECTION (PROT_READ | PROT_WRITE)
+
+#ifndef MAP_LARGEPAGE
+#define MAP_LARGEPAGE 0x40
+#endif
+
+/* Only ia64 requires this */
+#ifdef __ia64__
+#define ADDR (void *)(0x8000000000000000UL)
+#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_LARGEPAGE | MAP_FIXED)
+#else
+#define ADDR (void *)(0x0UL)
+#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_LARGEPAGE)
+#endif
+
+void check_bytes(char *addr)
+{
+ printf("First hex is %x\n", *((unsigned int *)addr));
+}
+
+void write_bytes(char *addr)
+{
+ unsigned long i;
+
+ for (i = 0; i < LENGTH; i++)
+ *(addr + i) = (char)i;
+}
+
+void read_bytes(char *addr)
+{
+ unsigned long i;
+
+ check_bytes(addr);
+ for (i = 0; i < LENGTH; i++)
+ if (*(addr + i) != (char)i) {
+ printf("Mismatch at %lu\n", i);
+ break;
+ }
+}
+
+int main(void)
+{
+ void *addr;
+
+ addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0);
+ if (addr == MAP_FAILED) {
+ perror("mmap");
+ exit(1);
+ }
+
+ printf("Returned address is %p\n", addr);
+ check_bytes(addr);
+ write_bytes(addr);
+ read_bytes(addr);
+
+ munmap(addr, LENGTH);
+
+ return 0;
+}
--
1.6.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions
2009-08-11 22:13 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Eric B Munson
2009-08-11 22:13 ` [PATCH 3/3] Add MAP_LARGEPAGE example to vm/hugetlbpage.txt Eric B Munson
@ 2009-08-12 5:07 ` Michael Kerrisk
2009-08-12 9:08 ` Eric B Munson
2009-08-12 5:45 ` Pekka Enberg
2 siblings, 1 reply; 12+ messages in thread
From: Michael Kerrisk @ 2009-08-12 5:07 UTC (permalink / raw)
To: Eric B Munson; +Cc: linux-kernel, linux-mm, linux-man
Eric,
On Wed, Aug 12, 2009 at 12:13 AM, Eric B Munson<ebmunson@us.ibm.com> wrote:
> This patch adds a flag for mmap that will be used to request a huge
> page region that will look like anonymous memory to user space. This
> is accomplished by using a file on the internal vfsmount. MAP_LARGEPAGE
> is a modifier of MAP_ANONYMOUS and so must be specified with it. The
> region will behave the same as a MAP_ANONYMOUS region using small pages.
Does this flag provide functionality analogous to shmget(SHM_HUGETLB)?
If so, would iot not make sense to name it similarly (i.e.,
MAP_HUGETLB)?
Cheers,
Michael
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Watch my Linux system programming book progress to publication!
http://blog.man7.org/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions
2009-08-12 5:07 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Michael Kerrisk
@ 2009-08-12 9:08 ` Eric B Munson
0 siblings, 0 replies; 12+ messages in thread
From: Eric B Munson @ 2009-08-12 9:08 UTC (permalink / raw)
To: mtk.manpages; +Cc: linux-kernel, linux-mm, linux-man
[-- Attachment #1: Type: text/plain, Size: 1058 bytes --]
On Wed, 12 Aug 2009, Michael Kerrisk wrote:
> Eric,
>
> On Wed, Aug 12, 2009 at 12:13 AM, Eric B Munson<ebmunson@us.ibm.com> wrote:
> > This patch adds a flag for mmap that will be used to request a huge
> > page region that will look like anonymous memory to user space. This
> > is accomplished by using a file on the internal vfsmount. MAP_LARGEPAGE
> > is a modifier of MAP_ANONYMOUS and so must be specified with it. The
> > region will behave the same as a MAP_ANONYMOUS region using small pages.
>
> Does this flag provide functionality analogous to shmget(SHM_HUGETLB)?
> If so, would iot not make sense to name it similarly (i.e.,
> MAP_HUGETLB)?
>
> Cheers,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Watch my Linux system programming book progress to publication!
> http://blog.man7.org/
>
I have no particular attachment to MAP_LARGEPAGE, I will make this chage for V2.
--
Eric B Munson
IBM Linux Technology Center
ebmunson@us.ibm.com
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions
2009-08-11 22:13 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Eric B Munson
2009-08-11 22:13 ` [PATCH 3/3] Add MAP_LARGEPAGE example to vm/hugetlbpage.txt Eric B Munson
2009-08-12 5:07 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Michael Kerrisk
@ 2009-08-12 5:45 ` Pekka Enberg
2 siblings, 0 replies; 12+ messages in thread
From: Pekka Enberg @ 2009-08-12 5:45 UTC (permalink / raw)
To: Eric B Munson
Cc: linux-kernel, linux-mm, linux-man, mtk.manpages, Andrew Morton
Hi Eric,
On Wed, Aug 12, 2009 at 1:13 AM, Eric B Munson<ebmunson@us.ibm.com> wrote:
> This patch adds a flag for mmap that will be used to request a huge
> page region that will look like anonymous memory to user space. This
> is accomplished by using a file on the internal vfsmount. MAP_LARGEPAGE
> is a modifier of MAP_ANONYMOUS and so must be specified with it. The
> region will behave the same as a MAP_ANONYMOUS region using small pages.
>
> Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
I would love to see something like this in the kernel. Huge pages are
useful for garbage collection and JIT text in the userspace but
unfortunately obtaining them is a real PITA at the moment.
Is there any way to drop the
CAP_IPC_LOCK/in_group_p(hugetlbfs_shm_group) requirement, btw? That
would make huge pages even more accessible to user-space virtual
machines.
Pekka
> ---
> include/asm-generic/mman-common.h | 1 +
> include/linux/hugetlb.h | 7 +++++++
> mm/mmap.c | 16 ++++++++++++++++
> 3 files changed, 24 insertions(+), 0 deletions(-)
>
> diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
> index 3b69ad3..60b6be7 100644
> --- a/include/asm-generic/mman-common.h
> +++ b/include/asm-generic/mman-common.h
> @@ -19,6 +19,7 @@
> #define MAP_TYPE 0x0f /* Mask for type of mapping */
> #define MAP_FIXED 0x10 /* Interpret addr exactly */
> #define MAP_ANONYMOUS 0x20 /* don't use a file */
> +#define MAP_LARGEPAGE 0x40 /* create a large page mapping */
>
> #define MS_ASYNC 1 /* sync memory asynchronously */
> #define MS_INVALIDATE 2 /* invalidate the caches */
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 78b6ddf..b84361c 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -109,12 +109,19 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
>
> #endif /* !CONFIG_HUGETLB_PAGE */
>
> +#define HUGETLB_ANON_FILE "anon_hugepage"
> +
> enum {
> /*
> * The file will be used as an shm file so shmfs accounting rules
> * apply
> */
> HUGETLB_SHMFS_INODE = 0x01,
> + /*
> + * The file is being created on the internal vfs mount and shmfs
> + * accounting rules do not apply
> + */
> + HUGETLB_ANONHUGE_INODE = 0x02,
> };
>
> #ifdef CONFIG_HUGETLBFS
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 34579b2..c2c729a 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -29,6 +29,7 @@
> #include <linux/rmap.h>
> #include <linux/mmu_notifier.h>
> #include <linux/perf_counter.h>
> +#include <linux/hugetlb.h>
>
> #include <asm/uaccess.h>
> #include <asm/cacheflush.h>
> @@ -954,6 +955,21 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
> if (mm->map_count > sysctl_max_map_count)
> return -ENOMEM;
>
> + if (flags & MAP_LARGEPAGE) {
> + if (file)
> + return -EINVAL;
> +
> + /*
> + * VM_NORESERVE is used because the reservations will be
> + * taken when vm_ops->mmap() is called
> + */
> + len = ALIGN(len, huge_page_size(&default_hstate));
> + file = hugetlb_file_setup(HUGETLB_ANON_FILE, len, VM_NORESERVE,
> + HUGETLB_ANONHUGE_INODE);
> + if (IS_ERR(file))
> + return -ENOMEM;
> + }
> +
> /* Obtain the address to map to. we verify (or select) it and ensure
> * that it represents a valid section of the address space.
> */
> --
> 1.6.3.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/3] Add pseudo-anonymous huge page mappings V4
@ 2009-08-25 11:14 Eric B Munson
2009-08-25 11:14 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
0 siblings, 1 reply; 12+ messages in thread
From: Eric B Munson @ 2009-08-25 11:14 UTC (permalink / raw)
To: linux-kernel, linux-mm, akpm
Cc: linux-man, mtk.manpages, randy.dunlap, Eric B Munson
This patch set adds a flag to mmap that allows the user to request
a mapping to be backed with huge pages. This mapping will borrow
functionality from the huge page shm code to create a file on the
kernel internal mount and use it to approximate an anonymous mapping.
The MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS and will not work
without both flags being preset.
A new flag is necessary because there is no other way to hook into
huge pages without creating a file on a hugetlbfs mount which
wouldn't be MAP_ANONYMOUS.
To userspace, this mapping will behave just like an anonymous mapping
because the file is not accessible outside of the kernel.
This patch set is meant to simplify the programming model, presently
there is a large chunk of boiler plate code, contained in libhugetlbfs,
required to create private, hugepage backed mappings. This patch set
would allow use of hugepages without linking to libhugetlbfs or having
hugetblfs mounted.
Unification of the VM code would provide these same benefits, but it
has been resisted each time that it has been suggested for several
reasons: it would break PAGE_SIZE assumptions across the kernel, it
makes page-table abstractions really expensive, and it does not
provide any benefit on architectures that do not support huge pages,
incurring fast path penalties without providing any benefit on these
architectures.
Eric B Munson (3):
hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on
the vfs internal mount
Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions
Add MAP_HUGETLB example
Documentation/vm/00-INDEX | 2 +
Documentation/vm/hugetlbpage.txt | 14 ++++---
Documentation/vm/map_hugetlb.c | 77 +++++++++++++++++++++++++++++++++++++
fs/hugetlbfs/inode.c | 21 ++++++++--
include/asm-generic/mman-common.h | 1 +
include/linux/hugetlb.h | 19 ++++++++-
ipc/shm.c | 2 +-
mm/mmap.c | 19 +++++++++
8 files changed, 142 insertions(+), 13 deletions(-)
create mode 100644 Documentation/vm/map_hugetlb.c
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
2009-08-25 11:14 [PATCH 0/3] Add pseudo-anonymous huge page mappings V4 Eric B Munson
@ 2009-08-25 11:14 ` Eric B Munson
2009-08-26 19:34 ` David Rientjes
0 siblings, 1 reply; 12+ messages in thread
From: Eric B Munson @ 2009-08-25 11:14 UTC (permalink / raw)
To: linux-kernel, linux-mm, akpm
Cc: linux-man, mtk.manpages, randy.dunlap, Eric B Munson
There are two means of creating mappings backed by huge pages:
1. mmap() a file created on hugetlbfs
2. Use shm which creates a file on an internal mount which essentially
maps it MAP_SHARED
The internal mount is only used for shared mappings but there is very
little that stops it being used for private mappings. This patch extends
hugetlbfs_file_setup() to deal with the creation of files that will be
mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
---
fs/hugetlbfs/inode.c | 21 +++++++++++++++++----
include/linux/hugetlb.h | 12 ++++++++++--
ipc/shm.c | 2 +-
3 files changed, 28 insertions(+), 7 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index cb88dac..5584d55 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -506,6 +506,13 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, uid_t uid,
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
INIT_LIST_HEAD(&inode->i_mapping->private_list);
info = HUGETLBFS_I(inode);
+ /*
+ * The policy is initialized here even if we are creating a
+ * private inode because initialization simply creates an
+ * an empty rb tree and calls spin_lock_init(), later when we
+ * call mpol_free_shared_policy() it will just return because
+ * the rb tree will still be empty.
+ */
mpol_shared_policy_init(&info->policy, NULL);
switch (mode & S_IFMT) {
default:
@@ -930,13 +937,19 @@ static struct file_system_type hugetlbfs_fs_type = {
static struct vfsmount *hugetlbfs_vfsmount;
-static int can_do_hugetlb_shm(void)
+static int can_do_hugetlb_shm(int creat_flags)
{
- return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
+ if (creat_flags != HUGETLB_SHMFS_INODE)
+ return 0;
+ if (capable(CAP_IPC_LOCK))
+ return 1;
+ if (in_group_p(sysctl_hugetlb_shm_group))
+ return 1;
+ return 0;
}
struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
- struct user_struct **user)
+ struct user_struct **user, int creat_flags)
{
int error = -ENOMEM;
struct file *file;
@@ -948,7 +961,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
if (!hugetlbfs_vfsmount)
return ERR_PTR(-ENOENT);
- if (!can_do_hugetlb_shm()) {
+ if (!can_do_hugetlb_shm(creat_flags)) {
*user = current_user();
if (user_shm_lock(size, *user)) {
WARN_ONCE(1,
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5cbc620..38bb552 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -110,6 +110,14 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
#endif /* !CONFIG_HUGETLB_PAGE */
+enum {
+ /*
+ * The file will be used as an shm file so shmfs accounting rules
+ * apply
+ */
+ HUGETLB_SHMFS_INODE = 1,
+};
+
#ifdef CONFIG_HUGETLBFS
struct hugetlbfs_config {
uid_t uid;
@@ -148,7 +156,7 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb)
extern const struct file_operations hugetlbfs_file_operations;
extern struct vm_operations_struct hugetlb_vm_ops;
struct file *hugetlb_file_setup(const char *name, size_t size, int acct,
- struct user_struct **user);
+ struct user_struct **user, int creat_flags);
int hugetlb_get_quota(struct address_space *mapping, long delta);
void hugetlb_put_quota(struct address_space *mapping, long delta);
@@ -170,7 +178,7 @@ static inline void set_file_hugepages(struct file *file)
#define is_file_hugepages(file) 0
#define set_file_hugepages(file) BUG()
-#define hugetlb_file_setup(name,size,acct,user) ERR_PTR(-ENOSYS)
+#define hugetlb_file_setup(name,size,acct,user,creat) ERR_PTR(-ENOSYS)
#endif /* !CONFIG_HUGETLBFS */
diff --git a/ipc/shm.c b/ipc/shm.c
index 1bc4701..5ba4962 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -370,7 +370,7 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
if (shmflg & SHM_NORESERVE)
acctflag = VM_NORESERVE;
file = hugetlb_file_setup(name, size, acctflag,
- &shp->mlock_user);
+ &shp->mlock_user, HUGETLB_SHMFS_INODE);
} else {
/*
* Do not allow no accounting for OVERCOMMIT_NEVER, even
--
1.6.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
2009-08-25 11:14 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
@ 2009-08-26 19:34 ` David Rientjes
0 siblings, 0 replies; 12+ messages in thread
From: David Rientjes @ 2009-08-26 19:34 UTC (permalink / raw)
To: Eric B Munson
Cc: linux-kernel, linux-mm, akpm, linux-man, mtk.manpages, randy.dunlap
On Tue, 25 Aug 2009, Eric B Munson wrote:
> There are two means of creating mappings backed by huge pages:
>
> 1. mmap() a file created on hugetlbfs
> 2. Use shm which creates a file on an internal mount which essentially
> maps it MAP_SHARED
>
> The internal mount is only used for shared mappings but there is very
> little that stops it being used for private mappings. This patch extends
> hugetlbfs_file_setup() to deal with the creation of files that will be
> mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
> used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
>
> Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/3] Add pseudo-anonymous huge page mappings V4
@ 2009-08-26 10:44 Eric B Munson
2009-08-26 10:44 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
0 siblings, 1 reply; 12+ messages in thread
From: Eric B Munson @ 2009-08-26 10:44 UTC (permalink / raw)
To: linux-kernel, linux-mm, akpm
Cc: linux-man, mtk.manpages, randy.dunlap, Eric B Munson
This patch set adds a flag to mmap that allows the user to request
a mapping to be backed with huge pages. This mapping will borrow
functionality from the huge page shm code to create a file on the
kernel internal mount and use it to approximate an anonymous mapping.
The MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS and will not work
without both flags being preset.
A new flag is necessary because there is no other way to hook into
huge pages without creating a file on a hugetlbfs mount which
wouldn't be MAP_ANONYMOUS.
To userspace, this mapping will behave just like an anonymous mapping
because the file is not accessible outside of the kernel.
This patch set is meant to simplify the programming model, presently
there is a large chunk of boiler plate code, contained in libhugetlbfs,
required to create private, hugepage backed mappings. This patch set
would allow use of hugepages without linking to libhugetlbfs or having
hugetblfs mounted.
Unification of the VM code would provide these same benefits, but it
has been resisted each time that it has been suggested for several
reasons: it would break PAGE_SIZE assumptions across the kernel, it
makes page-table abstractions really expensive, and it does not
provide any benefit on architectures that do not support huge pages,
incurring fast path penalties without providing any benefit on these
architectures.
Eric B Munson (3):
hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on
the vfs internal mount
Add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions
Add MAP_HUGETLB example
Documentation/vm/00-INDEX | 2 +
Documentation/vm/hugetlbpage.txt | 14 ++++---
Documentation/vm/map_hugetlb.c | 77 +++++++++++++++++++++++++++++++++++++
fs/hugetlbfs/inode.c | 21 ++++++++--
include/asm-generic/mman-common.h | 1 +
include/linux/hugetlb.h | 19 ++++++++-
ipc/shm.c | 2 +-
mm/mmap.c | 19 +++++++++
8 files changed, 142 insertions(+), 13 deletions(-)
create mode 100644 Documentation/vm/map_hugetlb.c
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
2009-08-26 10:44 [PATCH 0/3] Add pseudo-anonymous huge page mappings V4 Eric B Munson
@ 2009-08-26 10:44 ` Eric B Munson
2009-08-27 14:18 ` Mel Gorman
0 siblings, 1 reply; 12+ messages in thread
From: Eric B Munson @ 2009-08-26 10:44 UTC (permalink / raw)
To: linux-kernel, linux-mm, akpm
Cc: linux-man, mtk.manpages, randy.dunlap, Eric B Munson
There are two means of creating mappings backed by huge pages:
1. mmap() a file created on hugetlbfs
2. Use shm which creates a file on an internal mount which essentially
maps it MAP_SHARED
The internal mount is only used for shared mappings but there is very
little that stops it being used for private mappings. This patch extends
hugetlbfs_file_setup() to deal with the creation of files that will be
mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
Signed-off-by: Eric Munson <ebmunson@us.ibm.com>
---
fs/hugetlbfs/inode.c | 21 +++++++++++++++++----
include/linux/hugetlb.h | 12 ++++++++++--
ipc/shm.c | 2 +-
3 files changed, 28 insertions(+), 7 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index cb88dac..5584d55 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -506,6 +506,13 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb, uid_t uid,
inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
INIT_LIST_HEAD(&inode->i_mapping->private_list);
info = HUGETLBFS_I(inode);
+ /*
+ * The policy is initialized here even if we are creating a
+ * private inode because initialization simply creates an
+ * an empty rb tree and calls spin_lock_init(), later when we
+ * call mpol_free_shared_policy() it will just return because
+ * the rb tree will still be empty.
+ */
mpol_shared_policy_init(&info->policy, NULL);
switch (mode & S_IFMT) {
default:
@@ -930,13 +937,19 @@ static struct file_system_type hugetlbfs_fs_type = {
static struct vfsmount *hugetlbfs_vfsmount;
-static int can_do_hugetlb_shm(void)
+static int can_do_hugetlb_shm(int creat_flags)
{
- return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
+ if (creat_flags != HUGETLB_SHMFS_INODE)
+ return 0;
+ if (capable(CAP_IPC_LOCK))
+ return 1;
+ if (in_group_p(sysctl_hugetlb_shm_group))
+ return 1;
+ return 0;
}
struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
- struct user_struct **user)
+ struct user_struct **user, int creat_flags)
{
int error = -ENOMEM;
struct file *file;
@@ -948,7 +961,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
if (!hugetlbfs_vfsmount)
return ERR_PTR(-ENOENT);
- if (!can_do_hugetlb_shm()) {
+ if (!can_do_hugetlb_shm(creat_flags)) {
*user = current_user();
if (user_shm_lock(size, *user)) {
WARN_ONCE(1,
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5cbc620..38bb552 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -110,6 +110,14 @@ static inline void hugetlb_report_meminfo(struct seq_file *m)
#endif /* !CONFIG_HUGETLB_PAGE */
+enum {
+ /*
+ * The file will be used as an shm file so shmfs accounting rules
+ * apply
+ */
+ HUGETLB_SHMFS_INODE = 1,
+};
+
#ifdef CONFIG_HUGETLBFS
struct hugetlbfs_config {
uid_t uid;
@@ -148,7 +156,7 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb)
extern const struct file_operations hugetlbfs_file_operations;
extern struct vm_operations_struct hugetlb_vm_ops;
struct file *hugetlb_file_setup(const char *name, size_t size, int acct,
- struct user_struct **user);
+ struct user_struct **user, int creat_flags);
int hugetlb_get_quota(struct address_space *mapping, long delta);
void hugetlb_put_quota(struct address_space *mapping, long delta);
@@ -170,7 +178,7 @@ static inline void set_file_hugepages(struct file *file)
#define is_file_hugepages(file) 0
#define set_file_hugepages(file) BUG()
-#define hugetlb_file_setup(name,size,acct,user) ERR_PTR(-ENOSYS)
+#define hugetlb_file_setup(name,size,acct,user,creat) ERR_PTR(-ENOSYS)
#endif /* !CONFIG_HUGETLBFS */
diff --git a/ipc/shm.c b/ipc/shm.c
index 1bc4701..5ba4962 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -370,7 +370,7 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
if (shmflg & SHM_NORESERVE)
acctflag = VM_NORESERVE;
file = hugetlb_file_setup(name, size, acctflag,
- &shp->mlock_user);
+ &shp->mlock_user, HUGETLB_SHMFS_INODE);
} else {
/*
* Do not allow no accounting for OVERCOMMIT_NEVER, even
--
1.6.3.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
2009-08-26 10:44 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
@ 2009-08-27 14:18 ` Mel Gorman
2009-08-27 15:11 ` Eric B Munson
0 siblings, 1 reply; 12+ messages in thread
From: Mel Gorman @ 2009-08-27 14:18 UTC (permalink / raw)
To: Eric B Munson
Cc: linux-kernel, linux-mm, akpm, linux-man, mtk.manpages, randy.dunlap
On Wed, Aug 26, 2009 at 11:44:51AM +0100, Eric B Munson wrote:
> There are two means of creating mappings backed by huge pages:
>
> 1. mmap() a file created on hugetlbfs
> 2. Use shm which creates a file on an internal mount which essentially
> maps it MAP_SHARED
>
> The internal mount is only used for shared mappings but there is very
> little that stops it being used for private mappings. This patch extends
> hugetlbfs_file_setup() to deal with the creation of files that will be
> mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
> used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
>
Hi Eric,
I ran these patches through a series of small tests and I have just one
concern with the changes made to can_do_hugetlb_shm(). If that returns false
because of MAP_HUGETLB, we then proceed to call user_shm_lock(). I think your
intention might have been something like the following patch on top of yours?
For what it's worth, once this was applied, I didn't spot any other
problems, run-time or otherwise.
=====
hugetlbfs: Do not call user_shm_lock() for MAP_HUGETLB
The patch
hugetlbfs-allow-the-creation-of-files-suitable-for-map_private-on-the-vfs-internal-mount.patch
alters can_do_hugetlb_shm() to check if a file is being created for shared
memory or mmap(). If this returns false, we then unconditionally call
user_shm_lock() triggering a warning. This block should never be entered
for MAP_HUGETLB. This patch partially reverts the problem and fixes the check.
This patch should be considered a fix to
hugetlbfs-allow-the-creation-of-files-suitable-for-map_private-on-the-vfs-internal-mount.patch.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
fs/hugetlbfs/inode.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 49d2bf9..c944cc1 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -910,15 +910,9 @@ static struct file_system_type hugetlbfs_fs_type = {
static struct vfsmount *hugetlbfs_vfsmount;
-static int can_do_hugetlb_shm(int creat_flags)
+static int can_do_hugetlb_shm(void)
{
- if (creat_flags != HUGETLB_SHMFS_INODE)
- return 0;
- if (capable(CAP_IPC_LOCK))
- return 1;
- if (in_group_p(sysctl_hugetlb_shm_group))
- return 1;
- return 0;
+ return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
}
struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
@@ -934,7 +928,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
if (!hugetlbfs_vfsmount)
return ERR_PTR(-ENOENT);
- if (!can_do_hugetlb_shm(creat_flags)) {
+ if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) {
*user = current_user();
if (user_shm_lock(size, *user)) {
WARN_ONCE(1,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
2009-08-27 14:18 ` Mel Gorman
@ 2009-08-27 15:11 ` Eric B Munson
0 siblings, 0 replies; 12+ messages in thread
From: Eric B Munson @ 2009-08-27 15:11 UTC (permalink / raw)
To: Mel Gorman
Cc: linux-kernel, linux-mm, akpm, linux-man, mtk.manpages, randy.dunlap
[-- Attachment #1: Type: text/plain, Size: 3323 bytes --]
On Thu, 27 Aug 2009, Mel Gorman wrote:
> On Wed, Aug 26, 2009 at 11:44:51AM +0100, Eric B Munson wrote:
> > There are two means of creating mappings backed by huge pages:
> >
> > 1. mmap() a file created on hugetlbfs
> > 2. Use shm which creates a file on an internal mount which essentially
> > maps it MAP_SHARED
> >
> > The internal mount is only used for shared mappings but there is very
> > little that stops it being used for private mappings. This patch extends
> > hugetlbfs_file_setup() to deal with the creation of files that will be
> > mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
> > used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
> >
>
> Hi Eric,
>
> I ran these patches through a series of small tests and I have just one
> concern with the changes made to can_do_hugetlb_shm(). If that returns false
> because of MAP_HUGETLB, we then proceed to call user_shm_lock(). I think your
> intention might have been something like the following patch on top of yours?
>
> For what it's worth, once this was applied, I didn't spot any other
> problems, run-time or otherwise.
>
I am seeing the same thing, terminal says segfault with no memory, dmesg
complains about SHM. Your patch fixes the issue. Thanks.
> =====
> hugetlbfs: Do not call user_shm_lock() for MAP_HUGETLB
>
> The patch
> hugetlbfs-allow-the-creation-of-files-suitable-for-map_private-on-the-vfs-internal-mount.patch
> alters can_do_hugetlb_shm() to check if a file is being created for shared
> memory or mmap(). If this returns false, we then unconditionally call
> user_shm_lock() triggering a warning. This block should never be entered
> for MAP_HUGETLB. This patch partially reverts the problem and fixes the check.
>
> This patch should be considered a fix to
> hugetlbfs-allow-the-creation-of-files-suitable-for-map_private-on-the-vfs-internal-mount.patch.
>
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> ---
> fs/hugetlbfs/inode.c | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 49d2bf9..c944cc1 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -910,15 +910,9 @@ static struct file_system_type hugetlbfs_fs_type = {
>
> static struct vfsmount *hugetlbfs_vfsmount;
>
> -static int can_do_hugetlb_shm(int creat_flags)
> +static int can_do_hugetlb_shm(void)
> {
> - if (creat_flags != HUGETLB_SHMFS_INODE)
> - return 0;
> - if (capable(CAP_IPC_LOCK))
> - return 1;
> - if (in_group_p(sysctl_hugetlb_shm_group))
> - return 1;
> - return 0;
> + return capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group);
> }
>
> struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
> @@ -934,7 +928,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size, int acctflag,
> if (!hugetlbfs_vfsmount)
> return ERR_PTR(-ENOENT);
>
> - if (!can_do_hugetlb_shm(creat_flags)) {
> + if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) {
> *user = current_user();
> if (user_shm_lock(size, *user)) {
> WARN_ONCE(1,
>
>
--
Eric B Munson
IBM Linux Technology Center
ebmunson@us.ibm.com
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-08-27 15:11 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-11 22:13 [PATCH 0/3] Add pseudo-anonymous huge page mappings Eric B Munson
2009-08-11 22:13 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
2009-08-11 22:13 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Eric B Munson
2009-08-11 22:13 ` [PATCH 3/3] Add MAP_LARGEPAGE example to vm/hugetlbpage.txt Eric B Munson
2009-08-12 5:07 ` [PATCH 2/3] Add MAP_LARGEPAGE for mmaping pseudo-anonymous huge page regions Michael Kerrisk
2009-08-12 9:08 ` Eric B Munson
2009-08-12 5:45 ` Pekka Enberg
2009-08-25 11:14 [PATCH 0/3] Add pseudo-anonymous huge page mappings V4 Eric B Munson
2009-08-25 11:14 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
2009-08-26 19:34 ` David Rientjes
2009-08-26 10:44 [PATCH 0/3] Add pseudo-anonymous huge page mappings V4 Eric B Munson
2009-08-26 10:44 ` [PATCH 1/3] hugetlbfs: Allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount Eric B Munson
2009-08-27 14:18 ` Mel Gorman
2009-08-27 15:11 ` Eric B Munson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox