* [Patch 0/7] Mlock: doc, patch grouping and error return cleanups
@ 2008-08-22 21:10 Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 1/7] Mlock: fix __mlock_vma_pages_range comment block Lee Schermerhorn
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:10 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
The six patches introduced by this message are against:
2.6.27-rc3-mmotm-080821-0003
These patches replace the series of 6 patches posted by
/me at:
http://marc.info/?l=linux-mm&m=121917996115763&w=4
those patches themselves replace the series of 5 RFC patches
posted by Kosaki Motohiro at:
http://marc.info/?l=linux-mm&m=121843816412096&w=4
Patch 1/7 is a rework of KM's cleanup of the __mlock_vma_pages_range()
comment block. I tried to follow kerneldoc format. Randy will tell me if
I made a mistake :)
Patch 2/7 is a rework of KM's patch to remove the locked_vm
adjustments for "special vmas" during mmap() processing. Kosaki-san
wanted to "kill" this adjustment. After discussion, he requested that
it be resubmitted as a separate patch. This is the first step in providing
the separate patch [even tho' I consider this part of correctly "handling
mlocked pages during mmap()..."].
Patch 3/7 resubmits the locked_vm adjustment during mmap(MAP_LOCKED)) to
match the explicit mlock() behavior.
Patch 4/7 is KM's patch to change the error return for mlock
when, after downgrading the mmap semaphore to read during population of
the vma and switching back to write lock as our callers expect, the
vma that we just locked no longer covers the range we expected. See
the description.
Patch 5/7 is a new patch to ensure that locked_vm is updated correctly
when munmap()ing an mlocked region.
Patch 6/7 backs out a mainline patch to make_pages_present() to adjust
the error return to match the Posix specification for mlock error
returns. make_pages_present() is used by other than mlock, so this
isn't really the appropriate place to make the change, even tho'
apparently only mlock() looks at the return value from make_pages_present().
Patch 7/7 fixes the mlock error return to be Posixly Correct in the
appropriate [IMO] paths in mlock.c. Reworked in this version to
hide pte population errors [get_user_pages()] during mlock from mmap()
and related callers.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/7] Mlock: fix __mlock_vma_pages_range comment block
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
@ 2008-08-22 21:10 ` Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 2/7] Mlock: backout locked_vm adjustment during mmap() Lee Schermerhorn
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:10 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
Against: 2.6.27-rc3-mmotm-080821-0003
fix to mmap-handle-mlocked-pages-during-map-remap-unmap.patch
__mlock_vma_pages_range comment block needs updating:
- it fails to mention the mlock parameter
- no longer requires that mmap_sem be held for write.
following patch fixes it.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mlock.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
Index: linux-2.6.27-rc3-mmotm/mm/mlock.c
===================================================================
--- linux-2.6.27-rc3-mmotm.orig/mm/mlock.c 2008-08-18 11:41:19.000000000 -0400
+++ linux-2.6.27-rc3-mmotm/mm/mlock.c 2008-08-18 11:48:13.000000000 -0400
@@ -112,12 +112,20 @@ static void munlock_vma_page(struct page
}
}
-/*
- * mlock a range of pages in the vma.
+/**
+ * __mlock_vma_pages_range() - mlock/munlock a range of pages in the vma.
+ * @vma: target vma
+ * @start: start address
+ * @end: end address
+ * @mlock: 0 indicate munlock, otherwise mlock.
+ *
+ * If @mlock == 0, unlock an mlocked range;
+ * else mlock the range of pages. This takes care of making the pages present ,
+ * too.
*
- * This takes care of making the pages present too.
+ * return 0 on success, negative error code on error.
*
- * vma->vm_mm->mmap_sem must be held for write.
+ * vma->vm_mm->mmap_sem must be held for at least read.
*/
static int __mlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/7] Mlock: backout locked_vm adjustment during mmap()
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 1/7] Mlock: fix __mlock_vma_pages_range comment block Lee Schermerhorn
@ 2008-08-22 21:10 ` Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch Lee Schermerhorn
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:10 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
Against: 2.6.27-rc3-mmotm-080821-0003
can be folded into: mmap-handle-mlocked-pages-during-map-remap-unmap.patch
Backout mmap() path locked_vm accounting adjustment from the "handle
mlocked pages during map/remap/unmap" patch. Will resubmit as separate
patch with its own description.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mlock.c | 19 ++++++-------------
mm/mmap.c | 19 ++++++++-----------
2 files changed, 14 insertions(+), 24 deletions(-)
Index: linux-2.6.27-rc3-mmotm/mm/mlock.c
===================================================================
--- linux-2.6.27-rc3-mmotm.orig/mm/mlock.c 2008-08-18 12:38:20.000000000 -0400
+++ linux-2.6.27-rc3-mmotm/mm/mlock.c 2008-08-18 12:44:16.000000000 -0400
@@ -246,7 +246,7 @@ int mlock_vma_pages_range(struct vm_area
unsigned long start, unsigned long end)
{
struct mm_struct *mm = vma->vm_mm;
- int nr_pages = (end - start) / PAGE_SIZE;
+ int error = 0;
BUG_ON(!(vma->vm_flags & VM_LOCKED));
/*
@@ -259,8 +259,7 @@ int mlock_vma_pages_range(struct vm_area
is_vm_hugetlb_page(vma) ||
vma == get_gate_vma(current))) {
downgrade_write(&mm->mmap_sem);
- nr_pages = __mlock_vma_pages_range(vma, start, end, 1);
-
+ error = __mlock_vma_pages_range(vma, start, end, 1);
up_read(&mm->mmap_sem);
/* vma can change or disappear */
down_write(&mm->mmap_sem);
@@ -268,22 +267,20 @@ int mlock_vma_pages_range(struct vm_area
/* non-NULL vma must contain @start, but need to check @end */
if (!vma || end > vma->vm_end)
return -EAGAIN;
- return nr_pages;
+ return error;
}
/*
* User mapped kernel pages or huge pages:
* make these pages present to populate the ptes, but
- * fall thru' to reset VM_LOCKED--no need to unlock, and
- * return nr_pages so these don't get counted against task's
- * locked limit. huge pages are already counted against
- * locked vm limit.
+ * fall thru' to reset VM_LOCKED so we don't try to munlock
+ * this vma during munmap()/munlock().
*/
make_pages_present(start, end);
no_mlock:
vma->vm_flags &= ~VM_LOCKED; /* and don't come back! */
- return nr_pages; /* pages NOT mlocked */
+ return error;
}
@@ -372,10 +369,6 @@ success:
downgrade_write(&mm->mmap_sem);
ret = __mlock_vma_pages_range(vma, start, end, 1);
- if (ret > 0) {
- mm->locked_vm -= ret;
- ret = 0;
- }
/*
* Need to reacquire mmap sem in write mode, as our callers
* expect this. We have no support for atomically upgrading
Index: linux-2.6.27-rc3-mmotm/mm/mmap.c
===================================================================
--- linux-2.6.27-rc3-mmotm.orig/mm/mmap.c 2008-08-18 12:38:20.000000000 -0400
+++ linux-2.6.27-rc3-mmotm/mm/mmap.c 2008-08-18 12:52:21.000000000 -0400
@@ -1224,10 +1224,10 @@ out:
/*
* makes pages present; downgrades, drops, reacquires mmap_sem
*/
- int nr_pages = mlock_vma_pages_range(vma, addr, addr + len);
- if (nr_pages < 0)
- return nr_pages; /* vma gone! */
- mm->locked_vm += (len >> PAGE_SHIFT) - nr_pages;
+ int error = mlock_vma_pages_range(vma, addr, addr + len);
+ if (error < 0)
+ return error; /* vma gone! */
+ mm->locked_vm += (len >> PAGE_SHIFT);
} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
make_pages_present(addr, addr + len);
return addr;
@@ -1702,8 +1702,7 @@ find_extend_vma(struct mm_struct *mm, un
if (!prev || expand_stack(prev, addr))
return NULL;
if (prev->vm_flags & VM_LOCKED) {
- int nr_pages = mlock_vma_pages_range(prev, addr, prev->vm_end);
- if (nr_pages < 0)
+ if (mlock_vma_pages_range(prev, addr, prev->vm_end) < 0)
return NULL; /* vma gone! */
}
return prev;
@@ -1732,8 +1731,7 @@ find_extend_vma(struct mm_struct * mm, u
if (expand_stack(vma, addr))
return NULL;
if (vma->vm_flags & VM_LOCKED) {
- int nr_pages = mlock_vma_pages_range(vma, addr, start);
- if (nr_pages < 0)
+ if (mlock_vma_pages_range(vma, addr, start) < 0)
return NULL; /* vma gone! */
}
return vma;
@@ -2068,9 +2066,8 @@ unsigned long do_brk(unsigned long addr,
out:
mm->total_vm += len >> PAGE_SHIFT;
if (flags & VM_LOCKED) {
- int nr_pages = mlock_vma_pages_range(vma, addr, addr + len);
- if (nr_pages >= 0)
- mm->locked_vm += (len >> PAGE_SHIFT) - nr_pages;
+ if (mlock_vma_pages_range(vma, addr, addr + len) >= 0)
+ mm->locked_vm += (len >> PAGE_SHIFT);
}
return addr;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 1/7] Mlock: fix __mlock_vma_pages_range comment block Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 2/7] Mlock: backout locked_vm adjustment during mmap() Lee Schermerhorn
@ 2008-08-22 21:10 ` Lee Schermerhorn
2008-08-22 23:11 ` Andrew Morton
2008-08-22 21:10 ` [PATCH 4/7] Mlock: fix return value for munmap/mlock vma race Lee Schermerhorn
` (3 subsequent siblings)
6 siblings, 1 reply; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:10 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
atop patch:
mmap-handle-mlocked-pages-during-map-remap-unmap.patch
with locked_vm adjustment backout patch.
Adjust mm->locked_vm in the mmap(MAP_LOCKED) path to match mlock()
behavior and VM_LOCKED flag setting.
Broken out as separate patch.
During mlock*(), mlock_fixup() adjusts locked_vm as appropriate,
based on the type of vma. For the "special" vmas--those whose
pages we don't actually mark as PageMlocked()--VM_LOCKED is not
set, so that we don't attempt to munlock the pages during munmap
or munlock, and so we don't need to duplicate the vma type filtering
there. These vmas are not included in locked_vm by mlock_fixup().
During mmap() and vma extension, locked_vm is adjusted outside of the
mlock functions. This patch enhances those path to match the behavior
of mlock for the special vmas. Return number of pages NOT mlocked from
mlock_vma_pages_range() [0 or positive]. Share the return value with
possible error code [negative]. Caller adjusts locked_vm by non-negative
return value. For "special" vmas, this will include all pages mapped
by the vma. For "normal" [anon, file-backed] vmas, should always
return 0 adjustment.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/internal.h | 2 +-
mm/mlock.c | 42 +++++++++++++++++++++++++++++++++---------
mm/mmap.c | 10 +++++-----
3 files changed, 39 insertions(+), 15 deletions(-)
Index: linux-2.6.27-rc4-mmotm/mm/mlock.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/mlock.c 2008-08-21 10:52:24.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/mlock.c 2008-08-21 11:37:45.000000000 -0400
@@ -127,7 +127,7 @@ static void munlock_vma_page(struct page
*
* vma->vm_mm->mmap_sem must be held for at least read.
*/
-static int __mlock_vma_pages_range(struct vm_area_struct *vma,
+static long __mlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
int mlock)
{
@@ -229,7 +229,7 @@ static int __mlock_vma_pages_range(struc
/*
* Just make pages present if VM_LOCKED. No-op if unlocking.
*/
-static int __mlock_vma_pages_range(struct vm_area_struct *vma,
+static long __mlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
int mlock)
{
@@ -240,13 +240,27 @@ static int __mlock_vma_pages_range(struc
#endif /* CONFIG_UNEVICTABLE_LRU */
/*
- * mlock all pages in this vma range. For mmap()/mremap()/...
+/**
+ * mlock_vma_pages_range() - mlock pages in specified vma range.
+ * @vma - the vma containing the specfied address range
+ * @start - starting address in @vma to mlock
+ * @end - end address [+1] in @vma to mlock
+ *
+ * For mmap()/mremap()/expansion of mlocked vma.
+ *
+ * return 0 on success for "normal" vmas.
+ *
+ * return number of pages [> 0] to be removed from locked_vm on success
+ * of "special" vmas.
+ *
+ * return negative error if vma spanning @start-@range disappears while
+ * mmap semaphore is dropped. Unlikely?
*/
-int mlock_vma_pages_range(struct vm_area_struct *vma,
+long mlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
struct mm_struct *mm = vma->vm_mm;
- int error = 0;
+ int nr_pages = (end - start) / PAGE_SIZE;
BUG_ON(!(vma->vm_flags & VM_LOCKED));
/*
@@ -258,8 +272,11 @@ int mlock_vma_pages_range(struct vm_area
if (!((vma->vm_flags & (VM_DONTEXPAND | VM_RESERVED)) ||
is_vm_hugetlb_page(vma) ||
vma == get_gate_vma(current))) {
+ long error;
downgrade_write(&mm->mmap_sem);
+
error = __mlock_vma_pages_range(vma, start, end, 1);
+
up_read(&mm->mmap_sem);
/* vma can change or disappear */
down_write(&mm->mmap_sem);
@@ -267,20 +284,23 @@ int mlock_vma_pages_range(struct vm_area
/* non-NULL vma must contain @start, but need to check @end */
if (!vma || end > vma->vm_end)
return -EAGAIN;
- return error;
+
+ return 0; /* hide other errors from mmap(), et al */
}
/*
* User mapped kernel pages or huge pages:
* make these pages present to populate the ptes, but
- * fall thru' to reset VM_LOCKED so we don't try to munlock
- * this vma during munmap()/munlock().
+ * fall thru' to reset VM_LOCKED--no need to unlock, and
+ * return nr_pages so these don't get counted against task's
+ * locked limit. huge pages are already counted against
+ * locked vm limit.
*/
make_pages_present(start, end);
no_mlock:
vma->vm_flags &= ~VM_LOCKED; /* and don't come back! */
- return error;
+ return nr_pages; /* error or pages NOT mlocked */
}
@@ -369,6 +389,10 @@ success:
downgrade_write(&mm->mmap_sem);
ret = __mlock_vma_pages_range(vma, start, end, 1);
+ if (ret > 0) {
+ mm->locked_vm -= ret;
+ ret = 0;
+ }
/*
* Need to reacquire mmap sem in write mode, as our callers
* expect this. We have no support for atomically upgrading
Index: linux-2.6.27-rc4-mmotm/mm/mmap.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/mmap.c 2008-08-21 10:52:24.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/mmap.c 2008-08-21 11:45:44.000000000 -0400
@@ -1224,10 +1224,10 @@ out:
/*
* makes pages present; downgrades, drops, reacquires mmap_sem
*/
- int error = mlock_vma_pages_range(vma, addr, addr + len);
- if (error < 0)
- return error; /* vma gone! */
- mm->locked_vm += (len >> PAGE_SHIFT);
+ long nr_pages = mlock_vma_pages_range(vma, addr, addr + len);
+ if (nr_pages < 0)
+ return nr_pages; /* vma gone! */
+ mm->locked_vm += (len >> PAGE_SHIFT) - nr_pages;
} else if ((flags & MAP_POPULATE) && !(flags & MAP_NONBLOCK))
make_pages_present(addr, addr + len);
return addr;
@@ -2066,7 +2066,7 @@ unsigned long do_brk(unsigned long addr,
out:
mm->total_vm += len >> PAGE_SHIFT;
if (flags & VM_LOCKED) {
- if (mlock_vma_pages_range(vma, addr, addr + len) >= 0)
+ if (!mlock_vma_pages_range(vma, addr, addr + len))
mm->locked_vm += (len >> PAGE_SHIFT);
}
return addr;
Index: linux-2.6.27-rc4-mmotm/mm/internal.h
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/internal.h 2008-08-21 10:52:00.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/internal.h 2008-08-21 11:10:50.000000000 -0400
@@ -61,7 +61,7 @@ static inline unsigned long page_order(s
return page_private(page);
}
-extern int mlock_vma_pages_range(struct vm_area_struct *vma,
+extern long mlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
extern void munlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/7] Mlock: fix return value for munmap/mlock vma race
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
` (2 preceding siblings ...)
2008-08-22 21:10 ` [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch Lee Schermerhorn
@ 2008-08-22 21:10 ` Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 5/7] Mlock: update locked_vm on munmap() of mlocked() region Lee Schermerhorn
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:10 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
atop patch:
mmap-handle-mlocked-pages-during-map-remap-unmap.patch
with locked_vm adjustment backout patch.
Now, We call downgrade_write(&mm->mmap_sem) at begin of mlock.
It increase mlock scalability.
But if mlock and munmap conflict happend, We can find vma gone.
At that time, kernel should return ENOMEM because mlock after munmap return ENOMEM.
(in addition, EAGAIN indicate "please try again", but mlock() called again cause error again)
This problem is theoretical issue.
I can't reproduce that vma gone on my box, but fixes is better.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
mm/mlock.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6.27-rc4-mmotm/mm/mlock.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/mlock.c 2008-08-21 11:37:45.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/mlock.c 2008-08-21 11:58:05.000000000 -0400
@@ -283,7 +283,7 @@ long mlock_vma_pages_range(struct vm_are
vma = find_vma(mm, start);
/* non-NULL vma must contain @start, but need to check @end */
if (!vma || end > vma->vm_end)
- return -EAGAIN;
+ return -ENOMEM;
return 0; /* hide other errors from mmap(), et al */
}
@@ -405,7 +405,7 @@ success:
*prev = find_vma(mm, start);
/* non-NULL *prev must contain @start, but need to check @end */
if (!(*prev) || end > (*prev)->vm_end)
- ret = -EAGAIN;
+ ret = -ENOMEM;
} else {
/*
* TODO: for unlocking, pages will already be resident, so
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 5/7] Mlock: update locked_vm on munmap() of mlocked() region.
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
` (3 preceding siblings ...)
2008-08-22 21:10 ` [PATCH 4/7] Mlock: fix return value for munmap/mlock vma race Lee Schermerhorn
@ 2008-08-22 21:11 ` Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 6/7] Mlock: revert mainline handling of mlock error return Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 7/7] Mlock: make mlock error return Posixly Correct Lee Schermerhorn
6 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:11 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
against patch: mmap-handle-mlocked-pages-during-map-remap-unmap.patch
munlock_vma_pages_range() clears VM_LOCKED for munlock_vma_page(), et al
to work. This causes remove_vma_list(), called from do_munmap(), to skip
updating locked_vm.
We don't want to restore the VM_LOCKED in munlock_vma_pages_range()
because the pages are still on the lru. If vmscan attempts to reclaim
any of these pages before we get a chance to unmap them,
try_to_un{lock|map}() may mlock them again. This will result in freeing
an mlocked page.
Add comment block to munlock_vma_pages_range() to explain
this to future would be callers.
Move the accounting of locked_vm from remove_vma_list() to the munlock
loop in do_munmap(). This is where the pages are munlocked and VM_LOCKED
is cleared. Note that remove_vma_list() is a helper function for
do_munmap(), called only from there.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/mlock.c | 17 ++++++++++++++++-
mm/mmap.c | 6 +++---
2 files changed, 19 insertions(+), 4 deletions(-)
Index: linux-2.6.27-rc4-mmotm/mm/mlock.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/mlock.c 2008-08-21 12:04:06.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/mlock.c 2008-08-22 15:43:39.000000000 -0400
@@ -305,7 +305,22 @@ no_mlock:
/*
- * munlock all pages in the vma range. For mremap(), munmap() and exit().
+ * munlock_vma_pages_range() - munlock all pages in the vma range.'
+ * @vma - vma containing range to be munlock()ed.
+ * @start - start address in @vma of the range
+ * @end - end of range in @vma.
+ *
+ * For mremap(), munmap() and exit().
+ *
+ * Called with @vma VM_LOCKED.
+ *
+ * Returns with VM_LOCKED cleared. Callers must be prepared to
+ * deal with this.
+ *
+ * We don't save and restore VM_LOCKED here because pages are
+ * still on lru. In unmap path, pages might be scanned by reclaim
+ * and re-mlocked by try_to_{munlock|unmap} before we unmap and
+ * free them. This will result in freeing mlocked pages.
*/
void munlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
Index: linux-2.6.27-rc4-mmotm/mm/mmap.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/mmap.c 2008-08-22 09:19:10.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/mmap.c 2008-08-22 15:21:40.000000000 -0400
@@ -1752,8 +1752,6 @@ static void remove_vma_list(struct mm_st
long nrpages = vma_pages(vma);
mm->total_vm -= nrpages;
- if (vma->vm_flags & VM_LOCKED)
- mm->locked_vm -= nrpages;
vm_stat_account(mm, vma->vm_flags, vma->vm_file, -nrpages);
vma = remove_vma(vma);
} while (vma);
@@ -1924,8 +1922,10 @@ int do_munmap(struct mm_struct *mm, unsi
if (mm->locked_vm) {
struct vm_area_struct *tmp = vma;
while (tmp && tmp->vm_start < end) {
- if (tmp->vm_flags & VM_LOCKED)
+ if (tmp->vm_flags & VM_LOCKED) {
+ mm->locked_vm -= vma_pages(tmp);
munlock_vma_pages_all(tmp);
+ }
tmp = tmp->vm_next;
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 6/7] Mlock: revert mainline handling of mlock error return
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
` (4 preceding siblings ...)
2008-08-22 21:11 ` [PATCH 5/7] Mlock: update locked_vm on munmap() of mlocked() region Lee Schermerhorn
@ 2008-08-22 21:11 ` Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 7/7] Mlock: make mlock error return Posixly Correct Lee Schermerhorn
6 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:11 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
Revert the change to make_page_present() error return.
This change is intended to make mlock() error returns correct.
make_page_present() is a lower level function used by more than
mlock(). Subsequent patch[es] will add this error return fixup
in an mlock specific path.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/memory.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
Index: linux-2.6.27-rc3-mmotm/mm/memory.c
===================================================================
--- linux-2.6.27-rc3-mmotm.orig/mm/memory.c 2008-08-18 14:50:36.000000000 -0400
+++ linux-2.6.27-rc3-mmotm/mm/memory.c 2008-08-18 14:53:15.000000000 -0400
@@ -2819,19 +2819,9 @@ int make_pages_present(unsigned long add
len = DIV_ROUND_UP(end, PAGE_SIZE) - addr/PAGE_SIZE;
ret = get_user_pages(current, current->mm, addr,
len, write, 0, NULL, NULL);
- if (ret < 0) {
- /*
- SUS require strange return value to mlock
- - invalid addr generate to ENOMEM.
- - out of memory should generate EAGAIN.
- */
- if (ret == -EFAULT)
- ret = -ENOMEM;
- else if (ret == -ENOMEM)
- ret = -EAGAIN;
+ if (ret < 0)
return ret;
- }
- return ret == len ? 0 : -ENOMEM;
+ return ret == len ? 0 : -1;
}
#if !defined(__HAVE_ARCH_GATE_AREA)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 7/7] Mlock: make mlock error return Posixly Correct
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
` (5 preceding siblings ...)
2008-08-22 21:11 ` [PATCH 6/7] Mlock: revert mainline handling of mlock error return Lee Schermerhorn
@ 2008-08-22 21:11 ` Lee Schermerhorn
6 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-22 21:11 UTC (permalink / raw)
To: akpm; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
Rework Posix error return for mlock().
Posix requires error code for mlock*() system calls for
some conditions that differ from what kernel low level
functions, such as get_user_pages(), return for those
conditions. For more info, see:
http://marc.info/?l=linux-kernel&m=121750892930775&w=2
This patch provides the same translation of get_user_pages()
error codes to posix specified error codes in the context
of the mlock rework for unevictable lru.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/memory.c | 2 +-
mm/mlock.c | 27 +++++++++++++++++++++------
2 files changed, 22 insertions(+), 7 deletions(-)
Index: linux-2.6.27-rc4-mmotm/mm/mlock.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/mlock.c 2008-08-21 12:06:04.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/mlock.c 2008-08-21 12:06:08.000000000 -0400
@@ -143,6 +143,18 @@ static void munlock_vma_page(struct page
}
}
+/*
+ * convert get_user_pages() return value to posix mlock() error
+ */
+static int __mlock_posix_error_return(long retval)
+{
+ if (retval == -EFAULT)
+ retval = -ENOMEM;
+ else if (retval == -ENOMEM)
+ retval = -EAGAIN;
+ return retval;
+}
+
/**
* __mlock_vma_pages_range() - mlock/munlock a range of pages in the vma.
* @vma: target vma
@@ -248,11 +260,12 @@ static long __mlock_vma_pages_range(stru
addr += PAGE_SIZE; /* for next get_user_pages() */
nr_pages--;
}
+ ret = 0;
}
lru_add_drain_all(); /* to update stats */
- return 0; /* count entire vma as locked_vm */
+ return ret; /* count entire vma as locked_vm */
}
#else /* CONFIG_UNEVICTABLE_LRU */
@@ -265,7 +278,7 @@ static long __mlock_vma_pages_range(stru
int mlock)
{
if (mlock && (vma->vm_flags & VM_LOCKED))
- make_pages_present(start, end);
+ return make_pages_present(start, end);
return 0;
}
#endif /* CONFIG_UNEVICTABLE_LRU */
@@ -423,10 +436,7 @@ success:
downgrade_write(&mm->mmap_sem);
ret = __mlock_vma_pages_range(vma, start, end, 1);
- if (ret > 0) {
- mm->locked_vm -= ret;
- ret = 0;
- }
+
/*
* Need to reacquire mmap sem in write mode, as our callers
* expect this. We have no support for atomically upgrading
@@ -440,6 +450,11 @@ success:
/* non-NULL *prev must contain @start, but need to check @end */
if (!(*prev) || end > (*prev)->vm_end)
ret = -ENOMEM;
+ else if (ret > 0) {
+ mm->locked_vm -= ret;
+ ret = 0;
+ } else
+ ret = __mlock_posix_error_return(ret); /* translate if needed */
} else {
/*
* TODO: for unlocking, pages will already be resident, so
Index: linux-2.6.27-rc4-mmotm/mm/memory.c
===================================================================
--- linux-2.6.27-rc4-mmotm.orig/mm/memory.c 2008-08-21 12:06:04.000000000 -0400
+++ linux-2.6.27-rc4-mmotm/mm/memory.c 2008-08-21 12:06:08.000000000 -0400
@@ -2821,7 +2821,7 @@ int make_pages_present(unsigned long add
len, write, 0, NULL, NULL);
if (ret < 0)
return ret;
- return ret == len ? 0 : -1;
+ return ret == len ? 0 : -EFAULT;
}
#if !defined(__HAVE_ARCH_GATE_AREA)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch
2008-08-22 21:10 ` [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch Lee Schermerhorn
@ 2008-08-22 23:11 ` Andrew Morton
2008-08-25 13:01 ` Lee Schermerhorn
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2008-08-22 23:11 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
On Fri, 22 Aug 2008 17:10:47 -0400
Lee Schermerhorn <lee.schermerhorn@hp.com> wrote:
> @@ -240,13 +240,27 @@ static int __mlock_vma_pages_range(struc
> #endif /* CONFIG_UNEVICTABLE_LRU */
>
> /*
> - * mlock all pages in this vma range. For mmap()/mremap()/...
> +/**
mm/mlock.c:243:1: warning: "/*" within comment
what's happening over there?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch
2008-08-22 23:11 ` Andrew Morton
@ 2008-08-25 13:01 ` Lee Schermerhorn
0 siblings, 0 replies; 10+ messages in thread
From: Lee Schermerhorn @ 2008-08-25 13:01 UTC (permalink / raw)
To: Andrew Morton; +Cc: riel, linux-mm, kosaki.motohiro, Eric.Whitney
On Fri, 2008-08-22 at 16:11 -0700, Andrew Morton wrote:
> On Fri, 22 Aug 2008 17:10:47 -0400
> Lee Schermerhorn <lee.schermerhorn@hp.com> wrote:
>
> > @@ -240,13 +240,27 @@ static int __mlock_vma_pages_range(struc
> > #endif /* CONFIG_UNEVICTABLE_LRU */
> >
> > /*
> > - * mlock all pages in this vma range. For mmap()/mremap()/...
> > +/**
>
> mm/mlock.c:243:1: warning: "/*" within comment
>
> what's happening over there?
Indeed. Lost the warning in the make log. checkpatch let it slide :(
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-08-25 13:01 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-22 21:10 [Patch 0/7] Mlock: doc, patch grouping and error return cleanups Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 1/7] Mlock: fix __mlock_vma_pages_range comment block Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 2/7] Mlock: backout locked_vm adjustment during mmap() Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 3/7] Mlock: resubmit locked_vm adjustment as separate patch Lee Schermerhorn
2008-08-22 23:11 ` Andrew Morton
2008-08-25 13:01 ` Lee Schermerhorn
2008-08-22 21:10 ` [PATCH 4/7] Mlock: fix return value for munmap/mlock vma race Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 5/7] Mlock: update locked_vm on munmap() of mlocked() region Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 6/7] Mlock: revert mainline handling of mlock error return Lee Schermerhorn
2008-08-22 21:11 ` [PATCH 7/7] Mlock: make mlock error return Posixly Correct Lee Schermerhorn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox