* [PATCH 0/2] mm: Clean up and document fault and RETRY mmap_sem
@ 2011-06-28 16:47 Steven Rostedt
2011-06-28 16:47 ` [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set Steven Rostedt
2011-06-28 16:47 ` [PATCH 2/2] mm: Document handle_mm_fault() Steven Rostedt
0 siblings, 2 replies; 10+ messages in thread
From: Steven Rostedt @ 2011-06-28 16:47 UTC (permalink / raw)
To: linux-kernel, linux-mm
Cc: Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner,
Peter Zijlstra
On IRC, Russell King noticed that the code in arch/x86/mm/fault.c
looked buggy with the retry loop and retaking the mmap_sem. But
with further investigation it seems to be correct, but now
handle_mm_fault() has subtle locking with the mmap_sem depending
on what flags are set.
Unfortunately, there's no good comments about what is going on in the
code. Doing various git blame, git show, I dug up the history
and cleaned up the RETRY_NOWAIT and added documentation to the
handle_mm_fault().
Steven Rostedt (2):
mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set
mm: Document handle_mm_fault()
----
include/linux/mm.h | 4 ++--
mm/filemap.c | 14 +++++++-------
mm/memory.c | 24 +++++++++++++++++++++---
3 files changed, 30 insertions(+), 12 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set 2011-06-28 16:47 [PATCH 0/2] mm: Clean up and document fault and RETRY mmap_sem Steven Rostedt @ 2011-06-28 16:47 ` Steven Rostedt 2011-06-29 9:38 ` Michel Lespinasse 2011-06-28 16:47 ` [PATCH 2/2] mm: Document handle_mm_fault() Steven Rostedt 1 sibling, 1 reply; 10+ messages in thread From: Steven Rostedt @ 2011-06-28 16:47 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti [-- Attachment #1: 0001-mm-Remove-use-of-ALLOW_RETRY-when-RETRY_NOWAIT-is-se.patch --] [-- Type: text/plain, Size: 3546 bytes --] From: Steven Rostedt <srostedt@redhat.com> The only user of FAULT_FLAG_RETRY_NOWAIT also sets the FAULT_FLAG_ALLOW_RETRY flag. This makes the check in the __lock_page_or_retry redundant as it checks the RETRY_NOWAIT just after checking ALLOW_RETRY and then returns if it is set. The FAULT_FLAG_ALLOW_RETRY does not make any other difference in this path. Setting both and then ignoring one is quite confusing, especially since this code has very subtle locking issues when it comes to the mmap_sem. Only set the RETRY_WAIT flag and have that do the necessary work instead of confusing reviewers of this code by setting ALLOW_RETRY and not releasing the mmap_sem. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- include/linux/mm.h | 4 ++-- mm/filemap.c | 14 +++++++------- mm/memory.c | 2 +- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9670f71..2ec71d9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -151,8 +151,8 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ #define FAULT_FLAG_NONLINEAR 0x02 /* Fault was via a nonlinear mapping */ #define FAULT_FLAG_MKWRITE 0x04 /* Fault was mkwrite of existing pte */ -#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking */ -#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Don't drop mmap_sem and wait when retrying */ +#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking (drops mmap_sem) */ +#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Wait when retrying (don't drop mmap_sem) */ #define FAULT_FLAG_KILLABLE 0x20 /* The fault task is in SIGKILL killable region */ /* diff --git a/mm/filemap.c b/mm/filemap.c index a8251a8..bc9978f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -665,14 +665,14 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_or_retry(struct page *page, struct mm_struct *mm, unsigned int flags) { - if (flags & FAULT_FLAG_ALLOW_RETRY) { - /* - * CAUTION! In this case, mmap_sem is not released - * even though return 0. - */ - if (flags & FAULT_FLAG_RETRY_NOWAIT) - return 0; + /* + * Don't drop mmap_sem if FAULT_FLAG_RETRY_NOWAIT is + * set, even if FAULT_FLAG_ALLOW_RETRY is set. + */ + if (flags & FAULT_FLAG_RETRY_NOWAIT) + return 0; + if (flags & FAULT_FLAG_ALLOW_RETRY) { up_read(&mm->mmap_sem); if (flags & FAULT_FLAG_KILLABLE) wait_on_page_locked_killable(page); diff --git a/mm/memory.c b/mm/memory.c index 40b7531..5371b5e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1742,7 +1742,7 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (nonblocking) fault_flags |= FAULT_FLAG_ALLOW_RETRY; if (foll_flags & FOLL_NOWAIT) - fault_flags |= (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT); + fault_flags |= FAULT_FLAG_RETRY_NOWAIT; ret = handle_mm_fault(mm, vma, start, fault_flags); -- 1.7.5.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set 2011-06-28 16:47 ` [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set Steven Rostedt @ 2011-06-29 9:38 ` Michel Lespinasse 2011-06-29 12:43 ` Steven Rostedt 0 siblings, 1 reply; 10+ messages in thread From: Michel Lespinasse @ 2011-06-29 9:38 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, linux-mm, Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Hugh Dickins, Rik van Riel, Avi Kivity, Marcelo Tosatti On Tue, Jun 28, 2011 at 9:47 AM, Steven Rostedt <rostedt@goodmis.org> wrote: > From: Steven Rostedt <srostedt@redhat.com> > > The only user of FAULT_FLAG_RETRY_NOWAIT also sets the > FAULT_FLAG_ALLOW_RETRY flag. This makes the check in the > __lock_page_or_retry redundant as it checks the RETRY_NOWAIT > just after checking ALLOW_RETRY and then returns if it is > set. The FAULT_FLAG_ALLOW_RETRY does not make any other > difference in this path. > > Setting both and then ignoring one is quite confusing, > especially since this code has very subtle locking issues > when it comes to the mmap_sem. > > Only set the RETRY_WAIT flag and have that do the necessary > work instead of confusing reviewers of this code by setting > ALLOW_RETRY and not releasing the mmap_sem. > > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -151,8 +151,8 @@ extern pgprot_t protection_map[16]; > #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ > #define FAULT_FLAG_NONLINEAR 0x02 /* Fault was via a nonlinear mapping */ > #define FAULT_FLAG_MKWRITE 0x04 /* Fault was mkwrite of existing pte */ > -#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking */ > -#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Don't drop mmap_sem and wait when retrying */ > +#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking (drops mmap_sem) */ > +#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Wait when retrying (don't drop mmap_sem) */ You want to say "DONT wait when retrying" here... Also - you argued higher up that having both flags set at once is confusing, but I find it equally confusing to pass a flag to specify you don't want to wait on retry if the flag that allows retry is not set. I think the confusion comes from the way the nowait semantics got bolted on the retry code for virtualization, even though (if I understand the virtualization use case correctly) they dont actually want to retry there, they just want to give up without blocking. Would the following proposal make more sense to you ? FAULT_FLAG_ALLOW_ASYNC: allow returning a VM_FAULT_ASYNC error code if the page can't be obtained immediately (major fault). FAULT_FLAG_ASYNC_WAIT: before returning VM_FAULT_ASYNC, drop the mmap_sem and wait for major fault to complete. existing uses of FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT become FAULT_FLAG_ASYNC existing uses of FAULT_FLAG_ALLOW_RETRY alone become FAULT_FLAG_ASYNC | FAULT_FLAG_ASYNC_WAIT existing uses of VM_FAULT_RETRY become VM_FAULT_ASYNC This may also help your documentation proposal since the flags would now work together rather than having one be an exception to the other. -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set 2011-06-29 9:38 ` Michel Lespinasse @ 2011-06-29 12:43 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2011-06-29 12:43 UTC (permalink / raw) To: Michel Lespinasse Cc: linux-kernel, linux-mm, Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Hugh Dickins, Rik van Riel, Avi Kivity, Marcelo Tosatti On Wed, 2011-06-29 at 02:38 -0700, Michel Lespinasse wrote: > On Tue, Jun 28, 2011 at 9:47 AM, Steven Rostedt <rostedt@goodmis.org> wrote: > > From: Steven Rostedt <srostedt@redhat.com> > > > > The only user of FAULT_FLAG_RETRY_NOWAIT also sets the > > FAULT_FLAG_ALLOW_RETRY flag. This makes the check in the > > __lock_page_or_retry redundant as it checks the RETRY_NOWAIT > > just after checking ALLOW_RETRY and then returns if it is > > set. The FAULT_FLAG_ALLOW_RETRY does not make any other > > difference in this path. > > > > Setting both and then ignoring one is quite confusing, > > especially since this code has very subtle locking issues > > when it comes to the mmap_sem. > > > > Only set the RETRY_WAIT flag and have that do the necessary > > work instead of confusing reviewers of this code by setting > > ALLOW_RETRY and not releasing the mmap_sem. > > > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -151,8 +151,8 @@ extern pgprot_t protection_map[16]; > > #define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */ > > #define FAULT_FLAG_NONLINEAR 0x02 /* Fault was via a nonlinear mapping */ > > #define FAULT_FLAG_MKWRITE 0x04 /* Fault was mkwrite of existing pte */ > > -#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking */ > > -#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Don't drop mmap_sem and wait when retrying */ > > +#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking (drops mmap_sem) */ > > +#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Wait when retrying (don't drop mmap_sem) */ > > You want to say "DONT wait when retrying" here... Yeah, I thought that was weird in the comment. Heh, rereading the original comment I see it meant "don't drop nor wait" where I thought it meant "Don't drop mmap_sem and then wait when retrying". That needs to be fixed :) > > Also - you argued higher up that having both flags set at once is > confusing, but I find it equally confusing to pass a flag to specify > you don't want to wait on retry if the flag that allows retry is not > set. Hmm, I understand. The main issue I have with all these flags is the locking. I really don't care about the semantics for retry and such. What I care about is this dropping of the mmap_sem. It is very confusing to understand when it is dropped or not. > I think the confusion comes from the way the nowait semantics got > bolted on the retry code for virtualization, even though (if I > understand the virtualization use case correctly) they dont actually > want to retry there, they just want to give up without blocking. Right. > > > Would the following proposal make more sense to you ? > > FAULT_FLAG_ALLOW_ASYNC: allow returning a VM_FAULT_ASYNC error code if > the page can't be obtained immediately (major fault). > FAULT_FLAG_ASYNC_WAIT: before returning VM_FAULT_ASYNC, drop the > mmap_sem and wait for major fault to complete. Ug, I think that's worse. ASYNC to me is something that you enable and it will get back to you when done. Think AIO. How about... FAULT_FLAG_ALLOW_DROP_MMAP_SEM ? and it returns VM_FAULT_DROPPED_MMAP_SEM if it drops it. We could also have another flag: FAULT_FLAG_NOWAIT_ON_PAGE and it can return VM_FAULT_NOWAIT_PAGE if it failed to get the page lock and returned. This to me documents exactly what is happening, and doesn't confuse kernel developers when they see handle_mm_fault() called, and then the code not releasing the mmap_sem. If I saw in arch/x86/mm/fault.c: if (fault & VM_FAULT_DROPPED_MMAP_SEM) { /* Clear FAULT_FLAG_ALLOW_DROPPED_MMAP_SEM to avoid any risk * of starvation. */ flags &= ~FAULT_FLAG_ALLOW_DROPPED_MMAP_SEM; goto retry; } There would have never been this confusion about why we are doing: retry: down_read(&mm->mmap_sem); Without ever doing a up_read(&mm->mmap_sem); > > existing uses of FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT > become FAULT_FLAG_ASYNC > existing uses of FAULT_FLAG_ALLOW_RETRY alone become FAULT_FLAG_ASYNC > | FAULT_FLAG_ASYNC_WAIT > existing uses of VM_FAULT_RETRY become VM_FAULT_ASYNC > > This may also help your documentation proposal since the flags would > now work together rather than having one be an exception to the other. I'm thinking they should still be separate. Just because they do things differently with the mmap_sem. -- Steve -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm: Document handle_mm_fault() 2011-06-28 16:47 [PATCH 0/2] mm: Clean up and document fault and RETRY mmap_sem Steven Rostedt 2011-06-28 16:47 ` [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set Steven Rostedt @ 2011-06-28 16:47 ` Steven Rostedt 2011-06-28 16:59 ` Steven Rostedt ` (2 more replies) 1 sibling, 3 replies; 10+ messages in thread From: Steven Rostedt @ 2011-06-28 16:47 UTC (permalink / raw) To: linux-kernel, linux-mm Cc: Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Gleb Natapov, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti [-- Attachment #1: 0002-mm-Document-handle_mm_fault.patch --] [-- Type: text/plain, Size: 3697 bytes --] From: Steven Rostedt <srostedt@redhat.com> The function handle_mm_fault() is long overdue for comments. Adding a kernel doc header for the function and explaining the subtle use of the flags with respect to mmap_sem will prove useful in the future when others work with this code. Russell King noticed that the code in arch/x86/mm/fault.c looked buggy as the do_page_fault() code would grab the mmap_sem multiple times without letting it go. But it only did this when the handle_mm_fault() would return VM_FAULT_RETRY. Examining the code and reading the git change logs, it was discovered that commit d065bd810b6deb67d4897a14bfe21f8eb526ba99 mm: retry page fault when blocking on disk transfer added code to remove contention with the mmap_sem when the page_lock was being held for IO. As waiting on IO holding the mmap_sem can cause lots of contention between threads. The flag FAULT_FLAG_ALLOW_RETRY was added to let handle_mm_fault() know that it can safely release the mmap_sem. Adding to the confusion here with handle_mm_fault(), another commit 318b275fbca1ab9ec0862de71420e0e92c3d1aa7 mm: allow GUP to fail instead of waiting on a page was added that would not release the mmap_sem, even if FAULT_FLAG_ALLOW_RETRY was set and the page_lock was not taken and VM_FAULT_RETRY was returned, if FAULT_FLAGS_RETRY_NOWAIT was set. All of this is poorly documented and makes using or modifying handle_mm_fault() fragile. Documenting all of these subtle changes at the head of handle_mm_fault() should help future developers understand what is happening. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Gleb Natapov <gleb@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- mm/memory.c | 22 ++++++++++++++++++++-- 1 files changed, 20 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 5371b5e..3cf30f6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3378,8 +3378,26 @@ unlock: return 0; } -/* - * By the time we get here, we already hold the mm semaphore +/** + * handle_mm_fault - main routine for handling page faults + * @mm: the mm_struct of the target address space + * @vma: vm_area_struct holding the applicable pages + * @address: the address that took the fault + * @flags: flags modifying lookup behaviour + * + * Must have @mm->mmap_sem held. + * + * Note: if @flags has FAULT_FLAG_ALLOW_RETRY set then the mmap_sem + * may be released if it failed to arquire the page_lock. If the + * mmap_sem is released then it will return VM_FAULT_RETRY set. + * This is to keep the time mmap_sem is held when the page_lock + * is taken for IO. + * Exception: If FAULT_FLAG_RETRY_NOWAIT is set, then it will + * not release the mmap_sem, but will still return VM_FAULT_RETRY + * if it failed to acquire the page_lock. + * This is for helping virtualization. See get_user_page_nowait(). + * + * Returns status flags based on the VM_FAULT_* flags in <linux/mm.h> */ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, unsigned int flags) -- 1.7.5.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: Document handle_mm_fault() 2011-06-28 16:47 ` [PATCH 2/2] mm: Document handle_mm_fault() Steven Rostedt @ 2011-06-28 16:59 ` Steven Rostedt 2011-06-28 17:02 ` Linus Torvalds 2011-06-28 17:09 ` Gleb Natapov 2 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2011-06-28 16:59 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Gleb Natapov, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti On Tue, 2011-06-28 at 12:47 -0400, Steven Rostedt wrote: > - > mm/memory.c | 22 ++++++++++++++++++++-- > 1 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 5371b5e..3cf30f6 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3378,8 +3378,26 @@ unlock: > return 0; > } > > -/* > - * By the time we get here, we already hold the mm semaphore > +/** > + * handle_mm_fault - main routine for handling page faults > + * @mm: the mm_struct of the target address space > + * @vma: vm_area_struct holding the applicable pages > + * @address: the address that took the fault > + * @flags: flags modifying lookup behaviour > + * > + * Must have @mm->mmap_sem held. > + * > + * Note: if @flags has FAULT_FLAG_ALLOW_RETRY set then the mmap_sem > + * may be released if it failed to arquire the page_lock. If the s/arquire/acquire/ Hmm, I thought I fixed that. I better test the first patch again, in case it has issues in it that I thought I fixed. -- Steve > + * mmap_sem is released then it will return VM_FAULT_RETRY set. > + * This is to keep the time mmap_sem is held when the page_lock > + * is taken for IO. > + * Exception: If FAULT_FLAG_RETRY_NOWAIT is set, then it will > + * not release the mmap_sem, but will still return VM_FAULT_RETRY > + * if it failed to acquire the page_lock. > + * This is for helping virtualization. See get_user_page_nowait(). > + * > + * Returns status flags based on the VM_FAULT_* flags in <linux/mm.h> > */ > int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long address, unsigned int flags) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: Document handle_mm_fault() 2011-06-28 16:47 ` [PATCH 2/2] mm: Document handle_mm_fault() Steven Rostedt 2011-06-28 16:59 ` Steven Rostedt @ 2011-06-28 17:02 ` Linus Torvalds 2011-06-28 17:22 ` Steven Rostedt 2011-06-28 17:09 ` Gleb Natapov 2 siblings, 1 reply; 10+ messages in thread From: Linus Torvalds @ 2011-06-28 17:02 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, linux-mm, Andrew Morton, Russell King, Thomas Gleixner, Peter Zijlstra, Gleb Natapov, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti On Tue, Jun 28, 2011 at 9:47 AM, Steven Rostedt <rostedt@goodmis.org> wrote: > + * Note: if @flags has FAULT_FLAG_ALLOW_RETRY set then the mmap_sem > + * may be released if it failed to arquire the page_lock. If the > + * mmap_sem is released then it will return VM_FAULT_RETRY set. > + * This is to keep the time mmap_sem is held when the page_lock > + * is taken for IO. So I know what that flag does, but I knew it without the comment. WITH the comment, I'm just confused. "This is to keep the time mmap_sem is held when the page_lock is taken for IO." Sounds like a google translation from swahili. "keep the time" what? Maybe "keep" -> "minimize"? Or just "This is to avoid holding mmap_sem while waiting for IO" Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: Document handle_mm_fault() 2011-06-28 17:02 ` Linus Torvalds @ 2011-06-28 17:22 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2011-06-28 17:22 UTC (permalink / raw) To: Linus Torvalds Cc: linux-kernel, linux-mm, Andrew Morton, Russell King, Thomas Gleixner, Peter Zijlstra, Gleb Natapov, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti On Tue, 2011-06-28 at 10:02 -0700, Linus Torvalds wrote: > On Tue, Jun 28, 2011 at 9:47 AM, Steven Rostedt <rostedt@goodmis.org> wrote: > > + * Note: if @flags has FAULT_FLAG_ALLOW_RETRY set then the mmap_sem > > + * may be released if it failed to arquire the page_lock. If the > > + * mmap_sem is released then it will return VM_FAULT_RETRY set. > > + * This is to keep the time mmap_sem is held when the page_lock > > + * is taken for IO. > > So I know what that flag does, but I knew it without the comment. > > WITH the comment, I'm just confused. "This is to keep the time > mmap_sem is held when the page_lock is taken for IO." > > Sounds like a google translation from swahili. "keep the time" what? "When people ask me what language is my mother tongue, I simply reply C" Google translate from C -> english is worse than swahily -> english :p > > Maybe "keep" -> "minimize"? Or just "This is to avoid holding mmap_sem > while waiting for IO" OK, that sounds better. Thanks. I'll go to cut a v2, but I'll wait a day or so for others to comment. -- Steve -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: Document handle_mm_fault() 2011-06-28 16:47 ` [PATCH 2/2] mm: Document handle_mm_fault() Steven Rostedt 2011-06-28 16:59 ` Steven Rostedt 2011-06-28 17:02 ` Linus Torvalds @ 2011-06-28 17:09 ` Gleb Natapov 2011-06-28 17:16 ` Steven Rostedt 2 siblings, 1 reply; 10+ messages in thread From: Gleb Natapov @ 2011-06-28 17:09 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, linux-mm, Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti On Tue, Jun 28, 2011 at 12:47:52PM -0400, Steven Rostedt wrote: > From: Steven Rostedt <srostedt@redhat.com> > > The function handle_mm_fault() is long overdue for comments. > Adding a kernel doc header for the function and explaining the subtle > use of the flags with respect to mmap_sem will prove useful in the > future when others work with this code. > > Russell King noticed that the code in arch/x86/mm/fault.c looked > buggy as the do_page_fault() code would grab the mmap_sem multiple > times without letting it go. But it only did this when the > handle_mm_fault() would return VM_FAULT_RETRY. > > Examining the code and reading the git change logs, it was discovered > that commit d065bd810b6deb67d4897a14bfe21f8eb526ba99 > mm: retry page fault when blocking on disk transfer > added code to remove contention with the mmap_sem when the page_lock > was being held for IO. As waiting on IO holding the mmap_sem can > cause lots of contention between threads. The flag > FAULT_FLAG_ALLOW_RETRY was added to let handle_mm_fault() know > that it can safely release the mmap_sem. > > Adding to the confusion here with handle_mm_fault(), another > commit 318b275fbca1ab9ec0862de71420e0e92c3d1aa7 > mm: allow GUP to fail instead of waiting on a page > was added that would not release the mmap_sem, even if > FAULT_FLAG_ALLOW_RETRY was set and the page_lock was not taken > and VM_FAULT_RETRY was returned, if FAULT_FLAGS_RETRY_NOWAIT was > set. > > All of this is poorly documented and makes using or modifying > handle_mm_fault() fragile. Documenting all of these subtle changes > at the head of handle_mm_fault() should help future developers > understand what is happening. > > Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> > Cc: Gleb Natapov <gleb@redhat.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Hugh Dickins <hughd@google.com> > Cc: Rik van Riel <riel@redhat.com> > Cc: Michel Lespinasse <walken@google.com> > Cc: Avi Kivity <avi@redhat.com> > Cc: Marcelo Tosatti <mtosatti@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > mm/memory.c | 22 ++++++++++++++++++++-- > 1 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 5371b5e..3cf30f6 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3378,8 +3378,26 @@ unlock: > return 0; > } > > -/* > - * By the time we get here, we already hold the mm semaphore > +/** > + * handle_mm_fault - main routine for handling page faults > + * @mm: the mm_struct of the target address space > + * @vma: vm_area_struct holding the applicable pages > + * @address: the address that took the fault > + * @flags: flags modifying lookup behaviour > + * > + * Must have @mm->mmap_sem held. > + * > + * Note: if @flags has FAULT_FLAG_ALLOW_RETRY set then the mmap_sem > + * may be released if it failed to arquire the page_lock. If the > + * mmap_sem is released then it will return VM_FAULT_RETRY set. > + * This is to keep the time mmap_sem is held when the page_lock > + * is taken for IO. > + * Exception: If FAULT_FLAG_RETRY_NOWAIT is set, then it will > + * not release the mmap_sem, but will still return VM_FAULT_RETRY > + * if it failed to acquire the page_lock. I wouldn't describe it like that. It tells handle_mm_fault() to start IO if needed, but do not wait for its completion. > + * This is for helping virtualization. See get_user_page_nowait(). The virtialization is the only user right now, but I wouldn't describe this flag as virtialization specific. Why should we put this in the comment? The comment will become outdated when other users arise and meanwhile a simple grep will reveal the above information anyway. > + * > + * Returns status flags based on the VM_FAULT_* flags in <linux/mm.h> > */ > int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long address, unsigned int flags) > -- > 1.7.5.4 > -- Gleb. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm: Document handle_mm_fault() 2011-06-28 17:09 ` Gleb Natapov @ 2011-06-28 17:16 ` Steven Rostedt 0 siblings, 0 replies; 10+ messages in thread From: Steven Rostedt @ 2011-06-28 17:16 UTC (permalink / raw) To: Gleb Natapov Cc: linux-kernel, linux-mm, Andrew Morton, Linus Torvalds, Russell King, Thomas Gleixner, Peter Zijlstra, Hugh Dickins, Rik van Riel, Michel Lespinasse, Avi Kivity, Marcelo Tosatti On Tue, 2011-06-28 at 20:09 +0300, Gleb Natapov wrote: > > -/* > > - * By the time we get here, we already hold the mm semaphore > > +/** > > + * handle_mm_fault - main routine for handling page faults > > + * @mm: the mm_struct of the target address space > > + * @vma: vm_area_struct holding the applicable pages > > + * @address: the address that took the fault > > + * @flags: flags modifying lookup behaviour > > + * > > + * Must have @mm->mmap_sem held. > > + * > > + * Note: if @flags has FAULT_FLAG_ALLOW_RETRY set then the mmap_sem > > + * may be released if it failed to arquire the page_lock. If the > > + * mmap_sem is released then it will return VM_FAULT_RETRY set. > > + * This is to keep the time mmap_sem is held when the page_lock > > + * is taken for IO. > > + * Exception: If FAULT_FLAG_RETRY_NOWAIT is set, then it will > > + * not release the mmap_sem, but will still return VM_FAULT_RETRY > > + * if it failed to acquire the page_lock. > I wouldn't describe it like that. It tells handle_mm_fault() to start > IO if needed, but do not wait for its completion. This isn't a description of the flag itself. It's a description of the mmap_sem behavior for the flag. Again, this came about because of the subtle locking that handle_mm_fault() does with mmap_sem. When it comes to locking, we need things documented very well, as locking is usually what most developers get wrong even when subtle locking does not exist. > > > + * This is for helping virtualization. See get_user_page_nowait(). > The virtialization is the only user right now, but I wouldn't describe > this flag as virtialization specific. Why should we put this in the > comment? The comment will become outdated when other users arise > and meanwhile a simple grep will reveal the above information anyway. I've been going in the direction of adding comments about how things help. OK, so it's the only user now. I could change this to, "This is for helping things like virtualization." When reading code, it is nice to know why something is done that is out of the ordinary. If it changes, then we should change the comments. The best place for keeping code documented and up to date, is right at the code. ie. comments. -- Steve -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-06-29 12:44 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-06-28 16:47 [PATCH 0/2] mm: Clean up and document fault and RETRY mmap_sem Steven Rostedt 2011-06-28 16:47 ` [PATCH 1/2] mm: Remove use of ALLOW_RETRY when RETRY_NOWAIT is set Steven Rostedt 2011-06-29 9:38 ` Michel Lespinasse 2011-06-29 12:43 ` Steven Rostedt 2011-06-28 16:47 ` [PATCH 2/2] mm: Document handle_mm_fault() Steven Rostedt 2011-06-28 16:59 ` Steven Rostedt 2011-06-28 17:02 ` Linus Torvalds 2011-06-28 17:22 ` Steven Rostedt 2011-06-28 17:09 ` Gleb Natapov 2011-06-28 17:16 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox