linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: hugetlb: break COW earlier for resv owner
@ 2012-02-18  6:19 Hillf Danton
  2012-02-22  8:17 ` Hugh Dickins
  0 siblings, 1 reply; 2+ messages in thread
From: Hillf Danton @ 2012-02-18  6:19 UTC (permalink / raw)
  To: Linux-MM
  Cc: LKML, Michal Hocko, KAMEZAWA Hiroyuki, Hugh Dickins,
	Andrew Morton, Hillf Danton

When a process owning a MAP_PRIVATE mapping fails to COW, due to references
held by a child and insufficient huge page pool, page is unmapped from the
child process to guarantee the original mappers reliability, and the child
may get SIGKILLed if it later faults.

With that guarantee, COW is broken earlier on behalf of owners, and they will
go less page faults.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
---

--- a/mm/hugetlb.c	Tue Feb 14 20:10:46 2012
+++ b/mm/hugetlb.c	Sat Feb 18 13:29:58 2012
@@ -2145,10 +2145,12 @@ int copy_hugetlb_page_range(struct mm_st
 	struct page *ptepage;
 	unsigned long addr;
 	int cow;
+	int owner;
 	struct hstate *h = hstate_vma(vma);
 	unsigned long sz = huge_page_size(h);

 	cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
+	owner = is_vma_resv_set(vma, HPAGE_RESV_OWNER);

 	for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) {
 		src_pte = huge_pte_offset(src, addr);
@@ -2164,10 +2166,19 @@ int copy_hugetlb_page_range(struct mm_st

 		spin_lock(&dst->page_table_lock);
 		spin_lock_nested(&src->page_table_lock, SINGLE_DEPTH_NESTING);
-		if (!huge_pte_none(huge_ptep_get(src_pte))) {
+		entry = huge_ptep_get(src_pte);
+		if (!huge_pte_none(entry)) {
 			if (cow)
-				huge_ptep_set_wrprotect(src, addr, src_pte);
-			entry = huge_ptep_get(src_pte);
+				if (owner) {
+					/*
+					 * Break COW for resv owner to go less
+					 * page faults later
+					 */
+					entry = huge_pte_wrprotect(entry);
+				} else {
+					huge_ptep_set_wrprotect(src, addr, src_pte);
+					entry = huge_ptep_get(src_pte);
+				}
 			ptepage = pte_page(entry);
 			get_page(ptepage);
 			page_dup_rmap(ptepage);
--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] mm: hugetlb: break COW earlier for resv owner
  2012-02-18  6:19 [PATCH] mm: hugetlb: break COW earlier for resv owner Hillf Danton
@ 2012-02-22  8:17 ` Hugh Dickins
  0 siblings, 0 replies; 2+ messages in thread
From: Hugh Dickins @ 2012-02-22  8:17 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Linux-MM, LKML, Michal Hocko, KAMEZAWA Hiroyuki, Andrew Morton

On Sat, 18 Feb 2012, Hillf Danton wrote:
> When a process owning a MAP_PRIVATE mapping fails to COW, due to references
> held by a child and insufficient huge page pool, page is unmapped from the
> child process to guarantee the original mappers reliability, and the child
> may get SIGKILLed if it later faults.

I think I understand you there.

> 
> With that guarantee, COW is broken earlier on behalf of owners, and they will
> go less page faults.

As usual, I have to guess that here you're describing what (you think)
happens after your patch.

But I don't understand, or it doesn't seem to describe what happens in
your patch.  "COW is broken" only in the sense that you are breaking
the way COW is supposed to behave, and I believe your patch is wrong.

> 
> Signed-off-by: Hillf Danton <dhillf@gmail.com>
> ---
> 
> --- a/mm/hugetlb.c	Tue Feb 14 20:10:46 2012
> +++ b/mm/hugetlb.c	Sat Feb 18 13:29:58 2012
> @@ -2145,10 +2145,12 @@ int copy_hugetlb_page_range(struct mm_st
>  	struct page *ptepage;
>  	unsigned long addr;
>  	int cow;
> +	int owner;
>  	struct hstate *h = hstate_vma(vma);
>  	unsigned long sz = huge_page_size(h);
> 
>  	cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
> +	owner = is_vma_resv_set(vma, HPAGE_RESV_OWNER);
> 
>  	for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) {
>  		src_pte = huge_pte_offset(src, addr);
> @@ -2164,10 +2166,19 @@ int copy_hugetlb_page_range(struct mm_st
> 
>  		spin_lock(&dst->page_table_lock);
>  		spin_lock_nested(&src->page_table_lock, SINGLE_DEPTH_NESTING);
> -		if (!huge_pte_none(huge_ptep_get(src_pte))) {
> +		entry = huge_ptep_get(src_pte);
> +		if (!huge_pte_none(entry)) {
>  			if (cow)
> -				huge_ptep_set_wrprotect(src, addr, src_pte);
> -			entry = huge_ptep_get(src_pte);
> +				if (owner) {
> +					/*
> +					 * Break COW for resv owner to go less
> +					 * page faults later
> +					 */
> +					entry = huge_pte_wrprotect(entry);

So, the change you are making is that if the vma being copied (and I had
to check with dup_mmap() to see that vma here is indeed the src vma) is
the original "owner", then its pte is left (perhaps) writable, and only
the child's is write-protected.

But that means that modifications made to this page by the parent after
the fork will still be visible to the child, until such time as the child
writes to this area, if ever.

That is not how COW protection behaves for a normal page: it's symmetric,
neither parent nor child sees modifications made by the other after fork.

Now, hugetlb pages are not normal in all kinds of ways, but here you
appear to be changing the semantics of private hugetlb mappings, in
a way that could break applications, and has security implications.

Or am I misunderstanding?

Hugh

> +				} else {
> +					huge_ptep_set_wrprotect(src, addr, src_pte);
> +					entry = huge_ptep_get(src_pte);
> +				}
>  			ptepage = pte_page(entry);
>  			get_page(ptepage);
>  			page_dup_rmap(ptepage);
> --

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-02-22  8:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-18  6:19 [PATCH] mm: hugetlb: break COW earlier for resv owner Hillf Danton
2012-02-22  8:17 ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox