* [PATCH V5 1/4] mm: Refactor do_wp_page, extract the reuse case
2015-02-22 13:42 [PATCH V5 0/4] Refactor do_wp_page, no functional change Shachar Raindel
@ 2015-02-22 13:42 ` Shachar Raindel
2015-02-24 13:39 ` Michal Hocko
2015-02-22 13:42 ` [PATCH V5 2/4] mm: Refactor do_wp_page - rewrite the unlock flow Shachar Raindel
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Shachar Raindel @ 2015-02-22 13:42 UTC (permalink / raw)
To: linux-mm
Cc: kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, raindel, Dave Hansen
When do_wp_page is ending, in several cases it needs to reuse the
existing page. This is achieved by making the page table writable,
and possibly updating the page-cache state.
Currently, this logic was "called" by using a goto jump. This makes
following the control flow of the function harder. It is also
against the coding style guidelines for using goto.
As the code can easily be refactored into a specialized function,
refactor it out and simplify the code flow in do_wp_page.
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Haggai Eran <haggaie@mellanox.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Michel Lespinasse <walken@google.com>
---
mm/memory.c | 117 +++++++++++++++++++++++++++++++++++-------------------------
1 file changed, 68 insertions(+), 49 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 8068893..7a04414 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1983,6 +1983,65 @@ static int do_page_mkwrite(struct vm_area_struct *vma, struct page *page,
}
/*
+ * Handle write page faults for pages that can be reused in the current vma
+ *
+ * This can happen either due to the mapping being with the VM_SHARED flag,
+ * or due to us being the last reference standing to the page. In either
+ * case, all we need to do here is to mark the page as writable and update
+ * any related book-keeping.
+ */
+static int wp_page_reuse(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, pte_t *page_table,
+ spinlock_t *ptl, pte_t orig_pte,
+ struct page *page, int page_mkwrite,
+ int dirty_shared)
+ __releases(ptl)
+{
+ pte_t entry;
+ /*
+ * Clear the pages cpupid information as the existing
+ * information potentially belongs to a now completely
+ * unrelated process.
+ */
+ if (page)
+ page_cpupid_xchg_last(page, (1 << LAST_CPUPID_SHIFT) - 1);
+
+ flush_cache_page(vma, address, pte_pfn(orig_pte));
+ entry = pte_mkyoung(orig_pte);
+ entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+ if (ptep_set_access_flags(vma, address, page_table, entry, 1))
+ update_mmu_cache(vma, address, page_table);
+ pte_unmap_unlock(page_table, ptl);
+
+ if (dirty_shared) {
+ struct address_space *mapping;
+ int dirtied;
+
+ if (!page_mkwrite)
+ lock_page(page);
+
+ dirtied = set_page_dirty(page);
+ VM_BUG_ON_PAGE(PageAnon(page), page);
+ mapping = page->mapping;
+ unlock_page(page);
+ page_cache_release(page);
+
+ if ((dirtied || page_mkwrite) && mapping) {
+ /*
+ * Some device drivers do not set page.mapping
+ * but still dirty their pages
+ */
+ balance_dirty_pages_ratelimited(mapping);
+ }
+
+ if (!page_mkwrite)
+ file_update_time(vma->vm_file);
+ }
+
+ return VM_FAULT_WRITE;
+}
+
+/*
* This routine handles present pages, when users try to write
* to a shared page. It is done by copying the page to a new address
* and decrementing the shared-page counter for the old page.
@@ -2008,8 +2067,6 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
struct page *old_page, *new_page = NULL;
pte_t entry;
int ret = 0;
- int page_mkwrite = 0;
- bool dirty_shared = false;
unsigned long mmun_start = 0; /* For mmu_notifiers */
unsigned long mmun_end = 0; /* For mmu_notifiers */
struct mem_cgroup *memcg;
@@ -2026,7 +2083,8 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
*/
if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
(VM_WRITE|VM_SHARED))
- goto reuse;
+ return wp_page_reuse(mm, vma, address, page_table, ptl,
+ orig_pte, old_page, 0, 0);
goto gotten;
}
@@ -2055,12 +2113,16 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
*/
page_move_anon_rmap(old_page, vma, address);
unlock_page(old_page);
- goto reuse;
+ return wp_page_reuse(mm, vma, address, page_table, ptl,
+ orig_pte, old_page, 0, 0);
}
unlock_page(old_page);
} else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
(VM_WRITE|VM_SHARED))) {
+ int page_mkwrite = 0;
+
page_cache_get(old_page);
+
/*
* Only catch write-faults on shared writable pages,
* read-only shared pages can get COWed by
@@ -2091,51 +2153,8 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
page_mkwrite = 1;
}
- dirty_shared = true;
-
-reuse:
- /*
- * Clear the pages cpupid information as the existing
- * information potentially belongs to a now completely
- * unrelated process.
- */
- if (old_page)
- page_cpupid_xchg_last(old_page, (1 << LAST_CPUPID_SHIFT) - 1);
-
- flush_cache_page(vma, address, pte_pfn(orig_pte));
- entry = pte_mkyoung(orig_pte);
- entry = maybe_mkwrite(pte_mkdirty(entry), vma);
- if (ptep_set_access_flags(vma, address, page_table, entry,1))
- update_mmu_cache(vma, address, page_table);
- pte_unmap_unlock(page_table, ptl);
- ret |= VM_FAULT_WRITE;
-
- if (dirty_shared) {
- struct address_space *mapping;
- int dirtied;
-
- if (!page_mkwrite)
- lock_page(old_page);
-
- dirtied = set_page_dirty(old_page);
- VM_BUG_ON_PAGE(PageAnon(old_page), old_page);
- mapping = old_page->mapping;
- unlock_page(old_page);
- page_cache_release(old_page);
-
- if ((dirtied || page_mkwrite) && mapping) {
- /*
- * Some device drivers do not set page.mapping
- * but still dirty their pages
- */
- balance_dirty_pages_ratelimited(mapping);
- }
-
- if (!page_mkwrite)
- file_update_time(vma->vm_file);
- }
-
- return ret;
+ return wp_page_reuse(mm, vma, address, page_table, ptl,
+ orig_pte, old_page, page_mkwrite, 1);
}
/*
--
1.7.11.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH V5 1/4] mm: Refactor do_wp_page, extract the reuse case
2015-02-22 13:42 ` [PATCH V5 1/4] mm: Refactor do_wp_page, extract the reuse case Shachar Raindel
@ 2015-02-24 13:39 ` Michal Hocko
0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2015-02-24 13:39 UTC (permalink / raw)
To: Shachar Raindel
Cc: linux-mm, kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, Dave Hansen
On Sun 22-02-15 15:42:15, Shachar Raindel wrote:
> When do_wp_page is ending, in several cases it needs to reuse the
> existing page. This is achieved by making the page table writable,
> and possibly updating the page-cache state.
>
> Currently, this logic was "called" by using a goto jump. This makes
> following the control flow of the function harder. It is also
> against the coding style guidelines for using goto.
>
> As the code can easily be refactored into a specialized function,
> refactor it out and simplify the code flow in do_wp_page.
>
> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Andi Kleen <ak@linux.intel.com>
> Acked-by: Haggai Eran <haggaie@mellanox.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Peter Feiner <pfeiner@google.com>
> Cc: Michel Lespinasse <walken@google.com>
Makes sense to me
Reviewed-by: Michal Hocko <mhocko@suse.cz>
> ---
> mm/memory.c | 117 +++++++++++++++++++++++++++++++++++-------------------------
> 1 file changed, 68 insertions(+), 49 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 8068893..7a04414 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1983,6 +1983,65 @@ static int do_page_mkwrite(struct vm_area_struct *vma, struct page *page,
> }
>
> /*
> + * Handle write page faults for pages that can be reused in the current vma
> + *
> + * This can happen either due to the mapping being with the VM_SHARED flag,
> + * or due to us being the last reference standing to the page. In either
> + * case, all we need to do here is to mark the page as writable and update
> + * any related book-keeping.
> + */
> +static int wp_page_reuse(struct mm_struct *mm, struct vm_area_struct *vma,
> + unsigned long address, pte_t *page_table,
> + spinlock_t *ptl, pte_t orig_pte,
> + struct page *page, int page_mkwrite,
> + int dirty_shared)
> + __releases(ptl)
> +{
> + pte_t entry;
> + /*
> + * Clear the pages cpupid information as the existing
> + * information potentially belongs to a now completely
> + * unrelated process.
> + */
> + if (page)
> + page_cpupid_xchg_last(page, (1 << LAST_CPUPID_SHIFT) - 1);
> +
> + flush_cache_page(vma, address, pte_pfn(orig_pte));
> + entry = pte_mkyoung(orig_pte);
> + entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> + if (ptep_set_access_flags(vma, address, page_table, entry, 1))
> + update_mmu_cache(vma, address, page_table);
> + pte_unmap_unlock(page_table, ptl);
> +
> + if (dirty_shared) {
> + struct address_space *mapping;
> + int dirtied;
> +
> + if (!page_mkwrite)
> + lock_page(page);
> +
> + dirtied = set_page_dirty(page);
> + VM_BUG_ON_PAGE(PageAnon(page), page);
> + mapping = page->mapping;
> + unlock_page(page);
> + page_cache_release(page);
> +
> + if ((dirtied || page_mkwrite) && mapping) {
> + /*
> + * Some device drivers do not set page.mapping
> + * but still dirty their pages
> + */
> + balance_dirty_pages_ratelimited(mapping);
> + }
> +
> + if (!page_mkwrite)
> + file_update_time(vma->vm_file);
> + }
> +
> + return VM_FAULT_WRITE;
> +}
> +
> +/*
> * This routine handles present pages, when users try to write
> * to a shared page. It is done by copying the page to a new address
> * and decrementing the shared-page counter for the old page.
> @@ -2008,8 +2067,6 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> struct page *old_page, *new_page = NULL;
> pte_t entry;
> int ret = 0;
> - int page_mkwrite = 0;
> - bool dirty_shared = false;
> unsigned long mmun_start = 0; /* For mmu_notifiers */
> unsigned long mmun_end = 0; /* For mmu_notifiers */
> struct mem_cgroup *memcg;
> @@ -2026,7 +2083,8 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> */
> if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
> (VM_WRITE|VM_SHARED))
> - goto reuse;
> + return wp_page_reuse(mm, vma, address, page_table, ptl,
> + orig_pte, old_page, 0, 0);
> goto gotten;
> }
>
> @@ -2055,12 +2113,16 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> */
> page_move_anon_rmap(old_page, vma, address);
> unlock_page(old_page);
> - goto reuse;
> + return wp_page_reuse(mm, vma, address, page_table, ptl,
> + orig_pte, old_page, 0, 0);
> }
> unlock_page(old_page);
> } else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
> (VM_WRITE|VM_SHARED))) {
> + int page_mkwrite = 0;
> +
> page_cache_get(old_page);
> +
> /*
> * Only catch write-faults on shared writable pages,
> * read-only shared pages can get COWed by
> @@ -2091,51 +2153,8 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> page_mkwrite = 1;
> }
>
> - dirty_shared = true;
> -
> -reuse:
> - /*
> - * Clear the pages cpupid information as the existing
> - * information potentially belongs to a now completely
> - * unrelated process.
> - */
> - if (old_page)
> - page_cpupid_xchg_last(old_page, (1 << LAST_CPUPID_SHIFT) - 1);
> -
> - flush_cache_page(vma, address, pte_pfn(orig_pte));
> - entry = pte_mkyoung(orig_pte);
> - entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> - if (ptep_set_access_flags(vma, address, page_table, entry,1))
> - update_mmu_cache(vma, address, page_table);
> - pte_unmap_unlock(page_table, ptl);
> - ret |= VM_FAULT_WRITE;
> -
> - if (dirty_shared) {
> - struct address_space *mapping;
> - int dirtied;
> -
> - if (!page_mkwrite)
> - lock_page(old_page);
> -
> - dirtied = set_page_dirty(old_page);
> - VM_BUG_ON_PAGE(PageAnon(old_page), old_page);
> - mapping = old_page->mapping;
> - unlock_page(old_page);
> - page_cache_release(old_page);
> -
> - if ((dirtied || page_mkwrite) && mapping) {
> - /*
> - * Some device drivers do not set page.mapping
> - * but still dirty their pages
> - */
> - balance_dirty_pages_ratelimited(mapping);
> - }
> -
> - if (!page_mkwrite)
> - file_update_time(vma->vm_file);
> - }
> -
> - return ret;
> + return wp_page_reuse(mm, vma, address, page_table, ptl,
> + orig_pte, old_page, page_mkwrite, 1);
> }
>
> /*
> --
> 1.7.11.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V5 2/4] mm: Refactor do_wp_page - rewrite the unlock flow
2015-02-22 13:42 [PATCH V5 0/4] Refactor do_wp_page, no functional change Shachar Raindel
2015-02-22 13:42 ` [PATCH V5 1/4] mm: Refactor do_wp_page, extract the reuse case Shachar Raindel
@ 2015-02-22 13:42 ` Shachar Raindel
2015-02-24 13:45 ` Michal Hocko
2015-02-22 13:42 ` [PATCH V5 3/4] mm: refactor do_wp_page, extract the page copy flow Shachar Raindel
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Shachar Raindel @ 2015-02-22 13:42 UTC (permalink / raw)
To: linux-mm
Cc: kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, raindel, Dave Hansen
When do_wp_page is ending, in several cases it needs to unlock the
pages and ptls it was accessing.
Currently, this logic was "called" by using a goto jump. This makes
following the control flow of the function harder. Readability was
further hampered by the unlock case containing large amount of logic
needed only in one of the 3 cases.
Using goto for cleanup is generally allowed. However, moving the
trivial unlocking flows to the relevant call sites allow deeper
refactoring in the next patch.
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Haggai Eran <haggaie@mellanox.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Michel Lespinasse <walken@google.com>
---
mm/memory.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 7a04414..3afd9ce 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2066,7 +2066,7 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
{
struct page *old_page, *new_page = NULL;
pte_t entry;
- int ret = 0;
+ int page_copied = 0;
unsigned long mmun_start = 0; /* For mmu_notifiers */
unsigned long mmun_end = 0; /* For mmu_notifiers */
struct mem_cgroup *memcg;
@@ -2101,7 +2101,9 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
&ptl);
if (!pte_same(*page_table, orig_pte)) {
unlock_page(old_page);
- goto unlock;
+ pte_unmap_unlock(page_table, ptl);
+ page_cache_release(old_page);
+ return 0;
}
page_cache_release(old_page);
}
@@ -2148,7 +2150,9 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
&ptl);
if (!pte_same(*page_table, orig_pte)) {
unlock_page(old_page);
- goto unlock;
+ pte_unmap_unlock(page_table, ptl);
+ page_cache_release(old_page);
+ return 0;
}
page_mkwrite = 1;
}
@@ -2246,29 +2250,28 @@ gotten:
/* Free the old page.. */
new_page = old_page;
- ret |= VM_FAULT_WRITE;
+ page_copied = 1;
} else
mem_cgroup_cancel_charge(new_page, memcg);
if (new_page)
page_cache_release(new_page);
-unlock:
+
pte_unmap_unlock(page_table, ptl);
- if (mmun_end > mmun_start)
- mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+ mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
if (old_page) {
/*
* Don't let another task, with possibly unlocked vma,
* keep the mlocked page.
*/
- if ((ret & VM_FAULT_WRITE) && (vma->vm_flags & VM_LOCKED)) {
+ if (page_copied && (vma->vm_flags & VM_LOCKED)) {
lock_page(old_page); /* LRU manipulation */
munlock_vma_page(old_page);
unlock_page(old_page);
}
page_cache_release(old_page);
}
- return ret;
+ return page_copied ? VM_FAULT_WRITE : 0;
oom_free_new:
page_cache_release(new_page);
oom:
--
1.7.11.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH V5 2/4] mm: Refactor do_wp_page - rewrite the unlock flow
2015-02-22 13:42 ` [PATCH V5 2/4] mm: Refactor do_wp_page - rewrite the unlock flow Shachar Raindel
@ 2015-02-24 13:45 ` Michal Hocko
0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2015-02-24 13:45 UTC (permalink / raw)
To: Shachar Raindel
Cc: linux-mm, kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, Dave Hansen
On Sun 22-02-15 15:42:16, Shachar Raindel wrote:
> When do_wp_page is ending, in several cases it needs to unlock the
> pages and ptls it was accessing.
>
> Currently, this logic was "called" by using a goto jump. This makes
> following the control flow of the function harder. Readability was
> further hampered by the unlock case containing large amount of logic
> needed only in one of the 3 cases.
>
> Using goto for cleanup is generally allowed. However, moving the
> trivial unlocking flows to the relevant call sites allow deeper
> refactoring in the next patch.
>
> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Andi Kleen <ak@linux.intel.com>
> Acked-by: Haggai Eran <haggaie@mellanox.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Peter Feiner <pfeiner@google.com>
> Cc: Michel Lespinasse <walken@google.com>
Neat!
Reviewed-by: Michal Hocko <mhocko@suse.cz>
> ---
> mm/memory.c | 21 ++++++++++++---------
> 1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 7a04414..3afd9ce 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2066,7 +2066,7 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> {
> struct page *old_page, *new_page = NULL;
> pte_t entry;
> - int ret = 0;
> + int page_copied = 0;
> unsigned long mmun_start = 0; /* For mmu_notifiers */
> unsigned long mmun_end = 0; /* For mmu_notifiers */
> struct mem_cgroup *memcg;
> @@ -2101,7 +2101,9 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> &ptl);
> if (!pte_same(*page_table, orig_pte)) {
> unlock_page(old_page);
> - goto unlock;
> + pte_unmap_unlock(page_table, ptl);
> + page_cache_release(old_page);
> + return 0;
> }
> page_cache_release(old_page);
> }
> @@ -2148,7 +2150,9 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> &ptl);
> if (!pte_same(*page_table, orig_pte)) {
> unlock_page(old_page);
> - goto unlock;
> + pte_unmap_unlock(page_table, ptl);
> + page_cache_release(old_page);
> + return 0;
> }
> page_mkwrite = 1;
> }
> @@ -2246,29 +2250,28 @@ gotten:
>
> /* Free the old page.. */
> new_page = old_page;
> - ret |= VM_FAULT_WRITE;
> + page_copied = 1;
> } else
> mem_cgroup_cancel_charge(new_page, memcg);
>
> if (new_page)
> page_cache_release(new_page);
> -unlock:
> +
> pte_unmap_unlock(page_table, ptl);
> - if (mmun_end > mmun_start)
> - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
> + mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
> if (old_page) {
> /*
> * Don't let another task, with possibly unlocked vma,
> * keep the mlocked page.
> */
> - if ((ret & VM_FAULT_WRITE) && (vma->vm_flags & VM_LOCKED)) {
> + if (page_copied && (vma->vm_flags & VM_LOCKED)) {
> lock_page(old_page); /* LRU manipulation */
> munlock_vma_page(old_page);
> unlock_page(old_page);
> }
> page_cache_release(old_page);
> }
> - return ret;
> + return page_copied ? VM_FAULT_WRITE : 0;
> oom_free_new:
> page_cache_release(new_page);
> oom:
> --
> 1.7.11.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V5 3/4] mm: refactor do_wp_page, extract the page copy flow
2015-02-22 13:42 [PATCH V5 0/4] Refactor do_wp_page, no functional change Shachar Raindel
2015-02-22 13:42 ` [PATCH V5 1/4] mm: Refactor do_wp_page, extract the reuse case Shachar Raindel
2015-02-22 13:42 ` [PATCH V5 2/4] mm: Refactor do_wp_page - rewrite the unlock flow Shachar Raindel
@ 2015-02-22 13:42 ` Shachar Raindel
2015-02-24 13:56 ` Michal Hocko
2015-02-22 13:42 ` [PATCH V5 4/4] mm: Refactor do_wp_page handling of shared vma into a function Shachar Raindel
2015-02-24 0:20 ` [PATCH V5 0/4] Refactor do_wp_page, no functional change Andrew Morton
4 siblings, 1 reply; 11+ messages in thread
From: Shachar Raindel @ 2015-02-22 13:42 UTC (permalink / raw)
To: linux-mm
Cc: kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, raindel, Dave Hansen
In some cases, do_wp_page had to copy the page suffering a write fault
to a new location. If the function logic decided that to do this, it
was done by jumping with a "goto" operation to the relevant code
block. This made the code really hard to understand. It is also
against the kernel coding style guidelines.
This patch extracts the page copy and page table update logic to a
separate function. It also clean up the naming, from "gotten" to
"wp_page_copy", and adds few comments.
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Haggai Eran <haggaie@mellanox.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Michel Lespinasse <walken@google.com>
---
mm/memory.c | 265 +++++++++++++++++++++++++++++++++---------------------------
1 file changed, 147 insertions(+), 118 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 3afd9ce..a06e705 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2042,6 +2042,146 @@ static int wp_page_reuse(struct mm_struct *mm, struct vm_area_struct *vma,
}
/*
+ * Handle the case of a page which we actually need to copy to a new page.
+ *
+ * Called with mmap_sem locked and the old page referenced, but
+ * without the ptl held.
+ *
+ * High level logic flow:
+ *
+ * - Allocate a page, copy the content of the old page to the new one.
+ * - Handle book keeping and accounting - cgroups, mmu-notifiers, etc.
+ * - Take the PTL. If the pte changed, bail out and release the allocated page
+ * - If the pte is still the way we remember it, update the page table and all
+ * relevant references. This includes dropping the reference the page-table
+ * held to the old page, as well as updating the rmap.
+ * - In any case, unlock the PTL and drop the reference we took to the old page.
+ */
+static int wp_page_copy(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, pte_t *page_table, pmd_t *pmd,
+ pte_t orig_pte, struct page *old_page)
+{
+ struct page *new_page = NULL;
+ spinlock_t *ptl = NULL;
+ pte_t entry;
+ int page_copied = 0;
+ const unsigned long mmun_start = address & PAGE_MASK; /* For mmu_notifiers */
+ const unsigned long mmun_end = mmun_start + PAGE_SIZE; /* For mmu_notifiers */
+ struct mem_cgroup *memcg;
+
+ if (unlikely(anon_vma_prepare(vma)))
+ goto oom;
+
+ if (is_zero_pfn(pte_pfn(orig_pte))) {
+ new_page = alloc_zeroed_user_highpage_movable(vma, address);
+ if (!new_page)
+ goto oom;
+ } else {
+ new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
+ if (!new_page)
+ goto oom;
+ cow_user_page(new_page, old_page, address, vma);
+ }
+ __SetPageUptodate(new_page);
+
+ if (mem_cgroup_try_charge(new_page, mm, GFP_KERNEL, &memcg))
+ goto oom_free_new;
+
+ mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+
+ /*
+ * Re-check the pte - we dropped the lock
+ */
+ page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
+ if (likely(pte_same(*page_table, orig_pte))) {
+ if (old_page) {
+ if (!PageAnon(old_page)) {
+ dec_mm_counter_fast(mm, MM_FILEPAGES);
+ inc_mm_counter_fast(mm, MM_ANONPAGES);
+ }
+ } else {
+ inc_mm_counter_fast(mm, MM_ANONPAGES);
+ }
+ flush_cache_page(vma, address, pte_pfn(orig_pte));
+ entry = mk_pte(new_page, vma->vm_page_prot);
+ entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+ /*
+ * Clear the pte entry and flush it first, before updating the
+ * pte with the new entry. This will avoid a race condition
+ * seen in the presence of one thread doing SMC and another
+ * thread doing COW.
+ */
+ ptep_clear_flush_notify(vma, address, page_table);
+ page_add_new_anon_rmap(new_page, vma, address);
+ mem_cgroup_commit_charge(new_page, memcg, false);
+ lru_cache_add_active_or_unevictable(new_page, vma);
+ /*
+ * We call the notify macro here because, when using secondary
+ * mmu page tables (such as kvm shadow page tables), we want the
+ * new page to be mapped directly into the secondary page table.
+ */
+ set_pte_at_notify(mm, address, page_table, entry);
+ update_mmu_cache(vma, address, page_table);
+ if (old_page) {
+ /*
+ * Only after switching the pte to the new page may
+ * we remove the mapcount here. Otherwise another
+ * process may come and find the rmap count decremented
+ * before the pte is switched to the new page, and
+ * "reuse" the old page writing into it while our pte
+ * here still points into it and can be read by other
+ * threads.
+ *
+ * The critical issue is to order this
+ * page_remove_rmap with the ptp_clear_flush above.
+ * Those stores are ordered by (if nothing else,)
+ * the barrier present in the atomic_add_negative
+ * in page_remove_rmap.
+ *
+ * Then the TLB flush in ptep_clear_flush ensures that
+ * no process can access the old page before the
+ * decremented mapcount is visible. And the old page
+ * cannot be reused until after the decremented
+ * mapcount is visible. So transitively, TLBs to
+ * old page will be flushed before it can be reused.
+ */
+ page_remove_rmap(old_page);
+ }
+
+ /* Free the old page.. */
+ new_page = old_page;
+ page_copied = 1;
+ } else {
+ mem_cgroup_cancel_charge(new_page, memcg);
+ }
+
+ if (new_page)
+ page_cache_release(new_page);
+
+ pte_unmap_unlock(page_table, ptl);
+ mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+ if (old_page) {
+ /*
+ * Don't let another task, with possibly unlocked vma,
+ * keep the mlocked page.
+ */
+ if (page_copied && (vma->vm_flags & VM_LOCKED)) {
+ lock_page(old_page); /* LRU manipulation */
+ munlock_vma_page(old_page);
+ unlock_page(old_page);
+ }
+ page_cache_release(old_page);
+ }
+ return page_copied ? VM_FAULT_WRITE : 0;
+oom_free_new:
+ page_cache_release(new_page);
+oom:
+ if (old_page)
+ page_cache_release(old_page);
+ return VM_FAULT_OOM;
+}
+
+/*
* This routine handles present pages, when users try to write
* to a shared page. It is done by copying the page to a new address
* and decrementing the shared-page counter for the old page.
@@ -2064,12 +2204,7 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
spinlock_t *ptl, pte_t orig_pte)
__releases(ptl)
{
- struct page *old_page, *new_page = NULL;
- pte_t entry;
- int page_copied = 0;
- unsigned long mmun_start = 0; /* For mmu_notifiers */
- unsigned long mmun_end = 0; /* For mmu_notifiers */
- struct mem_cgroup *memcg;
+ struct page *old_page;
old_page = vm_normal_page(vma, address, orig_pte);
if (!old_page) {
@@ -2085,7 +2220,10 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
(VM_WRITE|VM_SHARED))
return wp_page_reuse(mm, vma, address, page_table, ptl,
orig_pte, old_page, 0, 0);
- goto gotten;
+
+ pte_unmap_unlock(page_table, ptl);
+ return wp_page_copy(mm, vma, address, page_table, pmd,
+ orig_pte, old_page);
}
/*
@@ -2165,119 +2303,10 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
* Ok, we need to copy. Oh, well..
*/
page_cache_get(old_page);
-gotten:
- pte_unmap_unlock(page_table, ptl);
-
- if (unlikely(anon_vma_prepare(vma)))
- goto oom;
-
- if (is_zero_pfn(pte_pfn(orig_pte))) {
- new_page = alloc_zeroed_user_highpage_movable(vma, address);
- if (!new_page)
- goto oom;
- } else {
- new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
- if (!new_page)
- goto oom;
- cow_user_page(new_page, old_page, address, vma);
- }
- __SetPageUptodate(new_page);
-
- if (mem_cgroup_try_charge(new_page, mm, GFP_KERNEL, &memcg))
- goto oom_free_new;
-
- mmun_start = address & PAGE_MASK;
- mmun_end = mmun_start + PAGE_SIZE;
- mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
-
- /*
- * Re-check the pte - we dropped the lock
- */
- page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
- if (likely(pte_same(*page_table, orig_pte))) {
- if (old_page) {
- if (!PageAnon(old_page)) {
- dec_mm_counter_fast(mm, MM_FILEPAGES);
- inc_mm_counter_fast(mm, MM_ANONPAGES);
- }
- } else
- inc_mm_counter_fast(mm, MM_ANONPAGES);
- flush_cache_page(vma, address, pte_pfn(orig_pte));
- entry = mk_pte(new_page, vma->vm_page_prot);
- entry = maybe_mkwrite(pte_mkdirty(entry), vma);
- /*
- * Clear the pte entry and flush it first, before updating the
- * pte with the new entry. This will avoid a race condition
- * seen in the presence of one thread doing SMC and another
- * thread doing COW.
- */
- ptep_clear_flush_notify(vma, address, page_table);
- page_add_new_anon_rmap(new_page, vma, address);
- mem_cgroup_commit_charge(new_page, memcg, false);
- lru_cache_add_active_or_unevictable(new_page, vma);
- /*
- * We call the notify macro here because, when using secondary
- * mmu page tables (such as kvm shadow page tables), we want the
- * new page to be mapped directly into the secondary page table.
- */
- set_pte_at_notify(mm, address, page_table, entry);
- update_mmu_cache(vma, address, page_table);
- if (old_page) {
- /*
- * Only after switching the pte to the new page may
- * we remove the mapcount here. Otherwise another
- * process may come and find the rmap count decremented
- * before the pte is switched to the new page, and
- * "reuse" the old page writing into it while our pte
- * here still points into it and can be read by other
- * threads.
- *
- * The critical issue is to order this
- * page_remove_rmap with the ptp_clear_flush above.
- * Those stores are ordered by (if nothing else,)
- * the barrier present in the atomic_add_negative
- * in page_remove_rmap.
- *
- * Then the TLB flush in ptep_clear_flush ensures that
- * no process can access the old page before the
- * decremented mapcount is visible. And the old page
- * cannot be reused until after the decremented
- * mapcount is visible. So transitively, TLBs to
- * old page will be flushed before it can be reused.
- */
- page_remove_rmap(old_page);
- }
-
- /* Free the old page.. */
- new_page = old_page;
- page_copied = 1;
- } else
- mem_cgroup_cancel_charge(new_page, memcg);
-
- if (new_page)
- page_cache_release(new_page);
pte_unmap_unlock(page_table, ptl);
- mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
- if (old_page) {
- /*
- * Don't let another task, with possibly unlocked vma,
- * keep the mlocked page.
- */
- if (page_copied && (vma->vm_flags & VM_LOCKED)) {
- lock_page(old_page); /* LRU manipulation */
- munlock_vma_page(old_page);
- unlock_page(old_page);
- }
- page_cache_release(old_page);
- }
- return page_copied ? VM_FAULT_WRITE : 0;
-oom_free_new:
- page_cache_release(new_page);
-oom:
- if (old_page)
- page_cache_release(old_page);
- return VM_FAULT_OOM;
+ return wp_page_copy(mm, vma, address, page_table, pmd,
+ orig_pte, old_page);
}
static void unmap_mapping_range_vma(struct vm_area_struct *vma,
--
1.7.11.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH V5 3/4] mm: refactor do_wp_page, extract the page copy flow
2015-02-22 13:42 ` [PATCH V5 3/4] mm: refactor do_wp_page, extract the page copy flow Shachar Raindel
@ 2015-02-24 13:56 ` Michal Hocko
0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2015-02-24 13:56 UTC (permalink / raw)
To: Shachar Raindel
Cc: linux-mm, kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, Dave Hansen
On Sun 22-02-15 15:42:17, Shachar Raindel wrote:
> In some cases, do_wp_page had to copy the page suffering a write fault
> to a new location. If the function logic decided that to do this, it
> was done by jumping with a "goto" operation to the relevant code
> block. This made the code really hard to understand. It is also
> against the kernel coding style guidelines.
>
> This patch extracts the page copy and page table update logic to a
> separate function. It also clean up the naming, from "gotten" to
> "wp_page_copy", and adds few comments.
>
> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Andi Kleen <ak@linux.intel.com>
> Acked-by: Haggai Eran <haggaie@mellanox.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Peter Feiner <pfeiner@google.com>
> Cc: Michel Lespinasse <walken@google.com>
Nice cleanup as well!
Reviewed-by: Michal Hocko <mhocko@suse.cz>
> ---
> mm/memory.c | 265 +++++++++++++++++++++++++++++++++---------------------------
> 1 file changed, 147 insertions(+), 118 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 3afd9ce..a06e705 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2042,6 +2042,146 @@ static int wp_page_reuse(struct mm_struct *mm, struct vm_area_struct *vma,
> }
>
> /*
> + * Handle the case of a page which we actually need to copy to a new page.
> + *
> + * Called with mmap_sem locked and the old page referenced, but
> + * without the ptl held.
> + *
> + * High level logic flow:
> + *
> + * - Allocate a page, copy the content of the old page to the new one.
> + * - Handle book keeping and accounting - cgroups, mmu-notifiers, etc.
> + * - Take the PTL. If the pte changed, bail out and release the allocated page
> + * - If the pte is still the way we remember it, update the page table and all
> + * relevant references. This includes dropping the reference the page-table
> + * held to the old page, as well as updating the rmap.
> + * - In any case, unlock the PTL and drop the reference we took to the old page.
> + */
> +static int wp_page_copy(struct mm_struct *mm, struct vm_area_struct *vma,
> + unsigned long address, pte_t *page_table, pmd_t *pmd,
> + pte_t orig_pte, struct page *old_page)
> +{
> + struct page *new_page = NULL;
> + spinlock_t *ptl = NULL;
> + pte_t entry;
> + int page_copied = 0;
> + const unsigned long mmun_start = address & PAGE_MASK; /* For mmu_notifiers */
> + const unsigned long mmun_end = mmun_start + PAGE_SIZE; /* For mmu_notifiers */
> + struct mem_cgroup *memcg;
> +
> + if (unlikely(anon_vma_prepare(vma)))
> + goto oom;
> +
> + if (is_zero_pfn(pte_pfn(orig_pte))) {
> + new_page = alloc_zeroed_user_highpage_movable(vma, address);
> + if (!new_page)
> + goto oom;
> + } else {
> + new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
> + if (!new_page)
> + goto oom;
> + cow_user_page(new_page, old_page, address, vma);
> + }
> + __SetPageUptodate(new_page);
> +
> + if (mem_cgroup_try_charge(new_page, mm, GFP_KERNEL, &memcg))
> + goto oom_free_new;
> +
> + mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
> +
> + /*
> + * Re-check the pte - we dropped the lock
> + */
> + page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
> + if (likely(pte_same(*page_table, orig_pte))) {
> + if (old_page) {
> + if (!PageAnon(old_page)) {
> + dec_mm_counter_fast(mm, MM_FILEPAGES);
> + inc_mm_counter_fast(mm, MM_ANONPAGES);
> + }
> + } else {
> + inc_mm_counter_fast(mm, MM_ANONPAGES);
> + }
> + flush_cache_page(vma, address, pte_pfn(orig_pte));
> + entry = mk_pte(new_page, vma->vm_page_prot);
> + entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> + /*
> + * Clear the pte entry and flush it first, before updating the
> + * pte with the new entry. This will avoid a race condition
> + * seen in the presence of one thread doing SMC and another
> + * thread doing COW.
> + */
> + ptep_clear_flush_notify(vma, address, page_table);
> + page_add_new_anon_rmap(new_page, vma, address);
> + mem_cgroup_commit_charge(new_page, memcg, false);
> + lru_cache_add_active_or_unevictable(new_page, vma);
> + /*
> + * We call the notify macro here because, when using secondary
> + * mmu page tables (such as kvm shadow page tables), we want the
> + * new page to be mapped directly into the secondary page table.
> + */
> + set_pte_at_notify(mm, address, page_table, entry);
> + update_mmu_cache(vma, address, page_table);
> + if (old_page) {
> + /*
> + * Only after switching the pte to the new page may
> + * we remove the mapcount here. Otherwise another
> + * process may come and find the rmap count decremented
> + * before the pte is switched to the new page, and
> + * "reuse" the old page writing into it while our pte
> + * here still points into it and can be read by other
> + * threads.
> + *
> + * The critical issue is to order this
> + * page_remove_rmap with the ptp_clear_flush above.
> + * Those stores are ordered by (if nothing else,)
> + * the barrier present in the atomic_add_negative
> + * in page_remove_rmap.
> + *
> + * Then the TLB flush in ptep_clear_flush ensures that
> + * no process can access the old page before the
> + * decremented mapcount is visible. And the old page
> + * cannot be reused until after the decremented
> + * mapcount is visible. So transitively, TLBs to
> + * old page will be flushed before it can be reused.
> + */
> + page_remove_rmap(old_page);
> + }
> +
> + /* Free the old page.. */
> + new_page = old_page;
> + page_copied = 1;
> + } else {
> + mem_cgroup_cancel_charge(new_page, memcg);
> + }
> +
> + if (new_page)
> + page_cache_release(new_page);
> +
> + pte_unmap_unlock(page_table, ptl);
> + mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
> + if (old_page) {
> + /*
> + * Don't let another task, with possibly unlocked vma,
> + * keep the mlocked page.
> + */
> + if (page_copied && (vma->vm_flags & VM_LOCKED)) {
> + lock_page(old_page); /* LRU manipulation */
> + munlock_vma_page(old_page);
> + unlock_page(old_page);
> + }
> + page_cache_release(old_page);
> + }
> + return page_copied ? VM_FAULT_WRITE : 0;
> +oom_free_new:
> + page_cache_release(new_page);
> +oom:
> + if (old_page)
> + page_cache_release(old_page);
> + return VM_FAULT_OOM;
> +}
> +
> +/*
> * This routine handles present pages, when users try to write
> * to a shared page. It is done by copying the page to a new address
> * and decrementing the shared-page counter for the old page.
> @@ -2064,12 +2204,7 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> spinlock_t *ptl, pte_t orig_pte)
> __releases(ptl)
> {
> - struct page *old_page, *new_page = NULL;
> - pte_t entry;
> - int page_copied = 0;
> - unsigned long mmun_start = 0; /* For mmu_notifiers */
> - unsigned long mmun_end = 0; /* For mmu_notifiers */
> - struct mem_cgroup *memcg;
> + struct page *old_page;
>
> old_page = vm_normal_page(vma, address, orig_pte);
> if (!old_page) {
> @@ -2085,7 +2220,10 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> (VM_WRITE|VM_SHARED))
> return wp_page_reuse(mm, vma, address, page_table, ptl,
> orig_pte, old_page, 0, 0);
> - goto gotten;
> +
> + pte_unmap_unlock(page_table, ptl);
> + return wp_page_copy(mm, vma, address, page_table, pmd,
> + orig_pte, old_page);
> }
>
> /*
> @@ -2165,119 +2303,10 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> * Ok, we need to copy. Oh, well..
> */
> page_cache_get(old_page);
> -gotten:
> - pte_unmap_unlock(page_table, ptl);
> -
> - if (unlikely(anon_vma_prepare(vma)))
> - goto oom;
> -
> - if (is_zero_pfn(pte_pfn(orig_pte))) {
> - new_page = alloc_zeroed_user_highpage_movable(vma, address);
> - if (!new_page)
> - goto oom;
> - } else {
> - new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
> - if (!new_page)
> - goto oom;
> - cow_user_page(new_page, old_page, address, vma);
> - }
> - __SetPageUptodate(new_page);
> -
> - if (mem_cgroup_try_charge(new_page, mm, GFP_KERNEL, &memcg))
> - goto oom_free_new;
> -
> - mmun_start = address & PAGE_MASK;
> - mmun_end = mmun_start + PAGE_SIZE;
> - mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
> -
> - /*
> - * Re-check the pte - we dropped the lock
> - */
> - page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
> - if (likely(pte_same(*page_table, orig_pte))) {
> - if (old_page) {
> - if (!PageAnon(old_page)) {
> - dec_mm_counter_fast(mm, MM_FILEPAGES);
> - inc_mm_counter_fast(mm, MM_ANONPAGES);
> - }
> - } else
> - inc_mm_counter_fast(mm, MM_ANONPAGES);
> - flush_cache_page(vma, address, pte_pfn(orig_pte));
> - entry = mk_pte(new_page, vma->vm_page_prot);
> - entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> - /*
> - * Clear the pte entry and flush it first, before updating the
> - * pte with the new entry. This will avoid a race condition
> - * seen in the presence of one thread doing SMC and another
> - * thread doing COW.
> - */
> - ptep_clear_flush_notify(vma, address, page_table);
> - page_add_new_anon_rmap(new_page, vma, address);
> - mem_cgroup_commit_charge(new_page, memcg, false);
> - lru_cache_add_active_or_unevictable(new_page, vma);
> - /*
> - * We call the notify macro here because, when using secondary
> - * mmu page tables (such as kvm shadow page tables), we want the
> - * new page to be mapped directly into the secondary page table.
> - */
> - set_pte_at_notify(mm, address, page_table, entry);
> - update_mmu_cache(vma, address, page_table);
> - if (old_page) {
> - /*
> - * Only after switching the pte to the new page may
> - * we remove the mapcount here. Otherwise another
> - * process may come and find the rmap count decremented
> - * before the pte is switched to the new page, and
> - * "reuse" the old page writing into it while our pte
> - * here still points into it and can be read by other
> - * threads.
> - *
> - * The critical issue is to order this
> - * page_remove_rmap with the ptp_clear_flush above.
> - * Those stores are ordered by (if nothing else,)
> - * the barrier present in the atomic_add_negative
> - * in page_remove_rmap.
> - *
> - * Then the TLB flush in ptep_clear_flush ensures that
> - * no process can access the old page before the
> - * decremented mapcount is visible. And the old page
> - * cannot be reused until after the decremented
> - * mapcount is visible. So transitively, TLBs to
> - * old page will be flushed before it can be reused.
> - */
> - page_remove_rmap(old_page);
> - }
> -
> - /* Free the old page.. */
> - new_page = old_page;
> - page_copied = 1;
> - } else
> - mem_cgroup_cancel_charge(new_page, memcg);
> -
> - if (new_page)
> - page_cache_release(new_page);
>
> pte_unmap_unlock(page_table, ptl);
> - mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
> - if (old_page) {
> - /*
> - * Don't let another task, with possibly unlocked vma,
> - * keep the mlocked page.
> - */
> - if (page_copied && (vma->vm_flags & VM_LOCKED)) {
> - lock_page(old_page); /* LRU manipulation */
> - munlock_vma_page(old_page);
> - unlock_page(old_page);
> - }
> - page_cache_release(old_page);
> - }
> - return page_copied ? VM_FAULT_WRITE : 0;
> -oom_free_new:
> - page_cache_release(new_page);
> -oom:
> - if (old_page)
> - page_cache_release(old_page);
> - return VM_FAULT_OOM;
> + return wp_page_copy(mm, vma, address, page_table, pmd,
> + orig_pte, old_page);
> }
>
> static void unmap_mapping_range_vma(struct vm_area_struct *vma,
> --
> 1.7.11.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH V5 4/4] mm: Refactor do_wp_page handling of shared vma into a function
2015-02-22 13:42 [PATCH V5 0/4] Refactor do_wp_page, no functional change Shachar Raindel
` (2 preceding siblings ...)
2015-02-22 13:42 ` [PATCH V5 3/4] mm: refactor do_wp_page, extract the page copy flow Shachar Raindel
@ 2015-02-22 13:42 ` Shachar Raindel
2015-02-24 13:59 ` Michal Hocko
2015-02-24 0:20 ` [PATCH V5 0/4] Refactor do_wp_page, no functional change Andrew Morton
4 siblings, 1 reply; 11+ messages in thread
From: Shachar Raindel @ 2015-02-22 13:42 UTC (permalink / raw)
To: linux-mm
Cc: kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, raindel, Dave Hansen
The do_wp_page function is extremely long. Extract the logic for
handling a page belonging to a shared vma into a function of its own.
This helps the readability of the code, without doing any functional
change in it.
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Haggai Eran <haggaie@mellanox.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Michel Lespinasse <walken@google.com>
---
mm/memory.c | 86 ++++++++++++++++++++++++++++++++++---------------------------
1 file changed, 48 insertions(+), 38 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index a06e705..b246d22 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2181,6 +2181,52 @@ oom:
return VM_FAULT_OOM;
}
+static int wp_page_shared(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, pte_t *page_table,
+ pmd_t *pmd, spinlock_t *ptl, pte_t orig_pte,
+ struct page *old_page)
+ __releases(ptl)
+{
+ int page_mkwrite = 0;
+
+ page_cache_get(old_page);
+
+ /*
+ * Only catch write-faults on shared writable pages,
+ * read-only shared pages can get COWed by
+ * get_user_pages(.write=1, .force=1).
+ */
+ if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
+ int tmp;
+
+ pte_unmap_unlock(page_table, ptl);
+ tmp = do_page_mkwrite(vma, old_page, address);
+ if (unlikely(!tmp || (tmp &
+ (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))) {
+ page_cache_release(old_page);
+ return tmp;
+ }
+ /*
+ * Since we dropped the lock we need to revalidate
+ * the PTE as someone else may have changed it. If
+ * they did, we just return, as we can count on the
+ * MMU to tell us if they didn't also make it writable.
+ */
+ page_table = pte_offset_map_lock(mm, pmd, address,
+ &ptl);
+ if (!pte_same(*page_table, orig_pte)) {
+ unlock_page(old_page);
+ pte_unmap_unlock(page_table, ptl);
+ page_cache_release(old_page);
+ return 0;
+ }
+ page_mkwrite = 1;
+ }
+
+ return wp_page_reuse(mm, vma, address, page_table, ptl,
+ orig_pte, old_page, page_mkwrite, 1);
+}
+
/*
* This routine handles present pages, when users try to write
* to a shared page. It is done by copying the page to a new address
@@ -2259,44 +2305,8 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
unlock_page(old_page);
} else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
(VM_WRITE|VM_SHARED))) {
- int page_mkwrite = 0;
-
- page_cache_get(old_page);
-
- /*
- * Only catch write-faults on shared writable pages,
- * read-only shared pages can get COWed by
- * get_user_pages(.write=1, .force=1).
- */
- if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
- int tmp;
-
- pte_unmap_unlock(page_table, ptl);
- tmp = do_page_mkwrite(vma, old_page, address);
- if (unlikely(!tmp || (tmp &
- (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))) {
- page_cache_release(old_page);
- return tmp;
- }
- /*
- * Since we dropped the lock we need to revalidate
- * the PTE as someone else may have changed it. If
- * they did, we just return, as we can count on the
- * MMU to tell us if they didn't also make it writable.
- */
- page_table = pte_offset_map_lock(mm, pmd, address,
- &ptl);
- if (!pte_same(*page_table, orig_pte)) {
- unlock_page(old_page);
- pte_unmap_unlock(page_table, ptl);
- page_cache_release(old_page);
- return 0;
- }
- page_mkwrite = 1;
- }
-
- return wp_page_reuse(mm, vma, address, page_table, ptl,
- orig_pte, old_page, page_mkwrite, 1);
+ return wp_page_shared(mm, vma, address, page_table, pmd,
+ ptl, orig_pte, old_page);
}
/*
--
1.7.11.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH V5 4/4] mm: Refactor do_wp_page handling of shared vma into a function
2015-02-22 13:42 ` [PATCH V5 4/4] mm: Refactor do_wp_page handling of shared vma into a function Shachar Raindel
@ 2015-02-24 13:59 ` Michal Hocko
0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2015-02-24 13:59 UTC (permalink / raw)
To: Shachar Raindel
Cc: linux-mm, kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, akpm, torvalds, haggaie, aarcange,
pfeiner, hannes, sagig, walken, Dave Hansen
On Sun 22-02-15 15:42:18, Shachar Raindel wrote:
> The do_wp_page function is extremely long. Extract the logic for
> handling a page belonging to a shared vma into a function of its own.
>
> This helps the readability of the code, without doing any functional
> change in it.
>
> Signed-off-by: Shachar Raindel <raindel@mellanox.com>
> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Andi Kleen <ak@linux.intel.com>
> Acked-by: Haggai Eran <haggaie@mellanox.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Peter Feiner <pfeiner@google.com>
> Cc: Michel Lespinasse <walken@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
> ---
> mm/memory.c | 86 ++++++++++++++++++++++++++++++++++---------------------------
> 1 file changed, 48 insertions(+), 38 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index a06e705..b246d22 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2181,6 +2181,52 @@ oom:
> return VM_FAULT_OOM;
> }
>
> +static int wp_page_shared(struct mm_struct *mm, struct vm_area_struct *vma,
> + unsigned long address, pte_t *page_table,
> + pmd_t *pmd, spinlock_t *ptl, pte_t orig_pte,
> + struct page *old_page)
> + __releases(ptl)
> +{
> + int page_mkwrite = 0;
> +
> + page_cache_get(old_page);
> +
> + /*
> + * Only catch write-faults on shared writable pages,
> + * read-only shared pages can get COWed by
> + * get_user_pages(.write=1, .force=1).
> + */
> + if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> + int tmp;
> +
> + pte_unmap_unlock(page_table, ptl);
> + tmp = do_page_mkwrite(vma, old_page, address);
> + if (unlikely(!tmp || (tmp &
> + (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))) {
> + page_cache_release(old_page);
> + return tmp;
> + }
> + /*
> + * Since we dropped the lock we need to revalidate
> + * the PTE as someone else may have changed it. If
> + * they did, we just return, as we can count on the
> + * MMU to tell us if they didn't also make it writable.
> + */
> + page_table = pte_offset_map_lock(mm, pmd, address,
> + &ptl);
> + if (!pte_same(*page_table, orig_pte)) {
> + unlock_page(old_page);
> + pte_unmap_unlock(page_table, ptl);
> + page_cache_release(old_page);
> + return 0;
> + }
> + page_mkwrite = 1;
> + }
> +
> + return wp_page_reuse(mm, vma, address, page_table, ptl,
> + orig_pte, old_page, page_mkwrite, 1);
> +}
> +
> /*
> * This routine handles present pages, when users try to write
> * to a shared page. It is done by copying the page to a new address
> @@ -2259,44 +2305,8 @@ static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
> unlock_page(old_page);
> } else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
> (VM_WRITE|VM_SHARED))) {
> - int page_mkwrite = 0;
> -
> - page_cache_get(old_page);
> -
> - /*
> - * Only catch write-faults on shared writable pages,
> - * read-only shared pages can get COWed by
> - * get_user_pages(.write=1, .force=1).
> - */
> - if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> - int tmp;
> -
> - pte_unmap_unlock(page_table, ptl);
> - tmp = do_page_mkwrite(vma, old_page, address);
> - if (unlikely(!tmp || (tmp &
> - (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))) {
> - page_cache_release(old_page);
> - return tmp;
> - }
> - /*
> - * Since we dropped the lock we need to revalidate
> - * the PTE as someone else may have changed it. If
> - * they did, we just return, as we can count on the
> - * MMU to tell us if they didn't also make it writable.
> - */
> - page_table = pte_offset_map_lock(mm, pmd, address,
> - &ptl);
> - if (!pte_same(*page_table, orig_pte)) {
> - unlock_page(old_page);
> - pte_unmap_unlock(page_table, ptl);
> - page_cache_release(old_page);
> - return 0;
> - }
> - page_mkwrite = 1;
> - }
> -
> - return wp_page_reuse(mm, vma, address, page_table, ptl,
> - orig_pte, old_page, page_mkwrite, 1);
> + return wp_page_shared(mm, vma, address, page_table, pmd,
> + ptl, orig_pte, old_page);
> }
>
> /*
> --
> 1.7.11.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH V5 0/4] Refactor do_wp_page, no functional change
2015-02-22 13:42 [PATCH V5 0/4] Refactor do_wp_page, no functional change Shachar Raindel
` (3 preceding siblings ...)
2015-02-22 13:42 ` [PATCH V5 4/4] mm: Refactor do_wp_page handling of shared vma into a function Shachar Raindel
@ 2015-02-24 0:20 ` Andrew Morton
2015-02-24 8:36 ` Shachar Raindel
4 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2015-02-24 0:20 UTC (permalink / raw)
To: Shachar Raindel
Cc: linux-mm, kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, torvalds, haggaie, aarcange, pfeiner,
hannes, sagig, walken
On Sun, 22 Feb 2015 15:42:14 +0200 Shachar Raindel <raindel@mellanox.com> wrote:
> Currently do_wp_page contains 265 code lines. It also contains 9 goto
> statements, of which 5 are targeting labels which are not cleanup
> related. This makes the function extremely difficult to
> understand. The following patches are an attempt at breaking the
> function to its basic components, and making it easier to understand.
>
> The patches are straight forward function extractions from
> do_wp_page. As we extract functions, we remove unneeded parameters and
> simplify the code as much as possible. However, the functionality is
> supposed to remain completely unchanged. The patches also attempt to
> document the functionality of each extracted function. In patch 2, we
> split the unlock logic to the contain logic relevant to specific needs
> of each use case, instead of having huge number of conditional
> decisions in a single unlock flow.
gcc-4.4.4:
text data bss dec hex filename
40898 186 13344 54428 d49c mm/memory.o-before
41422 186 13456 55064 d718 mm/memory.o-after
gcc-4.8.2:
text data bss dec hex filename
35261 12118 13904 61283 ef63 mm/memory.o
35646 12278 14032 61956 f204 mm/memory.o
The more recent compiler is more interesting but either way, that's a
somewhat disappointing increase in code size for refactoring of a
single function.
I had a brief poke around and couldn't find any obvious improvements
to make.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread* RE: [PATCH V5 0/4] Refactor do_wp_page, no functional change
2015-02-24 0:20 ` [PATCH V5 0/4] Refactor do_wp_page, no functional change Andrew Morton
@ 2015-02-24 8:36 ` Shachar Raindel
0 siblings, 0 replies; 11+ messages in thread
From: Shachar Raindel @ 2015-02-24 8:36 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, kirill.shutemov, mgorman, riel, ak, matthew.r.wilcox,
dave.hansen, n-horiguchi, torvalds, Haggai Eran, aarcange,
pfeiner, hannes, Sagi Grimberg, walken
> -----Original Message-----
> From: Andrew Morton [mailto:akpm@linux-foundation.org]
> Sent: Tuesday, February 24, 2015 2:20 AM
> To: Shachar Raindel
> Cc: linux-mm@kvack.org; kirill.shutemov@linux.intel.com;
> mgorman@suse.de; riel@redhat.com; ak@linux.intel.com;
> matthew.r.wilcox@intel.com; dave.hansen@linux.intel.com; n-
> horiguchi@ah.jp.nec.com; torvalds@linux-foundation.org; Haggai Eran;
> aarcange@redhat.com; pfeiner@google.com; hannes@cmpxchg.org; Sagi
> Grimberg; walken@google.com
> Subject: Re: [PATCH V5 0/4] Refactor do_wp_page, no functional change
>
> On Sun, 22 Feb 2015 15:42:14 +0200 Shachar Raindel
> <raindel@mellanox.com> wrote:
>
> > Currently do_wp_page contains 265 code lines. It also contains 9 goto
> > statements, of which 5 are targeting labels which are not cleanup
> > related. This makes the function extremely difficult to
> > understand. The following patches are an attempt at breaking the
> > function to its basic components, and making it easier to understand.
> >
> > The patches are straight forward function extractions from
> > do_wp_page. As we extract functions, we remove unneeded parameters and
> > simplify the code as much as possible. However, the functionality is
> > supposed to remain completely unchanged. The patches also attempt to
> > document the functionality of each extracted function. In patch 2, we
> > split the unlock logic to the contain logic relevant to specific needs
> > of each use case, instead of having huge number of conditional
> > decisions in a single unlock flow.
>
> gcc-4.4.4:
>
> text data bss dec hex filename
> 40898 186 13344 54428 d49c mm/memory.o-before
> 41422 186 13456 55064 d718 mm/memory.o-after
>
> gcc-4.8.2:
>
> text data bss dec hex filename
> 35261 12118 13904 61283 ef63 mm/memory.o
> 35646 12278 14032 61956 f204 mm/memory.o
>
My results (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16)) differ:
text data bss dec hex filename
29957 70 32 30059 756b memory.o.next-20150219
30061 70 32 30163 75d3 memory.o.next-20150219+1
30093 70 32 30195 75f3 memory.o.next-20150219+2
30165 70 32 30267 763b memory.o.next-20150219+3
30165 70 32 30267 763b memory.o.next-20150219+4
> The more recent compiler is more interesting but either way, that's a
> somewhat disappointing increase in code size for refactoring of a
> single function.
>
Seems like the majority of the size impact (104 bytes out of 208) originate
from the first patch - "mm: Refactor do_wp_page, extract the reuse case"
This is probably due to changing 3 gotos into returning a function call.
As gcc cannot do tail call optimization there, it is forced to create
3 new call sites there. Adding an inline to wp_page_reuse reduced the total
size to:
text data bss dec hex filename
30109 70 32 30211 7603 memory.o
> I had a brief poke around and couldn't find any obvious improvements
> to make.
IMHO, 152 bytes in code size is a small price to pay for code that is
not spaghetti.
The patch to add the strategic inline which saves 56 bytes on my GCC:
diff --git a/mm/memory.c b/mm/memory.c
index b246d22..9025285 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1990,7 +1990,7 @@ static int do_page_mkwrite(struct vm_area_struct *vma, struct page *page,
* case, all we need to do here is to mark the page as writable and update
* any related book-keeping.
*/
-static int wp_page_reuse(struct mm_struct *mm, struct vm_area_struct *vma,
+static inline int wp_page_reuse(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pte_t *page_table,
spinlock_t *ptl, pte_t orig_pte,
struct page *page, int page_mkwrite,
Thanks,
--Shachar
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread