linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shachar Raindel <raindel@mellanox.com>
To: Rik van Riel <riel@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: "kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	"matthew.r.wilcox@intel.com" <matthew.r.wilcox@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"n-horiguchi@ah.jp.nec.com" <n-horiguchi@ah.jp.nec.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
	Haggai Eran <haggaie@mellanox.com>,
	"aarcange@redhat.com" <aarcange@redhat.com>,
	"pfeiner@google.com" <pfeiner@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	Sagi Grimberg <sagig@mellanox.com>,
	"walken@google.com" <walken@google.com>,
	Jerome Glisse <j.glisse@gmail.com>
Subject: RE: [PATCH v2 3/4] mm: refactor do_wp_page, extract the page copy flow
Date: Tue, 2 Dec 2014 09:09:04 +0000	[thread overview]
Message-ID: <b33f68fb290142379f1189efbd8ea557@AM3PR05MB0935.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <547D29A3.7090108@redhat.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 5085 bytes --]



> -----Original Message-----
> From: owner-linux-mm@kvack.org [mailto:owner-linux-mm@kvack.org] On
> Behalf Of Rik van Riel
> Sent: Tuesday, December 02, 2014 4:53 AM
> To: Shachar Raindel; linux-mm@kvack.org
> Cc: kirill.shutemov@linux.intel.com; mgorman@suse.de;
> ak@linux.intel.com; matthew.r.wilcox@intel.com;
> dave.hansen@linux.intel.com; n-horiguchi@ah.jp.nec.com; akpm@linux-
> foundation.org; torvalds@linux-foundation.org; Haggai Eran;
> aarcange@redhat.com; pfeiner@google.com; hannes@cmpxchg.org; Sagi
> Grimberg; walken@google.com
> Subject: Re: [PATCH v2 3/4] mm: refactor do_wp_page, extract the page
> copy flow
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 12/01/2014 03:58 PM, Shachar Raindel wrote:
> > In some cases, do_wp_page had to copy the page suffering a write
> > fault to a new location. If the function logic decided that to do
> > this, it was done by jumping with a "goto" operation to the
> > relevant code block. This made the code really hard to understand.
> > It is also against the kernel coding style guidelines.
> >
> > This patch extracts the page copy and page table update logic to a
> > separate function. It also clean up the naming, from "gotten" to
> > "wp_page_copy", and adds few comments.
> >
> > Signed-off-by: Shachar Raindel <raindel@mellanox.com> ---
> > mm/memory.c | 265
> > +++++++++++++++++++++++++++++++++--------------------------- 1 file
> > changed, 147 insertions(+), 118 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c index b42bec0..c7c0df2
> > 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2088,6 +2088,146 @@
> > static int wp_page_reuse(struct mm_struct *mm, struct
> > vm_area_struct *vma, }
> >
> > /* + * Handle the case of a page which we actually need to copy to
> > a new page. + * + * Called with mmap_sem locked and the old page
> > referenced, but + * without the ptl held. + * + * High level logic
> > flow: + * + * - Allocate a page, copy the content of the old page
> > to the new one. + * - Handle book keeping and accounting - cgroups,
> > mmu-notifiers, etc. + * - Take the PTL. If the pte changed, bail
> > out and release the allocated page + * - If the pte is still the
> > way we remember it, update the page table and all + *   relevant
> > references. This includes dropping the reference the page-table + *
> > held to the old page, as well as updating the rmap. + * - In any
> > case, unlock the PTL and drop the reference we took to the old
> > page. + */ +static int wp_page_copy(struct mm_struct *mm, struct
> > vm_area_struct *vma, +			unsigned long address, pte_t
> *page_table,
> > pmd_t *pmd, +			pte_t orig_pte, struct page *old_page) +{ +
> 	struct
> > page *new_page = NULL; +	spinlock_t *ptl = NULL; +	pte_t entry; +
> > int page_copied = 0; +	const unsigned long mmun_start = address &
> > PAGE_MASK;	/* For mmu_notifiers */ +	const unsigned long mmun_end
> =
> > mmun_start + PAGE_SIZE;	/* For mmu_notifiers */ +	struct mem_cgroup
> > *memcg; + +	if (unlikely(anon_vma_prepare(vma))) +		goto
> oom; + +
> > if (is_zero_pfn(pte_pfn(orig_pte))) { +		new_page =
> > alloc_zeroed_user_highpage_movable(vma, address); +		if
> > (!new_page) +			goto oom; +	} else { +		new_page =
> > alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address); +		if
> > (!new_page) +			goto oom; +		cow_user_page(new_page,
> old_page,
> > address, vma); +	} +	__SetPageUptodate(new_page); + +	if
> > (mem_cgroup_try_charge(new_page, mm, GFP_KERNEL, &memcg)) +
> 	goto
> > oom_free_new; + +	mmu_notifier_invalidate_range_start(mm,
> > mmun_start, mmun_end);
> 
> I believe the mmu_notifier_invalidate_range_start & _end
> functions can be moved inside the pte_same(*page_table, orig_pte)
> branch. There is no reason to call those functions if we do not
> modify the page table entry.
> 

There is a critical reason for this - moving the MMU notifiers there will
make them unsleepable. This will prevent hardware devices that keep secondary
PTEs from waiting for an interrupt to signal that the invalidation was completed. 
This is required for example by the ODP patch set 
(http://www.spinics.net/lists/linux-rdma/msg22044.html ) and
by the HMM patch set (http://comments.gmane.org/gmane.linux.kernel.mm/116584 ).

> This is not something introduced by your patch, but you might as
> well fix it while you're touching the code :)

This happened to be introduced by Haggai Eran 
(http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6bdb913f0a ),
who kindly reviewed this patchset before v0 was sent.

As mentioned above, I would prefer not to break the sleepability of the MMU notifiers in
this patchset which is targeting coding style cleanup. If you want to further discuss the
issue, can we split it to a different thread, not blocking the ack on this patch?

> 
> Other than that:
> 
> Acked-by: Rik van Riel <riel@redhat.com>
> 
Thanks.

--Shachar
N‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒðèž×¦j)Z†·Ÿ

  reply	other threads:[~2014-12-02  9:09 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-01 20:58 [PATCH v2 0/4] Refactor do_wp_page, no functional change Shachar Raindel
2014-12-01 20:58 ` [PATCH v2 1/4] mm: Refactor do_wp_page, extract the reuse case Shachar Raindel
2014-12-01 22:23   ` Rik van Riel
2014-12-01 20:58 ` [PATCH v2 2/4] mm: Refactor do_wp_page - rewrite the unlock flow Shachar Raindel
2014-12-01 22:46   ` Rik van Riel
2014-12-01 20:58 ` [PATCH v2 3/4] mm: refactor do_wp_page, extract the page copy flow Shachar Raindel
2014-12-02  2:53   ` Rik van Riel
2014-12-02  9:09     ` Shachar Raindel [this message]
2014-12-02 14:56       ` Rik van Riel
2014-12-01 20:58 ` [PATCH v2 4/4] mm: Refactor do_wp_page handling of shared vma into a function Shachar Raindel
2014-12-02  2:57   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b33f68fb290142379f1189efbd8ea557@AM3PR05MB0935.eurprd05.prod.outlook.com \
    --to=raindel@mellanox.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=haggaie@mellanox.com \
    --cc=hannes@cmpxchg.org \
    --cc=j.glisse@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mgorman@suse.de \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=pfeiner@google.com \
    --cc=riel@redhat.com \
    --cc=sagig@mellanox.com \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox