From: Hugh Dickins <hughd@google.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Christoph Lameter <cl@linux.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v2] hugetlb: fix copy_hugetlb_page_range() to handle migration/hwpoisoned entry
Date: Tue, 17 Jun 2014 11:34:31 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.1406171049150.2862@eggly.anvils> (raw)
In-Reply-To: <1403012995-538-1-git-send-email-n-horiguchi@ah.jp.nec.com>
On Tue, 17 Jun 2014, Naoya Horiguchi wrote:
> There's a race between fork() and hugepage migration, as a result we try to
> "dereference" a swap entry as a normal pte, causing kernel panic.
> The cause of the problem is that copy_hugetlb_page_range() can't handle "swap
> entry" family (migration entry and hwpoisoned entry,) so let's fix it.
>
> ChangeLog v2:
> - stop applying is_cow_mapping() in copy_hugetlb_page_range()
> - use set_huge_pte_at() in hugepage code
> - fix stable version
>
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: stable@vger.kernel.org # v2.6.37+
Acked-by: Hugh Dickins <hughd@google.com>
But I do hope that you have made an unmissable note somewhere, that
the s390 and sparc set_huge_pte_at() will probably need fixing to handle
this non-present case, before hugepage migration can be allowed on them.
There's probably an easy patch for s390. but not so obvious for sparc
(which did not pretend to support hugepage migration before anyway).
And a followup not-for-stable cleanup to this patch would be nice:
testing is_hugetlb_entry_migration() and is_huge_tlb_entry_hwpoisoned()
separately, when they're doing almost the same thing, seems a bit daft.
Hmm, but maybe that should be part of a larger job: the handling of
non-swap swap-entries is tiresome in lots of places, I wonder if
there's a more convenient way of handling them everywhere.
Hugh
> ---
> mm/hugetlb.c | 70 ++++++++++++++++++++++++++++++++++++------------------------
> 1 file changed, 42 insertions(+), 28 deletions(-)
>
> diff --git v3.16-rc1.orig/mm/hugetlb.c v3.16-rc1/mm/hugetlb.c
> index 226910cb7c9b..a3f6349ab5b5 100644
> --- v3.16-rc1.orig/mm/hugetlb.c
> +++ v3.16-rc1/mm/hugetlb.c
> @@ -2520,6 +2520,31 @@ static void set_huge_ptep_writable(struct vm_area_struct *vma,
> update_mmu_cache(vma, address, ptep);
> }
>
> +static int is_hugetlb_entry_migration(pte_t pte)
> +{
> + swp_entry_t swp;
> +
> + if (huge_pte_none(pte) || pte_present(pte))
> + return 0;
> + swp = pte_to_swp_entry(pte);
> + if (non_swap_entry(swp) && is_migration_entry(swp))
> + return 1;
> + else
> + return 0;
> +}
> +
> +static int is_hugetlb_entry_hwpoisoned(pte_t pte)
> +{
> + swp_entry_t swp;
> +
> + if (huge_pte_none(pte) || pte_present(pte))
> + return 0;
> + swp = pte_to_swp_entry(pte);
> + if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> + return 1;
> + else
> + return 0;
> +}
>
> int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> struct vm_area_struct *vma)
> @@ -2559,10 +2584,25 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> dst_ptl = huge_pte_lock(h, dst, dst_pte);
> src_ptl = huge_pte_lockptr(h, src, src_pte);
> spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> - if (!huge_pte_none(huge_ptep_get(src_pte))) {
> + entry = huge_ptep_get(src_pte);
> + if (huge_pte_none(entry)) { /* skip none entry */
> + ;
> + } else if (unlikely(is_hugetlb_entry_migration(entry) ||
> + is_hugetlb_entry_hwpoisoned(entry))) {
> + swp_entry_t swp_entry = pte_to_swp_entry(entry);
> + if (is_write_migration_entry(swp_entry) && cow) {
> + /*
> + * COW mappings require pages in both
> + * parent and child to be set to read.
> + */
> + make_migration_entry_read(&swp_entry);
> + entry = swp_entry_to_pte(swp_entry);
> + set_huge_pte_at(src, addr, src_pte, entry);
> + }
> + set_huge_pte_at(dst, addr, dst_pte, entry);
> + } else {
> if (cow)
> huge_ptep_set_wrprotect(src, addr, src_pte);
> - entry = huge_ptep_get(src_pte);
> ptepage = pte_page(entry);
> get_page(ptepage);
> page_dup_rmap(ptepage);
> @@ -2578,32 +2618,6 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> return ret;
> }
>
> -static int is_hugetlb_entry_migration(pte_t pte)
> -{
> - swp_entry_t swp;
> -
> - if (huge_pte_none(pte) || pte_present(pte))
> - return 0;
> - swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_migration_entry(swp))
> - return 1;
> - else
> - return 0;
> -}
> -
> -static int is_hugetlb_entry_hwpoisoned(pte_t pte)
> -{
> - swp_entry_t swp;
> -
> - if (huge_pte_none(pte) || pte_present(pte))
> - return 0;
> - swp = pte_to_swp_entry(pte);
> - if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> - return 1;
> - else
> - return 0;
> -}
> -
> void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
> unsigned long start, unsigned long end,
> struct page *ref_page)
> --
> 1.9.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-06-17 18:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-06 19:07 [PATCH] " Naoya Horiguchi
2014-06-16 0:19 ` Hugh Dickins
2014-06-16 19:59 ` Naoya Horiguchi
2014-06-17 0:59 ` Hugh Dickins
2014-06-17 13:49 ` [PATCH v2] " Naoya Horiguchi
2014-06-17 18:34 ` Hugh Dickins [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1406171049150.2862@eggly.anvils \
--to=hughd@google.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox