linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	David Rientjes <rientjes@google.com>,
	Rik van Riel <riel@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [RFC][PATCH] mm: hugetlb: add stub-like do_hugetlb_numa()
Date: Mon, 30 Mar 2015 09:40:54 +0000	[thread overview]
Message-ID: <1427708426-31610-1-git-send-email-n-horiguchi@ah.jp.nec.com> (raw)

hugetlb doesn't support NUMA balancing now, but that doesn't mean that we
don't have to make hugetlb code prepared for PROTNONE entry properly.
In the current kernel, when a process accesses to hugetlb range protected
with PROTNONE, it causes unexpected COWs, which finally put hugetlb subsystem
into broken/uncontrollable state, where for example h->resv_huge_pages is
subtracted too much and wrapped around to a very large number, and free
hugepage pool is no longer maintainable.

This patch simply clears PROTNONE when it's caught out. Real NUMA balancing
code for hugetlb is not implemented yet (not sure how much it's worth doing.)

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 include/asm-generic/hugetlb.h | 13 +++++++++++++
 mm/hugetlb.c                  | 24 ++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git v4.0-rc4.orig/include/asm-generic/hugetlb.h v4.0-rc4/include/asm-generic/hugetlb.h
index 99b490b4d05a..7e73cc9e57b1 100644
--- v4.0-rc4.orig/include/asm-generic/hugetlb.h
+++ v4.0-rc4/include/asm-generic/hugetlb.h
@@ -37,4 +37,17 @@ static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 	pte_clear(mm, addr, ptep);
 }
 
+#ifdef CONFIG_NUMA_BALANCING
+static inline int huge_pte_protnone(pte_t pte)
+{
+	return (pte_flags(pte) & (_PAGE_PROTNONE | _PAGE_PRESENT))
+		== _PAGE_PROTNONE;
+}
+#else
+static inline int huge_pte_protnone(pte_t pte)
+{
+	return 0;
+}
+#endif /* CONFIG_NUMA_BALANCING */
+
 #endif /* _ASM_GENERIC_HUGETLB_H */
diff --git v4.0-rc4.orig/mm/hugetlb.c v4.0-rc4/mm/hugetlb.c
index cbb0bbc6662a..18c169674ee4 100644
--- v4.0-rc4.orig/mm/hugetlb.c
+++ v4.0-rc4/mm/hugetlb.c
@@ -3090,6 +3090,28 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	goto out;
 }
 
+#ifdef CONFIG_NUMA_BALANCING
+/*
+ * NUMA balancing code is to be implemented. Now we just clear PROTNONE to
+ * avoid unstability of hugetlb subsystem.
+ */
+static int do_hugetlb_numa(struct mm_struct *mm, struct vm_area_struct *vma,
+				unsigned long address, pte_t *ptep, pte_t pte)
+{
+	spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), mm, ptep);
+
+	spin_lock(ptl);
+	if (unlikely(!pte_same(*ptep, pte)))
+		goto unlock;
+	pte = pte_mkhuge(huge_pte_modify(pte, vma->vm_page_prot));
+	pte = pte_mkyoung(pte);
+	set_huge_pte_at(mm, address, ptep, pte);
+unlock:
+	spin_unlock(ptl);
+	return 0;
+}
+#endif
+
 #ifdef CONFIG_SMP
 static u32 fault_mutex_hash(struct hstate *h, struct mm_struct *mm,
 			    struct vm_area_struct *vma,
@@ -3144,6 +3166,8 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	ptep = huge_pte_offset(mm, address);
 	if (ptep) {
 		entry = huge_ptep_get(ptep);
+		if (huge_pte_protnone(entry))
+			return do_hugetlb_numa(mm, vma, address, ptep, entry);
 		if (unlikely(is_hugetlb_entry_migration(entry))) {
 			migration_entry_wait_huge(vma, mm, ptep);
 			return 0;
-- 
1.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2015-03-30  9:48 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-30  9:40 Naoya Horiguchi [this message]
2015-03-30 10:28 ` Mel Gorman
2015-03-30 10:42   ` Naoya Horiguchi
2015-03-30 11:59     ` Mel Gorman
2015-03-31  1:45       ` [PATCH] mm: numa: disable change protection for vma(VM_HUGETLB) Naoya Horiguchi
2015-03-31 21:35         ` Andrew Morton
2015-04-01  4:14           ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1427708426-31610-1-git-send-email-n-horiguchi@ah.jp.nec.com \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox