linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Punit Agrawal <punit.agrawal@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Punit Agrawal <punit.agrawal@arm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Steve Capper <steve.capper@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: [PATCH v2] mm/hugetlb.c: make huge_pte_offset() consistent and document behaviour
Date: Fri, 18 Aug 2017 15:54:15 +0100	[thread overview]
Message-ID: <20170818145415.7588-1-punit.agrawal@arm.com> (raw)
In-Reply-To: <20170725154114.24131-2-punit.agrawal@arm.com>

When walking the page tables to resolve an address that points to
!p*d_present() entry, huge_pte_offset() returns inconsistent values
depending on the level of page table (PUD or PMD).

It returns NULL in the case of a PUD entry while in the case of a PMD
entry, it returns a pointer to the page table entry.

A similar inconsitency exists when handling swap entries - returns NULL
for a PUD entry while a pointer to the pte_t is retured for the PMD entry.

Update huge_pte_offset() to make the behaviour consistent - return a
pointer to the pte_t for hugepage or swap entries. Only return NULL in
instances where we have a p*d_none() entry and the size parameter
doesn't match the hugepage size at this level of the page table.

Document the behaviour to clarify the expected behaviour of this function.
This is to set clear semantics for architecture specific implementations
of huge_pte_offset().

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
---

Hi Andrew,

>From discussions on the arm64 implementation of huge_pte_offset()[0]
we realised that there is benefit from returning a pte_t* in the case
of p*d_none().

The fault handling code in hugetlb_fault() can handle p*d_none()
entries and saves an extra round trip to huge_pte_alloc(). Other
callers of huge_pte_offset() should be ok as well.

Apologies for sending a late update but I thought if we are defining
the semantics, it's worth getting them right.

Could you please pick this version please?

Thanks,
Punit

[0] http://www.spinics.net/lists/linux-mm/msg133699.html

v2: 

 mm/hugetlb.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 31e207cb399b..1d54a131bdd5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4600,6 +4600,15 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
 	return pte;
 }
 
+/*
+ * huge_pte_offset() - Walk the page table to resolve the hugepage
+ * entry at address @addr
+ *
+ * Return: Pointer to page table or swap entry (PUD or PMD) for
+ * address @addr, or NULL if a p*d_none() entry is encountered and the
+ * size @sz doesn't match the hugepage size at this level of the page
+ * table.
+ */
 pte_t *huge_pte_offset(struct mm_struct *mm,
 		       unsigned long addr, unsigned long sz)
 {
@@ -4614,13 +4623,22 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
 	p4d = p4d_offset(pgd, addr);
 	if (!p4d_present(*p4d))
 		return NULL;
+
 	pud = pud_offset(p4d, addr);
-	if (!pud_present(*pud))
+	if (sz != PUD_SIZE && pud_none(*pud))
 		return NULL;
-	if (pud_huge(*pud))
+	/* hugepage or swap? */
+	if (pud_huge(*pud) || !pud_present(*pud))
 		return (pte_t *)pud;
+
 	pmd = pmd_offset(pud, addr);
-	return (pte_t *) pmd;
+	if (sz != PMD_SIZE && pmd_none(*pmd))
+		return NULL;
+	/* hugepage or swap? */
+	if (pmd_huge(*pmd) || !pmd_present(*pmd))
+		return (pte_t *)pmd;
+
+	return NULL;
 }
 
 #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
-- 
2.13.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-08-18 14:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-25 15:41 [PATCH 0/1] Clarify huge_pte_offset() semantics Punit Agrawal
2017-07-25 15:41 ` [PATCH 1/1] mm/hugetlb: Make huge_pte_offset() consistent and document behaviour Punit Agrawal
2017-07-26  8:39   ` Catalin Marinas
2017-07-26  8:50   ` Michal Hocko
2017-07-26  8:53     ` Michal Hocko
2017-07-26 12:11       ` Punit Agrawal
2017-07-26 12:33         ` Michal Hocko
2017-07-26 12:47           ` Michal Hocko
2017-07-26 13:34             ` Punit Agrawal
2017-07-27  3:16               ` Mike Kravetz
2017-07-27 12:58                 ` Punit Agrawal
2017-08-18 14:54   ` Punit Agrawal [this message]
2017-08-18 21:29     ` [PATCH v2] mm/hugetlb.c: make " Mike Kravetz
2017-08-21 18:07       ` Catalin Marinas
2017-08-21 21:30         ` Mike Kravetz
2017-08-22 15:32           ` Punit Agrawal
2017-08-22 10:11     ` Catalin Marinas
2017-08-30  7:49     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170818145415.7588-1-punit.agrawal@arm.com \
    --to=punit.agrawal@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=steve.capper@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox