From: Nadav Amit <nadav.amit@gmail.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Mike Rapoport <rppt@linux.ibm.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Nadav Amit <namit@vmware.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Andy Lutomirski <luto@kernel.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@redhat.com>,
Peter Xu <peterx@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>,
Nick Piggin <npiggin@gmail.com>
Subject: [RFC PATCH 14/14] mm: conditional check of pfn in pte_flush_type
Date: Mon, 18 Jul 2022 05:02:12 -0700 [thread overview]
Message-ID: <20220718120212.3180-15-namit@vmware.com> (raw)
In-Reply-To: <20220718120212.3180-1-namit@vmware.com>
From: Nadav Amit <namit@vmware.com>
Checking whether PFNs in two PTEs are the same takes surprisingly large
number of instructions. Yet in fact, in most cases the caller to
pte_flush_type() already knows if the PFN was changed. For instance,
mprotect() does not change the PFN, but only modifies the protection
flags.
Add argument to pte_flush_type() to indicate whether the PFN should be
checked. Keep checking it in mm-debug to see if some caller was wrong to
assume the PFN is the same.
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
---
arch/x86/include/asm/tlbflush.h | 14 ++++++++++----
include/asm-generic/tlb.h | 6 ++++--
mm/huge_memory.c | 2 +-
mm/mprotect.c | 2 +-
mm/rmap.c | 2 +-
5 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 58c95e36b098..50349861fdc9 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -340,14 +340,17 @@ static inline enum pte_flush_type pte_flags_flush_type(unsigned long oldflags,
* whether a strict or relaxed TLB flush is need. It should only be used on
* userspace PTEs.
*/
-static inline enum pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpte)
+static inline enum pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpte,
+ bool check_pfn)
{
/* !PRESENT -> * ; no need for flush */
if (!(pte_flags(oldpte) & _PAGE_PRESENT))
return PTE_FLUSH_NONE;
/* PFN changed ; needs flush */
- if (pte_pfn(oldpte) != pte_pfn(newpte))
+ if (!check_pfn)
+ VM_BUG_ON(pte_pfn(oldpte) != pte_pfn(newpte));
+ else if (pte_pfn(oldpte) != pte_pfn(newpte))
return PTE_FLUSH_STRICT;
/*
@@ -363,14 +366,17 @@ static inline enum pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpte)
* huge_pmd_flush_type() checks whether permissions were demoted and require a
* flush. It should only be used for userspace huge PMDs.
*/
-static inline enum pte_flush_type huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd)
+static inline enum pte_flush_type huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd,
+ bool check_pfn)
{
/* !PRESENT -> * ; no need for flush */
if (!(pmd_flags(oldpmd) & _PAGE_PRESENT))
return PTE_FLUSH_NONE;
/* PFN changed ; needs flush */
- if (pmd_pfn(oldpmd) != pmd_pfn(newpmd))
+ if (!check_pfn)
+ VM_BUG_ON(pmd_pfn(oldpmd) != pmd_pfn(newpmd));
+ else if (pmd_pfn(oldpmd) != pmd_pfn(newpmd))
return PTE_FLUSH_STRICT;
/*
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 07b3eb8caf63..aee9da6cc5d5 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -677,14 +677,16 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
#endif
#ifndef pte_flush_type
-static inline struct pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpte)
+static inline struct pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpte,
+ bool check_pfn)
{
return PTE_FLUSH_STRICT;
}
#endif
#ifndef huge_pmd_flush_type
-static inline bool huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd)
+static inline bool huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd,
+ bool check_pfn)
{
return PTE_FLUSH_STRICT;
}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b32b7da0f6f7..92a7b3ca317f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1818,7 +1818,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
flush_type = PTE_FLUSH_STRICT;
if (!tlb->strict)
- flush_type = huge_pmd_flush_type(oldpmd, entry);
+ flush_type = huge_pmd_flush_type(oldpmd, entry, false);
if (flush_type != PTE_FLUSH_NONE)
tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE,
flush_type == PTE_FLUSH_STRICT);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index cf775f6c8c08..78081d7f4edf 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -204,7 +204,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb,
flush_type = PTE_FLUSH_STRICT;
if (!tlb->strict)
- flush_type = pte_flush_type(oldpte, ptent);
+ flush_type = pte_flush_type(oldpte, ptent, false);
if (flush_type != PTE_FLUSH_NONE)
tlb_flush_pte_range(tlb, addr, PAGE_SIZE,
flush_type == PTE_FLUSH_STRICT);
diff --git a/mm/rmap.c b/mm/rmap.c
index 62f4b2a4f067..63261619b607 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -974,7 +974,7 @@ static int page_vma_mkclean_one(struct page_vma_mapped_walk *pvmw)
entry = pte_wrprotect(oldpte);
entry = pte_mkclean(entry);
- if (pte_flush_type(oldpte, entry) != PTE_FLUSH_NONE ||
+ if (pte_flush_type(oldpte, entry, false) != PTE_FLUSH_NONE ||
mm_tlb_flush_pending(vma->vm_mm))
flush_tlb_page(vma, address);
--
2.25.1
prev parent reply other threads:[~2022-07-18 19:37 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220718120212.3180-1-namit@vmware.com>
2022-07-18 12:01 ` [RFC PATCH 01/14] userfaultfd: set dirty and young on writeprotect Nadav Amit
2022-07-19 20:47 ` Peter Xu
2022-07-20 9:39 ` David Hildenbrand
2022-07-20 13:10 ` Peter Xu
2022-07-20 15:10 ` David Hildenbrand
2022-07-20 19:15 ` Peter Xu
2022-07-20 19:33 ` David Hildenbrand
2022-07-20 19:48 ` Peter Xu
2022-07-20 19:55 ` David Hildenbrand
2022-07-20 20:22 ` Nadav Amit
2022-07-20 20:38 ` David Hildenbrand
2022-07-20 20:56 ` Nadav Amit
2022-07-21 7:52 ` David Hildenbrand
2022-07-21 14:10 ` David Hildenbrand
2022-07-20 9:42 ` David Hildenbrand
2022-07-20 17:36 ` Nadav Amit
2022-07-20 18:00 ` David Hildenbrand
2022-07-20 18:09 ` Nadav Amit
2022-07-20 18:11 ` David Hildenbrand
2022-07-18 12:02 ` [RFC PATCH 02/14] userfaultfd: try to map write-unprotected pages Nadav Amit
2022-07-19 20:49 ` Peter Xu
2022-07-18 12:02 ` [RFC PATCH 03/14] mm/mprotect: allow exclusive anon pages to be writable Nadav Amit
2022-07-20 15:19 ` David Hildenbrand
2022-07-20 17:25 ` Nadav Amit
2022-07-21 7:45 ` David Hildenbrand
2022-07-18 12:02 ` [RFC PATCH 04/14] mm/mprotect: preserve write with MM_CP_TRY_CHANGE_WRITABLE Nadav Amit
2022-07-18 12:02 ` [RFC PATCH 06/14] mm/rmap: avoid flushing on page_vma_mkclean_one() when possible Nadav Amit
2022-07-18 12:02 ` [RFC PATCH 07/14] mm: do fix spurious page-faults for instruction faults Nadav Amit
2022-07-18 12:02 ` [RFC PATCH 08/14] x86/mm: introduce flush_tlb_fix_spurious_fault Nadav Amit
2022-07-18 12:02 ` [RFC PATCH 10/14] x86/mm: introduce relaxed TLB flushes Nadav Amit
2022-07-18 12:02 ` [RFC PATCH 11/14] x86/mm: use relaxed TLB flushes when protection is removed Nadav Amit
2022-07-18 12:02 ` [RFC PATCH 12/14] x86/tlb: no flush on PTE change from RW->RO when PTE is clean Nadav Amit
2022-07-18 12:02 ` Nadav Amit [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220718120212.3180-15-namit@vmware.com \
--to=nadav.amit@gmail.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.cooper3@citrix.com \
--cc=axelrasmussen@google.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=namit@vmware.com \
--cc=npiggin@gmail.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=rppt@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox