linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michel Lespinasse <michel@lespinasse.org>
To: Linux-MM <linux-mm@kvack.org>
Cc: Laurent Dufour <ldufour@linux.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Matthew Wilcox <willy@infradead.org>,
	Rik van Riel <riel@surriel.com>,
	Paul McKenney <paulmck@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Rom Lemarchand <romlem@google.com>,
	Linux-Kernel <linux-kernel@vger.kernel.org>,
	Michel Lespinasse <michel@lespinasse.org>
Subject: [RFC PATCH 14/37] mm: add pte_map_lock() and pte_spinlock()
Date: Tue,  6 Apr 2021 18:44:39 -0700	[thread overview]
Message-ID: <20210407014502.24091-15-michel@lespinasse.org> (raw)
In-Reply-To: <20210407014502.24091-1-michel@lespinasse.org>

pte_map_lock() and pte_spinlock() are used by fault handlers to ensure
the pte is mapped and locked before they commit the faulted page to the
mm's address space at the end of the fault.

The functions differ in their preconditions; pte_map_lock() expects
the pte to be unmapped prior to the call, while pte_spinlock() expects
it to be already mapped.

In the speculative fault case, the functions verify, after locking the pte,
that the mmap sequence count has not changed since the start of the fault,
and thus that no mmap lock writers have been running concurrently with
the fault. After that point the page table lock serializes any further
races with concurrent mmap lock writers.

If the mmap sequence count check fails, both functions will return false
with the pte being left unmapped and unlocked.

Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
 include/linux/mm.h | 34 ++++++++++++++++++++++
 mm/memory.c        | 71 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 105 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dee8a4833779..f26490aff514 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3183,5 +3183,39 @@ extern int sysctl_nr_trim_pages;
 
 void mem_dump_obj(void *object);
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+
+bool __pte_map_lock(struct vm_fault *vmf);
+
+static inline bool pte_map_lock(struct vm_fault *vmf)
+{
+	VM_BUG_ON(vmf->pte);
+	return __pte_map_lock(vmf);
+}
+
+static inline bool pte_spinlock(struct vm_fault *vmf)
+{
+	VM_BUG_ON(!vmf->pte);
+	return __pte_map_lock(vmf);
+}
+
+#else	/* !CONFIG_SPECULATIVE_PAGE_FAULT */
+
+static inline bool pte_map_lock(struct vm_fault *vmf)
+{
+	vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, vmf->address,
+				       &vmf->ptl);
+	return true;
+}
+
+static inline bool pte_spinlock(struct vm_fault *vmf)
+{
+	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+	spin_lock(vmf->ptl);
+	return true;
+}
+
+#endif	/* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */
diff --git a/mm/memory.c b/mm/memory.c
index a17704aac019..3e192d5f89a6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2566,6 +2566,77 @@ int apply_to_existing_page_range(struct mm_struct *mm, unsigned long addr,
 }
 EXPORT_SYMBOL_GPL(apply_to_existing_page_range);
 
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+
+bool __pte_map_lock(struct vm_fault *vmf)
+{
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	pmd_t pmdval;
+#endif
+	pte_t *pte = vmf->pte;
+	spinlock_t *ptl;
+
+	if (!(vmf->flags & FAULT_FLAG_SPECULATIVE)) {
+		vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+		if (!pte)
+			vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
+		spin_lock(vmf->ptl);
+		return true;
+	}
+
+	local_irq_disable();
+	if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq))
+		goto fail;
+	/*
+	 * The mmap sequence count check guarantees that the page
+	 * tables are still valid at that point, and having IRQs
+	 * disabled ensures that they stay around (see Fast GUP
+	 * comment in mm/gup.c).
+	 */
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	/*
+	 * We check if the pmd value is still the same to ensure that there
+	 * is not a huge collapse operation in progress in our back.
+	 */
+	pmdval = READ_ONCE(*vmf->pmd);
+	if (!pmd_same(pmdval, vmf->orig_pmd))
+		goto fail;
+#endif
+	ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+	if (!pte)
+		pte = pte_offset_map(vmf->pmd, vmf->address);
+	/*
+	 * Try locking the page table.
+	 *
+	 * Note that we might race against zap_pte_range() which
+	 * invalidates TLBs while holding the page table lock.
+	 * We still have local IRQs disabled here to prevent the
+	 * page table from being reclaimed, and zap_pte_range() could
+	 * thus deadlock with us if we tried using spin_lock() here.
+	 *
+	 * We also don't want to retry until spin_trylock() succeeds,
+	 * because of the starvation potential against a stream of lockers.
+	 */
+	if (unlikely(!spin_trylock(ptl)))
+		goto fail;
+	if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq))
+		goto unlock_fail;
+	local_irq_enable();
+	vmf->pte = pte;
+	vmf->ptl = ptl;
+	return true;
+
+unlock_fail:
+	spin_unlock(ptl);
+fail:
+	if (pte)
+		pte_unmap(pte);
+	local_irq_enable();
+	return false;
+}
+
+#endif	/* CONFIG_SPECULATIVE_PAGE_FAULT */
+
 /*
  * handle_pte_fault chooses page fault handler according to an entry which was
  * read non-atomically.  Before making any commitment, on those architectures
-- 
2.20.1



  parent reply	other threads:[~2021-04-07  1:45 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210407014502.24091-1-michel@lespinasse.org>
2021-04-07  1:44 ` [RFC PATCH 01/37] mmap locking API: mmap_lock_is_contended returns a bool Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 02/37] mmap locking API: name the return values Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 03/37] do_anonymous_page: use update_mmu_tlb() Michel Lespinasse
2021-04-07  2:06   ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 04/37] do_anonymous_page: reduce code duplication Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 05/37] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 06/37] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 07/37] mm: add FAULT_FLAG_SPECULATIVE flag Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 08/37] mm: add do_handle_mm_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 09/37] mm: add per-mm mmap sequence counter for speculative page fault handling Michel Lespinasse
2021-04-07 14:47   ` Peter Zijlstra
2021-04-07 20:50     ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 10/37] mm: rcu safe vma freeing Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 11/37] x86/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-07 14:48   ` Peter Zijlstra
2021-04-07 15:35     ` Matthew Wilcox
2021-04-07 20:32       ` Michel Lespinasse
2021-04-07 20:14     ` Michel Lespinasse
2021-04-07 20:18       ` Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 12/37] mm: refactor __handle_mm_fault() / handle_pte_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 13/37] mm: implement speculative handling in __handle_mm_fault() Michel Lespinasse
2021-04-07 15:36   ` Andy Lutomirski
2021-04-28 14:58     ` Michel Lespinasse
2021-04-28 15:13       ` Andy Lutomirski
2021-04-28 16:11         ` Paul E. McKenney
2021-04-29  0:02           ` Michel Lespinasse
2021-04-29  0:05             ` Andy Lutomirski
2021-04-29 16:12               ` Matthew Wilcox
2021-04-29 18:04                 ` Andy Lutomirski
2021-04-29 19:14                 ` Michel Lespinasse
2021-04-29 19:34                   ` Matthew Wilcox
2021-04-29 23:56                     ` Michel Lespinasse
2021-04-29 15:52             ` Paul E. McKenney
2021-04-29 18:34               ` Paul E. McKenney
2021-04-29 18:49                 ` Matthew Wilcox
2021-05-03  3:14                   ` Paul E. McKenney
2021-04-29 21:17                 ` Michel Lespinasse
2021-05-03  3:40                   ` Paul E. McKenney
2021-05-03  4:34                     ` Michel Lespinasse
2021-05-03 16:32                       ` Paul E. McKenney
2021-04-07  1:44 ` Michel Lespinasse [this message]
2021-04-07  1:44 ` [RFC PATCH 15/37] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 16/37] mm: enable speculative fault handling through do_anonymous_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 17/37] mm: implement speculative handling in do_numa_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 18/37] mm: enable speculative fault " Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 19/37] mm: implement speculative handling in wp_page_copy() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 20/37] mm: implement and enable speculative fault handling in handle_pte_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 21/37] mm: implement speculative handling in do_swap_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 22/37] mm: enable speculative fault handling through do_swap_page() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 23/37] mm: rcu safe vma->vm_file freeing Michel Lespinasse
2021-04-08  5:12   ` [mm] 87b1c39af4: nvml.blk_rw_mt_TEST0_check_pmem_debug.fail kernel test robot
2021-04-07  1:44 ` [RFC PATCH 24/37] mm: implement speculative handling in __do_fault() Michel Lespinasse
2021-04-07  2:35   ` Matthew Wilcox
2021-04-07  2:53     ` Michel Lespinasse
2021-04-07  3:01       ` Matthew Wilcox
2021-04-07 14:40   ` Peter Zijlstra
2021-04-07 21:20     ` Michel Lespinasse
2021-04-07 21:27       ` Matthew Wilcox
2021-04-08  7:00         ` Peter Zijlstra
2021-04-08  7:13           ` Matthew Wilcox
2021-04-08  8:18             ` Peter Zijlstra
2021-04-08  8:37             ` Michel Lespinasse
2021-04-08 11:28               ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 25/37] mm: implement speculative handling in filemap_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 26/37] mm: implement speculative fault handling in finish_fault() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 27/37] mm: implement speculative handling in do_fault_around() Michel Lespinasse
2021-04-07  2:37   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 28/37] mm: implement speculative handling in filemap_map_pages() Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 29/37] fs: list file types that support speculative faults Michel Lespinasse
2021-04-07  2:39   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 30/37] mm: enable speculative fault handling for supported file types Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 31/37] ext4: implement speculative fault handling Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 32/37] f2fs: " Michel Lespinasse
2021-04-07  1:44 ` [RFC PATCH 33/37] mm: enable speculative fault handling only for multithreaded user space Michel Lespinasse
2021-04-07  2:48   ` Matthew Wilcox
2021-04-07  1:44 ` [RFC PATCH 34/37] mm: rcu safe vma freeing " Michel Lespinasse
2021-04-07  2:50   ` Matthew Wilcox
2021-04-08  7:53     ` Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 35/37] mm: spf statistics Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 36/37] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Michel Lespinasse
2021-04-07  1:45 ` [RFC PATCH 37/37] arm64/mm: attempt speculative mm faults first Michel Lespinasse
2021-04-21  1:44 ` [RFC PATCH 00/37] Speculative page faults Chinwen Chang
2021-06-28 22:14 ` Axel Rasmussen
2021-07-21 11:33 ` vjitta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210407014502.24091-15-michel@lespinasse.org \
    --to=michel@lespinasse.org \
    --cc=akpm@linux-foundation.org \
    --cc=joelaf@google.com \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=romlem@google.com \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox