From: Michel Lespinasse <walken@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Laurent Dufour <ldufour@linux.ibm.com>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Jerome Glisse <jglisse@redhat.com>,
Davidlohr Bueso <dave@stgolabs.net>,
David Rientjes <rientjes@google.com>
Cc: linux-mm <linux-mm@kvack.org>, Michel Lespinasse <walken@google.com>
Subject: [RFC PATCH 17/24] x86 fault handler: implement range locking
Date: Mon, 24 Feb 2020 12:30:50 -0800 [thread overview]
Message-ID: <20200224203057.162467-18-walken@google.com> (raw)
In-Reply-To: <20200224203057.162467-1-walken@google.com>
Change the x86 fault handler to implement range locking.
Initially we try to lock a pmd sized range around the faulting address,
which is appropriate for anon vmas. After finding the correct vma for
the faulting address, we verify that it is anonymous and fall back to
a coarse grained lock if necessary. If a fine grained lock is workable,
we copy the vma of record into a pseudo-vma and release the mm_vma_lock
before handling the fault.
Signed-off-by: Michel Lespinasse <walken@google.com>
---
arch/x86/mm/fault.c | 40 ++++++++++++++++++++++++++++++++--------
1 file changed, 32 insertions(+), 8 deletions(-)
diff --git arch/x86/mm/fault.c arch/x86/mm/fault.c
index 52333272e14e..1e37284d373c 100644
--- arch/x86/mm/fault.c
+++ arch/x86/mm/fault.c
@@ -941,6 +941,7 @@ bad_area(struct pt_regs *regs, unsigned long error_code,
unsigned long address, struct vm_area_struct *vma,
struct mm_lock_range *range)
{
+ struct mm_struct *mm;
u32 pkey = 0;
int si_code = SEGV_MAPERR;
@@ -984,7 +985,10 @@ bad_area(struct pt_regs *regs, unsigned long error_code,
* Something tried to access memory that isn't in our memory map..
* Fix it, but check if it's kernel or user first..
*/
- mm_read_range_unlock(current->mm, range);
+ mm = current->mm;
+ if (!mm_range_is_coarse(range))
+ mm_vma_unlock(mm);
+ mm_read_range_unlock(mm, range);
__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
}
@@ -1278,7 +1282,7 @@ void do_user_addr_fault(struct pt_regs *regs,
unsigned long hw_error_code,
unsigned long address)
{
- struct mm_lock_range *range;
+ struct mm_lock_range pmd_range, *range;
struct vm_area_struct pvma, *vma;
struct task_struct *tsk;
struct mm_struct *mm;
@@ -1363,7 +1367,10 @@ void do_user_addr_fault(struct pt_regs *regs,
}
#endif
- range = mm_coarse_lock_range();
+ mm_init_lock_range(&pmd_range,
+ address & PMD_MASK,
+ (address & PMD_MASK) + PMD_SIZE);
+ range = &pmd_range;
/*
* Kernel-mode access to the user address space should only occur
@@ -1397,6 +1404,8 @@ void do_user_addr_fault(struct pt_regs *regs,
might_sleep();
}
+ if (!mm_range_is_coarse(range))
+ mm_vma_lock(mm);
vma = find_vma(mm, address);
if (unlikely(!vma)) {
bad_area(regs, hw_error_code, address, NULL, range);
@@ -1408,6 +1417,10 @@ void do_user_addr_fault(struct pt_regs *regs,
bad_area(regs, hw_error_code, address, NULL, range);
return;
}
+ /*
+ * Note that if range is fine grained, we can still safely call
+ * expand_stack as we are protected by the mm_vma_lock().
+ */
if (unlikely(expand_stack(vma, address))) {
bad_area(regs, hw_error_code, address, NULL, range);
return;
@@ -1423,23 +1436,34 @@ void do_user_addr_fault(struct pt_regs *regs,
return;
}
- if (vma_is_anonymous(vma)) {
+ if (!mm_range_is_coarse(range)) {
/*
* Allocate anon_vma if needed.
* This needs to operate on the vma of record.
*/
fault = prepare_mm_fault(vma, flags);
- if (fault)
- goto got_fault;
/*
* Copy vma attributes into a pseudo-vma.
- * This will be required when using fine grained locks.
+ * The vma of record is only valid until mm_vma_unlock().
*/
pvma = *vma;
vma = &pvma;
- }
+ mm_vma_unlock(mm);
+ if (fault)
+ goto got_fault;
+
+ /*
+ * Fall back to locking the entire MM
+ * when operating on file vma.
+ */
+ if (!vma_is_anonymous(vma)) {
+ mm_read_range_unlock(mm, range);
+ range = mm_coarse_lock_range();
+ goto retry;
+ }
+ }
/*
* If for any reason at all we couldn't handle the fault,
* make sure we exit gracefully rather than endlessly redo
--
2.25.0.341.g760bfbb309-goog
next prev parent reply other threads:[~2020-02-24 20:31 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-24 20:30 [RFC PATCH 00/24] Fine grained MM locking Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 01/24] MM locking API: initial implementation as rwsem wrappers Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 02/24] MM locking API: use coccinelle to convert mmap_sem rwsem call sites Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 03/24] MM locking API: manual conversion of mmap_sem call sites missed by coccinelle Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 04/24] MM locking API: add range arguments Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 05/24] MM locking API: allow for sleeping during unlock Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 06/24] MM locking API: implement fine grained range locks Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 07/24] mm/memory: add range field to struct vm_fault Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 08/24] mm/memory: allow specifying MM lock range to handle_mm_fault() Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 09/24] do_swap_page: use the vmf->range field when dropping mmap_sem Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 10/24] handle_userfault: " Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 11/24] x86 fault handler: merge bad_area() functions Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 12/24] x86 fault handler: use an explicit MM lock range Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 13/24] mm/memory: add prepare_mm_fault() function Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 14/24] mm/swap_state: disable swap vma readahead Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 15/24] x86 fault handler: use a pseudo-vma when operating on anonymous vmas Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 16/24] MM locking API: add vma locking API Michel Lespinasse
2020-02-24 20:30 ` Michel Lespinasse [this message]
2020-02-24 20:30 ` [RFC PATCH 18/24] shared file mappings: use the vmf->range field when dropping mmap_sem Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 19/24] mm: add field to annotate vm_operations that support range locking Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 20/24] x86 fault handler: extend range locking to supported file vmas Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 21/24] do_mmap: add locked argument Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 22/24] do_mmap: implement " Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 23/24] do_mmap: use locked=false in vm_mmap_pgoff() and aio_setup_ring() Michel Lespinasse
2020-02-24 20:30 ` [RFC PATCH 24/24] do_mmap: implement easiest cases of fine grained locking Michel Lespinasse
2022-03-20 22:08 ` [RFC PATCH 00/24] Fine grained MM locking Barry Song
2022-03-20 23:14 ` Matthew Wilcox
2022-03-21 0:20 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200224203057.162467-18-walken@google.com \
--to=walken@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=dave@stgolabs.net \
--cc=jglisse@redhat.com \
--cc=ldufour@linux.ibm.com \
--cc=linux-mm@kvack.org \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox