* [PATCH v2 1/1] userfaultfd: fix move_pages_pte() splitting folio under RCU read lock
@ 2024-01-02 23:32 Suren Baghdasaryan
2024-01-03 1:54 ` Peter Xu
0 siblings, 1 reply; 2+ messages in thread
From: Suren Baghdasaryan @ 2024-01-02 23:32 UTC (permalink / raw)
To: akpm
Cc: viro, brauner, shuah, aarcange, lokeshgidra, peterx, david,
ryan.roberts, hughd, mhocko, axelrasmussen, rppt, willy,
Liam.Howlett, jannh, zhangpeng362, bgeffon, kaleshsingh,
ngeoffray, jdduke, surenb, linux-mm, linux-fsdevel, linux-kernel,
linux-kselftest, kernel-team
While testing the split PMD path with lockdep enabled I've got an
"Invalid wait context" error caused by split_huge_page_to_list() trying
to lock anon_vma->rwsem while inside RCU read section. The issues is due
to move_pages_pte() calling split_folio() under RCU read lock. Fix this
by unmapping the PTEs and exiting RCU read section before splitting the
folio and then retrying. The same retry pattern is used when locking the
folio or anon_vma in this function. After splitting the large folio we
unlock and release it because after the split the old folio might not be
the one that contains the src_addr.
Fixes: 94b01c885131 ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
Changes from v1 [1]:
1. Reset src_folio and src_folio_pte after folio is split, per Peter Xu
[1] https://lore.kernel.org/all/20231230025607.2476912-1-surenb@google.com/
mm/userfaultfd.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 5e718014e671..216ab4c8621f 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1078,9 +1078,18 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
/* at this point we have src_folio locked */
if (folio_test_large(src_folio)) {
+ /* split_folio() can block */
+ pte_unmap(&orig_src_pte);
+ pte_unmap(&orig_dst_pte);
+ src_pte = dst_pte = NULL;
err = split_folio(src_folio);
if (err)
goto out;
+ /* have to reacquire the folio after it got split */
+ folio_unlock(src_folio);
+ folio_put(src_folio);
+ src_folio = NULL;
+ goto retry;
}
if (!src_anon_vma) {
--
2.43.0.472.g3155946c3a-goog
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH v2 1/1] userfaultfd: fix move_pages_pte() splitting folio under RCU read lock
2024-01-02 23:32 [PATCH v2 1/1] userfaultfd: fix move_pages_pte() splitting folio under RCU read lock Suren Baghdasaryan
@ 2024-01-03 1:54 ` Peter Xu
0 siblings, 0 replies; 2+ messages in thread
From: Peter Xu @ 2024-01-03 1:54 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: akpm, viro, brauner, shuah, aarcange, lokeshgidra, david,
ryan.roberts, hughd, mhocko, axelrasmussen, rppt, willy,
Liam.Howlett, jannh, zhangpeng362, bgeffon, kaleshsingh,
ngeoffray, jdduke, linux-mm, linux-fsdevel, linux-kernel,
linux-kselftest, kernel-team
On Tue, Jan 02, 2024 at 03:32:56PM -0800, Suren Baghdasaryan wrote:
> While testing the split PMD path with lockdep enabled I've got an
> "Invalid wait context" error caused by split_huge_page_to_list() trying
> to lock anon_vma->rwsem while inside RCU read section. The issues is due
> to move_pages_pte() calling split_folio() under RCU read lock. Fix this
> by unmapping the PTEs and exiting RCU read section before splitting the
> folio and then retrying. The same retry pattern is used when locking the
> folio or anon_vma in this function. After splitting the large folio we
> unlock and release it because after the split the old folio might not be
> the one that contains the src_addr.
>
> Fixes: 94b01c885131 ("userfaultfd: UFFDIO_MOVE uABI")
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-01-03 1:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-02 23:32 [PATCH v2 1/1] userfaultfd: fix move_pages_pte() splitting folio under RCU read lock Suren Baghdasaryan
2024-01-03 1:54 ` Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox