From: Donet Tom <donettom@linux.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Ritesh Harjani <ritesh.list@gmail.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
Zi Yan <ziy@nvidia.com>, David Hildenbrand <david@redhat.com>,
shuah Khan <shuah@kernel.org>, Dev Jain <dev.jain@arm.com>
Subject: [PATCH] mm: migration :shared anonymous migration test is failing
Date: Thu, 19 Dec 2024 06:47:17 -0600 [thread overview]
Message-ID: <20241219124717.4907-1-donettom@linux.ibm.com> (raw)
The migration selftest is currently failing for shared anonymous
mappings due to a race condition.
During migration, the source folio's PTE is unmapped by nuking the
PTE, flushing the TLB,and then marking the page for migration
(by creating the swap entries). The issue arises when, immediately
after the PTE is nuked and the TLB is flushed, but before the page
is marked for migration, another thread accesses the page. This
triggers a page fault, and the page fault handler invokes
do_pte_missing() instead of do_swap_page(), as the page is not yet
marked for migration.
In the fault handling path, do_pte_missing() calls __do_fault()
->shmem_fault() -> shmem_get_folio_gfp() -> filemap_get_entry().
This eventually calls folio_try_get(), incrementing the reference
count of the folio undergoing migration. The thread then blocks
on folio_lock(), as the migration path holds the lock. This
results in the migration failing in __migrate_folio(), which expects
the folio's reference count to be 2. However, the reference count is
incremented by the fault handler, leading to the failure.
The issue arises because, after nuking the PTE and before marking the
page for migration, the page is accessed. To address this, we have
updated the logic to first nuke the PTE, then mark the page for
migration, and only then flush the TLB. With this patch, If the page is
accessed immediately after nuking the PTE, the TLB entry is still
valid, so no fault occurs. After marking the page for migration,
flushing the TLB ensures that the next page fault correctly triggers
do_swap_page() and waits for the migration to complete.
Test Result without this patch
==============================
# ./tools/testing/selftests/mm/migration
TAP version 13
1..3
# Starting 3 tests from 1 test cases.
# RUN migration.private_anon ...
# OK migration.private_anon
ok 1 migration.private_anon
# RUN migration.shared_anon ...
Didn't migrate 1 pages
# migration.c:175:shared_anon:Expected migrate(ptr,
self->n1, self->n2) (-2) == 0 (0)
# shared_anon: Test terminated by assertion
# FAIL migration.shared_anon
not ok 2 migration.shared_anon
# RUN migration.private_anon_thp ...
# OK migration.private_anon_thp
ok 3 migration.private_anon_thp
# FAILED: 2 / 3 tests passed.
# Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0
Test result with this patch
===========================
# ./tools/testing/selftests/mm/migration
TAP version 13
1..3
# Starting 3 tests from 1 test cases.
# RUN migration.private_anon ...
# OK migration.private_anon
ok 1 migration.private_anon
# RUN migration.shared_anon ...
# OK migration.shared_anon
ok 2 migration.shared_anon
# RUN migration.private_anon_thp ...
# OK migration.private_anon_thp
ok 3 migration.private_anon_thp
# PASSED: 3 / 3 tests passed.
# Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
mm/rmap.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index c6c4d4ea29a7..920ae46e977f 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2154,7 +2154,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
hugetlb_vma_unlock_write(vma);
}
/* Nuke the hugetlb page table entry */
- pteval = huge_ptep_clear_flush(vma, address, pvmw.pte);
+ pteval = huge_ptep_get_and_clear(mm, address, pvmw.pte);
} else {
flush_cache_page(vma, address, pfn);
/* Nuke the page table entry. */
@@ -2171,7 +2171,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
set_tlb_ubc_flush_pending(mm, pteval, address);
} else {
- pteval = ptep_clear_flush(vma, address, pvmw.pte);
+ pteval = ptep_get_and_clear(mm, address, pvmw.pte);
}
}
@@ -2320,6 +2320,14 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
folio_remove_rmap_pte(folio, subpage, vma);
if (vma->vm_flags & VM_LOCKED)
mlock_drain_local();
+
+ if (!should_defer_flush(mm, flags)) {
+ if (folio_test_hugetlb(folio))
+ flush_hugetlb_page(vma, address);
+ else
+ flush_tlb_page(vma, address);
+ }
+
folio_put(folio);
}
--
2.37.2
next reply other threads:[~2024-12-19 12:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-19 12:47 Donet Tom [this message]
2024-12-19 12:55 ` David Hildenbrand
2024-12-20 2:16 ` Donet Tom
2024-12-19 12:58 ` David Hildenbrand
2024-12-20 2:55 ` Donet Tom
2024-12-20 10:11 ` David Hildenbrand
2024-12-23 12:08 ` Donet Tom
2024-12-20 2:31 ` Baolin Wang
2024-12-20 3:12 ` Donet Tom
2024-12-20 3:32 ` Baolin Wang
2024-12-20 4:30 ` Donet Tom
2024-12-20 4:37 ` Dev Jain
2024-12-23 12:02 ` Donet Tom
2024-12-20 10:05 ` kernel test robot
2024-12-20 10:17 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241219124717.4907-1-donettom@linux.ibm.com \
--to=donettom@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ritesh.list@gmail.com \
--cc=shuah@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox