From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: akpm@linux-foundation.org, mike.kravetz@oracle.com
Cc: almasrymina@google.com, songmuchun@bytedance.com,
baolin.wang@linux.alibaba.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v2 2/3] mm: rmap: Move the cache flushing to the correct place for hugetlb PMD sharing
Date: Wed, 27 Apr 2022 18:52:06 +0800 [thread overview]
Message-ID: <4f7ae6dfdc838ab71e1655188b657c032ff1f28f.1651056365.git.baolin.wang@linux.alibaba.com> (raw)
In-Reply-To: <cover.1651056365.git.baolin.wang@linux.alibaba.com>
In-Reply-To: <cover.1651056365.git.baolin.wang@linux.alibaba.com>
The cache level flush will always be first when changing an existing
virtual–>physical mapping to a new value, since this allows us to
properly handle systems whose caches are strict and require a
virtual–>physical translation to exist for a virtual address. So we
should move the cache flushing before huge_pmd_unshare().
As Muchun pointed out[1], now the architectures whose supporting hugetlb
PMD sharing have no cache flush issues in practice. But I think we
should still follow the cache/TLB flushing rules when changing a valid
virtual address mapping in case of potential issues in future.
[1] https://lore.kernel.org/all/YmT%2F%2FhuUbFX+KHcy@FVFYT0MHHV2J.usts.net/
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/rmap.c | 40 ++++++++++++++++++++++------------------
1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 61e63db..4f0d115 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1535,15 +1535,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
* do this outside rmap routines.
*/
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
+ /*
+ * huge_pmd_unshare may unmap an entire PMD page.
+ * There is no way of knowing exactly which PMDs may
+ * be cached for this mm, so we must flush them all.
+ * start/end were already adjusted above to cover this
+ * range.
+ */
+ flush_cache_range(vma, range.start, range.end);
+
if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) {
- /*
- * huge_pmd_unshare unmapped an entire PMD
- * page. There is no way of knowing exactly
- * which PMDs may be cached for this mm, so
- * we must flush them all. start/end were
- * already adjusted above to cover this range.
- */
- flush_cache_range(vma, range.start, range.end);
flush_tlb_range(vma, range.start, range.end);
mmu_notifier_invalidate_range(mm, range.start,
range.end);
@@ -1560,13 +1561,14 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
page_vma_mapped_walk_done(&pvmw);
break;
}
+ } else {
+ flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
}
/*
* Nuke the page table entry. When having to clear
* PageAnonExclusive(), we always have to flush.
*/
- flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
if (should_defer_flush(mm, flags) && !anon_exclusive) {
/*
* We clear the PTE but do not flush so potentially
@@ -1890,15 +1892,16 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
* do this outside rmap routines.
*/
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
+ /*
+ * huge_pmd_unshare may unmap an entire PMD page.
+ * There is no way of knowing exactly which PMDs may
+ * be cached for this mm, so we must flush them all.
+ * start/end were already adjusted above to cover this
+ * range.
+ */
+ flush_cache_range(vma, range.start, range.end);
+
if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) {
- /*
- * huge_pmd_unshare unmapped an entire PMD
- * page. There is no way of knowing exactly
- * which PMDs may be cached for this mm, so
- * we must flush them all. start/end were
- * already adjusted above to cover this range.
- */
- flush_cache_range(vma, range.start, range.end);
flush_tlb_range(vma, range.start, range.end);
mmu_notifier_invalidate_range(mm, range.start,
range.end);
@@ -1915,10 +1918,11 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
page_vma_mapped_walk_done(&pvmw);
break;
}
+ } else {
+ flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
}
/* Nuke the page table entry. */
- flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
pteval = ptep_clear_flush(vma, address, pvmw.pte);
/* Set the dirty flag on the folio now the pte is gone. */
--
1.8.3.1
next prev parent reply other threads:[~2022-04-27 10:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-27 10:52 [PATCH v2 0/3] Fix cache flush issues considering " Baolin Wang
2022-04-27 10:52 ` [PATCH v2 1/3] mm: hugetlb: Considering PMD sharing when flushing cache/TLBs Baolin Wang
2022-04-28 2:55 ` Muchun Song
2022-04-27 10:52 ` Baolin Wang [this message]
2022-04-28 5:55 ` [PATCH v2 2/3] mm: rmap: Move the cache flushing to the correct place for hugetlb PMD sharing Muchun Song
2022-04-28 7:06 ` Baolin Wang
2022-05-03 18:42 ` Mike Kravetz
2022-05-04 2:50 ` Baolin Wang
2022-04-27 10:52 ` [PATCH v2 3/3] mm: rmap: Use flush_cache_range() to flush cache for hugetlb pages Baolin Wang
2022-05-03 20:17 ` Mike Kravetz
2022-05-04 2:49 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4f7ae6dfdc838ab71e1655188b657c032ff1f28f.1651056365.git.baolin.wang@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox