linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qi Zheng <zhengqi.arch@bytedance.com>
To: david@redhat.com, hughd@google.com, willy@infradead.org,
	muchun.song@linux.dev, vbabka@kernel.org,
	akpm@linux-foundation.org, rppt@kernel.org,
	vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com,
	christophe.leroy2@cs-soprasteria.com
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org,
	Qi Zheng <zhengqi.arch@bytedance.com>
Subject: [PATCH v3 07/14] mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock()
Date: Wed,  4 Sep 2024 16:40:15 +0800	[thread overview]
Message-ID: <20240904084022.32728-8-zhengqi.arch@bytedance.com> (raw)
In-Reply-To: <20240904084022.32728-1-zhengqi.arch@bytedance.com>

In collapse_pte_mapped_thp(), we may modify the pte and pmd entry after
acquring the ptl, so convert it to using pte_offset_map_rw_nolock(). At
this time, the pte_same() check is not performed after the PTL held. So we
should get pgt_pmd and do pmd_same() check after the ptl held.

For the case where the ptl is released first and then the pml is acquired,
the PTE page may have been freed, so we must do pmd_same() check before
reacquiring the ptl.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 mm/khugepaged.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 6498721d4783a..a117d35f33aee 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1605,7 +1605,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
 	if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED))
 		pml = pmd_lock(mm, pmd);
 
-	start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl);
+	start_pte = pte_offset_map_rw_nolock(mm, pmd, haddr, &pgt_pmd, &ptl);
 	if (!start_pte)		/* mmap_lock + page lock should prevent this */
 		goto abort;
 	if (!pml)
@@ -1613,6 +1613,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
 	else if (ptl != pml)
 		spin_lock_nested(ptl, SINGLE_DEPTH_NESTING);
 
+	if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd))))
+		goto abort;
+
 	/* step 2: clear page table and adjust rmap */
 	for (i = 0, addr = haddr, pte = start_pte;
 	     i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) {
@@ -1658,6 +1661,16 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
 	/* step 4: remove empty page table */
 	if (!pml) {
 		pml = pmd_lock(mm, pmd);
+		/*
+		 * We called pte_unmap() and release the ptl before acquiring
+		 * the pml, which means we left the RCU critical section, so the
+		 * PTE page may have been freed, so we must do pmd_same() check
+		 * before reacquiring the ptl.
+		 */
+		if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) {
+			spin_unlock(pml);
+			goto pmd_change;
+		}
 		if (ptl != pml)
 			spin_lock_nested(ptl, SINGLE_DEPTH_NESTING);
 	}
@@ -1689,6 +1702,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
 		pte_unmap_unlock(start_pte, ptl);
 	if (pml && pml != ptl)
 		spin_unlock(pml);
+pmd_change:
 	if (notified)
 		mmu_notifier_invalidate_range_end(&range);
 drop_folio:
-- 
2.20.1



  parent reply	other threads:[~2024-09-04  8:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-04  8:40 [PATCH v3 00/14] introduce pte_offset_map_{ro|rw}_nolock() Qi Zheng
2024-09-04  8:40 ` [PATCH v3 01/14] mm: pgtable: " Qi Zheng
2024-09-06  7:20   ` Muchun Song
2024-09-12  9:28     ` Qi Zheng
2024-09-04  8:40 ` [PATCH v3 02/14] arm: adjust_pte() use pte_offset_map_rw_nolock() Qi Zheng
2024-09-04  8:40 ` [PATCH v3 03/14] powerpc: assert_pte_locked() use pte_offset_map_ro_nolock() Qi Zheng
2024-09-04  8:40 ` [PATCH v3 04/14] mm: filemap: filemap_fault_recheck_pte_none() " Qi Zheng
2024-09-04  8:40 ` [PATCH v3 05/14] mm: khugepaged: __collapse_huge_page_swapin() " Qi Zheng
2024-09-04  8:40 ` [PATCH v3 06/14] mm: handle_pte_fault() use pte_offset_map_rw_nolock() Qi Zheng
2024-09-04  8:40 ` Qi Zheng [this message]
2024-09-04  8:40 ` [PATCH v3 08/14] mm: copy_pte_range() " Qi Zheng
2024-09-05  8:57   ` Muchun Song
2024-09-05 10:55     ` Qi Zheng
2024-09-04  8:40 ` [PATCH v3 09/14] mm: mremap: move_ptes() " Qi Zheng
2024-09-05  9:25   ` Muchun Song
2024-09-05 10:56     ` Qi Zheng
2024-09-04  8:40 ` [PATCH v3 10/14] mm: page_vma_mapped_walk: map_pte() " Qi Zheng
2024-09-05 12:07   ` Muchun Song
2024-09-12  9:30     ` Qi Zheng
2024-09-04  8:40 ` [PATCH v3 11/14] mm: userfaultfd: move_pages_pte() " Qi Zheng
2024-09-05 12:20   ` Muchun Song
2024-09-04  8:40 ` [PATCH v3 12/14] mm: multi-gen LRU: walk_pte_range() " Qi Zheng
2024-09-05 12:23   ` Muchun Song
2024-09-04  8:40 ` [PATCH v3 13/14] mm: pgtable: remove pte_offset_map_nolock() Qi Zheng
2024-09-05 12:23   ` Muchun Song
2024-09-04  8:40 ` [PATCH v3 14/14] mm: khugepaged: retract_page_tables() use pte_offset_map_rw_nolock() Qi Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240904084022.32728-8-zhengqi.arch@bytedance.com \
    --to=zhengqi.arch@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy2@cs-soprasteria.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=muchun.song@linux.dev \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@kernel.org \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox