From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, x86@kernel.org, linux-s390@vger.kernel.org,
kvm@vger.kernel.org, David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Yonghua Huang <yonghua.huang@intel.com>,
Fei Li <fei1.li@intel.com>, Christoph Hellwig <hch@lst.de>,
Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Ingo Molnar <mingo@redhat.com>,
Alex Williamson <alex.williamson@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH v1 3/3] mm: follow_pte() improvements
Date: Wed, 10 Apr 2024 17:55:27 +0200 [thread overview]
Message-ID: <20240410155527.474777-4-david@redhat.com> (raw)
In-Reply-To: <20240410155527.474777-1-david@redhat.com>
follow_pte() is now our main function to lookup PTEs in VM_PFNMAP/VM_IO
VMAs. Let's perform some more sanity checks to make this exported function
harder to abuse.
Further, extend the doc a bit, it still focuses on the KVM use case with
MMU notifiers. Drop the KVM+follow_pfn() comment, follow_pfn() is no more,
and we have other users nowadays.
Also extend the doc regarding refcounted pages and the interaction with MMU
notifiers.
KVM is one example that uses MMU notifiers and can deal with refcounted
pages properly. VFIO is one example that doesn't use MMU notifiers, and
to prevent use-after-free, rejects refcounted pages:
pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn)). Protection changes are
less of a concern for users like VFIO: the behavior is similar to
longterm-pinning a page, and getting the PTE protection changed afterwards.
The primary concern with refcounted pages is use-after-free, which
callers should be aware of.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/memory.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index ab01fb69dc72..535ef2686f95 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5935,15 +5935,21 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
*
* On a successful return, the pointer to the PTE is stored in @ptepp;
* the corresponding lock is taken and its location is stored in @ptlp.
- * The contents of the PTE are only stable until @ptlp is released;
- * any further use, if any, must be protected against invalidation
- * with MMU notifiers.
+ *
+ * The contents of the PTE are only stable until @ptlp is released using
+ * pte_unmap_unlock(). This function will fail if the PTE is non-present.
+ * Present PTEs may include PTEs that map refcounted pages, such as
+ * anonymous folios in COW mappings.
+ *
+ * Callers must be careful when relying on PTE content after
+ * pte_unmap_unlock(). Especially if the PTE maps a refcounted page,
+ * callers must protect against invalidation with MMU notifiers; otherwise
+ * access to the PFN at a later point in time can trigger use-after-free.
*
* Only IO mappings and raw PFN mappings are allowed. The mmap semaphore
* should be taken for read.
*
- * KVM uses this function. While it is arguably less bad than the historic
- * ``follow_pfn``, it is not a good general-purpose API.
+ * This function must not be used to modify PTE content.
*
* Return: zero on success, -ve otherwise.
*/
@@ -5957,6 +5963,10 @@ int follow_pte(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmd;
pte_t *ptep;
+ mmap_assert_locked(mm);
+ if (unlikely(address < vma->vm_start || address >= vma->vm_end))
+ goto out;
+
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
goto out;
--
2.44.0
prev parent reply other threads:[~2024-04-10 15:56 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-10 15:55 [PATCH v1 0/3] mm: follow_pte() improvements and acrn follow_pte() fixes David Hildenbrand
2024-04-10 15:55 ` [PATCH v1 1/3] drivers/virt/acrn: fix PFNMAP PTE checks in acrn_vm_ram_map() David Hildenbrand
2024-04-10 20:12 ` Andrew Morton
2024-04-10 15:55 ` [PATCH v1 2/3] mm: pass VMA instead of MM to follow_pte() David Hildenbrand
2024-04-10 18:08 ` Sean Christopherson
2024-04-10 15:55 ` David Hildenbrand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240410155527.474777-4-david@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=fei1.li@intel.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hch@lst.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=x86@kernel.org \
--cc=yonghua.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox