linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	peterx@redhat.com, Alex Williamson <alex.williamson@redhat.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	x86@kernel.org, Yan Zhao <yan.y.zhao@intel.com>,
	Kevin Tian <kevin.tian@intel.com>, Pei Li <peili.dev@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	David Wang <00107082@163.com>, Bert Karwatzki <spasswolf@web.de>,
	Sergey Senozhatsky <senozhatsky@chromium.org>
Subject: [PATCH] mm/x86/pat: Only untrack the pfn range if unmap region
Date: Fri, 12 Jul 2024 10:42:44 -0400	[thread overview]
Message-ID: <20240712144244.3090089-1-peterx@redhat.com> (raw)

This patch is one patch of an old series [1] that got reposted standalone
here, with the hope to fix some reported untrack_pfn() issues reported
recently [2,3], where there used to be other fix [4] but unfortunately
which looks like to cause other issues.  The hope is this patch can fix it
the right way.

X86 uses pfn tracking to do pfnmaps.  AFAICT, the tracking should normally
start at mmap() of device drivers, then untracked when munmap().  However
in the current code the untrack is done in unmap_single_vma().  This might
be problematic.

For example, unmap_single_vma() can be used nowadays even for zapping a
single page rather than the whole vmas.  It's very confusing to do whole
vma untracking in this function even if a caller would like to zap one
page.  It could simply be wrong.

Such issue won't be exposed by things like MADV_DONTNEED won't ever work
for pfnmaps and it'll fail the madvise() already before reaching here.
However looks like it can be triggered like what was reported where invoked
from an unmap request from a file vma.

There's also work [5] on VFIO (merged now) to allow tearing down MMIO
pgtables before an munmap(), in which case we may not want to untrack the
pfns if we're only tearing down the pgtables.  IOW, we may want to keep the
pfn tracking information as those pfn mappings can be restored later with
the same vma object.  Currently it's not an immediate problem for VFIO, as
VFIO uses UC- by default, but it looks like there's plan to extend that in
the near future.

IIUC, this was overlooked when zap_page_range_single() was introduced,
while in the past it was only used in the munmap() path which wants to
always unmap the region completely.  E.g., commit f5cc4eef9987 ("VM: make
zap_page_range() callers that act on a single VMA use separate helper") is
the initial commit that introduced unmap_single_vma(), in which the chunk
of untrack_pfn() was moved over from unmap_vmas().

Recover that behavior to untrack pfnmap only when unmap regions.

[1] https://lore.kernel.org/r/20240523223745.395337-1-peterx@redhat.com
[2] https://groups.google.com/g/syzkaller-bugs/c/FeQZvSbqWbQ/m/tHFmoZthAAAJ
[3] https://lore.kernel.org/r/20240712131931.20207-1-00107082@163.com
[4] https://lore.kernel.org/all/20240710-bug12-v1-1-0e5440f9b8d3@gmail.com/
[5] https://lore.kernel.org/r/20240523195629.218043-1-alex.williamson@redhat.com

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: x86@kernel.org
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Pei Li <peili.dev@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Wang <00107082@163.com>
Cc: Bert Karwatzki <spasswolf@web.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---

NOTE: I massaged the commit message comparing to the rfc post [1], the
patch itself is untouched.  Also removed rfc tag, and added more people
into the loop. Please kindly help test this patch if you have a reproducer,
as I can't reproduce it myself even with the syzbot reproducer on top of
mm-unstable.  Instead of further check on the reproducer, I decided to send
this out first as we have a bunch of reproducers on the list now..
---
 mm/memory.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 4bcd79619574..f57cc304b318 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1827,9 +1827,6 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 	if (vma->vm_file)
 		uprobe_munmap(vma, start, end);
 
-	if (unlikely(vma->vm_flags & VM_PFNMAP))
-		untrack_pfn(vma, 0, 0, mm_wr_locked);
-
 	if (start != end) {
 		if (unlikely(is_vm_hugetlb_page(vma))) {
 			/*
@@ -1894,6 +1891,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas,
 		unsigned long start = start_addr;
 		unsigned long end = end_addr;
 		hugetlb_zap_begin(vma, &start, &end);
+		if (unlikely(vma->vm_flags & VM_PFNMAP))
+			untrack_pfn(vma, 0, 0, mm_wr_locked);
 		unmap_single_vma(tlb, vma, start, end, &details,
 				 mm_wr_locked);
 		hugetlb_zap_end(vma, &details);
-- 
2.45.0



             reply	other threads:[~2024-07-12 14:42 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-12 14:42 Peter Xu [this message]
2024-07-13  1:18 ` David Hildenbrand
2024-07-13  3:36 ` David Wang
2024-07-14 10:59 ` David Wang
2024-07-14 18:27   ` [PATCH] " David Hildenbrand
2024-07-15 15:03     ` Peter Xu
2024-07-17 14:14       ` David Hildenbrand
2024-07-17 16:27         ` Peter Xu
2024-07-15  7:08 ` Yan Zhao
2024-07-15 14:29   ` Peter Xu
2024-07-16  9:13     ` Yan Zhao
2024-07-16 19:01       ` Peter Xu
2024-07-17  1:38         ` Yan Zhao
2024-07-17 14:15           ` Peter Xu
2024-07-18  1:50             ` Yan Zhao
2024-07-18 14:03               ` Peter Xu
2024-07-18 23:18                 ` Yan Zhao
2024-07-19  8:28                   ` David Hildenbrand
2024-07-19 14:13                     ` Peter Xu
2024-07-22  6:49                       ` Yan Zhao
2024-07-22 13:52                         ` Peter Xu
2024-07-22  6:43                     ` Yan Zhao
2024-07-22  9:17                       ` David Hildenbrand
2024-07-23 20:27                         ` Peter Xu
2024-07-23 21:36                           ` David Hildenbrand
2024-07-23 21:44                             ` Jason Gunthorpe
2024-07-24  8:53                               ` David Hildenbrand
2024-07-17 14:17         ` David Hildenbrand
2024-07-17 16:30           ` Peter Xu
2024-07-17 16:31             ` Jason Gunthorpe
2024-07-17 18:10               ` Peter Xu
2024-07-17 16:32             ` David Hildenbrand
2024-07-17 18:12               ` Peter Xu
2024-07-20  2:18 ` Liam R. Howlett
2024-07-22 15:15   ` Peter Xu
2024-07-22 20:22     ` Liam R. Howlett
2024-07-22 21:17       ` Peter Xu
2024-07-23 10:12         ` David Hildenbrand
2024-07-23 17:58           ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240712144244.3090089-1-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=00107082@163.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peili.dev@gmail.com \
    --cc=peterz@infradead.org \
    --cc=senozhatsky@chromium.org \
    --cc=spasswolf@web.de \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox