From: Yan Zhao <yan.y.zhao@intel.com>
To: Peter Xu <peterx@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
Gavin Shan <gshan@redhat.com>,
Catalin Marinas <catalin.marinas@arm.com>, <x86@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Alistair Popple" <apopple@nvidia.com>, <kvm@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>,
Sean Christopherson <seanjc@google.com>,
Oscar Salvador <osalvador@suse.de>,
Jason Gunthorpe <jgg@nvidia.com>, Borislav Petkov <bp@alien8.de>,
Zi Yan <ziy@nvidia.com>,
Axel Rasmussen <axelrasmussen@google.com>,
David Hildenbrand <david@redhat.com>,
Will Deacon <will@kernel.org>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [PATCH v2 07/19] mm/fork: Accept huge pfnmap entries
Date: Mon, 2 Sep 2024 15:58:38 +0800 [thread overview]
Message-ID: <ZtVwLntpS0eJubFq@yzhao56-desk.sh.intel.com> (raw)
In-Reply-To: <20240826204353.2228736-8-peterx@redhat.com>
On Mon, Aug 26, 2024 at 04:43:41PM -0400, Peter Xu wrote:
> Teach the fork code to properly copy pfnmaps for pmd/pud levels. Pud is
> much easier, the write bit needs to be persisted though for writable and
> shared pud mappings like PFNMAP ones, otherwise a follow up write in either
> parent or child process will trigger a write fault.
>
> Do the same for pmd level.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> mm/huge_memory.c | 29 ++++++++++++++++++++++++++---
> 1 file changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e2c314f631f3..15418ffdd377 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1559,6 +1559,24 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> pgtable_t pgtable = NULL;
> int ret = -ENOMEM;
>
> + pmd = pmdp_get_lockless(src_pmd);
> + if (unlikely(pmd_special(pmd))) {
> + dst_ptl = pmd_lock(dst_mm, dst_pmd);
> + src_ptl = pmd_lockptr(src_mm, src_pmd);
> + spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> + /*
> + * No need to recheck the pmd, it can't change with write
> + * mmap lock held here.
> + *
> + * Meanwhile, making sure it's not a CoW VMA with writable
> + * mapping, otherwise it means either the anon page wrongly
> + * applied special bit, or we made the PRIVATE mapping be
> + * able to wrongly write to the backend MMIO.
> + */
> + VM_WARN_ON_ONCE(is_cow_mapping(src_vma->vm_flags) && pmd_write(pmd));
> + goto set_pmd;
> + }
> +
> /* Skip if can be re-fill on fault */
> if (!vma_is_anonymous(dst_vma))
> return 0;
> @@ -1640,7 +1658,9 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> pmdp_set_wrprotect(src_mm, addr, src_pmd);
> if (!userfaultfd_wp(dst_vma))
> pmd = pmd_clear_uffd_wp(pmd);
> - pmd = pmd_mkold(pmd_wrprotect(pmd));
> + pmd = pmd_wrprotect(pmd);
> +set_pmd:
> + pmd = pmd_mkold(pmd);
> set_pmd_at(dst_mm, addr, dst_pmd, pmd);
>
> ret = 0;
> @@ -1686,8 +1706,11 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> * TODO: once we support anonymous pages, use
> * folio_try_dup_anon_rmap_*() and split if duplicating fails.
> */
> - pudp_set_wrprotect(src_mm, addr, src_pud);
> - pud = pud_mkold(pud_wrprotect(pud));
> + if (is_cow_mapping(vma->vm_flags) && pud_write(pud)) {
> + pudp_set_wrprotect(src_mm, addr, src_pud);
> + pud = pud_wrprotect(pud);
> + }
Do we need the logic to clear dirty bit in the child as that in
__copy_present_ptes()? (and also for the pmd's case).
e.g.
if (vma->vm_flags & VM_SHARED)
pud = pud_mkclean(pud);
> + pud = pud_mkold(pud);
> set_pud_at(dst_mm, addr, dst_pud, pud);
>
> ret = 0;
> --
> 2.45.0
>
next prev parent reply other threads:[~2024-09-02 11:50 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-26 20:43 [PATCH v2 00/19] mm: Support huge pfnmaps Peter Xu
2024-08-26 20:43 ` [PATCH v2 01/19] mm: Introduce ARCH_SUPPORTS_HUGE_PFNMAP and special bits to pmd/pud Peter Xu
2024-08-26 20:43 ` [PATCH v2 02/19] mm: Drop is_huge_zero_pud() Peter Xu
2024-08-26 20:43 ` [PATCH v2 03/19] mm: Mark special bits for huge pfn mappings when inject Peter Xu
2024-08-28 15:31 ` David Hildenbrand
2024-08-26 20:43 ` [PATCH v2 04/19] mm: Allow THP orders for PFNMAPs Peter Xu
2024-08-28 15:31 ` David Hildenbrand
2024-08-26 20:43 ` [PATCH v2 05/19] mm/gup: Detect huge pfnmap entries in gup-fast Peter Xu
2024-08-26 20:43 ` [PATCH v2 06/19] mm/pagewalk: Check pfnmap for folio_walk_start() Peter Xu
2024-08-28 7:44 ` David Hildenbrand
2024-08-28 14:24 ` Peter Xu
2024-08-28 15:30 ` David Hildenbrand
2024-08-28 19:45 ` Peter Xu
2024-08-28 23:46 ` Jason Gunthorpe
2024-08-29 6:35 ` David Hildenbrand
2024-08-29 18:45 ` Peter Xu
2024-08-29 15:10 ` David Hildenbrand
2024-08-29 18:49 ` Peter Xu
2024-08-26 20:43 ` [PATCH v2 07/19] mm/fork: Accept huge pfnmap entries Peter Xu
2024-08-29 15:10 ` David Hildenbrand
2024-08-29 18:26 ` Peter Xu
2024-08-29 19:44 ` David Hildenbrand
2024-08-29 20:01 ` Peter Xu
2024-09-02 7:58 ` Yan Zhao [this message]
2024-09-03 21:23 ` Peter Xu
2024-09-09 22:25 ` Andrew Morton
2024-09-09 22:43 ` Peter Xu
2024-09-09 23:15 ` Andrew Morton
2024-09-10 0:08 ` Peter Xu
2024-09-10 2:52 ` Yan Zhao
2024-09-10 12:16 ` Peter Xu
2024-09-11 2:16 ` Yan Zhao
2024-09-11 14:34 ` Peter Xu
2024-08-26 20:43 ` [PATCH v2 08/19] mm: Always define pxx_pgprot() Peter Xu
2024-08-26 20:43 ` [PATCH v2 09/19] mm: New follow_pfnmap API Peter Xu
2024-08-26 20:43 ` [PATCH v2 10/19] KVM: Use " Peter Xu
2024-08-26 20:43 ` [PATCH v2 11/19] s390/pci_mmio: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 12/19] mm/x86/pat: Use the new " Peter Xu
2024-08-26 20:43 ` [PATCH v2 13/19] vfio: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 14/19] acrn: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 15/19] mm/access_process_vm: " Peter Xu
2024-08-26 20:43 ` [PATCH v2 16/19] mm: Remove follow_pte() Peter Xu
2024-09-01 4:33 ` Yu Zhao
2024-09-01 13:39 ` David Hildenbrand
2024-08-26 20:43 ` [PATCH v2 17/19] mm/x86: Support large pfn mappings Peter Xu
2024-08-26 20:43 ` [PATCH v2 18/19] mm/arm64: " Peter Xu
2025-03-19 22:22 ` Keith Busch
2025-03-19 22:46 ` Peter Xu
2025-03-19 22:53 ` Keith Busch
2024-08-26 20:43 ` [PATCH v2 19/19] vfio/pci: Implement huge_fault support Peter Xu
2024-08-27 22:36 ` [PATCH v2 00/19] mm: Support huge pfnmaps Jiaqi Yan
2024-08-27 22:57 ` Peter Xu
2024-08-28 0:42 ` Jiaqi Yan
2024-08-28 0:46 ` Jiaqi Yan
2024-08-28 14:24 ` Jason Gunthorpe
2024-08-28 16:10 ` Jiaqi Yan
2024-08-28 23:49 ` Jason Gunthorpe
2024-08-29 19:21 ` Jiaqi Yan
2024-09-04 15:52 ` Jason Gunthorpe
2024-09-04 16:38 ` Jiaqi Yan
2024-09-04 16:43 ` Jason Gunthorpe
2024-09-04 16:58 ` Jiaqi Yan
2024-09-04 17:00 ` Jason Gunthorpe
2024-09-04 17:07 ` Jiaqi Yan
2024-09-09 3:56 ` Ankit Agrawal
2024-08-28 14:41 ` Peter Xu
2024-08-28 16:23 ` Jiaqi Yan
2024-09-09 4:03 ` Ankit Agrawal
2024-09-09 15:03 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZtVwLntpS0eJubFq@yzhao56-desk.sh.intel.com \
--to=yan.y.zhao@intel.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=apopple@nvidia.com \
--cc=axelrasmussen@google.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=gshan@redhat.com \
--cc=jgg@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=osalvador@suse.de \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox