From: Peter Xu <peterx@redhat.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Mike Kravetz <mike.kravetz@oracle.com>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Matthew Wilcox <willy@infradead.org>,
John Hubbard <jhubbard@nvidia.com>,
Mike Rapoport <rppt@kernel.org>, Hugh Dickins <hughd@google.com>,
David Hildenbrand <david@redhat.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Rik van Riel <riel@surriel.com>,
James Houghton <jthoughton@google.com>,
Yang Shi <shy828301@gmail.com>, Jason Gunthorpe <jgg@nvidia.com>,
Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH RFC 04/12] mm: Introduce vma_pgtable_walk_{begin|end}()
Date: Fri, 24 Nov 2023 10:34:07 -0500 [thread overview]
Message-ID: <ZWDCb9oKV_kQg2qV@x1n> (raw)
In-Reply-To: <874jhb94u2.fsf@kernel.org>
On Fri, Nov 24, 2023 at 09:32:13AM +0530, Aneesh Kumar K.V wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > Introduce per-vma begin()/end() helpers for pgtable walks. This is a
> > preparation work to merge hugetlb pgtable walkers with generic mm.
> >
> > The helpers need to be called before and after a pgtable walk, will start
> > to be needed if the pgtable walker code supports hugetlb pages. It's a
> > hook point for any type of VMA, but for now only hugetlb uses it to
> > stablize the pgtable pages from getting away (due to possible pmd
> > unsharing).
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > include/linux/mm.h | 3 +++
> > mm/memory.c | 12 ++++++++++++
> > 2 files changed, 15 insertions(+)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 64cd1ee4aacc..349232dd20fb 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -4154,4 +4154,7 @@ static inline bool pfn_is_unaccepted_memory(unsigned long pfn)
> > return range_contains_unaccepted_memory(paddr, paddr + PAGE_SIZE);
> > }
> >
> > +void vma_pgtable_walk_begin(struct vm_area_struct *vma);
> > +void vma_pgtable_walk_end(struct vm_area_struct *vma);
> > +
> > #endif /* _LINUX_MM_H */
> > diff --git a/mm/memory.c b/mm/memory.c
> > index e27e2e5beb3f..3a6434b40d87 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -6180,3 +6180,15 @@ void ptlock_free(struct ptdesc *ptdesc)
> > kmem_cache_free(page_ptl_cachep, ptdesc->ptl);
> > }
> > #endif
> > +
> > +void vma_pgtable_walk_begin(struct vm_area_struct *vma)
> > +{
> > + if (is_vm_hugetlb_page(vma))
> > + hugetlb_vma_lock_read(vma);
> > +}
> >
>
> That is required only if we support pmd sharing?
Correct.
Note that for this specific gup code path, we're not changing the lock
behavior because we used to call hugetlb_vma_lock_read() the same in
hugetlb_follow_page_mask(), that's also unconditionally.
It make things even more complicated if we see the recent private mapping
change that Rik introduced in bf4916922c. I think it means we'll also take
that lock if private lock is allocated, but I'm not really sure whether
that's necessary for all pgtable walks, as the hugetlb vma lock is taken
mostly in all walk paths currently, only some special paths take i_mmap
rwsem instead of the vma lock.
Per my current understanding, the private lock was only for avoiding a race
between truncate & zapping. I had a feeling that maybe there's better way
to do this rather than sticking different functions with the same lock (or,
lock api).
In summary, the hugetlb vma lock is still complicated and may prone to
further refactoring. But all those needs further investigations. This
series can be hopefully seen as completely separated from that so far.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2023-11-24 15:34 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-16 1:28 [PATCH RFC 00/12] mm/gup: Unify hugetlb, part 2 Peter Xu
2023-11-16 1:28 ` [PATCH RFC 01/12] mm/hugetlb: Export hugetlbfs_pagecache_present() Peter Xu
2023-11-23 7:23 ` Christoph Hellwig
2023-11-23 16:05 ` Peter Xu
2023-11-16 1:28 ` [PATCH RFC 02/12] mm: Provide generic pmd_thp_or_huge() Peter Xu
2023-11-16 1:28 ` [PATCH RFC 03/12] mm: Export HPAGE_PXD_* macros even if !THP Peter Xu
2023-11-23 7:23 ` Christoph Hellwig
2023-11-23 9:53 ` Mike Rapoport
2023-11-23 15:27 ` Peter Xu
2023-11-16 1:29 ` [PATCH RFC 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() Peter Xu
2023-11-23 7:24 ` Christoph Hellwig
2023-11-23 16:11 ` Peter Xu
2023-11-24 4:02 ` Aneesh Kumar K.V
2023-11-24 15:34 ` Peter Xu [this message]
2023-11-16 1:29 ` [PATCH RFC 05/12] mm/gup: Fix follow_devmap_p[mu]d() to return even if NULL Peter Xu
2023-11-23 7:25 ` Christoph Hellwig
2023-11-23 17:59 ` Peter Xu
2023-11-16 1:29 ` [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing Peter Xu
2023-11-20 8:26 ` Christoph Hellwig
2023-11-21 15:59 ` Peter Xu
2023-11-22 8:00 ` Christoph Hellwig
2023-11-22 15:22 ` Peter Xu
2023-11-23 7:21 ` Christoph Hellwig
2023-11-23 16:10 ` Peter Xu
2023-11-23 18:22 ` Christophe Leroy
2023-11-23 19:37 ` Peter Xu
2023-11-24 5:28 ` Aneesh Kumar K.V
2023-11-24 7:03 ` Christophe Leroy
2023-11-24 1:06 ` Michael Ellerman
2023-11-23 15:47 ` Matthew Wilcox
2023-11-23 17:22 ` Peter Xu
2023-11-23 19:11 ` Ryan Roberts
2023-11-23 19:46 ` Peter Xu
2023-11-24 9:06 ` Ryan Roberts
2023-11-24 16:07 ` Peter Xu
2023-11-30 21:30 ` Peter Xu
2023-12-03 13:33 ` Christophe Leroy
2023-12-04 11:11 ` Ryan Roberts
2023-12-04 11:25 ` Christophe Leroy
2023-12-04 11:46 ` Ryan Roberts
2023-12-04 11:57 ` Christophe Leroy
2023-12-04 12:02 ` Ryan Roberts
2023-12-04 16:48 ` Peter Xu
2023-11-16 1:29 ` [PATCH RFC 07/12] mm/gup: Refactor record_subpages() to find 1st small page Peter Xu
2023-11-16 14:51 ` Matthew Wilcox
2023-11-16 19:40 ` Peter Xu
2023-11-16 19:41 ` Matthew Wilcox
2023-11-16 1:29 ` [PATCH RFC 08/12] mm/gup: Handle hugetlb for no_page_table() Peter Xu
2023-11-23 7:26 ` Christoph Hellwig
2023-11-16 1:29 ` [PATCH RFC 09/12] mm/gup: Handle huge pud for follow_pud_mask() Peter Xu
2023-11-23 7:28 ` Christoph Hellwig
2023-11-23 16:19 ` Peter Xu
2023-11-16 1:29 ` [PATCH RFC 10/12] mm/gup: Handle huge pmd for follow_pmd_mask() Peter Xu
2023-11-16 1:29 ` [PATCH RFC 11/12] mm/gup: Handle hugepd for follow_page() Peter Xu
2023-11-16 1:29 ` [PATCH RFC 12/12] mm/gup: Merge hugetlb into generic mm code Peter Xu
2023-11-23 7:29 ` Christoph Hellwig
2023-11-23 16:21 ` Peter Xu
2023-11-22 14:51 ` [PATCH RFC 00/12] mm/gup: Unify hugetlb, part 2 Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZWDCb9oKV_kQg2qV@x1n \
--to=peterx@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=axelrasmussen@google.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=jthoughton@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=mike.kravetz@oracle.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=shy828301@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox