Re: [PATCH 08/10] mm/hugetlb: Make walk_hugetlb_range() safe to pmd unshare

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	James Houghton <jthoughton@google.com>,
	Jann Horn <jannh@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@surriel.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Muchun Song <songmuchun@bytedance.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH 08/10] mm/hugetlb: Make walk_hugetlb_range() safe to pmd unshare
Date: Tue, 6 Dec 2022 16:51:03 -0500	[thread overview]
Message-ID: <Y4+5R+nh0W0RUX9R@x1n> (raw)
In-Reply-To: <97e3a8f2-df75-306e-2edf-85976c168955@nvidia.com>

On Tue, Dec 06, 2022 at 01:03:45PM -0800, John Hubbard wrote:
> On 12/6/22 08:45, Peter Xu wrote:
> > I've got a fixup attached.  John, since this got your attention please also
> > have a look too in case there's further issues.
> > 
> 
> Well, one question: Normally, the pattern of "release_lock(A); call f();
> acquire_lock(A);" is tricky, because one must revalidate that the state
> protected by A has not changed while the lock was released. However, in
> this case, it's letting page fault handling proceed, which already
> assumes that pages might be gone, so generally that seems OK.

Yes it's tricky, but not as tricky in this case.

I hope my documentation supplemented that (in the fixup patch):

+ * @hugetlb_entry:     if set, called for each hugetlb entry.  Note that
+ *                     currently the hook function is protected by hugetlb
+ *                     vma lock to make sure pte_t* and the spinlock is valid
+ *                     to access.  If the hook function needs to yield the
+ *                     thread or retake the vma lock for some reason, it
+ *                     needs to properly release the vma lock manually,
+ *                     and retake it before the function returns.

The vma lock here makes sure the pte_t and the pgtable spinlock being
stable.  Without the lock, they're prone to be freed in parallel.

> 
> However, I'm lagging behind on understanding what the vma lock actually
> protects. It seems to be a hugetlb-specific protection for concurrent
> freeing of the page tables?

Not exactly freeing, but unsharing.  Mike probably has more to say.  The
series is here:

https://lore.kernel.org/all/20220914221810.95771-1-mike.kravetz@oracle.com/#t

> If so, then running a page fault handler seems safe. If there's something
> else it protects, then we might need to revalidate that after
> re-acquiring the vma lock.

Nothing to validate here.  The only reason to take the vma lock is to match
with the caller who assumes the lock taken, so either it'll be released
very soon or it prepares for the next hugetlb pgtable walk (huge_pte_offset).

> 
> Also, scattering hugetlb-specific locks throughout mm seems like an
> unfortuate thing, I wonder if there is a longer term plan to Not Do
> That?

So far HMM is really the only one - normally hugetlb_entry() hook is pretty
light, so not really throughout the whole mm yet.  It's even not urgently
needed for the other two places calling cond_sched(), I added it mostly
just for completeness, and with the slight hope that maybe we can yield
earlier for some pmd unsharers.

But yes it's unfortunate, I just didn't come up with a good solution.
Suggestion is always welcomed.

Thanks,

-- 
Peter Xu

next prev parent reply	other threads:[~2022-12-06 21:51 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-29 19:35 [PATCH 00/10] mm/hugetlb: Make huge_pte_offset() thread-safe for " Peter Xu
2022-11-29 19:35 ` [PATCH 01/10] mm/hugetlb: Let vma_offset_start() to return start Peter Xu
2022-11-30 10:11   ` David Hildenbrand
2022-11-29 19:35 ` [PATCH 02/10] mm/hugetlb: Don't wait for migration entry during follow page Peter Xu
2022-11-30  4:37   ` Mike Kravetz
2022-11-30 10:15   ` David Hildenbrand
2022-11-29 19:35 ` [PATCH 03/10] mm/hugetlb: Document huge_pte_offset usage Peter Xu
2022-11-30  4:55   ` Mike Kravetz
2022-11-30 15:58     ` Peter Xu
2022-12-05 21:47       ` Mike Kravetz
2022-11-30 10:21   ` David Hildenbrand
2022-11-30 10:24   ` David Hildenbrand
2022-11-30 16:09     ` Peter Xu
2022-11-30 16:11       ` David Hildenbrand
2022-11-30 16:25         ` Peter Xu
2022-11-30 16:31           ` David Hildenbrand
2022-11-29 19:35 ` [PATCH 04/10] mm/hugetlb: Move swap entry handling into vma lock when faulted Peter Xu
2022-12-05 22:14   ` Mike Kravetz
2022-12-05 23:36     ` Peter Xu
2022-11-29 19:35 ` [PATCH 05/10] mm/hugetlb: Make userfaultfd_huge_must_wait() safe to pmd unshare Peter Xu
2022-11-30 16:08   ` David Hildenbrand
2022-12-05 22:23   ` Mike Kravetz
2022-11-29 19:35 ` [PATCH 06/10] mm/hugetlb: Make hugetlb_follow_page_mask() " Peter Xu
2022-11-30 16:09   ` David Hildenbrand
2022-12-05 22:29   ` Mike Kravetz
2022-11-29 19:35 ` [PATCH 07/10] mm/hugetlb: Make follow_hugetlb_page() " Peter Xu
2022-11-30 16:09   ` David Hildenbrand
2022-12-05 22:45   ` Mike Kravetz
2022-11-29 19:35 ` [PATCH 08/10] mm/hugetlb: Make walk_hugetlb_range() " Peter Xu
2022-11-30 16:11   ` David Hildenbrand
2022-12-05 23:33   ` Mike Kravetz
2022-12-05 23:52     ` John Hubbard
2022-12-06 16:45       ` Peter Xu
2022-12-06 18:50         ` Mike Kravetz
2022-12-06 21:03         ` John Hubbard
2022-12-06 21:51           ` Peter Xu [this message]
2022-12-06 22:31             ` John Hubbard
2022-12-07  0:07               ` Peter Xu
2022-12-07  2:38                 ` John Hubbard
2022-12-07 14:58                   ` Peter Xu
2022-11-29 19:35 ` [PATCH 09/10] mm/hugetlb: Make page_vma_mapped_walk() " Peter Xu
2022-11-30 16:18   ` David Hildenbrand
2022-11-30 16:32     ` Peter Xu
2022-11-30 16:39       ` David Hildenbrand
2022-12-05 23:52   ` Mike Kravetz
2022-12-06 17:10     ` Mike Kravetz
2022-12-06 17:39       ` Peter Xu
2022-12-06 17:43         ` Peter Xu
2022-12-06 19:58           ` Mike Kravetz
2022-11-29 19:35 ` [PATCH 10/10] mm/hugetlb: Introduce hugetlb_walk() Peter Xu
2022-11-30  5:18   ` Eric Biggers
2022-11-30 15:37     ` Peter Xu
2022-12-06  0:21       ` Mike Kravetz
2022-11-29 20:49 ` [PATCH 00/10] mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare Andrew Morton
2022-11-29 21:19   ` Peter Xu
2022-11-29 21:26     ` Andrew Morton
2022-11-29 20:51 ` Andrew Morton
2022-11-29 21:36   ` Peter Xu
2022-11-30  9:46 ` David Hildenbrand
2022-11-30 16:23   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4+5R+nh0W0RUX9R@x1n \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jannh@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=riel@surriel.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox