linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Rik van Riel <riel@surriel.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	James Houghton <jthoughton@google.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH RFC v2 02/12] mm/hugetlb: Move swap entry handling into vma lock for fault
Date: Thu, 17 Nov 2022 20:35:04 -0500	[thread overview]
Message-ID: <Y3bhSEmhfULy+Vxo@x1n> (raw)
In-Reply-To: <20221118011025.2178986-3-peterx@redhat.com>

On Thu, Nov 17, 2022 at 08:10:15PM -0500, Peter Xu wrote:
> In hugetlb_fault(), there used to have a special path to handle swap entry
> at the entrance using huge_pte_offset().  That's unsafe because
> huge_pte_offset() for a pmd sharable range can access freed pgtables if
> without either the walker lock or vma lock.
> 
> Here the simplest solution for making it safe is just to move the swap
> handling to be after the vma lock being held.  We may need to take the
> fault mutex on either migration or hwpoison entries now (also the vma lock,
> but that's really needed), however neither of them is hot path so it should
> be fine.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/hugetlb.c | 24 +++++++-----------------
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c3aab6d5b7aa..62ff3fc51d4e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5824,22 +5824,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	int need_wait_lock = 0;
>  	unsigned long haddr = address & huge_page_mask(h);
>  
> -	ptep = huge_pte_offset(mm, haddr, huge_page_size(h));
> -	if (ptep) {
> -		/*
> -		 * Since we hold no locks, ptep could be stale.  That is
> -		 * OK as we are only making decisions based on content and
> -		 * not actually modifying content here.
> -		 */
> -		entry = huge_ptep_get(ptep);
> -		if (unlikely(is_hugetlb_entry_migration(entry))) {
> -			migration_entry_wait_huge(vma, ptep);
> -			return 0;
> -		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
> -			return VM_FAULT_HWPOISON_LARGE |
> -				VM_FAULT_SET_HINDEX(hstate_index(h));
> -	}
> -
>  	/*
>  	 * Serialize hugepage allocation and instantiation, so that we don't
>  	 * get spurious allocation failures if two CPUs race to instantiate
> @@ -5886,8 +5870,14 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will
>  	 * properly handle it.
>  	 */
> -	if (!pte_present(entry))
> +	if (!pte_present(entry)) {
> +		if (unlikely(is_hugetlb_entry_migration(entry)))
> +			migration_entry_wait_huge(vma, ptep);

Hmm no, need to release the vma lock and fault mutex.. So I remembered why
I had a note that I need to rework migration wait code..

I'll try that on next version, it would be a callback just to release the
proper locks in migration_entry_wait_huge() right after releasing the
pgtable lock, in e.g. migration_entry_wait_on_locked().

> +		else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
> +			ret = VM_FAULT_HWPOISON_LARGE |
> +			    VM_FAULT_SET_HINDEX(hstate_index(h));
>  		goto out_mutex;
> +	}
>  
>  	/*
>  	 * If we are going to COW/unshare the mapping later, we examine the
> -- 
> 2.37.3
> 

-- 
Peter Xu



  reply	other threads:[~2022-11-18  1:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-18  1:10 [PATCH RFC v2 00/12] mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 01/12] mm/hugetlb: Let vma_offset_start() to return start Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 02/12] mm/hugetlb: Move swap entry handling into vma lock for fault Peter Xu
2022-11-18  1:35   ` Peter Xu [this message]
2022-11-18  1:10 ` [PATCH RFC v2 03/12] mm/hugetlb: Don't wait for migration entry during follow page Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 04/12] mm/hugetlb: Add pgtable walker lock Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 05/12] mm/hugetlb: Make userfaultfd_huge_must_wait() safe to pmd unshare Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 06/12] mm/hugetlb: Protect huge_pmd_share() with walker lock Peter Xu
2022-11-18  1:17   ` Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 07/12] mm/hugetlb: Use hugetlb walker lock in hugetlb_follow_page_mask() Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 08/12] mm/hugetlb: Use hugetlb walker lock in follow_hugetlb_page() Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 09/12] mm/hugetlb: Use hugetlb walker lock in hugetlb_vma_maps_page() Peter Xu
2022-11-18  1:10 ` [PATCH RFC v2 10/12] mm/hugetlb: Use hugetlb walker lock in walk_hugetlb_range() Peter Xu
2022-11-18  1:11 ` [PATCH RFC v2 11/12] mm/hugetlb: Use hugetlb walker lock in page_vma_mapped_walk() Peter Xu
2022-11-18  1:11 ` [PATCH RFC v2 12/12] mm/hugetlb: Introduce hugetlb_walk() Peter Xu
2022-11-23  9:40 ` [PATCH RFC v2 00/12] mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare David Hildenbrand
2022-11-23 15:09   ` Peter Xu
2022-11-23 18:21     ` Mike Kravetz
2022-11-23 18:56       ` Peter Xu
2022-11-23 19:31         ` David Hildenbrand
2022-11-25  9:43     ` David Hildenbrand
2022-11-25 13:55       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3bhSEmhfULy+Vxo@x1n \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jthoughton@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nadav.amit@gmail.com \
    --cc=riel@surriel.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox