linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
@ 2024-05-21  7:34 Oscar Salvador
  2024-05-21  9:56 ` David Hildenbrand
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Oscar Salvador @ 2024-05-21  7:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Muchun Song, Vishal Moola, Oscar Salvador

hugetlb_wp calls vmf_anon_prepare() after having allocated a page, which
means that we might need to call restore_reserve_on_error() upon error.
vmf_anon_prepare() releases the vma lock before returning, but
restore_reserve_on_error() expects the vma lock to be held by the caller.

Fix it by calling vmf_anon_prepare() before allocating the page.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
---
I did not hit this bug, I just spotted this because I was looking at hugetlb_wp
for some other reason. And I did not want to get creative to see if I could
trigger this so I could get a backtrace.
My assumption is that we could trigger this if 1) this was a shared mapping,
so no anon_vma and 2) we call in GUP code with FOLL_WRITE, which would cause
the FLAG_UNSHARE to be passed, so we will end up in hugetlb_wp().

 mm/hugetlb.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6be78e7d4f6e..eb0d8a45505e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6005,6 +6005,15 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
 	 * be acquired again before returning to the caller, as expected.
 	 */
 	spin_unlock(vmf->ptl);
+
+	/*
+	 * When the original hugepage is shared one, it does not have
+	 * anon_vma prepared.
+	 */
+	ret = vmf_anon_prepare(vmf);
+	if (unlikely(ret))
+		goto out_release_old;
+
 	new_folio = alloc_hugetlb_folio(vma, vmf->address, outside_reserve);
 
 	if (IS_ERR(new_folio)) {
@@ -6058,14 +6067,6 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
 		goto out_release_old;
 	}
 
-	/*
-	 * When the original hugepage is shared one, it does not have
-	 * anon_vma prepared.
-	 */
-	ret = vmf_anon_prepare(vmf);
-	if (unlikely(ret))
-		goto out_release_all;
-
 	if (copy_user_large_folio(new_folio, old_folio, vmf->real_address, vma)) {
 		ret = VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h));
 		goto out_release_all;
-- 
2.45.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
  2024-05-21  7:34 [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp Oscar Salvador
@ 2024-05-21  9:56 ` David Hildenbrand
  2024-05-21 10:23   ` Oscar Salvador
  2024-05-27  8:53 ` Oscar Salvador
  2024-06-12 20:27 ` Oscar Salvador
  2 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-05-21  9:56 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: linux-kernel, linux-mm, Muchun Song, Vishal Moola

On 21.05.24 09:34, Oscar Salvador wrote:
> hugetlb_wp calls vmf_anon_prepare() after having allocated a page, which
> means that we might need to call restore_reserve_on_error() upon error.
> vmf_anon_prepare() releases the vma lock before returning, but
> restore_reserve_on_error() expects the vma lock to be held by the caller.
> 
> Fix it by calling vmf_anon_prepare() before allocating the page.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
> ---
> I did not hit this bug, I just spotted this because I was looking at hugetlb_wp
> for some other reason. And I did not want to get creative to see if I could
> trigger this so I could get a backtrace.
> My assumption is that we could trigger this if 1) this was a shared mapping,
> so no anon_vma and 2) we call in GUP code with FOLL_WRITE, which would cause
> the FLAG_UNSHARE to be passed, so we will end up in hugetlb_wp().

FOLL_WRITE should never result in FLAG_UNSHARE.

> 
>   mm/hugetlb.c | 17 +++++++++--------
>   1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6be78e7d4f6e..eb0d8a45505e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6005,6 +6005,15 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
>   	 * be acquired again before returning to the caller, as expected.
>   	 */
>   	spin_unlock(vmf->ptl);
> +
> +	/*
> +	 * When the original hugepage is shared one, it does not have
> +	 * anon_vma prepared.
> +	 */
> +	ret = vmf_anon_prepare(vmf);
> +	if (unlikely(ret))
> +		goto out_release_old;
> +
>   	new_folio = alloc_hugetlb_folio(vma, vmf->address, outside_reserve);
>   
>   	if (IS_ERR(new_folio)) {
> @@ -6058,14 +6067,6 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
>   		goto out_release_old;
>   	}
>   
> -	/*
> -	 * When the original hugepage is shared one, it does not have
> -	 * anon_vma prepared.
> -	 */
> -	ret = vmf_anon_prepare(vmf);
> -	if (unlikely(ret))
> -		goto out_release_all;
> -
>   	if (copy_user_large_folio(new_folio, old_folio, vmf->real_address, vma)) {
>   		ret = VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h));
>   		goto out_release_all;

The joy of hugetlb reservation code.

LGTM

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
  2024-05-21  9:56 ` David Hildenbrand
@ 2024-05-21 10:23   ` Oscar Salvador
  0 siblings, 0 replies; 7+ messages in thread
From: Oscar Salvador @ 2024-05-21 10:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, linux-kernel, linux-mm, Muchun Song, Vishal Moola

On Tue, May 21, 2024 at 11:56:54AM +0200, David Hildenbrand wrote:
> On 21.05.24 09:34, Oscar Salvador wrote:
> > hugetlb_wp calls vmf_anon_prepare() after having allocated a page, which
> > means that we might need to call restore_reserve_on_error() upon error.
> > vmf_anon_prepare() releases the vma lock before returning, but
> > restore_reserve_on_error() expects the vma lock to be held by the caller.
> > 
> > Fix it by calling vmf_anon_prepare() before allocating the page.
> > 
> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> > Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
> > ---
> > I did not hit this bug, I just spotted this because I was looking at hugetlb_wp
> > for some other reason. And I did not want to get creative to see if I could
> > trigger this so I could get a backtrace.
> > My assumption is that we could trigger this if 1) this was a shared mapping,
> > so no anon_vma and 2) we call in GUP code with FOLL_WRITE, which would cause
> > the FLAG_UNSHARE to be passed, so we will end up in hugetlb_wp().
> 
> FOLL_WRITE should never result in FLAG_UNSHARE.

You are right.
It was quite early when I looked at this and I managed to confuse myself when
reading hugetlb_follow_page_mask().
 
> The joy of hugetlb reservation code.
> 
> LGTM

thanks David!
 

-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
  2024-05-21  7:34 [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp Oscar Salvador
  2024-05-21  9:56 ` David Hildenbrand
@ 2024-05-27  8:53 ` Oscar Salvador
  2024-05-27 13:17   ` David Hildenbrand
  2024-06-12 20:27 ` Oscar Salvador
  2 siblings, 1 reply; 7+ messages in thread
From: Oscar Salvador @ 2024-05-27  8:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, Muchun Song, Vishal Moola

On Tue, May 21, 2024 at 09:34:46AM +0200, Oscar Salvador wrote:
> I did not hit this bug, I just spotted this because I was looking at hugetlb_wp
> for some other reason. And I did not want to get creative to see if I could
> trigger this so I could get a backtrace.
> My assumption is that we could trigger this if 1) this was a shared mapping,
> so no anon_vma and 2) we call in GUP code with FOLL_WRITE, which would cause
> the FLAG_UNSHARE to be passed, so we will end up in hugetlb_wp().

So I checked this again and I have to confess I am bit confused.

hugetlb_wp() can be called from either hugetlb_fault() or hugetlb_no_page().

hugetlb_fault()->hugetlb_wp() upon FAULT_FLAG_{WRITE,UNSHARE}
hugetlb_no_page->hugetlb_wp()-> upon FAULT_FLAG_WRITE && !VM_SHARED

hugetlb_no_page()->vmf_anon_prepare() upon !VM_SHARED, which means that VM_SHARED
mappings do not have vma->anon_vma, while others do.

hugetlb_wp() will call set_huge_ptep_writable() right away and return if it sees
that the mapping is shared.
So the only other we have to end up in hugetlb_wp() is via FAULT_FLAG_UNSHARE.
For that to happen gup_must_unshare() must return true, which means the following
assumptions must hold.

- For Anonymous pages:
  1) !PageAnonExclusive
- For Filebacked pages:
  2) We do not have a vma
  3) It is a COW mapping

1) If gup_must_unshare() returns true for Anonymous pages because the page is not
   exclusive and must be unshared, hugetlb_wp() will already see the
   vma->anon_prepare being initialized because of the previous
   hugetlb_no_page()->vmf_anon_prepare.

2) I do not quite understand this case.
3) !VMSHARED mappings already had its anon_vma initialized in
   hugetlb_no_page()->vmf_anon_prepare.

Probably I am missing some bits here, but I cannot see how we would even
need vmf_anon_prepare() in hugetlb_wp() in the first place.


-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
  2024-05-27  8:53 ` Oscar Salvador
@ 2024-05-27 13:17   ` David Hildenbrand
  2024-05-27 13:54     ` Oscar Salvador
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-05-27 13:17 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: linux-kernel, linux-mm, Muchun Song, Vishal Moola



Am 27.05.24 um 10:53 schrieb Oscar Salvador:
> On Tue, May 21, 2024 at 09:34:46AM +0200, Oscar Salvador wrote:
>> I did not hit this bug, I just spotted this because I was looking at hugetlb_wp
>> for some other reason. And I did not want to get creative to see if I could
>> trigger this so I could get a backtrace.
>> My assumption is that we could trigger this if 1) this was a shared mapping,
>> so no anon_vma and 2) we call in GUP code with FOLL_WRITE, which would cause
>> the FLAG_UNSHARE to be passed, so we will end up in hugetlb_wp().
> 
> So I checked this again and I have to confess I am bit confused.
> 
> hugetlb_wp() can be called from either hugetlb_fault() or hugetlb_no_page().
> 
> hugetlb_fault()->hugetlb_wp() upon FAULT_FLAG_{WRITE,UNSHARE}
> hugetlb_no_page->hugetlb_wp()-> upon FAULT_FLAG_WRITE && !VM_SHARED
> 
> hugetlb_no_page()->vmf_anon_prepare() upon !VM_SHARED, which means that VM_SHARED
> mappings do not have vma->anon_vma, while others do.
> 
> hugetlb_wp() will call set_huge_ptep_writable() right away and return if it sees
> that the mapping is shared.
> So the only other we have to end up in hugetlb_wp() is via FAULT_FLAG_UNSHARE.
> For that to happen gup_must_unshare() must return true, which means the following
> assumptions must hold.
> 
> - For Anonymous pages:
>    1) !PageAnonExclusive
> - For Filebacked pages:
>    2) We do not have a vma
>    3) It is a COW mapping
> 
> 1) If gup_must_unshare() returns true for Anonymous pages because the page is not
>     exclusive and must be unshared, hugetlb_wp() will already see the
>     vma->anon_prepare being initialized because of the previous
>     hugetlb_no_page()->vmf_anon_prepare.
> 
> 2) I do not quite understand this case.

gup_must_unshare() without a VMA is only used for GUP-fast. Before triggering a 
page fault we always fallback to GUP-slow first, where we have a VMA.

IMHO we better not make assumptions that hugetlb_wp() will always already have 
an anon VMA.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
  2024-05-27 13:17   ` David Hildenbrand
@ 2024-05-27 13:54     ` Oscar Salvador
  0 siblings, 0 replies; 7+ messages in thread
From: Oscar Salvador @ 2024-05-27 13:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, linux-kernel, linux-mm, Muchun Song, Vishal Moola

On Mon, May 27, 2024 at 03:17:01PM +0200, David Hildenbrand wrote:
> gup_must_unshare() without a VMA is only used for GUP-fast. Before
> triggering a page fault we always fallback to GUP-slow first, where we have
> a VMA.

I see, thanks for explaining!

> IMHO we better not make assumptions that hugetlb_wp() will always already
> have an anon VMA.

Fine by me, I just wanted to have the picture more clear, because I could not
see how we could end up in such situation, but better be safe and than sorry.

@Andrew: This fix is to avoid an issue like [1], but for hugetlb_wp().

[1] https://lore.kernel.org/linux-mm/000000000000daf1e10615e64dcb@google.com/T/
 

-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp
  2024-05-21  7:34 [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp Oscar Salvador
  2024-05-21  9:56 ` David Hildenbrand
  2024-05-27  8:53 ` Oscar Salvador
@ 2024-06-12 20:27 ` Oscar Salvador
  2 siblings, 0 replies; 7+ messages in thread
From: Oscar Salvador @ 2024-06-12 20:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, Muchun Song, Vishal Moola

On Tue, May 21, 2024 at 09:34:46AM +0200, Oscar Salvador wrote:
> hugetlb_wp calls vmf_anon_prepare() after having allocated a page, which
> means that we might need to call restore_reserve_on_error() upon error.
> vmf_anon_prepare() releases the vma lock before returning, but
> restore_reserve_on_error() expects the vma lock to be held by the caller.
> 
> Fix it by calling vmf_anon_prepare() before allocating the page.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")

Let me just revamp this and run away
 

-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-06-12 20:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-21  7:34 [PATCH] mm/hugetlb: Move vmf_anon_prepare upfront in hugetlb_wp Oscar Salvador
2024-05-21  9:56 ` David Hildenbrand
2024-05-21 10:23   ` Oscar Salvador
2024-05-27  8:53 ` Oscar Salvador
2024-05-27 13:17   ` David Hildenbrand
2024-05-27 13:54     ` Oscar Salvador
2024-06-12 20:27 ` Oscar Salvador

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox