linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <richard.weiyang@gmail.com>
To: "David Hildenbrand (Red Hat)" <david@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>,
	akpm@linux-foundation.org, lorenzo.stoakes@oracle.com,
	Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, linux-mm@kvack.org,
	hanchuanhua@oppo.com, v-songbaohua@oppo.com
Subject: Re: [PATCH 1/2] mm: bail out do_swap_page() when no PTE table exist
Date: Sat, 20 Dec 2025 03:24:07 +0000	[thread overview]
Message-ID: <20251220032407.xeszwxus664jp7tq@master> (raw)
In-Reply-To: <4c2027d3-2665-42e0-8cfa-712b1d8f8870@kernel.org>

On Fri, Dec 19, 2025 at 09:42:16AM +0100, David Hildenbrand (Red Hat) wrote:
>On 12/16/25 08:59, Wei Yang wrote:
>> The alloc_swap_folio() function scans the PTE table to determine the
>> potential size (order) of the folio content to be swapped in.
>> 
>> Currently, if the call to pte_offset_map_lock() returns NULL, it
>> indicates that no PTE table exists for that range. Despite this, the
>> code proceeds to allocate an order-0 folio and continues the swap-in
>> process. This is unnecessary if the required table is absent.
>> 
>> This commit modifies the logic to immediately bail out of the swap-in
>> process when the PTE table is missing (i.e., pte_offset_map_lock()
>> returns NULL). This ensures we do not attempt to continue swapping when
>> the page table structure is incomplete or changed, preventing
>> unnecessary work.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Cc: Chuanhua Han <hanchuanhua@oppo.com>
>> Cc: Barry Song <v-songbaohua@oppo.com>
>> ---
>>   mm/memory.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 2a55edc48a65..1b8ef4f0ea60 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4566,7 +4566,7 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf)
>>   	pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
>>   				  vmf->address & PMD_MASK, &ptl);
>>   	if (unlikely(!pte))
>> -		goto fallback;
>> +		return ERR_PTR(-EAGAIN);
>>   	/*
>>   	 * For do_swap_page, find the highest order where the aligned range is
>> @@ -4709,6 +4709,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>>   		    __swap_count(entry) == 1) {
>>   			/* skip swapcache */
>>   			folio = alloc_swap_folio(vmf);
>> +			if (IS_ERR(folio))
>> +				goto out;
>>   			if (folio) {
>>   				__folio_set_locked(folio);
>>   				__folio_set_swapbacked(folio);
>
>How would we be able to even trigger this?

To be honest, I haven't thought about it. Thanks for the question.

>
>Trigger a swap fault with concurrent MADV_DONTNEED and concurrent page table
>reclaim.
>

Let me try to catch up with you.

  swap fault is triggered because user is access this range.
  MADV_DONTNEED is also triggered by user and means this range is not necessary.

So, we don't expect user will do these two contrary behavior at the same time.
This is your point, right?

>Is that really something we should be worrying about?
>

Now I question myself why alloc_anon_folio() need to bail out like this.

page fault vs MADV_DONTNEED is not in concern. So page fault vs khugepaged
is the case? But I see khugepaged would mmap_write_lock(), which prevent page
fault in my mind, before collapsing pmd.

I am curious alloc_anon_folio() tries to handle which case here.

>-- 
>Cheers
>
>David

-- 
Wei Yang
Help you, Help me


  reply	other threads:[~2025-12-20  3:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-16  7:59 [PATCH 0/3] mm/memory: align alloc_swap_folio() logic with alloc_anon_folio() Wei Yang
2025-12-16  7:59 ` [PATCH 1/2] mm: bail out do_swap_page() when no PTE table exist Wei Yang
2025-12-19  8:42   ` David Hildenbrand (Red Hat)
2025-12-20  3:24     ` Wei Yang [this message]
2025-12-21  9:40       ` David Hildenbrand (Red Hat)
2025-12-16  7:59 ` [PATCH 2/2] mm: avoid unnecessary PTE table lock during initial swap folio scan Wei Yang
2025-12-19  8:47   ` David Hildenbrand (Red Hat)
2025-12-20  3:36     ` Wei Yang
2025-12-21  9:47       ` David Hildenbrand (Red Hat)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251220032407.xeszwxus664jp7tq@master \
    --to=richard.weiyang@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=hanchuanhua@oppo.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox