linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pu Lehui <pulehui@huaweicloud.com>
To: David Hildenbrand <david@redhat.com>,
	lorenzo.stoakes@oracle.com, oleg@redhat.com
Cc: mhiramat@kernel.org, peterz@infradead.org,
	Liam.Howlett@oracle.com, akpm@linux-foundation.org,
	vbabka@suse.cz, jannh@google.com, pfalcato@suse.de,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	pulehui@huawei.com
Subject: Re: [RFC PATCH] mm/mmap: Fix uprobe anon page be overwritten when expanding vma during mremap
Date: Mon, 26 May 2025 22:52:36 +0800	[thread overview]
Message-ID: <4cbc1e43-ea46-44de-9e2b-1c62dcd2b6d5@huaweicloud.com> (raw)
In-Reply-To: <13c5fe73-9e11-4465-b401-fc96a22dc5d1@redhat.com>


On 2025/5/22 23:14, David Hildenbrand wrote:
> On 22.05.25 16:37, Pu Lehui wrote:
>>
>>
>> On 2025/5/21 18:25, David Hildenbrand wrote:
>>> On 21.05.25 11:25, Pu Lehui wrote:
>>>> From: Pu Lehui <pulehui@huawei.com>
>>>>
>>>> We encountered a BUG alert triggered by Syzkaller as follows:
>>>>      BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES
>>>> val:1
>>>>
>>>> And we can reproduce it with the following steps:
>>>> 1. register uprobe on file at zero offset
>>>> 2. mmap the file at zero offset:
>>>>      addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0);
>>>
>>> So, here we will install a uprobe.
>>>
>>>> 3. mremap part of vma1 to new vma2:
>>>>      addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE);
>>>
>>> Okay, so we'll essentially move the uprobe as we mremap.
>>>
>>>
>>>> 4. mremap back to orig addr1:
>>>>      mremap(addr2, 4096, 4096, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
>>>
>>> And here, we would expect to move the uprobe again.
>>>
>>>>
>>>> In the step 3, the vma1 range [addr1, addr1 + 4096] will be remap to 
>>>> new
>>>> vma2 with range [addr2, addr2 + 8192], and remap uprobe anon page from
>>>> the vma1 to vma2, then unmap the vma1 range [addr1, addr1 + 4096].
>>>> In tht step 4, the vma2 range [addr2, addr2 + 4096] will be remap back
>>>> to the addr range [addr1, addr1 + 4096]. Since the addr range [addr1 +
>>>> 4096, addr1 + 8192] still maps the file, it will take
>>>> vma_merge_new_range to merge these two addr ranges, and then do
>>>> uprobe_mmap in vma_complete. Since the merged vma pgoff is also zero
>>>> offset, it will install uprobe anon page to the merged vma.
>>>
>>> Oh, so we're installing the uprobe into the extended VMA before moving
>>> the page tables.
>> Yep!
>>>
>>> Gah.
>>>
>>>> However, the
>>>> upcomming move_page_tables step, which use set_pte_at to remap the vma2
>>>> uprobe anon page to the merged vma, will over map the old uprobe anon
>>>> page in the merged vma, and lead the old uprobe anon page to be orphan.
>>>
>>> Right, when moving page tables we don't expect there to already be
>>> something from the uprobe code.
>>>
>>>>
>>>> Since the uprobe anon page will be remapped to the merged vma, we can
>>>> remove the unnecessary uprobe_mmap at merged vma, that is, do not
>>>> perform uprobe_mmap when there is no vma in the addr range to be
>>>> expaned.
>>>
>>> Hmmm, I'll have to think about other corner cases ....
>>>
>> looking forward to it
> 
> I think, the rule is that we must not install a uprobe for the range 
> that we will be actually moving the page tables for.
> 
> So, for the range we're effectively moving (not the one we're extending).
> 
> Because logically, the uprobe will be already handled by the existing 
> page tables that we're moving.
> 
> For the range we're extending, we must call uprobe handling code ...
> 
> 
> Alternatively, maybe we could call uprobe handling code after moving the 
> page tables. We'd probably find that the uprobe is already installed and 
> do nothing (so the theory :) ). ... if that would simplify anything.
> 

Hi David, Lorenzo, Oleg,

My apologies for the delay. Thanks for your reply.

To make things simpler, perhaps we could try post-processing, that is:

diff --git a/mm/mremap.c b/mm/mremap.c
index 83e359754961..46a757fd26dc 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -240,6 +240,11 @@ static int move_ptes(struct pagetable_move_control 
*pmc,
                 if (pte_none(ptep_get(old_pte)))
                         continue;

+               /* skip move pte when expanded range has uprobe */
+               if (unlikely(pte_present(*new_pte) &&
+                            vma_has_uprobes(pmc->new, new_addr, 
new_addr + PAGE_SIZE)))
+                       continue;
+
                 pte = ptep_get_and_clear(mm, old_addr, old_pte);
                 /*
                  * If we are remapping a valid PTE, make sure

What do you think?

Thanks,
Lehui



  reply	other threads:[~2025-05-26 14:52 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21  9:25 Pu Lehui
2025-05-21 10:25 ` David Hildenbrand
2025-05-22 14:37   ` Pu Lehui
2025-05-22 15:14     ` David Hildenbrand
2025-05-26 14:52       ` Pu Lehui [this message]
2025-05-26 15:48         ` Oleg Nesterov
2025-05-26 18:46           ` David Hildenbrand
2025-05-27 11:42             ` Lorenzo Stoakes
2025-05-27 11:44               ` Lorenzo Stoakes
2025-05-27 13:39               ` Pu Lehui
2025-05-27 13:38             ` Pu Lehui
2025-05-28  9:03               ` David Hildenbrand
2025-05-29 16:07                 ` Pu Lehui
2025-05-30  8:33                   ` David Hildenbrand
2025-05-30  8:41                     ` Lorenzo Stoakes
2025-05-30  8:50                       ` David Hildenbrand
2025-05-30  9:03                         ` Lorenzo Stoakes
2025-05-30  9:27                           ` David Hildenbrand
2025-05-30 18:09                     ` Oleg Nesterov
2025-05-30 18:34                       ` David Hildenbrand
2025-05-30 22:48                         ` Pu Lehui
2025-05-27 13:23           ` Pu Lehui
2025-05-21 13:13 ` Lorenzo Stoakes
2025-05-22 15:00   ` Pu Lehui
2025-05-22 15:18     ` Lorenzo Stoakes
2025-05-24 16:45 ` Oleg Nesterov
2025-05-24 21:45   ` David Hildenbrand
2025-05-25  9:59     ` Oleg Nesterov
2025-05-25 10:24       ` David Hildenbrand
2025-05-26 16:29         ` Oleg Nesterov
2025-05-26 17:38           ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cbc1e43-ea46-44de-9e2b-1c62dcd2b6d5@huaweicloud.com \
    --to=pulehui@huaweicloud.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhiramat@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=pulehui@huawei.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox