linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Jann Horn <jannh@google.com>, "Uschakow, Stanislav" <suschako@amazon.de>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"trix@redhat.com" <trix@redhat.com>,
	"ndesaulniers@google.com" <ndesaulniers@google.com>,
	"nathan@kernel.org" <nathan@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"muchun.song@linux.dev" <muchun.song@linux.dev>,
	"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>,
	"lorenzo.stoakes@oracle.com" <lorenzo.stoakes@oracle.com>,
	"liam.howlett@oracle.com" <liam.howlett@oracle.com>,
	"osalvador@suse.de" <osalvador@suse.de>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: Bug: Performance regression in 1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
Date: Thu, 9 Oct 2025 09:40:34 +0200	[thread overview]
Message-ID: <c7fc5bd8-a738-4ad4-9c79-57e88e080b93@redhat.com> (raw)
In-Reply-To: <CAG48ez2yrEtEUnG15nbK+hern0gL9W-9hTy3fVY+rdz8QBkSNA@mail.gmail.com>

On 01.09.25 12:58, Jann Horn wrote:
> Hi!
> 
> On Fri, Aug 29, 2025 at 4:30 PM Uschakow, Stanislav <suschako@amazon.de> wrote:
>> We have observed a huge latency increase using `fork()` after ingesting the CVE-2025-38085 fix which leads to the commit `1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race`. On large machines with 1.5TB of memory with 196 cores, we identified mmapping of 1.2TB of shared memory and forking itself dozens or hundreds of times we see a increase of execution times of a factor of 4. The reproducer is at the end of the email.
> 
> Yeah, every 1G virtual address range you unshare on unmap will do an
> extra synchronous IPI broadcast to all CPU cores, so it's not very
> surprising that doing this would be a bit slow on a machine with 196
> cores.
> 
>> My observation/assumption is:
>>
>> each child touches 100 random pages and despawns
>> on each despawn `huge_pmd_unshare()` is called
>> each call to `huge_pmd_unshare()` syncrhonizes all threads using `tlb_remove_table_sync_one()` leading to the regression
> 
> Yeah, makes sense that that'd be slow.
> 
> There are probably several ways this could be optimized - like maybe
> changing tlb_remove_table_sync_one() to rely on the MM's cpumask
> (though that would require thinking about whether this interacts with
> remote MM access somehow), or batching the refcount drops for hugetlb
> shared page tables through something like struct mmu_gather, or doing
> something special for the unmap path, or changing the semantics of
> hugetlb page tables such that they can never turn into normal page
> tables again. However, I'm not planning to work on optimizing this.

I'm currently looking at the fix and what sticks out is "Fix it with an 
explicit broadcast IPI through tlb_remove_table_sync_one()".

(I don't understand how the page table can be used for "normal, 
non-hugetlb". I could only see how it is used for the remaining user for 
hugetlb stuff, but that's different question)

How does the fix work when an architecture does not issue IPIs for TLB 
shootdown? To handle gup-fast on these architectures, we use RCU.

So I'm wondering whether we use RCU somehow.

But note that in gup_fast_pte_range(), we are validating whether the PMD 
changed:

if (unlikely(pmd_val(pmd) != pmd_val(*pmdp)) ||
     unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) {
	gup_put_folio(folio, 1, flags);
	goto pte_unmap;
}


So in case the page table got reused in the meantime, we should just 
back off and be fine, right?

-- 
Cheers

David / dhildenb



  parent reply	other threads:[~2025-10-09  7:40 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29 14:30 Uschakow, Stanislav
2025-09-01 10:58 ` Jann Horn
2025-09-01 11:26   ` David Hildenbrand
2025-09-04 12:39     ` Uschakow, Stanislav
2025-10-08 22:54     ` Prakash Sangappa
2025-10-09  7:23       ` David Hildenbrand
2025-10-09 15:06         ` Prakash Sangappa
2025-10-09  7:40   ` David Hildenbrand [this message]
2025-10-09  8:19     ` David Hildenbrand
2025-10-16  9:21     ` Lorenzo Stoakes
2025-10-16 19:13       ` David Hildenbrand
2025-10-16 18:44     ` Jann Horn
2025-10-16 19:10       ` David Hildenbrand
2025-10-16 19:26         ` Jann Horn
2025-10-16 19:44           ` David Hildenbrand
2025-10-16 20:25             ` Jann Horn
2025-10-20 15:00       ` Lorenzo Stoakes
2025-10-20 15:33         ` Jann Horn
2025-10-24 12:24           ` Lorenzo Stoakes
2025-10-24 18:22             ` Jann Horn
2025-10-24 19:02               ` Lorenzo Stoakes
2025-10-24 19:43                 ` Jann Horn
2025-10-24 19:58                   ` Lorenzo Stoakes
2025-10-24 21:41                     ` Jann Horn
2025-10-29 16:19                   ` David Hildenbrand
2025-10-29 18:02                     ` Lorenzo Stoakes
2025-11-18 10:03                       ` David Hildenbrand (Red Hat)
2025-11-19 16:08                         ` Lorenzo Stoakes
2025-11-19 16:29                           ` David Hildenbrand (Red Hat)
2025-11-19 16:31                             ` David Hildenbrand (Red Hat)
2025-11-20 15:47                               ` David Hildenbrand (Red Hat)
2025-12-03 17:22                                 ` Prakash Sangappa
2025-12-03 19:45                                   ` David Hildenbrand (Red Hat)
2025-10-20 17:18         ` David Hildenbrand
2025-10-24  9:59           ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7fc5bd8-a738-4ad4-9c79-57e88e080b93@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=suschako@amazon.de \
    --cc=trix@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox