Re: [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Lance Yang <lance.yang@linux.dev>
Cc: dave.hansen@intel.com, dave.hansen@linux.intel.com,
	will@kernel.org, aneesh.kumar@kernel.org, npiggin@gmail.com,
	peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, x86@kernel.org, hpa@zytor.com, arnd@arndb.de,
	akpm@linux-foundation.org, lorenzo.stoakes@oracle.com,
	ziy@nvidia.com, baolin.wang@linux.alibaba.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, shy828301@gmail.com,
	riel@surriel.com, jannh@google.com, linux-arch@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	ioworker0@gmail.com
Subject: Re: [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized
Date: Fri, 9 Jan 2026 15:11:40 +0100	[thread overview]
Message-ID: <253140b0-c9b9-4ef0-8b36-af307296519b@kernel.org> (raw)
In-Reply-To: <9b1cb571-99df-44f8-8c0e-8e9bc3f6e8d5@linux.dev>

On 1/7/26 07:37, Lance Yang wrote:
> Hi David,
> 
> On 2026/1/7 00:10, Lance Yang wrote:
> [..]
>>> What could work is tracking "tlb_table_flush_sent_ipi" really when we
>>> are flushing the TLB for removed/unshared tables, and maybe resetting
>>> it ... I don't know when from the top of my head.
>>
> 
> Seems like we could fix the issue that the flag lifetime was broken
> if the MMU gather gets reused by splitting the flush and reset. This
> ensures the flag stays valid between flush and sync.
> 
> Now tlb_flush_unshared_tables() does:
>     1) __tlb_flush_mmu_tlbonly() - flush only, keeps flags alive
>     2) tlb_gather_remove_table_sync_one() - can check the flag
>     3) __tlb_reset_range() - reset everything after sync
> 
> Something like this:
> 
> ---8<---
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index 3975f7d11553..a95b054dfcca 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -415,6 +415,7 @@ static inline void __tlb_reset_range(struct
> mmu_gather *tlb)
>    	tlb->cleared_puds = 0;
>    	tlb->cleared_p4ds = 0;
>    	tlb->unshared_tables = 0;
> +	tlb->tlb_flush_sent_ipi = 0;


As raised, the "tlb_flush_sent_ipi" is confusing when we sent to 
different CPUs based on whether we are removing page tables or not.

I think you would really want to track that explicitly 
"tlb_table_flush_sent_ipi" ?

>    	/*
>    	 * Do not reset mmu_gather::vma_* fields here, we do not
>    	 * call into tlb_start_vma() again to set them if there is an
> @@ -492,7 +493,7 @@ tlb_update_vma_flags(struct mmu_gather *tlb, struct
> vm_area_struct *vma)
>    	tlb->vma_pfn |= !!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP));
>    }
> 
> -static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
> +static inline void __tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
>    {
>    	/*
>    	 * Anything calling __tlb_adjust_range() also sets at least one of
> @@ -503,6 +504,11 @@ static inline void tlb_flush_mmu_tlbonly(struct
> mmu_gather *tlb)
>    		return;
> 
>    	tlb_flush(tlb);
> +}
> +
> +static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb)
> +{
> +	__tlb_flush_mmu_tlbonly(tlb);
>    	__tlb_reset_range(tlb);
>    }
> 
> @@ -824,7 +830,7 @@ static inline void tlb_flush_unshared_tables(struct
> mmu_gather *tlb)
>    	 * flush the TLB for the unsharer now.
>    	 */
>    	if (tlb->unshared_tables)
> -		tlb_flush_mmu_tlbonly(tlb);
> +		__tlb_flush_mmu_tlbonly(tlb);
> 
>    	/*
>    	 * Similarly, we must make sure that concurrent GUP-fast will not
> @@ -834,14 +840,16 @@ static inline void
> tlb_flush_unshared_tables(struct mmu_gather *tlb)
>    	 * We only perform this when we are the last sharer of a page table,
>    	 * as the IPI will reach all CPUs: any GUP-fast.
>    	 *
> -	 * Note that on configs where tlb_remove_table_sync_one() is a NOP,
> -	 * the expectation is that the tlb_flush_mmu_tlbonly() would have issued
> -	 * required IPIs already for us.
> +	 * Use tlb_gather_remove_table_sync_one() instead of
> +	 * tlb_remove_table_sync_one() to skip the redundant IPI if the
> +	 * TLB flush above already sent one.
>    	 */
>    	if (tlb->fully_unshared_tables) {
> -		tlb_remove_table_sync_one();
> +		tlb_gather_remove_table_sync_one(tlb);
>    		tlb->fully_unshared_tables = false;
>    	}
> +
> +	__tlb_reset_range(tlb);
>    }
>    #endif /* CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */
> ---
> 
> For khugepaged, it should be fine - it uses a local mmu_gather that
> doesn't get reused. The lifetime is simply:
> 
>     tlb_gather_mmu() → flush → sync → tlb_finish_mmu()
> 
> Let me know if this addresses your concern :)

I'll probably have to see the full picture. But this lifetime stuff in 
core-mm ends up getting more complicated than v2 without a clear benefit 
to me (except maybe handling some x86 oddities better ;) )

-- 
Cheers

David

next prev parent reply	other threads:[~2026-01-09 14:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-06 12:03 [PATCH RESEND v3 0/2] skip redundant TLB sync IPIs Lance Yang
2026-01-06 12:03 ` [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized Lance Yang
2026-01-06 15:19   ` David Hildenbrand (Red Hat)
2026-01-06 16:10     ` Lance Yang
2026-01-07  6:37       ` Lance Yang
2026-01-09 14:11         ` David Hildenbrand (Red Hat) [this message]
2026-01-09 14:13       ` David Hildenbrand (Red Hat)
2026-01-09 15:30         ` Lance Yang
2026-01-09 15:40           ` David Hildenbrand (Red Hat)
2026-01-06 16:24   ` Dave Hansen
2026-01-07  2:47     ` Lance Yang
2026-01-06 12:03 ` [PATCH RESEND v3 2/2] mm: introduce pmdp_collapse_flush_sync() to skip redundant IPI Lance Yang
2026-01-06 15:07   ` David Hildenbrand (Red Hat)
2026-01-06 15:41     ` Lance Yang
2026-01-07  9:46   ` kernel test robot
2026-01-07 10:52   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=253140b0-c9b9-4ef0-8b36-af307296519b@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=arnd@arndb.de \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dev.jain@arm.com \
    --cc=hpa@zytor.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mingo@redhat.com \
    --cc=npache@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox