From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Lance Yang <lance.yang@linux.dev>, dave.hansen@intel.com
Cc: dave.hansen@linux.intel.com, will@kernel.org,
aneesh.kumar@kernel.org, npiggin@gmail.com, peterz@infradead.org,
tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
x86@kernel.org, hpa@zytor.com, arnd@arndb.de,
akpm@linux-foundation.org, lorenzo.stoakes@oracle.com,
ziy@nvidia.com, baolin.wang@linux.alibaba.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
dev.jain@arm.com, baohua@kernel.org, shy828301@gmail.com,
riel@surriel.com, jannh@google.com, linux-arch@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
ioworker0@gmail.com
Subject: Re: [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized
Date: Fri, 9 Jan 2026 16:40:19 +0100 [thread overview]
Message-ID: <bbfdf226-4660-4949-b17b-0d209ee4ef8c@kernel.org> (raw)
In-Reply-To: <f45a1760-7fa6-4e2c-ba5a-90e250a5792a@linux.dev>
On 1/9/26 16:30, Lance Yang wrote:
>
>
> On 2026/1/9 22:13, David Hildenbrand (Red Hat) wrote:
>>
>>>> What could work is tracking "tlb_table_flush_sent_ipi" really when we
>>>> are flushing the TLB for removed/unshared tables, and maybe resetting
>>>> it ... I don't know when from the top of my head.
>>>
>>> Not sure what's the best way forward here :(
>>>
>>>>
>>>> v2 was simpler IMHO.
>>>
>>> The main concern Dave raised was that with PV hypercalls or when
>>> INVLPGB is available, we can't tell from a static check whether IPIs
>>> were actually sent.
>>
>> Why can't we set the boolean at runtime when initializing the pv_ops
>> structure, when we are sure that it is allowed?
>
> Yes, thanks, that sounds like a reasonable trade-off :)
>
> As you mentioned:
>
> "this lifetime stuff in core-mm ends up getting more complicated than
> v2 without a clear benefit".
>
> I totally agree that v3 is too complicated :(
>
> But Dave's concern about v2 was that we can't accurately tell whether
> IPIs were actually sent in PV environments or with INVLPGB, which
> misses optimization opportunities. The INVLPGB+no_global_asid case
> also sends IPIs during TLB flush.
>
> Anyway, yeah, I'd rather start with a simple approach, even if it's
> not perfect. We can always improve it later ;)
>
> Any ideas on how to move forward?
I'd hope Dave can comment :)
In general, I saw the whole thing as a two step process:
1) Avoid IPIs completely when the TLB flush sent them. We can achieve
that through v2 or v3, one-way or the other, I don't particularly
care as long as it is clean and simple.
2) For other configs/arch, send IPIs only to CPUs that are actually in
GUP-fast etc. That would resolve some RT headake with broadcast IPIs.
Regarding 2), it obviously only applies to setups where 1) does not
apply: like x86 with INVLPGB or arm64.
I once had the idea of letting CPUs that enter/exit GUP-fast (and
similar) to indicate in a global cpumask (or per-CPU variables) that
they are in that context. Then, we can just collect these CPUs and limit
the IPIs to them (usually, not a lot ...).
The trick here is to not slowdown GUP-fast too much. And one person
(Yair in RT context) who played with that was not able to reduce the
overhead sufficiently enough.
I guess the options are
a) Per-MM CPU mask we have to update atomically when entering/leaving
GUP-fast
b) Global mask we have to update atomically when entering/leaving GUP-fast
c) Per-CPU variable we have to update when entering-leaving GUP-fast.
Interrupts are disabled, so we don't have to worry about reschedule etc.
Maybe someone reading along has other thoughts.
--
Cheers
David
next prev parent reply other threads:[~2026-01-09 15:40 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-06 12:03 [PATCH RESEND v3 0/2] skip redundant TLB sync IPIs Lance Yang
2026-01-06 12:03 ` [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized Lance Yang
2026-01-06 15:19 ` David Hildenbrand (Red Hat)
2026-01-06 16:10 ` Lance Yang
2026-01-07 6:37 ` Lance Yang
2026-01-09 14:11 ` David Hildenbrand (Red Hat)
2026-01-09 14:13 ` David Hildenbrand (Red Hat)
2026-01-09 15:30 ` Lance Yang
2026-01-09 15:40 ` David Hildenbrand (Red Hat) [this message]
2026-01-06 16:24 ` Dave Hansen
2026-01-07 2:47 ` Lance Yang
2026-01-06 12:03 ` [PATCH RESEND v3 2/2] mm: introduce pmdp_collapse_flush_sync() to skip redundant IPI Lance Yang
2026-01-06 15:07 ` David Hildenbrand (Red Hat)
2026-01-06 15:41 ` Lance Yang
2026-01-07 9:46 ` kernel test robot
2026-01-07 10:52 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bbfdf226-4660-4949-b17b-0d209ee4ef8c@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=arnd@arndb.de \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dev.jain@arm.com \
--cc=hpa@zytor.com \
--cc=ioworker0@gmail.com \
--cc=jannh@google.com \
--cc=lance.yang@linux.dev \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mingo@redhat.com \
--cc=npache@redhat.com \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox