* [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
@ 2025-03-04 8:51 Piotr Jaroszynski
2025-03-05 18:48 ` Catalin Marinas
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Piotr Jaroszynski @ 2025-03-04 8:51 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, linux-arm-kernel
Cc: Piotr Jaroszynski, Robin Murphy, Alistair Popple,
Raghavendra Rao Ananta, SeongJae Park, Jason Gunthorpe,
John Hubbard, Nicolin Chen, iommu, linux-mm, linux-kernel,
stable
Update the __flush_tlb_range_op macro not to modify its parameters as
these are unexepcted semantics. In practice, this fixes the call to
mmu_notifier_arch_invalidate_secondary_tlbs() in
__flush_tlb_range_nosync() to use the correct range instead of an empty
range with start=end. The empty range was (un)lucky as it results in
taking the invalidate-all path that doesn't cause correctness issues,
but can certainly result in suboptimal perf.
This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call
invalidate_range() when invalidating TLBs") when the call to the
notifiers was added to __flush_tlb_range(). It predates the addition of
the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb:
Refactor the core flush algorithm of __flush_tlb_range") that made the
bug hard to spot.
Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs")
Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Raghavendra Rao Ananta <rananta@google.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Nicolin Chen <nicolinc@nvidia.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux.dev
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
---
arch/arm64/include/asm/tlbflush.h | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index bc94e036a26b..8104aee4f9a0 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -396,33 +396,35 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
#define __flush_tlb_range_op(op, start, pages, stride, \
asid, tlb_level, tlbi_user, lpa2) \
do { \
+ typeof(start) __flush_start = start; \
+ typeof(pages) __flush_pages = pages; \
int num = 0; \
int scale = 3; \
int shift = lpa2 ? 16 : PAGE_SHIFT; \
unsigned long addr; \
\
- while (pages > 0) { \
+ while (__flush_pages > 0) { \
if (!system_supports_tlb_range() || \
- pages == 1 || \
- (lpa2 && start != ALIGN(start, SZ_64K))) { \
- addr = __TLBI_VADDR(start, asid); \
+ __flush_pages == 1 || \
+ (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
+ addr = __TLBI_VADDR(__flush_start, asid); \
__tlbi_level(op, addr, tlb_level); \
if (tlbi_user) \
__tlbi_user_level(op, addr, tlb_level); \
- start += stride; \
- pages -= stride >> PAGE_SHIFT; \
+ __flush_start += stride; \
+ __flush_pages -= stride >> PAGE_SHIFT; \
continue; \
} \
\
- num = __TLBI_RANGE_NUM(pages, scale); \
+ num = __TLBI_RANGE_NUM(__flush_pages, scale); \
if (num >= 0) { \
- addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
+ addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
scale, num, tlb_level); \
__tlbi(r##op, addr); \
if (tlbi_user) \
__tlbi_user(r##op, addr); \
- start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
- pages -= __TLBI_RANGE_PAGES(num, scale); \
+ __flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
+ __flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
} \
scale--; \
} \
base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc
--
2.22.1.7.gac84d6e93c.dirty
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
2025-03-04 8:51 [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates Piotr Jaroszynski
@ 2025-03-05 18:48 ` Catalin Marinas
2025-03-05 19:35 ` Will Deacon
2025-03-05 23:49 ` Alistair Popple
2025-03-11 13:13 ` Will Deacon
2025-03-17 13:07 ` Ivan T. Ivanov
2 siblings, 2 replies; 6+ messages in thread
From: Catalin Marinas @ 2025-03-05 18:48 UTC (permalink / raw)
To: Piotr Jaroszynski
Cc: Will Deacon, linux-arm-kernel, Robin Murphy, Alistair Popple,
Raghavendra Rao Ananta, SeongJae Park, Jason Gunthorpe,
John Hubbard, Nicolin Chen, iommu, linux-mm, linux-kernel,
stable
On Tue, Mar 04, 2025 at 12:51:27AM -0800, Piotr Jaroszynski wrote:
> Update the __flush_tlb_range_op macro not to modify its parameters as
> these are unexepcted semantics. In practice, this fixes the call to
> mmu_notifier_arch_invalidate_secondary_tlbs() in
> __flush_tlb_range_nosync() to use the correct range instead of an empty
> range with start=end. The empty range was (un)lucky as it results in
> taking the invalidate-all path that doesn't cause correctness issues,
> but can certainly result in suboptimal perf.
>
> This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call
> invalidate_range() when invalidating TLBs") when the call to the
> notifiers was added to __flush_tlb_range(). It predates the addition of
> the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb:
> Refactor the core flush algorithm of __flush_tlb_range") that made the
> bug hard to spot.
That's the problem with macros.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Will, do you want to take this as a fix? It's only a performance
regression, though you never know how it breaks the callers of the macro
at some point.
> Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs")
>
> Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Raghavendra Rao Ananta <rananta@google.com>
> Cc: SeongJae Park <sj@kernel.org>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Nicolin Chen <nicolinc@nvidia.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: iommu@lists.linux.dev
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org
> ---
> arch/arm64/include/asm/tlbflush.h | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index bc94e036a26b..8104aee4f9a0 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -396,33 +396,35 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
> #define __flush_tlb_range_op(op, start, pages, stride, \
> asid, tlb_level, tlbi_user, lpa2) \
> do { \
> + typeof(start) __flush_start = start; \
> + typeof(pages) __flush_pages = pages; \
> int num = 0; \
> int scale = 3; \
> int shift = lpa2 ? 16 : PAGE_SHIFT; \
> unsigned long addr; \
> \
> - while (pages > 0) { \
> + while (__flush_pages > 0) { \
> if (!system_supports_tlb_range() || \
> - pages == 1 || \
> - (lpa2 && start != ALIGN(start, SZ_64K))) { \
> - addr = __TLBI_VADDR(start, asid); \
> + __flush_pages == 1 || \
> + (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
> + addr = __TLBI_VADDR(__flush_start, asid); \
> __tlbi_level(op, addr, tlb_level); \
> if (tlbi_user) \
> __tlbi_user_level(op, addr, tlb_level); \
> - start += stride; \
> - pages -= stride >> PAGE_SHIFT; \
> + __flush_start += stride; \
> + __flush_pages -= stride >> PAGE_SHIFT; \
> continue; \
> } \
> \
> - num = __TLBI_RANGE_NUM(pages, scale); \
> + num = __TLBI_RANGE_NUM(__flush_pages, scale); \
> if (num >= 0) { \
> - addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
> + addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
> scale, num, tlb_level); \
> __tlbi(r##op, addr); \
> if (tlbi_user) \
> __tlbi_user(r##op, addr); \
> - start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> - pages -= __TLBI_RANGE_PAGES(num, scale); \
> + __flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> + __flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
> } \
> scale--; \
> } \
>
> base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc
> --
> 2.22.1.7.gac84d6e93c.dirty
--
Catalin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
2025-03-05 18:48 ` Catalin Marinas
@ 2025-03-05 19:35 ` Will Deacon
2025-03-05 23:49 ` Alistair Popple
1 sibling, 0 replies; 6+ messages in thread
From: Will Deacon @ 2025-03-05 19:35 UTC (permalink / raw)
To: Catalin Marinas
Cc: Piotr Jaroszynski, linux-arm-kernel, Robin Murphy,
Alistair Popple, Raghavendra Rao Ananta, SeongJae Park,
Jason Gunthorpe, John Hubbard, Nicolin Chen, iommu, linux-mm,
linux-kernel, stable
On Wed, Mar 05, 2025 at 06:48:19PM +0000, Catalin Marinas wrote:
> On Tue, Mar 04, 2025 at 12:51:27AM -0800, Piotr Jaroszynski wrote:
> > Update the __flush_tlb_range_op macro not to modify its parameters as
> > these are unexepcted semantics. In practice, this fixes the call to
> > mmu_notifier_arch_invalidate_secondary_tlbs() in
> > __flush_tlb_range_nosync() to use the correct range instead of an empty
> > range with start=end. The empty range was (un)lucky as it results in
> > taking the invalidate-all path that doesn't cause correctness issues,
> > but can certainly result in suboptimal perf.
> >
> > This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call
> > invalidate_range() when invalidating TLBs") when the call to the
> > notifiers was added to __flush_tlb_range(). It predates the addition of
> > the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb:
> > Refactor the core flush algorithm of __flush_tlb_range") that made the
> > bug hard to spot.
>
> That's the problem with macros.
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>
> Will, do you want to take this as a fix? It's only a performance
> regression, though you never know how it breaks the callers of the macro
> at some point.
Yeah, I'll pick it up but I'm travelling atm so it may have to wait until
next week.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
2025-03-05 18:48 ` Catalin Marinas
2025-03-05 19:35 ` Will Deacon
@ 2025-03-05 23:49 ` Alistair Popple
1 sibling, 0 replies; 6+ messages in thread
From: Alistair Popple @ 2025-03-05 23:49 UTC (permalink / raw)
To: Catalin Marinas
Cc: Piotr Jaroszynski, Will Deacon, linux-arm-kernel, Robin Murphy,
Raghavendra Rao Ananta, SeongJae Park, Jason Gunthorpe,
John Hubbard, Nicolin Chen, iommu, linux-mm, linux-kernel,
stable
On Wed, Mar 05, 2025 at 06:48:19PM +0000, Catalin Marinas wrote:
> On Tue, Mar 04, 2025 at 12:51:27AM -0800, Piotr Jaroszynski wrote:
> > Update the __flush_tlb_range_op macro not to modify its parameters as
> > these are unexepcted semantics. In practice, this fixes the call to
> > mmu_notifier_arch_invalidate_secondary_tlbs() in
> > __flush_tlb_range_nosync() to use the correct range instead of an empty
> > range with start=end. The empty range was (un)lucky as it results in
> > taking the invalidate-all path that doesn't cause correctness issues,
> > but can certainly result in suboptimal perf.
> >
> > This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call
> > invalidate_range() when invalidating TLBs") when the call to the
> > notifiers was added to __flush_tlb_range(). It predates the addition of
> > the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb:
> > Refactor the core flush algorithm of __flush_tlb_range") that made the
> > bug hard to spot.
>
> That's the problem with macros.
Yep, that's why I missed it when adding the notifier call. Anyway:
Reviewed-by: Alistair Popple <apopple@nvidia.com>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>
> Will, do you want to take this as a fix? It's only a performance
> regression, though you never know how it breaks the callers of the macro
> at some point.
>
> > Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs")
> >
> > Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Robin Murphy <robin.murphy@arm.com>
> > Cc: Alistair Popple <apopple@nvidia.com>
> > Cc: Raghavendra Rao Ananta <rananta@google.com>
> > Cc: SeongJae Park <sj@kernel.org>
> > Cc: Jason Gunthorpe <jgg@nvidia.com>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: Nicolin Chen <nicolinc@nvidia.com>
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: iommu@lists.linux.dev
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: stable@vger.kernel.org
> > ---
> > arch/arm64/include/asm/tlbflush.h | 22 ++++++++++++----------
> > 1 file changed, 12 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> > index bc94e036a26b..8104aee4f9a0 100644
> > --- a/arch/arm64/include/asm/tlbflush.h
> > +++ b/arch/arm64/include/asm/tlbflush.h
> > @@ -396,33 +396,35 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
> > #define __flush_tlb_range_op(op, start, pages, stride, \
> > asid, tlb_level, tlbi_user, lpa2) \
> > do { \
> > + typeof(start) __flush_start = start; \
> > + typeof(pages) __flush_pages = pages; \
> > int num = 0; \
> > int scale = 3; \
> > int shift = lpa2 ? 16 : PAGE_SHIFT; \
> > unsigned long addr; \
> > \
> > - while (pages > 0) { \
> > + while (__flush_pages > 0) { \
> > if (!system_supports_tlb_range() || \
> > - pages == 1 || \
> > - (lpa2 && start != ALIGN(start, SZ_64K))) { \
> > - addr = __TLBI_VADDR(start, asid); \
> > + __flush_pages == 1 || \
> > + (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
> > + addr = __TLBI_VADDR(__flush_start, asid); \
> > __tlbi_level(op, addr, tlb_level); \
> > if (tlbi_user) \
> > __tlbi_user_level(op, addr, tlb_level); \
> > - start += stride; \
> > - pages -= stride >> PAGE_SHIFT; \
> > + __flush_start += stride; \
> > + __flush_pages -= stride >> PAGE_SHIFT; \
> > continue; \
> > } \
> > \
> > - num = __TLBI_RANGE_NUM(pages, scale); \
> > + num = __TLBI_RANGE_NUM(__flush_pages, scale); \
> > if (num >= 0) { \
> > - addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
> > + addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
> > scale, num, tlb_level); \
> > __tlbi(r##op, addr); \
> > if (tlbi_user) \
> > __tlbi_user(r##op, addr); \
> > - start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> > - pages -= __TLBI_RANGE_PAGES(num, scale); \
> > + __flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> > + __flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
> > } \
> > scale--; \
> > } \
> >
> > base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc
> > --
> > 2.22.1.7.gac84d6e93c.dirty
>
> --
> Catalin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
2025-03-04 8:51 [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates Piotr Jaroszynski
2025-03-05 18:48 ` Catalin Marinas
@ 2025-03-11 13:13 ` Will Deacon
2025-03-17 13:07 ` Ivan T. Ivanov
2 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2025-03-11 13:13 UTC (permalink / raw)
To: Catalin Marinas, linux-arm-kernel, Piotr Jaroszynski
Cc: kernel-team, Will Deacon, Robin Murphy, Alistair Popple,
Raghavendra Rao Ananta, SeongJae Park, John Hubbard,
Nicolin Chen, iommu, linux-mm, linux-kernel, stable,
Jason Gunthorpe
On Tue, 04 Mar 2025 00:51:27 -0800, Piotr Jaroszynski wrote:
> Update the __flush_tlb_range_op macro not to modify its parameters as
> these are unexepcted semantics. In practice, this fixes the call to
> mmu_notifier_arch_invalidate_secondary_tlbs() in
> __flush_tlb_range_nosync() to use the correct range instead of an empty
> range with start=end. The empty range was (un)lucky as it results in
> taking the invalidate-all path that doesn't cause correctness issues,
> but can certainly result in suboptimal perf.
>
> [...]
Applied to arm64 (for-next/fixes), thanks!
[1/1] Fix mmu notifiers for range-based invalidates
https://git.kernel.org/arm64/c/f7edb07ad7c6
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
2025-03-04 8:51 [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates Piotr Jaroszynski
2025-03-05 18:48 ` Catalin Marinas
2025-03-11 13:13 ` Will Deacon
@ 2025-03-17 13:07 ` Ivan T. Ivanov
2 siblings, 0 replies; 6+ messages in thread
From: Ivan T. Ivanov @ 2025-03-17 13:07 UTC (permalink / raw)
To: Piotr Jaroszynski
Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, Robin Murphy,
Alistair Popple, Raghavendra Rao Ananta, SeongJae Park,
Jason Gunthorpe, John Hubbard, Nicolin Chen, iommu, linux-mm,
linux-kernel, stable
Hi,
On 03-04 00:51, Piotr Jaroszynski wrote:
>
> Update the __flush_tlb_range_op macro not to modify its parameters as
> these are unexepcted semantics. In practice, this fixes the call to
> mmu_notifier_arch_invalidate_secondary_tlbs() in
> __flush_tlb_range_nosync() to use the correct range instead of an empty
> range with start=end. The empty range was (un)lucky as it results in
> taking the invalidate-all path that doesn't cause correctness issues,
> but can certainly result in suboptimal perf.
>
> This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call
> invalidate_range() when invalidating TLBs") when the call to the
> notifiers was added to __flush_tlb_range(). It predates the addition of
> the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb:
> Refactor the core flush algorithm of __flush_tlb_range") that made the
> bug hard to spot.
>
> Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs")
I think that strictly speaking this should be:
Fixes: 360839027a6e ("arm64: tlb: Refactor the core flush algorithm of __flush_tlb_range")
Regards,
Ivan
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-17 13:07 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-04 8:51 [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates Piotr Jaroszynski
2025-03-05 18:48 ` Catalin Marinas
2025-03-05 19:35 ` Will Deacon
2025-03-05 23:49 ` Alistair Popple
2025-03-11 13:13 ` Will Deacon
2025-03-17 13:07 ` Ivan T. Ivanov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox