From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
David Hildenbrand <david@redhat.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Mark Rutland <mark.rutland@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Alexandre Ghiti <alexghiti@rivosinc.com>,
Kevin Brodsky <kevin.brodsky@arm.com>,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings
Date: Mon, 14 Apr 2025 18:38:19 +0100 [thread overview]
Message-ID: <Z_1IC-_Fp-yGLRSc@arm.com> (raw)
In-Reply-To: <20250304150444.3788920-12-ryan.roberts@arm.com>
On Tue, Mar 04, 2025 at 03:04:41PM +0000, Ryan Roberts wrote:
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 1898c3069c43..149df945c1ab 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -40,6 +40,55 @@
> #include <linux/sched.h>
> #include <linux/page_table_check.h>
>
> +static inline void emit_pte_barriers(void)
> +{
> + /*
> + * These barriers are emitted under certain conditions after a pte entry
> + * was modified (see e.g. __set_pte_complete()). The dsb makes the store
> + * visible to the table walker. The isb ensures that any previous
> + * speculative "invalid translation" marker that is in the CPU's
> + * pipeline gets cleared, so that any access to that address after
> + * setting the pte to valid won't cause a spurious fault. If the thread
> + * gets preempted after storing to the pgtable but before emitting these
> + * barriers, __switch_to() emits a dsb which ensure the walker gets to
> + * see the store. There is no guarrantee of an isb being issued though.
> + * This is safe because it will still get issued (albeit on a
> + * potentially different CPU) when the thread starts running again,
> + * before any access to the address.
> + */
> + dsb(ishst);
> + isb();
> +}
> +
> +static inline void queue_pte_barriers(void)
> +{
> + if (test_thread_flag(TIF_LAZY_MMU))
> + set_thread_flag(TIF_LAZY_MMU_PENDING);
As we can have lots of calls here, it might be slightly cheaper to test
TIF_LAZY_MMU_PENDING and avoid setting it unnecessarily.
I haven't checked - does the compiler generate multiple mrs from sp_el0
for subsequent test_thread_flag()?
> + else
> + emit_pte_barriers();
> +}
> +
> +#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
> +static inline void arch_enter_lazy_mmu_mode(void)
> +{
> + VM_WARN_ON(in_interrupt());
> + VM_WARN_ON(test_thread_flag(TIF_LAZY_MMU));
> +
> + set_thread_flag(TIF_LAZY_MMU);
> +}
> +
> +static inline void arch_flush_lazy_mmu_mode(void)
> +{
> + if (test_and_clear_thread_flag(TIF_LAZY_MMU_PENDING))
> + emit_pte_barriers();
> +}
> +
> +static inline void arch_leave_lazy_mmu_mode(void)
> +{
> + arch_flush_lazy_mmu_mode();
> + clear_thread_flag(TIF_LAZY_MMU);
> +}
> +
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
>
> @@ -323,10 +372,8 @@ static inline void __set_pte_complete(pte_t pte)
> * Only if the new pte is valid and kernel, otherwise TLB maintenance
> * has the necessary barriers.
> */
> - if (pte_valid_not_user(pte)) {
> - dsb(ishst);
> - isb();
> - }
> + if (pte_valid_not_user(pte))
> + queue_pte_barriers();
> }
I think this scheme works, I couldn't find a counter-example unless
__set_pte() gets called in an interrupt context. You could add
VM_WARN_ON(in_interrupt()) in queue_pte_barriers() as well.
With preemption, the newly mapped range shouldn't be used before
arch_flush_lazy_mmu_mode() is called, so it looks safe as well. I think
x86 uses a per-CPU variable to track this but per-thread is easier to
reason about if there's no nesting.
> static inline void __set_pte(pte_t *ptep, pte_t pte)
> @@ -778,10 +825,8 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
>
> WRITE_ONCE(*pmdp, pmd);
>
> - if (pmd_valid(pmd)) {
> - dsb(ishst);
> - isb();
> - }
> + if (pmd_valid(pmd))
> + queue_pte_barriers();
> }
We discussed on a previous series - for pmd/pud we end up with barriers
even for user mappings but they are at a much coarser granularity (and I
wasn't keen on 'user' attributes for the table entries).
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
next prev parent reply other threads:[~2025-04-14 17:38 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-04 15:04 [PATCH v3 00/11] Perf improvements for hugetlb and vmalloc on arm64 Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 01/11] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Ryan Roberts
2025-04-03 20:46 ` Catalin Marinas
2025-04-04 3:03 ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 02/11] arm64: hugetlb: Refine tlb maintenance scope Ryan Roberts
2025-04-03 20:47 ` Catalin Marinas
2025-04-04 3:50 ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 03/11] mm/page_table_check: Batch-check pmds/puds just like ptes Ryan Roberts
2025-03-26 14:48 ` Pasha Tatashin
2025-03-26 14:54 ` Ryan Roberts
2025-04-03 20:46 ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 04/11] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Ryan Roberts
2025-03-06 5:08 ` kernel test robot
2025-03-06 11:54 ` Ryan Roberts
2025-04-14 16:25 ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 05/11] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Ryan Roberts
2025-03-05 16:00 ` kernel test robot
2025-03-05 16:32 ` Ryan Roberts
2025-04-03 20:47 ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Ryan Roberts
2025-04-03 20:46 ` Catalin Marinas
2025-04-04 4:11 ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 07/11] mm/vmalloc: Warn on improper use of vunmap_range() Ryan Roberts
2025-03-27 13:05 ` Uladzislau Rezki
2025-03-04 15:04 ` [PATCH v3 08/11] mm/vmalloc: Gracefully unmap huge ptes Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 09/11] arm64/mm: Support huge pte-mapped pages in vmap Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 10/11] mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes Ryan Roberts
2025-03-27 13:06 ` Uladzislau Rezki
2025-04-03 20:47 ` Catalin Marinas
2025-04-04 4:54 ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings Ryan Roberts
2025-04-04 6:02 ` Anshuman Khandual
2025-04-14 17:38 ` Catalin Marinas [this message]
2025-04-14 18:28 ` Ryan Roberts
2025-04-15 10:51 ` Catalin Marinas
2025-04-15 17:28 ` Ryan Roberts
2025-03-27 12:16 ` [PATCH v3 00/11] Perf improvements for hugetlb and vmalloc on arm64 Uladzislau Rezki
2025-03-27 13:46 ` Ryan Roberts
2025-04-14 13:56 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_1IC-_Fp-yGLRSc@arm.com \
--to=catalin.marinas@arm.com \
--cc=akpm@linux-foundation.org \
--cc=alexghiti@rivosinc.com \
--cc=anshuman.khandual@arm.com \
--cc=david@redhat.com \
--cc=hch@infradead.org \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=pasha.tatashin@soleen.com \
--cc=ryan.roberts@arm.com \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox