From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
David Hildenbrand <david@redhat.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Mark Rutland <mark.rutland@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Alexandre Ghiti <alexghiti@rivosinc.com>,
Kevin Brodsky <kevin.brodsky@arm.com>,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 07/14] arm64/mm: Avoid barriers for invalid or userspace mappings
Date: Sat, 22 Feb 2025 13:17:47 +0000 [thread overview]
Message-ID: <Z7nOe78W4JXFAkMb@arm.com> (raw)
In-Reply-To: <20250217140809.1702789-8-ryan.roberts@arm.com>
On Mon, Feb 17, 2025 at 02:07:59PM +0000, Ryan Roberts wrote:
> __set_pte_complete(), set_pmd(), set_pud(), set_p4d() and set_pgd() are
> used to write entries into pgtables. And they issue barriers (currently
> dsb and isb) to ensure that the written values are observed by the table
> walker prior to any program-order-future memory access to the mapped
> location.
>
> Over the years some of these functions have received optimizations: In
> particular, commit 7f0b1bf04511 ("arm64: Fix barriers used for page
> table modifications") made it so that the barriers were only emitted for
> valid-kernel mappings for set_pte() (now __set_pte_complete()). And
> commit 0795edaf3f1f ("arm64: pgtable: Implement p[mu]d_valid() and check
> in set_p[mu]d()") made it so that set_pmd()/set_pud() only emitted the
> barriers for valid mappings.
The assumption probably was that set_pmd/pud() are called a lot less
often than set_pte() as they cover larger address ranges (whether table
or leaf).
> set_p4d()/set_pgd() continue to emit the barriers unconditionally.
We probably missed them, should have been the same as set_pmd().
> This is all very confusing to the casual observer; surely the rules
> should be invariant to the level? Let's change this so that every level
> consistently emits the barriers only when setting valid, non-user
> entries (both table and leaf).
Also see commit d0b7a302d58a ("Revert "arm64: Remove unnecessary ISBs
from set_{pte,pmd,pud}"") why we added back the ISBs to the pmd/pud
accessors and the last paragraph on why we are ok with the spurious
faults for PTEs.
For user mappings, the translation fault is routed through the usual
path that can handle mapping new entries, so I think we are fine. But
it's worth double-checking Will's comment (unless he only referred to
kernel table entries).
> It seems obvious that if it is ok to elide barriers all but valid kernel
> mappings at pte level, it must also be ok to do this for leaf entries at
> other levels: If setting an entry to invalid, a tlb maintenance
> operation must surely follow to synchronise the TLB and this contains
> the required barriers.
Setting to invalid is fine indeed, handled by the TLB flushing code,
hence the pmd_valid() checks.
> If setting a valid user mapping, the previous
> mapping must have been invalid and there must have been a TLB
> maintenance operation (complete with barriers) to honour
> break-before-make.
That's not entirely true for change_protection() for example or the
fork() path when we make the entries read-only from writeable without
BBM. We could improve these cases as well, I haven't looked in detail.
ptep_modify_prot_commit() via change_pte_range() can defer the barriers
to tlb_end_vma(). Something similar on the copy_present_ptes() path.
> So the worst that can happen is we take an extra
> fault (which will imply the DSB + ISB) and conclude that there is
> nothing to do. These are the arguments for doing this optimization at
> pte level and they also apply to leaf mappings at other levels.
It's worth clarifying Will's comment in the commit I mentioned above.
> For table entries, the same arguments hold: If unsetting a table entry,
> a TLB is required and this will emit the required barriers. If setting a
> table entry, the previous value must have been invalid and the table
> walker must already be able to observe that. Additionally the contents
> of the pgtable being pointed to in the newly set entry must be visible
> before the entry is written and this is enforced via smp_wmb() (dmb) in
> the pgtable allocation functions and in __split_huge_pmd_locked(). But
> this last part could never have been enforced by the barriers in
> set_pXd() because they occur after updating the entry. So ultimately,
> the wost that can happen by eliding these barriers for user table
> entries is an extra fault.
>
> I observe roughly the same number of page faults (107M) with and without
> this change when compiling the kernel on Apple M2.
That's microarch specific, highly dependent on timing, so you may never
see a difference.
> +static inline bool pmd_valid_not_user(pmd_t pmd)
> +{
> + /*
> + * User-space table entries always have (PXN && !UXN). All other
> + * combinations indicate it's a table entry for kernel space.
> + * Valid-not-user leaf entries follow the same rules as
> + * pte_valid_not_user().
> + */
> + if (pmd_table(pmd))
> + return !((pmd_val(pmd) & (PMD_TABLE_PXN | PMD_TABLE_UXN)) == PMD_TABLE_PXN);
> + return pte_valid_not_user(pmd_pte(pmd));
> +}
With the 128-bit format I think we lost the PXN/UXNTable bits, though we
have software bits if we need to. I just wonder whether it's worth the
hassle of skipping some barriers for user non-leaf entries. Did you see
any improvement in practice?
--
Catalin
next prev parent reply other threads:[~2025-02-22 13:17 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-17 14:07 [PATCH v2 00/14] Perf improvements for hugetlb and vmalloc on arm64 Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 01/14] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 02/14] arm64: hugetlb: Refine tlb maintenance scope Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 03/14] mm/page_table_check: Batch-check pmds/puds just like ptes Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 04/14] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 05/14] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 06/14] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Ryan Roberts
2025-02-22 11:56 ` Catalin Marinas
2025-02-24 12:18 ` Ryan Roberts
2025-02-17 14:07 ` [PATCH v2 07/14] arm64/mm: Avoid barriers for invalid or userspace mappings Ryan Roberts
2025-02-20 16:54 ` Kevin Brodsky
2025-02-24 12:26 ` Ryan Roberts
2025-02-22 13:17 ` Catalin Marinas [this message]
2025-02-25 16:41 ` Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 08/14] mm/vmalloc: Warn on improper use of vunmap_range() Ryan Roberts
2025-02-20 7:02 ` Anshuman Khandual
2025-02-24 12:03 ` Catalin Marinas
2025-02-24 12:04 ` Catalin Marinas
2025-02-17 14:08 ` [PATCH v2 09/14] mm/vmalloc: Gracefully unmap huge ptes Ryan Roberts
2025-02-20 12:05 ` Uladzislau Rezki
2025-02-24 12:04 ` Catalin Marinas
2025-02-17 14:08 ` [PATCH v2 10/14] arm64/mm: Support huge pte-mapped pages in vmap Ryan Roberts
2025-02-24 12:03 ` Catalin Marinas
2025-02-25 16:57 ` Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 11/14] mm/vmalloc: Batch arch_sync_kernel_mappings() more efficiently Ryan Roberts
2025-02-25 15:37 ` Catalin Marinas
2025-02-25 16:58 ` Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 12/14] mm: Generalize arch_sync_kernel_mappings() Ryan Roberts
2025-02-25 17:10 ` Ryan Roberts
2025-02-25 17:52 ` Catalin Marinas
2025-02-17 14:08 ` [PATCH v2 13/14] mm: Only call arch_update_kernel_mappings_[begin|end]() for kernel mappings Ryan Roberts
2025-02-17 14:08 ` [PATCH v2 14/14] arm64/mm: Batch barriers when updating " Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7nOe78W4JXFAkMb@arm.com \
--to=catalin.marinas@arm.com \
--cc=akpm@linux-foundation.org \
--cc=alexghiti@rivosinc.com \
--cc=anshuman.khandual@arm.com \
--cc=david@redhat.com \
--cc=hch@infradead.org \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=pasha.tatashin@soleen.com \
--cc=ryan.roberts@arm.com \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox