linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Will Deacon <will@kernel.org>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Alexandre Ghiti <alexghiti@rivosinc.com>,
	Kevin Brodsky <kevin.brodsky@arm.com>,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings
Date: Tue, 15 Apr 2025 11:51:45 +0100	[thread overview]
Message-ID: <Z_46QUFXVI69zRZR@arm.com> (raw)
In-Reply-To: <aabc9fb1-4e74-409a-b25b-8e844e65c502@arm.com>

On Mon, Apr 14, 2025 at 07:28:46PM +0100, Ryan Roberts wrote:
> On 14/04/2025 18:38, Catalin Marinas wrote:
> > On Tue, Mar 04, 2025 at 03:04:41PM +0000, Ryan Roberts wrote:
> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> >> index 1898c3069c43..149df945c1ab 100644
> >> --- a/arch/arm64/include/asm/pgtable.h
> >> +++ b/arch/arm64/include/asm/pgtable.h
> >> @@ -40,6 +40,55 @@
> >>  #include <linux/sched.h>
> >>  #include <linux/page_table_check.h>
> >>  
> >> +static inline void emit_pte_barriers(void)
> >> +{
> >> +	/*
> >> +	 * These barriers are emitted under certain conditions after a pte entry
> >> +	 * was modified (see e.g. __set_pte_complete()). The dsb makes the store
> >> +	 * visible to the table walker. The isb ensures that any previous
> >> +	 * speculative "invalid translation" marker that is in the CPU's
> >> +	 * pipeline gets cleared, so that any access to that address after
> >> +	 * setting the pte to valid won't cause a spurious fault. If the thread
> >> +	 * gets preempted after storing to the pgtable but before emitting these
> >> +	 * barriers, __switch_to() emits a dsb which ensure the walker gets to
> >> +	 * see the store. There is no guarrantee of an isb being issued though.
> >> +	 * This is safe because it will still get issued (albeit on a
> >> +	 * potentially different CPU) when the thread starts running again,
> >> +	 * before any access to the address.
> >> +	 */
> >> +	dsb(ishst);
> >> +	isb();
> >> +}
> >> +
> >> +static inline void queue_pte_barriers(void)
> >> +{
> >> +	if (test_thread_flag(TIF_LAZY_MMU))
> >> +		set_thread_flag(TIF_LAZY_MMU_PENDING);
> > 
> > As we can have lots of calls here, it might be slightly cheaper to test
> > TIF_LAZY_MMU_PENDING and avoid setting it unnecessarily.
> 
> Yes, good point.
> 
> > I haven't checked - does the compiler generate multiple mrs from sp_el0
> > for subsequent test_thread_flag()?
> 
> It emits a single mrs but it loads from the pointer twice.

It's not that bad if only do the set_thread_flag() once.

> I think v3 is the version we want?
> 
> 
> void TEST_queue_pte_barriers_v1(void)
> {
> 	if (test_thread_flag(TIF_LAZY_MMU))
> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	else
> 		emit_pte_barriers();
> }
> 
> void TEST_queue_pte_barriers_v2(void)
> {
> 	if (test_thread_flag(TIF_LAZY_MMU) &&
> 	    !test_thread_flag(TIF_LAZY_MMU_PENDING))
> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	else
> 		emit_pte_barriers();
> }
> 
> void TEST_queue_pte_barriers_v3(void)
> {
> 	unsigned long flags = read_thread_flags();
> 
> 	if ((flags & (_TIF_LAZY_MMU | _TIF_LAZY_MMU_PENDING)) == _TIF_LAZY_MMU)
> 		set_thread_flag(TIF_LAZY_MMU_PENDING);
> 	else
> 		emit_pte_barriers();
> }

Doesn't v3 emit barriers once _TIF_LAZY_MMU_PENDING has been set? We
need something like:

	if (flags & _TIF_LAZY_MMU) {
		if (!(flags & _TIF_LAZY_MMU_PENDING))
			set_thread_flag(TIF_LAZY_MMU_PENDING);
	} else {
		emit_pte_barriers();
	}

-- 
Catalin


  reply	other threads:[~2025-04-15 10:51 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-04 15:04 [PATCH v3 00/11] Perf improvements for hugetlb and vmalloc on arm64 Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 01/11] arm64: hugetlb: Cleanup huge_pte size discovery mechanisms Ryan Roberts
2025-04-03 20:46   ` Catalin Marinas
2025-04-04  3:03   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 02/11] arm64: hugetlb: Refine tlb maintenance scope Ryan Roberts
2025-04-03 20:47   ` Catalin Marinas
2025-04-04  3:50   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 03/11] mm/page_table_check: Batch-check pmds/puds just like ptes Ryan Roberts
2025-03-26 14:48   ` Pasha Tatashin
2025-03-26 14:54     ` Ryan Roberts
2025-04-03 20:46   ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 04/11] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Ryan Roberts
2025-03-06  5:08   ` kernel test robot
2025-03-06 11:54     ` Ryan Roberts
2025-04-14 16:25   ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 05/11] arm64: hugetlb: Use set_ptes_anysz() and ptep_get_and_clear_anysz() Ryan Roberts
2025-03-05 16:00   ` kernel test robot
2025-03-05 16:32     ` Ryan Roberts
2025-04-03 20:47   ` Catalin Marinas
2025-03-04 15:04 ` [PATCH v3 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Ryan Roberts
2025-04-03 20:46   ` Catalin Marinas
2025-04-04  4:11   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 07/11] mm/vmalloc: Warn on improper use of vunmap_range() Ryan Roberts
2025-03-27 13:05   ` Uladzislau Rezki
2025-03-04 15:04 ` [PATCH v3 08/11] mm/vmalloc: Gracefully unmap huge ptes Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 09/11] arm64/mm: Support huge pte-mapped pages in vmap Ryan Roberts
2025-03-04 15:04 ` [PATCH v3 10/11] mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes Ryan Roberts
2025-03-27 13:06   ` Uladzislau Rezki
2025-04-03 20:47   ` Catalin Marinas
2025-04-04  4:54   ` Anshuman Khandual
2025-03-04 15:04 ` [PATCH v3 11/11] arm64/mm: Batch barriers when updating kernel mappings Ryan Roberts
2025-04-04  6:02   ` Anshuman Khandual
2025-04-14 17:38   ` Catalin Marinas
2025-04-14 18:28     ` Ryan Roberts
2025-04-15 10:51       ` Catalin Marinas [this message]
2025-04-15 17:28         ` Ryan Roberts
2025-03-27 12:16 ` [PATCH v3 00/11] Perf improvements for hugetlb and vmalloc on arm64 Uladzislau Rezki
2025-03-27 13:46   ` Ryan Roberts
2025-04-14 13:56 ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_46QUFXVI69zRZR@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexghiti@rivosinc.com \
    --cc=anshuman.khandual@arm.com \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=kevin.brodsky@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=ryan.roberts@arm.com \
    --cc=urezki@gmail.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox