From: Will Deacon <will@kernel.org>
To: "Yin, Fengwei" <fengwei.yin@intel.com>
Cc: Barry Song <21cnbao@gmail.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
catalin.marinas@arm.com, akpm@linux-foundation.org,
v-songbaohua@oppo.com, yuzhao@google.com, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64: mm: drop tlb flush operation when clearing the access bit
Date: Tue, 7 Nov 2023 10:12:21 +0000 [thread overview]
Message-ID: <20231107101221.GB18944@willie-the-truck> (raw)
In-Reply-To: <44e32b0e-0e41-4055-bdb9-15bc7d47197c@intel.com>
On Wed, Oct 25, 2023 at 09:39:19AM +0800, Yin, Fengwei wrote:
>
> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> >> index 0bd18de9fd97..2979d796ba9d 100644
> >> --- a/arch/arm64/include/asm/pgtable.h
> >> +++ b/arch/arm64/include/asm/pgtable.h
> >> @@ -905,21 +905,22 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
> >> static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
> >> unsigned long address, pte_t *ptep)
> >> {
> >> - int young = ptep_test_and_clear_young(vma, address, ptep);
> >> -
> >> - if (young) {
> >> - /*
> >> - * We can elide the trailing DSB here since the worst that can
> >> - * happen is that a CPU continues to use the young entry in its
> >> - * TLB and we mistakenly reclaim the associated page. The
> >> - * window for such an event is bounded by the next
> >> - * context-switch, which provides a DSB to complete the TLB
> >> - * invalidation.
> >> - */
> >> - flush_tlb_page_nosync(vma, address);
> >> - }
> >> -
> >> - return young;
> >> + /*
> >> + * This comment is borrowed from x86, but applies equally to ARM64:
> >> + *
> >> + * Clearing the accessed bit without a TLB flush doesn't cause
> >> + * data corruption. [ It could cause incorrect page aging and
> >> + * the (mistaken) reclaim of hot pages, but the chance of that
> >> + * should be relatively low. ]
> >> + *
> >> + * So as a performance optimization don't flush the TLB when
> >> + * clearing the accessed bit, it will eventually be flushed by
> >> + * a context switch or a VM operation anyway. [ In the rare
> >> + * event of it not getting flushed for a long time the delay
> >> + * shouldn't really matter because there's no real memory
> >> + * pressure for swapout to react to. ]
> >> + */
> >> + return ptep_test_and_clear_young(vma, address, ptep);
> >> }
> From https://lore.kernel.org/lkml/20181029105515.GD14127@arm.com/:
>
> This is blindly copied from x86 and isn't true for us: we don't invalidate
> the TLB on context switch. That means our window for keeping the stale
> entries around is potentially much bigger and might not be a great idea.
I completely agree.
> My understanding is that arm64 doesn't do invalidate the TLB during
> context switch. The flush_tlb_page_nosync() here + DSB during context
> switch make sure the TLB is invalidated during context switch.
> So we can't remove flush_tlb_page_nosync() here? Or something was changed
> for arm64 (I have zero knowledge to TLB on arm64. So some obvious thing
> may be missed)? Thanks.
As you point out, we already elide the DSB here but I don't think we should
remove the TLB invalidation entirely because then we lose the guarantee
that the update ever becomes visible to the page-table walker.
I'm surprised that the TLBI is showing up as a performance issue without
the DSB present. Is it because we're walking over a large VA range and
invalidating on a per-page basis? If so, we'd be better off batching them
up and doing the invalidation at the end (which will be upgraded to a
full-mm invalidation if the range is large enough).
Will
next prev parent reply other threads:[~2023-11-07 10:12 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-24 12:56 Baolin Wang
2023-10-24 13:48 ` Kefeng Wang
2023-10-25 1:44 ` Baolin Wang
2023-10-24 22:32 ` Yu Zhao
2023-10-24 23:16 ` Barry Song
2023-10-24 23:31 ` Barry Song
2023-10-25 1:07 ` Alistair Popple
2023-10-25 1:44 ` Barry Song
2023-10-25 1:58 ` Alistair Popple
2023-10-25 2:43 ` Baolin Wang
2023-10-25 3:09 ` Alistair Popple
2023-10-25 6:17 ` Yu Zhao
2023-10-25 6:27 ` Barry Song
2023-10-25 10:12 ` Alistair Popple
2023-10-25 18:22 ` Yu Zhao
2023-10-25 23:32 ` Alistair Popple
2023-10-26 23:48 ` Barry Song
2023-10-25 2:02 ` Baolin Wang
2023-10-25 1:39 ` Yin, Fengwei
2023-10-25 3:03 ` Baolin Wang
2023-10-25 3:08 ` Yin, Fengwei
2023-10-25 3:15 ` Baolin Wang
2023-10-25 4:34 ` Barry Song
2023-11-07 10:12 ` Will Deacon [this message]
2023-11-07 20:50 ` Barry Song
2023-10-26 4:55 ` Anshuman Khandual
2023-10-26 5:54 ` Barry Song
2023-10-26 6:01 ` Anshuman Khandual
2023-10-26 12:30 ` Robin Murphy
2023-10-26 12:32 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231107101221.GB18944@willie-the-truck \
--to=will@kernel.org \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=fengwei.yin@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=v-songbaohua@oppo.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox