From: Barry Song <21cnbao@gmail.com>
To: Will Deacon <will@kernel.org>
Cc: "Yin, Fengwei" <fengwei.yin@intel.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
catalin.marinas@arm.com, akpm@linux-foundation.org,
v-songbaohua@oppo.com, yuzhao@google.com, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64: mm: drop tlb flush operation when clearing the access bit
Date: Wed, 8 Nov 2023 04:50:46 +0800 [thread overview]
Message-ID: <CAGsJ_4z3hTeOnMct4wZwDPQzWRn2h4as=-+82DSTT45RH7dQeA@mail.gmail.com> (raw)
In-Reply-To: <20231107101221.GB18944@willie-the-truck>
On Tue, Nov 7, 2023 at 6:12 PM Will Deacon <will@kernel.org> wrote:
>
> On Wed, Oct 25, 2023 at 09:39:19AM +0800, Yin, Fengwei wrote:
> >
> > >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > >> index 0bd18de9fd97..2979d796ba9d 100644
> > >> --- a/arch/arm64/include/asm/pgtable.h
> > >> +++ b/arch/arm64/include/asm/pgtable.h
> > >> @@ -905,21 +905,22 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
> > >> static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
> > >> unsigned long address, pte_t *ptep)
> > >> {
> > >> - int young = ptep_test_and_clear_young(vma, address, ptep);
> > >> -
> > >> - if (young) {
> > >> - /*
> > >> - * We can elide the trailing DSB here since the worst that can
> > >> - * happen is that a CPU continues to use the young entry in its
> > >> - * TLB and we mistakenly reclaim the associated page. The
> > >> - * window for such an event is bounded by the next
> > >> - * context-switch, which provides a DSB to complete the TLB
> > >> - * invalidation.
> > >> - */
> > >> - flush_tlb_page_nosync(vma, address);
> > >> - }
> > >> -
> > >> - return young;
> > >> + /*
> > >> + * This comment is borrowed from x86, but applies equally to ARM64:
> > >> + *
> > >> + * Clearing the accessed bit without a TLB flush doesn't cause
> > >> + * data corruption. [ It could cause incorrect page aging and
> > >> + * the (mistaken) reclaim of hot pages, but the chance of that
> > >> + * should be relatively low. ]
> > >> + *
> > >> + * So as a performance optimization don't flush the TLB when
> > >> + * clearing the accessed bit, it will eventually be flushed by
> > >> + * a context switch or a VM operation anyway. [ In the rare
> > >> + * event of it not getting flushed for a long time the delay
> > >> + * shouldn't really matter because there's no real memory
> > >> + * pressure for swapout to react to. ]
> > >> + */
> > >> + return ptep_test_and_clear_young(vma, address, ptep);
> > >> }
> > From https://lore.kernel.org/lkml/20181029105515.GD14127@arm.com/:
> >
> > This is blindly copied from x86 and isn't true for us: we don't invalidate
> > the TLB on context switch. That means our window for keeping the stale
> > entries around is potentially much bigger and might not be a great idea.
>
> I completely agree.
>
> > My understanding is that arm64 doesn't do invalidate the TLB during
> > context switch. The flush_tlb_page_nosync() here + DSB during context
> > switch make sure the TLB is invalidated during context switch.
> > So we can't remove flush_tlb_page_nosync() here? Or something was changed
> > for arm64 (I have zero knowledge to TLB on arm64. So some obvious thing
> > may be missed)? Thanks.
>
> As you point out, we already elide the DSB here but I don't think we should
> remove the TLB invalidation entirely because then we lose the guarantee
> that the update ever becomes visible to the page-table walker.
>
> I'm surprised that the TLBI is showing up as a performance issue without
> the DSB present. Is it because we're walking over a large VA range and
> invalidating on a per-page basis? If so, we'd be better off batching them
nop. in lru cases, there are thousands of pages in LRU list. doing vmscan,
we depend on rmap to find their PTEs, then read and clear AF to figure out
if a page is young. So it is not from a big VM area to those pages in this VA
range. There are just too many pages from lots of processes in LRU to be
scanned. The thing is done by rmap.
> up and doing the invalidation at the end (which will be upgraded to a
> full-mm invalidation if the range is large enough).
Those pages in LRU could be from hundreds of different processes, they are
not in just one process. i guess one possibility is that hardware has a limited
tlbi/nosync buffer, once the buffer is full, something similar with dsb will be
done automatically by hardware. So too many tlbi even without dsb can still
harm performance.
>
> Will
Thanks
Barry
next prev parent reply other threads:[~2023-11-07 20:51 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-24 12:56 Baolin Wang
2023-10-24 13:48 ` Kefeng Wang
2023-10-25 1:44 ` Baolin Wang
2023-10-24 22:32 ` Yu Zhao
2023-10-24 23:16 ` Barry Song
2023-10-24 23:31 ` Barry Song
2023-10-25 1:07 ` Alistair Popple
2023-10-25 1:44 ` Barry Song
2023-10-25 1:58 ` Alistair Popple
2023-10-25 2:43 ` Baolin Wang
2023-10-25 3:09 ` Alistair Popple
2023-10-25 6:17 ` Yu Zhao
2023-10-25 6:27 ` Barry Song
2023-10-25 10:12 ` Alistair Popple
2023-10-25 18:22 ` Yu Zhao
2023-10-25 23:32 ` Alistair Popple
2023-10-26 23:48 ` Barry Song
2023-10-25 2:02 ` Baolin Wang
2023-10-25 1:39 ` Yin, Fengwei
2023-10-25 3:03 ` Baolin Wang
2023-10-25 3:08 ` Yin, Fengwei
2023-10-25 3:15 ` Baolin Wang
2023-10-25 4:34 ` Barry Song
2023-11-07 10:12 ` Will Deacon
2023-11-07 20:50 ` Barry Song [this message]
2023-10-26 4:55 ` Anshuman Khandual
2023-10-26 5:54 ` Barry Song
2023-10-26 6:01 ` Anshuman Khandual
2023-10-26 12:30 ` Robin Murphy
2023-10-26 12:32 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGsJ_4z3hTeOnMct4wZwDPQzWRn2h4as=-+82DSTT45RH7dQeA@mail.gmail.com' \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=fengwei.yin@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=v-songbaohua@oppo.com \
--cc=will@kernel.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox