linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,  Ard Biesheuvel <ardb@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	 James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	 Zenghui Yu <yuzenghui@huawei.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	 Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	 Dmitry Vyukov <dvyukov@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Yu Zhao <yuzhao@google.com>,  Mark Rutland <mark.rutland@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 00/14] Transparent Contiguous PTEs for User Mappings
Date: Mon, 10 Jul 2023 20:05:19 +0800	[thread overview]
Message-ID: <CAGsJ_4x4b5Qe2RNTHyR2MQqSkRxcuchrUgrap9WTaDtuMUttcA@mail.gmail.com> (raw)
In-Reply-To: <20230622144210.2623299-1-ryan.roberts@arm.com>

On Thu, Jun 22, 2023 at 11:00 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> Hi All,
>
> This is a series to opportunistically and transparently use contpte mappings
> (set the contiguous bit in ptes) for user memory when those mappings meet the
> requirements. It is part of a wider effort to improve performance of the 4K
> kernel with the aim of approaching the performance of the 16K kernel, but
> without breaking compatibility and without the associated increase in memory. It
> also benefits the 16K and 64K kernels by enabling 2M THP, since this is the
> contpte size for those kernels.
>
> Of course this is only one half of the change. We require the mapped physical
> memory to be the correct size and alignment for this to actually be useful (i.e.
> 64K for 4K pages, or 2M for 16K/64K pages). Fortunately folios are solving this
> problem for us. Filesystems that support it (XFS, AFS, EROFS, tmpfs) will
> allocate large folios up to the PMD size today, and more filesystems are coming.
> And the other half of my work, to enable the use of large folios for anonymous
> memory, aims to make contpte sized folios prevalent for anonymous memory too.
>
>
> Dependencies
> ------------
>
> While there is a complicated set of hard and soft dependencies that this patch
> set depends on, I wanted to split it out as best I could and kick off proper
> review independently.
>
> The series applies on top of these other patch sets, with a tree at:
> https://gitlab.arm.com/linux-arm/linux-rr/-/tree/features/granule_perf/contpte-lkml_v1
>
> v6.4-rc6
>   - base
>
> set_ptes()
>   - hard dependency
>   - Patch set from Matthew Wilcox to set multiple ptes with a single API call
>   - Allows arch backend to more optimally apply contpte mappings
>   - https://lore.kernel.org/linux-mm/20230315051444.3229621-1-willy@infradead.org/
>
> ptep_get() pte encapsulation
>   - hard dependency
>   - Enabler series from me to ensure none of the core code ever directly
>     dereferences a pte_t that lies within a live page table.
>   - Enables gathering access/dirty bits from across the whole contpte range
>   - in mm-stable and linux-next at time of writing
>   - https://lore.kernel.org/linux-mm/d38dc237-6093-d4c5-993e-e8ffdd6cb6fa@arm.com/
>
> Report on physically contiguous memory in smaps
>   - soft dependency
>   - Enables visibility on how much memory is physically contiguous and how much
>     is contpte-mapped - useful for debug
>   - https://lore.kernel.org/linux-mm/20230613160950.3554675-1-ryan.roberts@arm.com/
>
> Additionally there are a couple of other dependencies:
>
> anonfolio
>   - soft dependency
>   - ensures more anonymous memory is allocated in contpte-sized folios, so
>     needed to realize the performance improvements (this is the "other half"
>     mentioned above).
>   - RFC: https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/
>   - Intending to post v1 shortly.
>
> exefolio
>   - soft dependency
>   - Tweak readahead to ensure executable memory are in 64K-sized folios, so
>     needed to see reduction in iTLB pressure.
>   - Don't intend to post this until we are further down the track with contpte
>     and anonfolio.
>
> Arm ARM Clarification
>   - hard dependency
>   - Current wording disallows the fork() optimization in the final patch.
>   - Arm (ATG) have proposed tightening the wording to permit it.
>   - In conversation with partners to check this wouldn't cause problems for any
>     existing HW deployments
>
> All of the _hard_ dependencies need to be resolved before this can be considered
> for merging.
>
>
> Performance
> -----------
>
> Below results show 2 benchmarks; kernel compilation and speedometer 2.0 (a
> javascript benchmark running in Chromium). Both cases are running on Ampere
> Altra with 1 NUMA node enabled, Ubuntu 22.04 and XFS filesystem. Each benchmark
> is repeated 15 times over 5 reboots and averaged.
>
> All improvements are relative to baseline-4k. anonfolio and exefolio are as
> described above. contpte is this series. (Note that exefolio only gives an
> improvement because contpte is already in place).
>
> Kernel Compilation (smaller is better):
>
> | kernel       |   real-time |   kern-time |   user-time |
> |:-------------|------------:|------------:|------------:|
> | baseline-4k  |        0.0% |        0.0% |        0.0% |
> | anonfolio    |       -5.4% |      -46.0% |       -0.3% |
> | contpte      |       -6.8% |      -45.7% |       -2.1% |
> | exefolio     |       -8.4% |      -46.4% |       -3.7% |

sorry i am a bit confused. in exefolio case, is anonfolio included?
or it only has large cont-pte folios on exe code? in the other words,
Does the 8.4% improvement come from iTLB miss reduction only,
or from both dTLB and iTLB miss reduction?

> | baseline-16k |       -8.7% |      -49.2% |       -3.7% |
> | baseline-64k |      -10.5% |      -66.0% |       -3.5% |
>
> Speedometer 2.0 (bigger is better):
>
> | kernel       |   runs_per_min |
> |:-------------|---------------:|
> | baseline-4k  |           0.0% |
> | anonfolio    |           1.2% |
> | contpte      |           3.1% |
> | exefolio     |           4.2% |

same question as above.

> | baseline-16k |           5.3% |
>
> I've also run Speedometer 2.0 on Pixel 6 with an Ubuntu SW stack and see similar
> gains.
>
> I've also verified that running the contpte changes without anonfolio and
> exefolio does not cause any regression vs baseline-4k.
>
>
> Opens
> -----
>
> The only potential issue that I see right now is that due to there only being 1
> access/dirty bit per contpte range, if a single page in the range is
> accessed/dirtied then all the adjacent pages are reported as accessed/dirtied
> too. Access/dirty is managed by the kernel per _folio_, so this information gets
> collapsed down anyway, and nothing changes there. However, the per _page_
> access/dirty information is reported through pagemap to user space. I'm not sure
> if this would/should be considered a break? Thoughts?
>
> Thanks,
> Ryan

Thanks
Barry


  parent reply	other threads:[~2023-07-10 12:05 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-22 14:41 Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 01/14] arm64/mm: set_pte(): New layer to manage contig bit Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 02/14] arm64/mm: set_ptes()/set_pte_at(): " Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 03/14] arm64/mm: pte_clear(): " Ryan Roberts
2023-06-22 14:41 ` [PATCH v1 04/14] arm64/mm: ptep_get_and_clear(): " Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 05/14] arm64/mm: ptep_test_and_clear_young(): " Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 06/14] arm64/mm: ptep_clear_flush_young(): " Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 07/14] arm64/mm: ptep_set_wrprotect(): " Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 08/14] arm64/mm: ptep_set_access_flags(): " Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 09/14] arm64/mm: ptep_get(): " Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 10/14] arm64/mm: Split __flush_tlb_range() to elide trailing DSB Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 11/14] arm64/mm: Wire up PTE_CONT for user mappings Ryan Roberts
2023-06-30  1:54   ` John Hubbard
2023-07-03  9:48     ` Ryan Roberts
2023-07-03 15:17   ` Catalin Marinas
2023-07-04 11:09     ` Ryan Roberts
2023-07-05 13:13       ` Ryan Roberts
2023-07-16 15:09       ` Catalin Marinas
2023-06-22 14:42 ` [PATCH v1 12/14] arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 13/14] mm: Batch-copy PTE ranges during fork() Ryan Roberts
2023-06-22 14:42 ` [PATCH v1 14/14] arm64/mm: Implement ptep_set_wrprotects() to optimize fork() Ryan Roberts
2023-07-10 12:05 ` Barry Song [this message]
2023-07-10 13:28   ` [PATCH v1 00/14] Transparent Contiguous PTEs for User Mappings Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4x4b5Qe2RNTHyR2MQqSkRxcuchrUgrap9WTaDtuMUttcA@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=anshuman.khandual@arm.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=ryabinin.a.a@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yuzenghui@huawei.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox