linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: David Hildenbrand <david@redhat.com>, linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Hugh Dickins <hughd@google.com>,
	Yin Fengwei <fengwei.yin@intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>, Peter Xu <peterx@redhat.com>
Subject: Re: [PATCH RFC 00/39] mm/rmap: interface overhaul
Date: Tue, 5 Dec 2023 12:29:59 +0000	[thread overview]
Message-ID: <33fb5a78-1432-4b43-af62-afeac84cf1ca@arm.com> (raw)
In-Reply-To: <993ea322-8cdb-4ab1-84d3-0a1cb40049c9@arm.com>

On 04/12/2023 19:53, Ryan Roberts wrote:
> On 04/12/2023 14:21, David Hildenbrand wrote:
>> Baed on mm-stable from a couple of days.
>>
>> This series proposes an overhaul to our rmap interface, to get rid of the
>> "bool compound" / RMAP_COMPOUND parameter with the goal of making the
>> interface less error prone, more future proof, and more natural to extend
>> to "batching". Also, this converts the interface to always consume
>> folio+subpage, which speeds up operations on large folios.
>>
>> Further, this series adds PTE-batching variants for 4 rmap functions,
>> whereby only folio_add_anon_rmap_ptes() is used for batching in this series
>> when PTE-remapping a PMD-mapped THP.
> 
> I certainly support the objective you have here; making the interfaces clearer,
> more consistent and more amenable to batching. I'll try to find some time this
> week to review.
> 
>>
>> Ryan has series where we would make use of folio_remove_rmap_ptes() [1]
>> -- he carries his own batching variant right now -- and
>> folio_try_dup_anon_rmap_ptes()/folio_dup_file_rmap_ptes() [2].
> 
> Note that the contpte series at [2] has a new patch in v3 (patch 2), which could
> benefit from folio_remove_rmap_ptes() or equivalent. My plan was to revive [1]
> on top of [2] once it is merged.
> 
>>
>> There is some overlap with both series (and some other work, like
>> multi-size THP [3]), so that will need some coordination, and likely a
>> stepwise inclusion.
> 
> Selfishly, I'd really like to get my stuff merged as soon as there is no
> technical reason not to. I'd prefer not to add this as a dependency if we can
> help it.
> 
>>
>> I got that started [4], but it made sense to show the whole picture. The
>> patches of [4] are contained in here, with one additional patch added
>> ("mm/rmap: introduce and use hugetlb_try_share_anon_rmap()") and some
>> slight patch description changes.
>>
>> In general, RMAP batching is an important optimization for PTE-mapped
>> THP, especially once we want to move towards a total mapcount or further,
>> as shown with my WIP patches on "mapped shared vs. mapped exclusively" [5].
>>
>> The rmap batching part of [5] is also contained here in a slightly reworked
>> fork [and I found a bug du to the "compound" parameter handling in these
>> patches that should be fixed here :) ].
>>
>> This series performs a lot of folio conversion, that could be separated
>> if there is a good reason. Most of the added LOC in the diff are only due
>> to documentation.
>>
>> As we're moving to a pte/pmd interface where we clearly express the
>> mapping granularity we are dealing with, we first get the remainder of
>> hugetlb out of the way, as it is special and expected to remain special: it
>> treats everything as a "single logical PTE" and only currently allows
>> entire mappings.
>>
>> Even if we'd ever support partial mappings, I strongly
>> assume the interface and implementation will still differ heavily:
>> hopefull we can avoid working on subpages/subpage mapcounts completely and
>> only add a "count" parameter for them to enable batching.
>>
>>
>> New (extended) hugetlb interface that operate on entire folio:
>>  * hugetlb_add_new_anon_rmap() -> Already existed
>>  * hugetlb_add_anon_rmap() -> Already existed
>>  * hugetlb_try_dup_anon_rmap()
>>  * hugetlb_try_share_anon_rmap()
>>  * hugetlb_add_file_rmap()
>>  * hugetlb_remove_rmap()
>>
>> New "ordinary" interface for small folios / THP::
>>  * folio_add_new_anon_rmap() -> Already existed
>>  * folio_add_anon_rmap_[pte|ptes|pmd]()
>>  * folio_try_dup_anon_rmap_[pte|ptes|pmd]()
>>  * folio_try_share_anon_rmap_[pte|pmd]()
>>  * folio_add_file_rmap_[pte|ptes|pmd]()
>>  * folio_dup_file_rmap_[pte|ptes|pmd]()
>>  * folio_remove_rmap_[pte|ptes|pmd]()
> 
> I'm not sure if there are official guidelines, but personally if we are
> reworking the API, I'd take the opportunity to move "rmap" to the front of the
> name, rather than having it burried in the middle as it is for some of these:
> 
> rmap_hugetlb_*()
> 
> rmap_folio_*()

In fact, I'd be inclined to drop the "folio" to shorten the name; everything is
a folio, so its not telling us much. e.g.:

New (extended) hugetlb interface that operate on entire folio:
 * rmap_hugetlb_add_new_anon() -> Already existed
 * rmap_hugetlb_add_anon() -> Already existed
 * rmap_hugetlb_try_dup_anon()
 * rmap_hugetlb_try_share_anon()
 * rmap_hugetlb_add_file()
 * rmap_hugetlb_remove()

New "ordinary" interface for small folios / THP::
 * rmap_add_new_anon() -> Already existed
 * rmap_add_anon_[pte|ptes|pmd]()
 * rmap_try_dup_anon_[pte|ptes|pmd]()
 * rmap_try_share_anon_[pte|pmd]()
 * rmap_add_file_[pte|ptes|pmd]()
 * rmap_dup_file_[pte|ptes|pmd]()
 * rmap_remove_[pte|ptes|pmd]()


> 
> I guess reading the patches will tell me, but what's the point of "ptes"? Surely
> you're either mapping at pte or pmd level, and the number of pages is determined
> by the folio size? (or presumably nr param passed in)
> 
> Thanks,
> Ryan
> 
>>
>> folio_add_new_anon_rmap() will always map at the biggest granularity
>> possible (currently, a single PMD to cover a PMD-sized THP). Could be
>> extended if ever required.
>>
>> In the future, we might want "_pud" variants and eventually "_pmds" variants
>> for batching. Further, if hugepd is ever a thing outside hugetlb code,
>> we might want some variants for that. All stuff for the distant future.
>>
>>
>> I ran some simple microbenchmarks from [5] on an Intel(R) Xeon(R) Silver
>> 4210R: munmap(), fork(), cow, MADV_DONTNEED on each PTE ... and PTE
>> remapping PMD-mapped THPs on 1 GiB of memory.
>>
>> For small folios, there is barely a change (< 1 % performance improvement),
>> whereby fork() still stands out with 0.74% performance improvement, but
>> it might be just some noise. Folio optimizations don't help that much
>> with small folios.
>>
>> For PTE-mapped THP:
>> * PTE-remapping a PMD-mapped THP is more than 10% faster.
>>   -> RMAP batching
>> * fork() is more than 4% faster.
>>   -> folio conversion
>> * MADV_DONTNEED is 2% faster
>>   -> folio conversion
>> * COW by writing only a single byte on a COW-shared PTE
>>   -> folio conversion
>> * munmap() is only slightly faster (< 1%).
>>
>> [1] https://lkml.kernel.org/r/20230810103332.3062143-1-ryan.roberts@arm.com
>> [2] https://lkml.kernel.org/r/20231204105440.61448-1-ryan.roberts@arm.com
>> [3] https://lkml.kernel.org/r/20231204102027.57185-1-ryan.roberts@arm.com
>> [4] https://lkml.kernel.org/r/20231128145205.215026-1-david@redhat.com
>> [5] https://lkml.kernel.org/r/20231124132626.235350-1-david@redhat.com
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: Yin Fengwei <fengwei.yin@intel.com>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Muchun Song <muchun.song@linux.dev>
>> Cc: Peter Xu <peterx@redhat.com>
>>
>> David Hildenbrand (39):
>>   mm/rmap: rename hugepage_add* to hugetlb_add*
>>   mm/rmap: introduce and use hugetlb_remove_rmap()
>>   mm/rmap: introduce and use hugetlb_add_file_rmap()
>>   mm/rmap: introduce and use hugetlb_try_dup_anon_rmap()
>>   mm/rmap: introduce and use hugetlb_try_share_anon_rmap()
>>   mm/rmap: add hugetlb sanity checks
>>   mm/rmap: convert folio_add_file_rmap_range() into
>>     folio_add_file_rmap_[pte|ptes|pmd]()
>>   mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()
>>   mm/huge_memory: page_add_file_rmap() -> folio_add_file_rmap_pmd()
>>   mm/migrate: page_add_file_rmap() -> folio_add_file_rmap_pte()
>>   mm/userfaultfd: page_add_file_rmap() -> folio_add_file_rmap_pte()
>>   mm/rmap: remove page_add_file_rmap()
>>   mm/rmap: factor out adding folio mappings into __folio_add_rmap()
>>   mm/rmap: introduce folio_add_anon_rmap_[pte|ptes|pmd]()
>>   mm/huge_memory: batch rmap operations in __split_huge_pmd_locked()
>>   mm/huge_memory: page_add_anon_rmap() -> folio_add_anon_rmap_pmd()
>>   mm/migrate: page_add_anon_rmap() -> folio_add_anon_rmap_pte()
>>   mm/ksm: page_add_anon_rmap() -> folio_add_anon_rmap_pte()
>>   mm/swapfile: page_add_anon_rmap() -> folio_add_anon_rmap_pte()
>>   mm/memory: page_add_anon_rmap() -> folio_add_anon_rmap_pte()
>>   mm/rmap: remove page_add_anon_rmap()
>>   mm/rmap: remove RMAP_COMPOUND
>>   mm/rmap: introduce folio_remove_rmap_[pte|ptes|pmd]()
>>   kernel/events/uprobes: page_remove_rmap() -> folio_remove_rmap_pte()
>>   mm/huge_memory: page_remove_rmap() -> folio_remove_rmap_pmd()
>>   mm/khugepaged: page_remove_rmap() -> folio_remove_rmap_pte()
>>   mm/ksm: page_remove_rmap() -> folio_remove_rmap_pte()
>>   mm/memory: page_remove_rmap() -> folio_remove_rmap_pte()
>>   mm/migrate_device: page_remove_rmap() -> folio_remove_rmap_pte()
>>   mm/rmap: page_remove_rmap() -> folio_remove_rmap_pte()
>>   Documentation: stop referring to page_remove_rmap()
>>   mm/rmap: remove page_remove_rmap()
>>   mm/rmap: convert page_dup_file_rmap() to
>>     folio_dup_file_rmap_[pte|ptes|pmd]()
>>   mm/rmap: introduce folio_try_dup_anon_rmap_[pte|ptes|pmd]()
>>   mm/huge_memory: page_try_dup_anon_rmap() ->
>>     folio_try_dup_anon_rmap_pmd()
>>   mm/memory: page_try_dup_anon_rmap() -> folio_try_dup_anon_rmap_pte()
>>   mm/rmap: remove page_try_dup_anon_rmap()
>>   mm: convert page_try_share_anon_rmap() to
>>     folio_try_share_anon_rmap_[pte|pmd]()
>>   mm/rmap: rename COMPOUND_MAPPED to ENTIRELY_MAPPED
>>
>>  Documentation/mm/transhuge.rst       |   4 +-
>>  Documentation/mm/unevictable-lru.rst |   4 +-
>>  include/linux/mm.h                   |   6 +-
>>  include/linux/rmap.h                 | 380 +++++++++++++++++++-----
>>  kernel/events/uprobes.c              |   2 +-
>>  mm/gup.c                             |   2 +-
>>  mm/huge_memory.c                     |  85 +++---
>>  mm/hugetlb.c                         |  21 +-
>>  mm/internal.h                        |  12 +-
>>  mm/khugepaged.c                      |  17 +-
>>  mm/ksm.c                             |  15 +-
>>  mm/memory-failure.c                  |   4 +-
>>  mm/memory.c                          |  60 ++--
>>  mm/migrate.c                         |  12 +-
>>  mm/migrate_device.c                  |  41 +--
>>  mm/mmu_gather.c                      |   2 +-
>>  mm/rmap.c                            | 422 ++++++++++++++++-----------
>>  mm/swapfile.c                        |   2 +-
>>  mm/userfaultfd.c                     |   2 +-
>>  19 files changed, 709 insertions(+), 384 deletions(-)
>>
> 



      parent reply	other threads:[~2023-12-05 12:30 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 14:21 David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 01/39] mm/rmap: rename hugepage_add* to hugetlb_add* David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 02/39] mm/rmap: introduce and use hugetlb_remove_rmap() David Hildenbrand
2023-12-06  1:22   ` Yin Fengwei
2023-12-06 12:11     ` David Hildenbrand
2023-12-07  0:56       ` Yin Fengwei
2023-12-04 14:21 ` [PATCH RFC 03/39] mm/rmap: introduce and use hugetlb_add_file_rmap() David Hildenbrand
2023-12-06  1:22   ` Yin Fengwei
2023-12-04 14:21 ` [PATCH RFC 04/39] mm/rmap: introduce and use hugetlb_try_dup_anon_rmap() David Hildenbrand
2023-12-06  1:22   ` Yin Fengwei
2023-12-04 14:21 ` [PATCH RFC 05/39] mm/rmap: introduce and use hugetlb_try_share_anon_rmap() David Hildenbrand
2023-12-06  1:23   ` Yin Fengwei
2023-12-04 14:21 ` [PATCH RFC 06/39] mm/rmap: add hugetlb sanity checks David Hildenbrand
2023-12-06  1:23   ` Yin Fengwei
2023-12-04 14:21 ` [PATCH RFC 07/39] mm/rmap: convert folio_add_file_rmap_range() into folio_add_file_rmap_[pte|ptes|pmd]() David Hildenbrand
2023-12-05 12:04   ` Ryan Roberts
2023-12-05 12:25     ` David Hildenbrand
2023-12-06  1:30   ` Yin Fengwei
2023-12-06  9:17     ` David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 08/39] mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]() David Hildenbrand
2023-12-08  1:40   ` Yin, Fengwei
2023-12-04 14:21 ` [PATCH RFC 09/39] mm/huge_memory: page_add_file_rmap() -> folio_add_file_rmap_pmd() David Hildenbrand
2023-12-08  1:41   ` Yin, Fengwei
2023-12-04 14:21 ` [PATCH RFC 10/39] mm/migrate: page_add_file_rmap() -> folio_add_file_rmap_pte() David Hildenbrand
2023-12-08  1:42   ` Yin, Fengwei
2023-12-04 14:21 ` [PATCH RFC 11/39] mm/userfaultfd: " David Hildenbrand
2023-12-08  1:42   ` Yin, Fengwei
2023-12-04 14:21 ` [PATCH RFC 12/39] mm/rmap: remove page_add_file_rmap() David Hildenbrand
2023-12-08  1:42   ` Yin, Fengwei
2023-12-04 14:21 ` [PATCH RFC 13/39] mm/rmap: factor out adding folio mappings into __folio_add_rmap() David Hildenbrand
2023-12-08  1:44   ` Yin, Fengwei
2023-12-04 14:21 ` [PATCH RFC 14/39] mm/rmap: introduce folio_add_anon_rmap_[pte|ptes|pmd]() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 15/39] mm/huge_memory: batch rmap operations in __split_huge_pmd_locked() David Hildenbrand
2023-12-05 12:22   ` Ryan Roberts
2023-12-05 12:26     ` David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 16/39] mm/huge_memory: page_add_anon_rmap() -> folio_add_anon_rmap_pmd() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 17/39] mm/migrate: page_add_anon_rmap() -> folio_add_anon_rmap_pte() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 18/39] mm/ksm: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 19/39] mm/swapfile: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 20/39] mm/memory: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 21/39] mm/rmap: remove page_add_anon_rmap() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 22/39] mm/rmap: remove RMAP_COMPOUND David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 23/39] mm/rmap: introduce folio_remove_rmap_[pte|ptes|pmd]() David Hildenbrand
2023-12-05 12:52   ` Ryan Roberts
2023-12-05 13:09     ` David Hildenbrand
2023-12-05 13:37       ` Ryan Roberts
2023-12-04 14:21 ` [PATCH RFC 24/39] kernel/events/uprobes: page_remove_rmap() -> folio_remove_rmap_pte() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 25/39] mm/huge_memory: page_remove_rmap() -> folio_remove_rmap_pmd() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 26/39] mm/khugepaged: page_remove_rmap() -> folio_remove_rmap_pte() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 27/39] mm/ksm: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 28/39] mm/memory: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 29/39] mm/migrate_device: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 30/39] mm/rmap: " David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 31/39] Documentation: stop referring to page_remove_rmap() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 32/39] mm/rmap: remove page_remove_rmap() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 33/39] mm/rmap: convert page_dup_file_rmap() to folio_dup_file_rmap_[pte|ptes|pmd]() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 34/39] mm/rmap: introduce folio_try_dup_anon_rmap_[pte|ptes|pmd]() David Hildenbrand
2023-12-04 17:59   ` David Hildenbrand
2023-12-05 13:11   ` David Hildenbrand
2023-12-05 13:12   ` Ryan Roberts
2023-12-05 13:17     ` David Hildenbrand
2023-12-05 13:18       ` David Hildenbrand
2023-12-05 13:40         ` Ryan Roberts
2023-12-05 13:50           ` David Hildenbrand
2023-12-05 14:02             ` Ryan Roberts
2023-12-05 14:12               ` David Hildenbrand
2023-12-05 13:32       ` David Hildenbrand
2023-12-05 13:40         ` Ryan Roberts
2023-12-04 14:21 ` [PATCH RFC 35/39] mm/huge_memory: page_try_dup_anon_rmap() -> folio_try_dup_anon_rmap_pmd() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 36/39] mm/memory: page_try_dup_anon_rmap() -> folio_try_dup_anon_rmap_pte() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 37/39] mm/rmap: remove page_try_dup_anon_rmap() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 38/39] mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]() David Hildenbrand
2023-12-04 14:21 ` [PATCH RFC 39/39] mm/rmap: rename COMPOUND_MAPPED to ENTIRELY_MAPPED David Hildenbrand
2023-12-04 19:53 ` [PATCH RFC 00/39] mm/rmap: interface overhaul Ryan Roberts
2023-12-05  9:56   ` David Hildenbrand
2023-12-05 13:31     ` Ryan Roberts
2023-12-05 13:39       ` David Hildenbrand
2023-12-05 13:49         ` Ryan Roberts
2023-12-05 13:55           ` David Hildenbrand
2023-12-08 11:24           ` David Hildenbrand
2023-12-08 11:38             ` Ryan Roberts
2023-12-05 12:29   ` Ryan Roberts [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33fb5a78-1432-4b43-af62-afeac84cf1ca@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=peterx@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox