Re: [PATCH v5 0/5] variable-order, large folios for anonymous memory

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Yin, Fengwei" <fengwei.yin@intel.com>
To: Itaru Kitayama <itaru.kitayama@gmail.com>,
	Ryan Roberts <ryan.roberts@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v5 0/5] variable-order, large folios for anonymous memory
Date: Wed, 16 Aug 2023 17:25:20 +0800	[thread overview]
Message-ID: <3a0ada31-0ec5-4a7a-ab9d-d59c3684b662@intel.com> (raw)
In-Reply-To: <EFC00B0B-CB45-40F0-A55E-0F110961A5B9@gmail.com>



On 8/16/2023 4:11 PM, Itaru Kitayama wrote:
> 
> 
>> On Aug 10, 2023, at 23:29, Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> Hi All,
>>
>> This is v5 of a series to implement variable order, large folios for anonymous
>> memory. (currently called "LARGE_ANON_FOLIO", previously called "FLEXIBLE_THP").
>> The objective of this is to improve performance by allocating larger chunks of
>> memory during anonymous page faults:
>>
>> 1) Since SW (the kernel) is dealing with larger chunks of memory than base
>>   pages, there are efficiency savings to be had; fewer page faults, batched PTE
>>   and RMAP manipulation, reduced lru list, etc. In short, we reduce kernel
>>   overhead. This should benefit all architectures.
>> 2) Since we are now mapping physically contiguous chunks of memory, we can take
>>   advantage of HW TLB compression techniques. A reduction in TLB pressure
>>   speeds up kernel and user space. arm64 systems have 2 mechanisms to coalesce
>>   TLB entries; "the contiguous bit" (architectural) and HPA (uarch).
>>
>> This patch set deals with the SW side of things (1). (2) is being tackled in a
>> separate series. The new behaviour is hidden behind a new Kconfig switch,
>> LARGE_ANON_FOLIO, which is disabled by default. Although the eventual aim is to
>> enable it by default.
>>
>> My hope is that we are pretty much there with the changes at this point;
>> hopefully this is sufficient to get an initial version merged so that we can
>> scale up characterization efforts. Although they should not be merged until the
>> prerequisites are complete. These are in progress and tracked at [5].
>>
>> This series is based on mm-unstable (ad3232df3e41).
>>
>> I'm going to be out on holiday from the end of today, returning on 29th
>> August. So responses will likely be patchy, as I'm terrified of posting
>> to list from my phone!
>>
>>
>> Testing
>> -------
>>
>> This version adds patches to mm selftests so that the cow tests explicitly test
>> large anon folios, in the same way that thp is tested. When enabled you should
>> see something similar at the start of the test suite:
>>
>>  # [INFO] detected large anon folio size: 32 KiB
>>
>> Then the following results are expected. The fails and skips are due to existing
>> issues in mm-unstable:
>>
>>  # Totals: pass:207 fail:16 xfail:0 xpass:0 skip:85 error:0
>>
>> Existing mm selftests reveal 1 regression in khugepaged tests when
>> LARGE_ANON_FOLIO is enabled:
>>
>>  Run test: collapse_max_ptes_none (khugepaged:anon)
>>  Maybe collapse with max_ptes_none exceeded.... Fail
>>  Unexpected huge page
>>
>> I believe this is because khugepaged currently skips non-order-0 pages when
>> looking for collapse opportunities and should get fixed with the help of
>> DavidH's work to create a mechanism to precisely determine shared vs exclusive
>> pages.
>>
>>
>> Changes since v4 [4]
>> --------------------
>>
>>  - Removed "arm64: mm: Override arch_wants_pte_order()" patch; arm64
>>    now uses the default order-3 size. I have moved this patch over to
>>    the contpte series.
>>  - Added "mm: Allow deferred splitting of arbitrary large anon folios" back
>>    into series. I originally removed this at v2 to add to a separate series,
>>    but that series has transformed significantly and it no longer fits, so
>>    bringing it back here.
>>  - Reintroduced dependency on set_ptes(); Originally dropped this at v2, but
>>    set_ptes() is in mm-unstable now.
>>  - Updated policy for when to allocate LAF; only fallback to order-0 if
>>    MADV_NOHUGEPAGE is present or if THP disabled via prctl; no longer rely on
>>    sysfs's never/madvise/always knob.
>>  - Fallback to order-0 whenever uffd is armed for the vma, not just when
>>    uffd-wp is set on the pte.
>>  - alloc_anon_folio() now returns `strucxt folio *`, where errors are encoded
>>    with ERR_PTR().
>>
>>  The last 3 changes were proposed by Yu Zhao - thanks!
>>
>>
>> Changes since v3 [3]
>> --------------------
>>
>>  - Renamed feature from FLEXIBLE_THP to LARGE_ANON_FOLIO.
>>  - Removed `flexthp_unhinted_max` boot parameter. Discussion concluded that a
>>    sysctl is preferable but we will wait until real workload needs it.
>>  - Fixed uninitialized `addr` on read fault path in do_anonymous_page().
>>  - Added mm selftests for large anon folios in cow test suite.
>>
>>
>> Changes since v2 [2]
>> --------------------
>>
>>  - Dropped commit "Allow deferred splitting of arbitrary large anon folios"
>>      - Huang, Ying suggested the "batch zap" work (which I dropped from this
>>        series after v1) is a prerequisite for merging FLXEIBLE_THP, so I've
>>        moved the deferred split patch to a separate series along with the batch
>>        zap changes. I plan to submit this series early next week.
>>  - Changed folio order fallback policy
>>      - We no longer iterate from preferred to 0 looking for acceptable policy
>>      - Instead we iterate through preferred, PAGE_ALLOC_COSTLY_ORDER and 0 only
>>  - Removed vma parameter from arch_wants_pte_order()
>>  - Added command line parameter `flexthp_unhinted_max`
>>      - clamps preferred order when vma hasn't explicitly opted-in to THP
>>  - Never allocate large folio for MADV_NOHUGEPAGE vma (or when THP is disabled
>>    for process or system).
>>  - Simplified implementation and integration with do_anonymous_page()
>>  - Removed dependency on set_ptes()
>>
>>
>> Changes since v1 [1]
>> --------------------
>>
>>  - removed changes to arch-dependent vma_alloc_zeroed_movable_folio()
>>  - replaced with arch-independent alloc_anon_folio()
>>      - follows THP allocation approach
>>  - no longer retry with intermediate orders if allocation fails
>>      - fallback directly to order-0
>>  - remove folio_add_new_anon_rmap_range() patch
>>      - instead add its new functionality to folio_add_new_anon_rmap()
>>  - remove batch-zap pte mappings optimization patch
>>      - remove enabler folio_remove_rmap_range() patch too
>>      - These offer real perf improvement so will submit separately
>>  - simplify Kconfig
>>      - single FLEXIBLE_THP option, which is independent of arch
>>      - depends on TRANSPARENT_HUGEPAGE
>>      - when enabled default to max anon folio size of 64K unless arch
>>        explicitly overrides
>>  - simplify changes to do_anonymous_page():
>>      - no more retry loop
>>
>>
>> [1] https://lore.kernel.org/linux-mm/20230626171430.3167004-1-ryan.roberts@arm.com/
>> [2] https://lore.kernel.org/linux-mm/20230703135330.1865927-1-ryan.roberts@arm.com/
>> [3] https://lore.kernel.org/linux-mm/20230714160407.4142030-1-ryan.roberts@arm.com/
>> [4] https://lore.kernel.org/linux-mm/20230726095146.2826796-1-ryan.roberts@arm.com/
>> [5] https://lore.kernel.org/linux-mm/f8d47176-03a8-99bf-a813-b5942830fd73@arm.com/
>>
>>
>> Thanks,
>> Ryan
>>
>> Ryan Roberts (5):
>>  mm: Allow deferred splitting of arbitrary large anon folios
>>  mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap()
>>  mm: LARGE_ANON_FOLIO for improved performance
>>  selftests/mm/cow: Generalize do_run_with_thp() helper
>>  selftests/mm/cow: Add large anon folio tests
>>
>> include/linux/pgtable.h          |  13 ++
>> mm/Kconfig                       |  10 ++
>> mm/memory.c                      | 144 +++++++++++++++++--
>> mm/rmap.c                        |  31 +++--
>> tools/testing/selftests/mm/cow.c | 229 ++++++++++++++++++++++---------
>> 5 files changed, 347 insertions(+), 80 deletions(-)
>>
>> --
>> 2.25.1
>>
> 
> I know Ryan is away currently, but as I can’t find the base commit mentioned in the cover letter to be based off of can anybody point me to it so I can use b4 for applying the series and test?
> 
Ryan mentioned: This series is based on mm-unstable (ad3232df3e41).

I believe you can apply the patchset to latest mm-unstable.


Regards
Yin, Fengwei

> Thanks,
> Itaru.

next prev parent reply	other threads:[~2023-08-16  9:25 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-10 14:29 Ryan Roberts
2023-08-10 14:29 ` [PATCH v5 1/5] mm: Allow deferred splitting of arbitrary large anon folios Ryan Roberts
2023-08-10 14:29 ` [PATCH v5 2/5] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-08-10 14:29 ` [PATCH v5 3/5] mm: LARGE_ANON_FOLIO for improved performance Ryan Roberts
2023-08-10 17:01   ` Yu Zhao
2023-08-10 19:12     ` Ryan Roberts
2023-08-10 19:46       ` Zi Yan
2023-08-11  0:36         ` Yin, Fengwei
2023-08-11  1:04           ` Zi Yan
2023-08-11  5:34             ` Yin, Fengwei
2023-08-11 14:33               ` Zi Yan
2023-08-12  0:23                 ` Yin, Fengwei
2023-08-30 11:41                   ` Ryan Roberts
2023-08-31  0:14                     ` Yin, Fengwei
2023-08-11  0:27       ` Yin, Fengwei
2023-08-15 21:32   ` Huang, Ying
2023-08-30 12:07     ` Ryan Roberts
2023-08-31  1:40       ` Huang, Ying
2023-08-31  7:57         ` David Hildenbrand
2023-08-31  8:02           ` Yin, Fengwei
2023-08-31  8:09             ` David Hildenbrand
2023-08-31 12:29           ` Matthew Wilcox
2023-09-01 14:40             ` David Hildenbrand
2023-08-31 17:15           ` Yang Shi
2023-09-01 16:13             ` Matthew Wilcox
2023-09-01 17:18               ` Yang Shi
2023-09-04 10:05                 ` Ryan Roberts
2023-08-10 14:29 ` [PATCH v5 4/5] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2023-08-10 14:29 ` [PATCH v5 5/5] selftests/mm/cow: Add large anon folio tests Ryan Roberts
2023-08-10 15:13 ` [PATCH v5 0/5] variable-order, large folios for anonymous memory Ryan Roberts
2023-08-16  8:11 ` Itaru Kitayama
2023-08-16  9:25   ` Yin, Fengwei [this message]
2023-08-16 11:57     ` Itaru Kitayama

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a0ada31-0ec5-4a7a-ab9d-d59c3684b662@intel.com \
    --to=fengwei.yin@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=itaru.kitayama@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox