Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Zi Yan <ziy@nvidia.com>
To: "\"Huang, Ying\"" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"\"Matthew Wilcox (Oracle)\"" <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	"\"Yin, Fengwei\"" <fengwei.yin@intel.com>,
	Yu Zhao <yuzhao@google.com>, Vlastimil Babka <vbabka@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Rohan Puri <rohan.puri15@gmail.com>,
	Mcgrof Chamberlain <mcgrof@kernel.org>,
	Adam Manzanares <a.manzanares@samsung.com>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction
Date: Mon, 09 Oct 2023 09:43:38 -0400	[thread overview]
Message-ID: <14089E95-251E-43A4-AF32-C9773723C810@nvidia.com> (raw)
In-Reply-To: <87a5ssjmld.fsf@yhuang6-desk2.ccr.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 5514 bytes --]

On 9 Oct 2023, at 3:12, Huang, Ying wrote:

> Hi, Zi,
>
> Thanks for your patch!
>
> Zi Yan <zi.yan@sent.com> writes:
>
>> From: Zi Yan <ziy@nvidia.com>
>>
>> Hi all,
>>
>> This patchset enables >0 order folio memory compaction, which is one of
>> the prerequisitions for large folio support[1]. It is on top of
>> mm-everything-2023-09-11-22-56.
>>
>> Overview
>> ===
>>
>> To support >0 order folio compaction, the patchset changes how free pages used
>> for migration are kept during compaction.
>
> migrate_pages() can split the large folio for allocation failure.  So
> the minimal implementation could be
>
> - allow to migrate large folios in compaction
> - return -ENOMEM for order > 0 in compaction_alloc()
>
> The performance may be not desirable.  But that may be a baseline for
> further optimization.

I would imagine it might cause a regression since compaction might gradually
split high order folios in the system. But I can move Patch 4 first to make this
the baseline and see how system performance changes.

>
> And, if we can measure the performance for each step of optimization,
> that will be even better.

Do you have any benchmark in mind for the performance tests? vm-scalability?

>
>> Free pages used to be split into
>> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared,
>> page order stored in page->private is zeroed, and page reference is set to 1).
>> Now all free pages are kept in a MAX_ORDER+1 array of page lists based
>> on their order without post allocation process. When migrate_pages() asks for
>> a new page, one of the free pages, based on the requested page order, is
>> then processed and given out.
>>
>>
>> Optimizations
>> ===
>>
>> 1. Free page split is added to increase migration success rate in case
>> a source page does not have a matched free page in the free page lists.
>> Free page merge is possible but not implemented, since existing
>> PFN-based buddy page merge algorithm requires the identification of
>> buddy pages, but free pages kept for memory compaction cannot have
>> PageBuddy set to avoid confusing other PFN scanners.
>>
>> 2. Sort source pages in ascending order before migration is added to
>
> Trivial.
>
> s/ascending/descending/
>
>> reduce free page split. Otherwise, high order free pages might be
>> prematurely split, causing undesired high order folio migration failures.
>>
>>
>> TODOs
>> ===
>>
>> 1. Refactor free page post allocation and free page preparation code so
>> that compaction_alloc() and compaction_free() can call functions instead
>> of hard coding.
>>
>> 2. One possible optimization is to allow migrate_pages() to continue
>> even if get_new_folio() returns a NULL. In general, that means there is
>> not enough memory. But in >0 order folio compaction case, that means
>> there is no suitable free page at source page order. It might be better
>> to skip that page and finish the rest of migration to achieve a better
>> compaction result.
>
> We can split the source folio if get_new_folio() returns NULL.  So, do
> we really need this?

It depends. The situation it can benefit is that when the system is going
to allocate a high order free page and trigger a compaction, it is possible to
get the high order free page by migrating a bunch of base pages instead of
splitting a existing high order folio.

>
> In general, we may reconsider all further optimizations given splitting
> is available already.

In my mind, split should be avoided as much as possible. But it really depends
on the actual situation, e.g., how much effort and cost the compaction wants
to pay to get memory defragmented. If the system really wants to get a high
order free page at any cost, split can be used without any issue. But applications
might lose performance because existing large folios are split just to a
new one.

Like I said in the email, there are tons of optimizations and policies for us
to explore. We can start with the bare minimum support (if no performance
regression is observed, we can even start with split all high folios like you
suggested) and add optimizations one by one.

>
>> 3. Another possible optimization is to enable free page merge. It is
>> possible that a to-be-migrated page causes free page split then fails to
>> migrate eventually. We would lose a high order free page without free
>> page merge function. But a way of identifying free pages for memory
>> compaction is needed to reuse existing PFN-based buddy page merge.
>>
>> 4. The implemented >0 order folio compaction algorithm is quite naive
>> and does not consider all possible situations. A better algorithm can
>> improve compaction success rate.
>>
>>
>> Feel free to give comments and ask questions.
>>
>> Thanks.
>>
>>
>> [1] https://lore.kernel.org/linux-mm/f8d47176-03a8-99bf-a813-b5942830fd73@arm.com/
>>
>> Zi Yan (4):
>>   mm/compaction: add support for >0 order folio memory compaction.
>>   mm/compaction: optimize >0 order folio compaction with free page
>>     split.
>>   mm/compaction: optimize >0 order folio compaction by sorting source
>>     pages.
>>   mm/compaction: enable compacting >0 order folios.
>>
>>  mm/compaction.c | 205 +++++++++++++++++++++++++++++++++++++++---------
>>  mm/internal.h   |   7 +-
>>  2 files changed, 176 insertions(+), 36 deletions(-)
>
> --
> Best Regards,
> Huang, Ying


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

next prev parent reply	other threads:[~2023-10-09 13:43 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-12 16:28 Zi Yan
2023-09-12 16:28 ` [RFC PATCH 1/4] mm/compaction: add support for " Zi Yan
2023-09-12 17:32   ` Johannes Weiner
2023-09-12 17:38     ` Zi Yan
2023-09-15  9:33   ` Baolin Wang
2023-09-18 17:06     ` Zi Yan
2023-10-10  8:07   ` Huang, Ying
2023-09-12 16:28 ` [RFC PATCH 2/4] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan
2023-09-18  7:34   ` Baolin Wang
2023-09-18 17:20     ` Zi Yan
2023-09-20  8:15       ` Baolin Wang
2023-09-12 16:28 ` [RFC PATCH 3/4] mm/compaction: optimize >0 order folio compaction by sorting source pages Zi Yan
2023-09-12 17:56   ` Johannes Weiner
2023-09-12 20:31     ` Zi Yan
2023-09-12 16:28 ` [RFC PATCH 4/4] mm/compaction: enable compacting >0 order folios Zi Yan
2023-09-15  9:41   ` Baolin Wang
2023-09-18 17:17     ` Zi Yan
2023-09-20 14:44   ` kernel test robot
2023-09-21  0:55 ` [RFC PATCH 0/4] Enable >0 order folio memory compaction Luis Chamberlain
2023-09-21  1:16   ` Luis Chamberlain
2023-09-21  2:05     ` John Hubbard
2023-09-21  3:14       ` Luis Chamberlain
2023-09-21 15:56         ` Zi Yan
2023-10-02 12:32 ` Ryan Roberts
2023-10-09 13:24   ` Zi Yan
2023-10-09 14:10     ` Ryan Roberts
2023-10-09 15:42       ` Zi Yan
2023-10-09 15:52       ` Zi Yan
2023-10-10 10:00         ` Ryan Roberts
2023-10-09  7:12 ` Huang, Ying
2023-10-09 13:43   ` Zi Yan [this message]
2023-10-10  6:08     ` Huang, Ying
2023-10-10 16:48       ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14089E95-251E-43A4-AF32-C9773723C810@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=a.manzanares@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=rohan.puri15@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox