Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Huang, Ying" <ying.huang@intel.com>
To: Zi Yan <zi.yan@sent.com>
Cc: linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	 Zi Yan <ziy@nvidia.com>,  Ryan Roberts <ryan.roberts@arm.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	 David Hildenbrand <david@redhat.com>,
	 "Yin, Fengwei" <fengwei.yin@intel.com>,
	 Yu Zhao <yuzhao@google.com>,  Vlastimil Babka <vbabka@suse.cz>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	 Baolin Wang <baolin.wang@linux.alibaba.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	 Mel Gorman <mgorman@techsingularity.net>,
	Rohan Puri <rohan.puri15@gmail.com>,
	 Mcgrof Chamberlain <mcgrof@kernel.org>,
	 Adam Manzanares <a.manzanares@samsung.com>,
	 John Hubbard <jhubbard@nvidia.com>
Subject: Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction
Date: Mon, 09 Oct 2023 15:12:30 +0800	[thread overview]
Message-ID: <87a5ssjmld.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <20230912162815.440749-1-zi.yan@sent.com> (Zi Yan's message of "Tue, 12 Sep 2023 12:28:11 -0400")

Hi, Zi,

Thanks for your patch!

Zi Yan <zi.yan@sent.com> writes:

> From: Zi Yan <ziy@nvidia.com>
>
> Hi all,
>
> This patchset enables >0 order folio memory compaction, which is one of
> the prerequisitions for large folio support[1]. It is on top of
> mm-everything-2023-09-11-22-56.
>
> Overview
> ===
>
> To support >0 order folio compaction, the patchset changes how free pages used
> for migration are kept during compaction.

migrate_pages() can split the large folio for allocation failure.  So
the minimal implementation could be

- allow to migrate large folios in compaction
- return -ENOMEM for order > 0 in compaction_alloc()

The performance may be not desirable.  But that may be a baseline for
further optimization.

And, if we can measure the performance for each step of optimization,
that will be even better.

> Free pages used to be split into
> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared,
> page order stored in page->private is zeroed, and page reference is set to 1).
> Now all free pages are kept in a MAX_ORDER+1 array of page lists based
> on their order without post allocation process. When migrate_pages() asks for
> a new page, one of the free pages, based on the requested page order, is
> then processed and given out.
>
>
> Optimizations
> ===
>
> 1. Free page split is added to increase migration success rate in case
> a source page does not have a matched free page in the free page lists.
> Free page merge is possible but not implemented, since existing
> PFN-based buddy page merge algorithm requires the identification of
> buddy pages, but free pages kept for memory compaction cannot have
> PageBuddy set to avoid confusing other PFN scanners.
>
> 2. Sort source pages in ascending order before migration is added to

Trivial.

s/ascending/descending/

> reduce free page split. Otherwise, high order free pages might be
> prematurely split, causing undesired high order folio migration failures.
>
>
> TODOs
> ===
>
> 1. Refactor free page post allocation and free page preparation code so
> that compaction_alloc() and compaction_free() can call functions instead
> of hard coding.
>
> 2. One possible optimization is to allow migrate_pages() to continue
> even if get_new_folio() returns a NULL. In general, that means there is
> not enough memory. But in >0 order folio compaction case, that means
> there is no suitable free page at source page order. It might be better
> to skip that page and finish the rest of migration to achieve a better
> compaction result.

We can split the source folio if get_new_folio() returns NULL.  So, do
we really need this?

In general, we may reconsider all further optimizations given splitting
is available already.

> 3. Another possible optimization is to enable free page merge. It is
> possible that a to-be-migrated page causes free page split then fails to
> migrate eventually. We would lose a high order free page without free
> page merge function. But a way of identifying free pages for memory
> compaction is needed to reuse existing PFN-based buddy page merge.
>
> 4. The implemented >0 order folio compaction algorithm is quite naive
> and does not consider all possible situations. A better algorithm can
> improve compaction success rate.
>
>
> Feel free to give comments and ask questions.
>
> Thanks.
>
>
> [1] https://lore.kernel.org/linux-mm/f8d47176-03a8-99bf-a813-b5942830fd73@arm.com/
>
> Zi Yan (4):
>   mm/compaction: add support for >0 order folio memory compaction.
>   mm/compaction: optimize >0 order folio compaction with free page
>     split.
>   mm/compaction: optimize >0 order folio compaction by sorting source
>     pages.
>   mm/compaction: enable compacting >0 order folios.
>
>  mm/compaction.c | 205 +++++++++++++++++++++++++++++++++++++++---------
>  mm/internal.h   |   7 +-
>  2 files changed, 176 insertions(+), 36 deletions(-)

--
Best Regards,
Huang, Ying

next prev parent reply	other threads:[~2023-10-09  7:14 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-12 16:28 Zi Yan
2023-09-12 16:28 ` [RFC PATCH 1/4] mm/compaction: add support for " Zi Yan
2023-09-12 17:32   ` Johannes Weiner
2023-09-12 17:38     ` Zi Yan
2023-09-15  9:33   ` Baolin Wang
2023-09-18 17:06     ` Zi Yan
2023-10-10  8:07   ` Huang, Ying
2023-09-12 16:28 ` [RFC PATCH 2/4] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan
2023-09-18  7:34   ` Baolin Wang
2023-09-18 17:20     ` Zi Yan
2023-09-20  8:15       ` Baolin Wang
2023-09-12 16:28 ` [RFC PATCH 3/4] mm/compaction: optimize >0 order folio compaction by sorting source pages Zi Yan
2023-09-12 17:56   ` Johannes Weiner
2023-09-12 20:31     ` Zi Yan
2023-09-12 16:28 ` [RFC PATCH 4/4] mm/compaction: enable compacting >0 order folios Zi Yan
2023-09-15  9:41   ` Baolin Wang
2023-09-18 17:17     ` Zi Yan
2023-09-20 14:44   ` kernel test robot
2023-09-21  0:55 ` [RFC PATCH 0/4] Enable >0 order folio memory compaction Luis Chamberlain
2023-09-21  1:16   ` Luis Chamberlain
2023-09-21  2:05     ` John Hubbard
2023-09-21  3:14       ` Luis Chamberlain
2023-09-21 15:56         ` Zi Yan
2023-10-02 12:32 ` Ryan Roberts
2023-10-09 13:24   ` Zi Yan
2023-10-09 14:10     ` Ryan Roberts
2023-10-09 15:42       ` Zi Yan
2023-10-09 15:52       ` Zi Yan
2023-10-10 10:00         ` Ryan Roberts
2023-10-09  7:12 ` Huang, Ying [this message]
2023-10-09 13:43   ` Zi Yan
2023-10-10  6:08     ` Huang, Ying
2023-10-10 16:48       ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5ssjmld.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=a.manzanares@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=rohan.puri15@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    --cc=zi.yan@sent.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox