Re: [RFC PATCH 0/1] Buddy allocator like folio split

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Hugh Dickins <hughd@google.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Yang Shi <yang@os.amperecomputing.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Yu Zhao <yuzhao@google.com>, John Hubbard <jhubbard@nvidia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/1] Buddy allocator like folio split
Date: Fri, 18 Oct 2024 15:10:21 -0400	[thread overview]
Message-ID: <9A314663-43F1-49B5-9225-0E326A4DB315@nvidia.com> (raw)
In-Reply-To: <7ec81ff8-5645-42a1-a048-c8700aff07fa@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2636 bytes --]

On 18 Oct 2024, at 14:42, David Hildenbrand wrote:

> On 09.10.24 00:37, Zi Yan wrote:
>> Hi all,
>
> Hi!
>
>>
>> Matthew and I have discussed about a different way of splitting large
>> folios. Instead of split one folio uniformly into the same order smaller
>> ones, doing buddy allocator like split can reduce the total number of
>> resulting folios, the amount of memory needed for multi-index xarray
>> split, and keep more large folios after a split. In addition, both
>> Hugh[1] and Ryan[2] had similar suggestions before.
>>
>> The patch is an initial implementation. It passes simple order-9 to
>> lower order split tests for anonymous folios and pagecache folios.
>> There are still a lot of TODOs to make it upstream. But I would like to gather
>> feedbacks before that.
>
> Interesting, but I don't see any actual users besides the debug/test interface wired up.

Right. I am working on it now, since two potential users, anon large folios
and truncate, might need more sophisticated implementation to fully take
advantage of this new split.

For anon large folios, this might be open to debate, if only a subset of
orders are enabled, I assume folio_split() can only split to smaller
folios with the enabled orders. For example, to get one order-0 from
an order-9, and only order-4 (64KB on x86) is enabled, folio_split()
can only split the order-9 to 16 order-0s, 31 order-4s, unless we are
OK with anon large folios with not enabled orders appear in the system.

For truncate, the example you give below is an easy one. For cases like
punching from 3rd to 5th order-0 of a order-3, [O0, O0, __, __, __, O0, O0, O0],
I am thinking which approach is better:

1. two folio_split()s,
  1) split second order-1 from order-3, 2) split order-0 from the second order-2;

2. one folio_split() by making folio_split() to support arbitrary range split,
so two steps in 1 can be done in one shot, which saves unmapping and remapping
cost.

Maybe I should go for 1 first as an easy route, but I still need an algorithm
in truncate to figure out the way of calling folio_split()s.

>
> I assume ftruncate() / fallocate(PUNCH_HOLE) might be good use cases? For example, when punching 1M of a 2M folio, we can just leave a 1M folio in the pagecache.

Yes, I am trying to make this work.

>
> Any other obvious users you have in mind?

Presumably, folio_split() should replace all split_huge*() to reduce total
number of folios after a split. But for swapcache folios, I need to figure
out if swap system works well with buddy allocator like splits.

Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

next prev parent reply	other threads:[~2024-10-18 19:10 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-08 22:37 Zi Yan
2024-10-08 22:37 ` [RFC PATCH 1/1] mm/huge_memory: buddy allocator like folio_split() Zi Yan
2024-10-09  9:54 ` [RFC PATCH 0/1] Buddy allocator like folio split Kirill A. Shutemov
2024-10-09 14:21   ` Zi Yan
2024-10-18 18:42 ` David Hildenbrand
2024-10-18 19:10   ` Zi Yan [this message]
2024-10-18 19:44     ` Yang Shi
2024-10-18 19:59       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9A314663-43F1-49B5-9225-0E326A4DB315@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox