linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zi Yan" <ziy@nvidia.com>
To: "David Hildenbrand" <david@redhat.com>,
	"Barry Song" <21cnbao@gmail.com>,
	"Juan Yescas" <jyescas@google.com>
Cc: <linux-mm@kvack.org>, <muchun.song@linux.dev>, <rppt@kernel.org>,
	<osalvador@suse.de>, <akpm@linux-foundation.org>,
	<lorenzo.stoakes@oracle.com>, "Jann Horn" <jannh@google.com>,
	<Liam.Howlett@oracle.com>, <minchan@kernel.org>,
	<jaewon31.kim@samsung.com>, <charante@codeaurora.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Kalesh Singh" <kaleshsingh@google.com>,
	"T.J. Mercier" <tjmercier@google.com>,
	"Isaac Manjarres" <isaacmanjarres@google.com>,
	<iamjoonsoo.kim@lge.com>, <quic_charante@quicinc.com>
Subject: Re: mm: CMA reservations require 32MiB alignment in 16KiB page size kernels instead of 8MiB in 4KiB page size kernel.
Date: Mon, 20 Jan 2025 10:29:08 -0500	[thread overview]
Message-ID: <D7709TNY6J94.1F4TOL8QC55I4@nvidia.com> (raw)
In-Reply-To: <463eb421-ac16-435c-b0a0-51a6a92168f6@redhat.com>

On Mon Jan 20, 2025 at 3:14 AM EST, David Hildenbrand wrote:
> On 20.01.25 01:39, Zi Yan wrote:
>> On Sun Jan 19, 2025 at 6:55 PM EST, Barry Song wrote:
>> <snip>
>>>>>>>
>>>>>>>
>>>>>>> However, with this workaround, we can't use transparent huge pages.
>>>>>>>
>>>>>>> Is the CMA_MIN_ALIGNMENT_BYTES requirement alignment only to support huge pages?
>>>> No. CMA_MIN_ALIGNMENT_BYTES is limited by CMA_MIN_ALIGNMENT_PAGES, which
>>>> is equal to pageblock size. Enabling THP just bumps the pageblock size.
>>>
>>> Currently, THP might be mTHP, which can have a significantly smaller
>>> size than 32MB. For
>>> example, on arm64 systems with a 16KiB page size, a 2MB CONT-PTE mTHP
>>> is possible.
>>> Additionally, mTHP relies on the CONFIG_TRANSPARENT_HUGEPAGE configuration.
>>>
>>> I wonder if it's possible to enable CONFIG_TRANSPARENT_HUGEPAGE
>>> without necessarily
>>> using 32MiB THP. If we use other sizes, such as 64KiB, perhaps a large
>>> pageblock size wouldn't
>>> be necessary?
>> 
>> I think this should work by reducing MAX_PAGE_ORDER like Juan did for
>> the experiment. But MAX_PAGE_ORDER is a macro right now, Kconfig needs
>> to be changed and kernel needs to be recompiled. Not sure if it is OK
>> for Juan's use case.
>
>
> IIRC, we set pageblock size == THP size because this is the granularity 
> we want to optimize defragmentation for. ("try keep pageblock 
> granularity of the same memory type: movable vs. unmovable")

Right. In past, it is optimized for PMD THP. Now we have mTHP. If user
does not care about PMD THP (32MB in ARM64 16KB base page case) and mTHP
(2MB mTHP here) is good enough, reducing pageblock size works.

>
> However, the buddy already supports having different pagetypes for large 
> allocations.

Right. To be clear, only MIGRATE_UNMOVABLE, MIGRATE_RECLAIMABLE, and
MIGRATE_MOVABLE can be merged.

>
> So we could leave MAX_ORDER alone and try adjusting the pageblock size 
> in these setups. pageblock size is already variable on some 
> architectures IIRC.

Making pageblock size a boot time variable? We might want to warn
sysadmin/user that >pageblock_order THP/mTHP creation will suffer.

>
> We'd only have to check if all of the THP logic can deal with pageblock 
> size < THP size.

Probably yes, pageblock should be independent of THP logic, although
compaction (used to create THPs) logic is based on pageblock.
>
> This issue is even more severe on arm64 with 64k (pageblock = 512MiB).

This is also good for virtio-mem, since the offline memory block size
can also be reduced. I remember you complained about it before.

-- 
Best Regards,
Yan, Zi



  reply	other threads:[~2025-01-20 15:29 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17 22:51 Juan Yescas
2025-01-17 22:52 ` Juan Yescas
2025-01-17 23:00   ` Juan Yescas
2025-01-17 23:19     ` Zi Yan
2025-01-19 23:55       ` Barry Song
2025-01-20  0:39         ` Zi Yan
2025-01-20  8:14           ` David Hildenbrand
2025-01-20 15:29             ` Zi Yan [this message]
2025-01-20 17:59               ` David Hildenbrand
2025-01-22  2:08                 ` Juan Yescas
2025-01-22  2:24                   ` Zi Yan
2025-01-22  4:06                     ` Juan Yescas
2025-01-22  6:52                       ` Barry Song
2025-01-22  8:04                         ` David Hildenbrand
2025-01-22  8:11                     ` David Hildenbrand
2025-01-22 12:49                       ` Zi Yan
2025-01-22 13:58                         ` David Hildenbrand
2025-01-20  0:17     ` Barry Song
2025-01-20  0:26       ` Zi Yan
2025-01-20  0:38         ` Barry Song
2025-01-20  0:45           ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D7709TNY6J94.1F4TOL8QC55I4@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=charante@codeaurora.org \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=isaacmanjarres@google.com \
    --cc=jaewon31.kim@samsung.com \
    --cc=jannh@google.com \
    --cc=jyescas@google.com \
    --cc=kaleshsingh@google.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=minchan@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=quic_charante@quicinc.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox