Re: mm: CMA reservations require 32MiB alignment in 16KiB page size kernels instead of 8MiB in 4KiB page size kernel.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Zi Yan" <ziy@nvidia.com>
To: "Barry Song" <21cnbao@gmail.com>
Cc: "Juan Yescas" <jyescas@google.com>, <linux-mm@kvack.org>,
	<muchun.song@linux.dev>, <rppt@kernel.org>, <david@redhat.com>,
	<osalvador@suse.de>, <akpm@linux-foundation.org>,
	<lorenzo.stoakes@oracle.com>, "Jann Horn" <jannh@google.com>,
	<Liam.Howlett@oracle.com>, <minchan@kernel.org>,
	<jaewon31.kim@samsung.com>, <charante@codeaurora.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Kalesh Singh" <kaleshsingh@google.com>,
	"T.J. Mercier" <tjmercier@google.com>,
	"Isaac Manjarres" <isaacmanjarres@google.com>,
	<iamjoonsoo.kim@lge.com>, <quic_charante@quicinc.com>
Subject: Re: mm: CMA reservations require 32MiB alignment in 16KiB page size kernels instead of 8MiB in 4KiB page size kernel.
Date: Sun, 19 Jan 2025 19:45:44 -0500	[thread overview]
Message-ID: <D76HHG1X0DM0.18V1AMGBS5RVY@nvidia.com> (raw)
In-Reply-To: <CAGsJ_4yN4xfuX0EXaqiqTx_fjp2wN_=qEY9VB7CU=ZFUke94Ww@mail.gmail.com>

On Sun Jan 19, 2025 at 7:38 PM EST, Barry Song wrote:
> On Mon, Jan 20, 2025 at 1:26 PM Zi Yan <ziy@nvidia.com> wrote:
>>
>> On 19 Jan 2025, at 19:17, Barry Song wrote:
>>
>> > On Sat, Jan 18, 2025 at 12:00 PM Juan Yescas <jyescas@google.com> wrote:
>> >>
>> >> + iamjoonsoo.kim@lge.com
>> >> + quic_charante@quicinc.com
>> >>
>> >> On Fri, Jan 17, 2025 at 2:52 PM Juan Yescas <jyescas@google.com> wrote:
>> >>>
>> >>> +Suren Baghdasaryan
>> >>> +Kalesh Singh
>> >>> +T.J. Mercier
>> >>> +Isaac Manjarres
>> >>>
>> >>> On Fri, Jan 17, 2025 at 2:51 PM Juan Yescas <jyescas@google.com> wrote:
>> >>>>
>> >>>> Hi Linux memory team
>> >>>>
>> >>>> When the drivers reserve CMA memory in 16KiB kernels, the minimum
>> >>>> alignment is 32 MiB as per CMA_MIN_ALIGNMENT_BYTES. However, in 4KiB
>> >>>> kernels, the CMA alignment is 4MiB.
>> >>>>
>> >>>> This is forcing the drivers to reserve more memory in 16KiB kernels,
>> >>>> even if they only require 4MiB or 8MiB.
>> >>>>
>> >>>> reserved-memory {
>> >>>>       #address-cells = <2>;
>> >>>>       #size-cells = <2>;
>> >>>>       ranges;
>> >>>>       tpu_cma_reserve: tpu_cma_reserve {
>> >>>>             compatible = "shared-dma-pool";
>> >>>>             reusable;
>> >>>>            size = <0x0 0x2000000>; /* 32 MiB */
>> >>>> }
>> >>>>
>> >>>> One workaround to continue using 4MiB alignment is:
>> >>>>
>> >>>> - Disable CONFIG_TRANSPARENT_HUGEPAGE so the buddy allocator does NOT
>> >>>> have to allocate huge pages (32 MiB in 16KiB page sizes)
>> >>>> - Set ARCH_FORCE_MAX_ORDER for ARM64_16K_PAGES to "8", instead of
>> >>>> "11", so CMA_MIN_ALIGNMENT_BYTES is equals to 4 MiB
>> >>>>
>> >>>>     config ARCH_FORCE_MAX_ORDER
>> >>>>         int
>> >>>>         default "13" if ARM64_64K_PAGES
>> >>>>         default "8" if ARM64_16K_PAGES
>> >>>>        default "10"
>> >>>>
>> >>>> #define MAX_PAGE_ORDER CONFIG_ARCH_FORCE_MAX_ORDER      // 8
>> >>>> #define pageblock_order MAX_PAGE_ORDER              // 8
>> >>>> #define pageblock_nr_pages (1UL << pageblock_order)    // 256
>> >>>> #define CMA_MIN_ALIGNMENT_PAGES pageblock_nr_pages      // 256
>> >>>> #define CMA_MIN_ALIGNMENT_BYTES (PAGE_SIZE * CMA_MIN_ALIGNMENT_PAGES)
>> >>>>   // 16384 * 256 = 4194304 = 4 MiB
>> >>>>
>> >>>> After compiling the kernel with this changes, the kernel boots without
>> >>>> warnings and the memory is reserved:
>> >>>>
>> >>>> [    0.000000] Reserved memory: created CMA memory pool at
>> >>>> 0x000000007f800000, size 8 MiB
>> >>>> [    0.000000] OF: reserved mem: initialized node tpu_cma_reserve,
>> >>>> compatible id shared-dma-pool
>> >>>> [    0.000000] OF: reserved mem:
>> >>>> 0x000000007f800000..0x000000007fffffff (8192 KiB) map reusable
>> >>>> tpu_cma_reserve
>> >>>>
>> >>>> #  uname -a
>> >>>> Linux buildroot 6.12.9-dirty
>> >>>> # zcat /proc/config.gz | grep ARM64_16K
>> >>>> CONFIG_ARM64_16K_PAGES=y
>> >>>> # zcat /proc/config.gz | grep TRANSPARENT_HUGE
>> >>>> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
>> >>>> # CONFIG_TRANSPARENT_HUGEPAGE is not set
>> >>>> # cat /proc/pagetypeinfo
>> >>>> Page block order: 8
>> >>>> Pages per block:  256
>> >>>>
>> >>>> Free pages count per migrate type at order       0      1      2
>> >>>> 3      4      5      6      7      8
>> >>>> Node    0, zone      DMA, type    Unmovable      1      1     13
>> >>>> 6      5      2      0      0      1
>> >>>> Node    0, zone      DMA, type      Movable      9     16     19
>> >>>> 13     13      5      2      0    182
>> >>>> Node    0, zone      DMA, type  Reclaimable      0      1      0
>> >>>> 1      1      0      0      1      0
>> >>>> Node    0, zone      DMA, type   HighAtomic      0      0      0
>> >>>> 0      0      0      0      0      0
>> >>>> Node    0, zone      DMA, type          CMA      1      0      0
>> >>>> 0      0      0      0      0     49
>> >>>> Node    0, zone      DMA, type      Isolate      0      0      0
>> >>>> 0      0      0      0      0      0
>> >>>> Number of blocks type     Unmovable      Movable  Reclaimable
>> >>>> HighAtomic          CMA      Isolate
>> >>>> Node 0, zone      DMA            6          199            1
>> >>>>  0           50            0
>> >>>>
>> >>>>
>> >>>> However, with this workaround, we can't use transparent huge pages.
>> >
>> > I don’t think this is accurate. You can still use mTHP with a size
>> > equal to or smaller than 4MiB,
>> > right?
>> >
>> > By the way, what specific regression have you observed when reserving
>> > a larger size like
>> > 32MB?
>> > For CMA, the over-reserved memory is still available to the system for
>> > movable folios. 28MiB
>>
>> The fallbacks table does not have MIGRATE_CMA as a fallback for any
>> migratetype. How can it be used for movable folios? Am I missing something?
>
> The whole purpose of CMA is to allow the memory reserved for a
> device's dma_alloc_coherent or other contiguous memory needs to
> be freely used by movable allocations when the device doesn't
> require it. When the device's DMA needs the memory, the movable
> folios can be migrated to make it available for the device.
>
> /* Must be called after current_gfp_context() which can change gfp_mask */
> static inline unsigned int gfp_to_alloc_flags_cma(gfp_t gfp_mask,
>                                                   unsigned int alloc_flags)
> {
> #ifdef CONFIG_CMA
>         if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
>                 alloc_flags |= ALLOC_CMA;
> #endif
>
>         return alloc_flags;
> }
>
> So there’s no waste here. cma can be used by normal buddy.

Ah, you are right. I missed the above code, which adds ALLOC_CMA. I
agree with you that there is no waste unless the system has a heavy use
of unmovable data.

-- 
Best Regards,
Yan, Zi

     prev parent reply	other threads:[~2025-01-20  0:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17 22:51 Juan Yescas
2025-01-17 22:52 ` Juan Yescas
2025-01-17 23:00   ` Juan Yescas
2025-01-17 23:19     ` Zi Yan
2025-01-19 23:55       ` Barry Song
2025-01-20  0:39         ` Zi Yan
2025-01-20  8:14           ` David Hildenbrand
2025-01-20 15:29             ` Zi Yan
2025-01-20 17:59               ` David Hildenbrand
2025-01-22  2:08                 ` Juan Yescas
2025-01-22  2:24                   ` Zi Yan
2025-01-22  4:06                     ` Juan Yescas
2025-01-22  6:52                       ` Barry Song
2025-01-22  8:04                         ` David Hildenbrand
2025-01-22  8:11                     ` David Hildenbrand
2025-01-22 12:49                       ` Zi Yan
2025-01-22 13:58                         ` David Hildenbrand
2025-01-20  0:17     ` Barry Song
2025-01-20  0:26       ` Zi Yan
2025-01-20  0:38         ` Barry Song
2025-01-20  0:45           ` Zi Yan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D76HHG1X0DM0.18V1AMGBS5RVY@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=charante@codeaurora.org \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=isaacmanjarres@google.com \
    --cc=jaewon31.kim@samsung.com \
    --cc=jannh@google.com \
    --cc=jyescas@google.com \
    --cc=kaleshsingh@google.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=minchan@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=quic_charante@quicinc.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=tjmercier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox