From: Juan Yescas <jyescas@google.com>
To: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
lsf-pc@lists.linux-foundation.org,
Suren Baghdasaryan <surenb@google.com>,
linux-mm@kvack.org, Kalesh Singh <kaleshsingh@google.com>,
Isaac Manjarres <isaacmanjarres@google.com>,
"T.J. Mercier" <tjmercier@google.com>,
Barry Song <21cnbao@gmail.com>,
Mel Gorman <mgorman@techsinguularity.net>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [LSF/MM/BPF TOPIC] CMA reservation optimizations
Date: Wed, 2 Apr 2025 11:01:22 -0700 [thread overview]
Message-ID: <CAJDx_rhwEFVyt2jvREnMy0OeJCTC2osEuTaMSzJiUErQXuwLsQ@mail.gmail.com> (raw)
In-Reply-To: <DBAA76C0-1EF5-425E-A31E-4B98D84AF75D@nvidia.com>
[-- Attachment #1.1: Type: text/plain, Size: 4503 bytes --]
Thanks all for the comments and suggestions.
The slides for my presentation "CMA optimization alignment" are attached.
Thanks,
Juan
On Tue, Jan 28, 2025 at 12:15 PM Zi Yan <ziy@nvidia.com> wrote:
> On 28 Jan 2025, at 13:33, David Hildenbrand wrote:
>
> > On 28.01.25 18:07, Juan Yescas wrote:
> >> On Tue, Jan 28, 2025 at 1:58 AM David Hildenbrand <david@redhat.com>
> wrote:
> >>>
> >>> On 28.01.25 02:04, Juan Yescas wrote:
> >>>> Hi LSF organizers,
> >>>>
> >>>> I would like to continue discussing this topic with the mm community:
> >>>>
> >>>> "CMA reservation optimizations"
> >>>>
> >>>> Note: There is already an email in the linux-mm mailing list that is
> >>>> discussing this issue. The title is:
> >>>>
> >>>> "CMA reservations require 32MiB alignment in 16KiB page size kernels
> >>>> instead of 8MiB in 4KiB page size kernel"
> >>>>
> >>>> Background
> >>>>
> >>>> When the drivers reserve CMA memory in 16KiB kernels, the minimum
> >>>> alignment is 32 MiB as per CMA_MIN_ALIGNMENT_BYTES. However, in 4KiB
> >>>> kernels, the CMA alignment is 4MiB.
> >>>
> >>> I'm curious, here you say 4 MiB, above 8 MiB.
> >>>
> >>
> >> My bad, it is a typo. I meant 4 MiB.
> >
> > That makes sense.
> >
> >>
> >>> But nowadays it's usually 2 MiB (pageblock size), no?
> >>
> >> That's right for the case when THPs are enabled in 4KiB page size
> configs.
> >>
> >> #define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORDER,
> MAX_PAGE_ORDER)
> >>
> https://elixir.bootlin.com/linux/v6.13/source/include/linux/pageblock-flags.h#L50
> >>
> >> This evals to pageblock_order = min(21 - 12, 10) = 9
> >>
> >> #define CMA_MIN_ALIGNMENT_PAGES pageblock_nr_pages
> >> #define CMA_MIN_ALIGNMENT_BYTES (PAGE_SIZE * CMA_MIN_ALIGNMENT_PAGES)
> >> https://elixir.bootlin.com/linux/v6.13/source/include/linux/cma.h#L21
> >>
> >> CMA_MIN_ALIGNMENT_BYTES = (4096 * 2 ^ 9) = (4096 * 512) = 2097152 = 2
> MiB
> >>
> >> However, when THPs are disabled, we get:
> >>
> >> #define pageblock_order MAX_PAGE_ORDER // 10
> >> https://elixir.bootlin.com/linux/v6.13/source/arch/arm64/Kconfig#L1630
> >>
> https://elixir.bootlin.com/linux/v6.13/source/include/linux/pageblock-flags.h#L55
> >>
> >> CMA_MIN_ALIGNMENT_BYTES = (4096 * 2 ^ 10) = (4096 * 1024) = 4194304 = 4
> MiB
> >
> > Right, and it can depend on ARCH_FORCE_MAX_ORDER.
> >
> > I've been wondering for a while if pageblock_order should nowadays
> default to HPAGE_PMD_ORDER, with the option to make it smaller/larger
> (likely smaller) -- as discussed.
> >
> > As discussed, the topic you are touching on is also relevant for
> virtio-mem, which can add/remove memory currently in pageblock granularity:
> 512 MiB on arm64 are not particularly helpful. I think we could support
> adding/removing smaller granularity, but it requires a bit of work, and
> always isolating 512MiB worth of pages just to effectively allocate e.g., 2
> MiB worth of pages is rather suboptimal. Same applies to CMA I assume.
> >
> > So there is more infrastructure that could benefit from pageblocks to
> rather be on the smaller side, even when hugetlb+THP might not be around in
> a config.
>
>
> It is related to anti-fragmentation mechanism in the kernel (Mel and
> Vlastima
> are cc’d, feel free to add more since I must miss others).
>
> If pageblock size is smaller than a PMD THP size, current compaction
> code will not be able to efficiently defragment memory for PMD THP
> creation,
> since compaction code works at pageblock granularity. This means we need to
> decouple compaction granularity from pageblock. pageblock uses different
> migratetypes (UNMOVABLE, MOVABLE, RECLAIMABLE, ...) to help reduce memory
> fragmentation by grouping pages by mobility. When compaction works on
> multiple
> pageblocks to generate a PMD THP, it needs all pageblocks within the range
> share the same migratetype, otherwise, compacting memory within a region
> with
> MIGRATE_UNMOVABLE would highly likely result in a waste of time, since
> unmovable
> pages just prevent a big free page from creation. So the key question is
> how to prevent MIGRATE_UNMOVABLE fragment a PMD THP size pageblock range,
> anti-fragmentation for pageblocks?
>
> Do we want to have super-pageblock for that? Or we allow sub-pageblock for
> virtio-mem and CMA reservation? That is probably what we want to discuss
> on the infrastructure side.
>
>
> Best Regards,
> Yan, Zi
>
[-- Attachment #1.2: Type: text/html, Size: 6010 bytes --]
[-- Attachment #2: CMA-optimizations-alignment.pdf --]
[-- Type: application/pdf, Size: 575707 bytes --]
prev parent reply other threads:[~2025-04-02 18:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-28 1:04 Juan Yescas
2025-01-28 9:58 ` David Hildenbrand
2025-01-28 17:07 ` Juan Yescas
2025-01-28 18:33 ` David Hildenbrand
2025-01-28 20:15 ` Zi Yan
2025-04-02 18:01 ` Juan Yescas [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJDx_rhwEFVyt2jvREnMy0OeJCTC2osEuTaMSzJiUErQXuwLsQ@mail.gmail.com \
--to=jyescas@google.com \
--cc=21cnbao@gmail.com \
--cc=david@redhat.com \
--cc=isaacmanjarres@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mgorman@techsinguularity.net \
--cc=surenb@google.com \
--cc=tjmercier@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox