linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
@ 2026-02-27  2:42 Juan Yescas
  2026-03-16 15:52 ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Yescas @ 2026-02-27  2:42 UTC (permalink / raw)
  To: Suren Baghdasaryan, Kalesh Singh, T.J. Mercier, Isaac Manjarres,
	android-mm, Linux Memory Management List, Matthew Wilcox,
	Vlastimil Babka, David Hildenbrand (Red Hat),
	Lorenzo Stoakes, lsf-pc

Hi LSF MM organizers

I would like to propose a discussion on improving our ability to
reproduce complex memory allocation and reclaim scenarios, and solicit
feedback on a debugfs-based testing interface to help trigger these
edge cases.

== The Problem ==

We frequently encounter complex memory management issues in the wild, including:

- CMA allocation failures due to pinned MIGRATE_MOVABLE pages.
- Page migration and compaction failing during reclaim.
- Excessive reclaim loops triggered by specific workloads.
- OOM kills.

Reproducing these specific memory states for debugging is currently
cumbersome. For instance, consuming most of the available
MIGRATE_MOVABLE memory, or forcing MIGRATE_UNMOVABLE allocations
specifically from Node 1 and Zone DMA directly from userspace,
requires writing custom kernel modules or relying on unreliable
userspace memory pressure tactics.

== Proposed Approach ==

To simplify reproducer setups, we are exploring a debugfs driver that
allows us to perform highly targeted allocations using a
straightforward path-based API. The interface exposes the node, zone,
order, and migrate type.

Example 1: Allocating 2^11 pages from Node 1, Zone Normal, MIGRATE_MOVABLE


$ echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-11/migrate-Movable/alloc

This generates a unique handle (file) for the allocation:

$ ls /sys/kernel/debug/mm/node-1/zone-Normal/order-11/migrate-Movable/
8343cb1e-cc57-4753-a060-e152e0584e36

Example 2: Allocating 2^3 pages from Node 0, Zone DMA, MIGRATE_UNMOVABLE

$ echo 1 > /sys/kernel/debug/mm/node-0/zone-DMA/order-3/migrate-Unmovable/alloc

this gives

$ ls /sys/kernel/debug/mm/node-0/zone-DMA/order-3/migrate-Unmovable/
b5f607ec-eae3-4aca-b8ab-4335a4338a1f

To release the memory, userspace simply writes 0 to the generated handle:

$ echo 0 > /sys/kernel/debug/mm/node-0/zone-DMA/order-3/migrate-Unmovable/b5f607ec-eae3-4aca-b8ab-4335a4338a1f

== Discussion Points ==

Rather than presenting this as a finalized driver, I would like to use
this session to discuss the design with the mm community:

- API Semantics: Does this path-based structure make sense for
targeted allocations? How should we handle metadata (e.g., cating the
generated file to show allocation details/status)?

- Extensibility: What other memory shaping or fault-injection
functionality would be valuable to add to this driver for the broader
community?

- Alternative Approaches: Are there better existing mechanisms to
achieve this level of deterministic, user-controlled page allocation
for testing?

Thanks
Juan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
  2026-02-27  2:42 [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs Juan Yescas
@ 2026-03-16 15:52 ` David Hildenbrand (Arm)
  2026-03-19  0:56   ` Juan Yescas
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-16 15:52 UTC (permalink / raw)
  To: Juan Yescas, Suren Baghdasaryan, Kalesh Singh, T.J. Mercier,
	Isaac Manjarres, android-mm, Linux Memory Management List,
	Matthew Wilcox, Vlastimil Babka, Lorenzo Stoakes, lsf-pc

On 2/27/26 03:42, Juan Yescas wrote:
> Hi LSF MM organizers

Hi,

I'm late ...

> 
> I would like to propose a discussion on improving our ability to
> reproduce complex memory allocation and reclaim scenarios, and solicit
> feedback on a debugfs-based testing interface to help trigger these
> edge cases.
> 
> == The Problem ==
> 
> We frequently encounter complex memory management issues in the wild, including:
> 
> - CMA allocation failures due to pinned MIGRATE_MOVABLE pages.
> - Page migration and compaction failing during reclaim.
> - Excessive reclaim loops triggered by specific workloads.
> - OOM kills.
> 
> Reproducing these specific memory states for debugging is currently
> cumbersome. For instance, consuming most of the available
> MIGRATE_MOVABLE memory, or forcing MIGRATE_UNMOVABLE allocations
> specifically from Node 1 and Zone DMA directly from userspace,
> requires writing custom kernel modules or relying on unreliable
> userspace memory pressure tactics.

I'm wondering whether an OOT module for this purpose would be sufficient?

IOW, do we really have to have this in the upstream kernel, or could we
have a public OOT module to perform these allocations?

Then, there are no worries about API/Extensibility etc.

Or would you want to fire up this debugging on a production kernel? I
would assume now.

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
  2026-03-16 15:52 ` David Hildenbrand (Arm)
@ 2026-03-19  0:56   ` Juan Yescas
  2026-03-23  9:14     ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Yescas @ 2026-03-19  0:56 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Suren Baghdasaryan, Kalesh Singh, T.J. Mercier, Isaac Manjarres,
	android-mm, Linux Memory Management List, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, lsf-pc

Thanks David for you comments,


On Mon, Mar 16, 2026 at 8:52 AM David Hildenbrand (Arm)
<david@kernel.org> wrote:
>
> On 2/27/26 03:42, Juan Yescas wrote:
> > Hi LSF MM organizers
>
> Hi,
>
> I'm late ...
>
> >
> > I would like to propose a discussion on improving our ability to
> > reproduce complex memory allocation and reclaim scenarios, and solicit
> > feedback on a debugfs-based testing interface to help trigger these
> > edge cases.
> >
> > == The Problem ==
> >
> > We frequently encounter complex memory management issues in the wild, including:
> >
> > - CMA allocation failures due to pinned MIGRATE_MOVABLE pages.
> > - Page migration and compaction failing during reclaim.
> > - Excessive reclaim loops triggered by specific workloads.
> > - OOM kills.
> >
> > Reproducing these specific memory states for debugging is currently
> > cumbersome. For instance, consuming most of the available
> > MIGRATE_MOVABLE memory, or forcing MIGRATE_UNMOVABLE allocations
> > specifically from Node 1 and Zone DMA directly from userspace,
> > requires writing custom kernel modules or relying on unreliable
> > userspace memory pressure tactics.
>
> I'm wondering whether an OOT module for this purpose would be sufficient?
>
> IOW, do we really have to have this in the upstream kernel, or could we
> have a public OOT module to perform these allocations?
>
> Then, there are no worries about API/Extensibility etc.
>
You’re right that going OOT would bypass the strict API stability and
extensibility requirements that come with being in-tree.

However, there are some symbols that we would need to be exported in
order for the module to compile.

> Or would you want to fire up this debugging on a production kernel? I
> would assume now.
>

Yes, that is actually one of our goals. We often encounter
"heisenbugs" that only manifest
under specific workloads and we want the ability to stress the memory subystem.

For example, if we want to increase the unmovable allocations by 16 MiB,
a 4 KiB kernel, we can do

$ for i in {1..4} \
do  \
  echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Unmovable/alloc
 \
done

And this is way more convenient than writing a test driver to make
only unmovable allocations.
The same apply for the other migrate types.

Having this driver upstream allows us to trigger these allocations
on-demand without the friction (or security risk) of
loading unsigned OOT modules into a locked-down device.

Thanks
Juan

> --
> Cheers,
>
> David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
  2026-03-19  0:56   ` Juan Yescas
@ 2026-03-23  9:14     ` David Hildenbrand (Arm)
  2026-04-08  0:12       ` Juan Yescas
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-23  9:14 UTC (permalink / raw)
  To: Juan Yescas
  Cc: Suren Baghdasaryan, Kalesh Singh, T.J. Mercier, Isaac Manjarres,
	android-mm, Linux Memory Management List, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, lsf-pc

On 3/19/26 01:56, Juan Yescas wrote:
> Thanks David for you comments,
> 
> 
> On Mon, Mar 16, 2026 at 8:52 AM David Hildenbrand (Arm)
> <david@kernel.org> wrote:
>>
>> On 2/27/26 03:42, Juan Yescas wrote:
>>> Hi LSF MM organizers
>>
>> Hi,
>>
>> I'm late ...
>>
>>>
>>> I would like to propose a discussion on improving our ability to
>>> reproduce complex memory allocation and reclaim scenarios, and solicit
>>> feedback on a debugfs-based testing interface to help trigger these
>>> edge cases.
>>>
>>> == The Problem ==
>>>
>>> We frequently encounter complex memory management issues in the wild, including:
>>>
>>> - CMA allocation failures due to pinned MIGRATE_MOVABLE pages.
>>> - Page migration and compaction failing during reclaim.
>>> - Excessive reclaim loops triggered by specific workloads.
>>> - OOM kills.
>>>
>>> Reproducing these specific memory states for debugging is currently
>>> cumbersome. For instance, consuming most of the available
>>> MIGRATE_MOVABLE memory, or forcing MIGRATE_UNMOVABLE allocations
>>> specifically from Node 1 and Zone DMA directly from userspace,
>>> requires writing custom kernel modules or relying on unreliable
>>> userspace memory pressure tactics.
>>
>> I'm wondering whether an OOT module for this purpose would be sufficient?
>>
>> IOW, do we really have to have this in the upstream kernel, or could we
>> have a public OOT module to perform these allocations?
>>
>> Then, there are no worries about API/Extensibility etc.
>>
> You’re right that going OOT would bypass the strict API stability and
> extensibility requirements that come with being in-tree.
> 
> However, there are some symbols that we would need to be exported in
> order for the module to compile.

Reason I am asking is because we had similar discussions around memory
hot(un)plug in the past, where we decided that an OOT kernel module to
simulate add/remove was a better choice than exposing weird APIs to user
space.

Which symbols would you need? I guess we'd want to call the buddy by
specifying node+zone+order.

Is specifying the migratetype really relevant?

> 
>> Or would you want to fire up this debugging on a production kernel? I
>> would assume now.
>>
> 
> Yes, that is actually one of our goals. We often encounter
> "heisenbugs" that only manifest
> under specific workloads and we want the ability to stress the memory subystem.
> 
> For example, if we want to increase the unmovable allocations by 16 MiB,
> a 4 KiB kernel, we can do
> 
> $ for i in {1..4} \
> do  \
>   echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Unmovable/alloc

How will we handle unmovable allocations ending up on movable memory
(e.g., ZONE_MOVABLE)? (e.g., allocating from ZONE_MOVABLE)

Also, is there any reason why we can't do it similar to hugetlb and use
a simple "nr_pages" variable, that can be set and read.

Why did you decide to use the "handle" approach?



>  \
> done
> 
> And this is way more convenient than writing a test driver to make
> only unmovable allocations.
> The same apply for the other migrate types.

Right, but the interface you provide looks like it would allow
allocating from movable areas etc, and I am not sure that is generally
helpful (or adds more complexity to handle).

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
  2026-03-23  9:14     ` David Hildenbrand (Arm)
@ 2026-04-08  0:12       ` Juan Yescas
  2026-04-08  7:47         ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Yescas @ 2026-04-08  0:12 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Suren Baghdasaryan, Kalesh Singh, T.J. Mercier, Isaac Manjarres,
	android-mm, Linux Memory Management List, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, lsf-pc

On Mon, Mar 23, 2026 at 2:14 AM David Hildenbrand (Arm)
<david@kernel.org> wrote:
>
> On 3/19/26 01:56, Juan Yescas wrote:
> > Thanks David for you comments,
> >
> >
> > On Mon, Mar 16, 2026 at 8:52 AM David Hildenbrand (Arm)
> > <david@kernel.org> wrote:
> >>
> >> On 2/27/26 03:42, Juan Yescas wrote:
> >>> Hi LSF MM organizers
> >>
> >> Hi,
> >>
> >> I'm late ...
> >>
> >>>
> >>> I would like to propose a discussion on improving our ability to
> >>> reproduce complex memory allocation and reclaim scenarios, and solicit
> >>> feedback on a debugfs-based testing interface to help trigger these
> >>> edge cases.
> >>>
> >>> == The Problem ==
> >>>
> >>> We frequently encounter complex memory management issues in the wild, including:
> >>>
> >>> - CMA allocation failures due to pinned MIGRATE_MOVABLE pages.
> >>> - Page migration and compaction failing during reclaim.
> >>> - Excessive reclaim loops triggered by specific workloads.
> >>> - OOM kills.
> >>>
> >>> Reproducing these specific memory states for debugging is currently
> >>> cumbersome. For instance, consuming most of the available
> >>> MIGRATE_MOVABLE memory, or forcing MIGRATE_UNMOVABLE allocations
> >>> specifically from Node 1 and Zone DMA directly from userspace,
> >>> requires writing custom kernel modules or relying on unreliable
> >>> userspace memory pressure tactics.
> >>
> >> I'm wondering whether an OOT module for this purpose would be sufficient?
> >>
> >> IOW, do we really have to have this in the upstream kernel, or could we
> >> have a public OOT module to perform these allocations?
> >>
> >> Then, there are no worries about API/Extensibility etc.
> >>
> > You’re right that going OOT would bypass the strict API stability and
> > extensibility requirements that come with being in-tree.
> >
> > However, there are some symbols that we would need to be exported in
> > order for the module to compile.
>
> Reason I am asking is because we had similar discussions around memory
> hot(un)plug in the past, where we decided that an OOT kernel module to
> simulate add/remove was a better choice than exposing weird APIs to user
> space.
>

Hi David, I apologize for the late reply. It’s been a bit of a
whirlwind over here with some internal issues.

> Which symbols would you need?

These are the required symbols:

ERROR: modpost: "cma_alloc" [page_alloc_debugfs.ko] undefined!
ERROR: modpost: "migratetype_names" [page_alloc_debugfs.ko] undefined!

 > I guess we'd want to call the buddy by
> specifying node+zone+order.
>
That's correct, for the no cma allocations we'll call "alloc_pages_node_noprof"

> Is specifying the migratetype really relevant?
>

Yes, we want to be able to allocate these types of memory:

MIGRATE_MOVABLE,
MIGRATE_RECLAIMABLE,
MIGRATE_CMA,

When the request is for MIGRATE_CMA, the "default_cma_region" will be
used for that allocation.

> >
> >> Or would you want to fire up this debugging on a production kernel? I
> >> would assume now.
> >>
> >
> > Yes, that is actually one of our goals. We often encounter
> > "heisenbugs" that only manifest
> > under specific workloads and we want the ability to stress the memory subystem.
> >
> > For example, if we want to increase the unmovable allocations by 16 MiB,
> > a 4 KiB kernel, we can do
> >
> > $ for i in {1..4} \
> > do  \
> >   echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Unmovable/alloc
>
> How will we handle unmovable allocations ending up on movable memory
> (e.g., ZONE_MOVABLE)? (e.g., allocating from ZONE_MOVABLE)
>

Once the allocation is requested using

echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Umovable/alloc

We don't care whether the allocation comes from movable/cma memory.

> Also, is there any reason why we can't do it similar to hugetlb and use
> a simple "nr_pages" variable, that can be set and read.
>

We could use a "nr_pages" variable, but we would also need to set the
node, zone and migrate type.

It would be cumbersome and error prone to have something like this:

echo "Node1/zone-Normal/MIGRATE_RECLAIMABLE/8" > /proc/kernel/debug/mm/nr_pages

However, when we have a directory tree, it is harder to make mistakes
regarding the node, zone, migration type, or order.
The tree would look like this:

# tree /sys/kernel/debug/mm/
/sys/kernel/debug/mm/
`-- node-0
    |-- zone-DMA
    |   |-- order-0
    |   |   |-- migrate-CMA
    |   |   |   `-- alloc
    |   |   |   `-- release
    |   |   |-- migrate-HighAtomic
    |   |   |   `-- alloc
    |   |   |   `-- release
    ....
    |   |   |   `-- alloc
    |   |   |   `-- release
    |   |   |-- migrate-Unmovable
    |   |   |   `-- alloc
    |   |   |   `-- release
   .....
        |-- order-8
        |   |-- migrate-CMA
        |   |   `-- alloc
        |   |   `-- release
        |   |-- migrate-HighAtomic
        |   |   `-- alloc
        |   |   `-- release
    ...
        |   |   `-- alloc
        |   |   `-- release
        |   |-- migrate-Unmovable
        |   |   `-- alloc
        |   |   `-- release
        `-- order-9
            |-- migrate-CMA
            |   `-- alloc
            |   `-- release
.....
            |-- migrate-Movable
            |   `-- alloc
            |   `-- release
            |-- migrate-Reclaimable
            |   `-- alloc
            |   `-- release
            |-- migrate-Unmovable
            |    `-- alloc
            |   `-- release

> Why did you decide to use the "handle" approach?

I think it is more convenient for the user and less error prone. Many
userspace developers exist.
that want to create memory pressure by allocating CMA/Movable memory,
but they are not familiar with
the nodes, zones or orders. This debug fs interface will make the
things a bit easier for them.
>
>
>
> >  \
> > done
> >
> > And this is way more convenient than writing a test driver to make
> > only unmovable allocations.
> > The same apply for the other migrate types.
>
> Right, but the interface you provide looks like it would allow
> allocating from movable areas etc, and I am not sure that is generally
> helpful (or adds more complexity to handle).
>

This will be a self-contained driver that does not require changes in
the memory subsystem.

Thanks
Juan

> --
> Cheers,
>
> David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
  2026-04-08  0:12       ` Juan Yescas
@ 2026-04-08  7:47         ` David Hildenbrand (Arm)
  2026-04-08 21:32           ` Juan Yescas
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-04-08  7:47 UTC (permalink / raw)
  To: Juan Yescas
  Cc: Suren Baghdasaryan, Kalesh Singh, T.J. Mercier, Isaac Manjarres,
	android-mm, Linux Memory Management List, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, lsf-pc

On 4/8/26 02:12, Juan Yescas wrote:
> On Mon, Mar 23, 2026 at 2:14 AM David Hildenbrand (Arm)
> <david@kernel.org> wrote:
>>
>> On 3/19/26 01:56, Juan Yescas wrote:
>>> Thanks David for you comments,
>>>
>>>
>>> On Mon, Mar 16, 2026 at 8:52 AM David Hildenbrand (Arm)
>>> <david@kernel.org> wrote:
>>> You’re right that going OOT would bypass the strict API stability and
>>> extensibility requirements that come with being in-tree.
>>>
>>> However, there are some symbols that we would need to be exported in
>>> order for the module to compile.
>>
>> Reason I am asking is because we had similar discussions around memory
>> hot(un)plug in the past, where we decided that an OOT kernel module to
>> simulate add/remove was a better choice than exposing weird APIs to user
>> space.
>>
> 
> Hi David, I apologize for the late reply. It’s been a bit of a
> whirlwind over here with some internal issues.
> 
>> Which symbols would you need?
> 
> These are the required symbols:
> 
> ERROR: modpost: "cma_alloc" [page_alloc_debugfs.ko] undefined!

cma_alloc() will be exported soon:

https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-5-e18fda504419@kernel.org


> ERROR: modpost: "migratetype_names" [page_alloc_debugfs.ko] undefined!

I'd assume that you can work around that?

> 
>  > I guess we'd want to call the buddy by
>> specifying node+zone+order.
>>
> That's correct, for the no cma allocations we'll call "alloc_pages_node_noprof"
> 
>> Is specifying the migratetype really relevant?
>>
> 
> Yes, we want to be able to allocate these types of memory:
> 
> MIGRATE_MOVABLE,
> MIGRATE_RECLAIMABLE,
> MIGRATE_CMA,
> 
> When the request is for MIGRATE_CMA, the "default_cma_region" will be
> used for that allocation.
> 
>>>
>>>
>>> Yes, that is actually one of our goals. We often encounter
>>> "heisenbugs" that only manifest
>>> under specific workloads and we want the ability to stress the memory subystem.
>>>
>>> For example, if we want to increase the unmovable allocations by 16 MiB,
>>> a 4 KiB kernel, we can do
>>>
>>> $ for i in {1..4} \
>>> do  \
>>>   echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Unmovable/alloc
>>
>> How will we handle unmovable allocations ending up on movable memory
>> (e.g., ZONE_MOVABLE)? (e.g., allocating from ZONE_MOVABLE)
>>
> 
> Once the allocation is requested using
> 
> echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Umovable/alloc
> 
> We don't care whether the allocation comes from movable/cma memory.

Well, you should.

If you end up doing something like

echo 1 >
/sys/kernel/debug/mm/node-1/zone-movable/order-10/migrate-Movable/alloc

You are just breaking ZONE_MOVABLE guarantees. Or what am I missing?

> 
>> Also, is there any reason why we can't do it similar to hugetlb and use
>> a simple "nr_pages" variable, that can be set and read.
>>
> 
> We could use a "nr_pages" variable, but we would also need to set the
> node, zone and migrate type.
> 
> It would be cumbersome and error prone to have something like this:
> 
> echo "Node1/zone-Normal/MIGRATE_RECLAIMABLE/8" > /proc/kernel/debug/mm/nr_pages

I meant something like:

echo 1 >
/sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Umovable/nr_pages

(not sure if we really want to specify the migratetype)

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs
  2026-04-08  7:47         ` David Hildenbrand (Arm)
@ 2026-04-08 21:32           ` Juan Yescas
  0 siblings, 0 replies; 7+ messages in thread
From: Juan Yescas @ 2026-04-08 21:32 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Suren Baghdasaryan, Kalesh Singh, T.J. Mercier, Isaac Manjarres,
	android-mm, Linux Memory Management List, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, lsf-pc

On Wed, Apr 8, 2026 at 12:47 AM David Hildenbrand (Arm)
<david@kernel.org> wrote:
>
> On 4/8/26 02:12, Juan Yescas wrote:
> > On Mon, Mar 23, 2026 at 2:14 AM David Hildenbrand (Arm)
> > <david@kernel.org> wrote:
> >>
> >> On 3/19/26 01:56, Juan Yescas wrote:
> >>> Thanks David for you comments,
> >>>
> >>>
> >>> On Mon, Mar 16, 2026 at 8:52 AM David Hildenbrand (Arm)
> >>> <david@kernel.org> wrote:
> >>> You’re right that going OOT would bypass the strict API stability and
> >>> extensibility requirements that come with being in-tree.
> >>>
> >>> However, there are some symbols that we would need to be exported in
> >>> order for the module to compile.
> >>
> >> Reason I am asking is because we had similar discussions around memory
> >> hot(un)plug in the past, where we decided that an OOT kernel module to
> >> simulate add/remove was a better choice than exposing weird APIs to user
> >> space.
> >>
> >
> > Hi David, I apologize for the late reply. It’s been a bit of a
> > whirlwind over here with some internal issues.
> >
> >> Which symbols would you need?
> >
> > These are the required symbols:
> >
> > ERROR: modpost: "cma_alloc" [page_alloc_debugfs.ko] undefined!
>
> cma_alloc() will be exported soon:
>
> https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-5-e18fda504419@kernel.org
>

Excellent, cma_alloc() and cma_release() will come in handy for this usecase :)

>
> > ERROR: modpost: "migratetype_names" [page_alloc_debugfs.ko] undefined!
>
> I'd assume that you can work around that?
>

That's true, we can just hardcode the strings.

> >
> >  > I guess we'd want to call the buddy by
> >> specifying node+zone+order.
> >>
> > That's correct, for the no cma allocations we'll call "alloc_pages_node_noprof"
> >
> >> Is specifying the migratetype really relevant?
> >>
> >
> > Yes, we want to be able to allocate these types of memory:
> >
> > MIGRATE_MOVABLE,
> > MIGRATE_RECLAIMABLE,
> > MIGRATE_CMA,
> >
> > When the request is for MIGRATE_CMA, the "default_cma_region" will be
> > used for that allocation.
> >
> >>>
> >>>
> >>> Yes, that is actually one of our goals. We often encounter
> >>> "heisenbugs" that only manifest
> >>> under specific workloads and we want the ability to stress the memory subystem.
> >>>
> >>> For example, if we want to increase the unmovable allocations by 16 MiB,
> >>> a 4 KiB kernel, we can do
> >>>
> >>> $ for i in {1..4} \
> >>> do  \
> >>>   echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Unmovable/alloc
> >>
> >> How will we handle unmovable allocations ending up on movable memory
> >> (e.g., ZONE_MOVABLE)? (e.g., allocating from ZONE_MOVABLE)
> >>
> >
> > Once the allocation is requested using
> >
> > echo 1 > /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Umovable/alloc
> >
> > We don't care whether the allocation comes from movable/cma memory.
>
> Well, you should.
>
> If you end up doing something like
>
> echo 1 >
> /sys/kernel/debug/mm/node-1/zone-movable/order-10/migrate-Movable/alloc
>
> You are just breaking ZONE_MOVABLE guarantees. Or what am I missing?

I see. How should the ZONE_MOVABLE allocations be handled? Should they
be excluded?

>
> >
> >> Also, is there any reason why we can't do it similar to hugetlb and use
> >> a simple "nr_pages" variable, that can be set and read.
> >>
> >
> > We could use a "nr_pages" variable, but we would also need to set the
> > node, zone and migrate type.
> >
> > It would be cumbersome and error prone to have something like this:
> >
> > echo "Node1/zone-Normal/MIGRATE_RECLAIMABLE/8" > /proc/kernel/debug/mm/nr_pages
>
> I meant something like:
>
> echo 1 >
> /sys/kernel/debug/mm/node-1/zone-Normal/order-10/migrate-Umovable/nr_pages

Got it, we can do something like that. It makes more sense.

>
> (not sure if we really want to specify the migratetype)
>

The migrate type is important because we want to make both MIGRATE_CMA
and MIGRATE_RECLAIMABLE allocations.

Thanks David for your comments.

> --
> Cheers,
>
> David


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-08 21:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-27  2:42 [LSF/MM/BPF TOPIC] Discussion: Targeted memory allocation via debugfs Juan Yescas
2026-03-16 15:52 ` David Hildenbrand (Arm)
2026-03-19  0:56   ` Juan Yescas
2026-03-23  9:14     ` David Hildenbrand (Arm)
2026-04-08  0:12       ` Juan Yescas
2026-04-08  7:47         ` David Hildenbrand (Arm)
2026-04-08 21:32           ` Juan Yescas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox