[LSF/MM/BPF TOPIC] Guaranteed CMA

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [LSF/MM/BPF TOPIC] Guaranteed CMA
@ 2025-02-02  0:19 Suren Baghdasaryan
  2025-02-04  5:46 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-02-02  0:19 UTC (permalink / raw)
  To: lsf-pc
  Cc: SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Vlastimil Babka,
	Lorenzo Stoakes, Liam R. Howlett, David Hildenbrand,
	Michal Hocko, linux-mm, android-kernel-team

Hi,
I would like to discuss the Guaranteed Contiguous Memory Allocator
(GCMA) mechanism that is being used by many Android vendors as an
out-of-tree feature, collect input on its possible usefulness for
others, feasibility to upstream and suggestions for possible better
alternatives.

Problem statement: Some workloads/hardware require physically
contiguous memory and carving out reserved memory areas for such
allocations often lead to inefficient usage of those carveouts. CMA
was designed to solve this inefficiency by allowing movable memory
allocations to use this reserved memory when it’s otherwise unused.
When a contiguous memory allocation is requested, CMA finds the
requested contiguous area, possibly migrating some of the movable
pages out of that area.
In latency-sensitive use cases, like face unlock on phones, we need to
allocate contiguous memory quickly and page migration in CMA takes
enough time to cause user-perceptible lag. Such allocations can also
fail if page migration is not possible.

GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
was not upstreamed but got adopted later by many Android vendors as an
out-of-tree feature. It is similar to CMA but backing memory is
cleancache backend, containing only clean file-backed pages. Most
importantly, the kernel can’t take a reference to pages from the
cleancache, therefore can’t prevent GCMA from quickly dropping them
when required. This guarantees GCMA low allocation latency and
improves allocation success rate.

We would like to standardize GCMA implementation and upstream it since
many Android vendors are asking to include it as a generic feature.

Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
didn’t know at the time about this use case) might complicate
upstreaming.

Desired participants:
GCMA authors: SeongJae Park <sj@kernel.org>, Minchan Kim <minchan@kernel.org>
CMA authors: Marek Szyprowski <m.szyprowski@samsung.com>, Aneesh Kumar
K.V <aneesh.kumar@kernel.org>, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Michal Nazarewicz <mina86@mina86.com>
The usual suspects (Willy, Vlastimil, Lorenzo, Liam, Michal, David H),
other mm folks

[1] https://lore.kernel.org/lkml/1424721263-25314-2-git-send-email-sj38.park@gmail.com/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-02  0:19 [LSF/MM/BPF TOPIC] Guaranteed CMA Suren Baghdasaryan
@ 2025-02-04  5:46 ` Christoph Hellwig
  2025-02-04  7:47   ` Lorenzo Stoakes
  2025-02-04  9:03   ` Vlastimil Babka
  2025-02-04  8:18 ` David Hildenbrand
  2025-02-04  9:07 ` Vlastimil Babka
  2 siblings, 2 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-02-04  5:46 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: lsf-pc, SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Vlastimil Babka,
	Lorenzo Stoakes, Liam R. Howlett, David Hildenbrand,
	Michal Hocko, linux-mm, android-kernel-team

On Sat, Feb 01, 2025 at 04:19:25PM -0800, Suren Baghdasaryan wrote:
> Hi,
> I would like to discuss the Guaranteed Contiguous Memory Allocator
> (GCMA) mechanism that is being used by many Android vendors as an
> out-of-tree feature, collect input on its possible usefulness for
> others, feasibility to upstream and suggestions for possible better
> alternatives.

Well, start by having code on the list.  We really need to stop all
these super hypothetical "wouldn't it be nice" discussions clogging up
the conferences.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04  5:46 ` Christoph Hellwig
@ 2025-02-04  7:47   ` Lorenzo Stoakes
  2025-02-04  7:48     ` Christoph Hellwig
  2025-02-04  9:03   ` Vlastimil Babka
  1 sibling, 1 reply; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-02-04  7:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Suren Baghdasaryan, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Liam R. Howlett, David Hildenbrand,
	Michal Hocko, linux-mm, android-kernel-team

On Mon, Feb 03, 2025 at 09:46:21PM -0800, Christoph Hellwig wrote:
> On Sat, Feb 01, 2025 at 04:19:25PM -0800, Suren Baghdasaryan wrote:
> > Hi,
> > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > (GCMA) mechanism that is being used by many Android vendors as an
> > out-of-tree feature, collect input on its possible usefulness for
> > others, feasibility to upstream and suggestions for possible better
> > alternatives.
>
> Well, start by having code on the list.  We really need to stop all
> these super hypothetical "wouldn't it be nice" discussions clogging up
> the conferences.
>
>

With respect, what's the point of discussing things that you've already
submitted upstream?... Isn't it useful to discuss things _ahead_ of
submitting them?

And I don't think this is some pie-in-the-sky 'wouldn't it be nice'
discussion. More so discussing a practical feature that is under
consideration.

I think it's actually extremely useful to get a heads up that somebody is
considering doing something, that's where a lot of value is in having a
face-to-face conversation.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04  7:47   ` Lorenzo Stoakes
@ 2025-02-04  7:48     ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-02-04  7:48 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Christoph Hellwig, Suren Baghdasaryan, lsf-pc, SeongJae Park,
	Minchan Kim, m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86,
	Matthew Wilcox, Vlastimil Babka, Liam R. Howlett,
	David Hildenbrand, Michal Hocko, linux-mm, android-kernel-team

On Tue, Feb 04, 2025 at 07:47:43AM +0000, Lorenzo Stoakes wrote:
> With respect, what's the point of discussing things that you've already
> submitted upstream?... Isn't it useful to discuss things _ahead_ of
> submitting them?

Not really.  That's what the mailinglists are for.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04  5:46 ` Christoph Hellwig
  2025-02-04  7:47   ` Lorenzo Stoakes
@ 2025-02-04  9:03   ` Vlastimil Babka
  2025-02-04 15:56     ` Suren Baghdasaryan
  1 sibling, 1 reply; 22+ messages in thread
From: Vlastimil Babka @ 2025-02-04  9:03 UTC (permalink / raw)
  To: Christoph Hellwig, Suren Baghdasaryan
  Cc: lsf-pc, SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Lorenzo Stoakes,
	Liam R. Howlett, David Hildenbrand, Michal Hocko, linux-mm,
	android-kernel-team

On 2/4/25 06:46, Christoph Hellwig wrote:
> On Sat, Feb 01, 2025 at 04:19:25PM -0800, Suren Baghdasaryan wrote:
>> Hi,
>> I would like to discuss the Guaranteed Contiguous Memory Allocator
>> (GCMA) mechanism that is being used by many Android vendors as an
>> out-of-tree feature, collect input on its possible usefulness for
>> others, feasibility to upstream and suggestions for possible better
>> alternatives.
> 
> Well, start by having code on the list.  We really need to stop all
> these super hypothetical "wouldn't it be nice" discussions clogging up
> the conferences.

A bit later the email says:

> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> was not upstreamed but got adopted later by many Android vendors as an

[1] https://lore.kernel.org/lkml/1424721263-25314-2-git-send-email-sj38.park@gmail.com/


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04  9:03   ` Vlastimil Babka
@ 2025-02-04 15:56     ` Suren Baghdasaryan
  0 siblings, 0 replies; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-02-04 15:56 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Hellwig, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Lorenzo Stoakes, Liam R. Howlett, David Hildenbrand,
	Michal Hocko, linux-mm, android-kernel-team

On Tue, Feb 4, 2025 at 1:03 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 2/4/25 06:46, Christoph Hellwig wrote:
> > On Sat, Feb 01, 2025 at 04:19:25PM -0800, Suren Baghdasaryan wrote:
> >> Hi,
> >> I would like to discuss the Guaranteed Contiguous Memory Allocator
> >> (GCMA) mechanism that is being used by many Android vendors as an
> >> out-of-tree feature, collect input on its possible usefulness for
> >> others, feasibility to upstream and suggestions for possible better
> >> alternatives.
> >
> > Well, start by having code on the list.  We really need to stop all
> > these super hypothetical "wouldn't it be nice" discussions clogging up
> > the conferences.
>
> A bit later the email says:
>
> > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > was not upstreamed but got adopted later by many Android vendors as an
>
> [1] https://lore.kernel.org/lkml/1424721263-25314-2-git-send-email-sj38.park@gmail.com/

Yeah, the link above can be used as a reference to get the idea about
this feature. I'll try to post a new version before the conference but
that requires some discussions with Android vendors to ensure we don't
miss anything they need (each of them might have tweaked their
implementation).


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-02  0:19 [LSF/MM/BPF TOPIC] Guaranteed CMA Suren Baghdasaryan
  2025-02-04  5:46 ` Christoph Hellwig
@ 2025-02-04  8:18 ` David Hildenbrand
  2025-02-04 11:23   ` Alexandru Elisei
  2025-02-04  9:07 ` Vlastimil Babka
  2 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-02-04  8:18 UTC (permalink / raw)
  To: Suren Baghdasaryan, lsf-pc
  Cc: SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Vlastimil Babka,
	Lorenzo Stoakes, Liam R. Howlett, Michal Hocko, linux-mm,
	android-kernel-team, alexandru Elisei

On 02.02.25 01:19, Suren Baghdasaryan wrote:
> Hi,

Hi,

> I would like to discuss the Guaranteed Contiguous Memory Allocator
> (GCMA) mechanism that is being used by many Android vendors as an
> out-of-tree feature, collect input on its possible usefulness for
> others, feasibility to upstream and suggestions for possible better
> alternatives.
> 
> Problem statement: Some workloads/hardware require physically
> contiguous memory and carving out reserved memory areas for such
> allocations often lead to inefficient usage of those carveouts. CMA
> was designed to solve this inefficiency by allowing movable memory
> allocations to use this reserved memory when it’s otherwise unused.
> When a contiguous memory allocation is requested, CMA finds the
> requested contiguous area, possibly migrating some of the movable
> pages out of that area.
> In latency-sensitive use cases, like face unlock on phones, we need to
> allocate contiguous memory quickly and page migration in CMA takes
> enough time to cause user-perceptible lag. Such allocations can also
> fail if page migration is not possible.
> 
> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> was not upstreamed but got adopted later by many Android vendors as an
> out-of-tree feature. It is similar to CMA but backing memory is
> cleancache backend, containing only clean file-backed pages. Most
> importantly, the kernel can’t take a reference to pages from the
> cleancache, therefore can’t prevent GCMA from quickly dropping them
> when required. This guarantees GCMA low allocation latency and
> improves allocation success rate.
> 
> We would like to standardize GCMA implementation and upstream it since
> many Android vendors are asking to include it as a generic feature.
> 
> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> didn’t know at the time about this use case) might complicate
> upstreaming.

we discussed another possible user last year: using MTE tag storage 
memory while the storage is not getting used to store MTE tags [1].

As long as the "ordinary RAM" that maps to a given MTE tag storage area 
does not use MTE tagging, we can reuse the MTE tag storage ("almost 
ordinary RAM, just that it doesn't support MTE itself") for different 
purposes.

We need a guarantee that that memory can be freed up / migrated once the 
tag storage gets activated.

We continued that discussion offline, and two users of such memory we 
discussed would be frontswap, and using it as a memory backend for 
something like swap/zswap: where the pages cannot get pinned / turned 
unmovable.

[1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@arm.com/

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04  8:18 ` David Hildenbrand
@ 2025-02-04 11:23   ` Alexandru Elisei
  2025-02-04 16:33     ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: Alexandru Elisei @ 2025-02-04 11:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Suren Baghdasaryan, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

Hi,

On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > Hi,
> 
> Hi,
> 
> > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > (GCMA) mechanism that is being used by many Android vendors as an
> > out-of-tree feature, collect input on its possible usefulness for
> > others, feasibility to upstream and suggestions for possible better
> > alternatives.
> > 
> > Problem statement: Some workloads/hardware require physically
> > contiguous memory and carving out reserved memory areas for such
> > allocations often lead to inefficient usage of those carveouts. CMA
> > was designed to solve this inefficiency by allowing movable memory
> > allocations to use this reserved memory when it’s otherwise unused.
> > When a contiguous memory allocation is requested, CMA finds the
> > requested contiguous area, possibly migrating some of the movable
> > pages out of that area.
> > In latency-sensitive use cases, like face unlock on phones, we need to
> > allocate contiguous memory quickly and page migration in CMA takes
> > enough time to cause user-perceptible lag. Such allocations can also
> > fail if page migration is not possible.
> > 
> > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > was not upstreamed but got adopted later by many Android vendors as an
> > out-of-tree feature. It is similar to CMA but backing memory is
> > cleancache backend, containing only clean file-backed pages. Most
> > importantly, the kernel can’t take a reference to pages from the
> > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > when required. This guarantees GCMA low allocation latency and
> > improves allocation success rate.
> > 
> > We would like to standardize GCMA implementation and upstream it since
> > many Android vendors are asking to include it as a generic feature.
> > 
> > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > didn’t know at the time about this use case) might complicate
> > upstreaming.
> 
> we discussed another possible user last year: using MTE tag storage memory
> while the storage is not getting used to store MTE tags [1].
> 
> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> just that it doesn't support MTE itself") for different purposes.
> 
> We need a guarantee that that memory can be freed up / migrated once the tag
> storage gets activated.

If I remember correctly, one of the issues with the MTE project that might be
relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
it for a very long time without specifying FOLL_LONGTERM.

If I remember things correctly, there were two examples given for this; there
might be more, or they might have been eliminated since then:

* The page is used as a buffer for accesses to a file opened with
  O_DIRECT.

* 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
  a direct quote from David [1].

Depending on your usecases, failing the allocation might be acceptable, but for
MTE that wasn't the case.

Hope some of this is useful.

[1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/

Thanks,
Alex

> 
> We continued that discussion offline, and two users of such memory we
> discussed would be frontswap, and using it as a memory backend for something
> like swap/zswap: where the pages cannot get pinned / turned unmovable.
> 
> [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@arm.com/
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04 11:23   ` Alexandru Elisei
@ 2025-02-04 16:33     ` Suren Baghdasaryan
  2025-03-20 18:06       ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-02-04 16:33 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: David Hildenbrand, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
<alexandru.elisei@arm.com> wrote:
>
> Hi,
>
> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> > On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > > Hi,
> >
> > Hi,
> >
> > > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > > (GCMA) mechanism that is being used by many Android vendors as an
> > > out-of-tree feature, collect input on its possible usefulness for
> > > others, feasibility to upstream and suggestions for possible better
> > > alternatives.
> > >
> > > Problem statement: Some workloads/hardware require physically
> > > contiguous memory and carving out reserved memory areas for such
> > > allocations often lead to inefficient usage of those carveouts. CMA
> > > was designed to solve this inefficiency by allowing movable memory
> > > allocations to use this reserved memory when it’s otherwise unused.
> > > When a contiguous memory allocation is requested, CMA finds the
> > > requested contiguous area, possibly migrating some of the movable
> > > pages out of that area.
> > > In latency-sensitive use cases, like face unlock on phones, we need to
> > > allocate contiguous memory quickly and page migration in CMA takes
> > > enough time to cause user-perceptible lag. Such allocations can also
> > > fail if page migration is not possible.
> > >
> > > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > > was not upstreamed but got adopted later by many Android vendors as an
> > > out-of-tree feature. It is similar to CMA but backing memory is
> > > cleancache backend, containing only clean file-backed pages. Most
> > > importantly, the kernel can’t take a reference to pages from the
> > > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > > when required. This guarantees GCMA low allocation latency and
> > > improves allocation success rate.
> > >
> > > We would like to standardize GCMA implementation and upstream it since
> > > many Android vendors are asking to include it as a generic feature.
> > >
> > > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > > didn’t know at the time about this use case) might complicate
> > > upstreaming.
> >
> > we discussed another possible user last year: using MTE tag storage memory
> > while the storage is not getting used to store MTE tags [1].
> >
> > As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> > not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> > just that it doesn't support MTE itself") for different purposes.
> >
> > We need a guarantee that that memory can be freed up / migrated once the tag
> > storage gets activated.
>
> If I remember correctly, one of the issues with the MTE project that might be
> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> it for a very long time without specifying FOLL_LONGTERM.
>
> If I remember things correctly, there were two examples given for this; there
> might be more, or they might have been eliminated since then:
>
> * The page is used as a buffer for accesses to a file opened with
>   O_DIRECT.
>
> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
>   a direct quote from David [1].
>
> Depending on your usecases, failing the allocation might be acceptable, but for
> MTE that wasn't the case.
>
> Hope some of this is useful.
>
> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/

Thanks for the references! I'll read through these discussions to see
how much useful information for GCMA I can extract.

>
> Thanks,
> Alex
>
> >
> > We continued that discussion offline, and two users of such memory we
> > discussed would be frontswap, and using it as a memory backend for something
> > like swap/zswap: where the pages cannot get pinned / turned unmovable.
> >
> > [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@arm.com/
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04 16:33     ` Suren Baghdasaryan
@ 2025-03-20 18:06       ` Suren Baghdasaryan
  2025-04-02 16:35         ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-03-20 18:06 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: David Hildenbrand, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> <alexandru.elisei@arm.com> wrote:
> >
> > Hi,
> >
> > On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> > > On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > > > Hi,
> > >
> > > Hi,
> > >
> > > > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > > > (GCMA) mechanism that is being used by many Android vendors as an
> > > > out-of-tree feature, collect input on its possible usefulness for
> > > > others, feasibility to upstream and suggestions for possible better
> > > > alternatives.
> > > >
> > > > Problem statement: Some workloads/hardware require physically
> > > > contiguous memory and carving out reserved memory areas for such
> > > > allocations often lead to inefficient usage of those carveouts. CMA
> > > > was designed to solve this inefficiency by allowing movable memory
> > > > allocations to use this reserved memory when it’s otherwise unused.
> > > > When a contiguous memory allocation is requested, CMA finds the
> > > > requested contiguous area, possibly migrating some of the movable
> > > > pages out of that area.
> > > > In latency-sensitive use cases, like face unlock on phones, we need to
> > > > allocate contiguous memory quickly and page migration in CMA takes
> > > > enough time to cause user-perceptible lag. Such allocations can also
> > > > fail if page migration is not possible.
> > > >
> > > > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > > > was not upstreamed but got adopted later by many Android vendors as an
> > > > out-of-tree feature. It is similar to CMA but backing memory is
> > > > cleancache backend, containing only clean file-backed pages. Most
> > > > importantly, the kernel can’t take a reference to pages from the
> > > > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > > > when required. This guarantees GCMA low allocation latency and
> > > > improves allocation success rate.
> > > >
> > > > We would like to standardize GCMA implementation and upstream it since
> > > > many Android vendors are asking to include it as a generic feature.
> > > >
> > > > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > > > didn’t know at the time about this use case) might complicate
> > > > upstreaming.
> > >
> > > we discussed another possible user last year: using MTE tag storage memory
> > > while the storage is not getting used to store MTE tags [1].
> > >
> > > As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> > > not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> > > just that it doesn't support MTE itself") for different purposes.
> > >
> > > We need a guarantee that that memory can be freed up / migrated once the tag
> > > storage gets activated.
> >
> > If I remember correctly, one of the issues with the MTE project that might be
> > relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> > it for a very long time without specifying FOLL_LONGTERM.
> >
> > If I remember things correctly, there were two examples given for this; there
> > might be more, or they might have been eliminated since then:
> >
> > * The page is used as a buffer for accesses to a file opened with
> >   O_DIRECT.
> >
> > * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> >   a direct quote from David [1].
> >
> > Depending on your usecases, failing the allocation might be acceptable, but for
> > MTE that wasn't the case.
> >
> > Hope some of this is useful.
> >
> > [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
>
> Thanks for the references! I'll read through these discussions to see
> how much useful information for GCMA I can extract.

I wanted to get an RFC code ahead of LSF/MM and just finished putting
it together. Sorry for the last minute posting. You can find it here:
https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
Thanks,
Suren.


>
> >
> > Thanks,
> > Alex
> >
> > >
> > > We continued that discussion offline, and two users of such memory we
> > > discussed would be frontswap, and using it as a memory backend for something
> > > like swap/zswap: where the pages cannot get pinned / turned unmovable.
> > >
> > > [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@arm.com/
> > >
> > > --
> > > Cheers,
> > >
> > > David / dhildenb
> > >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-03-20 18:06       ` Suren Baghdasaryan
@ 2025-04-02 16:35         ` Suren Baghdasaryan
  2025-08-22 22:14           ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-04-02 16:35 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: David Hildenbrand, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

[-- Attachment #1: Type: text/plain, Size: 5073 bytes --]

On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> > <alexandru.elisei@arm.com> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> > > > On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > > > > Hi,
> > > >
> > > > Hi,
> > > >
> > > > > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > > > > (GCMA) mechanism that is being used by many Android vendors as an
> > > > > out-of-tree feature, collect input on its possible usefulness for
> > > > > others, feasibility to upstream and suggestions for possible better
> > > > > alternatives.
> > > > >
> > > > > Problem statement: Some workloads/hardware require physically
> > > > > contiguous memory and carving out reserved memory areas for such
> > > > > allocations often lead to inefficient usage of those carveouts. CMA
> > > > > was designed to solve this inefficiency by allowing movable memory
> > > > > allocations to use this reserved memory when it’s otherwise unused.
> > > > > When a contiguous memory allocation is requested, CMA finds the
> > > > > requested contiguous area, possibly migrating some of the movable
> > > > > pages out of that area.
> > > > > In latency-sensitive use cases, like face unlock on phones, we need to
> > > > > allocate contiguous memory quickly and page migration in CMA takes
> > > > > enough time to cause user-perceptible lag. Such allocations can also
> > > > > fail if page migration is not possible.
> > > > >
> > > > > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > > > > was not upstreamed but got adopted later by many Android vendors as an
> > > > > out-of-tree feature. It is similar to CMA but backing memory is
> > > > > cleancache backend, containing only clean file-backed pages. Most
> > > > > importantly, the kernel can’t take a reference to pages from the
> > > > > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > > > > when required. This guarantees GCMA low allocation latency and
> > > > > improves allocation success rate.
> > > > >
> > > > > We would like to standardize GCMA implementation and upstream it since
> > > > > many Android vendors are asking to include it as a generic feature.
> > > > >
> > > > > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > > > > didn’t know at the time about this use case) might complicate
> > > > > upstreaming.
> > > >
> > > > we discussed another possible user last year: using MTE tag storage memory
> > > > while the storage is not getting used to store MTE tags [1].
> > > >
> > > > As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> > > > not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> > > > just that it doesn't support MTE itself") for different purposes.
> > > >
> > > > We need a guarantee that that memory can be freed up / migrated once the tag
> > > > storage gets activated.
> > >
> > > If I remember correctly, one of the issues with the MTE project that might be
> > > relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> > > it for a very long time without specifying FOLL_LONGTERM.
> > >
> > > If I remember things correctly, there were two examples given for this; there
> > > might be more, or they might have been eliminated since then:
> > >
> > > * The page is used as a buffer for accesses to a file opened with
> > >   O_DIRECT.
> > >
> > > * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> > >   a direct quote from David [1].
> > >
> > > Depending on your usecases, failing the allocation might be acceptable, but for
> > > MTE that wasn't the case.
> > >
> > > Hope some of this is useful.
> > >
> > > [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
> >
> > Thanks for the references! I'll read through these discussions to see
> > how much useful information for GCMA I can extract.
>
> I wanted to get an RFC code ahead of LSF/MM and just finished putting
> it together. Sorry for the last minute posting. You can find it here:
> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/

Sorry about the delay. Attached are the slides from my GCMA
presentation at the conference.
Thanks,
Suren.

> Thanks,
> Suren.
>
>
> >
> > >
> > > Thanks,
> > > Alex
> > >
> > > >
> > > > We continued that discussion offline, and two users of such memory we
> > > > discussed would be frontswap, and using it as a memory backend for something
> > > > like swap/zswap: where the pages cannot get pinned / turned unmovable.
> > > >
> > > > [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@arm.com/
> > > >
> > > > --
> > > > Cheers,
> > > >
> > > > David / dhildenb
> > > >

[-- Attachment #2: GCMA_LSFMM2025.pdf --]
[-- Type: application/pdf, Size: 561692 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-04-02 16:35         ` Suren Baghdasaryan
@ 2025-08-22 22:14           ` Suren Baghdasaryan
  2025-08-26  8:58             ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-08-22 22:14 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: David Hildenbrand, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
> > >
> > > On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> > > <alexandru.elisei@arm.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> > > > > On 02.02.25 01:19, Suren Baghdasaryan wrote:
> > > > > > Hi,
> > > > >
> > > > > Hi,
> > > > >
> > > > > > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > > > > > (GCMA) mechanism that is being used by many Android vendors as an
> > > > > > out-of-tree feature, collect input on its possible usefulness for
> > > > > > others, feasibility to upstream and suggestions for possible better
> > > > > > alternatives.
> > > > > >
> > > > > > Problem statement: Some workloads/hardware require physically
> > > > > > contiguous memory and carving out reserved memory areas for such
> > > > > > allocations often lead to inefficient usage of those carveouts. CMA
> > > > > > was designed to solve this inefficiency by allowing movable memory
> > > > > > allocations to use this reserved memory when it’s otherwise unused.
> > > > > > When a contiguous memory allocation is requested, CMA finds the
> > > > > > requested contiguous area, possibly migrating some of the movable
> > > > > > pages out of that area.
> > > > > > In latency-sensitive use cases, like face unlock on phones, we need to
> > > > > > allocate contiguous memory quickly and page migration in CMA takes
> > > > > > enough time to cause user-perceptible lag. Such allocations can also
> > > > > > fail if page migration is not possible.
> > > > > >
> > > > > > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > > > > > was not upstreamed but got adopted later by many Android vendors as an
> > > > > > out-of-tree feature. It is similar to CMA but backing memory is
> > > > > > cleancache backend, containing only clean file-backed pages. Most
> > > > > > importantly, the kernel can’t take a reference to pages from the
> > > > > > cleancache, therefore can’t prevent GCMA from quickly dropping them
> > > > > > when required. This guarantees GCMA low allocation latency and
> > > > > > improves allocation success rate.
> > > > > >
> > > > > > We would like to standardize GCMA implementation and upstream it since
> > > > > > many Android vendors are asking to include it as a generic feature.
> > > > > >
> > > > > > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > > > > > didn’t know at the time about this use case) might complicate
> > > > > > upstreaming.
> > > > >
> > > > > we discussed another possible user last year: using MTE tag storage memory
> > > > > while the storage is not getting used to store MTE tags [1].
> > > > >
> > > > > As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> > > > > not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> > > > > just that it doesn't support MTE itself") for different purposes.
> > > > >
> > > > > We need a guarantee that that memory can be freed up / migrated once the tag
> > > > > storage gets activated.
> > > >
> > > > If I remember correctly, one of the issues with the MTE project that might be
> > > > relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> > > > it for a very long time without specifying FOLL_LONGTERM.
> > > >
> > > > If I remember things correctly, there were two examples given for this; there
> > > > might be more, or they might have been eliminated since then:
> > > >
> > > > * The page is used as a buffer for accesses to a file opened with
> > > >   O_DIRECT.
> > > >
> > > > * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> > > >   a direct quote from David [1].
> > > >
> > > > Depending on your usecases, failing the allocation might be acceptable, but for
> > > > MTE that wasn't the case.
> > > >
> > > > Hope some of this is useful.
> > > >
> > > > [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
> > >
> > > Thanks for the references! I'll read through these discussions to see
> > > how much useful information for GCMA I can extract.
> >
> > I wanted to get an RFC code ahead of LSF/MM and just finished putting
> > it together. Sorry for the last minute posting. You can find it here:
> > https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
>
> Sorry about the delay. Attached are the slides from my GCMA
> presentation at the conference.

Hi Folks,
As I'm getting close to finalizing the GCMA patchset, one question
keeps bugging me. How do we account the memory that is allocated from
GCMA... In case of CMA allocations, they are backed by the system
memory, so accounting is straightforward, allocations contribute to
RSS, counted towards memcg limits, etc. In case of GCMA, the backing
memory is reserved memory (a carveout) not directly accessible by the
rest of the system and not part of the total_memory. So, if a process
allocates a buffer from GCMA, should it be accounted as a normal
allocation from system memory or as something else entirely? Any
thoughts?
Thanks,
Suren.

> Thanks,
> Suren.
>
> > Thanks,
> > Suren.
> >
> >
> > >
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > > >
> > > > > We continued that discussion offline, and two users of such memory we
> > > > > discussed would be frontswap, and using it as a memory backend for something
> > > > > like swap/zswap: where the pages cannot get pinned / turned unmovable.
> > > > >
> > > > > [1] https://lore.kernel.org/linux-mm/ZOc0fehF02MohuWr@arm.com/
> > > > >
> > > > > --
> > > > > Cheers,
> > > > >
> > > > > David / dhildenb
> > > > >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-08-22 22:14           ` Suren Baghdasaryan
@ 2025-08-26  8:58             ` David Hildenbrand
  2025-08-27  0:17               ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-08-26  8:58 UTC (permalink / raw)
  To: Suren Baghdasaryan, Alexandru Elisei
  Cc: lsf-pc, SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Vlastimil Babka,
	Lorenzo Stoakes, Liam R. Howlett, Michal Hocko, linux-mm,
	android-kernel-team

On 23.08.25 00:14, Suren Baghdasaryan wrote:
> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>
>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>
>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>
>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
>>>> <alexandru.elisei@arm.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
>>>>>>> Hi,
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
>>>>>>> out-of-tree feature, collect input on its possible usefulness for
>>>>>>> others, feasibility to upstream and suggestions for possible better
>>>>>>> alternatives.
>>>>>>>
>>>>>>> Problem statement: Some workloads/hardware require physically
>>>>>>> contiguous memory and carving out reserved memory areas for such
>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
>>>>>>> was designed to solve this inefficiency by allowing movable memory
>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
>>>>>>> When a contiguous memory allocation is requested, CMA finds the
>>>>>>> requested contiguous area, possibly migrating some of the movable
>>>>>>> pages out of that area.
>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
>>>>>>> fail if page migration is not possible.
>>>>>>>
>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
>>>>>>> cleancache backend, containing only clean file-backed pages. Most
>>>>>>> importantly, the kernel can’t take a reference to pages from the
>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
>>>>>>> when required. This guarantees GCMA low allocation latency and
>>>>>>> improves allocation success rate.
>>>>>>>
>>>>>>> We would like to standardize GCMA implementation and upstream it since
>>>>>>> many Android vendors are asking to include it as a generic feature.
>>>>>>>
>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
>>>>>>> didn’t know at the time about this use case) might complicate
>>>>>>> upstreaming.
>>>>>>
>>>>>> we discussed another possible user last year: using MTE tag storage memory
>>>>>> while the storage is not getting used to store MTE tags [1].
>>>>>>
>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
>>>>>> just that it doesn't support MTE itself") for different purposes.
>>>>>>
>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
>>>>>> storage gets activated.
>>>>>
>>>>> If I remember correctly, one of the issues with the MTE project that might be
>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
>>>>> it for a very long time without specifying FOLL_LONGTERM.
>>>>>
>>>>> If I remember things correctly, there were two examples given for this; there
>>>>> might be more, or they might have been eliminated since then:
>>>>>
>>>>> * The page is used as a buffer for accesses to a file opened with
>>>>>    O_DIRECT.
>>>>>
>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
>>>>>    a direct quote from David [1].
>>>>>
>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
>>>>> MTE that wasn't the case.
>>>>>
>>>>> Hope some of this is useful.
>>>>>
>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
>>>>
>>>> Thanks for the references! I'll read through these discussions to see
>>>> how much useful information for GCMA I can extract.
>>>
>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
>>> it together. Sorry for the last minute posting. You can find it here:
>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
>>
>> Sorry about the delay. Attached are the slides from my GCMA
>> presentation at the conference.
> 
> Hi Folks,

Hi,

> As I'm getting close to finalizing the GCMA patchset, one question
> keeps bugging me. How do we account the memory that is allocated from
> GCMA... In case of CMA allocations, they are backed by the system
> memory, so accounting is straightforward, allocations contribute to
> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
> memory is reserved memory (a carveout) not directly accessible by the
> rest of the system and not part of the total_memory. So, if a process
> allocates a buffer from GCMA, should it be accounted as a normal
> allocation from system memory or as something else entirely? Any
> thoughts?

You mean, an application allocates the memory and maps it into its page 
tables?

Can that memory get reclaimed somehow?

How would we be mapping these pages into processes (VM_PFNMAP or 
"normal" mappings)?

memcg doesn't quite make sense, I assume.

RSS ... hm ...

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-08-26  8:58             ` David Hildenbrand
@ 2025-08-27  0:17               ` Suren Baghdasaryan
  2025-09-01 16:01                 ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-08-27  0:17 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 23.08.25 00:14, Suren Baghdasaryan wrote:
> > On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>
> >> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>
> >>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>
> >>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> >>>> <alexandru.elisei@arm.com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> >>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
> >>>>>>> Hi,
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
> >>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
> >>>>>>> out-of-tree feature, collect input on its possible usefulness for
> >>>>>>> others, feasibility to upstream and suggestions for possible better
> >>>>>>> alternatives.
> >>>>>>>
> >>>>>>> Problem statement: Some workloads/hardware require physically
> >>>>>>> contiguous memory and carving out reserved memory areas for such
> >>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
> >>>>>>> was designed to solve this inefficiency by allowing movable memory
> >>>>>>> allocations to use this reserved memory when it’s otherwise unused.
> >>>>>>> When a contiguous memory allocation is requested, CMA finds the
> >>>>>>> requested contiguous area, possibly migrating some of the movable
> >>>>>>> pages out of that area.
> >>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
> >>>>>>> allocate contiguous memory quickly and page migration in CMA takes
> >>>>>>> enough time to cause user-perceptible lag. Such allocations can also
> >>>>>>> fail if page migration is not possible.
> >>>>>>>
> >>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> >>>>>>> was not upstreamed but got adopted later by many Android vendors as an
> >>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
> >>>>>>> cleancache backend, containing only clean file-backed pages. Most
> >>>>>>> importantly, the kernel can’t take a reference to pages from the
> >>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
> >>>>>>> when required. This guarantees GCMA low allocation latency and
> >>>>>>> improves allocation success rate.
> >>>>>>>
> >>>>>>> We would like to standardize GCMA implementation and upstream it since
> >>>>>>> many Android vendors are asking to include it as a generic feature.
> >>>>>>>
> >>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> >>>>>>> didn’t know at the time about this use case) might complicate
> >>>>>>> upstreaming.
> >>>>>>
> >>>>>> we discussed another possible user last year: using MTE tag storage memory
> >>>>>> while the storage is not getting used to store MTE tags [1].
> >>>>>>
> >>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> >>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> >>>>>> just that it doesn't support MTE itself") for different purposes.
> >>>>>>
> >>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
> >>>>>> storage gets activated.
> >>>>>
> >>>>> If I remember correctly, one of the issues with the MTE project that might be
> >>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> >>>>> it for a very long time without specifying FOLL_LONGTERM.
> >>>>>
> >>>>> If I remember things correctly, there were two examples given for this; there
> >>>>> might be more, or they might have been eliminated since then:
> >>>>>
> >>>>> * The page is used as a buffer for accesses to a file opened with
> >>>>>    O_DIRECT.
> >>>>>
> >>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> >>>>>    a direct quote from David [1].
> >>>>>
> >>>>> Depending on your usecases, failing the allocation might be acceptable, but for
> >>>>> MTE that wasn't the case.
> >>>>>
> >>>>> Hope some of this is useful.
> >>>>>
> >>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
> >>>>
> >>>> Thanks for the references! I'll read through these discussions to see
> >>>> how much useful information for GCMA I can extract.
> >>>
> >>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
> >>> it together. Sorry for the last minute posting. You can find it here:
> >>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
> >>
> >> Sorry about the delay. Attached are the slides from my GCMA
> >> presentation at the conference.
> >
> > Hi Folks,
>
> Hi,
>
> > As I'm getting close to finalizing the GCMA patchset, one question
> > keeps bugging me. How do we account the memory that is allocated from
> > GCMA... In case of CMA allocations, they are backed by the system
> > memory, so accounting is straightforward, allocations contribute to
> > RSS, counted towards memcg limits, etc. In case of GCMA, the backing
> > memory is reserved memory (a carveout) not directly accessible by the
> > rest of the system and not part of the total_memory. So, if a process
> > allocates a buffer from GCMA, should it be accounted as a normal
> > allocation from system memory or as something else entirely? Any
> > thoughts?
>
> You mean, an application allocates the memory and maps it into its page
> tables?

Allocation will happen via cma_alloc() or a similar interface, so
applications would have to use some driver to allocate from GCMA. Once
allocated, an application can map that memory if the driver supports
mapping.

>
> Can that memory get reclaimed somehow?

Hmm. I assume that once a driver allocates pages from GCMA it won't
put them into system-managed LRU or free them into buddy allocator for
kernel to use. If it does then at the time of cma_release() it can't
guarantee there are no more users for such pages.

>
> How would we be mapping these pages into processes (VM_PFNMAP or
> "normal" mappings)?

They would be normal mappings as the pages do have `struct page` but I
expect these pages to be managed by the driver that allocated them
rather than the core kernel itself.

I was trying to design GCMA to be used as close to CMA as possible so
that we can use the same cma_alloc/cma_release API and reuse CMA's
page management code but the fact that CMA is backed by the system
memory and GCMA is backed by a carveout makes it a bit difficult.

>
> memcg doesn't quite make sense, I assume.
>
> RSS ... hm ...

Yeah, I'm also unsure. I agree that memcg would not make sense because
this is not some memory that can be reclaimed and used by others.

Thanks for the feedback, David! Hope we can figure out some rules that
make sense here...
Suren.

>
> --
> Cheers
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-08-27  0:17               ` Suren Baghdasaryan
@ 2025-09-01 16:01                 ` David Hildenbrand
  2025-10-10  1:30                   ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-09-01 16:01 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On 27.08.25 02:17, Suren Baghdasaryan wrote:
> On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 23.08.25 00:14, Suren Baghdasaryan wrote:
>>> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>
>>>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>
>>>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>
>>>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
>>>>>> <alexandru.elisei@arm.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
>>>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
>>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
>>>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
>>>>>>>>> out-of-tree feature, collect input on its possible usefulness for
>>>>>>>>> others, feasibility to upstream and suggestions for possible better
>>>>>>>>> alternatives.
>>>>>>>>>
>>>>>>>>> Problem statement: Some workloads/hardware require physically
>>>>>>>>> contiguous memory and carving out reserved memory areas for such
>>>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
>>>>>>>>> was designed to solve this inefficiency by allowing movable memory
>>>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
>>>>>>>>> When a contiguous memory allocation is requested, CMA finds the
>>>>>>>>> requested contiguous area, possibly migrating some of the movable
>>>>>>>>> pages out of that area.
>>>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
>>>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
>>>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
>>>>>>>>> fail if page migration is not possible.
>>>>>>>>>
>>>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
>>>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
>>>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
>>>>>>>>> cleancache backend, containing only clean file-backed pages. Most
>>>>>>>>> importantly, the kernel can’t take a reference to pages from the
>>>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
>>>>>>>>> when required. This guarantees GCMA low allocation latency and
>>>>>>>>> improves allocation success rate.
>>>>>>>>>
>>>>>>>>> We would like to standardize GCMA implementation and upstream it since
>>>>>>>>> many Android vendors are asking to include it as a generic feature.
>>>>>>>>>
>>>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
>>>>>>>>> didn’t know at the time about this use case) might complicate
>>>>>>>>> upstreaming.
>>>>>>>>
>>>>>>>> we discussed another possible user last year: using MTE tag storage memory
>>>>>>>> while the storage is not getting used to store MTE tags [1].
>>>>>>>>
>>>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
>>>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
>>>>>>>> just that it doesn't support MTE itself") for different purposes.
>>>>>>>>
>>>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
>>>>>>>> storage gets activated.
>>>>>>>
>>>>>>> If I remember correctly, one of the issues with the MTE project that might be
>>>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
>>>>>>> it for a very long time without specifying FOLL_LONGTERM.
>>>>>>>
>>>>>>> If I remember things correctly, there were two examples given for this; there
>>>>>>> might be more, or they might have been eliminated since then:
>>>>>>>
>>>>>>> * The page is used as a buffer for accesses to a file opened with
>>>>>>>     O_DIRECT.
>>>>>>>
>>>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
>>>>>>>     a direct quote from David [1].
>>>>>>>
>>>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
>>>>>>> MTE that wasn't the case.
>>>>>>>
>>>>>>> Hope some of this is useful.
>>>>>>>
>>>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
>>>>>>
>>>>>> Thanks for the references! I'll read through these discussions to see
>>>>>> how much useful information for GCMA I can extract.
>>>>>
>>>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
>>>>> it together. Sorry for the last minute posting. You can find it here:
>>>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
>>>>
>>>> Sorry about the delay. Attached are the slides from my GCMA
>>>> presentation at the conference.
>>>
>>> Hi Folks,
>>
>> Hi,
>>
>>> As I'm getting close to finalizing the GCMA patchset, one question
>>> keeps bugging me. How do we account the memory that is allocated from
>>> GCMA... In case of CMA allocations, they are backed by the system
>>> memory, so accounting is straightforward, allocations contribute to
>>> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
>>> memory is reserved memory (a carveout) not directly accessible by the
>>> rest of the system and not part of the total_memory. So, if a process
>>> allocates a buffer from GCMA, should it be accounted as a normal
>>> allocation from system memory or as something else entirely? Any
>>> thoughts?
>>
>> You mean, an application allocates the memory and maps it into its page
>> tables?
> 
> Allocation will happen via cma_alloc() or a similar interface, so
> applications would have to use some driver to allocate from GCMA. Once
> allocated, an application can map that memory if the driver supports
> mapping.

Right, and that might happen either through a VM_PFNMAP or !VM_PFNMAP 
(ordinarily ref- and currently map-counted).

In the insert_page() case we do an inc_mm_counter, which increases the RSS.

That could happen with pages from carevouts (memblock allocations) 
already, but we don't run into that in general I assume.

> 
>>
>> Can that memory get reclaimed somehow?
> 
> Hmm. I assume that once a driver allocates pages from GCMA it won't
> put them into system-managed LRU or free them into buddy allocator for
> kernel to use. If it does then at the time of cma_release() it can't
> guarantee there are no more users for such pages.
> 
>>
>> How would we be mapping these pages into processes (VM_PFNMAP or
>> "normal" mappings)?
> 
> They would be normal mappings as the pages do have `struct page` but I
> expect these pages to be managed by the driver that allocated them
> rather than the core kernel itself.
> 
> I was trying to design GCMA to be used as close to CMA as possible so
> that we can use the same cma_alloc/cma_release API and reuse CMA's
> page management code but the fact that CMA is backed by the system
> memory and GCMA is backed by a carveout makes it a bit difficult.

Makes sense. So I assume memcg does not apply here already -- memcg does 
not apply on the CMA layer IIRC.

The RSS is a bit tricky. We would have to modify things like 
inc_mm_counter() to special-case on these things.

But then, smaps output would still count these pages towards the rss/pss 
(e.g., mss->resident). So that needs care as well ...

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-09-01 16:01                 ` David Hildenbrand
@ 2025-10-10  1:30                   ` Suren Baghdasaryan
  2025-10-10 13:58                     ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-10-10  1:30 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Mon, Sep 1, 2025 at 9:01 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 27.08.25 02:17, Suren Baghdasaryan wrote:
> > On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 23.08.25 00:14, Suren Baghdasaryan wrote:
> >>> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>
> >>>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>
> >>>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>
> >>>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> >>>>>> <alexandru.elisei@arm.com> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> >>>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
> >>>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
> >>>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
> >>>>>>>>> out-of-tree feature, collect input on its possible usefulness for
> >>>>>>>>> others, feasibility to upstream and suggestions for possible better
> >>>>>>>>> alternatives.
> >>>>>>>>>
> >>>>>>>>> Problem statement: Some workloads/hardware require physically
> >>>>>>>>> contiguous memory and carving out reserved memory areas for such
> >>>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
> >>>>>>>>> was designed to solve this inefficiency by allowing movable memory
> >>>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
> >>>>>>>>> When a contiguous memory allocation is requested, CMA finds the
> >>>>>>>>> requested contiguous area, possibly migrating some of the movable
> >>>>>>>>> pages out of that area.
> >>>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
> >>>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
> >>>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
> >>>>>>>>> fail if page migration is not possible.
> >>>>>>>>>
> >>>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> >>>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
> >>>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
> >>>>>>>>> cleancache backend, containing only clean file-backed pages. Most
> >>>>>>>>> importantly, the kernel can’t take a reference to pages from the
> >>>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
> >>>>>>>>> when required. This guarantees GCMA low allocation latency and
> >>>>>>>>> improves allocation success rate.
> >>>>>>>>>
> >>>>>>>>> We would like to standardize GCMA implementation and upstream it since
> >>>>>>>>> many Android vendors are asking to include it as a generic feature.
> >>>>>>>>>
> >>>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> >>>>>>>>> didn’t know at the time about this use case) might complicate
> >>>>>>>>> upstreaming.
> >>>>>>>>
> >>>>>>>> we discussed another possible user last year: using MTE tag storage memory
> >>>>>>>> while the storage is not getting used to store MTE tags [1].
> >>>>>>>>
> >>>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> >>>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> >>>>>>>> just that it doesn't support MTE itself") for different purposes.
> >>>>>>>>
> >>>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
> >>>>>>>> storage gets activated.
> >>>>>>>
> >>>>>>> If I remember correctly, one of the issues with the MTE project that might be
> >>>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> >>>>>>> it for a very long time without specifying FOLL_LONGTERM.
> >>>>>>>
> >>>>>>> If I remember things correctly, there were two examples given for this; there
> >>>>>>> might be more, or they might have been eliminated since then:
> >>>>>>>
> >>>>>>> * The page is used as a buffer for accesses to a file opened with
> >>>>>>>     O_DIRECT.
> >>>>>>>
> >>>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> >>>>>>>     a direct quote from David [1].
> >>>>>>>
> >>>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
> >>>>>>> MTE that wasn't the case.
> >>>>>>>
> >>>>>>> Hope some of this is useful.
> >>>>>>>
> >>>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
> >>>>>>
> >>>>>> Thanks for the references! I'll read through these discussions to see
> >>>>>> how much useful information for GCMA I can extract.
> >>>>>
> >>>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
> >>>>> it together. Sorry for the last minute posting. You can find it here:
> >>>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
> >>>>
> >>>> Sorry about the delay. Attached are the slides from my GCMA
> >>>> presentation at the conference.
> >>>
> >>> Hi Folks,
> >>
> >> Hi,
> >>
> >>> As I'm getting close to finalizing the GCMA patchset, one question
> >>> keeps bugging me. How do we account the memory that is allocated from
> >>> GCMA... In case of CMA allocations, they are backed by the system
> >>> memory, so accounting is straightforward, allocations contribute to
> >>> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
> >>> memory is reserved memory (a carveout) not directly accessible by the
> >>> rest of the system and not part of the total_memory. So, if a process
> >>> allocates a buffer from GCMA, should it be accounted as a normal
> >>> allocation from system memory or as something else entirely? Any
> >>> thoughts?
> >>
> >> You mean, an application allocates the memory and maps it into its page
> >> tables?
> >
> > Allocation will happen via cma_alloc() or a similar interface, so
> > applications would have to use some driver to allocate from GCMA. Once
> > allocated, an application can map that memory if the driver supports
> > mapping.
>
> Right, and that might happen either through a VM_PFNMAP or !VM_PFNMAP
> (ordinarily ref- and currently map-counted).
>
> In the insert_page() case we do an inc_mm_counter, which increases the RSS.
>
> That could happen with pages from carevouts (memblock allocations)
> already, but we don't run into that in general I assume.
>
> >
> >>
> >> Can that memory get reclaimed somehow?
> >
> > Hmm. I assume that once a driver allocates pages from GCMA it won't
> > put them into system-managed LRU or free them into buddy allocator for
> > kernel to use. If it does then at the time of cma_release() it can't
> > guarantee there are no more users for such pages.
> >
> >>
> >> How would we be mapping these pages into processes (VM_PFNMAP or
> >> "normal" mappings)?
> >
> > They would be normal mappings as the pages do have `struct page` but I
> > expect these pages to be managed by the driver that allocated them
> > rather than the core kernel itself.
> >
> > I was trying to design GCMA to be used as close to CMA as possible so
> > that we can use the same cma_alloc/cma_release API and reuse CMA's
> > page management code but the fact that CMA is backed by the system
> > memory and GCMA is backed by a carveout makes it a bit difficult.
>
> Makes sense. So I assume memcg does not apply here already -- memcg does
> not apply on the CMA layer IIRC.
>
> The RSS is a bit tricky. We would have to modify things like
> inc_mm_counter() to special-case on these things.
>
> But then, smaps output would still count these pages towards the rss/pss
> (e.g., mss->resident). So that needs care as well ...

In the end I decided to follow CMA as closely as possible, including
accounting. GCMA and CMA both use reserved area and the difference is
that CMA donates its memory to kernel to use for movable allocations
while GCMA donates it to the cleancache. But once that donation is
taken back by CMA/GCMA to satisfy cma_alloc() request, the memory
usage is pretty much the same and therefore accounting should probably
be the same. Anyway, that was the reasoning I eventually arrived at. I
posted the GCMA patchset at [1] and included this reasoning in the
cover letter. Happy to discuss this further in that patchset.
Thanks!

[1] https://lore.kernel.org/all/20251010011951.2136980-1-surenb@google.com/

>
> --
> Cheers
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-10-10  1:30                   ` Suren Baghdasaryan
@ 2025-10-10 13:58                     ` David Hildenbrand
  2025-10-10 15:07                       ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-10-10 13:58 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On 10.10.25 03:30, Suren Baghdasaryan wrote:
> On Mon, Sep 1, 2025 at 9:01 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 27.08.25 02:17, Suren Baghdasaryan wrote:
>>> On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 23.08.25 00:14, Suren Baghdasaryan wrote:
>>>>> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>
>>>>>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>>
>>>>>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>>>
>>>>>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
>>>>>>>> <alexandru.elisei@arm.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
>>>>>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
>>>>>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
>>>>>>>>>>> out-of-tree feature, collect input on its possible usefulness for
>>>>>>>>>>> others, feasibility to upstream and suggestions for possible better
>>>>>>>>>>> alternatives.
>>>>>>>>>>>
>>>>>>>>>>> Problem statement: Some workloads/hardware require physically
>>>>>>>>>>> contiguous memory and carving out reserved memory areas for such
>>>>>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
>>>>>>>>>>> was designed to solve this inefficiency by allowing movable memory
>>>>>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
>>>>>>>>>>> When a contiguous memory allocation is requested, CMA finds the
>>>>>>>>>>> requested contiguous area, possibly migrating some of the movable
>>>>>>>>>>> pages out of that area.
>>>>>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
>>>>>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
>>>>>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
>>>>>>>>>>> fail if page migration is not possible.
>>>>>>>>>>>
>>>>>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
>>>>>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
>>>>>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
>>>>>>>>>>> cleancache backend, containing only clean file-backed pages. Most
>>>>>>>>>>> importantly, the kernel can’t take a reference to pages from the
>>>>>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
>>>>>>>>>>> when required. This guarantees GCMA low allocation latency and
>>>>>>>>>>> improves allocation success rate.
>>>>>>>>>>>
>>>>>>>>>>> We would like to standardize GCMA implementation and upstream it since
>>>>>>>>>>> many Android vendors are asking to include it as a generic feature.
>>>>>>>>>>>
>>>>>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
>>>>>>>>>>> didn’t know at the time about this use case) might complicate
>>>>>>>>>>> upstreaming.
>>>>>>>>>>
>>>>>>>>>> we discussed another possible user last year: using MTE tag storage memory
>>>>>>>>>> while the storage is not getting used to store MTE tags [1].
>>>>>>>>>>
>>>>>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
>>>>>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
>>>>>>>>>> just that it doesn't support MTE itself") for different purposes.
>>>>>>>>>>
>>>>>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
>>>>>>>>>> storage gets activated.
>>>>>>>>>
>>>>>>>>> If I remember correctly, one of the issues with the MTE project that might be
>>>>>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
>>>>>>>>> it for a very long time without specifying FOLL_LONGTERM.
>>>>>>>>>
>>>>>>>>> If I remember things correctly, there were two examples given for this; there
>>>>>>>>> might be more, or they might have been eliminated since then:
>>>>>>>>>
>>>>>>>>> * The page is used as a buffer for accesses to a file opened with
>>>>>>>>>      O_DIRECT.
>>>>>>>>>
>>>>>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
>>>>>>>>>      a direct quote from David [1].
>>>>>>>>>
>>>>>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
>>>>>>>>> MTE that wasn't the case.
>>>>>>>>>
>>>>>>>>> Hope some of this is useful.
>>>>>>>>>
>>>>>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
>>>>>>>>
>>>>>>>> Thanks for the references! I'll read through these discussions to see
>>>>>>>> how much useful information for GCMA I can extract.
>>>>>>>
>>>>>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
>>>>>>> it together. Sorry for the last minute posting. You can find it here:
>>>>>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
>>>>>>
>>>>>> Sorry about the delay. Attached are the slides from my GCMA
>>>>>> presentation at the conference.
>>>>>
>>>>> Hi Folks,
>>>>
>>>> Hi,
>>>>
>>>>> As I'm getting close to finalizing the GCMA patchset, one question
>>>>> keeps bugging me. How do we account the memory that is allocated from
>>>>> GCMA... In case of CMA allocations, they are backed by the system
>>>>> memory, so accounting is straightforward, allocations contribute to
>>>>> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
>>>>> memory is reserved memory (a carveout) not directly accessible by the
>>>>> rest of the system and not part of the total_memory. So, if a process
>>>>> allocates a buffer from GCMA, should it be accounted as a normal
>>>>> allocation from system memory or as something else entirely? Any
>>>>> thoughts?
>>>>
>>>> You mean, an application allocates the memory and maps it into its page
>>>> tables?
>>>
>>> Allocation will happen via cma_alloc() or a similar interface, so
>>> applications would have to use some driver to allocate from GCMA. Once
>>> allocated, an application can map that memory if the driver supports
>>> mapping.
>>
>> Right, and that might happen either through a VM_PFNMAP or !VM_PFNMAP
>> (ordinarily ref- and currently map-counted).
>>
>> In the insert_page() case we do an inc_mm_counter, which increases the RSS.
>>
>> That could happen with pages from carevouts (memblock allocations)
>> already, but we don't run into that in general I assume.
>>
>>>
>>>>
>>>> Can that memory get reclaimed somehow?
>>>
>>> Hmm. I assume that once a driver allocates pages from GCMA it won't
>>> put them into system-managed LRU or free them into buddy allocator for
>>> kernel to use. If it does then at the time of cma_release() it can't
>>> guarantee there are no more users for such pages.
>>>
>>>>
>>>> How would we be mapping these pages into processes (VM_PFNMAP or
>>>> "normal" mappings)?
>>>
>>> They would be normal mappings as the pages do have `struct page` but I
>>> expect these pages to be managed by the driver that allocated them
>>> rather than the core kernel itself.
>>>
>>> I was trying to design GCMA to be used as close to CMA as possible so
>>> that we can use the same cma_alloc/cma_release API and reuse CMA's
>>> page management code but the fact that CMA is backed by the system
>>> memory and GCMA is backed by a carveout makes it a bit difficult.
>>
>> Makes sense. So I assume memcg does not apply here already -- memcg does
>> not apply on the CMA layer IIRC.
>>
>> The RSS is a bit tricky. We would have to modify things like
>> inc_mm_counter() to special-case on these things.
>>
>> But then, smaps output would still count these pages towards the rss/pss
>> (e.g., mss->resident). So that needs care as well ...
> 
> In the end I decided to follow CMA as closely as possible, including
> accounting. GCMA and CMA both use reserved area and the difference is
> that CMA donates its memory to kernel to use for movable allocations
> while GCMA donates it to the cleancache. But once that donation is
> taken back by CMA/GCMA to satisfy cma_alloc() request, the memory
> usage is pretty much the same and therefore accounting should probably
> be the same. Anyway, that was the reasoning I eventually arrived at. I
> posted the GCMA patchset at [1] and included this reasoning in the
> cover letter. Happy to discuss this further in that patchset.

Right, probably best to keep it simple. Will these GCMA pages be 
accounted towards MemTotal like CMA pages would?

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-10-10 13:58                     ` David Hildenbrand
@ 2025-10-10 15:07                       ` Suren Baghdasaryan
  2025-10-10 15:37                         ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-10-10 15:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Fri, Oct 10, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 10.10.25 03:30, Suren Baghdasaryan wrote:
> > On Mon, Sep 1, 2025 at 9:01 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 27.08.25 02:17, Suren Baghdasaryan wrote:
> >>> On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
> >>>>
> >>>> On 23.08.25 00:14, Suren Baghdasaryan wrote:
> >>>>> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>
> >>>>>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>>
> >>>>>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>>>
> >>>>>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> >>>>>>>> <alexandru.elisei@arm.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> >>>>>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
> >>>>>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
> >>>>>>>>>>> out-of-tree feature, collect input on its possible usefulness for
> >>>>>>>>>>> others, feasibility to upstream and suggestions for possible better
> >>>>>>>>>>> alternatives.
> >>>>>>>>>>>
> >>>>>>>>>>> Problem statement: Some workloads/hardware require physically
> >>>>>>>>>>> contiguous memory and carving out reserved memory areas for such
> >>>>>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
> >>>>>>>>>>> was designed to solve this inefficiency by allowing movable memory
> >>>>>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
> >>>>>>>>>>> When a contiguous memory allocation is requested, CMA finds the
> >>>>>>>>>>> requested contiguous area, possibly migrating some of the movable
> >>>>>>>>>>> pages out of that area.
> >>>>>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
> >>>>>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
> >>>>>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
> >>>>>>>>>>> fail if page migration is not possible.
> >>>>>>>>>>>
> >>>>>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> >>>>>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
> >>>>>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
> >>>>>>>>>>> cleancache backend, containing only clean file-backed pages. Most
> >>>>>>>>>>> importantly, the kernel can’t take a reference to pages from the
> >>>>>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
> >>>>>>>>>>> when required. This guarantees GCMA low allocation latency and
> >>>>>>>>>>> improves allocation success rate.
> >>>>>>>>>>>
> >>>>>>>>>>> We would like to standardize GCMA implementation and upstream it since
> >>>>>>>>>>> many Android vendors are asking to include it as a generic feature.
> >>>>>>>>>>>
> >>>>>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> >>>>>>>>>>> didn’t know at the time about this use case) might complicate
> >>>>>>>>>>> upstreaming.
> >>>>>>>>>>
> >>>>>>>>>> we discussed another possible user last year: using MTE tag storage memory
> >>>>>>>>>> while the storage is not getting used to store MTE tags [1].
> >>>>>>>>>>
> >>>>>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> >>>>>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> >>>>>>>>>> just that it doesn't support MTE itself") for different purposes.
> >>>>>>>>>>
> >>>>>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
> >>>>>>>>>> storage gets activated.
> >>>>>>>>>
> >>>>>>>>> If I remember correctly, one of the issues with the MTE project that might be
> >>>>>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> >>>>>>>>> it for a very long time without specifying FOLL_LONGTERM.
> >>>>>>>>>
> >>>>>>>>> If I remember things correctly, there were two examples given for this; there
> >>>>>>>>> might be more, or they might have been eliminated since then:
> >>>>>>>>>
> >>>>>>>>> * The page is used as a buffer for accesses to a file opened with
> >>>>>>>>>      O_DIRECT.
> >>>>>>>>>
> >>>>>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> >>>>>>>>>      a direct quote from David [1].
> >>>>>>>>>
> >>>>>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
> >>>>>>>>> MTE that wasn't the case.
> >>>>>>>>>
> >>>>>>>>> Hope some of this is useful.
> >>>>>>>>>
> >>>>>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
> >>>>>>>>
> >>>>>>>> Thanks for the references! I'll read through these discussions to see
> >>>>>>>> how much useful information for GCMA I can extract.
> >>>>>>>
> >>>>>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
> >>>>>>> it together. Sorry for the last minute posting. You can find it here:
> >>>>>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
> >>>>>>
> >>>>>> Sorry about the delay. Attached are the slides from my GCMA
> >>>>>> presentation at the conference.
> >>>>>
> >>>>> Hi Folks,
> >>>>
> >>>> Hi,
> >>>>
> >>>>> As I'm getting close to finalizing the GCMA patchset, one question
> >>>>> keeps bugging me. How do we account the memory that is allocated from
> >>>>> GCMA... In case of CMA allocations, they are backed by the system
> >>>>> memory, so accounting is straightforward, allocations contribute to
> >>>>> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
> >>>>> memory is reserved memory (a carveout) not directly accessible by the
> >>>>> rest of the system and not part of the total_memory. So, if a process
> >>>>> allocates a buffer from GCMA, should it be accounted as a normal
> >>>>> allocation from system memory or as something else entirely? Any
> >>>>> thoughts?
> >>>>
> >>>> You mean, an application allocates the memory and maps it into its page
> >>>> tables?
> >>>
> >>> Allocation will happen via cma_alloc() or a similar interface, so
> >>> applications would have to use some driver to allocate from GCMA. Once
> >>> allocated, an application can map that memory if the driver supports
> >>> mapping.
> >>
> >> Right, and that might happen either through a VM_PFNMAP or !VM_PFNMAP
> >> (ordinarily ref- and currently map-counted).
> >>
> >> In the insert_page() case we do an inc_mm_counter, which increases the RSS.
> >>
> >> That could happen with pages from carevouts (memblock allocations)
> >> already, but we don't run into that in general I assume.
> >>
> >>>
> >>>>
> >>>> Can that memory get reclaimed somehow?
> >>>
> >>> Hmm. I assume that once a driver allocates pages from GCMA it won't
> >>> put them into system-managed LRU or free them into buddy allocator for
> >>> kernel to use. If it does then at the time of cma_release() it can't
> >>> guarantee there are no more users for such pages.
> >>>
> >>>>
> >>>> How would we be mapping these pages into processes (VM_PFNMAP or
> >>>> "normal" mappings)?
> >>>
> >>> They would be normal mappings as the pages do have `struct page` but I
> >>> expect these pages to be managed by the driver that allocated them
> >>> rather than the core kernel itself.
> >>>
> >>> I was trying to design GCMA to be used as close to CMA as possible so
> >>> that we can use the same cma_alloc/cma_release API and reuse CMA's
> >>> page management code but the fact that CMA is backed by the system
> >>> memory and GCMA is backed by a carveout makes it a bit difficult.
> >>
> >> Makes sense. So I assume memcg does not apply here already -- memcg does
> >> not apply on the CMA layer IIRC.
> >>
> >> The RSS is a bit tricky. We would have to modify things like
> >> inc_mm_counter() to special-case on these things.
> >>
> >> But then, smaps output would still count these pages towards the rss/pss
> >> (e.g., mss->resident). So that needs care as well ...
> >
> > In the end I decided to follow CMA as closely as possible, including
> > accounting. GCMA and CMA both use reserved area and the difference is
> > that CMA donates its memory to kernel to use for movable allocations
> > while GCMA donates it to the cleancache. But once that donation is
> > taken back by CMA/GCMA to satisfy cma_alloc() request, the memory
> > usage is pretty much the same and therefore accounting should probably
> > be the same. Anyway, that was the reasoning I eventually arrived at. I
> > posted the GCMA patchset at [1] and included this reasoning in the
> > cover letter. Happy to discuss this further in that patchset.
>
> Right, probably best to keep it simple. Will these GCMA pages be
> accounted towards MemTotal like CMA pages would?

I thought CMA pages are accounted towards CmaTotal and if that's what
you mean then yes, they are added to that metric in the patch [1], see
the change in gcma_register_area(). I'm not adding the GcmaTotal
metric because I think it's simpler to consider GCMA as just a flavor
of CMA, as both are used via the same API (cma_alloc/cma_release) and
serve the same purpose. The GCMA area can be distinguished from the
CMA area using the /sys/kernel/mm/cma/<area>/gcma attribute, but
otherwise, it should appear to users as yet another CMA area. Does
that make sense?

[1] https://lore.kernel.org/all/20251010011951.2136980-9-surenb@google.com/


>
> --
> Cheers
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-10-10 15:07                       ` Suren Baghdasaryan
@ 2025-10-10 15:37                         ` David Hildenbrand
  2025-10-10 15:47                           ` Suren Baghdasaryan
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2025-10-10 15:37 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On 10.10.25 17:07, Suren Baghdasaryan wrote:
> On Fri, Oct 10, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 10.10.25 03:30, Suren Baghdasaryan wrote:
>>> On Mon, Sep 1, 2025 at 9:01 AM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 27.08.25 02:17, Suren Baghdasaryan wrote:
>>>>> On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
>>>>>>
>>>>>> On 23.08.25 00:14, Suren Baghdasaryan wrote:
>>>>>>> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>>>
>>>>>>>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
>>>>>>>>>> <alexandru.elisei@arm.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
>>>>>>>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
>>>>>>>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
>>>>>>>>>>>>> out-of-tree feature, collect input on its possible usefulness for
>>>>>>>>>>>>> others, feasibility to upstream and suggestions for possible better
>>>>>>>>>>>>> alternatives.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Problem statement: Some workloads/hardware require physically
>>>>>>>>>>>>> contiguous memory and carving out reserved memory areas for such
>>>>>>>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
>>>>>>>>>>>>> was designed to solve this inefficiency by allowing movable memory
>>>>>>>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
>>>>>>>>>>>>> When a contiguous memory allocation is requested, CMA finds the
>>>>>>>>>>>>> requested contiguous area, possibly migrating some of the movable
>>>>>>>>>>>>> pages out of that area.
>>>>>>>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
>>>>>>>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
>>>>>>>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
>>>>>>>>>>>>> fail if page migration is not possible.
>>>>>>>>>>>>>
>>>>>>>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
>>>>>>>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
>>>>>>>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
>>>>>>>>>>>>> cleancache backend, containing only clean file-backed pages. Most
>>>>>>>>>>>>> importantly, the kernel can’t take a reference to pages from the
>>>>>>>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
>>>>>>>>>>>>> when required. This guarantees GCMA low allocation latency and
>>>>>>>>>>>>> improves allocation success rate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We would like to standardize GCMA implementation and upstream it since
>>>>>>>>>>>>> many Android vendors are asking to include it as a generic feature.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
>>>>>>>>>>>>> didn’t know at the time about this use case) might complicate
>>>>>>>>>>>>> upstreaming.
>>>>>>>>>>>>
>>>>>>>>>>>> we discussed another possible user last year: using MTE tag storage memory
>>>>>>>>>>>> while the storage is not getting used to store MTE tags [1].
>>>>>>>>>>>>
>>>>>>>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
>>>>>>>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
>>>>>>>>>>>> just that it doesn't support MTE itself") for different purposes.
>>>>>>>>>>>>
>>>>>>>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
>>>>>>>>>>>> storage gets activated.
>>>>>>>>>>>
>>>>>>>>>>> If I remember correctly, one of the issues with the MTE project that might be
>>>>>>>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
>>>>>>>>>>> it for a very long time without specifying FOLL_LONGTERM.
>>>>>>>>>>>
>>>>>>>>>>> If I remember things correctly, there were two examples given for this; there
>>>>>>>>>>> might be more, or they might have been eliminated since then:
>>>>>>>>>>>
>>>>>>>>>>> * The page is used as a buffer for accesses to a file opened with
>>>>>>>>>>>       O_DIRECT.
>>>>>>>>>>>
>>>>>>>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
>>>>>>>>>>>       a direct quote from David [1].
>>>>>>>>>>>
>>>>>>>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
>>>>>>>>>>> MTE that wasn't the case.
>>>>>>>>>>>
>>>>>>>>>>> Hope some of this is useful.
>>>>>>>>>>>
>>>>>>>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
>>>>>>>>>>
>>>>>>>>>> Thanks for the references! I'll read through these discussions to see
>>>>>>>>>> how much useful information for GCMA I can extract.
>>>>>>>>>
>>>>>>>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
>>>>>>>>> it together. Sorry for the last minute posting. You can find it here:
>>>>>>>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
>>>>>>>>
>>>>>>>> Sorry about the delay. Attached are the slides from my GCMA
>>>>>>>> presentation at the conference.
>>>>>>>
>>>>>>> Hi Folks,
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>> As I'm getting close to finalizing the GCMA patchset, one question
>>>>>>> keeps bugging me. How do we account the memory that is allocated from
>>>>>>> GCMA... In case of CMA allocations, they are backed by the system
>>>>>>> memory, so accounting is straightforward, allocations contribute to
>>>>>>> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
>>>>>>> memory is reserved memory (a carveout) not directly accessible by the
>>>>>>> rest of the system and not part of the total_memory. So, if a process
>>>>>>> allocates a buffer from GCMA, should it be accounted as a normal
>>>>>>> allocation from system memory or as something else entirely? Any
>>>>>>> thoughts?
>>>>>>
>>>>>> You mean, an application allocates the memory and maps it into its page
>>>>>> tables?
>>>>>
>>>>> Allocation will happen via cma_alloc() or a similar interface, so
>>>>> applications would have to use some driver to allocate from GCMA. Once
>>>>> allocated, an application can map that memory if the driver supports
>>>>> mapping.
>>>>
>>>> Right, and that might happen either through a VM_PFNMAP or !VM_PFNMAP
>>>> (ordinarily ref- and currently map-counted).
>>>>
>>>> In the insert_page() case we do an inc_mm_counter, which increases the RSS.
>>>>
>>>> That could happen with pages from carevouts (memblock allocations)
>>>> already, but we don't run into that in general I assume.
>>>>
>>>>>
>>>>>>
>>>>>> Can that memory get reclaimed somehow?
>>>>>
>>>>> Hmm. I assume that once a driver allocates pages from GCMA it won't
>>>>> put them into system-managed LRU or free them into buddy allocator for
>>>>> kernel to use. If it does then at the time of cma_release() it can't
>>>>> guarantee there are no more users for such pages.
>>>>>
>>>>>>
>>>>>> How would we be mapping these pages into processes (VM_PFNMAP or
>>>>>> "normal" mappings)?
>>>>>
>>>>> They would be normal mappings as the pages do have `struct page` but I
>>>>> expect these pages to be managed by the driver that allocated them
>>>>> rather than the core kernel itself.
>>>>>
>>>>> I was trying to design GCMA to be used as close to CMA as possible so
>>>>> that we can use the same cma_alloc/cma_release API and reuse CMA's
>>>>> page management code but the fact that CMA is backed by the system
>>>>> memory and GCMA is backed by a carveout makes it a bit difficult.
>>>>
>>>> Makes sense. So I assume memcg does not apply here already -- memcg does
>>>> not apply on the CMA layer IIRC.
>>>>
>>>> The RSS is a bit tricky. We would have to modify things like
>>>> inc_mm_counter() to special-case on these things.
>>>>
>>>> But then, smaps output would still count these pages towards the rss/pss
>>>> (e.g., mss->resident). So that needs care as well ...
>>>
>>> In the end I decided to follow CMA as closely as possible, including
>>> accounting. GCMA and CMA both use reserved area and the difference is
>>> that CMA donates its memory to kernel to use for movable allocations
>>> while GCMA donates it to the cleancache. But once that donation is
>>> taken back by CMA/GCMA to satisfy cma_alloc() request, the memory
>>> usage is pretty much the same and therefore accounting should probably
>>> be the same. Anyway, that was the reasoning I eventually arrived at. I
>>> posted the GCMA patchset at [1] and included this reasoning in the
>>> cover letter. Happy to discuss this further in that patchset.
>>
>> Right, probably best to keep it simple. Will these GCMA pages be
>> accounted towards MemTotal like CMA pages would?
> 
> I thought CMA pages are accounted towards CmaTotal and if that's what
> you mean then yes, they are added to that metric in the patch [1], see
> the change in gcma_register_area(). I'm not adding the GcmaTotal
> metric because I think it's simpler to consider GCMA as just a flavor
> of CMA, as both are used via the same API (cma_alloc/cma_release) and
> serve the same purpose. The GCMA area can be distinguished from the
> CMA area using the /sys/kernel/mm/cma/<area>/gcma attribute, but
> otherwise, it should appear to users as yet another CMA area. Does
> that make sense?

I was rather wondering whether these pages will be part of /proc/meminfo 
MemTotal: so that totalram_pages_add() is called for them.

For ordinary CMA that happens in cma_activate_area() -> 
init_cma_reserved_pageblock() through adjust_managed_page_count(), where 
we also have the __free_pages() call.

If nothing changed on that front, then yes, it would behave just like 
ordinary CMA.

I should probably take a look at your v1, unfortunately that might have 
to wait for next week as I'm out of review capacity for today I'm afraid.

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-10-10 15:37                         ` David Hildenbrand
@ 2025-10-10 15:47                           ` Suren Baghdasaryan
  0 siblings, 0 replies; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-10-10 15:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alexandru Elisei, lsf-pc, SeongJae Park, Minchan Kim,
	m.szyprowski, aneesh.kumar, Joonsoo Kim, mina86, Matthew Wilcox,
	Vlastimil Babka, Lorenzo Stoakes, Liam R. Howlett, Michal Hocko,
	linux-mm, android-kernel-team

On Fri, Oct 10, 2025 at 8:37 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 10.10.25 17:07, Suren Baghdasaryan wrote:
> > On Fri, Oct 10, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 10.10.25 03:30, Suren Baghdasaryan wrote:
> >>> On Mon, Sep 1, 2025 at 9:01 AM David Hildenbrand <david@redhat.com> wrote:
> >>>>
> >>>> On 27.08.25 02:17, Suren Baghdasaryan wrote:
> >>>>> On Tue, Aug 26, 2025 at 1:58 AM David Hildenbrand <david@redhat.com> wrote:
> >>>>>>
> >>>>>> On 23.08.25 00:14, Suren Baghdasaryan wrote:
> >>>>>>> On Wed, Apr 2, 2025 at 9:35 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>>>
> >>>>>>>> On Thu, Mar 20, 2025 at 11:06 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 4, 2025 at 8:33 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Feb 4, 2025 at 3:23 AM Alexandru Elisei
> >>>>>>>>>> <alexandru.elisei@arm.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Feb 04, 2025 at 09:18:20AM +0100, David Hildenbrand wrote:
> >>>>>>>>>>>> On 02.02.25 01:19, Suren Baghdasaryan wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I would like to discuss the Guaranteed Contiguous Memory Allocator
> >>>>>>>>>>>>> (GCMA) mechanism that is being used by many Android vendors as an
> >>>>>>>>>>>>> out-of-tree feature, collect input on its possible usefulness for
> >>>>>>>>>>>>> others, feasibility to upstream and suggestions for possible better
> >>>>>>>>>>>>> alternatives.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Problem statement: Some workloads/hardware require physically
> >>>>>>>>>>>>> contiguous memory and carving out reserved memory areas for such
> >>>>>>>>>>>>> allocations often lead to inefficient usage of those carveouts. CMA
> >>>>>>>>>>>>> was designed to solve this inefficiency by allowing movable memory
> >>>>>>>>>>>>> allocations to use this reserved memory when it’s otherwise unused.
> >>>>>>>>>>>>> When a contiguous memory allocation is requested, CMA finds the
> >>>>>>>>>>>>> requested contiguous area, possibly migrating some of the movable
> >>>>>>>>>>>>> pages out of that area.
> >>>>>>>>>>>>> In latency-sensitive use cases, like face unlock on phones, we need to
> >>>>>>>>>>>>> allocate contiguous memory quickly and page migration in CMA takes
> >>>>>>>>>>>>> enough time to cause user-perceptible lag. Such allocations can also
> >>>>>>>>>>>>> fail if page migration is not possible.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> >>>>>>>>>>>>> was not upstreamed but got adopted later by many Android vendors as an
> >>>>>>>>>>>>> out-of-tree feature. It is similar to CMA but backing memory is
> >>>>>>>>>>>>> cleancache backend, containing only clean file-backed pages. Most
> >>>>>>>>>>>>> importantly, the kernel can’t take a reference to pages from the
> >>>>>>>>>>>>> cleancache, therefore can’t prevent GCMA from quickly dropping them
> >>>>>>>>>>>>> when required. This guarantees GCMA low allocation latency and
> >>>>>>>>>>>>> improves allocation success rate.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We would like to standardize GCMA implementation and upstream it since
> >>>>>>>>>>>>> many Android vendors are asking to include it as a generic feature.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> >>>>>>>>>>>>> didn’t know at the time about this use case) might complicate
> >>>>>>>>>>>>> upstreaming.
> >>>>>>>>>>>>
> >>>>>>>>>>>> we discussed another possible user last year: using MTE tag storage memory
> >>>>>>>>>>>> while the storage is not getting used to store MTE tags [1].
> >>>>>>>>>>>>
> >>>>>>>>>>>> As long as the "ordinary RAM" that maps to a given MTE tag storage area does
> >>>>>>>>>>>> not use MTE tagging, we can reuse the MTE tag storage ("almost ordinary RAM,
> >>>>>>>>>>>> just that it doesn't support MTE itself") for different purposes.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We need a guarantee that that memory can be freed up / migrated once the tag
> >>>>>>>>>>>> storage gets activated.
> >>>>>>>>>>>
> >>>>>>>>>>> If I remember correctly, one of the issues with the MTE project that might be
> >>>>>>>>>>> relevant to GCMA, was that userspace, once it gets a hold of a page, it can pin
> >>>>>>>>>>> it for a very long time without specifying FOLL_LONGTERM.
> >>>>>>>>>>>
> >>>>>>>>>>> If I remember things correctly, there were two examples given for this; there
> >>>>>>>>>>> might be more, or they might have been eliminated since then:
> >>>>>>>>>>>
> >>>>>>>>>>> * The page is used as a buffer for accesses to a file opened with
> >>>>>>>>>>>       O_DIRECT.
> >>>>>>>>>>>
> >>>>>>>>>>> * 'vmsplice() can pin pages forever and doesn't use FOLL_LONGTERM yet' - that's
> >>>>>>>>>>>       a direct quote from David [1].
> >>>>>>>>>>>
> >>>>>>>>>>> Depending on your usecases, failing the allocation might be acceptable, but for
> >>>>>>>>>>> MTE that wasn't the case.
> >>>>>>>>>>>
> >>>>>>>>>>> Hope some of this is useful.
> >>>>>>>>>>>
> >>>>>>>>>>> [1] https://lore.kernel.org/linux-arm-kernel/4e7a4054-092c-4e34-ae00-0105d7c9343c@redhat.com/
> >>>>>>>>>>
> >>>>>>>>>> Thanks for the references! I'll read through these discussions to see
> >>>>>>>>>> how much useful information for GCMA I can extract.
> >>>>>>>>>
> >>>>>>>>> I wanted to get an RFC code ahead of LSF/MM and just finished putting
> >>>>>>>>> it together. Sorry for the last minute posting. You can find it here:
> >>>>>>>>> https://lore.kernel.org/all/20250320173931.1583800-1-surenb@google.com/
> >>>>>>>>
> >>>>>>>> Sorry about the delay. Attached are the slides from my GCMA
> >>>>>>>> presentation at the conference.
> >>>>>>>
> >>>>>>> Hi Folks,
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>>> As I'm getting close to finalizing the GCMA patchset, one question
> >>>>>>> keeps bugging me. How do we account the memory that is allocated from
> >>>>>>> GCMA... In case of CMA allocations, they are backed by the system
> >>>>>>> memory, so accounting is straightforward, allocations contribute to
> >>>>>>> RSS, counted towards memcg limits, etc. In case of GCMA, the backing
> >>>>>>> memory is reserved memory (a carveout) not directly accessible by the
> >>>>>>> rest of the system and not part of the total_memory. So, if a process
> >>>>>>> allocates a buffer from GCMA, should it be accounted as a normal
> >>>>>>> allocation from system memory or as something else entirely? Any
> >>>>>>> thoughts?
> >>>>>>
> >>>>>> You mean, an application allocates the memory and maps it into its page
> >>>>>> tables?
> >>>>>
> >>>>> Allocation will happen via cma_alloc() or a similar interface, so
> >>>>> applications would have to use some driver to allocate from GCMA. Once
> >>>>> allocated, an application can map that memory if the driver supports
> >>>>> mapping.
> >>>>
> >>>> Right, and that might happen either through a VM_PFNMAP or !VM_PFNMAP
> >>>> (ordinarily ref- and currently map-counted).
> >>>>
> >>>> In the insert_page() case we do an inc_mm_counter, which increases the RSS.
> >>>>
> >>>> That could happen with pages from carevouts (memblock allocations)
> >>>> already, but we don't run into that in general I assume.
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Can that memory get reclaimed somehow?
> >>>>>
> >>>>> Hmm. I assume that once a driver allocates pages from GCMA it won't
> >>>>> put them into system-managed LRU or free them into buddy allocator for
> >>>>> kernel to use. If it does then at the time of cma_release() it can't
> >>>>> guarantee there are no more users for such pages.
> >>>>>
> >>>>>>
> >>>>>> How would we be mapping these pages into processes (VM_PFNMAP or
> >>>>>> "normal" mappings)?
> >>>>>
> >>>>> They would be normal mappings as the pages do have `struct page` but I
> >>>>> expect these pages to be managed by the driver that allocated them
> >>>>> rather than the core kernel itself.
> >>>>>
> >>>>> I was trying to design GCMA to be used as close to CMA as possible so
> >>>>> that we can use the same cma_alloc/cma_release API and reuse CMA's
> >>>>> page management code but the fact that CMA is backed by the system
> >>>>> memory and GCMA is backed by a carveout makes it a bit difficult.
> >>>>
> >>>> Makes sense. So I assume memcg does not apply here already -- memcg does
> >>>> not apply on the CMA layer IIRC.
> >>>>
> >>>> The RSS is a bit tricky. We would have to modify things like
> >>>> inc_mm_counter() to special-case on these things.
> >>>>
> >>>> But then, smaps output would still count these pages towards the rss/pss
> >>>> (e.g., mss->resident). So that needs care as well ...
> >>>
> >>> In the end I decided to follow CMA as closely as possible, including
> >>> accounting. GCMA and CMA both use reserved area and the difference is
> >>> that CMA donates its memory to kernel to use for movable allocations
> >>> while GCMA donates it to the cleancache. But once that donation is
> >>> taken back by CMA/GCMA to satisfy cma_alloc() request, the memory
> >>> usage is pretty much the same and therefore accounting should probably
> >>> be the same. Anyway, that was the reasoning I eventually arrived at. I
> >>> posted the GCMA patchset at [1] and included this reasoning in the
> >>> cover letter. Happy to discuss this further in that patchset.
> >>
> >> Right, probably best to keep it simple. Will these GCMA pages be
> >> accounted towards MemTotal like CMA pages would?
> >
> > I thought CMA pages are accounted towards CmaTotal and if that's what
> > you mean then yes, they are added to that metric in the patch [1], see
> > the change in gcma_register_area(). I'm not adding the GcmaTotal
> > metric because I think it's simpler to consider GCMA as just a flavor
> > of CMA, as both are used via the same API (cma_alloc/cma_release) and
> > serve the same purpose. The GCMA area can be distinguished from the
> > CMA area using the /sys/kernel/mm/cma/<area>/gcma attribute, but
> > otherwise, it should appear to users as yet another CMA area. Does
> > that make sense?
>
> I was rather wondering whether these pages will be part of /proc/meminfo
> MemTotal: so that totalram_pages_add() is called for them.
>
> For ordinary CMA that happens in cma_activate_area() ->
> init_cma_reserved_pageblock() through adjust_managed_page_count(), where
> we also have the __free_pages() call.

Ah, ok I see what you mean. IIUC this combination of __free_pages() +
adjust_managed_page_count() means the pages from the CMA area are
donated to the system, and therefore the MemTotal is increased. Since
GCMA does not donate its memory to the system (it donates to the
cleancache instead), we do not increase MemTotal.

>
> If nothing changed on that front, then yes, it would behave just like
> ordinary CMA.
>
> I should probably take a look at your v1, unfortunately that might have
> to wait for next week as I'm out of review capacity for today I'm afraid.

No worries, I appreciate your input whenever it arrives.
Thanks!

>
> --
> Cheers
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-02  0:19 [LSF/MM/BPF TOPIC] Guaranteed CMA Suren Baghdasaryan
  2025-02-04  5:46 ` Christoph Hellwig
  2025-02-04  8:18 ` David Hildenbrand
@ 2025-02-04  9:07 ` Vlastimil Babka
  2025-02-04 16:20   ` Suren Baghdasaryan
  2 siblings, 1 reply; 22+ messages in thread
From: Vlastimil Babka @ 2025-02-04  9:07 UTC (permalink / raw)
  To: Suren Baghdasaryan, lsf-pc
  Cc: SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Lorenzo Stoakes,
	Liam R. Howlett, David Hildenbrand, Michal Hocko, linux-mm,
	android-kernel-team, David Hildenbrand

On 2/2/25 01:19, Suren Baghdasaryan wrote:
> Hi,
> I would like to discuss the Guaranteed Contiguous Memory Allocator
> (GCMA) mechanism that is being used by many Android vendors as an
> out-of-tree feature, collect input on its possible usefulness for
> others, feasibility to upstream and suggestions for possible better
> alternatives.
> 
> Problem statement: Some workloads/hardware require physically
> contiguous memory and carving out reserved memory areas for such
> allocations often lead to inefficient usage of those carveouts. CMA
> was designed to solve this inefficiency by allowing movable memory
> allocations to use this reserved memory when it’s otherwise unused.
> When a contiguous memory allocation is requested, CMA finds the
> requested contiguous area, possibly migrating some of the movable
> pages out of that area.
> In latency-sensitive use cases, like face unlock on phones, we need to
> allocate contiguous memory quickly and page migration in CMA takes
> enough time to cause user-perceptible lag. Such allocations can also
> fail if page migration is not possible.
> 
> GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> was not upstreamed but got adopted later by many Android vendors as an
> out-of-tree feature. It is similar to CMA but backing memory is
> cleancache backend, containing only clean file-backed pages. Most
> importantly, the kernel can’t take a reference to pages from the
> cleancache, therefore can’t prevent GCMA from quickly dropping them

By reference you men a (long-term) pin? Otherwise "no reference" would mean
no way to map the pages or read() from them etc. Also there might be
speculative references by physical scanners...

> when required. This guarantees GCMA low allocation latency and
> improves allocation success rate.
> 
> We would like to standardize GCMA implementation and upstream it since
> many Android vendors are asking to include it as a generic feature.
> 
> Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> didn’t know at the time about this use case) might complicate
> upstreaming.
> 
> Desired participants:
> GCMA authors: SeongJae Park <sj@kernel.org>, Minchan Kim <minchan@kernel.org>
> CMA authors: Marek Szyprowski <m.szyprowski@samsung.com>, Aneesh Kumar
> K.V <aneesh.kumar@kernel.org>, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
> Michal Nazarewicz <mina86@mina86.com>
> The usual suspects (Willy, Vlastimil, Lorenzo, Liam, Michal, David H),
> other mm folks
> 
> [1] https://lore.kernel.org/lkml/1424721263-25314-2-git-send-email-sj38.park@gmail.com/



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Guaranteed CMA
  2025-02-04  9:07 ` Vlastimil Babka
@ 2025-02-04 16:20   ` Suren Baghdasaryan
  0 siblings, 0 replies; 22+ messages in thread
From: Suren Baghdasaryan @ 2025-02-04 16:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: lsf-pc, SeongJae Park, Minchan Kim, m.szyprowski, aneesh.kumar,
	Joonsoo Kim, mina86, Matthew Wilcox, Lorenzo Stoakes,
	Liam R. Howlett, David Hildenbrand, Michal Hocko, linux-mm,
	android-kernel-team

On Tue, Feb 4, 2025 at 1:07 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 2/2/25 01:19, Suren Baghdasaryan wrote:
> > Hi,
> > I would like to discuss the Guaranteed Contiguous Memory Allocator
> > (GCMA) mechanism that is being used by many Android vendors as an
> > out-of-tree feature, collect input on its possible usefulness for
> > others, feasibility to upstream and suggestions for possible better
> > alternatives.
> >
> > Problem statement: Some workloads/hardware require physically
> > contiguous memory and carving out reserved memory areas for such
> > allocations often lead to inefficient usage of those carveouts. CMA
> > was designed to solve this inefficiency by allowing movable memory
> > allocations to use this reserved memory when it’s otherwise unused.
> > When a contiguous memory allocation is requested, CMA finds the
> > requested contiguous area, possibly migrating some of the movable
> > pages out of that area.
> > In latency-sensitive use cases, like face unlock on phones, we need to
> > allocate contiguous memory quickly and page migration in CMA takes
> > enough time to cause user-perceptible lag. Such allocations can also
> > fail if page migration is not possible.
> >
> > GCMA (Guaranteed CMA) is a mechanism previously proposed in [1] which
> > was not upstreamed but got adopted later by many Android vendors as an
> > out-of-tree feature. It is similar to CMA but backing memory is
> > cleancache backend, containing only clean file-backed pages. Most
> > importantly, the kernel can’t take a reference to pages from the
> > cleancache, therefore can’t prevent GCMA from quickly dropping them
>
> By reference you men a (long-term) pin? Otherwise "no reference" would mean
> no way to map the pages or read() from them etc. Also there might be
> speculative references by physical scanners...

By that I mean that the cleancache memory is not addressable by the
kernel, see: https://elixir.bootlin.com/linux/v5.16.20/source/Documentation/vm/cleancache.rst#L19.

>
> > when required. This guarantees GCMA low allocation latency and
> > improves allocation success rate.
> >
> > We would like to standardize GCMA implementation and upstream it since
> > many Android vendors are asking to include it as a generic feature.
> >
> > Note: removal of cleancache in 5.17 kernel due to no users (sorry, we
> > didn’t know at the time about this use case) might complicate
> > upstreaming.
> >
> > Desired participants:
> > GCMA authors: SeongJae Park <sj@kernel.org>, Minchan Kim <minchan@kernel.org>
> > CMA authors: Marek Szyprowski <m.szyprowski@samsung.com>, Aneesh Kumar
> > K.V <aneesh.kumar@kernel.org>, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
> > Michal Nazarewicz <mina86@mina86.com>
> > The usual suspects (Willy, Vlastimil, Lorenzo, Liam, Michal, David H),
> > other mm folks
> >
> > [1] https://lore.kernel.org/lkml/1424721263-25314-2-git-send-email-sj38.park@gmail.com/
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-10-10 15:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-02  0:19 [LSF/MM/BPF TOPIC] Guaranteed CMA Suren Baghdasaryan
2025-02-04  5:46 ` Christoph Hellwig
2025-02-04  7:47   ` Lorenzo Stoakes
2025-02-04  7:48     ` Christoph Hellwig
2025-02-04  9:03   ` Vlastimil Babka
2025-02-04 15:56     ` Suren Baghdasaryan
2025-02-04  8:18 ` David Hildenbrand
2025-02-04 11:23   ` Alexandru Elisei
2025-02-04 16:33     ` Suren Baghdasaryan
2025-03-20 18:06       ` Suren Baghdasaryan
2025-04-02 16:35         ` Suren Baghdasaryan
2025-08-22 22:14           ` Suren Baghdasaryan
2025-08-26  8:58             ` David Hildenbrand
2025-08-27  0:17               ` Suren Baghdasaryan
2025-09-01 16:01                 ` David Hildenbrand
2025-10-10  1:30                   ` Suren Baghdasaryan
2025-10-10 13:58                     ` David Hildenbrand
2025-10-10 15:07                       ` Suren Baghdasaryan
2025-10-10 15:37                         ` David Hildenbrand
2025-10-10 15:47                           ` Suren Baghdasaryan
2025-02-04  9:07 ` Vlastimil Babka
2025-02-04 16:20   ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox