Re: [PATCH v2 0/3] dma-buf: heaps: cma: enable dmem cgroup accounting

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Christian König" <christian.koenig@amd.com>
To: Dave Airlie <airlied@gmail.com>
Cc: Maxime Ripard <mripard@redhat.com>,
	"T.J. Mercier" <tjmercier@google.com>,
	Eric Chanudet <echanude@redhat.com>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Benjamin Gaignard <benjamin.gaignard@collabora.com>,
	Brian Starkey <Brian.Starkey@arm.com>,
	John Stultz <jstultz@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org, linux-kernel@vger.kernel.org,
	Albert Esteve <aesteve@redhat.com>,
	linux-mm@kvack.org, Yosry Ahmed <yosryahmed@google.com>,
	Shakeel Butt <shakeel.butt@linux.dev>
Subject: Re: [PATCH v2 0/3] dma-buf: heaps: cma: enable dmem cgroup accounting
Date: Thu, 26 Feb 2026 12:32:42 +0100	[thread overview]
Message-ID: <d1b287c9-46ff-4345-a410-7e1cfefb5c66@amd.com> (raw)
In-Reply-To: <CAPM=9ty5mbMAVHPO4mRy1jKGnpChr7gK6uMtco2=j7MMJGpZdg@mail.gmail.com>

On 2/26/26 00:43, Dave Airlie wrote:
>>>>
>>>> Using module parameters to enable/disable it globally is just a
>>>> workaround as far as I can see.
>>>
>>> That's a pretty good idea! It would indeed be a solution that could
>>> satisfy everyone (I assume?).
>>
>> I think so yeah.
>>
>> From what I have seen we have three different use cases:
>>
>> 1. local device memory (VRAM), GTT/CMA and memcg are completely separate domains and you want to have completely separate values as limit for them.
>>
>> 2. local device memory (VRAM) is separate. GTT/CMA are accounted to memcg, you can still have separate values as limit so that nobody over allocates CMA (for example).
>>
>> 3. All three are accounted to memcg because system memory is actually used as fallback if applications over allocate device local memory.
>>
>> It's debatable what should be the default, but we clearly need to handle all three use cases. Potentially even on the same system.
> 
> 
> Give me cases where 1 or 3 actually make sense in the real world.
> 
> I can maybe take 1 if CMA is just old school CMA carved out preboot so
> it's not in the main memory pool, but in that case it's just equiv to
> device memory really

Well I think #1 is pretty much the default for dGPUs on a desktop. That's why I mentioned it first.

> If something is in the main memory pool, it should be accounted for
> using memcg. You cannot remove memory from the main memory pool
> without accounting for it.

That's what I'm strongly disagreeing on. See the page cache is not accounted to memcg either, so when you open a file and the kernel caches the backing pages that doesn't reduce the amount you can allocate through malloc, doesn't it?

For dGPUs GTT is basically just the fallback when you over allocate local memory (plus a few things for uploads).

In other words system memory becomes the swap of device local memory. Just think about why memcg doesn't limits swap but only how much is swapped out.

For those use cases you want to have a hard static limit on how much system memory can be used as swap. That's why we originally used to have the per driver gttsize, the global TTM page limit etc... 

The problem is that we weakened those limitations because of the APU use case and that in turn resulted in all those problems with browsers over allocating system memory etc....

Now cgroups should provide an alternative and I still think that this is the right approach to solve this, but in this alternative I think we want to preserve the original idea of separate domains for dGPUs.

> Now we can add gpu limits to memcg, that
> was going to me a next step in my series.
> 
> Whether we have that as a percentage or a hard limit, we would just
> say GPU can consume 95% of the configured max for this cgroup.

That is only useful on APUs which don't have local memory because those make all of their allocations through system memory.

dGPUs should be much more limited in that regard.

> 3 to me just sounds like we haven't figured out fallback or
> suspend/resume accounting yet, which is true, but I'm not sure there
> is a reason for 3 to exist outside of the we don't know how to account
> for temporary storage of swapped out VRAM objects.

Mario has fixed or is at least working on the suspend/resume problems. So I don't consider that an issue any more.

The use case 3 happens on HPC systems where device local memory is basically just a cache. For example this one here: https://en.wikipedia.org/wiki/Frontier_(supercomputer)

In this use case you don't care if a buffer is in device local memory or system memory, what you care about is that things are reliable and for that your task at hand shouldn't exceeds a certain limit.

E.g. you run computation A which can use 100GB of resources and when computation B starts concurrently you don't want A to suddenly fail because it now fights with B for resources.

> Like it might be we need to have it so we have a limited transfer pool
> of system memory for VRAM objects to "live in" but we move them to
> swap as soon as possible once we get to the limit on that. Now what we
> do on systems where no swap is available, that gets into I've no idea
> space.
> 
> Static partitioning memcg up into a dmem and memcg isn't going to
> solve this, we should solve it inside memcg.

Well it's certainly possible to solve all of this in memcg, but I don't think it's very elegant.

Static partitioning between memcg and dmeme for the dGPU case and merged accounting for the APU case by default and then giving the system administrator to eventually switch to use case 3 sounds much more flexible to me.

At least the obvious advantage is that you don't start to add module parameters to TTM, DMA-buf heaps and drivers if they should or should not account to memcg, but rather keep all the logic inside cgroups.

Christian.

> 
> Dave.

next prev parent reply	other threads:[~2026-02-26 11:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18 17:14 Eric Chanudet
2026-02-18 17:14 ` [PATCH v2 1/3] cma: Register dmem region for each cma region Eric Chanudet
2026-02-18 17:14 ` [PATCH v2 2/3] cma: Provide accessor to cma dmem region Eric Chanudet
2026-02-18 17:14 ` [PATCH v2 3/3] dma-buf: heaps: cma: charge each cma heap's dmem Eric Chanudet
2026-02-19  7:17   ` Christian König
2026-02-19 17:10     ` Eric Chanudet
2026-02-20  8:16       ` Christian König
2026-02-23 16:14         ` Eric Chanudet
2026-02-19  9:16   ` Maxime Ripard
2026-02-19 17:21     ` Eric Chanudet
2026-02-19  9:45 ` [PATCH v2 0/3] dma-buf: heaps: cma: enable dmem cgroup accounting Albert Esteve
2026-02-20  1:14 ` T.J. Mercier
2026-02-20  9:45   ` Christian König
2026-02-23 19:39     ` Eric Chanudet
2026-02-24  9:43     ` Maxime Ripard
2026-02-24 10:32       ` Christian König
2026-02-25 23:43         ` Dave Airlie
2026-02-26 11:32           ` Christian König [this message]
2026-02-26 13:45             ` Shakeel Butt
2026-02-24  9:42   ` Maxime Ripard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d1b287c9-46ff-4345-a410-7e1cfefb5c66@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Brian.Starkey@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aesteve@redhat.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=benjamin.gaignard@collabora.com \
    --cc=david@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=echanude@redhat.com \
    --cc=jstultz@google.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=mripard@redhat.com \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=sumit.semwal@linaro.org \
    --cc=surenb@google.com \
    --cc=tjmercier@google.com \
    --cc=vbabka@suse.cz \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox