linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Airlie <airlied@redhat.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Dave Airlie <airlied@gmail.com>,
	dri-devel@lists.freedesktop.org,
	 Christian Koenig <christian.koenig@amd.com>,
	Matthew Brost <matthew.brost@intel.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org,  Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 1/2] mm: add gpu active/reclaim per-node stat counters
Date: Thu, 19 Jun 2025 10:42:37 +1000	[thread overview]
Message-ID: <CAMwc25rd7Tgvmdm4b3HeEqi1Nw+NDSc1d6wQX4hrNVsQD-OQPw@mail.gmail.com> (raw)
In-Reply-To: <6yxpihotsrg73dmlr2fajga2b7qbdnsroi2tq7alohrqt56dx3@sjyoy4yg2ck7>

On Thu, Jun 19, 2025 at 10:33 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Wed, Jun 18, 2025 at 02:06:17PM +1000, Dave Airlie wrote:
> > From: Dave Airlie <airlied@redhat.com>
> >
> > While discussing memcg intergration with gpu memory allocations,
> > it was pointed out that there was no numa/system counters for
> > GPU memory allocations.
> >
> > With more integrated memory GPU server systems turning up, and
> > more requirements for memory tracking it seems we should start
> > closing the gap.
> >
> > Add two counters to track GPU per-node system memory allocations.
> >
> > The first is currently allocated to GPU objects, and the second
> > is for memory that is stored in GPU page pools that can be reclaimed,
> > by the shrinker.
> >
> > Cc: Christian Koenig <christian.koenig@amd.com>
> > Cc: Matthew Brost <matthew.brost@intel.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: linux-mm@kvack.org
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Dave Airlie <airlied@redhat.com>
> >
> > ---
> >
> > I'd like to get acks to merge this via the drm tree, if possible,
> >
> > Dave.
> > ---
> >  Documentation/filesystems/proc.rst | 6 ++++++
> >  drivers/base/node.c                | 5 +++++
> >  fs/proc/meminfo.c                  | 6 ++++++
> >  include/linux/mmzone.h             | 2 ++
> >  mm/show_mem.c                      | 9 +++++++--
> >  mm/vmstat.c                        | 2 ++
> >  6 files changed, 28 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 5236cb52e357..45f61a19a790 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -1095,6 +1095,8 @@ Example output. You may not have all of these fields.
> >      CmaFree:               0 kB
> >      Unaccepted:            0 kB
> >      Balloon:               0 kB
> > +    GPUActive:             0 kB
> > +    GPUReclaim:            0 kB
> >      HugePages_Total:       0
> >      HugePages_Free:        0
> >      HugePages_Rsvd:        0
> > @@ -1273,6 +1275,10 @@ Unaccepted
> >                Memory that has not been accepted by the guest
> >  Balloon
> >                Memory returned to Host by VM Balloon Drivers
> > +GPUActive
> > +              Memory allocated to GPU objects
> > +GPUReclaim
> > +              Memory in GPU allocator pools that is reclaimable
>
> Can you please explain a bit more about these GPUActive & GPUReclaim?
> Please correct me if I am wrong, GPUActive is the total memory used by
> GPU objects and GPUReclaim is the subset of GPUActive which is
> reclaimable (possibly through shrinkers).

Currently,
GPUActive is total memory used by active GPU objects.
GPUReclaim is the amount of memory (not a subset of Active) that is
being stored in GPU reusable pools, that can be retrieved via a simple
shrinker. (this memory usually has different page table attributes,
uncached or writecombined).

Example workflow:
User allocates cached system RAM for GPU object:
Active increases,
Free cached system RAM,
Active decreases.

User allocates write combined system RAM for GPU object:
Active increases
Free write combined system RAM
Active decreases,
Reclaim increases
User allocates another WC system RAM object:
Reclaim decreases
Active increases
Shrinker shrinks the pool:
Reclaim decreases.

There could be in the future a 3rd type of memory which I'm not sure
it if necessary to account at this level, but it would be Active
memory that the driver considers discardable, and could be shrunk
easily, but I'm not seeing much consistency on usage in drivers for
this, or even what use case it is needed for, so I'm not going to
address it yet. This could end up in Reclaim, but I'd need to see the
use cases for it.

Dave.



      reply	other threads:[~2025-06-19  0:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-18  4:06 Dave Airlie
2025-06-18  4:06 ` [PATCH 2/2] drm/ttm: use gpu mm stats to track gpu memory allocations Dave Airlie
2025-06-19  0:05 ` [PATCH 1/2] mm: add gpu active/reclaim per-node stat counters Andrew Morton
2025-06-19  0:26 ` Shakeel Butt
2025-06-19  0:42   ` David Airlie [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMwc25rd7Tgvmdm4b3HeEqi1Nw+NDSc1d6wQX4hrNVsQD-OQPw@mail.gmail.com \
    --to=airlied@redhat.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=shakeel.butt@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox