From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD18BC77B7C for ; Wed, 25 Jun 2025 19:16:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57BAC6B0096; Wed, 25 Jun 2025 15:16:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 553966B00D0; Wed, 25 Jun 2025 15:16:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 469596B00D1; Wed, 25 Jun 2025 15:16:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3456B6B0096 for ; Wed, 25 Jun 2025 15:16:24 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A24F6806D0 for ; Wed, 25 Jun 2025 19:16:23 +0000 (UTC) X-FDA: 83594879046.24.9F19EFD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 5CCD340002 for ; Wed, 25 Jun 2025 19:16:21 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Lpxkn4B0; spf=pass (imf27.hostedemail.com: domain of airlied@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=airlied@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750878981; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gpqiDklJzgWULvZMFzOAfWV7cE1oihT1BC+dqlQ/AhA=; b=kCnbkksbPPyN07i/8sf77ZyvPLgRHZHgAPq8mz6fMh3a0YAYyGJBXmYFUa1pB5FpYy0kIZ cUjpcinzbmr6fV1/m6S1ZVFQ4IlL1UE2S0jv3KemNv3J3kv/BocUc7Sw+5v+6ORZ00kwxn rEQhBJbZ2uAMRLli7fBvvinvdOY/5aQ= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Lpxkn4B0; spf=pass (imf27.hostedemail.com: domain of airlied@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=airlied@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750878981; a=rsa-sha256; cv=none; b=cl2zBQCtG92l2wGtPkjmfHRHBiHoo0bFHdazFVrlOV94VEOP0SXKqk2wcq7ic48u5Aaevq yfIS2WN2mjmIcsFvz7mMDEMMbPIsmVUc3u7L6d2+fdR/HTKbUto5M+9aP43s7dilS7KGHw CNmmsZAbhMUc4E/0veXzVp55ZcZj09I= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750878980; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gpqiDklJzgWULvZMFzOAfWV7cE1oihT1BC+dqlQ/AhA=; b=Lpxkn4B0vVeuXlukj4NnN5Ao9xm1vGT31X8xOJFwI6RR4ufl6LMpX+q61YGG/a7mgopyNm 3LEvIXJRilP5tUGIHiX/Ja53FiljzgTcDHMxG0Jf+8Gg1mWBhb0t7s7donR7V2qD0QCGwN pL6ei+R+OSXIHHj50kwTvhfV0lalFqo= Received: from mail-pj1-f70.google.com (mail-pj1-f70.google.com [209.85.216.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-639-itLA1LuePQ6rEcdZOzs0sw-1; Wed, 25 Jun 2025 15:16:19 -0400 X-MC-Unique: itLA1LuePQ6rEcdZOzs0sw-1 X-Mimecast-MFC-AGG-ID: itLA1LuePQ6rEcdZOzs0sw_1750878978 Received: by mail-pj1-f70.google.com with SMTP id 98e67ed59e1d1-311a6b43ed7so115356a91.1 for ; Wed, 25 Jun 2025 12:16:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750878978; x=1751483778; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gpqiDklJzgWULvZMFzOAfWV7cE1oihT1BC+dqlQ/AhA=; b=QAuqBH40GI+v1nFAXefJsHYIGuNeWKQHo70tY7n8LBfGP0qBFt04RgU5ZBR3ibmIPx Rl8Lz3j6ulqw8lD7LGP5IdgR8hNU2aFZYxess1Z+GCBEB0jR9IoBEkcXUFRsS/LIr1mX BQzIoSX/yQZics4Xm9c+PCBQEK5RCgColD+2d+Q48rcExJwi3IBYk2aZNqULlq9Jp0Qc Dde8QAjKgkliUsYMdQfqy1BRIu7dpS1R/JgFpN0ullqJcQEp09uwRlw41YdX/ta4Pv92 WU42z7/5l2UfGpd8jCrePO35dFc2CMiIXVcN39ZxE1p+VayHC7Tb8H7lxF2wsCbrg0OA UDWg== X-Forwarded-Encrypted: i=1; AJvYcCUVmq++R+eSksti49gyqlQzNzySpKtO5ZRLyUyjHO7h+bSDhqJgNqumgGvUMqPxUrJUfHcoXr7fuw==@kvack.org X-Gm-Message-State: AOJu0Yzy2ZnFNyQs0taKjyD6i31b/nOgcgkPtn7xa18vx2/jbwG0PjNG 98Va0XWClfSh47xyUt/QWixW+pGoUSI5e8tdIElip9VL/NM9VcbxPmRj9FNnjCHQ0dgIAfWyo3u UYuP6G+bxLdOCYP49MYjYsXC81Kx1Ym2xqjfKDrG4fFuxwv6Arj41ecS6C7pY8ne6ZeUpwkTQC+ biL4u9a3TzFs3PqVAp73ZygXSANS4= X-Gm-Gg: ASbGncsW8y1P6WbFJw2jL/If7qckq065cYRw2CIW3CEtiVB9OwCOjGGTptsEe4No4Bh C1kS/P87cL5FuPpOVBnSWmO/pbAn/ehMHSWqPYuZkiUpVgYl727A6utzbOGeF/Aa533kaI0K+cr sS5Q== X-Received: by 2002:a17:90b:5484:b0:312:1d2d:18e1 with SMTP id 98e67ed59e1d1-315f2671e07mr5030267a91.22.1750878978138; Wed, 25 Jun 2025 12:16:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEHZD/XPD3yQ/dZWbIHGbhcw69O6AIEjELyv6bFlc3mSyV8vzDQ9y+y5V/0+aNpa/SpnAU2TPiQIFrrgnHox2k= X-Received: by 2002:a17:90b:5484:b0:312:1d2d:18e1 with SMTP id 98e67ed59e1d1-315f2671e07mr5030246a91.22.1750878977734; Wed, 25 Jun 2025 12:16:17 -0700 (PDT) MIME-Version: 1.0 References: <20250619072026.635133-1-airlied@gmail.com> <724720cd-eb05-4fc0-85a1-f6b60649b1ad@amd.com> <7dd0885a-7e7c-41a9-ae81-811fc344caf5@amd.com> In-Reply-To: <7dd0885a-7e7c-41a9-ae81-811fc344caf5@amd.com> From: David Airlie Date: Thu, 26 Jun 2025 05:16:04 +1000 X-Gm-Features: AX0GCFsvgmfea4XXwkA6mXxr3Pik4knfndsTsYx45LSMbhTEVn0xQupC3ObxyRs Message-ID: Subject: Re: [PATCH 1/2] mm: add gpu active/reclaim per-node stat counters (v2) To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: Dave Airlie , dri-devel@lists.freedesktop.org, Matthew Brost , Johannes Weiner , linux-mm@kvack.org, Andrew Morton X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: iWRolHVSIPDbqqOcxY8hTwSfLOIahcKZ9ApVMGAdPf8_1750878978 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5CCD340002 X-Stat-Signature: djch8xoxwb1byatperamt8czzh5irkxk X-HE-Tag: 1750878981-961155 X-HE-Meta: U2FsdGVkX18US+DtAeMnVW0BcbViBfKzMpq2d+5z0uHfnVtKG090BJPaWfnpoiToViFaZHhC0CojD4mksDitI4HRQdN++TqutLnwX9G0p5455AHGJQy+82HbVPx4W23Y6wrV6yYPItW23HkLfci/5QIJW3mop0af9tmqgr5RiyJpKOitMyGAExgN9S+KS7rvWBcvneuSuFoiRgWjfe2rCjnf3SU2Rn9abOHZXZLdn/sYIsHvBQWn0glKfSD9AsgZ1pmVzbcp31BKKX0qK61BP5CqQC0HLWwu1qWPB3aoe94jiLnEsQnOgNARiywsfIylJ6vQfWRRQPEa0u9TJHsrR6kMLLoC2vFPeFtmU0C61zyLjVdwp/UoHEiBKDlgAjgsCgBPgcVomD4OUHg9AYVfAeZ2lkHNSA9zaj3UcfA20GP6CIIBdiHLZ3ZmlnEA4eytFmIN4LctHdlFWrcv9b0azCO0RijwNfe2qhv7idth9sRnIlHYwaupswG7qSfx/G5qLjqOnDOBfYdkV9oUKmqBb3a3PIpf5W1Jw1tJvYbDccYghTFiDRO+lSOT8mR1xNMt9Ry6RROhJ1qRWgC+IKjFbgAXRaR/4bwDynaikFmThUIpYcgrv66wOCXTsmNDV9hq6mOTD1OxA+YWUq+Vfcwx41rf88JUVLjSDc2CYfJF7uLU24gvXn9oxMtiH+oOLnZVbkpsAhpmbOJg60xHBQ1MtF2TMjqJ6SMI9wnB7JTIRbZYRT/tvWnPLFn8kQ0t29e4s+1pp9GIXoeY/tE9qSf6iURqcJ3TWWQq3pqD1Fw6bGBr6IsMCKJ9BhoUvu5ivC4Dhhgm8YUmSWzwKijmnp8mZLskUVk6TIjxz6oQFF0/OVM5wa7QZYGVQ9eUl55z6m3kGsFnv4kCFjRh7K8hEn/PtsebcS/EmBe9KyHt40ZJ7Hm7GclcqoY2uGeLW9R33VtF5NTX14zhc6NzPAUsgQP TEfI/PW6 UZ8n47yhqNwgl9fyX76eZLOZh3Ko8GEx2xJx+lMPY6x8MRwNUUPvkzrrGc7+bOuR5l6UjposBAYjstbkyTy5WkZ7r6tnO1UUhplGbjIzOnaFLhFAU+aUpLET9HaTaPrLm11n+KC8hFJRCj7xDlN+3CtJ+xE3e9d3qZ0emVt4U9sq3AbJ46rp/xwCugNvGCeGZLzi7f1C1prAtfCrLBFITBezA95tlhHrscBh1WXZMef2qVveowD21I5OALD3pI5pEgL4loEhz+m6e1J9FajGZvbDqxWxVVb/bKKpbNLKFrHGkmiQ3XvEOxY1YX21mszbGf6/H9C2rozD0Zu6YcsAk1fXbRAmym83GCf6J1IPiesIuD2oCWhG/1cLjmhuP/mbYyvIVNuCV9C3V21UCXfIX7ieLf2gHtSHz5ofnk4ti3gOHdovsgw3auBAwwp1wnvFTuaGJQEfxVDbnTRvdsXcpGA6x8g5clQ4TOnRjiai930TC6IZxDHGtjJaQtzFEFDwCa/y0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 25, 2025 at 9:55=E2=80=AFPM Christian K=C3=B6nig wrote: > > On 24.06.25 03:12, David Airlie wrote: > > On Mon, Jun 23, 2025 at 6:54=E2=80=AFPM Christian K=C3=B6nig > > wrote: > >> > >> On 6/19/25 09:20, Dave Airlie wrote: > >>> From: Dave Airlie > >>> > >>> While discussing memcg intergration with gpu memory allocations, > >>> it was pointed out that there was no numa/system counters for > >>> GPU memory allocations. > >>> > >>> With more integrated memory GPU server systems turning up, and > >>> more requirements for memory tracking it seems we should start > >>> closing the gap. > >>> > >>> Add two counters to track GPU per-node system memory allocations. > >>> > >>> The first is currently allocated to GPU objects, and the second > >>> is for memory that is stored in GPU page pools that can be reclaimed, > >>> by the shrinker. > >>> > >>> Cc: Christian Koenig > >>> Cc: Matthew Brost > >>> Cc: Johannes Weiner > >>> Cc: linux-mm@kvack.org > >>> Cc: Andrew Morton > >>> Signed-off-by: Dave Airlie > >>> > >>> --- > >>> > >>> v2: add more info to the documentation on this memory. > >>> > >>> I'd like to get acks to merge this via the drm tree, if possible, > >>> > >>> Dave. > >>> --- > >>> Documentation/filesystems/proc.rst | 8 ++++++++ > >>> drivers/base/node.c | 5 +++++ > >>> fs/proc/meminfo.c | 6 ++++++ > >>> include/linux/mmzone.h | 2 ++ > >>> mm/show_mem.c | 9 +++++++-- > >>> mm/vmstat.c | 2 ++ > >>> 6 files changed, 30 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/Documentation/filesystems/proc.rst b/Documentation/files= ystems/proc.rst > >>> index 5236cb52e357..7cc5a9185190 100644 > >>> --- a/Documentation/filesystems/proc.rst > >>> +++ b/Documentation/filesystems/proc.rst > >>> @@ -1095,6 +1095,8 @@ Example output. You may not have all of these f= ields. > >>> CmaFree: 0 kB > >>> Unaccepted: 0 kB > >>> Balloon: 0 kB > >>> + GPUActive: 0 kB > >>> + GPUReclaim: 0 kB > >> > >> Active certainly makes sense, but I think we should rather disable the= pool on newer CPUs than adding reclaimable memory here. > > > > I'm not just concerned about newer platforms though, even on Fedora 42 > > on my test ryzen1+7900xt machine, with a desktop session running > > > > nr_gpu_active 7473 > > nr_gpu_reclaim 6656 > > > > It's not an insignificant amount of memory. > > That was not what I meant, that you have quite a bunch of memory allocate= d to the GPU is correct. > > But the problem is more that we used the pool for way to many thinks whic= h is actually not necessary. > > But granted this is orthogonal to that patch here. At least here this is all WC allocations, probably from userspace, so it feels like we are using it correctly, since we stopped pooling cached pages. > > > I also think if we get to > > some sort of discardable GTT objects with a shrinker they should > > probably be accounted in reclaim. > > The problem is that this is extremely driver specific. > > On amdgpu we have some temporary buffers which can be reclaimed immediate= ly, but the really big chunk is for example what XE does with it's shrinker= . > > See Thomas TTM patches from a few month ago. If memory is active or recla= imable does not depend on how it is allocated, but on how it is used. > > So the accounting need to be at the driver level if you really want to di= stinct between the two states. How the counters are used is fine to be done at the driver level on top of this, though I think for discardable there is grounds for ttm_tt having a discardable flag once we see a couple of drivers using it, and then maybe the counters could be moved, but it's also fine to use these counters in drivers outside TTM if they are done appropriately, just so we can see the memory allocations as part of the big picture. Dave.