From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1FD4C83F1B for ; Mon, 14 Jul 2025 05:23:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 539B56B00A0; Mon, 14 Jul 2025 01:23:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4EAEF6B00A1; Mon, 14 Jul 2025 01:23:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B2C26B00A2; Mon, 14 Jul 2025 01:23:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2A2096B00A0 for ; Mon, 14 Jul 2025 01:23:51 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E74141A0509 for ; Mon, 14 Jul 2025 05:23:50 +0000 (UTC) X-FDA: 83661728220.23.A1BAD51 Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [207.211.30.44]) by imf24.hostedemail.com (Postfix) with ESMTP id 23F02180002 for ; Mon, 14 Jul 2025 05:23:48 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (imf24.hostedemail.com: 207.211.30.44 is neither permitted nor denied by domain of airlied@gmail.com) smtp.mailfrom=airlied@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752470629; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9lmK7YrFEzPPu3UcffkFWTFekJ5nS3NbZo5LSXDyFwQ=; b=4V2o798P8aLgwHmcmnwUso8R4yLV1Ejp07QRtF9XamxnMZ4JLX5z4z3KAa3XvavMqyfqLG gE40cTxd+ieZtWR5QJG4ehMAMBYTObktGGWUPwLXYCBZgDZ8ERd+qlBUFEIAV2DTPwECnG NE3aI5pgE8R0/G1EAl55Hac1uiUi/dY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752470629; a=rsa-sha256; cv=none; b=INHe9QdAYRdONQ7+kWta6gpPH2i0Q45i61VJ8DFnlOut8oIurQUF56MrGo6vkUzVNV0w6n L4lrPquyRBTeE51VWXMxdBkh19i9VYlVcjTnLKRjAqfpIGFXpnxd5/358+S0RyV8SvD4hd SitLTWAlbAyICChEqtMWs9GqnOjG5LI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (imf24.hostedemail.com: 207.211.30.44 is neither permitted nor denied by domain of airlied@gmail.com) smtp.mailfrom=airlied@gmail.com Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-283-mj5SBCQyMq6I4xXk5jjb5Q-1; Mon, 14 Jul 2025 01:23:47 -0400 X-MC-Unique: mj5SBCQyMq6I4xXk5jjb5Q-1 X-Mimecast-MFC-AGG-ID: mj5SBCQyMq6I4xXk5jjb5Q_1752470625 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CCEF81801212; Mon, 14 Jul 2025 05:23:45 +0000 (UTC) Received: from dreadlord.redhat.com (unknown [10.67.32.31]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8A7581977000; Mon, 14 Jul 2025 05:23:41 +0000 (UTC) From: Dave Airlie To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Johannes Weiner , Christian Koenig Cc: Dave Chinner , Kairui Song , Dave Airlie Subject: [PATCH 08/18] memcg: add support for GPU page counters. (v2) Date: Mon, 14 Jul 2025 15:18:23 +1000 Message-ID: <20250714052243.1149732-9-airlied@gmail.com> In-Reply-To: <20250714052243.1149732-1-airlied@gmail.com> References: <20250714052243.1149732-1-airlied@gmail.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 4Pv19uL3Y2iFtDEn3kpAilDuhGpYDIxMHhRWYqg9A0Q_1752470625 X-Mimecast-Originator: gmail.com Content-Transfer-Encoding: quoted-printable content-type: text/plain; charset=WINDOWS-1252; x-default=true X-Rspamd-Queue-Id: 23F02180002 X-Rspam-User: X-Rspamd-Server: rspam09 X-Stat-Signature: t8b9kr8z11858sodq3n9bwiog5cj1uwo X-HE-Tag: 1752470628-252178 X-HE-Meta: U2FsdGVkX1/3v2xC7Q3P+EgkdRNcBEMrkvrXkCUqkXlrqyxYTVMPdtY3t27lEFGtIa3/6FIIKwCJoaDVhB0h8uvDB8th/NAduj+f1QEhUAyWq4Z2Fp82l/aq92u8dSDB9DxLeK7lU8aB9ll0VTGVBmoLY5PolPAY/57X4zu/XjxCho8jS9jqmGTQ7MQ64mcDdOxTtIkJga0kURyeGLZSTRmd8xIG1pnkiDB8Gueg/P0rUIxosrvEq2L3qZdxe9vsf0ZrR3nlHmHRIdbfWiHLfHb6P+00a39Yoe6Dgtv5fI/VhZ3p6dc2P8HlTERmIpToO6nq3KFH/9RV4M985I6LY7GnFf/TTsf+hVVw10vdhNQ6QNEl6dwZZ6zwqMatBOC0TmZHUo1WrZdjFGxg+3yiFsPV3aoD758ALgMgek7b/ScVb2c/eumF/xJj9CCMQCFhmlCE7LY6pL34Emap9HHuVph/9v+tfM4WdP1N4+U1yIIN7yIOIvLDjlNq9kikMh1rL3TYUY9VzPRdlgZzbB46PB4LcSSLwmKOhDPoKIGVbEddli+BVmLliLgbYRN23JApPaN+8p4rFdRGaMlMiQcR10yFztu0SJFC4a/QNuHYZ21yH/8Z3s/vGFZW5r+q21cyLraGLMwgtf/XzmT/LwZOsHW5kzZMU5b918kWIOANjFkJ5jsF6bCbtGdYoXaUS5eT17UKAlJDDVPzXe4rtkpzkT9yuVM0ysUSFLfvxfcBhJciT5vjw7pcLxmhwYkT62hP9qpF1x7evQeajphwPi2grkl2kEEbg5SIYebn3spADvPiSwLgnKW/rzlW8MeT3550zkXRnMyvHb71TBJfMPD+lIbkHCYdMN2aFfdkHXHA4sxWp4+sQ93IcAVMebSaSKZJ9x1q1bDvmdP7vfHRoFtPA2B21n4shIb7Sm1ksYB7hLRnAAF2JOS28OsoJwwhvj2++RoxacKeKFFdR4FZe5P GWX0d5zx 7AwaU5f/Z88DnHeaaJz+h7/CvAEhJSAIKZ/KLy95qJU5DLiVUnRAaBmjStbUHU3E4J8vmaLXkEIF2g0umgNNciDnrrdE8pgSUGoiXM8jWVYZ9GMkJdGhmjFPmAbE4t51nZWkKMm1yczTaDztApgitNp9uw719avz31NK7LrfbMTpoZkBwvLReHsydYCYizwBhczM2sx+IGzORnRmjq6HazMOCV0R5z2wVN5L64Rpe7/8nQ50EvZvcM4JzcFs9pFCc5vk0J9Vr+I0Y28kLRuCmuW/qjWEKIHM0bkWWxVN1B+d0mICwph7JN6tNF80d1CmTcyMVjOL29HUjQlnmSZ9867ZC9RFHiTzhIT+9d+HoTGpMkS5ElnJcWbLzr0oxLtIquRVy3eM4XdgJ1CIvxxHB418oHhz0YhpcWWF53rJAh/O2ONI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Airlie This introduces 2 new statistics and 3 new memcontrol APIs for dealing with GPU system memory allocations. The stats corresponds to the same stats in the global vmstat, for number of active GPU pages, and number of pages in pools that can be reclaimed. The first API charges a order of pages to a objcg, and sets the objcg on the pages like kmem does, and updates the active/reclaim statistic. The second API uncharges a page from the obj cgroup it is currently charged to. The third API allows moving a page to/from reclaim and between obj cgroups. When pages are added to the pool lru, this just updates accounting. When pages are being removed from a pool lru, they can be taken from the parent objcg so this allows them to be uncharged from there and transfe= rred to a new child objcg. Signed-off-by: Dave Airlie --- v2: use memcg_node_stat_items --- Documentation/admin-guide/cgroup-v2.rst | 6 ++ include/linux/memcontrol.h | 12 +++ mm/memcontrol.c | 105 ++++++++++++++++++++++++ 3 files changed, 123 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-= guide/cgroup-v2.rst index 0cc35a14afbe..28088c4e52d3 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1542,6 +1542,12 @@ The following nested keys are defined. =09 vmalloc (npn) =09=09Amount of memory used for vmap backed memory. =20 +=09 gpu_active (npn) +=09=09Amount of system memory used for GPU devices. + +=09 gpu_reclaim (npn) +=09=09Amount of system memory cached for GPU devices. + =09 shmem =09=09Amount of cached filesystem data that is swap-backed, =09=09such as tmpfs, shm segments, shared anonymous mmap()s diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 87b6688f124a..21328f207d38 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1597,6 +1597,18 @@ struct sock; bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pag= es, =09=09=09 gfp_t gfp_mask); void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_p= ages); + +bool mem_cgroup_charge_gpu_page(struct obj_cgroup *objcg, struct page *pag= e, +=09=09=09 unsigned int nr_pages, +=09=09=09 gfp_t gfp_mask, bool reclaim); +void mem_cgroup_uncharge_gpu_page(struct page *page, +=09=09=09=09 unsigned int nr_pages, +=09=09=09=09 bool reclaim); +bool mem_cgroup_move_gpu_page_reclaim(struct obj_cgroup *objcg, +=09=09=09=09 struct page *page, +=09=09=09=09 unsigned int order, +=09=09=09=09 bool to_reclaim); + #ifdef CONFIG_MEMCG extern struct static_key_false memcg_sockets_enabled_key; #define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_e= nabled_key) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 902da8a9c643..4c8ded9501c6 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -330,6 +330,8 @@ static const unsigned int memcg_node_stat_items[] =3D { #ifdef CONFIG_HUGETLB_PAGE =09NR_HUGETLB, #endif +=09NR_GPU_ACTIVE, +=09NR_GPU_RECLAIM, }; =20 static const unsigned int memcg_stat_items[] =3D { @@ -1345,6 +1347,8 @@ static const struct memory_stat memory_stats[] =3D { =09{ "percpu",=09=09=09MEMCG_PERCPU_B=09=09=09}, =09{ "sock",=09=09=09MEMCG_SOCK=09=09=09}, =09{ "vmalloc",=09=09=09MEMCG_VMALLOC=09=09=09}, +=09{ "gpu_active",=09=09=09NR_GPU_ACTIVE=09=09=09}, +=09{ "gpu_reclaim",=09=09NR_GPU_RECLAIM=09 }, =09{ "shmem",=09=09=09NR_SHMEM=09=09=09}, #ifdef CONFIG_ZSWAP =09{ "zswap",=09=09=09MEMCG_ZSWAP_B=09=09=09}, @@ -5132,6 +5136,107 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *m= emcg, unsigned int nr_pages) =09refill_stock(memcg, nr_pages); } =20 +/** + * mem_cgroup_charge_gpu_page - charge a page to GPU memory tracking + * @objcg: objcg to charge, NULL charges root memcg + * @page: page to charge + * @order: page allocation order + * @gfp_mask: gfp mode + * @reclaim: charge the reclaim counter instead of the active one. + * + * Charge the order sized @page to the objcg. Returns %true if the charge = fit within + * @objcg's configured limit, %false if it doesn't. + */ +bool mem_cgroup_charge_gpu_page(struct obj_cgroup *objcg, struct page *pag= e, +=09=09=09=09unsigned int order, gfp_t gfp_mask, bool reclaim) +{ +=09unsigned int nr_pages =3D 1 << order; +=09struct mem_cgroup *memcg =3D NULL; +=09struct lruvec *lruvec; +=09int ret; + +=09if (objcg) { +=09=09memcg =3D get_mem_cgroup_from_objcg(objcg); + +=09=09ret =3D try_charge_memcg(memcg, gfp_mask, nr_pages); +=09=09if (ret) { +=09=09=09css_put(&memcg->css); +=09=09=09return false; +=09=09} + +=09=09obj_cgroup_get(objcg); +=09=09page_set_objcg(page, objcg); +=09} + +=09lruvec =3D mem_cgroup_lruvec(memcg, page_pgdat(page)); +=09mod_lruvec_state(lruvec, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, nr_p= ages); + +=09mem_cgroup_put(memcg); +=09return true; +} +EXPORT_SYMBOL_GPL(mem_cgroup_charge_gpu_page); + +/** + * mem_cgroup_uncharge_gpu_page - uncharge a page from GPU memory tracking + * @page: page to uncharge + * @order: order of the page allocation + * @reclaim: uncharge the reclaim counter instead of the active. + */ +void mem_cgroup_uncharge_gpu_page(struct page *page, +=09=09=09=09 unsigned int order, bool reclaim) +{ +=09struct obj_cgroup *objcg =3D page_objcg(page); +=09struct mem_cgroup *memcg; +=09struct lruvec *lruvec; +=09int nr_pages =3D 1 << order; + +=09memcg =3D objcg ? get_mem_cgroup_from_objcg(objcg) : NULL; + +=09lruvec =3D mem_cgroup_lruvec(memcg, page_pgdat(page)); +=09mod_lruvec_state(lruvec, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, -nr_= pages); + +=09page->memcg_data =3D 0; +=09obj_cgroup_put(objcg); +=09mem_cgroup_put(memcg); +} +EXPORT_SYMBOL_GPL(mem_cgroup_uncharge_gpu_page); + +/** + * mem_cgroup_move_gpu_reclaim - move pages from gpu to gpu reclaim and ba= ck + * @new_objcg: objcg to move page to, NULL if just stats update. + * @nr_pages: number of pages to move + * @to_reclaim: true moves pages into reclaim, false moves them back + */ +bool mem_cgroup_move_gpu_page_reclaim(struct obj_cgroup *new_objcg, +=09=09=09=09 struct page *page, +=09=09=09=09 unsigned int order, +=09=09=09=09 bool to_reclaim) +{ +=09struct obj_cgroup *objcg =3D page_objcg(page); + +=09if (!objcg) +=09=09return false; + +=09if (!new_objcg || objcg =3D=3D new_objcg) { +=09=09struct mem_cgroup *memcg =3D get_mem_cgroup_from_objcg(objcg); +=09=09struct lruvec *lruvec; +=09=09unsigned long flags; +=09=09int nr_pages =3D 1 << order; + +=09=09lruvec =3D mem_cgroup_lruvec(memcg, page_pgdat(page)); +=09=09local_irq_save(flags); +=09=09__mod_lruvec_state(lruvec, to_reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTI= VE, nr_pages); +=09=09__mod_lruvec_state(lruvec, to_reclaim ? NR_GPU_ACTIVE : NR_GPU_RECLA= IM, -nr_pages); +=09=09local_irq_restore(flags); +=09=09mem_cgroup_put(memcg); +=09=09return true; +=09} else { +=09=09mem_cgroup_uncharge_gpu_page(page, order, true); +=09=09return mem_cgroup_charge_gpu_page(new_objcg, page, order, 0, false); +=09} +} +EXPORT_SYMBOL_GPL(mem_cgroup_move_gpu_page_reclaim); + static int __init cgroup_memory(char *s) { =09char *token; --=20 2.49.0