From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40658C83F1A for ; Tue, 22 Jul 2025 01:50:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6B428E000A; Mon, 21 Jul 2025 21:50:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D1C658E0001; Mon, 21 Jul 2025 21:50:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0A068E000A; Mon, 21 Jul 2025 21:50:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ACADF8E0001 for ; Mon, 21 Jul 2025 21:50:37 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7BB69C01D8 for ; Tue, 22 Jul 2025 01:50:37 +0000 (UTC) X-FDA: 83690221314.28.859C822 Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [207.211.30.44]) by imf12.hostedemail.com (Postfix) with ESMTP id 9F7BF40003 for ; Tue, 22 Jul 2025 01:50:35 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (imf12.hostedemail.com: 207.211.30.44 is neither permitted nor denied by domain of airlied@gmail.com) smtp.mailfrom=airlied@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1753149035; a=rsa-sha256; cv=none; b=A2DfrNjN4JVZFGYMz4hjarQ12x0WuwYffQDcrDJ3jg2tAvwy/QqBWAKBvNM1N0EtFn+zmy F+k/d96dqC5YcjHXzV2RG34RbtSIJJY5LquYle4NVjzyId+gbylfbKsGmjtBtk16MGlKkt z5njr0ehfFv9KD3V0XL2oRjwW95+aHU= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (imf12.hostedemail.com: 207.211.30.44 is neither permitted nor denied by domain of airlied@gmail.com) smtp.mailfrom=airlied@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1753149035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jWGh45flXV6hzrrd2KiZgO0FoFdXAA/Iyp8xlitQb9o=; b=HeXRA6hsw2Js27QVQv5IP9hj3fm6wuycs9CHxkN7mUM2vecFmpMzK/K+w1ENe8pJ2GcdCt inSz6Ui185KlJzAlhvS97f/zrp+aJlXzdWhYWSwLZ1K4Lcgj3e+Jyc0/n3SnEQ5iHgmV7M 65STUPdjSQcgPb7w7Kp7AcXvmzHX0UE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-317-5SDevxgxOwWpFehckoOI5g-1; Mon, 21 Jul 2025 21:50:33 -0400 X-MC-Unique: 5SDevxgxOwWpFehckoOI5g-1 X-Mimecast-MFC-AGG-ID: 5SDevxgxOwWpFehckoOI5g_1753149032 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 0B2F51800371; Tue, 22 Jul 2025 01:50:32 +0000 (UTC) Received: from dreadlord.redhat.com (unknown [10.67.32.7]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D679E19560A3; Tue, 22 Jul 2025 01:50:27 +0000 (UTC) From: Dave Airlie To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Johannes Weiner , Christian Koenig Cc: Dave Chinner , Kairui Song , Dave Airlie Subject: [PATCH 07/15] memcg: add support for GPU page counters. (v2) Date: Tue, 22 Jul 2025 11:43:20 +1000 Message-ID: <20250722014942.1878844-8-airlied@gmail.com> In-Reply-To: <20250722014942.1878844-1-airlied@gmail.com> References: <20250722014942.1878844-1-airlied@gmail.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: EeMd75pisn_1wCRGlOrj8IBQgpLXhiWx66JxvEgqFzY_1753149032 X-Mimecast-Originator: gmail.com Content-Transfer-Encoding: quoted-printable content-type: text/plain; charset=WINDOWS-1252; x-default=true X-Stat-Signature: 9x6qyj6bs88knpxdwyb8j714o458otez X-Rspam-User: X-Rspamd-Queue-Id: 9F7BF40003 X-Rspamd-Server: rspam02 X-HE-Tag: 1753149035-182304 X-HE-Meta: U2FsdGVkX1/XLBxuaMqnVAiALbbZVLuNFDiweqhQL5qN2Z5pkzn5VsJkfEMDPIGVImXB7FNiofXKZuj9iFj+c59ckZWwRgGydOHvrkqEP6sTqKqdAV42cBrnWSbyODZf9n2x42T94fCckWumv5/U5OGEkNMgLYZ07M3yd4BsNQ9OcgfVeRk7XiPOz4NjYFrOQUItpmpBWDaQ2oVL6Kt3nWWFh12Altlzx8+b+3C7Z2yLA2bLo3cZC1lHRa+vYoZUoQaFhgW+1PgWegI5627jRSUU+Oy+3fMLsJMopFlukco81dEJkUlr1C6LQ5xdSBzmy3OCm5G5jVNSX8PF6EOJDK21QQ5F8wk+VtSAAKYZjnFFhUQ05RbiIsl6o+NBGWbyUIRpF6vt40xbExp91nWGkI6jqxCjoJLh7fTf8lbmDJJ2r2TKX+E2jEKlUmqIWo7YtqjUfzsCTPApoq8p0BzbSX13ETQaHN7gViiWoJCUyxUtrbUTY5ROobdIjBV2rbgCqavNlI0Syq5zwiREqNg72O9iAxxZrBogoAjAIWEtc32aLBh41SWTDX1HOVa5Ycz3+sKbYtxawv9JGaTSZGPTrHt1zTc6BXn15/KGR85IUTv3TFxHloG0LWxtdVDsDqpIAvDuRa/kcSfRHPiZ+dEgb1tzEwR8y+hlnDh6+OJ8cqqiMfexULHbmo9CnELtno2weYY2b/pEEAZIBsEPSPajQYiThIVsmyXhQXPijkNpS4tGP44fuWvmpt8AMNKiujbZh4ytrOm2ji4d9q/UVuAzhfwJeMuSW+Wtr0wLoekfGCzrp4lrUkHIr5m4GNNUEEEyrWrmqucFdIsiVW6LWFWiymUlW3aMGLJUvltRcuwMOXplPFMwUj68fKkhuJdG2knJ/CnSQh1zU0a9r3kJYEPKo8a57Iy2GYsEuQrNo38uFJMrV3WLYnf7KKL1jI9dcYVqqZWSt2rX+TGywaNoGFw dpWyuuLU iriEU+HCjciuZtF/soCVdHNP8amJmfn/zFuRIFIjlzKCtpc9fGoi6J3fIImSxiJo9ph4S/qrvny7G55IftXEQZTt7JPRCMdIrz0GhGBMLYHWE5/IeimH+B9BYUhZ/MnJDLQxzzBquq1CpQaBKWC2jHbyhtipLq5VqW8m8j1sS/rJ2QWYwaHe/+oYmLeS1f6BvqTxKpM1DeYNGFIE/t5AyemtQbE2swZVqBD9FhVMCbn5nAMLK7SqpdBNIbJ/k3+sPsUcKfISPiebg/yoyTmqnHbF7obLTS4tvYUv1q4ifBsax4AjS9/JMoDSSGn6x1xY6r1GHD6hd3E6oz0t2QlLNbI38Shytut4SKwHccE3qSv04gQuoZaVqWRC+KBhXD2ndEcECcuCOxh6JAwpHSu57Ycdx6aOMOhjmRQioPIEL6NEvYOMCWLOqMAd6dPzmwce26o3e7h+9RIemyAfHjSyuPH++0zS6TE1JV9YRHL8cz+1d3/jLSjFFqNU6G7rzvX6bysGguy+GZBAR5C6q6lbda0+zUA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Airlie This introduces 2 new statistics and 3 new memcontrol APIs for dealing with GPU system memory allocations. The stats corresponds to the same stats in the global vmstat, for number of active GPU pages, and number of pages in pools that can be reclaimed. The first API charges a order of pages to a objcg, and sets the objcg on the pages like kmem does, and updates the active/reclaim statistic. The second API uncharges a page from the obj cgroup it is currently charged to. The third API allows moving a page to/from reclaim and between obj cgroups. When pages are added to the pool lru, this just updates accounting. When pages are being removed from a pool lru, they can be taken from the parent objcg so this allows them to be uncharged from there and transfe= rred to a new child objcg. Signed-off-by: Dave Airlie --- v2: use memcg_node_stat_items --- Documentation/admin-guide/cgroup-v2.rst | 6 ++ include/linux/memcontrol.h | 12 +++ mm/memcontrol.c | 107 ++++++++++++++++++++++++ 3 files changed, 125 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-= guide/cgroup-v2.rst index 0cc35a14afbe..28088c4e52d3 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1542,6 +1542,12 @@ The following nested keys are defined. =09 vmalloc (npn) =09=09Amount of memory used for vmap backed memory. =20 +=09 gpu_active (npn) +=09=09Amount of system memory used for GPU devices. + +=09 gpu_reclaim (npn) +=09=09Amount of system memory cached for GPU devices. + =09 shmem =09=09Amount of cached filesystem data that is swap-backed, =09=09such as tmpfs, shm segments, shared anonymous mmap()s diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 87b6688f124a..21328f207d38 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1597,6 +1597,18 @@ struct sock; bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pag= es, =09=09=09 gfp_t gfp_mask); void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_p= ages); + +bool mem_cgroup_charge_gpu_page(struct obj_cgroup *objcg, struct page *pag= e, +=09=09=09 unsigned int nr_pages, +=09=09=09 gfp_t gfp_mask, bool reclaim); +void mem_cgroup_uncharge_gpu_page(struct page *page, +=09=09=09=09 unsigned int nr_pages, +=09=09=09=09 bool reclaim); +bool mem_cgroup_move_gpu_page_reclaim(struct obj_cgroup *objcg, +=09=09=09=09 struct page *page, +=09=09=09=09 unsigned int order, +=09=09=09=09 bool to_reclaim); + #ifdef CONFIG_MEMCG extern struct static_key_false memcg_sockets_enabled_key; #define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_e= nabled_key) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 902da8a9c643..94fe825f0e0f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -330,6 +330,8 @@ static const unsigned int memcg_node_stat_items[] =3D { #ifdef CONFIG_HUGETLB_PAGE =09NR_HUGETLB, #endif +=09NR_GPU_ACTIVE, +=09NR_GPU_RECLAIM, }; =20 static const unsigned int memcg_stat_items[] =3D { @@ -1345,6 +1347,8 @@ static const struct memory_stat memory_stats[] =3D { =09{ "percpu",=09=09=09MEMCG_PERCPU_B=09=09=09}, =09{ "sock",=09=09=09MEMCG_SOCK=09=09=09}, =09{ "vmalloc",=09=09=09MEMCG_VMALLOC=09=09=09}, +=09{ "gpu_active",=09=09=09NR_GPU_ACTIVE=09=09=09}, +=09{ "gpu_reclaim",=09=09NR_GPU_RECLAIM=09 }, =09{ "shmem",=09=09=09NR_SHMEM=09=09=09}, #ifdef CONFIG_ZSWAP =09{ "zswap",=09=09=09MEMCG_ZSWAP_B=09=09=09}, @@ -5132,6 +5136,109 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *m= emcg, unsigned int nr_pages) =09refill_stock(memcg, nr_pages); } =20 +/** + * mem_cgroup_charge_gpu_page - charge a page to GPU memory tracking + * @objcg: objcg to charge, NULL charges root memcg + * @page: page to charge + * @order: page allocation order + * @gfp_mask: gfp mode + * @reclaim: charge the reclaim counter instead of the active one. + * + * Charge the order sized @page to the objcg. Returns %true if the charge = fit within + * @objcg's configured limit, %false if it doesn't. + */ +bool mem_cgroup_charge_gpu_page(struct obj_cgroup *objcg, struct page *pag= e, +=09=09=09=09unsigned int order, gfp_t gfp_mask, bool reclaim) +{ +=09unsigned int nr_pages =3D 1 << order; +=09struct mem_cgroup *memcg =3D NULL; +=09struct lruvec *lruvec; +=09int ret; + +=09if (objcg) { +=09=09memcg =3D get_mem_cgroup_from_objcg(objcg); + +=09=09ret =3D try_charge_memcg(memcg, gfp_mask, nr_pages); +=09=09if (ret) { +=09=09=09mem_cgroup_put(memcg); +=09=09=09return false; +=09=09} + +=09=09obj_cgroup_get(objcg); +=09=09page_set_objcg(page, objcg); +=09} + +=09lruvec =3D mem_cgroup_lruvec(memcg, page_pgdat(page)); +=09mod_lruvec_state(lruvec, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, nr_p= ages); + +=09mem_cgroup_put(memcg); +=09return true; +} +EXPORT_SYMBOL_GPL(mem_cgroup_charge_gpu_page); + +/** + * mem_cgroup_uncharge_gpu_page - uncharge a page from GPU memory tracking + * @page: page to uncharge + * @order: order of the page allocation + * @reclaim: uncharge the reclaim counter instead of the active. + */ +void mem_cgroup_uncharge_gpu_page(struct page *page, +=09=09=09=09 unsigned int order, bool reclaim) +{ +=09struct obj_cgroup *objcg =3D page_objcg(page); +=09struct mem_cgroup *memcg; +=09struct lruvec *lruvec; +=09int nr_pages =3D 1 << order; + +=09memcg =3D objcg ? get_mem_cgroup_from_objcg(objcg) : NULL; + +=09lruvec =3D mem_cgroup_lruvec(memcg, page_pgdat(page)); +=09mod_lruvec_state(lruvec, reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTIVE, -nr_= pages); + +=09if (!mem_cgroup_is_root(memcg)) +=09=09refill_stock(memcg, nr_pages); +=09page->memcg_data =3D 0; +=09obj_cgroup_put(objcg); +=09mem_cgroup_put(memcg); +} +EXPORT_SYMBOL_GPL(mem_cgroup_uncharge_gpu_page); + +/** + * mem_cgroup_move_gpu_reclaim - move pages from gpu to gpu reclaim and ba= ck + * @new_objcg: objcg to move page to, NULL if just stats update. + * @nr_pages: number of pages to move + * @to_reclaim: true moves pages into reclaim, false moves them back + */ +bool mem_cgroup_move_gpu_page_reclaim(struct obj_cgroup *new_objcg, +=09=09=09=09 struct page *page, +=09=09=09=09 unsigned int order, +=09=09=09=09 bool to_reclaim) +{ +=09struct obj_cgroup *objcg =3D page_objcg(page); + +=09if (!objcg) +=09=09return false; + +=09if (!new_objcg || objcg =3D=3D new_objcg) { +=09=09struct mem_cgroup *memcg =3D get_mem_cgroup_from_objcg(objcg); +=09=09struct lruvec *lruvec; +=09=09unsigned long flags; +=09=09int nr_pages =3D 1 << order; + +=09=09lruvec =3D mem_cgroup_lruvec(memcg, page_pgdat(page)); +=09=09local_irq_save(flags); +=09=09__mod_lruvec_state(lruvec, to_reclaim ? NR_GPU_RECLAIM : NR_GPU_ACTI= VE, nr_pages); +=09=09__mod_lruvec_state(lruvec, to_reclaim ? NR_GPU_ACTIVE : NR_GPU_RECLA= IM, -nr_pages); +=09=09local_irq_restore(flags); +=09=09mem_cgroup_put(memcg); +=09=09return true; +=09} else { +=09=09mem_cgroup_uncharge_gpu_page(page, order, true); +=09=09return mem_cgroup_charge_gpu_page(new_objcg, page, order, 0, false); +=09} +} +EXPORT_SYMBOL_GPL(mem_cgroup_move_gpu_page_reclaim); + static int __init cgroup_memory(char *s) { =09char *token; --=20 2.49.0