From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 688FFC25B75 for ; Thu, 23 May 2024 08:57:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B64996B0092; Thu, 23 May 2024 04:57:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B14DB6B0095; Thu, 23 May 2024 04:57:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A04ED6B0096; Thu, 23 May 2024 04:57:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8429E6B0092 for ; Thu, 23 May 2024 04:57:31 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0FC0780437 for ; Thu, 23 May 2024 08:57:31 +0000 (UTC) X-FDA: 82149057102.26.E5D5572 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf07.hostedemail.com (Postfix) with ESMTP id 92CE640018 for ; Thu, 23 May 2024 08:57:27 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716454648; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c/2rt563WATrRnpCURygE+b11/GNWNWnQqOSUk5R4UI=; b=rtPizQnCHRShrZdxqPB28OKXWGpgcOJSnRZxpbvu0hbe+FUGDdtMNdlFie2JlPKlWbheGS ocQAl+U5MaG0i+9grRUy5OLA0qOppk1fcsiEipmUH4MdOId1329A7vZZ5l1YaUqU2vhK3b WePbfQknCgBRR2xJmwMqKVwqt3tdhjk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716454648; a=rsa-sha256; cv=none; b=KW4vGOmmDa3xRyMZj9zG0fXtiDbKTCKfAL9Kb19HVmmXcRXCxFKr0FV8fnQ/SdGSXszdrC EQwI/f72QgsukYqei6Gdjm0cJOkZRb8XKb5cbtvSIoLYF5FoWlXWw1jNM2Qj7pBJ9woFYG +JfBYHL4D617vK16hDMmYJNKpvMreXQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4VlMQv4h3lzwPXR; Thu, 23 May 2024 16:53:43 +0800 (CST) Received: from dggpemm100001.china.huawei.com (unknown [7.185.36.93]) by mail.maildlp.com (Postfix) with ESMTPS id 19D491401E9; Thu, 23 May 2024 16:57:23 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Thu, 23 May 2024 16:57:22 +0800 Message-ID: Date: Thu, 23 May 2024 16:57:21 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Kefeng Wang Subject: Re: [PATCH] mm: memcontrol: remove page_memcg() To: Shakeel Butt , Matthew Wilcox CC: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , , , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes References: <20240521131556.142176-1-wangkefeng.wang@huawei.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm100001.china.huawei.com (7.185.36.93) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 92CE640018 X-Rspam-User: X-Stat-Signature: h5445uhzfso3z5xug7dstrjdudujayam X-HE-Tag: 1716454647-412789 X-HE-Meta: U2FsdGVkX1+edBV1evElDw805/rmVcV2t4DOekFrdVnwqY4XIn6ZYnuJi2dTpMfR9pmMJM/P88dnB/RC+qLRa+zfYM0WjwoPC6UeZF57y+jcpn7/kwgSmqxkwV4hP2gmNO4eR1WvY+cZQrfoaWFQ+HlxCZtv6dyhH6S6y41fRn+d7tkpcdHIQD+QejkmtHlEkGRfrbqb2XTQkoAnCG3g3eqIXzWn2n/jm1wqKHbzFI3wOCPbZUlAhEklpw0WcK/wKusxQf2g9fM0LBExMZkR0i0bxV8xDhW0TYEos2lVPbLxi700OfrpnkJp4cVYoF3+ZenJF3IzJDecc408rlYkMiedHgxYbmVe/xSKWcdxyHTwtRPbD5t01RV+kpcJHlZRb+ii6XDh/dwm2NGBrZf/qGwpSfuI0a8ktyQwW3kLwlLUn44NL3duyBhbRpWajnVY6cbX2Jr1khAcqEy+9DKVB+IcCpcNIyNfS//raDfEq0upuVeKYJJrArplgGZMRiQCGDwUiAVGvLNzLxpNLHed7vUDJdCz/2dwGfYr6sBNyLj87qzonIsScnPYJ+komEZaxJMn5KLfDoAXfstj9SI/cTa5bZgy38VFVl6MWAmGGaAUY2/Ipk/tX5glb2llgUlqMaOb0gDd4a0+IhxOfg2GE8g9lSr8XgKN9RK9i/m2STrn1Ex6dDu9+8lAUb/SypE9aJFaZo4rUADnmIf+SmPYQ439NSTVG1sWMV31lQx+YzEizbzttQ/n3lf6WieDYV7+QSMgGf+Vxrf4/lMI3WeXMx3bhSa1TPFzGE919MN/U5PSnMMJJsyLhWi06/z0eSdc2Ml0WNqqUwiQ3sWW6UjaOXsO0wJPn8D1P5Wmzh6+2nfquGXZIYieFR29vqTpmEZV31dMr+n6NAlQ61sLv8wc2CywyJTWdJSS1U2yEHwLnJwazge+TFWrx7lep/vQ6+f3XGwLJIjW2jWFqoCnwmV f5hWHjjI dNuvHzmm0hm/ZlCoBIzZuai2vwUqxowFcYzNBzIAYGRLh2TgVe9bruI7DSpeaxkoooiaHx/NZb2gF3iubKfo4n+FJbMyty8x+zOzBcIA0dh9ZOn8rFGH6CAs5ub+fOshinj7MyNKquVOZTogxNnP+kNa/UZHjcSKV7UfiHevFKS2Av6fdm59slbmzfTT06Hvt4Z2DQF4b0wc7GzteCTFj/LgIWmdSOhC7Hol8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/5/22 3:29, Shakeel Butt wrote: > On Tue, May 21, 2024 at 03:44:21PM +0100, Matthew Wilcox wrote: >> On Tue, May 21, 2024 at 09:15:56PM +0800, Kefeng Wang wrote: >>> The page_memcg() only called by mod_memcg_page_state(), so squash it to >>> cleanup page_memcg(). >> >> This isn't wrong, except that the entire usage of memcg is wrong in the >> only two callers of mod_memcg_page_state(): >> >> $ git grep mod_memcg_page_state >> include/linux/memcontrol.h:static inline void mod_memcg_page_state(struct page *page, >> include/linux/memcontrol.h:static inline void mod_memcg_page_state(struct page *page, >> mm/vmalloc.c: mod_memcg_page_state(page, MEMCG_VMALLOC, -1); >> mm/vmalloc.c: mod_memcg_page_state(area->pages[i], MEMCG_VMALLOC, 1); >> >> The memcg should not be attached to the individual pages that make up a >> vmalloc allocation. Rather, it should be managed by the vmalloc >> allocation itself. I don't have the knowledge to poke around inside >> vmalloc right now, but maybe somebody else could take that on. > > Are you concerned about accessing just memcg or any field of the > sub-page? There are drivers accessing fields of pages allocated through > vmalloc. Some details at 3b8000ae185c ("mm/vmalloc: huge vmalloc backing > pages should be split rather than compound"). Maybe Matthew want something shown below, move the memcg MEMCG_VMALLOC stat update from per-page to per-vmalloc-allocation? It should be speed up the statistic after conversion. diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index e4a631ec430b..89f115623124 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -55,6 +55,9 @@ struct vm_struct { unsigned long size; unsigned long flags; struct page **pages; +#ifdef CONFIG_MEMCG_KMEM + struct obj_cgroup *objcg; +#endif #ifdef CONFIG_HAVE_ARCH_HUGE_VMALLOC unsigned int page_order; #endif diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 5d3aa2dc88a8..3e28c382f604 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3001,6 +3001,49 @@ static inline void set_vm_area_page_order(struct vm_struct *vm, unsigned int ord #endif } +#ifdef CONFIG_MEMCG_KMEM +static void vmalloc_memcg_alloc_hook(struct vm_struct *area, gfp_t gfp, + int nr_pages) +{ + struct obj_cgroup *objcg; + + if (!memcg_kmem_online() || !(gfp & __GFP_ACCOUNT)) + return; + + objcg = get_obj_cgroup_from_current(); + if (objcg) + return; + + area->objcg = objcg; + + rcu_read_lock(); + mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_VMALLOC, nr_pages); + rcu_read_unlock(); +} + +static void vmalloc_memcg_free_hook(struct vm_struct *area) +{ + struct obj_cgroup *objcg = area->objcg; + + if (!objcg) + return; + + rcu_read_lock(); + mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_VMALLOC, -area->nr_pages); + rcu_read_unlock(); + + obj_cgroup_put(objcg); +} +#else +static void vmalloc_memcg_alloc_hook(struct vm_struct *area, gfp_t gfp, + int nr_pages) +{ +} +static void vmalloc_memcg_free_hook(struct vm_struct *area) +{ +} +#endif + /** * vm_area_add_early - add vmap area early during boot * @vm: vm_struct to add @@ -3338,7 +3381,6 @@ void vfree(const void *addr) struct page *page = vm->pages[i]; BUG_ON(!page); - mod_memcg_page_state(page, MEMCG_VMALLOC, -1); /* * High-order allocs for huge vmallocs are split, so * can be freed as an array of order-0 allocations @@ -3347,6 +3389,7 @@ void vfree(const void *addr) cond_resched(); } atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); + vmalloc_memcg_free_hook(vm); kvfree(vm->pages); kfree(vm); } @@ -3643,12 +3686,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, node, page_order, nr_small_pages, area->pages); atomic_long_add(area->nr_pages, &nr_vmalloc_pages); - if (gfp_mask & __GFP_ACCOUNT) { - int i; - - for (i = 0; i < area->nr_pages; i++) - mod_memcg_page_state(area->pages[i], MEMCG_VMALLOC, 1); - } + vmalloc_memcg_alloc_hook(area, gfp_mask, area->nr_pages); /* * If not enough pages were obtained to accomplish an