From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB48CC433EF for ; Sat, 11 Jun 2022 18:32:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F0768D011B; Sat, 11 Jun 2022 14:32:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A0BD8D0115; Sat, 11 Jun 2022 14:32:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA9898D011B; Sat, 11 Jun 2022 14:32:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DC2E78D0115 for ; Sat, 11 Jun 2022 14:32:52 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id AF9F033F53 for ; Sat, 11 Jun 2022 18:32:52 +0000 (UTC) X-FDA: 79566801384.01.E1C05DE Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) by imf26.hostedemail.com (Postfix) with ESMTP id B789614007D for ; Sat, 11 Jun 2022 18:32:51 +0000 (UTC) Received: by mail-lf1-f46.google.com with SMTP id y32so3127614lfa.6 for ; Sat, 11 Jun 2022 11:32:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=openvz-org.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=Y/Q2290+YOhK1bDLGmt6qXPcRk6v5jqFwCfmxJNUbhg=; b=dCtVTC9tTJH9oFz02XgCMQ8lPgmycesJKBImsDGNDzlhFS6zWOFAetpLNTPEJauTTb CYE385ZktNV6TitUaCVqUc3IGDDAh2rdjhWjpdoJ72tAm4bQzosQWLDQsgD5w26AmBcS ejBFyIExUHry+DT3HOcu354JG5VytsXWPOZczJr1hJGmPv3pN27UahMKj5jCH2vo7LpZ d8j1D4FnN9ZH2TElMiYEvFzB6hXXIvnP+brnAK4ZdL4g3H7THeabtRi48y5ULOZlf+TO P/L+6BUyLos9zYI7RUQJknwTutjTLKz+7axpxhfkunc3g8AZJX8/volSOlrlM3iS+89i wWqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=Y/Q2290+YOhK1bDLGmt6qXPcRk6v5jqFwCfmxJNUbhg=; b=bcv2XQrzRAf4kIbZKxaJoCjFF25DgGIHxdS7xW2o30CsvR3Trsj4Ga5xebbI6PQulA v3tSrB6/m7kNbiH9p/3pASLLb0DBsc0vKQHxhCICnl4B9BFUp+KtfBJK69LGR3QdiOEQ /EnIszJGRO4cpxrbd5no+VAA/JWCGF3B88hIMQM7pIMx0goYltPzUx5UxwJG56J06yOr ZtJPKcyEXh3Ugpeac5OWjF/WrefraJ545a1RqhciM1+EHxMHwAfdGgplZpG9eFppCrZP BAi5VRDP8m7vGycTS8LrSEFitY9Lv9o2kbDkCgaDcA6iyrnSzb+vwgoXFNsnSDku+eFh OpdQ== X-Gm-Message-State: AOAM531QZ9zYTct9hAL8ROSSe+dENObiuTZ0WQuedrnOkUOIT1+PZH90 HLsFSj8GyF0YmzqTlUxIK3F9jA== X-Google-Smtp-Source: ABdhPJymFrwkMTrhHRFUvL5FMxusP4kbC1Ig3W2VPrSrQ9wu9j/SjH2yBNxrM4FrRp0aFpKY0ggVrA== X-Received: by 2002:a05:6512:39d6:b0:478:fd24:36 with SMTP id k22-20020a05651239d600b00478fd240036mr29951762lfu.504.1654972369804; Sat, 11 Jun 2022 11:32:49 -0700 (PDT) Received: from [192.168.1.65] ([46.188.121.129]) by smtp.gmail.com with ESMTPSA id g1-20020a056512118100b00478efdea1e4sm344455lfr.64.2022.06.11.11.32.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 11 Jun 2022 11:32:49 -0700 (PDT) Message-ID: <1fc4f91c-8142-44dd-0f1d-a75420bf450d@openvz.org> Date: Sat, 11 Jun 2022 21:32:47 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH] mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe Content-Language: en-US To: Roman Gushchin , Andrew Morton , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Naresh Kamboju , Qian Cai , Kefeng Wang , Linux Kernel Functional Testing , Shakeel Butt References: <20220610180310.1725111-1-roman.gushchin@linux.dev> From: Vasily Averin In-Reply-To: <20220610180310.1725111-1-roman.gushchin@linux.dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1654972372; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y/Q2290+YOhK1bDLGmt6qXPcRk6v5jqFwCfmxJNUbhg=; b=MZFEUI6AbKLfN5a0oAQQiPnO+qjkxUUpOppdm6b+fQdT1Sg4/9E7mBjFB600Nd7wHfvcHH iJYUY++9qbgMMYpQ2BO441OJ04QKQhjBRmUB8sV2rhaTMT3eFKL7gsdQIp3N3o0QOkhetJ VgvcHsJerPPmIKp7R1UrZaABahxLXdM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=openvz-org.20210112.gappssmtp.com header.s=20210112 header.b=dCtVTC9t; dmarc=pass (policy=none) header.from=openvz.org; spf=pass (imf26.hostedemail.com: domain of vvs@openvz.org designates 209.85.167.46 as permitted sender) smtp.mailfrom=vvs@openvz.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1654972372; a=rsa-sha256; cv=none; b=YzztAwBSKYlhNp5cWnO93RyeG2UjQ3eUbZm08Itp9JSrJYqvTcjVfn+k0OLExz0H8O1Cc0 2ulucXtvfpzs7S7FzuAPC/+5ZJNhhaJIxV+QYn9vh7S9DFT0g43V4VHe3hvRpuD5+mmnKK eqMRWNrrG8GrgFcbk0/0JONNHjodvp8= X-Stat-Signature: z56ybmpmat4jihpmnd6j7eu3isdmwtzx X-Rspamd-Queue-Id: B789614007D Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=openvz-org.20210112.gappssmtp.com header.s=20210112 header.b=dCtVTC9t; dmarc=pass (policy=none) header.from=openvz.org; spf=pass (imf26.hostedemail.com: domain of vvs@openvz.org designates 209.85.167.46 as permitted sender) smtp.mailfrom=vvs@openvz.org X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1654972371-126177 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/10/22 21:03, Roman Gushchin wrote: > Currently mem_cgroup_from_obj() is not working properly with objects > allocated using vmalloc(). It creates problems in some cases, when > it's called for static objects belonging to modules or generally > allocated using vmalloc(). > > This patch makes mem_cgroup_from_obj() safe to be called on objects > allocated using vmalloc(). > > It also introduces mem_cgroup_from_slab_obj(), which is a faster > version to use in places when we know the object is either a slab > object or a generic slab page (e.g. when adding an object to a lru > list). > > Suggested-by: Kefeng Wang > Signed-off-by: Roman Gushchin > Tested-by: Linux Kernel Functional Testing > Acked-by: Shakeel Butt I've tested this patch together with my patch "net: set proper memcg for net_init hooks allocations" and successfully booted test kernel on arm64 VM without any memcg-related warnings. [root@fedora ~]# uname -a Linux fedora 5.19.0-rc1-next-20220610+ #1 SMP Sat Jun 11 16:06:23 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux Tested-by: Vasily Averin > --- > include/linux/memcontrol.h | 6 ++++ > mm/list_lru.c | 2 +- > mm/memcontrol.c | 71 +++++++++++++++++++++++++++----------- > 3 files changed, 57 insertions(+), 22 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 0d7584e2f335..4d31ce55b1c0 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -1761,6 +1761,7 @@ static inline int memcg_kmem_id(struct mem_cgroup *memcg) > } > > struct mem_cgroup *mem_cgroup_from_obj(void *p); > +struct mem_cgroup *mem_cgroup_from_slab_obj(void *p); > > static inline void count_objcg_event(struct obj_cgroup *objcg, > enum vm_event_item idx) > @@ -1858,6 +1859,11 @@ static inline struct mem_cgroup *mem_cgroup_from_obj(void *p) > return NULL; > } > > +static inline struct mem_cgroup *mem_cgroup_from_slab_obj(void *p) > +{ > + return NULL; > +} > + > static inline void count_objcg_event(struct obj_cgroup *objcg, > enum vm_event_item idx) > { > diff --git a/mm/list_lru.c b/mm/list_lru.c > index ba76428ceece..a05e5bef3b40 100644 > --- a/mm/list_lru.c > +++ b/mm/list_lru.c > @@ -71,7 +71,7 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr, > if (!list_lru_memcg_aware(lru)) > goto out; > > - memcg = mem_cgroup_from_obj(ptr); > + memcg = mem_cgroup_from_slab_obj(ptr); > if (!memcg) > goto out; > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 4093062c5c9b..8c408d681377 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -783,7 +783,7 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val) > struct lruvec *lruvec; > > rcu_read_lock(); > - memcg = mem_cgroup_from_obj(p); > + memcg = mem_cgroup_from_slab_obj(p); > > /* > * Untracked pages have no memcg, no lruvec. Update only the > @@ -2833,27 +2833,9 @@ int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, > return 0; > } > > -/* > - * Returns a pointer to the memory cgroup to which the kernel object is charged. > - * > - * A passed kernel object can be a slab object or a generic kernel page, so > - * different mechanisms for getting the memory cgroup pointer should be used. > - * In certain cases (e.g. kernel stacks or large kmallocs with SLUB) the caller > - * can not know for sure how the kernel object is implemented. > - * mem_cgroup_from_obj() can be safely used in such cases. > - * > - * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(), > - * cgroup_mutex, etc. > - */ > -struct mem_cgroup *mem_cgroup_from_obj(void *p) > +static __always_inline > +struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p) > { > - struct folio *folio; > - > - if (mem_cgroup_disabled()) > - return NULL; > - > - folio = virt_to_folio(p); > - > /* > * Slab objects are accounted individually, not per-page. > * Memcg membership data for each individual object is saved in > @@ -2886,6 +2868,53 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p) > return page_memcg_check(folio_page(folio, 0)); > } > > +/* > + * Returns a pointer to the memory cgroup to which the kernel object is charged. > + * > + * A passed kernel object can be a slab object, vmalloc object or a generic > + * kernel page, so different mechanisms for getting the memory cgroup pointer > + * should be used. > + * > + * In certain cases (e.g. kernel stacks or large kmallocs with SLUB) the caller > + * can not know for sure how the kernel object is implemented. > + * mem_cgroup_from_obj() can be safely used in such cases. > + * > + * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(), > + * cgroup_mutex, etc. > + */ > +struct mem_cgroup *mem_cgroup_from_obj(void *p) > +{ > + struct folio *folio; > + > + if (mem_cgroup_disabled()) > + return NULL; > + > + if (unlikely(is_vmalloc_addr(p))) > + folio = page_folio(vmalloc_to_page(p)); > + else > + folio = virt_to_folio(p); > + > + return mem_cgroup_from_obj_folio(folio, p); > +} > + > +/* > + * Returns a pointer to the memory cgroup to which the kernel object is charged. > + * Similar to mem_cgroup_from_obj(), but faster and not suitable for objects, > + * allocated using vmalloc(). > + * > + * A passed kernel object must be a slab object or a generic kernel page. > + * > + * The caller must ensure the memcg lifetime, e.g. by taking rcu_read_lock(), > + * cgroup_mutex, etc. > + */ > +struct mem_cgroup *mem_cgroup_from_slab_obj(void *p) > +{ > + if (mem_cgroup_disabled()) > + return NULL; > + > + return mem_cgroup_from_obj_folio(virt_to_folio(p), p); > +} > + > static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) > { > struct obj_cgroup *objcg = NULL;