From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEC8CC64EC7 for ; Sat, 25 Feb 2023 15:14:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1AD7D6B0071; Sat, 25 Feb 2023 10:14:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 15E366B0073; Sat, 25 Feb 2023 10:14:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 025516B0074; Sat, 25 Feb 2023 10:14:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E5C556B0071 for ; Sat, 25 Feb 2023 10:14:57 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 909CF1A0239 for ; Sat, 25 Feb 2023 15:14:57 +0000 (UTC) X-FDA: 80506161834.16.6DC3B1D Received: from forward500a.mail.yandex.net (forward500a.mail.yandex.net [178.154.239.80]) by imf13.hostedemail.com (Postfix) with ESMTP id 10A522000A for ; Sat, 25 Feb 2023 15:14:54 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=ya.ru header.s=mail header.b=ZPk9q+2D; spf=pass (imf13.hostedemail.com: domain of tkhai@ya.ru designates 178.154.239.80 as permitted sender) smtp.mailfrom=tkhai@ya.ru; dmarc=pass (policy=none) header.from=ya.ru ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677338095; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O0ne6wtJpcAAwidABJIZc2u0MpFSQW350uQXrrIuvDQ=; b=Se+LTC3gfpshha96buxGqVhC2NXxb5SSjGkuqxLfl/sdM2WQaFw3M+99QbdVB0wymCdzAC 8LaQ0Hx3JIdYeZgCU+5MIYWmGTCZvGsOHGZ+78y8c5Hir4Q501+mXsc/GPsMNDeh5QxsHn CwXtkwPk/sbrWKhMcnc1FxRGlM/uDfA= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=ya.ru header.s=mail header.b=ZPk9q+2D; spf=pass (imf13.hostedemail.com: domain of tkhai@ya.ru designates 178.154.239.80 as permitted sender) smtp.mailfrom=tkhai@ya.ru; dmarc=pass (policy=none) header.from=ya.ru ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677338095; a=rsa-sha256; cv=none; b=O1MvFzmcXfbuAlLpdFuriNvDCaMFbdqdCfqD4zXqdT7NpkArcKWV0Yx9rnHnrgi2SskjhG IbnZtSqoiiKyzjYVWa/6PdJHeY61wygw9h9tbRd7S7Gzz9xU57I1FvGSZvYSTBnqRX8gZQ TxKppoAv+jdV2tssb9YdK75mlfjwUFw= Received: from vla3-fd3176e90be6.qloud-c.yandex.net (vla3-fd3176e90be6.qloud-c.yandex.net [IPv6:2a02:6b8:c15:2584:0:640:fd31:76e9]) by forward500a.mail.yandex.net (Yandex) with ESMTP id 251715E59D; Sat, 25 Feb 2023 18:14:51 +0300 (MSK) Received: by vla3-fd3176e90be6.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id kETN7XKW68c1-vLhTGWtI; Sat, 25 Feb 2023 18:14:49 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ya.ru; s=mail; t=1677338089; bh=O0ne6wtJpcAAwidABJIZc2u0MpFSQW350uQXrrIuvDQ=; h=From:In-Reply-To:Cc:Date:References:To:Subject:Message-ID; b=ZPk9q+2DLp9EQ2a8DLijMULLL49nq2Le/GEcd2xnxhd6LQZIIMvDaLxQcDfTOhGeW jOrrWgFVZGo2Jn5PzxzKjVYy3vDRsuqPR+G88dqMS5EJ9tLpKUGP+j9RxCiPjtlmNT HGMqCH6vytTFUw2dMj+psnhcRgS6fZz65hq/p/+o= Message-ID: Date: Sat, 25 Feb 2023 18:14:46 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH v2 1/7] mm: vmscan: add a map_nr_max field to shrinker_info To: Qi Zheng Cc: sultan@kerneltoast.com, dave@stgolabs.net, penguin-kernel@I-love.SAKURA.ne.jp, paulmck@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Johannes Weiner , Shakeel Butt , Michal Hocko , Roman Gushchin , Muchun Song , David Hildenbrand , Yang Shi References: <20230223132725.11685-1-zhengqi.arch@bytedance.com> <20230223132725.11685-2-zhengqi.arch@bytedance.com> <6f8f01b5-d802-db64-7725-8481c67c13a2@bytedance.com> Content-Language: en-US From: Kirill Tkhai In-Reply-To: <6f8f01b5-d802-db64-7725-8481c67c13a2@bytedance.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 10A522000A X-Stat-Signature: hm98ywqn9yfewd87nhogu5defntujb5a X-HE-Tag: 1677338094-913139 X-HE-Meta: U2FsdGVkX1+KNRWaEpP6xeLIws5QBQgt2W+amC7uheKvEDYMN3Gklp5V5nYlrw+BW9d5JK8k/wUejxBS4wGu2BDbUUKQp75vI28tNVv58t4nub2XSVuUkoC3S7cJoHXj5kQx5+8XtGc0RlnzSFVkZEgHQOqIroK0gkMLssF6s7OzfTtM9+p4Tcf0nEHUujLkJKQ3G6OOlnNioa7Op81v9vJHDCR/VKyzhVQl837YMFJmpCRVbHf3jPZRlZfULAKNrGwbqmyje65ebc4nSqnXmHeM/oq9mZB3+fKPXdwr7L7WWhnkBpCEAelNZXrPeujcbchluVy5MNMmZjjhg49c1zk0PzsfkSASQWHANT4+R4IiUyWFMpnCLJQoEs8uZqjGk/XG/3wQNY3/OF428y4i3ol4AfCHf3o4ezM5JFeE5QHgss4RXr3Nt4xzQmD2Vk1KcmoZmtREOBKlhlI4BtxS6i3hi4O2X/n5AlxhDiiRbh5mA+umuROfHi/vEl/bRefrcoqzi05drZIOeR1F9LxH3pZTSVRoaVQSB8+n3vceYUZ1Fe4NjvxRpePaKawkpN+7kF7LPB4ZZE2+LyKSrU56YsM0s7XBw/21LNO39yv1baUn/PEEgLrBPKySCb9w8RYLcq75GIM0gILv+w8Eu7j2+TGQzwMOk4pkHleuKWsuqiPCKM2M1o4kR6KonKUYicp2LouGASamC773guLHVz9WsVBlJFjOaX0lneRakPG0vWIjAqtx97uxZhEZgLUSp01f3TQT88RNh0txyPEvpQHGjeIJ7bAfxCuZBay3Rd4GrW5897N41vZD5cAuc4rLbM5TPLMVOePhk/pTWT21KyG6zpFmM1YOvvOZQi1sR5xPHrXfc0Q2VTradowj9FIJcUvkES6sLpnCtihf0rpOLUyFKEj5uo9wYglhr6tZ6s90GxMQvcnVH5vN8XM5oqomzqBa4Z3rGaBV9RWQhB2Cu7j Gw1Lv1qO bFq5IdBexsVEv0WgrHQjA/BRmqSvbRuSepGlWBEqsL6rpz85Y9JaRRa8Pc6dcoryImIzzzLScRahwhS9i1gdFbYjP/w2sQV8akiRQdNPRblrKIGTx29o0Pi88ixofM0mao4EfGRbUCqiPwv73ofhiCrFXK/1SZVzCW3j1uNk0m/AeSqYM3Bq8h2ZaeC/OUeB2YJnvhrYNhhWCVS+BhKM4y31SUIAO3EvpZ6T4koQ/OBcLRgf1um8s9NojcIMsLKtTZrPzYQbTY3vpbGpgBsR4L5ZGxmBhPQWLcuuOvOIcvchUb6/ylpeghoZWIoQZb0VpXVuVtxBOgEuo7PAB3igrPBHNk6EdNIPFrUTCvTlsehGFJbSx2IsgRBCdC6j+kwjNCNWFZ0hJ+IFwdN3myP+j5qFimsMX+ebaFozUJbkp6yinNs+2zCQsbtdxBOJ4HumTG0LoWIeBUrRp+UQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Qi, On 25.02.2023 11:18, Qi Zheng wrote: > > > On 2023/2/23 21:27, Qi Zheng wrote: >> To prepare for the subsequent lockless memcg slab shrink, >> add a map_nr_max field to struct shrinker_info to records >> its own real shrinker_nr_max. >> >> No functional changes. >> >> Signed-off-by: Qi Zheng > > I missed Suggested-by here, hi Kirill, can I add it? > > Suggested-by: Kirill Tkhai Yes, feel free to add this tag. There is a comment below. >> --- >>   include/linux/memcontrol.h |  1 + >>   mm/vmscan.c                | 29 ++++++++++++++++++----------- >>   2 files changed, 19 insertions(+), 11 deletions(-) >> >> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h >> index b6eda2ab205d..aa69ea98e2d8 100644 >> --- a/include/linux/memcontrol.h >> +++ b/include/linux/memcontrol.h >> @@ -97,6 +97,7 @@ struct shrinker_info { >>       struct rcu_head rcu; >>       atomic_long_t *nr_deferred; >>       unsigned long *map; >> +    int map_nr_max; >>   }; >>     struct lruvec_stats_percpu { >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 9c1c5e8b24b8..9f895ca6216c 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -224,9 +224,16 @@ static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, >>                        lockdep_is_held(&shrinker_rwsem)); >>   } >>   +static inline bool need_expand(int new_nr_max, int old_nr_max) >> +{ >> +    return round_up(new_nr_max, BITS_PER_LONG) > >> +           round_up(old_nr_max, BITS_PER_LONG); >> +} >> + >>   static int expand_one_shrinker_info(struct mem_cgroup *memcg, >>                       int map_size, int defer_size, >> -                    int old_map_size, int old_defer_size) >> +                    int old_map_size, int old_defer_size, >> +                    int new_nr_max) >>   { >>       struct shrinker_info *new, *old; >>       struct mem_cgroup_per_node *pn; >> @@ -240,12 +247,16 @@ static int expand_one_shrinker_info(struct mem_cgroup *memcg, >>           if (!old) >>               return 0; >>   +        if (!need_expand(new_nr_max, old->map_nr_max)) >> +            return 0; >> + >>           new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); >>           if (!new) >>               return -ENOMEM; >>             new->nr_deferred = (atomic_long_t *)(new + 1); >>           new->map = (void *)new->nr_deferred + defer_size; >> +        new->map_nr_max = new_nr_max; >>             /* map: set all old bits, clear all new bits */ >>           memset(new->map, (int)0xff, old_map_size); >> @@ -295,6 +306,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>           } >>           info->nr_deferred = (atomic_long_t *)(info + 1); >>           info->map = (void *)info->nr_deferred + defer_size; >> +        info->map_nr_max = shrinker_nr_max; >>           rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>       } >>       up_write(&shrinker_rwsem); >> @@ -302,12 +314,6 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>       return ret; >>   } >>   -static inline bool need_expand(int nr_max) >> -{ >> -    return round_up(nr_max, BITS_PER_LONG) > >> -           round_up(shrinker_nr_max, BITS_PER_LONG); >> -} >> - >>   static int expand_shrinker_info(int new_id) >>   { >>       int ret = 0; >> @@ -316,7 +322,7 @@ static int expand_shrinker_info(int new_id) >>       int old_map_size, old_defer_size = 0; >>       struct mem_cgroup *memcg; >>   -    if (!need_expand(new_nr_max)) >> +    if (!need_expand(new_nr_max, shrinker_nr_max)) >>           goto out; >>         if (!root_mem_cgroup) >> @@ -332,7 +338,8 @@ static int expand_shrinker_info(int new_id) >>       memcg = mem_cgroup_iter(NULL, NULL, NULL); >>       do { >>           ret = expand_one_shrinker_info(memcg, map_size, defer_size, >> -                           old_map_size, old_defer_size); >> +                           old_map_size, old_defer_size, >> +                           new_nr_max); >>           if (ret) { >>               mem_cgroup_iter_break(NULL, memcg); >>               goto out; >> @@ -432,7 +439,7 @@ void reparent_shrinker_deferred(struct mem_cgroup *memcg) >>       for_each_node(nid) { >>           child_info = shrinker_info_protected(memcg, nid); >>           parent_info = shrinker_info_protected(parent, nid); >> -        for (i = 0; i < shrinker_nr_max; i++) { >> +        for (i = 0; i < child_info->map_nr_max; i++) { >>               nr = atomic_long_read(&child_info->nr_deferred[i]); >>               atomic_long_add(nr, &parent_info->nr_deferred[i]); >>           } >> @@ -899,7 +906,7 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, >>       if (unlikely(!info)) >>           goto unlock; >>   -    for_each_set_bit(i, info->map, shrinker_nr_max) { >> +    for_each_set_bit(i, info->map, info->map_nr_max) { >>           struct shrink_control sc = { >>               .gfp_mask = gfp_mask, >>               .nid = nid, The patch as whole thing won't work as expected. It won't ever call shrinker with ids from [round_down(shrinker_nr_max, sizeof(unsigned long)) + 1, shrinker_nr_max - 1] Just replay the sequence we add new shrinkers: 1)We add shrinker #0: shrinker_nr_max = 0; prealloc_memcg_shrinker() id = 0; expand_shrinker_info(0) new_nr_max = 1; expand_one_shrinker_info(new_nr_max = 1) new->map_nr_max = 1; shrinker_nr_max = 1; 2)We add shrinker #1: prealloc_memcg_shrinker() id = 1; expand_shrinker_info(1) new_nr_max = 2; need_expand(2, 1) => false => ignore expand shrinker_nr_max = 2; 3)Then we call shrinker: shrink_slab_memcg() for_each_set_bit(i, info->map, 1/* info->map_nr_max */ ) { } => ignore shrinker #1 I'd fixed this patch by something like the below: diff --git a/mm/vmscan.c b/mm/vmscan.c index 9f895ca6216c..bb617a3871f1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -224,12 +224,6 @@ static struct shrinker_info *shrinker_info_protected(struct mem_cgroup *memcg, lockdep_is_held(&shrinker_rwsem)); } -static inline bool need_expand(int new_nr_max, int old_nr_max) -{ - return round_up(new_nr_max, BITS_PER_LONG) > - round_up(old_nr_max, BITS_PER_LONG); -} - static int expand_one_shrinker_info(struct mem_cgroup *memcg, int map_size, int defer_size, int old_map_size, int old_defer_size, @@ -247,9 +241,6 @@ static int expand_one_shrinker_info(struct mem_cgroup *memcg, if (!old) return 0; - if (!need_expand(new_nr_max, old->map_nr_max)) - return 0; - new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid); if (!new) return -ENOMEM; @@ -317,14 +308,11 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) static int expand_shrinker_info(int new_id) { int ret = 0; - int new_nr_max = new_id + 1; + int new_nr_max = round_up(new_id + 1, BITS_PER_LONG); int map_size, defer_size = 0; int old_map_size, old_defer_size = 0; struct mem_cgroup *memcg; - if (!need_expand(new_nr_max, shrinker_nr_max)) - goto out; - if (!root_mem_cgroup) goto out; @@ -359,9 +347,11 @@ void set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id) rcu_read_lock(); info = rcu_dereference(memcg->nodeinfo[nid]->shrinker_info); - /* Pairs with smp mb in shrink_slab() */ - smp_mb__before_atomic(); - set_bit(shrinker_id, info->map); + if (!WARN_ON_ONCE(shrinker_id >= info->map_nr_max)) { + /* Pairs with smp mb in shrink_slab() */ + smp_mb__before_atomic(); + set_bit(shrinker_id, info->map); + } rcu_read_unlock(); } } (I also added a new check into set_shrinker_bit() for safety). Kirill