From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E366C64EC4 for ; Wed, 22 Feb 2023 08:16:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E33FD6B0071; Wed, 22 Feb 2023 03:16:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DE3456B0073; Wed, 22 Feb 2023 03:16:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C83C86B0074; Wed, 22 Feb 2023 03:16:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BB19E6B0071 for ; Wed, 22 Feb 2023 03:16:22 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7671B120C57 for ; Wed, 22 Feb 2023 08:16:22 +0000 (UTC) X-FDA: 80494220604.12.B5D1731 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf25.hostedemail.com (Postfix) with ESMTP id 142A7A001F for ; Wed, 22 Feb 2023 08:16:19 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=VJso8tsI; spf=pass (imf25.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677053780; a=rsa-sha256; cv=none; b=CJJkMt0x6jMlfWsEoxqM+Lovggg+hNk17l+AauFOv3CjWBatx6pjqCqyT4y/AJBTJ8KhQi jBR8yGi2gg5Fm90xZqpp2deogJHweGJXSRpGCHBpuWffs8Q3sP6HwfBctZNlEsb0iUjPSz dl0NV8EAD2s4ewHlRNeRVepe0oZYYKk= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=VJso8tsI; spf=pass (imf25.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677053780; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HARGPugVb93dQnSioFTB2zuQ96e3hBJwOx/rwqR3n4k=; b=zwTtMEzqSF3hfX7e7BZH/eHs9i8Qszp9UQMvI29u5tPXhjyI6YW5cD0/Q+WEvUl9/jkohZ Cpp5ODeps7CeRUVUuv12W2hVZuVFZe4ZfTLOAW5u0U+0zIHpHvM7cSDpEhP7XqqRESjdsS S+Q9MdHLDlzZDhNU996ZYHxU0pN0Qws= Received: by mail-pl1-f176.google.com with SMTP id c1so8145750plg.4 for ; Wed, 22 Feb 2023 00:16:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; t=1677053779; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=HARGPugVb93dQnSioFTB2zuQ96e3hBJwOx/rwqR3n4k=; b=VJso8tsI93eu35pc3VxYqOg7hOmThZF2C6/VVNoF/2NBx4nwh1hlpTshvQjpNKifMR k2wtQ7gaEtFlOsRaU+xLl349t9Bnukdscz8vzKmZQapoEcvswUlfMc+d9T84Cgfjfeg1 kppw6zK4CuwHBKCbJdnN/syETBLjHRDi5D6jNHA2ll3fZw9g3YKkcYZrYertnN2qkDey g1UlooJP7pc7B0Ftu+l00TA5rEJqhhEA+jvdVraN64uySgsYm2/3NrFw8ovzmBmC1MGL a/JzFk3Y4X39Oc7L/efFR7a1v5AAi0sXkihkTx4lXbanCKkHZmE8oRCBi0keiNBWTZoZ rvLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677053779; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HARGPugVb93dQnSioFTB2zuQ96e3hBJwOx/rwqR3n4k=; b=aox7fwysdVGVm9ZIro8Kd73bQOtyMfOOW7MaBt2ZS5lFLa56+QLRl7FihQS8vjKt7T 2NC0XYjLqALYGtSR0dGs7a9Yop9zVK5HASgFzR1W9JaEGE5zIBMiwuuRJQGfYyakLb/I ZnwNx0QdbocLskJ7XkeoCA4YUSiPJJ7I03Nur5wvupfpvwOHkgRxgtMTEvHDM1Fs4iRb Zz7Z4B9hSjm7JNvYSJOW2lyc7atlORrj+3bBTqJek2y3rkyknTJrqqbmKPvlty9DbSwx RB9fVBp/foCNmP94AMmyiUUIwANdwZZwqv4MT8DIEUVG1icZ+LxsME59L5foLToXp7Pl sS2w== X-Gm-Message-State: AO0yUKUEEbJzwlFQGVQRPbJnUi/sy5VMgAyyT6xYZebS+3sSN8GGRNaA mesql9MM7QeBvzOwKIwywHmyAA== X-Google-Smtp-Source: AK7set+EJQYbkAdrA8MCp8gP35wFOZPOvgMBqUZ/bEdAI6Sc+CHosq21YsseWSCKe5ecZlgYogqi1A== X-Received: by 2002:a05:6a21:3392:b0:cb:92d1:12fa with SMTP id yy18-20020a056a21339200b000cb92d112famr4249247pzb.5.1677053778697; Wed, 22 Feb 2023 00:16:18 -0800 (PST) Received: from [10.200.11.31] ([139.177.225.226]) by smtp.gmail.com with ESMTPSA id e187-20020a6369c4000000b004b1fef0bf16sm3288144pgc.73.2023.02.22.00.16.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Feb 2023 00:16:18 -0800 (PST) Message-ID: <39cf7fa7-5dd6-0424-23de-1c33e948a201@bytedance.com> Date: Wed, 22 Feb 2023 16:16:09 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [PATCH 2/5] mm: vmscan: make memcg slab shrink lockless Content-Language: en-US To: Kirill Tkhai Cc: sultan@kerneltoast.com, dave@stgolabs.net, penguin-kernel@I-love.SAKURA.ne.jp, paulmck@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Johannes Weiner , Shakeel Butt , Michal Hocko , Roman Gushchin , Muchun Song , David Hildenbrand , Yang Shi References: <20230220091637.64865-1-zhengqi.arch@bytedance.com> <20230220091637.64865-3-zhengqi.arch@bytedance.com> From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 142A7A001F X-Rspamd-Server: rspam01 X-Stat-Signature: bp6nr3u9ez39qwfp3dh9fyo5pcnfgn4a X-HE-Tag: 1677053779-682692 X-HE-Meta: U2FsdGVkX1/M5v3HrayQ3KhWDRE8STjtzsB/UcBV69PNgmD9nlYQUWk2o0xjP6qGfXfnRosvPecYzR256nHu8ln91zau9b4cKsiq7vWpWoe1jR19axIpPA+SbvEK+M52K9pWi3e9jzRnxrxuzNQXHKsKEbuNOhYDXiZjOdzWI5CkBoZjFbEuPFBjzNmM61fiVUmPVpIJ/IcQJBfVrkE0fn9xMOdf71K7eALVtpEbkQGnGLATmsw51VQMnRvB81y7ORPWdh7/YtjXufRjWsuMJvDP232aUQD2EsbQ+7YWas0K+ik6vv75DnF7ZUbD5HPYsDfgTyIM+nUVQAGpud1Cf+krl+fW+oeyIzO/iTZd5zPy3YF69Q7/jyvBzxuE78aYB/x2n7C9WC97FnY2pjHiFpdC/XSvbREoMwZftluO3EWoreASZFCKYPxJJOMApZ9uFS6JQZmajT2TiBG2388AZRD7KK9OF+Fgf3wC7mwIvZv1xcIVJRjgjNNkwsZyqr45SyLiEjTYUlCOU+QtL3thQp7hyg4nsQFHDxMS1TUuDStr9pQ8ZV+9Hqvli+Sa2o1s3vDhqoHP+BmZeFV5rhi6/0aOm+zZ8acasKv8Hv/iZusRJQfdJLaeEXRNN3gDR4l/UsqGfkFvgFJcIYAFE1UT8LAnNhO2lJa9j0RWRFMLBLrTedY/L59pUgEcLEqZGrRPoqa8jJJlnBXh010aO43G6jrS6PYnwngnFsW3lXKXKxn50BJuQzJZLx6ydKtwNgtaLudJqi9yVKbWcmM5uMfh1MLNreh3+X3owVhC3gUTzjafa9kpVRFlJoiIEPBkDwjWtFJNsX8b7BniyGc4rXG38foMgP3JnlarmOLy255GSpwKFSsOKfse+mgiYj+CSfD/kQz5S9/tNR7UzhP9V0/q8e46hdyT7mkSjtIZMSsUS4ppl/6ACWCap8Af+ezISZmippZ1+ImH52ntHWPznkg +0oA6Va1 Rg7jU3gOaqU0k6v3bbsIbdOdv2sSPJAAweXMZaO6SLlD3L7987Dxj2WUQFPsN2ePp5TF59v1CUJ917KnZ1wF+EIbl0+npSQaNC1uYvg+wKIaXXU7o7BVAwjU8T89NCAB0lmKIAmGvEUUAd9LkveRo5HXEA1L73mYcEKStDwJGFQUkrA0Q7q2oGsp8JG1bC0LI+/RqWrOpKEkfo7qvTlORSmHjv0JCWD2tpSM0gGiJAmCbP7yg+ZEp8FY+twNgZDRY1k+PWhGJX9fl1+yK5H68auZq0VTzudKxt4WlRjiPWlp505++P6pAal+bi3aKf2cMTFHZdqqMf1Lptz4Ni8hb8Ysh5O/UlSyDDadU5cxpIt088BW5rUPPm2HWDlNcFXkau/0AQPNZ0+HE8saNLn+3jLKajvtlqvUkTd46t/8ppYPJm+tvcohDDKbdG4JUZLTSI5wD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Kirill, On 2023/2/22 05:43, Kirill Tkhai wrote: > On 20.02.2023 12:16, Qi Zheng wrote: >> Like global slab shrink, since commit 1cd0bd06093c<...> >> static bool cgroup_reclaim(struct scan_control *sc) >> @@ -891,15 +905,14 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, >> { >> struct shrinker_info *info; >> unsigned long ret, freed = 0; >> + int srcu_idx; >> int i; >> >> if (!mem_cgroup_online(memcg)) >> return 0; >> >> - if (!down_read_trylock(&shrinker_rwsem)) >> - return 0; >> - >> - info = shrinker_info_protected(memcg, nid); >> + srcu_idx = srcu_read_lock(&shrinker_srcu); >> + info = shrinker_info_srcu(memcg, nid); >> if (unlikely(!info)) >> goto unlock; > > There is shrinker_nr_max dereference under this hunk. It's not in the patch: > > for_each_set_bit(i, info->map, shrinker_nr_max) { > > Since shrinker_nr_max may grow in parallel, this leads to access beyond allocated memory :( Oh, indeed. > > It looks like we should save size of info->map as a new member of struct shrinker_info. Agree, then we only traverse info->map_size each time. Like below: diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index b6eda2ab205d..f1b5d2803007 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -97,6 +97,7 @@ struct shrinker_info { struct rcu_head rcu; atomic_long_t *nr_deferred; unsigned long *map; + int map_size; }; struct lruvec_stats_percpu { diff --git a/mm/vmscan.c b/mm/vmscan.c index f94bfe540675..dd07eb107915 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -239,14 +239,20 @@ static void free_shrinker_info_rcu(struct rcu_head *head) kvfree(container_of(head, struct shrinker_info, rcu)); } -static int expand_one_shrinker_info(struct mem_cgroup *memcg, - int map_size, int defer_size, - int old_map_size, int old_defer_size) +static int expand_one_shrinker_info(struct mem_cgroup *memcg, int new_nr_max, + int old_nr_max) { + int map_size, defer_size, old_map_size, old_defer_size; struct shrinker_info *new, *old; struct mem_cgroup_per_node *pn; int nid; - int size = map_size + defer_size; + int size; + + map_size = shrinker_map_size(new_nr_max); + defer_size = shrinker_defer_size(new_nr_max); + old_map_size = shrinker_map_size(shrinker_nr_max); + old_defer_size = shrinker_defer_size(shrinker_nr_max); + size = map_size + defer_size; for_each_node(nid) { pn = memcg->nodeinfo[nid]; @@ -261,6 +267,7 @@ static int expand_one_shrinker_info(struct mem_cgroup *memcg, new->nr_deferred = (atomic_long_t *)(new + 1); new->map = (void *)new->nr_deferred + defer_size; + new->map_size = new_nr_max; /* map: set all old bits, clear all new bits */ memset(new->map, (int)0xff, old_map_size); @@ -310,6 +317,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) } info->nr_deferred = (atomic_long_t *)(info + 1); info->map = (void *)info->nr_deferred + defer_size; + info->map_size = shrinker_nr_max; rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex); @@ -327,8 +335,6 @@ static int expand_shrinker_info(int new_id) { int ret = 0; int new_nr_max = new_id + 1; - int map_size, defer_size = 0; - int old_map_size, old_defer_size = 0; struct mem_cgroup *memcg; if (!need_expand(new_nr_max)) @@ -339,15 +345,9 @@ static int expand_shrinker_info(int new_id) lockdep_assert_held(&shrinker_mutex); - map_size = shrinker_map_size(new_nr_max); - defer_size = shrinker_defer_size(new_nr_max); - old_map_size = shrinker_map_size(shrinker_nr_max); - old_defer_size = shrinker_defer_size(shrinker_nr_max); - memcg = mem_cgroup_iter(NULL, NULL, NULL); do { - ret = expand_one_shrinker_info(memcg, map_size, defer_size, - old_map_size, old_defer_size); + ret = expand_one_shrinker_info(memcg, new_nr_max, shrinker_nr_max); if (ret) { mem_cgroup_iter_break(NULL, memcg); goto out; @@ -912,7 +912,7 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, if (unlikely(!info)) goto unlock; - for_each_set_bit(i, info->map, shrinker_nr_max) { + for_each_set_bit(i, info->map, info->map_size) { struct shrink_control sc = { .gfp_mask = gfp_mask, .nid = nid, I will send the v2. Thanks, Qi > >> >> @@ -949,14 +962,9 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, >> set_shrinker_bit(memcg, nid, i); >> } >> freed += ret; >> - >> - if (rwsem_is_contended(&shrinker_rwsem)) { >> - freed = freed ? : 1; >> - break; >> - } >> } >> unlock: >> - up_read(&shrinker_rwsem); >> + srcu_read_unlock(&shrinker_srcu, srcu_idx); >> return freed; >> } >> #else /* CONFIG_MEMCG */ > -- Thanks, Qi