From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DFC0C6FA8E for ; Sun, 26 Feb 2023 13:56:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D64B06B0072; Sun, 26 Feb 2023 08:56:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D15586B0073; Sun, 26 Feb 2023 08:56:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDD476B0074; Sun, 26 Feb 2023 08:56:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B13D96B0072 for ; Sun, 26 Feb 2023 08:56:20 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7AA81140476 for ; Sun, 26 Feb 2023 13:56:20 +0000 (UTC) X-FDA: 80509592520.09.A6DC967 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by imf05.hostedemail.com (Postfix) with ESMTP id 8C244100009 for ; Sun, 26 Feb 2023 13:56:18 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=W5Zsg23c; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.48 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677419778; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LOnShjfH2x5AjxcIBSzXwGcoae/sENzxXQ+eVDUaBfg=; b=oLC2DQKEUImQ3INSQCY0QFmhLUmu7KCL9sBX4mB2aCedjMO/HCuuQWjfBWUHdMfWDaR97E AmzSWnIr9Vuz+ZZjhIXfxn+2rV8aDVrOSZlhMEeP/dDI0jJFu7OHiuNX8JIHHqLLqLztuw rwcDD3W/0qCbGvDnRpqvT/uIgASlb8g= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=W5Zsg23c; spf=pass (imf05.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.48 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677419778; a=rsa-sha256; cv=none; b=XQLtxjd/y7TVqAfwboeTyfJmCJ9THF4rEUd+PyLa3XP54kdii3EU2BKyH7kjS/nlshssr0 JV0LDWKA+OuKbGRkBvP4n3ffOi+F1Kdi07ErrnHJbiTCBnjbaW90uMl+VbLz7kUN/tlTAY 8LqvIysEHetnb+uFq4mABaGL1deLEko= Received: by mail-pj1-f48.google.com with SMTP id q31-20020a17090a17a200b0023750b69614so3654116pja.5 for ; Sun, 26 Feb 2023 05:56:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1677419777; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=LOnShjfH2x5AjxcIBSzXwGcoae/sENzxXQ+eVDUaBfg=; b=W5Zsg23cyyz/ukKm755XIXZnUcc3AmSiYrxx24Ym6vJFvKuXgfAbnAopR6fyL47oYc 093xMO9YH0+7ZjHaiOcySVn7qk7ZosWvLzkeSzxIa01j0/LoLdQCJWDfMUCXdGIDGtV6 VThi1RvnELfGe3UOTJ+IDExU8pGiLpPISiixP7j8SY4++IeozWuj128nz4jLX+vBa6mM EwjpvV+2xPTKfQsiKplrF22yGRlhP1sIrojKpdSFMH7RyjN/JK6hvWqk8kVcwrmTyVE1 aiZykrOOMcFIkE3Q2jAh+OLlKQpurTvehZUPJQCdNQPQogVQD0oqVGCfOUt4tlCDWW54 EfSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677419777; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LOnShjfH2x5AjxcIBSzXwGcoae/sENzxXQ+eVDUaBfg=; b=tBX18ScjjvZwW8rA5p1CX7GM+3ssXVdo9ac3xQXMCyu8wyUmUaZV2RHAg/rFPpTo0+ W16ErtsFJrN1qVGhrTcJqli3K7nvnv6mIHm9QXQPcZbebAWlEe+95OmAdA0KUTTBiKVb onS6qGTGvQUDOPS1iZtEX+tT4yTA/zuMbrGOZx+IUOmThlzEq6DgZ1c10vTU1G0hesQE +9gyqJKvko2jeVpao8F1XG8mAW5kg0kK4IADrWG0WI9xUCDR1l9czT88ONW+qA3DGtd4 brwNpcc+VoVCjiE1LE4QQl1JuBlweVJX+ujjV1GmK4/APP08juFuMFv1VGAT6bYzUlYj DC3A== X-Gm-Message-State: AO0yUKUzax559wJ5R3R/CevkYTrybyo03RfOloBjQnHQV/2owuXqhOXH 08UCmhQhBu/m0mpIzKy48V9KYQ== X-Google-Smtp-Source: AK7set9v2zw+Bq/rC0nWfgoDDyrDQ5VhJTPqyyegdcN1NfESp1bq0eGnaPhNL2a/n440TKmrCFUsjQ== X-Received: by 2002:a17:902:e741:b0:19c:f005:92d5 with SMTP id p1-20020a170902e74100b0019cf00592d5mr4909438plf.5.1677419777195; Sun, 26 Feb 2023 05:56:17 -0800 (PST) Received: from [10.200.9.56] ([139.177.225.248]) by smtp.gmail.com with ESMTPSA id iy21-20020a170903131500b00198fb25d09bsm2701220plb.237.2023.02.26.05.56.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 26 Feb 2023 05:56:16 -0800 (PST) Message-ID: Date: Sun, 26 Feb 2023 21:56:10 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [PATCH v2 2/7] mm: vmscan: make global slab shrink lockless Content-Language: en-US To: Kirill Tkhai Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, shakeelb@google.com, mhocko@kernel.org, roman.gushchin@linux.dev, muchun.song@linux.dev, david@redhat.com, shy828301@gmail.com, dave@stgolabs.net, penguin-kernel@i-love.sakura.ne.jp, paulmck@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sultan Alsawaf References: <20230223132725.11685-1-zhengqi.arch@bytedance.com> <20230223132725.11685-3-zhengqi.arch@bytedance.com> <8049b6ed-435f-b518-f947-5516a514aec2@bytedance.com> <1aa70926-af39-0ce1-ae23-d86deb74d1c6@ya.ru> <74c4cf95-9506-98b3-9fc0-0814f63d5d7f@bytedance.com> <5663b349-8f6f-874a-eb9b-63d3179dcab7@ya.ru> <2ba86f45-f0a5-3a85-4aa6-f8beb50491b3@bytedance.com> <8dab3b27-7282-f8bc-7d04-ca63c9b872cf@ya.ru> From: Qi Zheng In-Reply-To: <8dab3b27-7282-f8bc-7d04-ca63c9b872cf@ya.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8C244100009 X-Stat-Signature: amiqopdzufqd9uqaq41ky15aqs8bj71q X-HE-Tag: 1677419778-792486 X-HE-Meta: U2FsdGVkX19prkgRuses6LHJRTg6fdgdeTYCw+6AqFD7RSnomCbNx7XWDEqiV8QoAIz6+UEmhPHcn9mJQadMTZWEneyPpU59WTbEXDOnB6AMB808WIhkUIi38OXo5XeIByr2lInu8IAFEMzj73fV3dbBWq26Qb2+7kRSpqLCMnm76XHKEUeYAnYKX/WO1NBTEYIEtLT3xWe90H+nfhx1Qr2mJdT1304ULJmLA0ErJbNAWrEzr0QsWjj2thaei/ld8wYm5R+WuG7Urw5153uX91OHxBZT0aJ7xBdnmGJFzT3fymcpNI2YzkAWlCN0xeDY213qaag3WJAL4NjIIqdSW0UQV/S/g/gMfTELkPFJTPXnu29x2d3JGsIuNJodivXgie87qr5WKnEX5wTDdgejBQ4tQMz7xqazbyZkDmBtur4WRM+Uclxgzhl2StS3jycyweoxms+CVTQKPPuCPps9nOoDepC3lozpcYQbndT1Y3cP9E7ifjWmosav5zoaJAlX6El6vzsLIQ/CzzPDIvcfwS8SJ0/d5nJ0a2ER80oATsn7s9yiEEy/dVQNwtzzBq6P/1KBnk6ekIhYrL8eFjytOn5/DP/ggkskrqDv1uzEeETRTeprZsKVLYBojYPdFrtPR9ifuJEyGKO5vsq+/ptGRlJUWyqlxPSQ9kqvNcvs44h3Wc9kJNM8AGOEysUcYZjS75IdiMBwfJumqtOBHieUIZ3H/jVD9bO/Cmxp4FVm0YW7EjPG/VeCFdPZJTHx2K0UM8vhjgLXOHQbhu4AXCE4Ulbn+o03AlAet/XuHxk7fv/+wDAFNSmlFh+oPDGbqu29qcVHs0gjiTx8kOuEyfXiDViLqqaiWOPAKqcVi6pg1TrCmde4PT3k/lYdsEldv+ok2ZPJdnVhwGEGVDLRCgOVnrJp6kT9y52m25vJHnjBKnfCwX8/Ak2NejreTZgdtg5/RuEB22Js8nyu1dl8vpw /9AWDNoc UjXBoHnmwRZGFHMW44JpkXtGce2PL/pVpizHixZ8a5knN8mVDV2bNMgRMik2GZG0hyFqauuYctkzD2ofzRVaJP8s7a1Sh3CiUUlmj6Uxg2k39Cq9Y9Us81JWqAGyIrTxEA+WIf1GIXy7pVhvVLk6GSU1kgt0SeaLSipED8uWDS4tWDOsLTvoVFNCmttMmoPzuxEVZrD4avToacqx6fRf6s9j7Bjq0Q+bCNRcgRUbtTvjeHMCk2eFP/jlC25VqXIGVJ0oGyIwYy04dfuodP+rBErlg6YKIujdLnk/fBs1clEmaZEoUq1JvZrbB2Vk4s6UI/DpsNV3aEzUGJnIxsLFn7T/+2nc4rhXY209gNz6u1BdR9QHhAwdh5Ywq3VvszN4Nhl+8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/2/26 05:28, Kirill Tkhai wrote: > On 25.02.2023 19:37, Qi Zheng wrote: >> >> >> On 2023/2/26 00:17, Kirill Tkhai wrote: >>> On 25.02.2023 18:57, Qi Zheng wrote: >>>> >> <...> >>>> How about this? >>>>>> >>>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>>>> index ffddbd204259..9d8c53075298 100644 >>>>>> --- a/mm/vmscan.c >>>>>> +++ b/mm/vmscan.c >>>>>> @@ -1012,7 +1012,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, >>>>>>                                    int priority) >>>>>>    { >>>>>>           unsigned long ret, freed = 0; >>>>>> -       struct shrinker *shrinker; >>>>>> +       struct shrinker *shrinker = NULL; >>>>>>           int srcu_idx, generation; >>>>>> >>>>>>           /* >>>>>> @@ -1025,11 +1025,15 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, >>>>>>           if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) >>>>>>                   return shrink_slab_memcg(gfp_mask, nid, memcg, priority); >>>>>> >>>>>> +again: >>>>>>           srcu_idx = srcu_read_lock(&shrinker_srcu); >>>>>> >>>>>>           generation = atomic_read(&shrinker_srcu_generation); >>>>>> -       list_for_each_entry_srcu(shrinker, &shrinker_list, list, >>>>>> -                                srcu_read_lock_held(&shrinker_srcu)) { >>>>>> +       if (!shrinker) >>>>>> +               shrinker = list_entry_rcu(shrinker_list.next, struct shrinker, list); >>>>>> +       else >>>>>> +               shrinker = list_entry_rcu(shrinker->list.next, struct shrinker, list); >>>>>> +       list_for_each_entry_from_rcu(shrinker, &shrinker_list, list) { >>>>>>                   struct shrink_control sc = { >>>>>>                           .gfp_mask = gfp_mask, >>>>>>                           .nid = nid, >>>>>> @@ -1042,8 +1046,9 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, >>>>>>                   freed += ret; >>>>>> >>>>>>                   if (atomic_read(&shrinker_srcu_generation) != generation) { >>>>>> -                       freed = freed ? : 1; >>>>>> -                       break; >>>>>> +                       srcu_read_unlock(&shrinker_srcu, srcu_idx); >>> >>> After SRCU in unlocked we can't believe @shrinker anymore. So, above list_entry_rcu(shrinker->list.next) >>> dereferences some random memory. >> >> Indeed. >> >>> >>>>>> +                       cond_resched(); >>>>>> +                       goto again; >>>>>>                   } >>>>>>           } >>>>>> >>>>>>> >>>>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>>>>> index 27ef9946ae8a..0b197bba1257 100644 >>>>>>> --- a/mm/vmscan.c >>>>>>> +++ b/mm/vmscan.c >>>>>>> @@ -204,6 +204,7 @@ static void set_task_reclaim_state(struct task_struct *task, >>>>>>>     LIST_HEAD(shrinker_list); >>>>>>>     DEFINE_MUTEX(shrinker_mutex); >>>>>>>     DEFINE_SRCU(shrinker_srcu); >>>>>>> +static atomic_t shrinker_srcu_generation = ATOMIC_INIT(0); >>>>>>>       #ifdef CONFIG_MEMCG >>>>>>>     static int shrinker_nr_max; >>>>>>> @@ -782,6 +783,7 @@ void unregister_shrinker(struct shrinker *shrinker) >>>>>>>         debugfs_entry = shrinker_debugfs_remove(shrinker); >>>>>>>         mutex_unlock(&shrinker_mutex); >>>>>>>     +    atomic_inc(&shrinker_srcu_generation); >>>>>>>         synchronize_srcu(&shrinker_srcu); >>>>>>>           debugfs_remove_recursive(debugfs_entry); >>>>>>> @@ -799,6 +801,7 @@ EXPORT_SYMBOL(unregister_shrinker); >>>>>>>      */ >>>>>>>     void synchronize_shrinkers(void) >>>>>>>     { >>>>>>> +    atomic_inc(&shrinker_srcu_generation); >>>>>>>         synchronize_srcu(&shrinker_srcu); >>>>>>>     } >>>>>>>     EXPORT_SYMBOL(synchronize_shrinkers); >>>>>>> @@ -908,18 +911,19 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, >>>>>>>     { >>>>>>>         struct shrinker_info *info; >>>>>>>         unsigned long ret, freed = 0; >>>>>>> -    int srcu_idx; >>>>>>> -    int i; >>>>>>> +    int srcu_idx, generation; >>>>>>> +    int i = 0; >>>>>>>           if (!mem_cgroup_online(memcg)) >>>>>>>             return 0; >>>>>>> - >>>>>>> +again: >>>>>>>         srcu_idx = srcu_read_lock(&shrinker_srcu); >>>>>>>         info = shrinker_info_srcu(memcg, nid); >>>>>>>         if (unlikely(!info)) >>>>>>>             goto unlock; >>>>>>>     -    for_each_set_bit(i, info->map, info->map_nr_max) { >>>>>>> +    generation = atomic_read(&shrinker_srcu_generation); >>>>>>> +    for_each_set_bit_from(i, info->map, info->map_nr_max) { >>>>>>>             struct shrink_control sc = { >>>>>>>                 .gfp_mask = gfp_mask, >>>>>>>                 .nid = nid, >>>>>>> @@ -965,6 +969,11 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid, >>>>>>>                     set_shrinker_bit(memcg, nid, i); >>>>>>>             } >>>>>>>             freed += ret; >>>>>>> + >>>>>>> +        if (atomic_read(&shrinker_srcu_generation) != generation) { >>>>>>> +            srcu_read_unlock(&shrinker_srcu, srcu_idx); >>>>>> >>>>>> Maybe we can add the following code here, so as to avoid repeating the >>>>>> current id and avoid triggering softlockup: >>>>>> >>>>>>               i++; >>> >>> This is OK. >>> >>>>>>               cond_resched(); >>> >>> Possible, existing cond_resched() in do_shrink_slab() is enough. >> >> Yeah. >> >> I will add this patch in the next version. May I mark you as the author >> of this patch? > > I think, yes Thanks. :) Qi > >>> >>>> And this. :) >>>> >>>> Thanks, >>>> Qi >>>> >>>>>> >>>>>> Thanks, >>>>>> Qi >>>>>> >>>>>>> +            goto again; >>>>>>> +        } >>>>>>>         } >>>>>>>     unlock: >>>>>>>         srcu_read_unlock(&shrinker_srcu, srcu_idx); >>>>>>> @@ -1004,7 +1013,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, >>>>>>>     { >>>>>>>         unsigned long ret, freed = 0; >>>>>>>         struct shrinker *shrinker; >>>>>>> -    int srcu_idx; >>>>>>> +    int srcu_idx, generation; >>>>>>>           /* >>>>>>>          * The root memcg might be allocated even though memcg is disabled >>>>>>> @@ -1017,6 +1026,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, >>>>>>>             return shrink_slab_memcg(gfp_mask, nid, memcg, priority); >>>>>>>           srcu_idx = srcu_read_lock(&shrinker_srcu); >>>>>>> +    generation = atomic_read(&shrinker_srcu_generation); >>>>>>>           list_for_each_entry_srcu(shrinker, &shrinker_list, list, >>>>>>>                      srcu_read_lock_held(&shrinker_srcu)) { >>>>>>> @@ -1030,6 +1040,11 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, >>>>>>>             if (ret == SHRINK_EMPTY) >>>>>>>                 ret = 0; >>>>>>>             freed += ret; >>>>>>> + >>>>>>> +        if (atomic_read(&shrinker_srcu_generation) != generation) { >>>>>>> +            freed = freed ? : 1; >>>>>>> +            break; >>>>>>> +        } >>>>>>>         } >>>>>>>           srcu_read_unlock(&shrinker_srcu, srcu_idx); >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -- Thanks, Qi