From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C98CC10F04 for ; Wed, 6 Dec 2023 07:47:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0705E6B0080; Wed, 6 Dec 2023 02:47:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 020546B0081; Wed, 6 Dec 2023 02:47:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E2A046B0082; Wed, 6 Dec 2023 02:47:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D29646B0080 for ; Wed, 6 Dec 2023 02:47:57 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9DBE814012C for ; Wed, 6 Dec 2023 07:47:57 +0000 (UTC) X-FDA: 81535614594.04.0C30DFC Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by imf13.hostedemail.com (Postfix) with ESMTP id DE6B42001A for ; Wed, 6 Dec 2023 07:47:55 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Yahd4so4; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of jiangshanlai@gmail.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=jiangshanlai@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701848875; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l9tOeG5lSH9f7dyJI5an/jIJpQd2p6O7LHJWWGhvBKQ=; b=MojpHDivPgiogtH55lsScLtk2hIXNfMfWuj5l/60JHctMG+weBnxUYDewGsxd0beJHaLfO fpy3ZDhX2jdE1DnR9TCMIGdujNEGSGnhQj1cwNNroM76rK7NwOLsvY0qiVZzhBb3RKAsRX fD7FzLdm2U5H+vmqX4BCixI0QSMMooY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Yahd4so4; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of jiangshanlai@gmail.com designates 209.85.215.175 as permitted sender) smtp.mailfrom=jiangshanlai@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701848875; a=rsa-sha256; cv=none; b=VkZJqLYvoSDMJ5sUoOzlb6c1sTyhhPsiSOnXfb4KZOI+C1d4Ju+IpgdAbIZpoFcE6ETa+e +Tk0ir+Dx6z7VZ8ZqBX1W2cn1eNDrD47cYan1g01cwIMxzH9sOew2WBQdAhqOk5qkN5kUY 6WiiZgIukSMS6XQiYNjHTx+08f285L0= Received: by mail-pg1-f175.google.com with SMTP id 41be03b00d2f7-5c659db0ce2so3447103a12.0 for ; Tue, 05 Dec 2023 23:47:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701848875; x=1702453675; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=l9tOeG5lSH9f7dyJI5an/jIJpQd2p6O7LHJWWGhvBKQ=; b=Yahd4so41WqY6jfELcXOjb+3g1ogwdvlk9VGecp648hVS8yfDnhiOFE9uP+0FTr0kn xUdn+ixqmMBfqjG+uoYOKVINK107PJiMJWie2ODO82WS7yzA1TDBad9Z8VmwBnj/PscA cxqji8mKnQ4s3QIAQTpoc/o0L560x8psfJIpSkRANYivuulZlgjYuYuOl+oYSGJYCh9k eZZwYxBzaYxGx2yH/622MEkkYvklQEuJvWEZxNpCX6/8wMCASXduJnJiKy4MNTBnw3vj A4jRUkXDuD4YY46nHIig/+JOztKIRUcN6aVnaTPz6D27WX9FHcErKiF/vgfn2KkPhDik rwaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701848875; x=1702453675; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l9tOeG5lSH9f7dyJI5an/jIJpQd2p6O7LHJWWGhvBKQ=; b=sc+Wjd/oYkyYkXR4o6BMaYMS3rv7Nc/pEHhKNO/H84jjlJkhRVMHwiqy2yTttdaG5+ tcFBEjlUcECoFugNrV85+qa0OTVRxJQ9pBGtAtHBNYnmQuYtWXIsgl4CkDpZFW63zwd4 znSKuMm9JFI3o4tuTjlTchUimerkPH9wVyDTMbn4IGZrrhbH2ZeEuhEesOLZZ98NdEO3 mXx/o3Qbzgpb6ppSdv92CgNZOZ7Xqhw4Vy6B63O8pwnMMRFKhlgW9zbtbR5FukRGHFEB cM5sLgWKiGs082xq5FHc1yOZcNa0pn9qxw708MQbdNVlxRqPLSxPHShxlB9inQFX8N38 M66w== X-Gm-Message-State: AOJu0YzD8AKZLYBKXRk/6J+lwocAZix69xakSz03rKxqoeM1DpavMS5c IDp0V/nq31jrT3otCZ4FE569JFy8lvs10ctPGOU= X-Google-Smtp-Source: AGHT+IFwSbhTLDlvnf5+oxYDdBIhm4qGO2bZ81JWMwosaCevSDcqxlcxB/Vy2nAY8jK/0IA4t818DT8m3AIBZaVNV9Y= X-Received: by 2002:a05:6a20:12c8:b0:18f:97c:9776 with SMTP id v8-20020a056a2012c800b0018f097c9776mr528838pzg.94.1701848874699; Tue, 05 Dec 2023 23:47:54 -0800 (PST) MIME-Version: 1.0 References: <20230911094444.68966-1-zhengqi.arch@bytedance.com> <20230911094444.68966-43-zhengqi.arch@bytedance.com> In-Reply-To: <20230911094444.68966-43-zhengqi.arch@bytedance.com> From: Lai Jiangshan Date: Wed, 6 Dec 2023 15:47:43 +0800 Message-ID: Subject: Re: [PATCH v6 42/45] mm: shrinker: make global slab shrink lockless To: Qi Zheng , akpm@linux-foundation.org, paulmck@kernel.org Cc: david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: DE6B42001A X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: ze1rpbfqq8xran31uxxghb93hzg1skpf X-HE-Tag: 1701848875-302842 X-HE-Meta: U2FsdGVkX19/tEZCTuGcvJLov+63IrzSTItHFZ4WJhCq4mUGg9zKIb8AjP27wZCbJHYUJgq4IGZ8tuMNiBbbhz1OpZuQ5U3c4LecWSKrdGWg/qpl4UVgK/7l/C6AzjhKdofKIHIB7MbWuXyVcJ2X/x5mlPlQVlzCzH9OnkOkoxSSSzNkBc3quCeb+rhwAcgij33GW83Neg0Zwdi2CLVtxnFeau3QFbBlsX/jqJX7DDdBPwzS4RE9ggTWE+PUmm2xp8r8/nKlc74+J3fNprcm4tq3rL63FTEYKvLnO6dV1wrsqJBb1koD2ycOmsM9yyp45WFh7iaOpDkxvwi3+NTFWi5L6BDibhHHIUzZZvYcJsPJL8YFIJTWcoFij9/5C0u+6+ypQ9Xmg7WOtgQdjrUTEONe9lz0bt10/AOAA+aoL5F1VBn1xQAiI+TbmX7MD3ytNwyJJ1B5LfRBIpnTN1QiUIZ31DP+5K3j46e141KoRGOdlGzK82A4UhYHY6NeMABh8leMJb/5F0YnDxIapZaXmrbcWmYihSezJ2g77RU0IAX6nmm2uwQGAcrB6mqV5WyDoypJpZeyVD0YNHBVofSQWwMoMI0f/JZLJ8UlSXOsAHwWF3V3rWUy3Wv1fsWSr4oTowHHv7jzdZ4TyAlevDwjpMTguaA3tTlxikVKrqY6yP+S2vblI0AORiiKVrXx43Z8DO9dWHaQTYb2jJJzRZ4yAl7AM8sI8N3bwSzxKpsWglSN/1IuRkM90EVdiJBu+DQRFIACHnT54Oyg05Gse751oEDBUmg499CONRctFHYBxaLhr9ApPeTFu+onom1DrlbBWH49LgJHqBRljxK7W2mS0zlTZ36UUHBq8gmY03uN44o0OzhXHuxGpviXosDlS67FxMI6zEMzh4VajYgOXfkdzyVnx85zenINqu+/r5TsPOgk6oO390f96j7PjoJwYkfW/GELr58HY/8LtlXCas5 fxaUSO0U hJ1ftbngNhwX772OJL3uu1xs5CMNUihDNYQybR/1cDAOlpZNDJeLCp40WKDZR5zLxQWr4CqjB1NKeI0let6T6pKq0LHQqg1/knr+nWBxuNR2skyGSOjaJLk3zn/gDzaCYPlZlWxKtXpeU5/SGfwtFid7DC/DwbS1BFeNpN9NEJMa3RjlCOeaUki6nMkScWrUSB1swgbwGvQ494zOqRZ+GhFd8cST0tXjRC1vLPQGoHUV07jdqucAwEI6uM5StREmoBvd16sWl9RwRYDNzka6pb6xsj6P5ae5XjBOwEluEdEwgOobLBYoH61ukkEjTDxEUe1YGfXV71ahlzkB7gj5GUVv1araJUgVs/fB1P5w/DJXKv5yOIDIEAMgjRPF1/9lqmNvDWvEebr9Xt5xSf4ENFdtnS7lLm98y+cS4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.016298, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Sep 12, 2023 at 9:57=E2=80=AFPM Qi Zheng wrote: > - if (!down_read_trylock(&shrinker_rwsem)) > - goto out; > - > - list_for_each_entry(shrinker, &shrinker_list, list) { > + /* > + * lockless algorithm of global shrink. > + * > + * In the unregistration setp, the shrinker will be freed asynchr= onously > + * via RCU after its refcount reaches 0. So both rcu_read_lock() = and > + * shrinker_try_get() can be used to ensure the existence of the = shrinker. > + * > + * So in the global shrink: > + * step 1: use rcu_read_lock() to guarantee existence of the shr= inker > + * and the validity of the shrinker_list walk. > + * step 2: use shrinker_try_get() to try get the refcount, if su= ccessful, > + * then the existence of the shrinker can also be guaran= teed, > + * so we can release the RCU lock to do do_shrink_slab()= that > + * may sleep. > + * step 3: *MUST* to reacquire the RCU lock before calling shrin= ker_put(), > + * which ensures that neither this shrinker nor the next= shrinker > + * will be freed in the next traversal operation. Hello, Qi, Andrew, Paul, I wonder know how RCU can ensure the lifespan of the next shrinker. it seems it is diverged from the common pattern usage of RCU+reference. cpu1: rcu_read_lock(); shrinker_try_get(this_shrinker); rcu_read_unlock(); cpu2: shrinker_free(this_shrinker); cpu2: shrinker_free(next_shrinker); and free the memory of next_shrinke= r cpu2: when shrinker_free(next_shrinker), no one updates this_shrinker's= next cpu2: since this_shrinker has been removed first. rcu_read_lock(); shrinker_put(this_shrinker); travel to the freed next_shrinker. a quick simple fix: // called with other references other than RCU (i.e. refcount) static inline rcu_list_deleted(struct list_head *entry) { // something like this: return entry->prev =3D=3D LIST_POISON2; } // in the loop if (rcu_list_deleted(&shrinker->list)) { shrinker_put(shrinker); goto restart; } rcu_read_lock(); shrinker_put(shrinker); Thanks Lai > + * step 4: do shrinker_put() paired with step 2 to put the refco= unt, > + * if the refcount reaches 0, then wake up the waiter in > + * shrinker_free() by calling complete(). > + */ > + rcu_read_lock(); > + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { > struct shrink_control sc =3D { > .gfp_mask =3D gfp_mask, > .nid =3D nid, > .memcg =3D memcg, > }; > > + if (!shrinker_try_get(shrinker)) > + continue; > + > + rcu_read_unlock(); > + > ret =3D do_shrink_slab(&sc, shrinker, priority); > if (ret =3D=3D SHRINK_EMPTY) > ret =3D 0; > freed +=3D ret; > - /* > - * Bail out if someone want to register a new shrinker to > - * prevent the registration from being stalled for long p= eriods > - * by parallel ongoing shrinking. > - */ > - if (rwsem_is_contended(&shrinker_rwsem)) { > - freed =3D freed ? : 1; > - break; > - } > + > + rcu_read_lock(); > + shrinker_put(shrinker); > } >