From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3E9CC4167B for ; Wed, 6 Dec 2023 07:55:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 706156B0082; Wed, 6 Dec 2023 02:55:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 68FAE6B0083; Wed, 6 Dec 2023 02:55:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EB526B0085; Wed, 6 Dec 2023 02:55:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3FB076B0082 for ; Wed, 6 Dec 2023 02:55:41 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 13EC61A00FB for ; Wed, 6 Dec 2023 07:55:41 +0000 (UTC) X-FDA: 81535634082.24.5F710DE Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf09.hostedemail.com (Postfix) with ESMTP id 487E314000A for ; Wed, 6 Dec 2023 07:55:38 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=YdNE3pNk; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701849339; a=rsa-sha256; cv=none; b=CRUFINgFsSQga4fEsrk2P8xOzeMC0LpT1kAUmnryEcGk+u/aFm+jl1E/P267rBF/S+IgkZ Ne/yzQY5umkbAB7sZ+464G/i6vgqUZrcquEiheK7wudKxewoeCY/mFyuxgqX47uENwTf6R mIA7NNNS/eRO8SK20pmzJpC5o5vh0/s= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=YdNE3pNk; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701849339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KS8bxqNBURkYnzM0KcIii/6TxRexjVdODNdz5oVxNd4=; b=VRcxQzL2U79PXTgVldUmEQ/55KBu/PzSFXjPCfWF57u5qJfYeYqBCQ8e1Lz8N75jObk6QL hvCSNtLVFKNc6C4PVWcUhb2Kcinu96Ga/IoZ9ATGU6C1sqVtxsks2B42CSAleXNZ7tEPn4 x6S04+VtikGl/hDVGOTZM2/3ASZQzKo= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1d03f03cda9so12488685ad.0 for ; Tue, 05 Dec 2023 23:55:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701849337; x=1702454137; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=KS8bxqNBURkYnzM0KcIii/6TxRexjVdODNdz5oVxNd4=; b=YdNE3pNkO9lBHXmiWLzuzXk4lJNfl0vWvhabDO3hDs0ZXxaCxcRrw31zqWtolSU8lk 278/cutK6IXdGGDpjISQmjnH71D8sbITHwX+1JnfW6nJtBqiShsmKydoWqjU1CZs5gFu FeJabF7uKbUJJ/aMxOPkFEWQMKE9YfN9wRukEN2FNuvsijl74aB99D8k+FwifXJtTWfj 75dTIY2NZY4iX3aSUZjAOPYot3OXUzlr5ikLu4hxMZmb2TNhNMOfg4nQ5WqRh1lSvVGj tcDJQk4cwGQW9nB5gRCsZUPiuNXQ0095U0/D5ak7hZ37ohUKp3QfrnRvZIeGmn3RgLGO +p6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701849337; x=1702454137; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KS8bxqNBURkYnzM0KcIii/6TxRexjVdODNdz5oVxNd4=; b=phWPazq+YRZmzBOwnAh4C0I8XHQg5k7ozefrB2r51ELfxQbR1AiDYM/ylMpAg6r+eZ e1yzPaKsHYahcqJfBqjU5wxpKT+S93Zt+FZpClrRgL6QbO8jE5iIjw7YC0NIiXndTo7e /HnIY/SeZUfZvvex/vx//JNV6PXjKdOGGXIJSwoA9eWDQuBkxgUSBP/F5ah8J84F/o/P 8FgjlwUm77kuGC0JTMTH9XVdOg/UaYHIXopUJQH7a4YySvv5TTO+3V8H0blGF7sVbye4 XNaBU7PcBMKNtmVHiYxMaLW5u0cwN0x3FYgcEb10fjTaSDZfM4YKDWkvFJxuJ/Mm1KVy /BdQ== X-Gm-Message-State: AOJu0YxXHbYiYXekB7ijZo/1L8FuEmOlfblwe5TjKyQUYMSCB/B+Nmig +tzepZ+pKm7g6Zwx1VUh21g5oA== X-Google-Smtp-Source: AGHT+IFLk5sMy9OTPu9J3OwOGKIEAcdCIVFjwUnyOPSBh82NogqdIsEMmvb8Qth2WbV8n11qZVbf8Q== X-Received: by 2002:a17:902:cec1:b0:1cf:b192:fab8 with SMTP id d1-20020a170902cec100b001cfb192fab8mr945048plg.1.1701849336937; Tue, 05 Dec 2023 23:55:36 -0800 (PST) Received: from [10.84.152.29] ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id j3-20020a17090276c300b001b7f40a8959sm11411596plt.76.2023.12.05.23.55.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Dec 2023 23:55:36 -0800 (PST) Message-ID: <93c36097-5266-4fc5-84a8-d770ab344361@bytedance.com> Date: Wed, 6 Dec 2023 15:55:28 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 42/45] mm: shrinker: make global slab shrink lockless To: Lai Jiangshan Cc: akpm@linux-foundation.org, paulmck@kernel.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org References: <20230911094444.68966-1-zhengqi.arch@bytedance.com> <20230911094444.68966-43-zhengqi.arch@bytedance.com> Content-Language: en-US From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 487E314000A X-Stat-Signature: txtmdj7ejfx5a3rotyrts6y4i8yzh9tg X-HE-Tag: 1701849338-94492 X-HE-Meta: U2FsdGVkX19+xy0llZBqgfqp/qtmaNuJIJj71p+bL3d84pFdMAwXl22zGsmDkYEOatqqAoniasGco3Rc6OSxk4QrgkMjjxLpK+0WbI/o+H56zr6UC1k5peCFP1pmg8QkedrjqBaFMgFZZ7uJ7M7fcaXhVsizCAzfn1OQaiwsk84T/PiN3zIvfNspVSlbzIO75L77sQY4yYnVG59h453VPEdSK17EN6nO4DMsokvZR5MoVn5v/x0qPJGPi+gma6f2wJ0xRU7cFbfIV8+n7J/ju0Zx4vbiWJB1gdr+zSrNQWT67OFzFJLb/euYgFd77dMTbPYBu7VcUlNldT93UhejVJpTZf+agSh9XM/KHYY5fMwLoluvfsDjP+/iwlIk4ZRuL1ReRj7t9uYESOYpmGIBX+qF/f3CpleFm13dwCaGN5CRAlzAk5ROQXhDXWgzhkEIscvSTQu6YZ2X5u04fnPppLb0fxfH+tvKh/UYijh5ow5RBeU2DChK/rBK3JhFHAYicL4d41VY0UA/lnaaoUwa2aJLEkTwnfV7qyYrbr2nIlRdMzMLSTr6G7PMhFLYIXcOnhqQH055y3aLg0jPlZlDSPZRg6e37kBVCtylg5kqbaQJBJYe/drphTAF4af4+hjDqS+6RUPznMxFJxmqDZNof3DyvgueGeFzQmEMbtOgLGWzZlqLsXTSa2H6oXGmt75drmGYkAEYEZT9On+HtC1NCWHJpqL34JxO4gJ89fESzgFD0AF7VeAvc4vDOsDAHDNohxtqMHmqi3h7C0HaUtQVYqw87j4U470axBWYPJ7wOB+YT00JmUvkKMzeV5i9tcSegTtKA5PYP2ddEy0QPlV7gCXwzVnDNJqb+79K13PwCFcgqTxlrgYbVojaSPTwiGtweTCqo+oorqzJXcEJC//6UK+Yvv695sKsV22izydu9KDiFyE4LUIpMSCninOa1ZSuko7SSBzyhrGms1GKCns EMP5sFBJ fboUg+nQ+JJEClwKTbQNGemM+Fw9IHkuPyjk1415wZSBRH8qap/zNV9ng5qYxVeshEyh6WjFU9pLH4uDu7JoCmDydMzRmx0uA6pCiyp1pXCmiLBR54raIZ/HY2gT8cp33rsZf6RYBv6sQ5mDOMuMHWUPBFocmPrD4rrlsxKwvRx5Xerd8yfWIReXhgHGExDclXmiz3+doIBesA+ieOhUKglmZEG3GN0pwH/brwMvWJVGBRFuUTBZVbpIp/h4C/fyZw2RnLhv2xY61+wEnAmrAJ6UPyLbhgsqLZUEXuLO+DXXGyhU12mpGTzi5GyAEoLppmoKdQZ2REYUe9SQTNAtzXu3vnD4Jf/7ozinChw/nRqpVzcMK3B5HijFPeakyNnksFSG9rC7/7vqvEVyxPi709X5JxZg4ZQyTZqfxWyISaXMVPmLrUK6/mxo5nEcSegFo2kChodKyBxiAs/mZA59tjOdh1w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, On 2023/12/6 15:47, Lai Jiangshan wrote: > On Tue, Sep 12, 2023 at 9:57 PM Qi Zheng wrote: > >> - if (!down_read_trylock(&shrinker_rwsem)) >> - goto out; >> - >> - list_for_each_entry(shrinker, &shrinker_list, list) { >> + /* >> + * lockless algorithm of global shrink. >> + * >> + * In the unregistration setp, the shrinker will be freed asynchronously >> + * via RCU after its refcount reaches 0. So both rcu_read_lock() and >> + * shrinker_try_get() can be used to ensure the existence of the shrinker. >> + * >> + * So in the global shrink: >> + * step 1: use rcu_read_lock() to guarantee existence of the shrinker >> + * and the validity of the shrinker_list walk. >> + * step 2: use shrinker_try_get() to try get the refcount, if successful, >> + * then the existence of the shrinker can also be guaranteed, >> + * so we can release the RCU lock to do do_shrink_slab() that >> + * may sleep. >> + * step 3: *MUST* to reacquire the RCU lock before calling shrinker_put(), >> + * which ensures that neither this shrinker nor the next shrinker >> + * will be freed in the next traversal operation. > > Hello, Qi, Andrew, Paul, > > I wonder know how RCU can ensure the lifespan of the next shrinker. > it seems it is diverged from the common pattern usage of RCU+reference. > > cpu1: > rcu_read_lock(); > shrinker_try_get(this_shrinker); > rcu_read_unlock(); > cpu2: shrinker_free(this_shrinker); > cpu2: shrinker_free(next_shrinker); and free the memory of next_shrinker > cpu2: when shrinker_free(next_shrinker), no one updates this_shrinker's next > cpu2: since this_shrinker has been removed first. No, this_shrinker will not be removed from the shrinker_list until the last refcount is released. See below: > rcu_read_lock(); > shrinker_put(this_shrinker); CPU 1 CPU 2 --> if (refcount_dec_and_test(&shrinker->refcount)) complete(&shrinker->done); wait_for_completion(&shrinker->done); list_del_rcu(&shrinker->list); > travel to the freed next_shrinker. > > a quick simple fix: > > // called with other references other than RCU (i.e. refcount) > static inline rcu_list_deleted(struct list_head *entry) > { > // something like this: > return entry->prev == LIST_POISON2; > } > > // in the loop > if (rcu_list_deleted(&shrinker->list)) { > shrinker_put(shrinker); > goto restart; > } > rcu_read_lock(); > shrinker_put(shrinker); > > Thanks > Lai > >> + * step 4: do shrinker_put() paired with step 2 to put the refcount, >> + * if the refcount reaches 0, then wake up the waiter in >> + * shrinker_free() by calling complete(). >> + */ >> + rcu_read_lock(); >> + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { >> struct shrink_control sc = { >> .gfp_mask = gfp_mask, >> .nid = nid, >> .memcg = memcg, >> }; >> >> + if (!shrinker_try_get(shrinker)) >> + continue; >> + >> + rcu_read_unlock(); >> + >> ret = do_shrink_slab(&sc, shrinker, priority); >> if (ret == SHRINK_EMPTY) >> ret = 0; >> freed += ret; >> - /* >> - * Bail out if someone want to register a new shrinker to >> - * prevent the registration from being stalled for long periods >> - * by parallel ongoing shrinking. >> - */ >> - if (rwsem_is_contended(&shrinker_rwsem)) { >> - freed = freed ? : 1; >> - break; >> - } >> + >> + rcu_read_lock(); >> + shrinker_put(shrinker); >> } >>