From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCA7FC25B46 for ; Mon, 23 Oct 2023 17:00:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 548C96B0125; Mon, 23 Oct 2023 13:00:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D1FB6B0126; Mon, 23 Oct 2023 13:00:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C05E6B0127; Mon, 23 Oct 2023 13:00:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 263976B0125 for ; Mon, 23 Oct 2023 13:00:16 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C0E861CB566 for ; Mon, 23 Oct 2023 17:00:14 +0000 (UTC) X-FDA: 81377339148.24.4A9A92D Received: from gentwo.org (gentwo.org [62.72.0.81]) by imf08.hostedemail.com (Postfix) with ESMTP id E4598160002 for ; Mon, 23 Oct 2023 17:00:12 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=gentwo.org; spf=pass (imf08.hostedemail.com: domain of cl@gentwo.org designates 62.72.0.81 as permitted sender) smtp.mailfrom=cl@gentwo.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698080413; a=rsa-sha256; cv=none; b=aiyZGU5y0I5WJokTjxI8EdP9W3S/f9AYDxb/o3+nXK96Frx3HQrSzk8zxYt1FjIg/Slw5C duYth1RRy5QdBbMNFxYqvlIkTPSPf3ldfIoj/t1Mb9liR5edaI7Od1eX2aBvwW2N2r6O++ dmmVMIJyx9z5GE81GgaDMGn9G2qEcAc= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=gentwo.org; spf=pass (imf08.hostedemail.com: domain of cl@gentwo.org designates 62.72.0.81 as permitted sender) smtp.mailfrom=cl@gentwo.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698080413; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8xcDNBjPAI5YzV7DWNrYIMuejQTIM+WpmZUU+gHOF1A=; b=hnE7EvQ4gsxBNVY5TqweoRez0Dqcc11voUGvbq64VON6+qVbAFivvNfQPEe+37dFuJGyBk ineeADOZ5RAAG9ek0+NojZnVtA9asBFaKsBDlS156BVrrIkLi8i4JcEYJXZFQ3jvQ/DC4w KB7JOsL0j9/OaCC6ZGILsGl/CZr+OrE= Received: by gentwo.org (Postfix, from userid 1003) id 5066248F4B; Mon, 23 Oct 2023 10:00:11 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id 4D21248F48; Mon, 23 Oct 2023 10:00:11 -0700 (PDT) Date: Mon, 23 Oct 2023 10:00:11 -0700 (PDT) From: "Christoph Lameter (Ampere)" To: Vlastimil Babka cc: chengming.zhou@linux.dev, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chengming Zhou Subject: Re: [RFC PATCH v2 0/6] slub: Delay freezing of CPU partial slabs In-Reply-To: <4134b039-fa99-70cd-3486-3d0c7632e4a3@suse.cz> Message-ID: References: <20231021144317.3400916-1-chengming.zhou@linux.dev> <4134b039-fa99-70cd-3486-3d0c7632e4a3@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: E4598160002 X-Stat-Signature: zdmfmtiryycjwwj97fbmfetb4uqgtpc6 X-Rspam-User: X-HE-Tag: 1698080412-436850 X-HE-Meta: U2FsdGVkX1+3HUWUPHb80cEq21KU2E1wbPlTf4tbAI1p5XWVwWImo42YiEJ3Os+5M+6gXfJ0HZTuXzf+h/9T1wHy1VW+tn6qitiv+R+Gv1zZBFSc4ED8bYuchaoHHWvPof4hJ1XPeRz2031CMMgsdBNON4oTThnDfuZ3C0ulvTeg7HgHJS0CvJH/U2sIoLletTREuAGVal6QXkgNdnq/hpX2LR3VfQ0xQ3oOhuJcTjfswOyAjTS95A6MTdQkNRsVCzIWPIWL3f9N7Lqxx76AzRtIQPFALvCMyTkx2o5xpNIaZ6RZux+gHRSU9xgm2lbS4axxvvGjRY15Mi4ayHRyZKw5/7f9gFyCwrCv6GzW5cSizIhEVXkdSUt3xQ4fphBOjcGwV1JPLij0JO0FW8ALC/Sf6jNl1LgFjOBmUc/6ZBrKcyjfrLzHxLHG/UHs4APJRUuNjZMVib9ql3w3t2pOgJI6C2xvaVOAJc1TR845qjxkDZlIKTVJqEjEpkIjz6aW+P9hXdKEFSA6RuO03v/d9fbW0xV22dNT9IkJNsJ4QWrEXhP4DufBk90wTNK3enQZOwV+8d3KDHajfmG+DWo5khVMYT0MA5DaFb/afvHJHxjeJVUYKrvVuaHcjtI16dmU6hxActPE30Lj0wEVgIZMBsWPakOVxsYmmfB+TJ+ajYEW6AgCqs6/sPYYj/buyekVsNusnRuvJdawGJMBPEYM9Y5I5siw830A/V8Q+B1jXgetH6RyF0S3Y4HmRoOvor0X5AnfbSRWfImRemDY0omUOjDTT9U6Aopyfzowbyns2E2OsU3S7sNAQ0W6HVPOkNz0rKqEZOhIWMr313riIAQ19H2AmD3KnnePJHFhnT67NHMBOaSLiSqjgQk4PjqP4NRYaVARONw1LzemiLH/Fu2nDTqns9+8IYmGmJVuEregvYeZkMERx6+xuAwaY43eZyiCf2/h0hoJBmQ9wW+v3bU 4sXZiv5m 84BLXdDqbo1vZaI8rsOs4aCbTgEnOX0KPGzAeWMtKw5zslqAwZRnTA+FXM5DGc6ulNucD4wa2gpLgoHMrgQW9vJMCjyPPNHTaVotY3qiFRktfCeO05AOg0qOnnI/mQ5lmzoXUPusT8LrdCv8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 23 Oct 2023, Vlastimil Babka wrote: >> >> The slab will be delay frozen when it's picked to actively use by the >> CPU, it becomes full at the same time, in which case we still need to >> rely on "frozen" bit to avoid manipulating its list. So the slab will >> be frozen only when activate use and be unfrozen only when deactivate. > > Interesting solution! I wonder if we could go a bit further and remove > acquire_slab() completely. Because AFAICS even after your changes, > acquire_slab() is still attempted including freezing the slab, which means > still doing an cmpxchg_double under the list_lock, and now also handling the > special case when it failed, but we at least filled percpu partial lists. > What if we only filled the partial list without freezing, and then froze the > first slab outside of the list_lock? > > Or more precisely, instead of returning the acquired "object" we would > return the first slab removed from partial list. I think it would simplify > the code a bit, and further reduce list_lock holding times. > > I'll also point out a few more details, but it's not a full detailed review > as the suggestion above, and another for 4/5, could mean a rather > significant change for v3. This is not that easy. The frozen bit indicates that list management does not have to be done for a slab if its processed in free. If you take a slab off the list without setting that bit then something else needs to provide the information that "frozen" provided. If the frozen bit changes can be handled in a different way than with cmpxchg then that is a good optimization. For much of the frozen handling we must be holding the node list lock anyways in order to add/remove from the list. So we already have a lock that could be used to protect flag operations.