From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D47C4167B for ; Sun, 3 Dec 2023 11:19:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9FC06B02E5; Sun, 3 Dec 2023 06:19:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E51AD6B02E6; Sun, 3 Dec 2023 06:19:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D173D6B02E7; Sun, 3 Dec 2023 06:19:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C1AA36B02E5 for ; Sun, 3 Dec 2023 06:19:52 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 927E0A0446 for ; Sun, 3 Dec 2023 11:19:52 +0000 (UTC) X-FDA: 81525262224.25.B4778F5 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) by imf11.hostedemail.com (Postfix) with ESMTP id DC3AA40003 for ; Sun, 3 Dec 2023 11:19:50 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="L/aksPLz"; spf=pass (imf11.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.222.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701602390; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Tw+/5qq37rQrr1uVu+tKsvD2lTjhdzLYaYrSz1LEfBY=; b=GSz6zEMqtnojPA3TwhDS/n8YQ7LUCsBTmZLZqYABQaJcNaWM0xKnCJHrGhQN81bwxD/ewz IfiXwbatpIAYmGkR0kRGPv6WlUoqBeMGmrVhEDyuaHLpwm66HwC0Lrjb69C0phcTEXW5QE esYOLUOBk6eRIHWU+1c/I+N2UiV4GnQ= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="L/aksPLz"; spf=pass (imf11.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.222.175 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701602390; a=rsa-sha256; cv=none; b=V0cQ6FVL+xN1hrQeZIJZGgIBBzRhcX4G+g0TGx0abepSNuvDJDONGJpLf0s4DGHb66C2wN wpU8EgFWLazJ/peTZjEsPglanGmtfozsE3HS1F3BmgEF3ekIGb5pNkPxR0ZBbP+GiFTqhi xAvj6H1j3PD8/tdI0DbnLomBFJZZJNA= Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-77bcbc14899so250163685a.1 for ; Sun, 03 Dec 2023 03:19:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701602390; x=1702207190; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Tw+/5qq37rQrr1uVu+tKsvD2lTjhdzLYaYrSz1LEfBY=; b=L/aksPLz0QeMXP4J3GDi4bU4628qabwJbgPUIb2PcVWAGVuyLlspQ5VOsQUIOp7+wn 69zDxfUkq0mj87C3q7C+sKzD2Zs+m0oXhDKhfnqb4+y+0D2CrOyt/s3mniATW+Mte1W+ LXkPJo4bXFlTkw5Y9a1xk8mrX6OkDsRTDQ2NmSY1Gfi2eV+u7/Uhyn1DYVKsF46IrmGX 7cZu2KtyBw2Qe1AWPm5Wt6rpDUW6Q+Ms+bZehCq7Jm+z23yA6q8SnFK7nM/5h+vzaYUN rjvuagbZ5qNI/NiosIlIPmBmNT9n6xq+dd9I8cR8JTypbcCwtCKEfShITxxSS75JNUsP lmMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701602390; x=1702207190; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Tw+/5qq37rQrr1uVu+tKsvD2lTjhdzLYaYrSz1LEfBY=; b=Quzqn/+6imFN+c6zdeurZSHSAU2wJ0L7Lzgjm7IkfoRNgW4SNzg/Dwjadjgrvu2iJ0 2hTfKR6H7es1kFtjQqIWkrhiqiB+bNUrTqbySz2Rt2CV4ZVKq99BIWgt2+0mo9+B3L/0 bcvqbkuBM+MkoJ7rykqdEzCFHQQOmFTS31Y3Y7p6VgR8nsKIpz94lfi36GDzjVj2gLhO 0NyFLYI7mAlf1u3NGpgqVr1cpFtePXskwATq4HVRd10+RUFHBXmDNaNSVm67qGImqZ8F uyudwQz2mahW8y1g61AgfN0UHiM/gwEaIo2UzGwbX5X2YAs91Mg86fuy4SNack1fmNau enUA== X-Gm-Message-State: AOJu0YzBeBUXI5bEsvxW1MogwZ23G0HiEoMtH5juF602ykXY8m5MhVNJ 9Z+cr0llV253YVzawtEA2nSSonm/YTDAAZkJ2qE= X-Google-Smtp-Source: AGHT+IHtR3/KcQdwOQzCPNMSV2WOidMSRhcsGQgfPHZTJ+W7zZsKFPQVwvjzRtuh35cMVjVUYJkiHYeuA8aiWSuME4g= X-Received: by 2002:a05:620a:179e:b0:77e:fba3:81bd with SMTP id ay30-20020a05620a179e00b0077efba381bdmr3122145qkb.83.1701602389881; Sun, 03 Dec 2023 03:19:49 -0800 (PST) MIME-Version: 1.0 References: <20231102032330.1036151-1-chengming.zhou@linux.dev> <20231102032330.1036151-8-chengming.zhou@linux.dev> In-Reply-To: From: Hyeonggon Yoo <42.hyeyoo@gmail.com> Date: Sun, 3 Dec 2023 20:19:38 +0900 Message-ID: Subject: Re: [PATCH v5 7/9] slub: Optimize deactivate_slab() To: Chengming Zhou Cc: vbabka@suse.cz, cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, roman.gushchin@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Chengming Zhou Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: DC3AA40003 X-Rspam-User: X-Stat-Signature: 43cssz683ephra5njibw84rgjw4cmo1t X-Rspamd-Server: rspam01 X-HE-Tag: 1701602390-104625 X-HE-Meta: U2FsdGVkX18WJmnLCe3lVKz4cnN5ByHTieAaYj3ZoshNPebESGZTcxOhgtwKestR2uXYD4U7k6MFXRjx9xlNIEixj1n1YxQXzeasfeyJP0txIqnLAXqfDdX7GxDTQ8o2DaFyWGyI2jYUy0HXnrLhw1Ez+ehxY0pLj8i6eyHibzMeBLKx0fUuuNqo0E+Bt6cSpjWaJNsmQGkljXrlzumrqaZUlOQR+aWb2Wy1HcX52ETHjNCk/R05dTX5jKhxWChM7Hj+3cPCRVWAr3MJJ0zlJGfd37XPgzA0HICPTdx707vvIwIhntHZ4ZllGHegLZqTCXJ7Kawr1Yx58ykUq1pfMK3CWqaUF4wxdFJcpZFaX6d/a6+urkWYD3P69ByqgCBM++hdJcst9i+KAPd2uhT7vYS3eBdPPWvRGOQysCyfT0wQvsx3/2bBURC6BCVayZj0av87K7LZbPr5GlsM9nn2dGaePbcviAw+L1It/dLlhHZAmBpMPXEREpvqaPEfsWotZ49qMvDnNYqDpAd4qOGH/fBDkYzMUpj6+nwJkgrcC6McR/DI8TOmG4ZzMJgw4mhse60Nxr9Jj5or9OS1n47ugky/b++BFYR6Fu/VTcGgMgasrrrBPBBZMQ/qe+1ZPhmV+PED8PTR/JapPSdDwXhjdebR55Tn68x2C5IJmmxZmgcju3vUC/9ZQTEQYAuKJSAGkse5v3LkdPflLSosVxsPTqu6aNTgc7wQ9qgqI1l1amjw6pFREa96BCoygsvebj28KApJWCK9ZxRBz+92voJCK9el+n7qY8xoFnkspLQova3zdGLfjOmrGQ5wyEplqmvxAmqC8irhkoJANw6k+/5HkAQ+YuDTMmMOi43p7U43VFzzSvgF00Xtu+I2PwBvUSfV9M1A0mT0UqO4K+UjA7ekA7zLujW2Q4BYey7WA96vJwrne1lpd5I5RLIWYnHF6+iE2XWdvHzv46K9zKopLo0 I5lTPA6n Us1+FAgNZLyxwUmutDtsyFKPutjwP2HpmBgAXtxMRb9OvDxbwmPgpXBlamWc/lwzDlfT0MlHuCa06/fCRzEPxwsUUSVx9Cnt4WakfQJ2WHLlMw3mHlS6HdI6Sr9JbG/f557xAj/nXCw2Th2MW1eqfc/0BOOsIgu/Elt8sd4y82DXtQR2UlygXWMzBHati05uzLfpK/dq0r338oayJPc8ndjpPOi5f/+P2FAX+UR0BUVP6IP9sw1aHDqaamzLV2gsYbBAyW7CHVcInn7PM5ku9GhKnRcJOvwn2sLTEVpWM6YrSoYAWpJrBqEQEQN4Axcdr5I0Hrt+xr2em+owH/YqEKF6nCA1N9BhloIkpGvTmT479YI9z1tzbZ5LUgFwuo8AlO/kRYbk4J3H15n8Pd0QmSdGv9WreRNcB05PbdKeW8HTqR0NKCcLTOP5DBWCZW4U6L106xPRe0kU9xqQzBE2A803et1/afIkLa/uZgonjp/WE9HFkHgVa94YHkCP54O3AUqeg6xpVFuiC0sY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Dec 3, 2023 at 7:26=E2=80=AFPM Chengming Zhou wrote: > > On 2023/12/3 17:23, Hyeonggon Yoo wrote: > > On Thu, Nov 2, 2023 at 12:25=E2=80=AFPM wrot= e: > >> > >> From: Chengming Zhou > >> > >> Since the introduce of unfrozen slabs on cpu partial list, we don't > >> need to synchronize the slab frozen state under the node list_lock. > >> > >> The caller of deactivate_slab() and the caller of __slab_free() won't > >> manipulate the slab list concurrently. > >> > >> So we can get node list_lock in the last stage if we really need to > >> manipulate the slab list in this path. > >> > >> Signed-off-by: Chengming Zhou > >> Reviewed-by: Vlastimil Babka > >> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > >> --- > >> mm/slub.c | 79 ++++++++++++++++++------------------------------------= - > >> 1 file changed, 26 insertions(+), 53 deletions(-) > >> > >> diff --git a/mm/slub.c b/mm/slub.c > >> index bcb5b2c4e213..d137468fe4b9 100644 > >> --- a/mm/slub.c > >> +++ b/mm/slub.c > >> @@ -2468,10 +2468,8 @@ static void init_kmem_cache_cpus(struct kmem_ca= che *s) > >> static void deactivate_slab(struct kmem_cache *s, struct slab *slab, > >> void *freelist) > >> { > >> - enum slab_modes { M_NONE, M_PARTIAL, M_FREE, M_FULL_NOLIST }; > >> struct kmem_cache_node *n =3D get_node(s, slab_nid(slab)); > >> int free_delta =3D 0; > >> - enum slab_modes mode =3D M_NONE; > >> void *nextfree, *freelist_iter, *freelist_tail; > >> int tail =3D DEACTIVATE_TO_HEAD; > >> unsigned long flags =3D 0; > >> @@ -2509,65 +2507,40 @@ static void deactivate_slab(struct kmem_cache = *s, struct slab *slab, > >> /* > >> * Stage two: Unfreeze the slab while splicing the per-cpu > >> * freelist to the head of slab's freelist. > >> - * > >> - * Ensure that the slab is unfrozen while the list presence > >> - * reflects the actual number of objects during unfreeze. > >> - * > >> - * We first perform cmpxchg holding lock and insert to list > >> - * when it succeed. If there is mismatch then the slab is not > >> - * unfrozen and number of objects in the slab may have changed= . > >> - * Then release lock and retry cmpxchg again. > >> */ > >> -redo: > >> - > >> - old.freelist =3D READ_ONCE(slab->freelist); > >> - old.counters =3D READ_ONCE(slab->counters); > >> - VM_BUG_ON(!old.frozen); > >> - > >> - /* Determine target state of the slab */ > >> - new.counters =3D old.counters; > >> - if (freelist_tail) { > >> - new.inuse -=3D free_delta; > >> - set_freepointer(s, freelist_tail, old.freelist); > >> - new.freelist =3D freelist; > >> - } else > >> - new.freelist =3D old.freelist; > >> - > >> - new.frozen =3D 0; > >> + do { > >> + old.freelist =3D READ_ONCE(slab->freelist); > >> + old.counters =3D READ_ONCE(slab->counters); > >> + VM_BUG_ON(!old.frozen); > >> + > >> + /* Determine target state of the slab */ > >> + new.counters =3D old.counters; > >> + new.frozen =3D 0; > >> + if (freelist_tail) { > >> + new.inuse -=3D free_delta; > >> + set_freepointer(s, freelist_tail, old.freelist= ); > >> + new.freelist =3D freelist; > >> + } else { > >> + new.freelist =3D old.freelist; > >> + } > >> + } while (!slab_update_freelist(s, slab, > >> + old.freelist, old.counters, > >> + new.freelist, new.counters, > >> + "unfreezing slab")); > >> > >> + /* > >> + * Stage three: Manipulate the slab list based on the updated = state. > >> + */ > > > > deactivate_slab() might unconsciously put empty slabs into partial list= , like: > > > > deactivate_slab() __slab_free() > > cmpxchg(), slab's not empty > > cmpxchg(), slab's empty > > and unfrozen > > Hi, > > Sorry, but I don't get it here how __slab_free() can see the slab empty, > since the slab is not empty from deactivate_slab() path, and it can't be > used by any CPU at that time? The scenario is CPU B previously allocated an object from slab X, but put it into node partial list and then CPU A have taken slab X into cpu sla= b. While slab X is CPU A's cpu slab, when CPU B frees an object from slab X, it puts the object into slab X's freelist using cmpxchg. Let's say in CPU A the deactivation path performs cmpxchg and X.inuse was 1= , and then CPU B frees (__slab_free()) to slab X's freelist using cmpxchg, _before_ slab X's put into partial list by CPU A. Then CPU A thinks it's not empty so put it into partial list, but by CPU B the slab has become empty. Maybe I am confused, in that case please tell me I'm wrong :) Thanks! -- Hyeonggon