From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 233F0E7716D for ; Wed, 4 Dec 2024 17:58:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 600336B0082; Wed, 4 Dec 2024 12:58:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5AFA66B0083; Wed, 4 Dec 2024 12:58:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44FFC6B0085; Wed, 4 Dec 2024 12:58:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 285406B0082 for ; Wed, 4 Dec 2024 12:58:54 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 988AC804B3 for ; Wed, 4 Dec 2024 17:58:53 +0000 (UTC) X-FDA: 82858036464.09.2A0BA0A Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by imf21.hostedemail.com (Postfix) with ESMTP id C18C71C0002 for ; Wed, 4 Dec 2024 17:58:23 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aLK+uH7x; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.167.45 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733335118; a=rsa-sha256; cv=none; b=ZHOFNCxaoicOIxynYjV3nRGUpyuNyEIejAj1re/034MJv2A8jpYLh5xFSxddK7AhGrXh5O jHdi8JDZj00E1HgVc0mcNls8QLEkYmHGIDby/ju3pTMe9xmeAqv4ALKrijSdZzJ6V8i8yV TrWcHRsHp8AM4poPIHyUonOLVpz5UwM= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aLK+uH7x; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.167.45 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733335118; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tWCeFLTJpj0LgE0RztJm+OwymY9MAnHMbb9v/QBj95I=; b=3ZM4wMvozwGWpU1KtbA8wxgGRKAc0hQCY4VPaZMfRl0oY1uaDHA1IjMYxBIi7GntLmn0QE 3p/Gwln0imozd1qhr7P0aybwimSN8g4h2YVmFx1QDxBRua7vS/812Yk1Z1J50FubGQ36kJ mAXHtCqNaRLdyz/h60GdG0+35ODMSo4= Received: by mail-lf1-f45.google.com with SMTP id 2adb3069b0e04-53e152731d0so1040060e87.0 for ; Wed, 04 Dec 2024 09:58:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733335130; x=1733939930; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tWCeFLTJpj0LgE0RztJm+OwymY9MAnHMbb9v/QBj95I=; b=aLK+uH7xtEYwcy/sVgxk8AVu+j8/hnt5NX7FKwODouRxIfkIxSu5pDCK7JOmXecTMa 2421t91u6Z52gMYGjzIg/V8t8oQbQ5sa/EYJVUI5X555N+Iy33C6UrQNw75IB2l399V5 oQ33BEHhCZKfzQyil0EgaXlRx1gkGoRpSb4wHX6zmi7hk5MVQDXEmvxGgEnZvu23rZVO zXXcpyf1U5lJMklsL23Z82NyXmWjs8BIuYvREgQn23HtUPVf+hsPHxTR/QOWQXTbzOwu tjeL72RAoCIrDvl9qh5trn0CpJOsIZFRAUJNmYRmoA2bjWQ3xkVHFwBv437fsPtO+pNN kx7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733335130; x=1733939930; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tWCeFLTJpj0LgE0RztJm+OwymY9MAnHMbb9v/QBj95I=; b=v3wW9lIcqRnhZCGk4aqrMUQ0gUsMlClSSdEQHMgg7RxKqtQI3Dq8Cgv0YOMfQoDAXM mSOfTClTaXEWp5eEE6Zs9yV0+zIsp8rHoleVwDW+simlhZQIq3ykJnVj9QVIsgGVAufU Fd2j0zuBwQ42FJak5O1VNCMzvW5tFSI/cn5M+v6XqKx1Yd6XRtTHA3hYyHi+Baqliy++ 5fHiUisqD7MllELELLjOBkd9DFc3Ug1hPs3oSRx1irhJYLdW9908oPmkVCzwJRaG6IS8 I600D6sopvsw6o8C+t8pMi2elPb2M3BHJAr1gvDUssWxG32GOSZzQoSnJouxYodoYuer LwAw== X-Gm-Message-State: AOJu0YwGKRx1oZTs67UbiBn8CRDRhoFPOCuWJN9Q5OqHh1MrxILSiBph pRt3j1dTCpwEHLgTUJ8TqC4rgVgC6hV78fqB4cA5FxUR6RHc3TDvD3AzJwti/Y874mTlHHQZQ1w P7uM9T5/o8vL0SxKGM+i0wogE4aE= X-Gm-Gg: ASbGncsWhXiHVvR0FfF4Kk+iGnSqAFah632jN/0Zf++Wr9hTaGdZQETcc2ej12/L2X+ OfO8EYMtyXHMQ47UOIGa3W5adBv2mNe4= X-Google-Smtp-Source: AGHT+IFb0L2TrCMXYydUplgGTCybAyBQnEPdAK8jOzKnUov9HVMSNjjn8/1RlxaHaPjQvcwkRXkcNwcg4JT7riXj21w= X-Received: by 2002:a05:6512:224b:b0:53d:f8aa:abaf with SMTP id 2adb3069b0e04-53e216fb0cfmr79476e87.9.1733335129568; Wed, 04 Dec 2024 09:58:49 -0800 (PST) MIME-Version: 1.0 References: <20241202184154.19321-1-ryncsn@gmail.com> <20241202184154.19321-5-ryncsn@gmail.com> In-Reply-To: From: Kairui Song Date: Thu, 5 Dec 2024 01:58:33 +0800 Message-ID: Subject: Re: [PATCH 4/4] mm, swap_cgroup: remove global swap cgroup lock To: Yosry Ahmed , "Paul E. McKenney" Cc: linux-mm@kvack.org, Andrew Morton , Chris Li , Hugh Dickins , "Huang, Ying" , Roman Gushchin , Shakeel Butt , Johannes Weiner , Barry Song , Michal Hocko , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C18C71C0002 X-Stat-Signature: zrbrc9ewp9f94psqojmb46x93qpyqqn7 X-Rspam-User: X-HE-Tag: 1733335103-534344 X-HE-Meta: U2FsdGVkX18jgTd6HwumbXnBofYn53LD6reS/lWK9YE9hhZKIh7tw9sCpKx3aLLYEDHwdRW+wM6y4MQ1RD4rXgRYDip56L38gMcBU5etcLEkVLoPWD9cjKX2ZUuGfWq0Nwm9r8Lt7ZlocrvgP/zLR7jwEbezGtO4PlEkpRcQqCrDCrJQ/wgrVXPFiIWFo/smOMF2BMoz9WzIVVsJXLHSGJq7Fr0uFUSqyhDKT0TwZxaB7HvZy9YDCduoIhVcAVIQTTAytgcM692rAfg8VaqXZnen+HVulSmy2fI2VH5xRirz1OkI7QLjF/ZYpEMDLE2LYUKHi90L8CMPeesoLESjTq0aJWVn4YtATyo2BY8J+eaqo7bOoWR34jNop4ySqSbSr9i0xhi+2YC0h7EBfs7xmyIC2PPdGVqD1Hs7vy+AdbwNOmsGBJP0y+Kwm7ly7fSvIWrqalosrl2YPebQ0Xr9Ex7+BFph3pr79Dg/lXhiJJN7g/7nryL5aA4ydarQ3S+ASBEiaKcWolCF8plF9ebLsolUK8RlMHveuPBZ8UK3m+NH3dIVme5bpzuJsJxrvFC/Qsp0jmbbwfD5Vc/UVppYwcSittTKALs881b+pIRqLMzhhSJv837TmzQoFzg6X7jQMeXWcTVElHik9ZkGjO8z9IVowXM4LBhlV89VajLP+gkepnLKM3JmMfpMc9UFlc87CrHmt/CFXfAw55qt5r905/709+Fj9vh19d+mK+mXrthuDaQ/QgvUplRl20aHnQdR7him0uwegmriRjJP27gqWGZQzQ2a8FgFM8s4ch6y+PWySjlBrrjPEu45ifZe3cH2APSdurcfgAYQjSuqMpvcBXfZTJZFKvwT8AaUI5ES25mUjejkKBGoCI9vBxDLQwAcsWMjvoy47V23uz6yFYvTl+P6YU8qRDwoGfYTZR/FONpR2fc6CqeWIf3ByJG0QAp1RshQVLD8cKQ7KgKx6TK /uAZQWdo sITX98KhdGgFoaiiG7DRRJ5O5RFKzl0GoaV1wIDTdPyqzvd5FuMsMED3sjHHlOh4eO8VL9JD4vmfk0jIQ4r3rKk4M0Q0uOJbDL/ZJuriuzFCf9GwV8DzXENuJBW96ysa4fv3MNGgKqlPoh3SgcOYoK7luywZHoMTSLUJRzEe7X8ak0Rag2TCJX02kv6bS+2wpyXl8XXYTGvK1GL9AfpRB02Vvd3hd7wXOJmUlc84yGDnk6ck9d2ybKAQjRIx/dpM3etbfXxi/FcN9w6gTNQ8hrzJy8s0u0ibPBHRVIGC9qQqREQkqac0nbT+iOdMLy69KDlndy2xW9b1NEAVpQa232Zs9/eVvBCnGTvP4beYFfWBcqQMfflXa2j+fFS8qL0Q0ABUg/024cpu6mGJ69md2CKmcB8xawtZDTlIgQD6zeGQgfKJaL6hxkB5cuQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 4, 2024 at 3:18=E2=80=AFAM Yosry Ahmed = wrote: > > On Tue, Dec 3, 2024 at 10:20=E2=80=AFAM Kairui Song wr= ote: > > > > On Tue, Dec 3, 2024 at 4:36=E2=80=AFAM Yosry Ahmed wrote: > > > > > > On Mon, Dec 2, 2024 at 11:28=E2=80=AFAM Yosry Ahmed wrote: > > > > > > > > On Mon, Dec 2, 2024 at 10:42=E2=80=AFAM Kairui Song wrote: > > > > > > > > > > From: Kairui Song > > > > > > > > > > commit e9e58a4ec3b1 ("memcg: avoid use cmpxchg in swap cgroup mai= ntainance") > > > > > replaced the cmpxchg/xchg with a global irq spinlock because some= archs > > > > > doesn't support 2 bytes cmpxchg/xchg. Clearly this won't scale we= ll. > > > > > > > > > > And as commented in swap_cgroup.c, this lock is not needed for ma= p > > > > > synchronization. > > > > > > > > > > Emulation of 2 bytes cmpxchg/xchg with atomic isn't hard, so impl= ement > > > > > it to get rid of this lock. > > > > > > > > > > Testing using 64G brd and build with build kernel with make -j96 = in 1.5G > > > > > memory cgroup using 4k folios showed below improvement (10 test r= un): > > > > > > > > > > Before this series: > > > > > Sys time: 10730.08 (stdev 49.030728) > > > > > Real time: 171.03 (stdev 0.850355) > > > > > > > > > > After this commit: > > > > > Sys time: 9612.24 (stdev 66.310789), -10.42% > > > > > Real time: 159.78 (stdev 0.577193), -6.57% > > > > > > > > > > With 64k folios and 2G memcg: > > > > > Before this series: > > > > > Sys time: 7626.77 (stdev 43.545517) > > > > > Real time: 136.22 (stdev 1.265544) > > > > > > > > > > After this commit: > > > > > Sys time: 6936.03 (stdev 39.996280), -9.06% > > > > > Real time: 129.65 (stdev 0.880039), -4.82% > > > > > > > > > > Sequential swapout of 8G 4k zero folios (24 test run): > > > > > Before this series: > > > > > 5461409.12 us (stdev 183957.827084) > > > > > > > > > > After this commit: > > > > > 5420447.26 us (stdev 196419.240317) > > > > > > > > > > Sequential swapin of 8G 4k zero folios (24 test run): > > > > > Before this series: > > > > > 19736958.916667 us (stdev 189027.246676) > > > > > > > > > > After this commit: > > > > > 19662182.629630 us (stdev 172717.640614) > > > > > > > > > > Performance is better or at least not worse for all tests above. > > > > > > > > > > Signed-off-by: Kairui Song > > > > > --- > > > > > mm/swap_cgroup.c | 56 +++++++++++++++++++++++++++++++++++-------= ------ > > > > > 1 file changed, 41 insertions(+), 15 deletions(-) > > > > > > > > > > diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c > > > > > index a76afdc3666a..028f5e6be3f0 100644 > > > > > --- a/mm/swap_cgroup.c > > > > > +++ b/mm/swap_cgroup.c > > > > > @@ -5,6 +5,15 @@ > > > > > > > > > > #include /* depends on mm.h include */ > > > > > > > > > > +#define ID_PER_UNIT (sizeof(atomic_t) / sizeof(unsigned short)) > > > > > +struct swap_cgroup_unit { > > > > > + union { > > > > > + int raw; > > > > > + atomic_t val; > > > > > + unsigned short __id[ID_PER_UNIT]; > > > > > + }; > > > > > +}; > > > > > > > > This doubles the size of the per-entry data, right? > > > > > > Oh we don't, we just store 2 ids in an int instead of storing each id > > > individually. But the question below still stands, can't we just use > > > cmpxchg() directly on the id? > > > > Hi Yosry, > > > > Last time I checked the xchg status some archs still don't support > > xchg for 2 bytes, I just found things may have changed slightly but it > > seems at least parisc still doesn't support that. And looking at the > > code some arches still don't support cmpxchg of 2 bytes today (And I > > just dropped cmpxchg helper for swap_cgroup so that should be OK). RCU > > just dropped one-byte cmpxchg emulation 2 months ago in d4e287d7caff > > so that area is changing. Lacking such support is exactly the reason > > why there was a global lock previously, so I think the safe move is > > just to emulate the operation manually for now? > > +Paul E. McKenney > > If there's already work to support 2-byte cmpxchg() I'd rather wait > for that. Alternatively, if it's not too difficult, we should > generalize this emulation to something like cmpxchg_emu_u8() and add > the missing arch support. It doesn't feel right to have our own custom > 2-byte cmpxchg() emulation here. Actually here we need 2-byte xchg, not cmpxchg. I'm not exactly sure if any arch still has anything missing for that support, or is there a plan to support it for all archs?