From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B5D7E74AC3 for ; Tue, 3 Dec 2024 18:20:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 533826B0083; Tue, 3 Dec 2024 13:20:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E3906B0085; Tue, 3 Dec 2024 13:20:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35D536B0088; Tue, 3 Dec 2024 13:20:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 13D0D6B0083 for ; Tue, 3 Dec 2024 13:20:33 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A696B40EE9 for ; Tue, 3 Dec 2024 18:20:32 +0000 (UTC) X-FDA: 82854462138.20.D43A485 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf14.hostedemail.com (Postfix) with ESMTP id E643C100006 for ; Tue, 3 Dec 2024 18:20:15 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=N23PO71Q; spf=pass (imf14.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733250024; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jBCDNod8qWx3xnslvb8P0JxN8cIjahJejEW0GBjZxls=; b=nT7Fspeb+C5BNxH+KSYT6lKKPzISfnG86VgLAGRslPePt9wEIm+KqqfmjHhQyUFQe55+HF nBtQJLKFyguMwaQTBhcqoi5eZEsOlVgEzZi9j8w/7Rx6/mn6hzklsIw3RUpa3GFMyOBDSE wKuDqCh1nSs8hrUfk48eetsja8y1LeE= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=N23PO71Q; spf=pass (imf14.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733250024; a=rsa-sha256; cv=none; b=JSiidQRbjOcFANCy2EtALxfveN0pivn/Vr5nkX5spv0gTeRGB17/ymogb++ZbGfEvGrt0V 4gkQELUKQ60L2WRvf3djSYt6hRWsUryhSeXa9X50y8a6QVbyXqVfaKzZwkuTvtp9xeWOba tbYFZQtWvCfUf/n/2ZYD1amfxHBjp3M= Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2ffb3cbcbe4so63870901fa.0 for ; Tue, 03 Dec 2024 10:20:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733250029; x=1733854829; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jBCDNod8qWx3xnslvb8P0JxN8cIjahJejEW0GBjZxls=; b=N23PO71QdYpFxf1JCs+jccNSp0r0LdlIb0rM8qPKV4eIOOjLWmLDTl3hP8YBKKucIf 2gD6Oof9s0GHGP51qwVIWUfqwsPBH0gUQcoLSpRiLJGtL2GjLKjZePu8wFpuHOXt3CM0 nOLWjEg6lsv5om8iExYk66xiN8c40OPA/yStpVV85To1uhPy+rPawAmMDL/8rOOOR12h mZr91S+kAmVqoUuN4AtM0GhzwnLSfpeByEtPxJ4iPAbxutxdS5+W56OqYFrSoTlt9tKU wYCy0NrWG5nYtJzE9pGY70ZW5biDBFj8fAx2tSxQ6EN4iwSaVna2TErQX50jMRoZFW/b gutg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733250029; x=1733854829; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jBCDNod8qWx3xnslvb8P0JxN8cIjahJejEW0GBjZxls=; b=POI8ZkkbAjtwRQcgmZodwOJdUM5v1B2prNg5aPPH8EXhB6W41JqrU/Pch8IJr5ybrJ Obrg0zcGb37yAIx/+EcppdnC5c4DN9xxVJnvPltRWrMr7HOwoAL77DS69axKsdbAv2Js 61VAiZYDgxglviPDnDT7fZK3oJ6AMTc/VzrRLLekS26CQHLY0SNTgQQrxuCIft8OkXpM WOON7kAxGtalsEDvS0hYr/QpZdYmKZ983JJpYYKCmVQIMGXUyDTwaUbHXUmHvRzCZAQT NUbQlb589BOIO2DGMtIoJ/IrPl3EI6PBCbyRpG/+VbVuJlhewvgD5TuBytnNuVuaWuso OwEw== X-Gm-Message-State: AOJu0Ywh5X3ZMRBdcqWan0uNbcKi+lRH4gQoj4gr8y5VYG2P9/ezKYfF A+38b+BpntN+IdSdKfjg9PBgAmdSd4v2LGwpcaTbrDfLiUjomQ/tdQNexIfqDZ6m14vjkU650Ph W81C5uDn5XGrXNVG7Mh6+zXb+OUA= X-Gm-Gg: ASbGnctoSkANSpWrQhVhsRWDM4P0wYP77/0+YSOvop4+4x/bUWZDVqAf6Xmpgxz0rXn pJvKAgGYpwX7SXOfp0P+lxo/xGJD543w= X-Google-Smtp-Source: AGHT+IEHtgXqV5spgn1jCShCpH39AID9i3tZwDZ7iC8o3rSlviejHzCRZmbFFMLWYH1agF3iiTjCbND+KCX6cd2ikxo= X-Received: by 2002:a05:651c:892:b0:2ff:c741:db84 with SMTP id 38308e7fff4ca-30009bf618dmr28772601fa.1.1733250028557; Tue, 03 Dec 2024 10:20:28 -0800 (PST) MIME-Version: 1.0 References: <20241202184154.19321-1-ryncsn@gmail.com> <20241202184154.19321-5-ryncsn@gmail.com> In-Reply-To: From: Kairui Song Date: Wed, 4 Dec 2024 02:20:11 +0800 Message-ID: Subject: Re: [PATCH 4/4] mm, swap_cgroup: remove global swap cgroup lock To: Yosry Ahmed Cc: linux-mm@kvack.org, Andrew Morton , Chris Li , Hugh Dickins , "Huang, Ying" , Roman Gushchin , Shakeel Butt , Johannes Weiner , Barry Song , Michal Hocko , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: E643C100006 X-Stat-Signature: gywyac3zbtsba1nbuuouo7jabra5siqn X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1733250015-556748 X-HE-Meta: U2FsdGVkX1/kpb+jum6U41eJG86zCH1zrAkzrMEg69GfnI1u0BlnpJcoQqJN6p2cQDQ1e4Ilu8oRxsV/VrQPFvOzK96wtTO6M3awml30TNQ7R7JOI68ACGNlkd6OBAkcv8H/Or3yOs4ZXLKV+DSdJGrHivd3B+6atqf22nmR04oztUqnQpcYdfbEkwEmGID+Q/QO7JEOO7WvWQj2YuibSizE4DFcvUBhahXcmcTbzS+bCQGLqMq/assez5lqQHflxzNCrHRD+DcQ6pseUvsH/2b/Z4pT17cn5FMOnMftZBcZrAKdwu9lB9PAefsah1ymRXMx60bNspTML8WK8osclpiIaB5aRVQz86/+madUa21kZKvzdPKxFEwqg5TYv1JLPL+p2JNo9nRpJaPQUaiuF7BCEX/ypr8UD16APpVl6dvF6Inlp8GWjmRMrkjoVP6OFkQ8wMF5VmIKo0Vs2QFUZj/RFBluOVD8nf2tecxk4/WpCIon1D7VrIcQ/7tvgzMYIPcXQQ8FLrzL10Tyk8PSpB/OtWgslMekDiGRnv6nyuCy6rPyMykNPcWGd4oW0hNBLDRp8yGLqFszAwj3w8PYnsF4zAeSDKPy6ZMICaBdWd8VX99izSCPzyWO8YHxF/fmjykRTdYH8vJhqgRTDmounbRfG2wHbknI7GtZrUTKYrBX2ZGmVgVtc8r9nYw0mfdRtS/q9t/R9KYRQBskkXw1qazIQ8NJxIHFh/UVnPjd2NbzILXmt8eyVSLHb4ffX46zKNzDeXlVJm9O5KeE6MaIYIYo4CiZISyO+pgGEchn7zhljypXY0e0JNjyZ5gEUoYuLKAzazu6hl0aDwKvFDzIYHwZXbsQxt6+PZ/Z9RyyOkDSrs4/DuI/Z5u49GOBvHqikmTER1cjehwrd8wj8jpmCFGA7dVTg+YEv7DH8RPpjfBYz+Kj7mF3a7XiulcFiMG+zSZuSLh+hoTne/5NnmW 0e8R2lgb hvpGVXzPFsLQ36I5/RcW6X+A/lN3ZX/xAn6FQwzf+QW5RKNPvcy5aF+i5WYbqp1gpEWsneReiNTb+rm3RZDkGu6Dm+mAbCpf9jFCFqYpTjojjMrYq5hxR3ZwzjqYBI1VdRG89wFFe/woV89Rux8mKRJQT5iXrlOjzAW0XVESh0qzmQz7cMkzAadZIU7BIFdCvQa4931XooGvGy/YZsPH42YrvTslsJmJ+3dHXaC4Rn7AaFYTsVIPB+RLzrAqBKCMGPjU79osKOud0/k7ytKtgC7cO4xlBXe5HIqq7YV0qov3/9WMJ4YYUKt6TGivly1D+Rv77KIuHgZwj11mR52siiawhkEj8jsVycULrQiSBnbWFukNrNrGz5ZQNYTijovXwxx3IizWrkN+jcdzKM4nascP/Dcl2xEI1F6GU12yV4UCwzxuJavGE4bBJGA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Dec 3, 2024 at 4:36=E2=80=AFAM Yosry Ahmed = wrote: > > On Mon, Dec 2, 2024 at 11:28=E2=80=AFAM Yosry Ahmed wrote: > > > > On Mon, Dec 2, 2024 at 10:42=E2=80=AFAM Kairui Song = wrote: > > > > > > From: Kairui Song > > > > > > commit e9e58a4ec3b1 ("memcg: avoid use cmpxchg in swap cgroup maintai= nance") > > > replaced the cmpxchg/xchg with a global irq spinlock because some arc= hs > > > doesn't support 2 bytes cmpxchg/xchg. Clearly this won't scale well. > > > > > > And as commented in swap_cgroup.c, this lock is not needed for map > > > synchronization. > > > > > > Emulation of 2 bytes cmpxchg/xchg with atomic isn't hard, so implemen= t > > > it to get rid of this lock. > > > > > > Testing using 64G brd and build with build kernel with make -j96 in 1= .5G > > > memory cgroup using 4k folios showed below improvement (10 test run): > > > > > > Before this series: > > > Sys time: 10730.08 (stdev 49.030728) > > > Real time: 171.03 (stdev 0.850355) > > > > > > After this commit: > > > Sys time: 9612.24 (stdev 66.310789), -10.42% > > > Real time: 159.78 (stdev 0.577193), -6.57% > > > > > > With 64k folios and 2G memcg: > > > Before this series: > > > Sys time: 7626.77 (stdev 43.545517) > > > Real time: 136.22 (stdev 1.265544) > > > > > > After this commit: > > > Sys time: 6936.03 (stdev 39.996280), -9.06% > > > Real time: 129.65 (stdev 0.880039), -4.82% > > > > > > Sequential swapout of 8G 4k zero folios (24 test run): > > > Before this series: > > > 5461409.12 us (stdev 183957.827084) > > > > > > After this commit: > > > 5420447.26 us (stdev 196419.240317) > > > > > > Sequential swapin of 8G 4k zero folios (24 test run): > > > Before this series: > > > 19736958.916667 us (stdev 189027.246676) > > > > > > After this commit: > > > 19662182.629630 us (stdev 172717.640614) > > > > > > Performance is better or at least not worse for all tests above. > > > > > > Signed-off-by: Kairui Song > > > --- > > > mm/swap_cgroup.c | 56 +++++++++++++++++++++++++++++++++++-----------= -- > > > 1 file changed, 41 insertions(+), 15 deletions(-) > > > > > > diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c > > > index a76afdc3666a..028f5e6be3f0 100644 > > > --- a/mm/swap_cgroup.c > > > +++ b/mm/swap_cgroup.c > > > @@ -5,6 +5,15 @@ > > > > > > #include /* depends on mm.h include */ > > > > > > +#define ID_PER_UNIT (sizeof(atomic_t) / sizeof(unsigned short)) > > > +struct swap_cgroup_unit { > > > + union { > > > + int raw; > > > + atomic_t val; > > > + unsigned short __id[ID_PER_UNIT]; > > > + }; > > > +}; > > > > This doubles the size of the per-entry data, right? > > Oh we don't, we just store 2 ids in an int instead of storing each id > individually. But the question below still stands, can't we just use > cmpxchg() directly on the id? Hi Yosry, Last time I checked the xchg status some archs still don't support xchg for 2 bytes, I just found things may have changed slightly but it seems at least parisc still doesn't support that. And looking at the code some arches still don't support cmpxchg of 2 bytes today (And I just dropped cmpxchg helper for swap_cgroup so that should be OK). RCU just dropped one-byte cmpxchg emulation 2 months ago in d4e287d7caff so that area is changing. Lacking such support is exactly the reason why there was a global lock previously, so I think the safe move is just to emulate the operation manually for now? > > > > > Why do we need this? I thought cmpxchg() supports multiple sizes and > > will already do the emulation for us. >