From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B159E74ACA for ; Tue, 3 Dec 2024 19:18:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95A966B0082; Tue, 3 Dec 2024 14:18:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 90A026B0085; Tue, 3 Dec 2024 14:18:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D1316B0088; Tue, 3 Dec 2024 14:18:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5F6346B0082 for ; Tue, 3 Dec 2024 14:18:19 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E92661C753C for ; Tue, 3 Dec 2024 19:18:18 +0000 (UTC) X-FDA: 82854608088.08.51F20AD Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf11.hostedemail.com (Postfix) with ESMTP id 2682E40019 for ; Tue, 3 Dec 2024 19:18:03 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eItwtbuW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733253491; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Mp9MC83gmpclNo8eqEK0RSI0aec4QJvEhb/iElkWT0s=; b=h2G473tr9qVmzCLCn5iSUSqamxDEvqQqa9OUvV1ZhdAKKT/dW3gO2xTZFd8hrixq8cBeiN SBb5XwMmir8QPbGtvqo/RU2NnRXm5u27eUVZxCEBboJltL3PGUsflSJC5jhHIpFXyC/woo jkCS5DwomIP9tn+Vi7APh9Os/Gta1+A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733253491; a=rsa-sha256; cv=none; b=kcDZh5RssIqxo0aMolvSXBZAknCHW5gPxiXFy0guFw8g9dpGGsIQYU4jwkCO0OYg0wm0o1 z3AnydHztNoeqxu4gmlye7Vfs9D5SEDL3MyrhUOi2VODebcogG3dwZif8MYGDZXvdfpMH6 C0ZuQ2DpebHCUt+ByEIDpzpAmoSHncA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eItwtbuW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6d87f00f786so54081846d6.1 for ; Tue, 03 Dec 2024 11:18:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733253496; x=1733858296; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Mp9MC83gmpclNo8eqEK0RSI0aec4QJvEhb/iElkWT0s=; b=eItwtbuWT6b3rsdbEeY+LRx70/OpISw+P3ApLf90KfLvi4wfgp6zuK8AiTl4+tauJw e1dzWCc5qtxGl+6mT+oYNougMUQFRikX+9fijEludvvFK0TbUKuDrD79Mfd9dto62A1v 4/C0SQ6AELtm10yNZYy1bW7Hq4nPZaQky+mEVuvKhuIpG9IXkOIu4mGiAt7U87aTNvBv aCZmXH63+IxbVbVeH1HwAJ4yv6PhMDd00IjvsQiI5XL9+pxuDWF9v0jNmhiAt0LRD10E FtuVAhJFylh68pnXoyBDVAVao3vyHKQImNHMjZJkCLVfredgJP/zjFCvT1yj3E+yCFcw 6QdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733253496; x=1733858296; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Mp9MC83gmpclNo8eqEK0RSI0aec4QJvEhb/iElkWT0s=; b=uQxCOZY5dtT4te3WN76tgIARw5tFFK6/9LyNbk28SAiHHgncc7g0rM/Mfd57882REr PEBTx3sdeMTP7g1mcLmrhHdRYqpWtzqJLMm7Wcvpk9DctjFR6oduOUJMRtp6PyJ23WuV ql+Hoq7b73QoUGtqUDzXOdfUobVoNsYGHZb5UENlOWzZ6+ubYrX+TqG1Dfd+e4TBLMWI rKUlP85TGR2DFAQrFftphaE9vEPlnwHpomQEquvy6Z8pLNsQbWvZZV29HLBQjLAF8pP+ GbwT/4G+bqg0Sja+8+rF1jS5aL0YobjdTNbQSgsnpcgaGl9EpaEgChxwybhX/0Kzqa1G nOYA== X-Gm-Message-State: AOJu0YwqQVeChq+iLVa1lG0gzOksvKnTPvF8i28awJxdi69skrLPlCPI 0k/SDOV/PVT3Idh1akloy3VZxCUtOUm4oa0sB/FwaS61Zmu37iEe7gdk5lPoq8tZvVsbVwEOEdd FxiWBtPqvKyYjoFRdvWVW3oHAPNc4Vo12OTT6 X-Gm-Gg: ASbGncv3WEiu0t+Tekg0PqBT4Fhw1Nvf3CH42zZzX0d89aGmC9XRw3vGOTbwr8TdttQ ZW6PUmL8gHMIPg3JA4ilGojyS5MUE X-Google-Smtp-Source: AGHT+IHeqHeI58UNUW3pQpETjpS6gPoh+s1SmxZW8+KLFGgf3rZ45txlFKhIVGecAd+lNNIHJGjSIXeRs1z9KmIzspo= X-Received: by 2002:a05:6214:62c:b0:6d8:9e16:d083 with SMTP id 6a1803df08f44-6d8b72f1a17mr50671206d6.3.1733253495966; Tue, 03 Dec 2024 11:18:15 -0800 (PST) MIME-Version: 1.0 References: <20241202184154.19321-1-ryncsn@gmail.com> <20241202184154.19321-5-ryncsn@gmail.com> In-Reply-To: From: Yosry Ahmed Date: Tue, 3 Dec 2024 11:17:39 -0800 Message-ID: Subject: Re: [PATCH 4/4] mm, swap_cgroup: remove global swap cgroup lock To: Kairui Song Cc: linux-mm@kvack.org, Andrew Morton , Chris Li , Hugh Dickins , "Huang, Ying" , Roman Gushchin , Shakeel Butt , Johannes Weiner , Barry Song , Michal Hocko , linux-kernel@vger.kernel.org, "Paul E. McKenney" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 6gk18n1gyh81iu6xmf4an9up8mqtcjbu X-Rspamd-Queue-Id: 2682E40019 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1733253483-666807 X-HE-Meta: U2FsdGVkX19cYjsQKiO0t1LucjNyNDUMubg4SujJypwCx/q6mL7CD4v/iSMYyzRePTBQ+ymxfxB88H1/9Ctznu/1oFN0u6eEo3xJe/4LAzggYp+HOpuHqE2uo2f6FLTCutVsC2SxCfj6Ll9d/JCkk2kb2yPrHMzU452JARQjnIp9RHtkdBQOtOuzoNNOIGlPPn3377kMFm4WC5RqUvALE5krI0XPttGfZymU9KzrpV6RvHQFkFErjCWoXdKu4YYvpTeYLQ3C5PSe6RPp8I/K9EVCISYfc61UOQ83lQHPYUQJtO4YCaY1EPiLToSfDniUfW6712nNpePY0SiWCs6HZ1WgrWelB+kyN4Ga6tv27PEJO7lM7dY05NPdePsAHWmA1FAm3U1LfGIUQNMacjLb55lVfPViqCpibnx2egpHq/IMA+uy//8r7ezJwHAc7fUujkp/eCG7ueQrCb4bXNoh6gJsUQcKPtdnBEuxJvJ24Dm1EFUzxlZdx8iX9n2ucsvQhSC9Z92CfNvdUjU7O3GCs8tsYNOu5a1vMZfM58QjFk79yoqOOsYfk+s20RglSS1ZUVF4lXCPDsiHxLFOB1NDwXho1fdzObiu+7fomQ4k0qEkukVUrEpwv8+03uEzX/DiHJncQOfOI5IZD419y7azfD4bGoe25ru1Xq2/ILKHmB04nUuBQm6SIOQ1yJfZWyb/g+eA0dh1mscK0y+pw8PUdSj72LJ+kDCIB8u/X4977QCvyActmIrRhDiDNcExeh2BeT5klMz/tjIrVy+4MsIRz1vtfeaFjUjrhbBBke1M4bQLKlmycPNwUeFf9dopNKf2OuteFHLavvpJAF2exeskzho+nkgHOK6moIy+3jITLx/yOuu1GRM+Nis1LOQjC40M9lnNVMogHHejUqaYiHD7Yg0/Aoi3f3UqoGpItAQW95D9Hj58A6dHYOe/LkpO0pI3Qcf/1rEpYQS03YUZhTI uJ6c3Zos XeB7ogaA6AQTCmJK5TBsOzz8iWY6C2nA2Yo/+PU//4IahuD+pAU1akr/GspRlaNJsZLyInW54HbqwdZnkRlUydzkYmCp9GZEJdybANF5VVHU+5PLp7ACCixofkvhgjX0kvwLuHD3Zqml2kPTJ8M9AgYI+FwlP7Ogi81Yywv7gRp9smS0/hCpUUf7mo+eu7cmtZGFn988CLkrnNpKdR+T8GM2b5IhltNcg5RzMTdzTg8poWkNlE0+zLXxmV2N0qCeWBBW4wd9/aPZM/Gk+DsygAo/KpO5cf0Drcrg+jKspyxOE7OlC/4ttPCXeiQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Dec 3, 2024 at 10:20=E2=80=AFAM Kairui Song wrot= e: > > On Tue, Dec 3, 2024 at 4:36=E2=80=AFAM Yosry Ahmed wrote: > > > > On Mon, Dec 2, 2024 at 11:28=E2=80=AFAM Yosry Ahmed wrote: > > > > > > On Mon, Dec 2, 2024 at 10:42=E2=80=AFAM Kairui Song wrote: > > > > > > > > From: Kairui Song > > > > > > > > commit e9e58a4ec3b1 ("memcg: avoid use cmpxchg in swap cgroup maint= ainance") > > > > replaced the cmpxchg/xchg with a global irq spinlock because some a= rchs > > > > doesn't support 2 bytes cmpxchg/xchg. Clearly this won't scale well= . > > > > > > > > And as commented in swap_cgroup.c, this lock is not needed for map > > > > synchronization. > > > > > > > > Emulation of 2 bytes cmpxchg/xchg with atomic isn't hard, so implem= ent > > > > it to get rid of this lock. > > > > > > > > Testing using 64G brd and build with build kernel with make -j96 in= 1.5G > > > > memory cgroup using 4k folios showed below improvement (10 test run= ): > > > > > > > > Before this series: > > > > Sys time: 10730.08 (stdev 49.030728) > > > > Real time: 171.03 (stdev 0.850355) > > > > > > > > After this commit: > > > > Sys time: 9612.24 (stdev 66.310789), -10.42% > > > > Real time: 159.78 (stdev 0.577193), -6.57% > > > > > > > > With 64k folios and 2G memcg: > > > > Before this series: > > > > Sys time: 7626.77 (stdev 43.545517) > > > > Real time: 136.22 (stdev 1.265544) > > > > > > > > After this commit: > > > > Sys time: 6936.03 (stdev 39.996280), -9.06% > > > > Real time: 129.65 (stdev 0.880039), -4.82% > > > > > > > > Sequential swapout of 8G 4k zero folios (24 test run): > > > > Before this series: > > > > 5461409.12 us (stdev 183957.827084) > > > > > > > > After this commit: > > > > 5420447.26 us (stdev 196419.240317) > > > > > > > > Sequential swapin of 8G 4k zero folios (24 test run): > > > > Before this series: > > > > 19736958.916667 us (stdev 189027.246676) > > > > > > > > After this commit: > > > > 19662182.629630 us (stdev 172717.640614) > > > > > > > > Performance is better or at least not worse for all tests above. > > > > > > > > Signed-off-by: Kairui Song > > > > --- > > > > mm/swap_cgroup.c | 56 +++++++++++++++++++++++++++++++++++---------= ---- > > > > 1 file changed, 41 insertions(+), 15 deletions(-) > > > > > > > > diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c > > > > index a76afdc3666a..028f5e6be3f0 100644 > > > > --- a/mm/swap_cgroup.c > > > > +++ b/mm/swap_cgroup.c > > > > @@ -5,6 +5,15 @@ > > > > > > > > #include /* depends on mm.h include */ > > > > > > > > +#define ID_PER_UNIT (sizeof(atomic_t) / sizeof(unsigned short)) > > > > +struct swap_cgroup_unit { > > > > + union { > > > > + int raw; > > > > + atomic_t val; > > > > + unsigned short __id[ID_PER_UNIT]; > > > > + }; > > > > +}; > > > > > > This doubles the size of the per-entry data, right? > > > > Oh we don't, we just store 2 ids in an int instead of storing each id > > individually. But the question below still stands, can't we just use > > cmpxchg() directly on the id? > > Hi Yosry, > > Last time I checked the xchg status some archs still don't support > xchg for 2 bytes, I just found things may have changed slightly but it > seems at least parisc still doesn't support that. And looking at the > code some arches still don't support cmpxchg of 2 bytes today (And I > just dropped cmpxchg helper for swap_cgroup so that should be OK). RCU > just dropped one-byte cmpxchg emulation 2 months ago in d4e287d7caff > so that area is changing. Lacking such support is exactly the reason > why there was a global lock previously, so I think the safe move is > just to emulate the operation manually for now? +Paul E. McKenney If there's already work to support 2-byte cmpxchg() I'd rather wait for that. Alternatively, if it's not too difficult, we should generalize this emulation to something like cmpxchg_emu_u8() and add the missing arch support. It doesn't feel right to have our own custom 2-byte cmpxchg() emulation here. > > > > > > > > > Why do we need this? I thought cmpxchg() supports multiple sizes and > > > will already do the emulation for us. > >