From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2825C47258 for ; Wed, 17 Jan 2024 22:56:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8774C6B0080; Wed, 17 Jan 2024 17:56:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 826CE6B0089; Wed, 17 Jan 2024 17:56:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EEDB6B008A; Wed, 17 Jan 2024 17:56:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5E67F6B0080 for ; Wed, 17 Jan 2024 17:56:28 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 32C90C067F for ; Wed, 17 Jan 2024 22:56:28 +0000 (UTC) X-FDA: 81690313656.19.732C8D3 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf27.hostedemail.com (Postfix) with ESMTP id 65F904000F for ; Wed, 17 Jan 2024 22:56:26 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eiG7ZhFG; spf=pass (imf27.hostedemail.com: domain of shakeelb@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705532186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bGcgRuVseEOV3oC1VpSFRFC5b02XYAyQRuzT8bO3Zzs=; b=JXj/2A4a9rgwxD6q6gS8DBTKMuDEowlohWJk7DFU5kGzmurNN8GRQCRTwFdaTCUoiS6V6j oqI+KdOZ/BsJeI8bqbDoLwxxrgJxAVuX9ewYxoQGKyC8t++yDGiPRk4tqQBvj/DSZDUcd6 +zt6TPrniLI0Vc6eX556OZZ33XD1pfY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705532186; a=rsa-sha256; cv=none; b=J8WgHgiCAM3PYdaek9ki11DNzBpghDVmE1G0m9HxU2nHL47iUnz5deRp+xt+L5XLCwvOYJ u5yVV6qkwu/4808rrrL+RwNTZngXkxpE3XDdLkX60mCqmqT3hzZ53Sx9b0aaH86WUVsB2U CNelh6a6eB3o/Vm+as2311CbS5gvlhc= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eiG7ZhFG; spf=pass (imf27.hostedemail.com: domain of shakeelb@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-4298fa85baeso350351cf.1 for ; Wed, 17 Jan 2024 14:56:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705532185; x=1706136985; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bGcgRuVseEOV3oC1VpSFRFC5b02XYAyQRuzT8bO3Zzs=; b=eiG7ZhFG/UaGrxLRHTblw3iFJeIs1tULcWMrEFRbrucGx6Rgst9Z92V5jOlpU0iFfr hCwDlcaiWw8BxQKGPTXtWy7aDqyjYdb4qwKgk+7KY/SDdoFUMXm4d0o1nrwf+zumB0G0 Nryo75FvaPuW1/smsYr4hCfSTVnq88tqoOeh81xyoVZEVuPtreExi7al1TlM/eK8t4Dj ffkD/k+wUVBWfCTmeTYZqP7jKiBjU3peAyqIBzHgvlWcwquhscGH/+NO6UL7xETIg8/t Zx1yoLqwivrAzUZHg8ACBuwBb1KcHLkcc6rSecEsx73mLgZ69jHRR/Cb5OYbiYFighm1 PIgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705532185; x=1706136985; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bGcgRuVseEOV3oC1VpSFRFC5b02XYAyQRuzT8bO3Zzs=; b=BBBC2aqSVONLFIHMaBUaTxYamWGULA1gCkShWy/F0TdqCh59lV7kHpL0pqYibspPbK z1+pncCYxlfd43KaoZdTKSBtG+yWBN+eOm52wxW+j7wXMuJkWVZz75xK8QoNBzdnVoe/ KwN6+0AVpjHhb0+JxgXvMkPDfPJUsDPOJuEBdkRDjBwxZY51hjSP1o3ZcJ6fiaXr6THB nIKu4nFMDlnh/A/cvG1cfH7Gs9sGVD4oub1n6NSSnvSbGJPkcGdEYEzBki+FejHgBLSv mCzEaUPt4GRt3oKaYSNpqQobyZVqPZrf5zsKNpU/2xFTOId9h3tMQiF982mG7u17wp0z eu+A== X-Gm-Message-State: AOJu0YybvsgjUjqyXoIivtwNYc0HG6wwleF1IihpaFf8vHifTw1Dl0+y R2AgyvaOFSBt86MdN1zpHoIcFy/vxeKOsi9nWvHjYcl2r+yexwgCeKbby2IKuFOvn7qaYETRYdn kOnq0SAOa+M/OHa8LDNegL7GnZ83Mf4R/hlCF X-Google-Smtp-Source: AGHT+IF1P30PiLjC2CQWqrwirPzdBqiOUoNxTHpv9JcFeNcMRLyFct/qjqhKxCmRPW8qqdV7EHPEaNFhiWXXS9AKCwA= X-Received: by 2002:a05:622a:5c99:b0:42a:101b:61b0 with SMTP id ge25-20020a05622a5c9900b0042a101b61b0mr418353qtb.2.1705532185394; Wed, 17 Jan 2024 14:56:25 -0800 (PST) MIME-Version: 1.0 References: <6667b799702e1815bd4e4f7744eddbc0bd042bb7.camel@kernel.org> <20240117193915.urwueineol7p4hg7@treble> In-Reply-To: From: Shakeel Butt Date: Wed, 17 Jan 2024 14:56:11 -0800 Message-ID: Subject: Re: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again To: Roman Gushchin Cc: Linus Torvalds , Josh Poimboeuf , Vlastimil Babka , Jeff Layton , Chuck Lever , Johannes Weiner , Michal Hocko , linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 65F904000F X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 74qbepaqx47bmnqxj4zpee8gnyienfhr X-HE-Tag: 1705532186-136527 X-HE-Meta: U2FsdGVkX1+fvl/X5Qgk0Uu/YUA+O+gbpTh0J2Hrr3UISJaAY/nuTVFhzU4O+yVfNB9HZ51Lji9y8yGK7MkDOry5I77wg+AV/EVWpCnE37/AJTQ8BXH0+0hdWQsy13jEZ0ereCIZkQqve52M99po+LuWuDMkxbB8RuGyYw6C3AYmtU05TK2Lm3pd5pd8QPaVPRdUmlQRHyn2VT/J5wA7l25C/9xpQqDG/oOtWroBB3pe5ooTlByvh95uXmIVhA+vEHHkGUSjgeVy1h+cRgxikVOtCYSza0eYyV8dwzdKLvPLT4KGb4roNoVe8krCt7qgtelTURBKYBRAJx0by23loRoRBhPpif7Z8SPT6rHqoN4PguhRf6BOSOujoWCx5zmG+4kPVCPwykb32+7IUJRL3FCmWb99PLVXnG4HvdRHNq0FkRMJ+rXoRLiHlhFlDvZT7aab8O+6ba5GHnZ1oqmjPl3P+Hc28TtHsQ7PYJC8dfIWfLmLqeFMNXefT3CCatXXgzlJzgP7U4j/ct3htP3RSfrUje48j/Kghk2SuryY46QDvtfdrdS8ULQVnQlICES74Imorh/gtIaADTrNM8CollMNq3BAxaYEm9AfEHN2z4VjCbQBHa4USPMGT9RkImToyZO8bU4FG7zQSrnCrGJeFebMBqdC21ldbBvTvoa/iOvLdGWF/He1RhOdBTFjS6yRnoLnw6nFL+lmLot9vUM3jVZ4KZlmJLI0rV7VeS9o9Y4lnpOLZZUv3ZkUJ9OPuK4N3tWQLfui72kxCUqScaUjtcwMnYdZMz9ColKk0taBw+vwphu9j3bQx1Nq7jEDawMG+ZYSfEQe1EhKdvMlKDY8vEs5UzFrreNs5CNv95KR7mtqqspKOvj89XVgcMGQ6cuF02ufQZfRldMb4Jakf0YSv0ESNf+Bq1MyN8Umhd5qoCeuYt/rJDScfmG/SB+42GG6K3Vw0gPreTK49B7w4wH oIISHhtB 9TNFMXXPAyb0NMT+NFdouhc3ZSOc9reTJusK1oIDhwt9jigqfjxSzLNu6DgQmr7CVX7XPyix1m8/RADKm+628pEohgWYj0YbPqkkaXTPyv3dtiIwb1CXsmFuZWnXa0Y3xYqx+LKm2wxxZpFOzDeYbQD9K12YHBHijqdwjzH6ilzuDv4P6u2uXWkolPGtCc3duOHT08MM9eMlGPrbAFOuOFmg7DroOc7Fy+Xxz6gIasHqogsKhg6akY7bZzDCPECAzlFvB0ulMAOS8xLOmmI9MJi7mJUSoLuzgCEhhs16A2opDJ13s4o+KYHyenMMmzbglp4irvGET1DQ5MY02m7R1tgp5ztXxeg7WPMGL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 17, 2024 at 2:20=E2=80=AFPM Roman Gushchin wrote: > > On Wed, Jan 17, 2024 at 01:02:19PM -0800, Shakeel Butt wrote: > > On Wed, Jan 17, 2024 at 12:21=E2=80=AFPM Linus Torvalds > > wrote: > > > > > > On Wed, 17 Jan 2024 at 11:39, Josh Poimboeuf wr= ote: > > > > > > > > That's a good point. If the microbenchmark isn't likely to be even > > > > remotely realistic, maybe we should just revert the revert until if= /when > > > > somebody shows a real world impact. > > > > > > > > Linus, any objections to that? > > > > > > We use SLAB_ACCOUNT for much more common allocations like queued > > > signals, so I would tend to agree with Jeff that it's probably just > > > some not very interesting microbenchmark that shows any file locking > > > effects from SLAB_ALLOC, not any real use. > > > > > > That said, those benchmarks do matter. It's very easy to say "not > > > relevant in the big picture" and then the end result is that > > > everything is a bit of a pig. > > > > > > And the regression was absolutely *ENORMOUS*. We're not talking "a fe= w > > > percent". We're talking a 33% regression that caused the revert: > > > > > > https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex= -9020/ > > > > > > I wish our SLAB_ACCOUNT wasn't such a pig. Rather than account every > > > single allocation, it would be much nicer to account at a bigger > > > granularity, possibly by having per-thread counters first before > > > falling back to the obj_cgroup_charge. Whatever. > > > > > > It's kind of stupid to have a benchmark that just allocates and > > > deallocates a file lock in quick succession spend lots of time > > > incrementing and decrementing cgroup charges for that repeated > > > alloc/free. > > > > > > However, that problem with SLAB_ACCOUNT is not the fault of file > > > locking, but more of a slab issue. > > > > > > End result: I think we should bring in Vlastimil and whoever else is > > > doing SLAB_ACCOUNT things, and have them look at that side. > > > > > > And then just enable SLAB_ACCOUNT for file locks. But very much look > > > at silly costs in SLAB_ACCOUNT first, at least for trivial > > > "alloc/free" patterns.. > > > > > > Vlastimil? Who would be the best person to look at that SLAB_ACCOUNT > > > thing? See commit 3754707bcc3e (Revert "memcg: enable accounting for > > > file lock caches") for the history here. > > > > > > > Roman last looked into optimizing this code path. I suspect > > mod_objcg_state() to be more costly than obj_cgroup_charge(). I will > > try to measure this path and see if I can improve it. > > It's roughly an equal split between mod_objcg_state() and obj_cgroup_char= ge(). > And each is comparable (by order of magnitude) to the slab allocation cos= t > itself. On the free() path a significant cost comes simple from reading > the objcg pointer (it's usually a cache miss). > > So I don't see how we can make it really cheap (say, less than 5% overhea= d) > without caching pre-accounted objects. Maybe this is what we want. Now we are down to just SLUB, maybe such caching of pre-accounted objects can be in SLUB layer and we can decide to keep this caching per-kmem-cache opt-in or always on. > > I thought about merging of charge and stats handling paths, which _maybe_= can > shave off another 20-30%, but there still will be a double-digit% account= ing > overhead. > > I'm curious to hear other ideas and suggestions. > > Thanks!