From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 237B6C3DA5D for ; Fri, 19 Jul 2024 23:02:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61E006B0083; Fri, 19 Jul 2024 19:02:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5CAD46B0085; Fri, 19 Jul 2024 19:02:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4925D6B0088; Fri, 19 Jul 2024 19:02:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2C5DB6B0083 for ; Fri, 19 Jul 2024 19:02:14 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C7038A3BBD for ; Fri, 19 Jul 2024 23:02:13 +0000 (UTC) X-FDA: 82358027346.07.236E4F3 Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) by imf18.hostedemail.com (Postfix) with ESMTP id BCF1C1C000A for ; Fri, 19 Jul 2024 23:02:09 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=G1mjbLP4; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf18.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721430097; a=rsa-sha256; cv=none; b=Qn8275MK/+iZa7PaA+B2yFZnYrNuCs0e5LjFC+N2gNxvb90cKvmXY5wXnwTNyjL2EZlnVz QvqlgZ8pEW74Gta4DWmeKdVzsueITmAkgctOOGYpqxON3M+hWMfIE0rYZIqsL/EtI8m/mo 41gaCMqlV1weSlA+pES6jUFfn+/i0Yc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=G1mjbLP4; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf18.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721430097; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l0zD0aA4CXe4tpOLD1gGPpbJDDXrQ2vOJBNWM34Kaa0=; b=MLRy5s/doEHEZ3fvL2MyyioxVO/P65S/n3Rzol8MytSU1ExKbquDeSe7fEEOJvHHPJcy8R sG4mTvJovwBYDOymNW9EZidU0Di/3i8/3r6NzMiAFWoUIDvnZ/An8l6MAJu7ERwG+iluUv GS3z3bWaoZ4aOFrJrgmXT/P26BR9Rwk= X-Envelope-To: yosryahmed@google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1721430127; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l0zD0aA4CXe4tpOLD1gGPpbJDDXrQ2vOJBNWM34Kaa0=; b=G1mjbLP4/VizFq7tfwvP/yTuxNlKmKxb/zMUkDEmgh0Dut/p/A99cJkLgmPsjRpMAAfTz+ RuI5kSSsMG+YO8hX08qxBK76kQYb1qYKV12vGGahL1BaQXd0W5PR8JdWI7HLP8ITMqNfNo mYs6BDEbF9CXlBYzkMoYw7tpSNzDOF4= X-Envelope-To: hawk@kernel.org X-Envelope-To: tj@kernel.org X-Envelope-To: cgroups@vger.kernel.org X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: lizefan.x@bytedance.com X-Envelope-To: longman@redhat.com X-Envelope-To: kernel-team@cloudflare.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: linux-kernel@vger.kernel.org Date: Fri, 19 Jul 2024 16:01:59 -0700 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Yosry Ahmed Cc: Jesper Dangaard Brouer , tj@kernel.org, cgroups@vger.kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, longman@redhat.com, kernel-team@cloudflare.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH V7 1/2] cgroup/rstat: Avoid thundering herd problem by kswapd across NUMA nodes Message-ID: References: <172070450139.2992819.13210624094367257881.stgit@firesoul> <100caebf-c11c-45c9-b864-d8562e2a5ac5@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: BCF1C1C000A X-Stat-Signature: ukbogx74sqhiugnx5ezek8naaseujid8 X-Rspam-User: X-HE-Tag: 1721430129-24150 X-HE-Meta: U2FsdGVkX1++Rg1GgPTexOh4LLebS8NQjlmRflgiag82Is5M7a1bBGodpUwv5ncEpum22L0HNo53iVfvmufQVbmr/jk3rzjCz8fwvvDcQHSQCHYbNgOPW9JwelHhHzQbZ2b6wjLQAzb+UxJ/aCrA3S/lHUMUE3QJ+GTM5TTeUwUrjTaS7+UPnMx2SlMqYKT61D7Hw9cTaMltIlvv6x9lpJhbuXR/PHEpBxjawAmJU/1VDEThKKQHlpdSXce7aQ1W6u7FxVWWNrigtlnX0q79zzA01ztvMGzOtkTf8RZHxLRDH/4kBG7r5NAwfxTitcZtS5CV+zNVvPqnG/PCnQSTnyEA0+oqSesVs7MC/UCMJWdVxYvTHitHbLiHajOu5AssfVTj0/WKVKjVlaBP7Jgd3udrTghlwsjkZ3BRsupkXk/OAiJ+DAyh/6bKeUWif6nGKwZN8FkrUyzHDIa7QjZ6/V9AdzaZ0u/6X0UBjD3Kro8sNx7zwomn5rOWldy/mQ1ZK9SRaorVNnvHrKsUwSJCzqimY22BRZp0xeg0RtRnTIxymCHRk2OB0r0cutkiU+twg5J3nDDPTtelf1dBKPj5K+F4/8koT95RIccHA4OWp1Qnvw0V9G8FQGDDJg0II9ttIpz/gJn6cwoR88qaIZ/QTizXiPAfUxq+iFiNJsva5UiiSEJjPT6VUycTYKm2x5KZWIIDV2l/a2g6IVP/DxUYWCgaJihSzWcCMyUgPLNz3V1PjOjD59mCHpT8Z4kkQ6/JaLxwPS0jIOOJ6UOYxFGq3HIACR8z1y0E4ZyTkOm1Q1jtWRowQeRclu42dFraBRvQ1taFs1K/o5evUwYzzjpv2mbdkzKxX+I9srl50npIxvmwkOjDhWaSLP6yX4WJaezJLW9It2jQ3/FpnZUfovZY0SZL8oDn2ZZQnYtgQX0dEtCoKJg4tu9ZWAnDp0B+h6UmLVhX8BMWbYkwCe5kIP2 mVNQCCt/ 5TUOEy/pTusTyPoLTHKpEdMMQ1/phGh8BRH4l9SlpRr2DhEsuwtvUEpZHgaWR7o4iRVFHD8+5Wfe4mEdER/ItPVxM+pwizrmkI6KkHnNbKABrbGKj7Xj55BV/JVdM3OfQxO3V43gINNb5Rrs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000057, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 18, 2024 at 08:11:32PM GMT, Yosry Ahmed wrote: > On Thu, Jul 18, 2024 at 5:41 PM Shakeel Butt wrote: > > > > Hi Jesper, > > > > On Wed, Jul 17, 2024 at 06:36:28PM GMT, Jesper Dangaard Brouer wrote: > > > > > [...] > > > > > > > > > Looking at the production numbers for the time the lock is held for level 0: > > > > > > @locked_time_level[0]: > > > [4M, 8M) 623 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | > > > [8M, 16M) 860 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > > > [16M, 32M) 295 |@@@@@@@@@@@@@@@@@ | > > > [32M, 64M) 275 |@@@@@@@@@@@@@@@@ | > > > > > > > Is it possible to get the above histogram for other levels as well? I > > know this is 12 numa node machine, how many total CPUs are there? > > > > > The time is in nanosec, so M corresponds to ms (milliseconds). > > > > > > With 36 flushes per second (as shown earlier) this is a flush every > > > 27.7ms. It is not unreasonable (from above data) that the flush time > > > also spend 27ms, which means that we spend a full CPU second flushing. > > > That is spending too much time flushing. > > > > One idea to further reduce this time is more fine grained flush > > skipping. At the moment we either skip the whole flush or not. How > > about we make this decision per-cpu? We already have per-cpu updates > > data and if it is less than MEMCG_CHARGE_BATCH, skip flush on that cpu. > > Good idea. > > I think we would need a per-subsystem callback to decide whether we > want to flush the cgroup or not. This needs to happen in the core > rstat flushing code (not the memcg flushing code), as we need to make > sure we do not remove the cgroup from the per-cpu updated tree if we > don't flush it. Unless we have per-subsystem update tree, I don't think per-subsystem callback would work or we would be flushing all if any subsystem wants it. Anyways we can discuss when we have data that it really helps. > > More generally, I think we should be able to have a "force" flush API > that skips all optimizations and ensures that a flush occurs. I think > this will be needed in the cgroup_rstat_exit() path, where stats of a > cgroup being freed must be propagated to its parent, no matter how > insignificant they may be, to avoid inconsistencies. Agree.