From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79E33C4345F for ; Thu, 18 Apr 2024 02:22:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D53A6B009A; Wed, 17 Apr 2024 22:22:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 085B96B009B; Wed, 17 Apr 2024 22:22:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8F156B009C; Wed, 17 Apr 2024 22:22:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id CB4B86B009A for ; Wed, 17 Apr 2024 22:22:14 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 73421140E14 for ; Thu, 18 Apr 2024 02:22:14 +0000 (UTC) X-FDA: 82021052988.25.66C8109 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf07.hostedemail.com (Postfix) with ESMTP id 9D2A240008 for ; Thu, 18 Apr 2024 02:22:11 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F5m9RIP8; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713406931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qff+I+aGfGJWISm1b0a5UyILEZ6EJuYWH7ZKwAZRixI=; b=NgNrSxGFNESFHU4uAJ/DDuhQsF/yv0htZolJB/6FkPYVq6S6Tj9D6gnuKHBvRQHX2i4wm8 d1w06Azzk+19QYqhDaMwrqUdLLrlz+z2cR5HTJpf7eYL3IRubVDK9dmi+rvM+vBewh+t+0 I0kNdQMmoYL5zF94Ap5HbJOmKSyXFK8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=F5m9RIP8; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713406931; a=rsa-sha256; cv=none; b=KzVLCbF9EvMhmp2wRE4hvpiS0IvFz+Iw1o/JQTjE8t6OM3+joAs+hc3EY+8pkGKC0widNi K87Ysz6r9HYZ17uhKpYGJnIg9Zg2WueyO2BR/9obtVbyQfVV3xMlKIW5484U+TUXiPwE2i 4l/XEp82dx+jYx08QSMtydFE57OMdv8= Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-571be483ccaso1946a12.2 for ; Wed, 17 Apr 2024 19:22:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713406930; x=1714011730; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Qff+I+aGfGJWISm1b0a5UyILEZ6EJuYWH7ZKwAZRixI=; b=F5m9RIP8RTUaATddWZxUamYMZRDfxO0W891HdpmIXEETNu+wCZJY6iVmz3YJAOAL00 Y9zJdkINZhpiF+sjin3LNIyhEtdSXTm5ueorq/51xMX07NOzUoJOA5zOLuPZz7Hm3nbB XPVaO1nJk7DWA09OKVfEKMJgNe1gJSDEOitTJn7YnTd6yXV3rOkq7IRqRQwCBFiegCL6 M3DvIBpfoKwqZ6Qq/X3EfsxT6cOPmO4EjZy2iOO4DnYcH9l1dMfsrTKubnkBWqgVdZAH sceriM5jKkC4lyEi8z7Y9G7whSoGwn/UnTzxfAkNDX1drAOk3G6yVgEdd6pauBQaJ98W qLPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713406930; x=1714011730; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qff+I+aGfGJWISm1b0a5UyILEZ6EJuYWH7ZKwAZRixI=; b=j3GG+tcijjo393yg61swK7gQiYJ/uUhJ7Xj+8zTzgFNzBnjo5OmD+nSb/2ioeYnM1l YFWgPTpxiSrbrNFWKYOlm3aVu5Svg/D1LVybaFvc3gE8irPJIcYeCXdoLZpHpNiBDvFb XyGLNyWFkeESEd4OkbNi25NWFclU7XP5viZHAjUOsDnDkI90XafvMPLBxzz8UI2H6LUU 9zl20ETVCcqJc4YIUB1oP+b31X+3MlCMUm9W+t0tnJfKX9BYhEbeE2/RDwnEt5E0gax1 /bPq5yXb4sNSkfsJwCTcp5jRtCcTfTFwPJAUjrl22BaNP0ORSsmJZVz1X7/hTsbpm+Sy C3KA== X-Forwarded-Encrypted: i=1; AJvYcCU6bEtdhHXmzb1r1tf360zCmOhZYN3InyVkQyLOYKH2ts9YEsaKOTBCSsg2ztG5293r7mGF30IVIjAZxFFop9raCHI= X-Gm-Message-State: AOJu0YywVRb9NBpGRZXtVXeBBptsDoypeSXeYanP/T+whmteKZWpYJmp dz/bNFjH5xfe7sPxWfnnGbHHCNd0c3qcINTdMuko2E8k1G2wQ6GDZNgLBqtD4gImpB8FfH+tw3x UXvskKEIHoMTTxFlIHIRhxTutYvatPEs00NVpXUJ1Q4kC/47CW3oLIM0= X-Google-Smtp-Source: AGHT+IEsn2jGrIe3C4gLKbA8+kMsvSSF1fhZfq37SxAlOsONgdio3k0OzSspWSZ5JGdHKbepDMsk16Du3X8HrZb0pAw= X-Received: by 2002:a17:906:38d:b0:a52:15dd:20d8 with SMTP id b13-20020a170906038d00b00a5215dd20d8mr707296eja.26.1713406929803; Wed, 17 Apr 2024 19:22:09 -0700 (PDT) MIME-Version: 1.0 References: <171328983017.3930751.9484082608778623495.stgit@firesoul> <171328990014.3930751.10674097155895405137.stgit@firesoul> In-Reply-To: <171328990014.3930751.10674097155895405137.stgit@firesoul> From: Yosry Ahmed Date: Wed, 17 Apr 2024 19:21:33 -0700 Message-ID: Subject: Re: [PATCH v1 3/3] cgroup/rstat: introduce ratelimited rstat flushing To: Jesper Dangaard Brouer Cc: tj@kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, cgroups@vger.kernel.org, longman@redhat.com, netdev@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, shakeel.butt@linux.dev, kernel-team@cloudflare.com, Arnaldo Carvalho de Melo , Sebastian Andrzej Siewior , mhocko@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: uqzsdzpfg6sb6gkjpqreanfib6c4piko X-Rspamd-Queue-Id: 9D2A240008 X-Rspamd-Server: rspam02 X-Rspam-User: X-HE-Tag: 1713406931-545406 X-HE-Meta: U2FsdGVkX18HjI1rfb7njdW1JjK8HvOBL+F7mokBMD9h1H+56XbE610fn+pMACWCvO1VdRBq8881aRehhOYA20v/AQcczMkRcDlNVJsONUsT7DxUgF7nIaEHFL4FAkWDGgFWt2ph8GpymBg+guXMeImrxu0K0ZjS/MV1TkuINGnBUphoPPx4Kud/B6UNTnXmrdLVHHkGcUMcFCjVxHe4BNIDPZlwMZS9LKxEViux9jQ4sl4TKjYfeycj/yAHsvSFmkVPwcne4YbK1JPX7a46CsO5SVSjoZOZRxbS2JE04b29so+fU7dkSv6uvo9S1E/YflcTLAMv0dcezhY9s50a6Rb7mdREPOdpAn0iRBphyZIdmNLr4pMSCKib3WrIJHH6i7yr9LIbrfGnx9hwqsNqL/kMeB6MywLCrsRnfKixMIGjPjRoh5kIglej93aW3zU6VoUFBHArOKts5uYRovni0d/wplhGew8nLj3+UPrI0Oycm/OSnoHHLw5FrBHiGfKWclx3SeFlfFqevDJtF/6jn2hg0+k1sqI8fDS/wDJ/jwM1zZoUFXhYlVck19kf02wTkW0ONpyFv4TTXUW6ClfaQOM0xDN6Z/mKm3Lb1G2iYrdzApWBdNmzoCsheoz5PiqLCwkhMJ/4JZEza4D8lOgETIKDwa1laxKVJ8PYS6ecWQXS6UaUypy5pEhR7jvFxjZKjZARrSjCjCaDey8MxzD1t+JVrCw69ieMDwyCmE45arZNDOjuQLqE5nxUwOWbfBJ+e+9AFpAH7HS0E3qz/76VWtX7/5VWj49lvDcm0g47Ns1q2HoXoFi95UVTUKa2Rju4F/oJRISUXwpcnrNjWQzDZwyqMsONAQ87W0Vfby8vE7PUx7utpoGUOv5IX1ZfkxJrwx5Kz682n7/YOaXUVXIrjAaSauQkX9Vce93DnxhPgqGZLnm2EPHOpXU0Gb/aXyGuNtRwrE+9mekB39Uk0Pd mNhJL6ya omu5Brsn/pSOrCSB9OnHl0frKu+CGx0tyGf11LmmgIcFJqOQhHQghh9iuxkpWeTdeQFPMjF/Lli1CPDtrKsPmR1gIU4q7blZK48zgMzDUKV1jAg05V2X8Xl40XIRDHByIOyw8GvlAdhUH9WexRnSG203bDZ5kjyRoNlRMSedwoEAFzcz4R6fNRdGXy87A8ADPQDSWOdg/1fqqYRpghc97xu/gLQhQIeNQGHDC/CoY6z7vTvkmuVytzS4cKrddvYqjA65ErSxQqUF7U8g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.060134, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 16, 2024 at 10:51=E2=80=AFAM Jesper Dangaard Brouer wrote: > > This patch aims to reduce userspace-triggered pressure on the global > cgroup_rstat_lock by introducing a mechanism to limit how often reading > stat files causes cgroup rstat flushing. > > In the memory cgroup subsystem, memcg_vmstats_needs_flush() combined with > mem_cgroup_flush_stats_ratelimited() already limits pressure on the > global lock (cgroup_rstat_lock). As a result, reading memory-related stat > files (such as memory.stat, memory.numa_stat, zswap.current) is already > a less userspace-triggerable issue. > > However, other userspace users of cgroup_rstat_flush(), such as when > reading io.stat (blk-cgroup.c) and cpu.stat, lack a similar system to > limit pressure on the global lock. Furthermore, userspace can easily > trigger this issue by reading those stat files. > > Typically, normal userspace stats tools (e.g., cadvisor, nomad, systemd) > spawn threads that read io.stat, cpu.stat, and memory.stat (even from the > same cgroup) without realizing that on the kernel side, they share the > same global lock. This limitation also helps prevent malicious userspace > applications from harming the kernel by reading these stat files in a > tight loop. > > To address this, the patch introduces cgroup_rstat_flush_ratelimited(), > similar to memcg's mem_cgroup_flush_stats_ratelimited(). > > Flushing occurs per cgroup (even though the lock remains global) a > variable named rstat_flush_last_time is introduced to track when a given > cgroup was last flushed. This variable, which contains the jiffies of the > flush, shares properties and a cache line with rstat_flush_next and is > updated simultaneously. > > For cpu.stat, we need to acquire the lock (via cgroup_rstat_flush_hold) > because other data is read under the lock, but we skip the expensive > flushing if it occurred recently. > > Regarding io.stat, there is an opportunity outside the lock to skip the > flush, but inside the lock, we must recheck to handle races. > > Signed-off-by: Jesper Dangaard Brouer As I mentioned in another thread, I really don't like time-based rate-limiting [1]. Would it be possible to generalize the magnitude-based rate-limiting instead? Have something like memcg_vmstats_needs_flush() in the core rstat code? Also, why do we keep the memcg time rate-limiting with this patch? Is it because we use a much larger window there (2s)? Having two layers of time-based rate-limiting is not ideal imo. [1]https://lore.kernel.org/lkml/CAJD7tkYnSRwJTpXxSnGgo-i3-OdD7cdT-e3_S_yf7d= SknPoRKw@mail.gmail.com/