From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B60D7C0015E for ; Wed, 16 Aug 2023 01:14:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BCC894003A; Tue, 15 Aug 2023 21:14:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06D318D0001; Tue, 15 Aug 2023 21:14:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E76C394003A; Tue, 15 Aug 2023 21:14:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D72D38D0001 for ; Tue, 15 Aug 2023 21:14:45 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A88381402CE for ; Wed, 16 Aug 2023 01:14:45 +0000 (UTC) X-FDA: 81128198130.27.D21F6B4 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf21.hostedemail.com (Postfix) with ESMTP id F2B451C0015 for ; Wed, 16 Aug 2023 01:14:43 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="Vyshv/iy"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of shakeelb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=shakeelb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692148484; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kVVaJzAMOPLs4F5EbaziKRc8Rzjn1Gof//6KpkJlzlQ=; b=lWINEj8VhsU0UnO7uPTl582EvRIYTB7U7iK+L48bseuzUPcJHXQmvfGrqB+KC/TFSwUQge TRsrr/yPTUpH/5wo9/b97Z7YsVNIZ/lznTB8aSHt+5Aw1wrTS73dj/JMnm6/NcQ2EKmeTT LGtH5R+AUny14/XfiFWBJKp0MjTIZVk= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="Vyshv/iy"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of shakeelb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=shakeelb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692148484; a=rsa-sha256; cv=none; b=x78cjjj3/DwNXH3tVS3eHAf4RsX6A9kxWNOz2gXr/XJhRdVVy1PLesa+hir7r59tvFFq0w Of7a1HUj/K6vUi876DPnizRZTiDupAzZkM4pqz0hSTmF6xUMIVB2lp3OLxpf4s+x/kyQb4 kOeg1WuIE9ReKJkK4oyhZExPyIfib3U= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-40c72caec5cso158011cf.0 for ; Tue, 15 Aug 2023 18:14:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692148483; x=1692753283; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kVVaJzAMOPLs4F5EbaziKRc8Rzjn1Gof//6KpkJlzlQ=; b=Vyshv/iyYwR/oFEOKc1JYHMUrHXFkWIoQxeD1rRWIN0P1XogOx5MjMYvYkUY6T9etq Zyy+W6dszX02xqzVxcGnzkiWibtt560W+RPTXwER5Qmf0ZZx6l3Swf03vaNrgkidKm5V Ej/KmVtxqPaC0pd4xGzE58TL6WJ5bPesj0D801i/T+wyBZfYfcBz9gQF8xBvFFT6Vqpb uJQox/dXljUPbaHHuiQofbJMHJgJpluHl5tdkpEyPgxbT4CQBHktXf2KCsM0dFTCMBwz kn8NICYHPgpjhsrh4cNbGwSHhulF3tQ+l22DrSd/Dz81D2atgZBtsQPqUUYRbZqq3sKU di6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692148483; x=1692753283; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kVVaJzAMOPLs4F5EbaziKRc8Rzjn1Gof//6KpkJlzlQ=; b=joUOK0Kiq1qNnk2MIv5Vh0q6yVFYXdKtgN195IZlPES6hOdosS5sM6vbG/ysGMcFid 5VaD4Ip/UxBearIOfM2rQLOliUHDLzngmnzT37lk2Hgb6iI1VtD+n/ldiIxhraiX1Tdv 00sWUBShsP1fhktKheLBG4h804vPElmd9okTTNXXKpfZ1prhbu16DBDNsyJIs47PQ9Cy vnKiz9bscPHT0MveAb/NR4qJyssZFkz3JkD2ozO2oGPiIqPdYH+FwEKTWg/rs57PJkm6 XT+IdXjcoaxZM00r9+7CouE2EmykBRYU7PqE/ym5tfHXtQNSvZ8eFgLJ+c3n36WquN1T V3Ag== X-Gm-Message-State: AOJu0Yxl4ur3CTzuMlHUCsytN1X8SysaLT8kBn5esQdU+aCQoey1OiN7 lSk3iRp5tqXppGej/WQ2Gn9JjmCS9rklRdjG4/XqOg== X-Google-Smtp-Source: AGHT+IH8TYko1Bh6ENJiY9cnctcLBXofliPDjplhGZbCyU2pM4y2EIXq+p6ecUUKdPOKf9+5dRJChmOai1+wd5cANig= X-Received: by 2002:ac8:4e86:0:b0:3f5:2006:50f1 with SMTP id 6-20020ac84e86000000b003f5200650f1mr130504qtp.12.1692148483063; Tue, 15 Aug 2023 18:14:43 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Shakeel Butt Date: Tue, 15 Aug 2023 18:14:30 -0700 Message-ID: Subject: Re: [PATCH] mm: memcg: provide accurate stats for userspace reads To: Yosry Ahmed Cc: Tejun Heo , Michal Hocko , Johannes Weiner , Roman Gushchin , Andrew Morton , Muchun Song , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Ivan Babrou Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: F2B451C0015 X-Stat-Signature: oef9jd9wtfxkmesnjtbyxtudrgotmgh7 X-Rspam-User: X-HE-Tag: 1692148483-963114 X-HE-Meta: U2FsdGVkX18FtgBdq9mkmezuCmY5QiecYqvV5JS0L92rpScBl1TwwASjdt5+o1ixB2F+tRz9A09LLCuyQGN5L6/lOOXmNrmVd8FPLvEMfKjcvNbqNyoUoKzwaWasEgim92VEkc6b9dIpvBcFwzivEmc5gT0B6icfntb5gKqYbgV8/dWsfoBPpkogJ7Mf9jvRtDvM3fpdc8Xas9is0/RxUCxQfgirtI5xaFoH1hAvbT8ifxNnIzIJS65s1xgA07ylGhcMD9+o6ftO8Gpu8JrhDLlGSlCFxHyb81DTGTW7wdEMu6Eoznc+Kc78encu3pOYMPKQbhcd4SLR6LDaVJHY/T4XcmDB/j4GQi+xGLW+ycTxzxex49RCfD3tu51qrWrvtvd44ro35zqs3u1SLQgl9tivFXtOiyPqEoa0gx2spbklI2UQROink7m3Aq1JGAll0vaYh8RfM1hqYMBZuXS5N7yZ10LLGWJh+P/kvyis1uKl54Z/ogJZBQ4gDaWAYRwHbHCmQ7SaL+4yPzjXLv7EYVURqi8P+WAZr9Zrhj/BakUnX4YsVTSeuWpjVMLtyYTVA2Zq0HGK9qVANqCja9xJIM2YQhCofKxTdpkASUK2toxQJpjbc5O7t09VnSg1VP4p3NelH0diEsTZpm1QH+gIWwj9BBj4gk4tPCHmcdrWAe+ccp+p5SIYCv/Yy8GwLO8N0Ixfce4hNCemZ6lSBxlLxhW2f+Ql9YRg7qmqEnqBFy6dHQ0CamM3JnwMiB7QA4oL/bC/aBnWQLOHAoNgSGgidzfSmuE0cN2F1QulbxL48DmZ/k5Kx+EvrIQzpQYGK6tOdJrJ+ThjnP8cCYZ8xETxGXM0kF00IK8CsAHl0HY+ujCJQLgmws3wS+pSA8SJGnGM5RHDA9YstLL7yYLmZBGX1/Njj123tJlD0Kww9dwAvy2FqWYVrsDspGVomj1MPfeS86kIwcoQbSXuvjXPumj nY12W0Mk suNW61w7/G1p38CXypn+mt2/4GwQy11Kk97kuLxqhQ1d8neSgu4S4sWMKx0M4S6GbhylPHf0zh+zf+kzXamPkcS9LV0qP56ZMeBkU6e3NaAr1HOIcRr8qy8ZuMkLVUC5VWhYn7Lp68/s8xkj0izeqalcxC9izvAVvqWrfEa9Kkfyqy2eZKbXgbmdBW3Ub+UB+9eoBpXdqkejIBdRxh1CQTEQpXFV6YQBKWDvj2zp9WKLFBWszbNmIQqGtgQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Aug 15, 2023 at 5:29=E2=80=AFPM Yosry Ahmed = wrote: > [...] > > > > I thought we already reached the decision on how to proceed here. Let > > me summarize what I think we should do: > > > > 1. Completely remove the sync flush from stat files read from userspace= . > > 2. Provide a separate way/interface to explicitly flush stats for > > users who want more accurate stats and can pay the cost. This is > > similar to the stat_refresh interface. > > 3. Keep the 2 sec periodic stats flusher. > > I think this solution is suboptimal to be honest, I think we can do bette= r. > > With recent improvements to spinlocks/mutexes, and flushers becoming > sleepable, I think a better solution would be to remove unified > flushing and let everyone only flush the subtree they care about. Sync > flushing becomes much better (unless you're flushing root ofc), and > concurrent flushing wouldn't cause too many problems (ideally no > thundering herd, and rstat lock can be dropped at cpu boundaries in > cgroup_rstat_flush_locked()). > > If we do this, stat reads can be much faster as Ivan demonstrated with > his patch that only flushes the cgroup being read, and we do not > sacrifice accuracy as we never skip flushing. We also do not need a > separate interface for explicit refresh. > > In all cases, we need to keep the 2 sec periodic flusher. What we need > to figure out if we remove unified flushing is: > > 1. Handling stats_flush_threshold. > 2. Handling flush_next_time. > > Both of these are global now, and will need to be adapted to > non-unified non-global flushing. The only thing we are disagreeing on is (1) the complete removal of sync flush and an explicit flush interface versus (2) keep doing the sync flush of the subtree. To me (1) seems more optimal particularly for the server use-case where a node controller reads stats of root and as well as cgroups of a couple of top levels (we actually do this internally). Doing flush once explicitly and then reading the stats for all such cgroups seems better to me.