From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5237FEEAA5E for ; Thu, 14 Sep 2023 17:36:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CB5A86B02D8; Thu, 14 Sep 2023 13:36:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C64B86B02D9; Thu, 14 Sep 2023 13:36:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B2C4D6B02DA; Thu, 14 Sep 2023 13:36:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A2C0C6B02D8 for ; Thu, 14 Sep 2023 13:36:43 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 68B748068C for ; Thu, 14 Sep 2023 17:36:43 +0000 (UTC) X-FDA: 81235907886.22.BA53CDE Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf16.hostedemail.com (Postfix) with ESMTP id 98A8C180025 for ; Thu, 14 Sep 2023 17:36:41 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pdwU7a3y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of shakeelb@google.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=shakeelb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694713001; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ljbrHfpo2IKe26opU5ehmoNQnl15X51TY3BJK0gEGYU=; b=I9Agk3Z0zoM7FFEZHqMFmWpKhBU5ak+N2u4Aljgu2FOq4ofLU4EQxScD3XazY0VJk6Cshn ESTufrcufuoBb16KsyJyiNAH7RfV4y2Eb9risx0Hd1CSrZxzARwZixhc0wqevKL36+73JE 0vM0uBrBpcdkQBST6hXbq09YxQkbq0M= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pdwU7a3y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of shakeelb@google.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=shakeelb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694713001; a=rsa-sha256; cv=none; b=NlcZ7lZWi10GKeZIIOMIyxZuctWg8mK+zYERD+D9rQB/sjlfTU12owKopPewJ9CHOqByJt 0UQgGBRv2dICshGVvjq8itK8EAngBPXqLI26LEe8ZhTBKNVBtHhYRIqK5CUyvwcJZAX9aj kHkWnl18hrZw0mdRtQwpiZrN4t0SCqI= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-1c3a2ea2816so10875ad.1 for ; Thu, 14 Sep 2023 10:36:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694713000; x=1695317800; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ljbrHfpo2IKe26opU5ehmoNQnl15X51TY3BJK0gEGYU=; b=pdwU7a3yJ0+ckEGOo7elaMMI+RKyA9GgB78VIJRkd5jliQQ/RYuiMGtL0WqarmhE+D nQwi2I8R5NCv1FY14Ed9XO25526RDp5vzUT4yY8nh2pjxfnGYgd1VZ8pc6AA0jbW0hfW IkHurhvpkKEOnvotIPvHtEPLAb227jy31cf1Sq4EovE74AShzlnLaQLNZsNpulwi7z8R a5E5eUmyULkOlt/v71zN5mWnpq4kdiOsZhtNHnRA+UXPF/UNGJ+yCxn47p+3/bkjDaKn WDA3o5X0/VcSdE9aMk/KUp5RniPtGms/r7MUL2Cw4v3zj7QWglzbdt6eyt88mE2AkbaZ rfvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694713000; x=1695317800; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ljbrHfpo2IKe26opU5ehmoNQnl15X51TY3BJK0gEGYU=; b=qQjltHJMN9G2cj6m9KdEXTF2ET/dAtfOb9VcVXRPZ2BKTlD6Mz4r1k6cfo6rfmFMVv 0ptJodKT13ioX6Qdi0UatekY1o7pV/CSvFHbbTx2XUsnjgJQ+73DE1SLsifOPVTYIwRe voEAi216YVbMbL2JyTI8eK2OJM6ygNAnFfXz12K2LXngQVWdOPBrfx0ppTSU4NDEpJHk zIoCXi9iZ3glM80+0RVzl3UbfBu3ms2w3pusJmqBRthUZr5s72zLqNffzLip7ofwZOoC 1epk2H3FT/yclA3tg6JpVrKrVTrtyjoeHw0GiDSdMKwngOSrcDpy1VNcdYvugKER4DQg 19cA== X-Gm-Message-State: AOJu0Yw74WXdNctC1BtdSthCrzpZgWuzL9KEuASAqnLgZGWBw2JM6b7d IaGcm82rTom0IIYJaEA5dhPnaBZfQSBbEFAu+1WiOg== X-Google-Smtp-Source: AGHT+IEnHyABp75QjIESjqQSDnfuRNqfnRAKK6pkWDxjKA8uSqkrYfi89QWWeiyMVyuDLuMgaNM2S1TAk6qoO1Sctss= X-Received: by 2002:a17:902:a3c8:b0:1c1:efe5:ccf8 with SMTP id q8-20020a170902a3c800b001c1efe5ccf8mr9658plb.17.1694713000337; Thu, 14 Sep 2023 10:36:40 -0700 (PDT) MIME-Version: 1.0 References: <20230913073846.1528938-1-yosryahmed@google.com> <20230913073846.1528938-4-yosryahmed@google.com> In-Reply-To: <20230913073846.1528938-4-yosryahmed@google.com> From: Shakeel Butt Date: Thu, 14 Sep 2023 10:36:28 -0700 Message-ID: Subject: Re: [PATCH 3/3] mm: memcg: optimize stats flushing for latency and accuracy To: Yosry Ahmed Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Ivan Babrou , Tejun Heo , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Waiman Long , kernel-team@cloudflare.com, Wei Xu , Greg Thelen , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 98A8C180025 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: mw9j3mf6mzc1w7d3yrh6ffj857fkzjhy X-HE-Tag: 1694713001-999206 X-HE-Meta: U2FsdGVkX18qR7uY82vEGQCjlHyFIzFcohoWx2U7QQ5WbOK1XX4REKwIR7PJlzTBwNQwv0lnosCDvC/+OivCL2NA9E7MxTBIkbK98EbB/gtq4SdlWCDF7FDe65WDQVVhNPFTtCBPYowTlnsZdnjV3Wl95Q18FCLvlpXSgE3Dr1hAA4tyi6TJl/TBtjQ9k3xWPQkc91KqpkrzU2oxr4DLxFd/IqYmOzCDloSj4Di27cDxr8fgPZLaE72jFwze6Ri83Dy4k5b9uSHfmCWk3m84pqLPW7CZ77lVZ9/hvmDFHya9DpjfJyFP9CKGVxBwf2oqo1kyNNczJptDz24yn1QG8aoEXRys+0sQiGS3Lj4uLcfWnyUX6cp5c2lvDR8HS7vTmupsPjmM5tSy+tGsP89WztVAcSBpFaG7oRFopciQ+luOCVco+FrzCYeeFlZidWFseYH4eraCHxdITyod/hBcFO/MJqXAnDfyeXdpcqY01YVTFEo7MaIIakhcQuqTcTCMNsYaDgkFqTbMXq2zCJP3BbZBRANQc8oBaVvsKDpeGv6RJhwxS4O/PSfhKkx3CrFa0ytxBI3qfMObo3jQ7/cRMajFrUnYl8oF6GX5AXDTd6O98884J7bqPgMXCD5opZN5/Cu5ckAVO9SvKpVLm4yC5milnP3fAm5JMrRq/dn7kV7gdWFEJE/UP7JyghORk2VKP/dTakS5tpK31pVyYlCU4mKttVchgbrxIykiejxrrDROIiYPIPz5dCCEHHpdHurCRDQ6L7w9LTSgJ2ASvu8vNepuTDOP9S/PxhTABE/xMHWTVib4CQAIZBTwyp35UMsOmMwcs3oyrfs1OrMKW6c4YJu21bRUOd1j2D3FOONhRgSoLoJcjlbNGm6g7Ux/rwkRo5QNXg/UUzL8qkUflBRYrmWqzmpMpj3BQeV6suwDH66D/aqkoW7BEXp4eRAKsYrh+MRNZ+YidfGJNkXfUIc /vhmWkac +w3pcBDIeL8Nmsf47mNHTfSoPEwnIWApL3aBPWlJGvkdlpB9oW9h1NVsx8AA6Slta0sxVxuhqRAaKhem+l7WkxjJm4aZBa51EkGJAHz1labJo6iGZbjX+zCCmYGM8myA/0r3vkF00oitnmhC3qGFnfSuETkmu09wG7e6+qxn9wDp5UtOOXcEt0FoylyOADtTD53UrouHgl7QJo7S/jhZR0UJfUPCpPaXBBAk3Qoobhzl7qy4ym7A1RCuer0bjohfERBiG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 13, 2023 at 12:38=E2=80=AFAM Yosry Ahmed wrote: > > Stats flushing for memcg currently follows the following rules: > - Always flush the entire memcg hierarchy (i.e. flush the root). > - Only one flusher is allowed at a time. If someone else tries to flush > concurrently, they skip and return immediately. > - A periodic flusher flushes all the stats every 2 seconds. > > The reason this approach is followed is because all flushes are > serialized by a global rstat spinlock. On the memcg side, flushing is > invoked from userspace reads as well as in-kernel flushers (e.g. > reclaim, refault, etc). This approach aims to avoid serializing all > flushers on the global lock, which can cause a significant performance > hit under high concurrency. > > This approach has the following problems: > - Occasionally a userspace read of the stats of a non-root cgroup will > be too expensive as it has to flush the entire hierarchy [1]. This is a real world workload exhibiting the issue which is good. > - Sometimes the stats accuracy are compromised if there is an ongoing > flush, and we skip and return before the subtree of interest is > actually flushed. This is more visible when reading stats from > userspace, but can also affect in-kernel flushers. Please provide similar data/justification for the above. In addition: 1. How much delayed/stale stats have you observed on real world workload? 2. What is acceptable staleness in the stats for your use-case? 3. What is your use-case? 4. Does your use-case care about staleness of all the stats in memory.stat or some specific stats? 5. If some specific stats in memory.stat, does it make sense to decouple them from rstat and just pay the price up front to maintain them accurately? Most importantly please please please be concise in your responses. I know I am going back on some of the previous agreements but this whole locking back and forth has made in question the original motivation. thanks, Shakeel