From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0FAAEEAA7C for ; Thu, 14 Sep 2023 22:58:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD01A6B02C8; Thu, 14 Sep 2023 18:58:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B593E6B02DC; Thu, 14 Sep 2023 18:58:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A21C86B02E7; Thu, 14 Sep 2023 18:58:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8CD726B02C8 for ; Thu, 14 Sep 2023 18:58:49 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 61E251C984B for ; Thu, 14 Sep 2023 22:58:49 +0000 (UTC) X-FDA: 81236719578.28.958B06B Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf28.hostedemail.com (Postfix) with ESMTP id AA556C0016 for ; Thu, 14 Sep 2023 22:58:47 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qJQ5Gxzx; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3JpADZQgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3JpADZQgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694732327; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OBvN087+LWsLjnRA9Q/StMUiztB5piJc+GOWYwv7hrg=; b=5AWykyTykuqgX8qApMQFOhbVf87ju3LRt9CR/npO8RkKlAk64HUKIfWsBIuTHgexMSalcy XEnlK9wDut7fDaiSaFvzdTpSGXGp6/ILNomuq/vgZd3cZHJzkDkNjSK4gvlHFMvcQ5p0c1 7+5wnFiI76OQr5eSz+DaeTpH1+NOmGs= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qJQ5Gxzx; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 3JpADZQgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3JpADZQgKCIMzohrllsinvvnsl.jvtspu14-ttr2hjr.vyn@flex--shakeelb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694732327; a=rsa-sha256; cv=none; b=qt3hVY0NbI7juJtFx2bKTZfsIkGV+pOOyE1paWnDzol0LyZ5j11OvkFMhJymRk1iWiTRL2 xl5mfLV51i/DB8oY2/16QagpcnoiLvcMoCrkxZHp7LW3/9VRJVmYNFwGbKP+bgKE4maT3n IldRqxBMeF15xIXK4OLppwa3M65+wxk= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-563ab574cb5so1230482a12.1 for ; Thu, 14 Sep 2023 15:58:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694732326; x=1695337126; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OBvN087+LWsLjnRA9Q/StMUiztB5piJc+GOWYwv7hrg=; b=qJQ5GxzxJcXP0c3bAdrlfPEYJY0eW16qlqTS3vj5BePJPDD+pHyn7fMlZaOXznpTaw Hlq4ghjrwmCyQdRVpwcuHI/xnMPzAvHpdDkOlwUcQjZrIKR4sO6PjhwKQ6vgkUjtOhVC MKfoJA1YzLhtWrVInDgYPpBLSYjaxtmyW90JeYpLW2RhuHaNmVCTXsTbqmQMxqFLLWyz jQCLk1Z4cWupJ8fJRF9CDbxedocA+uTH2b0YwkAOaLpT3PoGDTUclYsg+8BDlElta8bD Ca4sGQDjWKhsrNeecZqeitfywNId0vene+E0Xvou5ldmJLAm1KLy0p/8ym9YPJrpo5OA GoNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694732326; x=1695337126; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OBvN087+LWsLjnRA9Q/StMUiztB5piJc+GOWYwv7hrg=; b=ErXZNd/oSl6FflTg0+hCaBygI7fmTUon/qOnmQkquz7TDZtLV1ciwqpa7rMS4kBBSP n5s1QWP0F7aRFGcD+ei0SZj1/+/U6bx2qx0dDNfdOdZ0fyYtulDvYjFl+GeoY019u3lA GBWGwU7ZsLF9epa4eISi/JMxBs3QYSj7bZrclV70DhxO1kHgXODbYHlMzbCbp5UsQhr0 yP3oA0WmEyAeXHBGtbUye1OVyXgxUxaFAA/j2ndiEI5qv2l/qVJ2m4HzB7vdfqlMuu2a AgelW5zLoxZNW/MjMje383OOo22YBMKinNkzKYz0+ZU8XPfOCd1bqeuUFC2p61Hzqkzu 1CIQ== X-Gm-Message-State: AOJu0Yz0e5BKobVvvr5IWd8hWTejSYmYkoutg6Qhc3uuAztG+Ru6Mfbt HO2OezXNWWgtCJoe6biFhi/URElRU11ykg== X-Google-Smtp-Source: AGHT+IEpNEc4jLhITi/avuNxdjVhmv5rK3VAkgg6J4TclKm4i8qSATiM+JZ1b0hGZt0buZsf8KqVFPQcOiYGIg== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a63:6d8a:0:b0:578:1b5a:6367 with SMTP id i132-20020a636d8a000000b005781b5a6367mr19710pgc.12.1694732326211; Thu, 14 Sep 2023 15:58:46 -0700 (PDT) Date: Thu, 14 Sep 2023 22:58:44 +0000 In-Reply-To: Mime-Version: 1.0 References: <20230913073846.1528938-1-yosryahmed@google.com> <20230913073846.1528938-4-yosryahmed@google.com> Message-ID: <20230914225844.woz7mke6vnmwijh7@google.com> Subject: Re: [PATCH 3/3] mm: memcg: optimize stats flushing for latency and accuracy From: Shakeel Butt To: Yosry Ahmed Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Ivan Babrou , Tejun Heo , "Michal =?utf-8?Q?Koutn=C3=BD?=" , Waiman Long , kernel-team@cloudflare.com, Wei Xu , Greg Thelen , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: AA556C0016 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 4nhqj41henemhyohej7zm8kus8swzopj X-HE-Tag: 1694732327-887388 X-HE-Meta: U2FsdGVkX1/xgq0JE3giRhr61wcBCjVf1O+b/hTQTRDPFxQTs+L/Agj0pkZXjXF3LkGUp5N2BapmRnb4R0hDp7A2DM+Lg9jD8NWHMCTDIj8mtkD2gLkxFYN3Urvnx2Xqa4ILNrOWefklFRZROMJbA136vY0NfNMjGleq4Xyt+H9f9Wsu1qjTEJ2ZiaRcH5SgnQxpNlN+Nm+pH/Oozr6amtanrpSCMt7T4DXeXmcWeSumJklnxC1eL8VsF7BaYXzIhaadjZc5K62NRVS7P0fN9X9uFC53yxVimAOz1THWBvDxrO0CxTG91Ke1I/nSQWDRwfMG1uvL1sEIU+Uzoai7CnaXLMaPRkDHsH81R9dzCADGt+7oprjajJ7K2EfyxzfnG6RxDNxRjfYeFLkJ1Sn/YS9XdpqsFYZyOsfg6zBX1plC5KuWrHAwY1dNMnj+fH71e0skPjTs2jUsdX3bhlpxazbvp/wf4MX42W87c2Y4D1IKfy8l27uWO7eClkWClmi1OTpxl+ceyHqF9BoLtIb37IMcQNYY8+YSDQJR201bli7NNTWFRgE2ACSbScFnApBV2hedNOcHysOCC4cTUU+LVgC3CDNUvqBe/pcS43KDQk1dvJ4Rycq09/YL35JswDEcbX661JfyQxZ7Mp5AVt2jEP6/V5EDHfhF/MdSEsAL6zJlUF1rZbpDIlE0QmQ2kYhy5Y6E7SJsCIJBDi4oOu+zgZGazv9QyZYro393bfDy/AFPid9mbzlLtxVDqj0dJXwPmvkfanjokHCTQItxLsE2z8fv0pOGWuWqS/n6lCCSvJ9GPMxvdLDD7z5dSLE00P5az9hfiJsJ6P+4JOMnAIbX03111Uv5moy2YHAcEP0bYOg8Ribq45SYd+MfZb3h8c+u90uIl/SYgBkFupYQJhcuzTy/Tou/xju5K1P53faqMfQgPt1UPXQnTYpFFIEtQAoBYhKwQvBOJygzS905IBO XX+0/CXR HfINpU6IwP0dZ6bI+I3XL/IQtOUlvDK382k6tEXqzYJmzFEADsZQzhT85VECagNkummmsZ8TvB6g42X1MEBT2MRSkwHDcbVft5AXVWAcumhum78Sbwpn0MZ/BNF+t9OWfuANzlAuQXCARQgTKlDa44SZJbkElnSdcx/yKLElPSih8FeI/EqVyeo+kFbyf6ie9OgOXc699c8ckM8RABT4VCyp715aQKPJ/IAFa/Q5HXnzpXcyWQksAbCXUnrh6KF2MiqKV4I4rBAMBXxcbz8sKv/k2K2m5mVrIt1sdUbdYzW474Yj8fdTLKtNJsC9abeNvQtQY7fNKO0YizNd0vIfiC/khC5I9p1D/tFIafAv9AC/Fvi2BQ0WWfceMhG/8CN05oxhmF9GVt7hWOk+FIJU4I/2/3uP+9enEGJjvGmd5MT2Cocnz6tv4myzhYafZs5Iv1WNVqqPnRlcoFPUCXPmS4DJ8gz2Ht+7gX2sGOZZAct7YNbsK5LM668tzkJkQdPyvtcSqEP6tjuKPNPv5Ydd0KR45rU7kKSSplJNOpfn5u8V+h0VNsRnUkcRWlFIIvdVD7fpJRI353VKW8amLvzcwJ0NhkH/N2sUk4kAs X-Bogosity: Ham, tests=bogofilter, spamicity=0.001231, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 14, 2023 at 10:56:52AM -0700, Yosry Ahmed wrote: [...] > > > > 1. How much delayed/stale stats have you observed on real world workload? > > I am not really sure. We don't have a wide deployment of kernels with > rstat yet. These are problems observed in testing and/or concerns > expressed by our userspace team. > Why sleep(2) not good enough for the tests? > I am trying to solve this now because any problems that result from > this staleness will be very hard to debug and link back to stale > stats. > I think first you need to show if this (2 sec stale stats) is really a problem. > > > > 2. What is acceptable staleness in the stats for your use-case? > > Again, unfortunately I am not sure, but right now it can be O(seconds) > which is not acceptable as we have workloads querying the stats every > 1s (and sometimes more frequently). > It is 2 seconds in most cases and if it is higher, the system is already in bad shape. O(seconds) seems more dramatic. So, why 2 seconds staleness is not acceptable? Is 1 second acceptable? or 500 msec? Let's look at the use-cases below. > > > > 3. What is your use-case? > > A few use cases we have that may be affected by this: > - System overhead: calculations using memory.usage and some stats from > memory.stat. If one of them is fresh and the other one isn't we have > an inconsistent view of the system. > - Userspace OOM killing: We use some stats in memory.stat to gauge the > amount of memory that will be freed by killing a task as sometimes > memory.usage includes shared resources that wouldn't be freed anyway. > - Proactive reclaim: we read memory.stat in a proactive reclaim > feedback loop, stale stats may cause us to mistakenly think reclaim is > ineffective and prematurely stop. > I don't see why userspace OOM killing and proactive reclaim need subsecond accuracy. Please explain. Same for system overhead but I can see the complication of two different sources for stats. Can you provide the formula of system overhead? I am wondering why do you need to read stats from memory.stat files. Why not the memory.current of top level cgroups and /proc/meminfo be enough. Something like: Overhead = MemTotal - MemFree - SumOfTopCgroups(memory.current) > > > > I know I am going back on some of the previous agreements but this > > whole locking back and forth has made in question the original > > motivation. > > That's okay. Taking a step back, having flushing being indeterministic I would say atmost 2 second stale instead of indeterministic. > in this way is a time bomb in my opinion. Note that this also affects > in-kernel flushers like reclaim or dirty isolation Fix the in-kernel flushers separately. Also the problem Cloudflare is facing does not need to be tied with this.