From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0541C3DA6E for ; Thu, 28 Dec 2023 15:14:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 239888D0015; Thu, 28 Dec 2023 10:14:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E9A08D0012; Thu, 28 Dec 2023 10:14:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B1F78D0015; Thu, 28 Dec 2023 10:14:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F03A18D0012 for ; Thu, 28 Dec 2023 10:14:03 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id C7448A1D20 for ; Thu, 28 Dec 2023 15:14:03 +0000 (UTC) X-FDA: 81616572366.27.4CC4776 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) by imf12.hostedemail.com (Postfix) with ESMTP id F347B40027 for ; Thu, 28 Dec 2023 15:14:01 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="eoJx3/xd"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703776442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ahnv5X4nnAsn+nuTs2Z+3DG61SWJqLXeCYhMNSbZQT0=; b=P/BRmqE5tT2FYJMkwV0THxnCAFPSdyvCznttKwA0ggSlAgn782jNdV88qbFW4Xf/B7wU93 W4JQiRMahWHffJyyXSMA0unwjaqxDPXBwCw+nAuhegEeF7JRLth8MgECa00apKOnS5iC8G UIsDpdNeFbi8crHHvvDOpBgx7D/UNE4= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="eoJx3/xd"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703776442; a=rsa-sha256; cv=none; b=URgs6wIlo6N8fJaroOa/v2lij6EL+Wezn06+0SUagc1L+o8an7bPZS8waguDR0GETUa3kO JepK+I0+mrm/HgA7Z3E7PqEkMLbPtMwBxN1OXaJy5QbonUuR+O4u0/i77poshj2vhFWS0r QK2dwoVQ9z2BogTzoEflmrKuULaUkIQ= Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a2335d81693so1061202666b.0 for ; Thu, 28 Dec 2023 07:14:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1703776440; x=1704381240; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Ahnv5X4nnAsn+nuTs2Z+3DG61SWJqLXeCYhMNSbZQT0=; b=eoJx3/xdkjAoNs2//+9r0TIOVY401CWtv9OwMI56CyEPN89AWPySR2Rq0ZcDjJtld5 v+CcXrZOC3PSB3cvhPD4V5egS8+FnQnG7US8rkTaBvWk8VuE9BaftJ/0tauRTe9reedy mXbdAztQxtBI+73SqrHrToeKoO4XrVyN/wNLFh3wW+MVIMKGy3Su5nt5M6I6UcpSsCj2 PeWotbqWUNYEJJ2bkYPCJ9Y0NW70Bh58DlpOLKlaukGHU4eWaFhu1VNCWOwmConsOZYU TmOop+7R/D9NDENvOcr9fd/zahEzrVXSfxH1S/ndUTpsQ2jUWzbZqsV8eZVH42jTP2j2 n9Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703776440; x=1704381240; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ahnv5X4nnAsn+nuTs2Z+3DG61SWJqLXeCYhMNSbZQT0=; b=V4XGi8hfWApie+17/yZNU4l5kd4mlfdpCATfrqve84CpJ7k3NWxEFbnLbBLEtz7WwB S/8SMSwA0T4aP36xHrEHTIBp4mFNefArIv6eI7d94Nt2oaYZsyzJnJLaB3Zobec/zDQX m/54TbsLxvn0BrXmNvRNEAT1VdXLWFC5XHgHeqNoMvk7NZoAoAvYWXv9TT1ZfhRh6ZVi Dm+jI8kK2vEeRMlgqOPwzun28ku+y+BiZ6cn1VSEDwJg7AXMJ4a7Up2a49FOsLj54jvi 9i2qBmxP66Vx4sHfExKNhOZOGacIfJN8eHYOA1PM0ctMPXEVKBQlRCelZwkFYlIQgFw3 X71w== X-Gm-Message-State: AOJu0YxR0xMDyXk8903iQTWt/2RPoJ2z5cZpzGosjWjWdSVCXJqQhBMy moYQpPozMMgkWkvHI2OYOupUySLXyFukkuFHWJsuWHNGs8Xh X-Google-Smtp-Source: AGHT+IEFHeLXb2+I4clZFSkOkMfyCaJGc/cfoOl0KPxpF81N+THYiqTo1FMe57UY8J4bOsW3G4BuIidohqBwB3Y4v94= X-Received: by 2002:a17:906:2dd:b0:a23:1b07:5c1b with SMTP id 29-20020a17090602dd00b00a231b075c1bmr10625948ejk.10.1703776440320; Thu, 28 Dec 2023 07:14:00 -0800 (PST) MIME-Version: 1.0 References: <20231228073055.4046430-1-shakeelb@google.com> In-Reply-To: <20231228073055.4046430-1-shakeelb@google.com> From: Yosry Ahmed Date: Thu, 28 Dec 2023 07:13:23 -0800 Message-ID: Subject: Re: [PATCH] mm: ratelimit stat flush from workingset shrinker To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: F347B40027 X-Stat-Signature: k67a5w3f5pj7tggcepnea7dpukqj95ni X-HE-Tag: 1703776441-915692 X-HE-Meta: U2FsdGVkX1+Qk3TwqEo9T7qb2C4VhPKC+Fe1eMVi6vBKVCKZxf/4r6CPGJ4/uudJJiko6QtFHsfkhDTIXDeolC4mgMWU8qNDLGJZLw4p4BLPYy9CujJhPSCYyt2/VNX/W4T9ZCW8Y0XmE7jePnjM2BDeIo0cGUHfFwO5x3QBoE6Jq6PBzrOD/hck6gvxyEZFV3rhXx8m3FjTOpyUWW5vI3p3GVVdXeg5z8qoU9WKMKWUnFee3UV8vNPcPkvaRSgJNl1KsAvAmb3J1UbfjHehGPSQYVvV1OzltwhZlOsnY/vvVpIoh356FrFKt/oA1V4MEulfp72NEtxEyXiDeARceU2q9W9gDC+wFavVVmm0KK6ToujaDXSew3hiW4/wzik1Ur6OOdAzzKRV1w+UyxmZVjkBQlg8IYR8KN+p3DuOmQ/kyTPus+sPWBwYNteNffFZUhZiVxBq1S8MWRrkA52Tf21F3DXOVCMpGIx8UREiEt7z2FBrXRzg2wkGdaszwGnpv6A2sZ+UH4YPbrzxhRFD5r/z22VgaH+GF7c7OCTCI8WMgTzDLva+cdA0cF5MWhHMzE7m379HBDTNtIegYVSj6evOt7wFYUdYh0+r9EOlhNaVgHwy73cea5sLQ3rMqF8VH07OX86xkEJ70u4PFo6ekrIbQOXj13xiLgbv5hMmcz55rjrmxiUqyyeE7Ae5FKRewq9arMUakrjX7vc3Bx8R7IUIVNnBnmBBrKVuwwBujALSEZcv8Fcu0gEK4/+3u2fu9S6d1OOHTUjOpacObhfI0QSMeVf/UsTW22p0/bamOeeZkZBPcFsjuEG3qE1HUy3qlCCCJyOjoT5Zn9tkAFr/XceNj7Fyl3ejKivN0HxGBBP6nmeHdKkMNElmAc8yA8HSmmcvESlM1vI2qeO9qKEYYLYbGbHb+3QJSNzTRR0sDMWCwYHmMCst61vyZu1GlrgnPyKkl2eYFnF1khU/pst +59kVzya UT0pzPGdGiZHcJrWuGD2UqMNDc8fSlCkZJUDCuoQEPgaZhduxftCKg8/fBZ+ad6cqMK9t/UyIEI+P1un8YJOrso+1b2TousL2zXph65kXx5ib2Ht27u3ykg7GTXvLGr3hvOB9TL/0T7aH4FNN4i68HkjxxF8A9NTNMcDbckety4XypEiwlNtmkHbbBCjM+xnHgWeF0/qqJqd4rpe5i4g9iyMcFxN0q/ZzVcTpghvfb7pMbqyAlQG5oM95VQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 27, 2023 at 11:31=E2=80=AFPM Shakeel Butt = wrote: > > One of our internal workload regressed on newer upstream kernel and on > further investigation, it seems like the cause is the always synchronous > rstat flush in the count_shadow_nodes() added by the commit f82e6bf9bb9b > ("mm: memcg: use rstat for non-hierarchical stats"). On further > inspection it seems like we don't really need accurate stats in this > function as it was already approximating the amount of appropriate > shadow entried to keep for maintaining the refault information. Since s/entried/entries > there is already 2 sec periodic rstat flush, we don't need exact stats > here. Let's ratelimit the rstat flush in this code path. Is the regression observed even with commit 7d7ef0a4686a ("mm: memcg: restore subtree stats flushing")? I think the answer is yes based on internal discussions, but this really surprises me. Commit f82e6bf9bb9b removed the percpu loop in lruvec_page_state_local(), and added a flush call. With 7d7ef0a4686a, the flush call is only effective if there are pending updates in the cgroup subtree exceeding MEMCG_CHARGE_BATCH * num_online_cpus(). IOW, we should only be doing work when actually needed, whereas before we used to have multiple percpu loops in count_shadow_nodes() regardless of pending updates. It seems like the cgroup subtree is very active that we continuously need to flush in count_shadow_nodes()? If that's the case, do we still think it's okay not to flush when we know there are pending updates? I don't have enough background about the workingset heuristics to judge this. I am not objecting to this change, I am just trying to understand what's happening. Thanks! > > Fixes: f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical stats") > Signed-off-by: Shakeel Butt > --- > mm/workingset.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/workingset.c b/mm/workingset.c > index 2a2a34234df9..226012974328 100644 > --- a/mm/workingset.c > +++ b/mm/workingset.c > @@ -680,7 +680,7 @@ static unsigned long count_shadow_nodes(struct shrink= er *shrinker, > struct lruvec *lruvec; > int i; > > - mem_cgroup_flush_stats(sc->memcg); > + mem_cgroup_flush_stats_ratelimited(sc->memcg); > lruvec =3D mem_cgroup_lruvec(sc->memcg, NODE_DATA(sc->nid= )); > for (pages =3D 0, i =3D 0; i < NR_LRU_LISTS; i++) > pages +=3D lruvec_page_state_local(lruvec, > -- > 2.43.0.472.g3155946c3a-goog >