From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B17F4C27C53 for ; Fri, 7 Jun 2024 14:32:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 269EB6B009E; Fri, 7 Jun 2024 10:32:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 21A136B00A1; Fri, 7 Jun 2024 10:32:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 109116B00A2; Fri, 7 Jun 2024 10:32:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E3D3B6B009E for ; Fri, 7 Jun 2024 10:32:57 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 98D454196E for ; Fri, 7 Jun 2024 14:32:57 +0000 (UTC) X-FDA: 82204334394.24.F2EE905 Received: from out-188.mta1.migadu.com (out-188.mta1.migadu.com [95.215.58.188]) by imf27.hostedemail.com (Postfix) with ESMTP id 2D2B740017 for ; Fri, 7 Jun 2024 14:32:53 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=B8hEjK93; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf27.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.188 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717770774; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2NvCYHpZJ1ZjGoA68QOuHW4f0uq565qSfBp0tFxFCsw=; b=umzKq8onWQ8JjTDriB5GLoY4urtTDFJ5YmfASPowBJfOs9kkNmbv9dDf0adBKo/twvKBZx pN+/FU3Fl5T9804bb4e3NE9c0eUIWqIELoo5GlTmXMOye4TyAeG6gtDJldulpztHKEavXu 75AWAyK1ZE7mybcCwjljgIrJf+ptUK4= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=B8hEjK93; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf27.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.188 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717770774; a=rsa-sha256; cv=none; b=VxiszjN3e/INv6axKZyDHM5TOqxy9olZO6DzVGjznDeDUziJPv4xjBw/wyxfokvYRTMwtF aOOJpkitNI2S4VBcEJFhePXbuqAM8nCqBs6fPaivmkVFcc10vlCT0NjTiY2R7RQ2nZkiaI dZdhCYkJjKATdAbMAyfyKXumWctNk/k= X-Envelope-To: hawk@kernel.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1717770771; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2NvCYHpZJ1ZjGoA68QOuHW4f0uq565qSfBp0tFxFCsw=; b=B8hEjK93FWbycj6OxRHaxjicx7BrkV2Uiq2QDAKDQKrIM5ihChm3umdH0rf5J06hUPY7K8 yavee1l8pRBkj+GJTSlatoJgrXhDvVrFG0wNwJ/RpfFl9pdZOdn/pB+SMXj1ljjDVYmXZZ QIyaFwoSlrU4qh5k1Ikwog41UFEvxf8= X-Envelope-To: stable@vger.kernel.org X-Envelope-To: yosryahmed@google.com X-Envelope-To: tj@kernel.org X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: lizefan.x@bytedance.com X-Envelope-To: cgroups@vger.kernel.org X-Envelope-To: longman@redhat.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: kernel-team@cloudflare.com Date: Fri, 7 Jun 2024 07:32:46 -0700 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Jesper Dangaard Brouer Cc: stable@vger.kernel.org, yosryahmed@google.com, tj@kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, cgroups@vger.kernel.org, longman@redhat.com, linux-mm@kvack.org, kernel-team@cloudflare.com Subject: Re: [PATCH 6.6.y] mm: ratelimit stat flush from workingset shrinker Message-ID: References: <171776806121.384105.7980809581420394573.stgit@firesoul> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <171776806121.384105.7980809581420394573.stgit@firesoul> X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 2D2B740017 X-Stat-Signature: ujr5193q4sfsjpkywdnbooorb8imqm44 X-HE-Tag: 1717770773-802260 X-HE-Meta: U2FsdGVkX1+JcP/y8GZY7uCT6BvotROYeD/8YB7R4GA2ILUS7JLIZI+g8OiHwS+xnb0qon7yv9q/0cZt9RCftsiEPeqtOYxjD72b/CIk2Pg+JXU/1TmvBk9MlnDw6jqDbFcEl5GTFIQ4H8fdHoVK8mXbBeGQE4LpiYA+7InTXGFTu+4YUvRegBLSBAf7ZXhropyauAQhaeQn5Q5I+LsRQsBlvbTQ+DHbZZ/heJaxVMSNJqGMuJ8ZbVzdyDlOf1oHjwNi2dDROJT2xJ+HprcvHAEitGXgGJsKijtrUk7VBCrK2/LgUrfTTXeMRzaCgcZEhs9+DtM4Di7ONxQAUEqVwZAIYEYgkBI/DeWjg8KOkiqZecuOLTi+OgVG3grxpVSIbor/bmZDZ/QLQcmQuxfTbpYhzF4fNrRSFF6o6MMR3Xeh25z3QoCB+9+0oVS/t+6YRo93SrC+vm5qIbJCbcV4zGZvqaCpO4Ip19fQ+64ghWTCx9tZb2r0JZYa7ytv0153OnBioB4Jf3qrYuqGS+bc7wQEJxeIUkp8+YoH9a7PiIAXQloDr+1DZWF2wrQx03tt2hg+kS7O1MppVeuuZBtOfiwPoJOLaLTpM7WBgtVKfQjwQYKMSv3uRaqcfSwhBTdVIoAMGlC1KrhwNRbB6Yrp/kG0jbzQcfsGECYsM7hcdVmohWnTk+4uYIbqJWCV1SqxTYXSJz61VD5/NkRc2rBd421a8JBKtWC6xrMJGlM+by0Wy9cKdT7SfqTlsKbL9Ow8SfNdtWdR5aKnJJRcK48sv4v5hiAmPzlA7zbuKujax+gP3nZckAyGjlAelK6MtzDYD5AaD2vKGm9LkPMLD2RdT5Csg2jYlBARzO4ha452DTwsVP7qUXY/tJiW/hqd/F/1Y/lEGvs+j6Ll0/mqa+ygSe5J3dis0KI8Xo1ORsc+w4ZAz4PPnRcSsNnKM1ppmt2ouIgvZNUCOHXQB8Qucdv X8cP9aVn pMWnIws73tOHdlKMZl33WOowDcC34g7z+HdzoodkNVXwIm6HP4IipcTfCXox2uj4EqXk6Av3s5TCKGZiaJJbBb71HWOwKnyNK0cdYSEeQHTBov2yP8Ts3ackE3djoz1VyvQYBZcYx1XmR5lcJeFNf54S5E/yJn5zDAiWsuo/e3yYCohRUW+0pAfTn4smF7kNKRz6OIM8zEQ9QzriftFsn1TMThrhrrFj1eVkcOW84F4U2UDK9BE0f/lq2g4n3mHgESRP6M7R40RZMUTDDEP6K3wlGY2n1PXYQ7uQzR9jQZhT1BNXySZfhp7igSmi70IZBG1A8TwK1ZPbtHJ4sNAumas9fMHb4Ynk5V2NOhY5Ju0CPpjL4HjF5JxKNDLQ3GAF4VOcG8nTKxzl89ro= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 07, 2024 at 03:48:06PM GMT, Jesper Dangaard Brouer wrote: > From: Shakeel Butt > > commit d4a5b369ad6d8aae552752ff438dddde653a72ec upstream. > > One of our workloads (Postgres 14 + sysbench OLTP) regressed on newer > upstream kernel and on further investigation, it seems like the cause is > the always synchronous rstat flush in the count_shadow_nodes() added by > the commit f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical > stats"). On further inspection it seems like we don't really need > accurate stats in this function as it was already approximating the amount > of appropriate shadow entries to keep for maintaining the refault > information. Since there is already 2 sec periodic rstat flush, we don't > need exact stats here. Let's ratelimit the rstat flush in this code path. > > Link: https://lkml.kernel.org/r/20231228073055.4046430-1-shakeelb@google.com > Fixes: f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical stats") > Signed-off-by: Shakeel Butt > Cc: Johannes Weiner > Cc: Yosry Ahmed > Cc: Yu Zhao > Cc: Michal Hocko > Cc: Roman Gushchin > Cc: Muchun Song > Signed-off-by: Andrew Morton > Signed-off-by: Jesper Dangaard Brouer > > --- > On production with kernel v6.6 we are observing issues with excessive > cgroup rstat flushing due to the extra call to mem_cgroup_flush_stats() > in count_shadow_nodes() introduced in commit f82e6bf9bb9b ("mm: memcg: > use rstat for non-hierarchical stats") that commit is part of v6.6. > We request backport of commit d4a5b369ad6d ("mm: ratelimit stat flush > from workingset shrinker") as it have a fixes tag for this commit. > > IMHO it is worth explaining call path that makes count_shadow_nodes() > cause excessive cgroup rstat flushing calls. Function shrink_node() > calls mem_cgroup_flush_stats() on its own first, and then invokes > shrink_node_memcgs(). Function shrink_node_memcgs() iterates over > cgroups via mem_cgroup_iter() for each calling shrink_slab(). The > shrink_slab() calls do_shrink_slab() that via shrinker->count_objects() > invoke count_shadow_nodes(), and count_shadow_nodes() does > a mem_cgroup_flush_stats() call, that seems unnecessary. > Actually at Meta production we have also replaced mem_cgroup_flush_stats() in shrink_node() with mem_cgroup_flush_stats_ratelimited() as it was causing too much flushing issue. We have not observed any issue after the change. I will propose that patch to upstream as well.