From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E973C2BD09 for ; Mon, 24 Jun 2024 17:03:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEAD26B030E; Mon, 24 Jun 2024 13:03:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E73AF6B0394; Mon, 24 Jun 2024 13:03:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CED0E6B0395; Mon, 24 Jun 2024 13:03:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AD1846B030E for ; Mon, 24 Jun 2024 13:03:02 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 62A951211CC for ; Mon, 24 Jun 2024 17:03:02 +0000 (UTC) X-FDA: 82266402204.16.FD880E8 Received: from out-180.mta1.migadu.com (out-180.mta1.migadu.com [95.215.58.180]) by imf12.hostedemail.com (Postfix) with ESMTP id 9A7FC40008 for ; Mon, 24 Jun 2024 17:02:51 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=KF73x2M2; spf=pass (imf12.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.180 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719248565; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WLu+V2MKAr+T8vIVzaWcCXsvgj+Dk/5RNmAxKrhoWJs=; b=QL9u9T42sYluf+/bSTPwh+vXE/KjMiB46lP1U9LLtOZKauBR0+t68E4noWKFc4vcM4Ylf8 eEQXHq8Qdtiwg7OnotTLP+X690xDr0JBHQB408dYSfTP61u2A0amJbObjv7n3Wmb6XuGzt HVGhVfppcVlOi3wQjYiN3zQIsGrGmHQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=KF73x2M2; spf=pass (imf12.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.180 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719248565; a=rsa-sha256; cv=none; b=AOmgHwiibQr7HqvABsX3qml6aUuF60o6XtIxwImb+WetXyaB77/DvQSjxc6G7QIuIWNIDl EWLhv+KEJxn8ln94aL34DHAVBmgIB8hMhhFdmoDJf4Q0BC3PjLcNyKBCjtntEdnOjzLGYZ 0chvYxD51cE9B+chcvOacCs/NpvALcM= X-Envelope-To: yosryahmed@google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1719248569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WLu+V2MKAr+T8vIVzaWcCXsvgj+Dk/5RNmAxKrhoWJs=; b=KF73x2M2+QzH5jfz1tOrUAbn69LBTA6fj4fPfIDpUJfJiCxOFlnTUpwTEhI2rdHGanZCWx BymMLCaLJElNNyaCAbZK67T+EP4h7oaMLPB2g+Qj/GEtmR2F7Rn/qbk/45RCgo+5p+Tfft 8xoExjfEtUV4koS8symwqbbsY3vtITI= X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: mhocko@suse.com X-Envelope-To: roman.gushchin@linux.dev X-Envelope-To: hawk@kernel.org X-Envelope-To: yuzhao@google.com X-Envelope-To: songmuchun@bytedance.com X-Envelope-To: kernel-team@meta.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: linux-kernel@vger.kernel.org Date: Mon, 24 Jun 2024 10:02:44 -0700 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Yosry Ahmed Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Jesper Dangaard Brouer , Yu Zhao , Muchun Song , Facebook Kernel Team , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] memcg: use ratelimited stats flush in the reclaim Message-ID: References: <20240615081257.3945587-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 9A7FC40008 X-Stat-Signature: auxc8ouy84ny1bz8w1chcaw1tdhs1mcr X-HE-Tag: 1719248571-752078 X-HE-Meta: U2FsdGVkX18ceuMIThidBHHEhDQkOsoVVzIctbce7+6DdD1TcVSQSYRtH/+eUpGR2wy+4CrX48eElytn86vOktDdfexfzoK+wH3u5UOu/MBCyFNXhSAtkE68/9ovnJJmI1rpiatbfWDZLXf5Hm6VjpdxLEVB+1J78KH2k1H5TCP3reG93aPXm0VcLf8JSn7xvTElwlclnUih2yLyJkr8wc6IhfxT7ztwDQH/4q8uknmVXPuiss/AQtcvyBryV5RioLsRMHUYEiqiTqTHDOwFCZzF79enE81e0uvNblQvsOjHT8pNGjIy1pUyz/l4RcBDXU7JXwUGCoCAmTEOZ2BkfVHplHkBrdi0ccoXU0IT+Thj1FX+OjUTxMGrmSyM5efYqkXorOXoYFUTm3CfI4ZghirekpDrs5mx45/1KDt/NMwstYQR5P23/qp8B4/T08EWlzj7y2Hd0IT5hgf6kkOF0ztx8fE+7CIkuNxoLpAQVCNt8FBCYL4dvggQ+cvlVHot8uEVhrxrF/xvtgYWNfdslRbZohL+HCb1LxuZJNl2laYmzSVe9LINhi/WFW3cfsRHym/eRLgDaYb05QL/wnd7e3cWwQIO6WfHl6j0GEeKgj3wTxkuC7Uqv05flpB+1Hg8+jwGLBUXslRixT8QFr98HhFlGgUZ4mkE7gEePfgze8K2yBaA5GLW7FfevWC/cV2OUc5pFSdSQgPT23qBj2aSXehEcUZwppqjudaI1AsKju/vjJ6SiENlxgSIUssRV8YtAW6dLCt1PB9j7cbe3ad7TNAoVHaYh2atMMksDTZdglBcwWnqvCfOtpXx4m+S/SDi90i+YO6+4di3JY5VZMcpFgbFfLlDC+qsRg5UtPILiJ29AgYHBBiBNh6xt+3WB476276uvrv0WYgDp4ryn+TudA/9YKAZt6AO3FsuUqFQO75rsZzKamRzyKoe5XDQG5Z3d52ptMA82FdMRYTy50L WKx+2n+P JM62cABALkt5TKILhnjMtRqyE2g4TaEJ9llkGrvmlUtV2OOHWWQJ5I1MU5pa3Wc469+S0bXZWU2080ryr8QDI+SeF41HymqhJoze+u0GS+tvY6FW654ufrWuM5w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 24, 2024 at 05:57:51AM GMT, Yosry Ahmed wrote: > > > and I will explain why below. I know it may be a necessary > > > evil, but I would like us to make sure there is no other option before > > > going forward with this. > > > > Instead of necessary evil, I would call it a pragmatic approach i.e. > > resolve the ongoing pain with good enough solution and work on long term > > solution later. > > It seems like there are a few ideas for solutions that may address > longer-term concerns, let's make sure we try those out first before we > fall back to the short-term mitigation. > Why? More specifically why try out other things before this patch? Both can be done in parallel. This patch has been running in production at Meta for several weeks without issues. Also I don't see how merging this would impact us on working on long term solutions. [...] > > Thanks for explaining this in such detail. It does make me feel > better, but keep in mind that the above heuristics may change in the > future and become more sensitive to stale stats, and very likely no > one will remember that we decided that stale stats are fine > previously. > When was the last time this heuristic change? This heuristic was introduced in 2008 for anon pages and extended to file pages in 2016. In 2019 the ratio enforcement at 'reclaim root' was introduce. I am pretty sure we will improve the whole rstat flushing thing within a year or so :P > > > > For the cache trim mode, inactive file LRU size is read and the kernel > > scales it down based on the reclaim iteration (file >> sc->priority) and > > only checks if it is zero or not. Again precise information is not > > needed. > > It sounds like it is possible that we enter the cache trim mode when > we shouldn't if the stats are stale. Couldn't this lead to > over-reclaiming file memory? > Can you explain how this over-reclaiming file will happen? [...] > > > > > > - Try to figure out if one (or a few) update paths are regressing all > > > flushers. If one specific stat or stats update path is causing most of > > > the updates, we can try to fix that instead. Especially if it's a > > > counter that is continuously being increased and decreases (so the net > > > change is not as high as we think). > > > > This is actually a good point. I remember Jasper telling that MEMCG_KMEM > > might be the one with most updates. I can try to collect from Meta fleet > > what is the cause of most updates. > > Let's also wait and see what comes out of this. It would be > interesting if we can fix this on the update side instead. > Yes it would be interesting but I don't see any reason to wait for it. > > > > > > > > At the end of the day, all of the above may not work, and we may have > > > to live with just using the ratelimited approach. But I *really* hope > > > we could actually go the other way. Fix things on a more fundamental > > > level and eventually drop the ratelimited variants completely. > > > > > > Just my 2c. Sorry for the long email :) > > > > Please note that this is not some user API which can not be changed > > later. We can change and disect however we want. My only point is not to > > wait for the perfect solution and have some intermediate and good enough > > solution. > > I agree that we shouldn't wait for a perfect solution, but it also > seems like there are a few easy-ish solutions that we can discover > first (Jesper's patch, investigating update paths, etc). If none of > those pan out, we can fall back to the ratelimited flush, ideally with > a plan on next steps for a longer-term solution. I think I already explain why there is no need to wait. One thing we should agree on is that this is hard problem and will need multiple iterations to comeup with a solution which is acceptable for most. Until then I don't see any reason to block mitigations to reduce pain. thanks, Shakeel