From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89799C2BD09 for ; Mon, 24 Jun 2024 19:29:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9A8A6B0382; Mon, 24 Jun 2024 15:29:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4A796B0383; Mon, 24 Jun 2024 15:29:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC45C6B0385; Mon, 24 Jun 2024 15:29:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AA0536B0382 for ; Mon, 24 Jun 2024 15:29:14 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3970C12136D for ; Mon, 24 Jun 2024 19:29:14 +0000 (UTC) X-FDA: 82266770628.09.E511FE9 Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) by imf06.hostedemail.com (Postfix) with ESMTP id C09D5180011 for ; Mon, 24 Jun 2024 19:29:10 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=aVH3akGp; spf=pass (imf06.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719257337; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=73YsBBClGIt5TbpSPiOKHwE9/5M+RqUyWGrm+k8BonY=; b=xUWGErkwZdibya+lpCsk9nHZ4Q0Zk2K2I1wt84nDd5snljy+37ExiocOYTvACR8BCFMdFr jzwx0IlJcU7mxRnvyi/Ix1RIXD5OEIW6wwI3vaSkozPmTXNRT+qFajvn7lvzWOwYQ3MkIf E5NoT9PrJvT0p+dXz4+fXrtnpJj0LNo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719257337; a=rsa-sha256; cv=none; b=Cdk4CTtXluvziO3DUgRVdmXOJiaRK/WkQsHsTUR/cBThmpsNR1pGlmjtEYYsziRbP4ZlIc 3KzRv26AOTyTWVmeewcOnN7Vrp1+J75JbaW0iEBlVixet//kAMxXAfFwoQ/edQ5CybKQY0 ElgNMGvXN342RZa0W9nNcGKz1xPfAKU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=aVH3akGp; spf=pass (imf06.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Envelope-To: yosryahmed@google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1719257345; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=73YsBBClGIt5TbpSPiOKHwE9/5M+RqUyWGrm+k8BonY=; b=aVH3akGpYdpiiuoa/xlXRi3TVb3tLF8wrOExf95ahoOmx+UryJXTKOo1HwsLG3NaTrHW9i /iPoMlwVVa5wYX0iPZ4WF132PJtk+LhlPcroFQUE8hHifyJZIuibvjvtGQZwpocmkNVf38 rGivv3CUv/ajtvqG/vyScQQdhsz+T74= X-Envelope-To: hawk@kernel.org X-Envelope-To: tj@kernel.org X-Envelope-To: cgroups@vger.kernel.org X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: lizefan.x@bytedance.com X-Envelope-To: longman@redhat.com X-Envelope-To: kernel-team@cloudflare.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: linux-kernel@vger.kernel.org Date: Mon, 24 Jun 2024 12:29:00 -0700 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Yosry Ahmed Cc: Jesper Dangaard Brouer , tj@kernel.org, cgroups@vger.kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, longman@redhat.com, kernel-team@cloudflare.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH V2] cgroup/rstat: Avoid thundering herd problem by kswapd across NUMA nodes Message-ID: References: <171923011608.1500238.3591002573732683639.stgit@firesoul> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C09D5180011 X-Stat-Signature: o9szx3np56kyqn9j86x3ibcmcrthxwx8 X-HE-Tag: 1719257350-253196 X-HE-Meta: U2FsdGVkX1+VTTWrVG63BtSWdhnRxXqj7sPLpITWFo0wUF+NyXED1+VwXgdEdH79wUTwwdBtasdtFArDp55jHWciXJBEANdzvEz/QhGNgj89MH9Sq6K0VSxEi0gr7wR5wx/H+GXMMVd7+X35W0xNT0IPLXRBNghTIWFhM1qaDJN+J0m0xzrTrt7cXjH0+ZLPeerXJPrk0PZilRsLS4PnBVRf5QP3LRP4NkfH9lRNWUTliL6MmxbfwihlPmnn81WkJJdA0xT79/zh5uJAclR5q1wZ61hMTHmVJEdvKPKLuRy4LerEooxBmxKu2w4HqwWHrM3PkPA32VAMau31nmEbTe+bJOhBf+BKgmHu3tKvMUZNePk77l/wu+7zXxrhu8VjmHr4qoV/Yb1a5feVMyVKWFoEz7v809McBFwwPSm489eg5dGuZN0viE8lb4T4+w6MI2K9T5C5XcCZOXOk9TPto/97nawMTq43SfK61weNAdf7C+Y7Ggg3Y0lCqaA3Qs8Tyi9wBKj2gGawTpImhNJy+qBX4dimXDrFiG8kpiI1I9urj0qG8iDuojRwsbQ02iT7YB0Uk1+CXNSZV0ORa25t9+HDDP4YrUFz04XMMoazZw+n+uJfx8bz3+Y7VP/MjXtRDai3Zyxg/de+C7zKDxajmNNh3VCwg9v5sjieQMBwXSUCE2ACdxmb9Msg+vYM21kMzR7cKll9EsgQ/7zNzbnYr0P939xBdrQCnhzdDl40LZxPawj+SOLPfOaFH5yqfoEB0TqwEQbg1dIy8LFLJGo/SFFy+2KqslEiTOkCkDjtM+TlNRc95pHKSE4WStbwMRGrCzXr2p8xSzjDdxCJakJ/jCOOLDGWoQZepvT0lmIBsBZAEZ7Jtf/O/Y/Hnh3tv92mQAVEBYTG/AxsEpAlVP4gdldMfeeUDqbkicJYWkpTmpkcb/AQUyjdtf21TTwWAp9QlCndhaIDkBFkFs05ypU mCMErgxf ifZ8BzT4+co8M+s9k73WeiwbTMr+NLAgbqEjrWUyrC1qAIxxW/pvI7yRv5oelQjIeTpWWPqimiXvly6pLnrN2sLgmEovrCuL5N3rPpDFPUrwVhwYxeLcGJwAef66I6HieAPtIsbliBEBaPV73V3t71dsQVOuICu7E12qckjUwxuL2smSll6w5X7zFN6cxdnaefrEqTS7u8Fc4s5BakhcxMWF2RyI7v5kemi2tiLl50r0A64s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 24, 2024 at 10:40:48AM GMT, Yosry Ahmed wrote: > On Mon, Jun 24, 2024 at 10:32 AM Shakeel Butt wrote: > > > > On Mon, Jun 24, 2024 at 05:46:05AM GMT, Yosry Ahmed wrote: > > > On Mon, Jun 24, 2024 at 4:55 AM Jesper Dangaard Brouer wrote: > > > > > [...] > > > I am assuming this supersedes your other patch titled "[PATCH RFC] > > > cgroup/rstat: avoid thundering herd problem on root cgrp", so I will > > > only respond here. > > > > > > I have two comments: > > > - There is no reason why this should be limited to the root cgroup. We > > > can keep track of the cgroup being flushed, and use > > > cgroup_is_descendant() to find out if the cgroup we want to flush is a > > > descendant of it. We can use a pointer and cmpxchg primitives instead > > > of the atomic here IIUC. > > > > > > - More importantly, I am not a fan of skipping the flush if there is > > > an ongoing one. For all we know, the ongoing flush could have just > > > started and the stats have not been flushed yet. This is another > > > example of non deterministic behavior that could be difficult to > > > debug. > > > > Even with the flush, there will almost always per-cpu updates which will > > be missed. This can not be fixed unless we block the stats updaters as > > well (which is not going to happen). So, we are already ok with this > > level of non-determinism. Why skipping flushing would be worse? One may > > argue 'time window is smaller' but this still does not cap the amount of > > updates. So, unless there is concrete data that this skipping flushing > > is detrimental to the users of stats, I don't see an issue in the > > presense of periodic flusher. > > As you mentioned, the updates that happen during the flush are > unavoidable anyway, and the window is small. On the other hand, we > should be able to maintain the current behavior that at least all the > stat updates that happened *before* the call to cgroup_rstat_flush() > are flushed after the call. > > The main concern here is that the stats read *after* an event occurs > should reflect the system state at that time. For example, a proactive > reclaimer reading the stats after writing to memory.reclaim should > observe the system state after the reclaim operation happened. What about the in-kernel users like kswapd? I don't see any before or after events for the in-kernel users. > > Please see [1] for more details about why this is important, which was > the rationale for removing stats_flush_ongoing in the first place. > > [1]https://lore.kernel.org/lkml/20231129032154.3710765-6-yosryahmed@google.com/ >