From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C43BC00140 for ; Thu, 18 Aug 2022 10:12:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A48586B0074; Thu, 18 Aug 2022 06:12:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F9368D0006; Thu, 18 Aug 2022 06:12:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BFC68D0005; Thu, 18 Aug 2022 06:12:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7EF366B0074 for ; Thu, 18 Aug 2022 06:12:29 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 54C7680A29 for ; Thu, 18 Aug 2022 10:12:29 +0000 (UTC) X-FDA: 79812298818.12.6A8830C Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf13.hostedemail.com (Postfix) with ESMTP id 00D7520857 for ; Thu, 18 Aug 2022 10:04:50 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 31652353CA; Thu, 18 Aug 2022 10:04:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1660817088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yGB57PLr3/XCzM4eFzUWiwdrtPGGJcwk3YhcDibPx8M=; b=gvEy9i+g+y8d7rIZ0U0o86DQsVZwmbcNxlRYSpGPZr0T18g6ijVFRl1wyUxyWXaJJFfxPx 9qoD12WHed4RQwOShceW3e3jpd7+bBtOWgXhiU9NWNAk7soVGUTTfqS40al6REZE5TWySi 1MximM3wxgOE7vjWSPav2gtm55geZkg= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DA66E139B7; Thu, 18 Aug 2022 10:04:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Kkw+NL8O/mJLCgAAMHmgww (envelope-from ); Thu, 18 Aug 2022 10:04:47 +0000 Date: Thu, 18 Aug 2022 12:04:46 +0200 From: Michal =?iso-8859-1?Q?Koutn=FD?= To: Shakeel Butt Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , David Hildenbrand , Yosry Ahmed , Greg Thelen , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] Revert "memcg: cleanup racy sum avoidance code" Message-ID: <20220818100446.GA789@blackbody.suse.cz> References: <20220817172139.3141101-1-shakeelb@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220817172139.3141101-1-shakeelb@google.com> User-Agent: Mutt/1.10.1 (2018-07-13) ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=gvEy9i+g; spf=pass (imf13.hostedemail.com: domain of mkoutny@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mkoutny@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660817090; a=rsa-sha256; cv=none; b=3Tsqe8JqS4KLzFO3QbgeHvireaT0qTMx57T2K2jM8PAmGDB/WGZIoS+z2aACZ7Gkcspy66 28rr6dWOnqkGBHiT3sgc5pRFNmHseneGTBHAiWUF/MIU1olMcqm8H9Bq5FkmQsLwtde+q4 P0noV2BTArcZCIJ9NEwWfkl0J+4dM2I= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660817090; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yGB57PLr3/XCzM4eFzUWiwdrtPGGJcwk3YhcDibPx8M=; b=1tH3lqZE0xAD3phMArjNNY8DiSMsqluxn17i0fWOF2pC/2hYlSp8vaKThH7YcG6KB6tUt7 3Z1m4X6jklyrQe0mHv4H9uP4V2XgFA1RV1mTTgXHlb9X3Dx/D9/PYY0r3bydWwitOYfXKa ayct6OvUujwyDz9b40EAPALOVAfzm2c= X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: 5fpbik4t7e8cx8dicc9z1ot5koqxc3a4 X-Rspamd-Queue-Id: 00D7520857 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=gvEy9i+g; spf=pass (imf13.hostedemail.com: domain of mkoutny@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mkoutny@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-HE-Tag: 1660817090-864982 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 17, 2022 at 05:21:39PM +0000, Shakeel Butt wrote: > $ grep "sock " /mnt/memory/job/memory.stat > sock 253952 > total_sock 18446744073708724224 > > Re-run after couple of seconds > > $ grep "sock " /mnt/memory/job/memory.stat > sock 253952 > total_sock 53248 > > For now we are only seeing this issue on large machines (256 CPUs) and > only with 'sock' stat. I think the networking stack increase the stat on > one cpu and decrease it on another cpu much more often. So, this > negative sock is due to rstat flusher flushing the stats on the CPU that > has seen the decrement of sock but missed the CPU that has increments. A > typical race condition. This theory adds up :-) (Provided the numbers.) > For easy stable backport, revert is the most simple solution. Sounds reasonable. > For long term solution, I am thinking of two directions. First is just > reduce the race window by optimizing the rstat flusher. Second is if > the reader sees a negative stat value, force flush and restart the > stat collection. Basically retry but limited. Or just stick with the revert since it already reduces the observed error by rounding to zero in simple way. (Or if the imprecision was worth extra storage, use two-stage flushing to accumulate (cpus x cgroups) and assign in two steps.) Thanks, Michal