From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25900EB64DD for ; Mon, 24 Jul 2023 18:31:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC83A8E0001; Mon, 24 Jul 2023 14:31:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A788D6B0074; Mon, 24 Jul 2023 14:31:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98F5C8E0001; Mon, 24 Jul 2023 14:31:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 89A636B0071 for ; Mon, 24 Jul 2023 14:31:10 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 451CDB18D4 for ; Mon, 24 Jul 2023 18:31:10 +0000 (UTC) X-FDA: 81047347500.17.157AAF3 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf24.hostedemail.com (Postfix) with ESMTP id 8D3CE180024 for ; Mon, 24 Jul 2023 18:31:07 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IMhdN1V+; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690223467; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FIh5Dlkd1rVkTGMA3fSOoROkYOEVPiaTlUaZ/96CdNw=; b=TNyF/+7QNro8y1Y2DllzP7FmMczfp6oP7WBlEw5N2CoOsYsT+R3s/LNPc3rjckv2ZMjGYT XA+8/xSsCjhyWi/TSN+Lc6G05ib2+v7HIbihkl/NAxvzL+ZUD5h1uQZNP+KYmFWxm2QGbY MsCd5AlHmyZaUX52cuw/IVgZ/fncmPI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=IMhdN1V+; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690223467; a=rsa-sha256; cv=none; b=LJPaxvaAL4b9LbY2wn7vKqPkQWKujkAsguO5vjpeyfQ3xsq9r6UKLnZAcv620C7YT3mXyq t5BwrZJ+OeoZxN4/u2xAz3SUMZOytx+gcf1F9f7k7MjI8HDQW9bIibUP9sFrWvcZD3endJ GCd49jy0tu8nVzqij9qwffVygRsrsDc= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6BB3D6137C; Mon, 24 Jul 2023 18:31:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54D3DC433C7; Mon, 24 Jul 2023 18:31:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1690223465; bh=VYtMyHmTjfqRNqrGlaslEXmtuJkxe6wetxm/yy7dEvU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=IMhdN1V+KZF00P6hoKJX2tFLxgM5UA9dcRRI6pwyjwKF9aiDNJHJNlMHzyCOnn1d+ 2oQpxdGmHvtLO7jB1HOLpLMtFmtcQ67BKJJH1qJ0ytJZRTyZYoQZKKi2BJcvsCvYuh 10MseRD2bM18o525xvpf+5RgkOcgjQ1VCFYGA7SY= Date: Mon, 24 Jul 2023 11:31:04 -0700 From: Andrew Morton To: Yosry Ahmed Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm: memcg: use rstat for non-hierarchical stats Message-Id: <20230724113104.6994bd471fb926ceeaf46707@linux-foundation.org> In-Reply-To: <20230719174613.3062124-1-yosryahmed@google.com> References: <20230719174613.3062124-1-yosryahmed@google.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8D3CE180024 X-Stat-Signature: hbprfaq4ggbuwtf5ahtretuxwukbp1at X-HE-Tag: 1690223467-468843 X-HE-Meta: U2FsdGVkX1/ROw2IGqGyceXyYBvc2fWeY2MuOXiNu7fvZzLAvkPQvuo6kl9EZuJfd53/HMdzwCgQ/KDgUOuoqvfQB1vg8As83elMlo0UuLIWXXdklEdlVaL0M9Um8oupwy4ERCxxnr6AFTIi/MwoZgckHrxnFHmhOUSgoCCEntR9G6zpbfKsrpu3QDrKN6wkEHdBI8DA3S2ArNRhA4+4pdiHH0XkyF8Y1bDq3jwvWQBgaF1AZvw80gVu3jv3FNLxOZUgZSZ5Sj1HHGXA4Tsa5OiyqRecSaVEGAq6/RYZWU15we3kSxgBGoZxUc+1bBfZZ/907bYQmVdY38jdnf4lBvnDBOJLCYjLo/QX8RpfYb+HjH01MlCjQ76y7r4fjED9+6NxBlE311WS9qn6+ryH+kagmDQjW+IdZKZ2LNFV5I0dGztNsOkpVmpip68ky3Kr3SXOn/3tPb29KZnEkYu3u0UWTEj96QuI73y32lj5bqjcIjdnEgpEsxDU7oq/1MBubp5MZXRyQoCU17JLP6nrqIiCwhWczeqGFakvc2eNuNw5ODjeSJsYYci2pstCKS2X7f0Yjnch7/PsHUqVMZo2v1xgJdf73viAhRKaUyyHKPyMAZzaddKU0UwGA4oxK9HGE/kHnZK9c9kZcYglof6XqE3B8lJ2vok6zLlUEOwmnL9NkgUvzYkTJCMtkR0e1XKE3jppmsqxFzn94oMd14HqxoNihfPIzybH6rz6ZOPGFrYtLximQMH7wD8BVHUgJLZJ1prc2eDvP1bM1kbKQoP3xW/OIvBsa3+UhSIgHlkmS2FOsmsoXcQeHHEWfKzgDOg2nMCvk9/KUP1NFFzWI3G6an46h8Vd4H8AQbXfLaLHrKcbc5xLUU+9ZHSKTTgEnoPD8YPAZhBorDO5MRZq3bl9mfN2m4/UNImOuM5Q8DyMN6Xb4ncSKFj94PBiR9q1PG9bjvD1YtRiJDDsCv3E4mm 32jMku+p S6cm4c9EabcsMBFDClnXKDe7jOq2xeCIaxJ7Ciri8saklk6+pTJkP1/nXfkjMTC++FTheOWAEHrN0bvkzwR/PbsOJAo569sPbizM6XKnHXxI4ENPurRjKUXJaIbipmNRKWnW+mI6wY189azpLN3xYa1Akkcs2Ljw9rT50KN9iTp1LwyFQSkJJGl9cCwSawobuk/WqzF4VmYMD8nmfPOMLBpoy5lVM0u+CAak8qTOTDwPltNo85ZjfI/KGwDvCDUL8ly+/UMw8SNDCkO8sdp0HfH7Rc/kxpBcIk6xC6fxl9CqoKMosXYsesgquP9P0TgMnRYYg4qmULEneZ9ikWUL+SuRBxQ/2lNpOSt0P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 19 Jul 2023 17:46:13 +0000 Yosry Ahmed wrote: > Currently, memcg uses rstat to maintain hierarchical stats. The rstat > framework keeps track of which cgroups have updates on which cpus. > > For non-hierarchical stats, as memcg moved to rstat, they are no longer > readily available as counters. Instead, the percpu counters for a given > stat need to be summed to get the non-hierarchical stat value. This > causes a performance regression when reading non-hierarchical stats on > kernels where memcg moved to using rstat. This is especially visible > when reading memory.stat on cgroup v1. There are also some code paths > internal to the kernel that read such non-hierarchical stats. > > It is inefficient to iterate and sum counters in all cpus when the rstat > framework knows exactly when a percpu counter has an update. Instead, > maintain cpu-aggregated non-hierarchical counters for each stat. During > an rstat flush, keep those updated as well. When reading > non-hierarchical stats, we no longer need to iterate cpus, we just need > to read the maintainer counters, similar to hierarchical stats. > > A caveat is that we now a stats flush before reading > local/non-hierarchical stats through {memcg/lruvec}_page_state_local() > or memcg_events_local(), where we previously only needed a flush to > read hierarchical stats. Most contexts reading non-hierarchical stats > are already doing a flush, add a flush to the only missing context in > count_shadow_nodes(). > > With this patch, reading memory.stat from 1000 memcgs is 3x faster on a > machine with 256 cpus on cgroup v1: > # for i in $(seq 1000); do mkdir /sys/fs/cgroup/memory/cg$i; done > # time cat /dev/cgroup/memory/cg*/memory.stat > /dev/null > real 0m0.125s > user 0m0.005s > sys 0m0.120s > > After: > real 0m0.032s > user 0m0.005s > sys 0m0.027s > I'll queue this for some testing, pending reviewer input, please.