From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7786DC001DE for ; Wed, 26 Jul 2023 02:20:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EDBD16B0071; Tue, 25 Jul 2023 22:20:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E8C1E6B0074; Tue, 25 Jul 2023 22:20:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D53FF8D0001; Tue, 25 Jul 2023 22:20:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C35C86B0071 for ; Tue, 25 Jul 2023 22:20:42 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8669B1A0A37 for ; Wed, 26 Jul 2023 02:20:42 +0000 (UTC) X-FDA: 81052159524.27.DDAB455 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf01.hostedemail.com (Postfix) with ESMTP id B92A24000F for ; Wed, 26 Jul 2023 02:20:40 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="zX/ucMGv"; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690338040; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q8i3sOYpsPz/5z4EGZvbSWnvU6I/qPCguplntjXR3Gg=; b=aq/Ssf7suURnZLiPT0tNSn5Cp/GVa1Bog28Ny6o8JzUao9F48Zkh4YJG3esg5MOZY0ykxU FND6Dm3ISc8wXdhoGlonK2bfmGE1bGYrggspxJKRGuXrp52ITC73y19ApvsizuOZD2u46f NwZfedpTSKELCHig9RPXzRMXaJvWNrU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690338040; a=rsa-sha256; cv=none; b=R+8RVjLzY79kUAe+BIA+BbM2vgmOY630DiwMuMnkPiOsAvaSnxdWS35FIjGc3SSPOZxUvQ MqHixnh/QVyBWprAZ1OlPpcyulQ5yOcJCEjPkwvUVY1QIakxzHZnKdbITmsOXJGFI91GiG ozoVbJg/Ypwe7A3uNDQXvSC2156mW6U= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="zX/ucMGv"; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-98377c5d53eso978792966b.0 for ; Tue, 25 Jul 2023 19:20:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690338039; x=1690942839; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Q8i3sOYpsPz/5z4EGZvbSWnvU6I/qPCguplntjXR3Gg=; b=zX/ucMGvp64UIN2TBRjHmtg4M1ZyQus/V7XJMJabp2TI34u8ATIOzuXKNEbYJ4Kq/U BAKXhx0djH/AEVYojD9/E4xeO9cMvEt4Q7FJdKvT/iBZvjqMfcCHxyLLyXzoN1kfmn4y +WGwrX73lXIInJs01RyhmihamNPtRWXfO6l/9EqNC7Bn9C7HpLP6V0FecaFlvFhFMg1G TwvkTuXVRcGhPlxOZWecmoB+yoxr1xdd3xW1gu0iQ7xZm8D4YKSmaEReUGBk8cK4mHc9 zFsF57XxKoiFxStRGmlWBI1QeUWI0d0uElU3pVrr/lweOwaFdKbRTCg0PV9S8jpXHqy/ wbSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690338039; x=1690942839; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q8i3sOYpsPz/5z4EGZvbSWnvU6I/qPCguplntjXR3Gg=; b=EWUL6je/ztKNncIGbDBLD0d3KrLHotB1JL6EjobifE1RKg6ZjDtR9muA0kAw841gYi DoY20HN4xIynz+IACDAvg8FtpPmfiFjwRgrXQMSgjAf/G5q7g3vlv3RQzUpMy7H5j0e/ NpvKBU4UBJHMr8+Me9bWNGLQchrz39eYNseJR5P2+6UgY8C30G+gfW0e05fEXhkAz50V IlU2JtnW5ufzs5SOyGSkkicNbgK+70upEZwUWaD9fnKK/97/cjn65Z7k1IMwSINuez2i wOtXbPxJo//xd9vzlGqFsdZbpc3bzXceSK1jpo6fAPTCDl7SJJJ4ttSv5WFGx56f9KHh XLpg== X-Gm-Message-State: ABy/qLYcLtIjmOg9mvLH26CZMQZQwng7gnbTUurAH1PKfhSbZrOy/29E krk95rVFnY4OeY7Wh0G9rGdkKTKVt1zptfV7Cl/CTg== X-Google-Smtp-Source: APBJJlEQuwhLPIu14E9peivaXaRzQlmhXN55cVKlqo4UZApLvL+uxAR3Uz6etPHfSZbMZPgJMZHAk4/Q4Oh2svg8Y94= X-Received: by 2002:a17:906:20d6:b0:99b:5abb:8caf with SMTP id c22-20020a17090620d600b0099b5abb8cafmr438454ejc.44.1690338038873; Tue, 25 Jul 2023 19:20:38 -0700 (PDT) MIME-Version: 1.0 References: <20230726002904.655377-1-yosryahmed@google.com> <20230726002904.655377-2-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Tue, 25 Jul 2023 19:20:02 -0700 Message-ID: Subject: Re: [PATCH v2] mm: memcg: use rstat for non-hierarchical stats To: Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B92A24000F X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: u8dixqiqt4uanu5jcn3foxooxfduwq8x X-HE-Tag: 1690338040-509175 X-HE-Meta: U2FsdGVkX1/Zt6WI15v3V5bibEGuBxpuO16RXtmx3R3Lz7veiLMX5AcLQ5zQU7mGl20gHsHzC7Z3gtAiRyBK7q7lNhU1jmVtKftDB4l1fdg0cgwuwCA1wcrfyPmM8XeThtPkQlNkjwmOlu44bB9N9rrUk5M+QNiyEf8aMc1ie5M+PMf8oDEkomWbmHWC4mZsiuA8z2agp2ZUSLJqMheY/YL3vEx7Yym7av9f4aP1ebp9WY+LrQ62uNvfS5OwQcrGVmx9MiUpKxbYKwau9X9+yI+i0c7V7UGEDhsfeDU46uYq6EvZNYgyoyyUGNO4ulhBbfzPsJishjmb3cL03yS75VQ/MXd6pKxeXfjgUvJVy8Mvo6gVFMeyDgBVhHYg++WkBnkKuDhMgQn5PUsqOAnL6IL+QxU2UcIuyHsrbzqgUnZlJ7dpSPVHfxXvmFc4rVwmwKUegv9A8MoF78yweXaWMAWwDqFClUi3xesGu35ZAWdRjpbY3Y5lJUt3Z5p0AQJ2H6Zhf2E/mfcq7xX3TMjnELWltJ2tEXRvGytIPGNUdVcEqFImtXy5WGH5784pWldkHJRy+HtNGPQ0/Rz2dgiBCh7j9utpxkVvAu7gMPQtYZk/pdFzErLllAFwtKMKKVMSLjxkBE5iUf+rCFBEwGqrrqHXl1NiA/T+DSmjDZDrF2w+Qwgw4hR+ZtaVfpKkb2+7ohJqI9bvdfYZIkO9rgvQ2HZASQuD3roGf57LNA0fIprzR2kCLYlNDjw5w/hs+4r95zX0f8Rg0U3HpBfzdiloEyDqNF8D4Rk4MQKEtAFcvHOzhKET8x/Gg0mFSWHkykc8AZDcjICtg8jeLyzzPVgVpU9cFh1RYY1LLzxe3Bv0r3u+ebJ72Acu6nkd5HdtBWjoyCAffKvTbbWXbPin3V37DUvvFtXg3hmGLQK2SXHqzMgL9kLSu8+GveLBCl37EeL8plul1OXToEblOI85Yr1 hdjqCm5a DZqJNmwKVH5ZNJispUDe8ppAvupo+tPxAXwiimC1EGqYBwa7GaMIe6w43pLlXih8WEB44OoZ2pyJcRWn5M+uXKbsU0j2+ixAZrQHgX/ywM3I4X6BBp0ammj1U3N4xKkqXALZYeVMSOxTsby9jgEXMHnxV6mV7Z++XocFVarCGFR6L+/ol6BamJZofdwdsKo0i8G/WWYsDH7BlJ+5LAI87P+2pjKw/EErOnQcp9Ffkupv8nnEkBrEal8ivMXG3hkOs2LnQG0ngDhHn4y2eZLU4leyNO/Tb4D6RxpWx0XhxaVdfPrum/rshpa/MudjGq8YyY1HN/VgmNlgG0lGZYrcuxv0ZWtqrGoTmd6825XaOAj5vWK8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 25, 2023 at 7:15=E2=80=AFPM Roman Gushchin wrote: > > On Wed, Jul 26, 2023 at 12:29:04AM +0000, Yosry Ahmed wrote: > > Currently, memcg uses rstat to maintain hierarchical stats. Counters ar= e > > maintained for hierarchical stats at each memcg. Rstat tracks which > > cgroups have updates on which cpus to keep those counters fresh on the > > read-side. > > > > For non-hierarchical stats, we do not maintain counters. Instead, the > global? Do you mean "we do not maintain global counters"? I think "global" is confusing, because it can be thought of as all cpus or as including the subtree (as opposed to local for non-hierarchical stats). > > percpu counters for a given stat need to be summed to get the > > non-hierarchical stat value. The original implementation did the same. > > At some point before rstat, non-hierarchical counters were introduced b= y > > commit a983b5ebee57 ("mm: memcontrol: fix excessive complexity in > > memory.stat reporting"). However, those counters were updated on the > > performance critical write-side, which caused regressions, so they were > > later removed by commit 815744d75152 ("mm: memcontrol: don't batch > > updates of local VM stats and events"). See [1] for more detailed > > history. > > > > Kernel versions in between a983b5ebee57 & 815744d75152 (a year and a > > half) enjoyed cheap reads of non-hierarchical stats, specifically on > > cgroup v1. When moving to more recent kernels, a performance regression > > for reading non-hierarchical stats is observed. > > > > Now that we have rstat, we know exactly which percpu counters have > > updates for each stat. We can maintain non-hierarchical counters again, > > making reads much more efficient, without affecting the performance > > critical write-side. Hence, add non-hierarchical (i.e local) counters > > for the stats, and extend rstat flushing to keep those up-to-date. > > > > A caveat is that we now a stats flush before reading > need? Ah yes. I am hoping Andrew can amend this but I am happy to send a v3 as we= ll. > > local/non-hierarchical stats through {memcg/lruvec}_page_state_local() > > or memcg_events_local(), where we previously only needed a flush to > > read hierarchical stats. Most contexts reading non-hierarchical stats > > are already doing a flush, add a flush to the only missing context in > > count_shadow_nodes(). > > > > With this patch, reading memory.stat from 1000 memcgs is 3x faster on a > > machine with 256 cpus on cgroup v1: > > # for i in $(seq 1000); do mkdir /sys/fs/cgroup/memory/cg$i; done > > # time cat /dev/cgroup/memory/cg*/memory.stat > /dev/null > > real 0m0.125s > > user 0m0.005s > > sys 0m0.120s > > > > After: > > real 0m0.032s > > user 0m0.005s > > sys 0m0.027s > > > > [1]https://lore.kernel.org/lkml/20230725201811.GA1231514@cmpxchg.org/ > > > > Signed-off-by: Yosry Ahmed > > Acked-by: Johannes Weiner > > Acked-by: Roman Gushchin Thanks! > > Thank you!