From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BF43C0015E for ; Mon, 24 Jul 2023 18:34:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC75E900003; Mon, 24 Jul 2023 14:34:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C777A900002; Mon, 24 Jul 2023 14:34:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3EED900003; Mon, 24 Jul 2023 14:34:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9FA27900002 for ; Mon, 24 Jul 2023 14:34:53 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6ECFC14088E for ; Mon, 24 Jul 2023 18:34:53 +0000 (UTC) X-FDA: 81047356866.07.804B607 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf01.hostedemail.com (Postfix) with ESMTP id 9A35340005 for ; Mon, 24 Jul 2023 18:34:51 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=tgJjvbb8; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690223691; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RHsSjCzPEu8aIR0EiXrFcnff9X89cuZKlKes/xQyYu4=; b=1wizJ6wZy4Rzk0hr6Zr7z3FOZ91ANFWX3nzCuXxdKH/L8eLuaJV9baxpK+dUC/sU7QK5Hk iQJ/NCjl26hoQwTO1m01KD9AhxSalmjPN22u47TdIkLMbL2oMR0DBlYwID29oWpXjVaiS+ 947r+rSU+wZklJiJ4sK4SusOE6wbAyc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690223691; a=rsa-sha256; cv=none; b=Y8FoI90UM6ZVOLyYWxVNQeBVBCXwE4X5W+7f//0SUFONuA9bkG2Zl01uy4W6ja75j1axvV p9ilNlJaFQuc2zWmvjnRXjmcC5XA83za2tzW32brkPnY5IOmFxnCO8gZEzkqF2fBO2kW59 zE0W3lZmvUgoH1mRtCRqaE7pi5BwqJs= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=tgJjvbb8; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-993a37b79e2so751597566b.1 for ; Mon, 24 Jul 2023 11:34:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690223690; x=1690828490; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RHsSjCzPEu8aIR0EiXrFcnff9X89cuZKlKes/xQyYu4=; b=tgJjvbb8fbr4lsjCqG6AjX4CmhOF8BSm7LMTX8UnAu+pO48JGry6u+EknbBFqthu35 YuLb16ZhBdCfECepHRAACWWL63o2gq9/l0HmXEWAx52uuTkb175nx9DfYQbNGbivylFg 8GMxtUduFz/VbJuOBPg6gl76EqxGhi7nkt0UgtFWdUkfpx6l+jvy6+AFVLexMgR0OdmO pryO0K1ssC1Fditn9CU8HNxmis5RUgMd//HRjjv4ZORGSlZve6Vnh+dC0VqJy5/cx4Dr t2S6Z+tc1Q9scwxfvVnl+yfOEF2RgbKrDiwN4JCAZVwAN/G/P1W9vEGjkCcLFY0T0Rbx LJ7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690223690; x=1690828490; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RHsSjCzPEu8aIR0EiXrFcnff9X89cuZKlKes/xQyYu4=; b=afZ7vRXZJTKSohKT413XMqHghCZyfEynLYBWRtoKbrxM1WlQlGctlK/NRLO5iSaFru Bog9ta8AFyGXDxHrqbA/6hjWv21TNTrj+r0ngoMFDXNxv6ZEXO8RBlj+OSdZ2UKSPDYq U+Rou6VqXIwKEDZfXgtOZcXP7CyLHGMhZMPZWEovc9vwU+t6IEGBRbDhiwZFKhoo463j m0SuFQXNTLwjsVY3DEQLBUrfVQDeeSYmNe6muPb2qw9U+yTTOmHs4va87hPey6VPxOVi d4fcPugJYZbutpSZlOOfBrltfvHftJsOUbkjk8B+Jfm5PS4VhOgLFClhvjUJnLCYEGkN taFQ== X-Gm-Message-State: ABy/qLZ5CDSSY1OKvt4u49VgBzdU1U+l66SgeBCf4HCvcoyfud/a7ESH mD9gT0F1O/tyu2/iUxlIVZogkzY8JFbJuQ4+fOky7g== X-Google-Smtp-Source: APBJJlHXhwyT47TwVlBofR03GtqUt39PXL5S6GHF7dm7gtIyN2+4uaBe2BqoGZPtzqlb9F3tE/w+t97EgRV0UTvVrIU= X-Received: by 2002:a17:906:778f:b0:982:45ca:ac06 with SMTP id s15-20020a170906778f00b0098245caac06mr10027114ejm.60.1690223689846; Mon, 24 Jul 2023 11:34:49 -0700 (PDT) MIME-Version: 1.0 References: <20230719174613.3062124-1-yosryahmed@google.com> <20230724113104.6994bd471fb926ceeaf46707@linux-foundation.org> In-Reply-To: <20230724113104.6994bd471fb926ceeaf46707@linux-foundation.org> From: Yosry Ahmed Date: Mon, 24 Jul 2023 11:34:13 -0700 Message-ID: Subject: Re: [PATCH] mm: memcg: use rstat for non-hierarchical stats To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 6xkjs5fui34i84xc5bxcd9x5tsjihqq6 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 9A35340005 X-Rspam-User: X-HE-Tag: 1690223691-229284 X-HE-Meta: U2FsdGVkX1+bOHlbwTIKcgWFHvvDLFYWgI6PIIlh4/4apCR+wrAAtB7WuXRE3BJqSXpDsfM4P81U4mYN+JZDP+UqAs7WIMMAUTY0ltDzTIW3YI7Nf2LZATKymzODuJ8nsrJBXzgj2kY09dUd47NKcz45vl/wPkHKUYBsmahUcWNfs998PMRz7u/XWyWsqZe7wbjpeJ0f+7NTIJRW3OkTV+YqtsXHAOGCYTKLbvuak9YWKRwZMrLqD/x7qSG4jT88lMBnMV56Geqp/r9eWwIVsb5vULbzMCEeTPeJnyb4Lid6TZYCWKYhJFcVZnZu3I6q7IuTcyxJVR3qov4pM4TWC3Y+YNO/wuRZxPHzr4+/mkT3HqtcNVMI2ruFg+k3/VbWB9FF54cn69VlPD3tBg/Izak+G26eVbeV+D3z4AVyg77gtVTmRsOWdAkMnXiRp16OkIDtTBQr231PbKpYEQikRFUSmqHgWM6urtUliJERiBNnRPZtgwXvxNnKz8fsU9fxwTqxmG/6CLMD8XyRQu3LSBm5CpuWW39rX8A0PrVsDUWTFTeik4VRkTAI69A3Z4srCC9xS9/JvSdV5bOBkm2xuraHvUx21Sf/RPRzO5d640H+IfCa6fOvu3dRC9kzkL4X1dS+FCBk95QoAsiYhbQ6531XsCUxwyO1ZsQ/hjpeFZ/aDdlzwiW8tCTT+ILwDwuO0jNqsfczF+9bjoB6OIycbhaAkEZkdNGDpgDmFXmcVPkUm7Hxdt4YVs6vmGdTyP63ZflhiM7XGkXmYkGp4RUjk6F+0vjrFCUISlQhZUCigQs40h2SFsQiF33t/XYWL2a+OVVd4ZopHrtu4oXRJ7eTk+VcboxOp1FnLJV+Mt+pXcbT2O9ZwuggHYH2da5y1Rk3nNB0IqluOnDvpEwb+m1Vl7nKJHD3VU9Ojtp6MarWVpqB8rsNvOqOciZDQHjuRZjdm0W9/05edBVSL/cz87n fhd5tSBm j0veobNJzZrY0wSgZoSIzWwOf09J/6JDHxYzp8qnwtNUbpM4avpzUz0Ef7Z3aDTEDFSSPobliO8itaIUT63TnUF5qJNsg++aW+xfpndyTdCghI6olDaQhUBdxI7aVCjIdTp1ahvVOWBU3rG9MHLkeX8PAMiLEM+yeAhHEWq/qQo6ikxHXNk1yYYEV5eTXq/5nJc7FCJEsW1uteDLvWYgg3MRy1g9WWTbM9bA54IgBKBIZzqp6JKetGyUWPdENPymC5SKwTiOwuT0X5H1SSJMvuBZ1enbaFCl5Oz+YI46xiVSqs+kuFLKdBjZ8ow== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 24, 2023 at 11:31=E2=80=AFAM Andrew Morton wrote: > > On Wed, 19 Jul 2023 17:46:13 +0000 Yosry Ahmed wr= ote: > > > Currently, memcg uses rstat to maintain hierarchical stats. The rstat > > framework keeps track of which cgroups have updates on which cpus. > > > > For non-hierarchical stats, as memcg moved to rstat, they are no longer > > readily available as counters. Instead, the percpu counters for a given > > stat need to be summed to get the non-hierarchical stat value. This > > causes a performance regression when reading non-hierarchical stats on > > kernels where memcg moved to using rstat. This is especially visible > > when reading memory.stat on cgroup v1. There are also some code paths > > internal to the kernel that read such non-hierarchical stats. > > > > It is inefficient to iterate and sum counters in all cpus when the rsta= t > > framework knows exactly when a percpu counter has an update. Instead, > > maintain cpu-aggregated non-hierarchical counters for each stat. During > > an rstat flush, keep those updated as well. When reading > > non-hierarchical stats, we no longer need to iterate cpus, we just need > > to read the maintainer counters, similar to hierarchical stats. > > > > A caveat is that we now a stats flush before reading > > local/non-hierarchical stats through {memcg/lruvec}_page_state_local() > > or memcg_events_local(), where we previously only needed a flush to > > read hierarchical stats. Most contexts reading non-hierarchical stats > > are already doing a flush, add a flush to the only missing context in > > count_shadow_nodes(). > > > > With this patch, reading memory.stat from 1000 memcgs is 3x faster on a > > machine with 256 cpus on cgroup v1: > > # for i in $(seq 1000); do mkdir /sys/fs/cgroup/memory/cg$i; done > > # time cat /dev/cgroup/memory/cg*/memory.stat > /dev/null > > real 0m0.125s > > user 0m0.005s > > sys 0m0.120s > > > > After: > > real 0m0.032s > > user 0m0.005s > > sys 0m0.027s > > > > I'll queue this for some testing, pending reviewer input, please. Thanks Andrew! I am doing extra testing of my own as well as of now. Will report back if anything interesting pops up.