From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14521C001DF for ; Wed, 2 Aug 2023 22:03:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8A62B2801FD; Wed, 2 Aug 2023 18:03:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 855442801EB; Wed, 2 Aug 2023 18:03:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71CD42801FD; Wed, 2 Aug 2023 18:03:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6313A2801EB for ; Wed, 2 Aug 2023 18:03:36 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C1A1440E0D for ; Wed, 2 Aug 2023 22:03:35 +0000 (UTC) X-FDA: 81080541990.29.2F24D24 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf07.hostedemail.com (Postfix) with ESMTP id 0291940015 for ; Wed, 2 Aug 2023 22:03:33 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=P45Kscjf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691013814; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HIo3KTyr76k6K1BA/QNGBKeMcM+9rrKO+UG5HeIlUAM=; b=FxC9hSPJlty4l2wgVioOr7gkewqkDIRTSQpkk1pOk9mg9ueVrDBLl8E+3WA0UTua6eaPGF RjZtDusN9e4MudM9LJLxefoO0fLopXGXf2dYUJa/VxpOxzUUc5LWjFxjS3tffll0l/E+Wa ejW9PdeCJGKQnfQBUzox/kBxmz+YkaU= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=P45Kscjf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691013814; a=rsa-sha256; cv=none; b=flEj7KYzokjOQyaCYjyCtjAHnwTHW+Bx9YoA3sgXJId4g5r7NkKYMqSuhDyBCrP/7pzcg9 xp5+NZPFszglR7BIXQx1w75aQVmxl2iGcyb0TxcqgYbMsjfp0njCqnZwkCxkM/yju3LfKZ ctva5A0IO+o6eynWWiiKVdt8KjNXkNc= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-99bdcade7fbso36388666b.1 for ; Wed, 02 Aug 2023 15:03:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1691013812; x=1691618612; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HIo3KTyr76k6K1BA/QNGBKeMcM+9rrKO+UG5HeIlUAM=; b=P45Kscjf3OEWZ8OppUx9kF8H15Yd5cbLbFtqdc5LjrYU6q+R+whrc7inBwtatoqMy7 IDRGZ7zi0UzY+SBvuz37cl3R4M3iyHS3gJ/O6fLmZVcyvMA6+jqetWNxu+04ksSj4y7j SKnSdjOenyIwiqgOeyzOXSofnTeGALl/BHm+4eziWASUs9HOB7afyEYGOSEJEhd5pW+m Jqoxe+vEXQKCNXGSXvRfnPtt+i3YqOjhnuoIBGSaFPcyVZDnNkSxUTK3ivcrUMHArmJD ZrAhOCGnbLprKU9L4JG4XYeLT7SPR0eqV7FvNosqGDBzvn+UvwH3wX06BMbm/KS6VPK9 bSyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691013812; x=1691618612; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HIo3KTyr76k6K1BA/QNGBKeMcM+9rrKO+UG5HeIlUAM=; b=iKtwprDPRPFe2IxCSG/cbSqbRxvHarlbqBhlngG8C3VQuoXZBZ3N+TMy1nnGRtg/Il 5n0B/y2LLDMpKAE2+cHM0E+eh6N/8VhzKjanDbHTAKm1flL/YsL5OXCfmvtL5MpWAYuf H6sZinblTsA2LFKTyIlW15oYSxBP8abuwPeFD91XiCp6yT56pq9ZvFubtDmclvfCeoW6 UT8Kvdn+6QY6LzPecXb7gBgIDXafLbdcF6zbCpHPXcLWYQHiMHWH3LUQivgKGexK8fju ZFML0ygFLzSonVfQ2Ob/WQUc21YuZ2zdjSuzlHTAkjjM3+dUJ63Jg5YHp9HUa7AAuDg2 crvA== X-Gm-Message-State: ABy/qLbZYS7aVcL+WsyXZJLei0Wd/ONfQBRc4fkbDSP4qo9Lxo3G/Kqw Agnj/JM97wcl+XXTpGXZmeEXQzz51QBLEzOTV9ojyg== X-Google-Smtp-Source: APBJJlFTftZm3m5lcjbMgQ2ymx11qxhas8V+QpXpKyaiCiGLnZSf7izHWUaM2LELOMUBMQsG1Y8hGPtoioeBTXsP7DE= X-Received: by 2002:a17:906:53d3:b0:99b:d9f3:9a98 with SMTP id p19-20020a17090653d300b0099bd9f39a98mr6126651ejo.74.1691013812145; Wed, 02 Aug 2023 15:03:32 -0700 (PDT) MIME-Version: 1.0 References: <20230726153223.821757-1-yosryahmed@google.com> <20230726153223.821757-2-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 2 Aug 2023 15:02:55 -0700 Message-ID: Subject: Re: [PATCH v3] mm: memcg: use rstat for non-hierarchical stats To: Michal Hocko Cc: Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Content-Type: multipart/mixed; boundary="000000000000967e290601f7d566" X-Rspamd-Queue-Id: 0291940015 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: xb6ymgcx7wqwf6taekgxzed39ueocmhn X-HE-Tag: 1691013813-258980 X-HE-Meta: U2FsdGVkX1/1xgo3ZNIKXoJgqXRhk0c/mghpkiiihp4CmsjzjgypOY5Gmk+blnpFjkjJCHamry3uY6/ihl0jKjb/aW9/Y3nqw0r1gap/aSXyFUg/rvVPByJOqgCAhmiGzjIwEv1YGclMkF5RKhIf9ktCiqQq1Xpez+2azqvSYeE5ZU0uHI1Z0my9YQdhQckzFp9AEftUrrbPQET+QSDSr7UVRh7pqbYpiAxq2Jiu5pikymiwtpoIEV23bVihIEJSTQrWXipLaBx9mhT5XINP12Q3QwzLt2JanyoNUqS+wtRG4iISym+lSGFYJnr5nsGbs8aw6GOYmfch9rCa1fm6dGBqj9cPi3LZJrulNMXOk6YF8J/Lc9DMQY+xQD5AFD99df8HsUSkIxhb69afHLwwNu3bKU0Bk/LTyb7pnmrMVVqRKJ5F7Snbs6kuLPkzNcxTWSXx5L9+8Ey/lpYpR765eUP0oRJezuOZUwo00cOVUFLlkGSTevCLR8+ZEWrgaL+wHBFPe395/wrXU83pArWVdDl2dPB4GA07Y893XhRRIA4TAOT7+f0/trOm80rvnb2uQ/7RO6dBjIWzx91kfprp+qaW9GfSL0U4ZjtJr1WZp71h7c67VFxWmZ87nLlyWOjWJy7Ahpt8QY+L9HrB83aD0wB728a+ioGXBXjx5Un2z/Bj96sLwh4A8NPSs58QjRXnFWzdfd18VBiEMDhh7zn5pBDuQJskRqdbWh6oRr1yKtuxBvTLi/a3btjJ4oHIbHqCHuH5akDpkHiQy0nG+if1SSZeR2aNIzIx0NwuLBu+EO0Tds+X6xJrmz6M/zTF4HvlBocM0EmpvBipdA7Md7f+3hM8ah6O9vv07BEfHwJHpZQDle7YUjtcD+Vdx0uEnXeVrmCjgI90IhdtxkgTGHhiITKz/SU9nrdj4pf78mbPAtXG4NJdLd0ywIe0uOnU4RTjQaioF5dj8Ts1XF8++rP cdeeAyur yoegHOWxUmZEJgHP9UFmkLbFw0w8975ZOH12RncaZVQcQbn5CNSOl2Vqp4XY4Bn0DKkGlM3eYwS6WbXh3eEd5g4bSNeSxbK8pgqyOUDGVksC3/0Q2Yp4G/+8prRPvSiSIyUI46JPEuMW7bbbmO99hTXIXNqLovkngwfLRR0NY/dmG9WBxxqdw7o3nhNaAnU4U1/QNzgeEIDMXfXIkIEdStMaQMnfpgUufBqcnPFgo7gBNGLYeFQxXL09CdsQvXtmFMAfhp7Kz5YGdyDS2UcLqDebmy8/DtyAEK5O2fzKWr4cFxsSZevR4s9Wm1+45QSKMnl/BAahzQ6DCnJUiFFVI7fuqgy7JtvxYvOXeGUEtCvFkcT24uGQzTy5fdW7TvijEzdoH9hyki/xAtUgu25eto30zMn4RyCSOiJhUnHX1xsSGfIqGwXlDcdy2huO/hsfr7al8PIgN++0ntPXAu7S91EGs6UThPNCYesLMyxuAZGIpEgpJGKrHA2uWUnO/Rk/K0EdyDXAkg1k3QUvkCy6hKhqynHKJH1VCPPB41Qg7KK3sAPeXSh2azqMUb6b+pQUKGmX7szONU4G+1KE0ug+pCVrKVg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --000000000000967e290601f7d566 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Aug 2, 2023 at 1:11=E2=80=AFAM Yosry Ahmed = wrote: > > On Wed, Aug 2, 2023 at 12:40=E2=80=AFAM Michal Hocko wr= ote: > > > > On Tue 01-08-23 10:29:39, Yosry Ahmed wrote: > > > On Tue, Aug 1, 2023 at 9:39=E2=80=AFAM Yosry Ahmed wrote: > > [...] > > > > > Have you measured any potential regression for cgroup v2 which co= llects > > > > > all this data without ever using it (AFAICS)? > > > > > > > > I did not. I did not expect noticeable regressions given that all t= he > > > > extra work is done during flushing, which should mostly be done by = the > > > > asynchronous worker, but can also happen in the stats reading conte= xt. > > > > Let me run the same script on cgroup v2 just in case and report bac= k. > > > > > > A few runs on mm-unstable with this patch: > > > > > > # time cat /sys/fs/cgroup/cg*/memory.stat > /dev/null > > > > Is this really representative test to make? I would have expected the > > overhead would be mostly in mem_cgroup_css_rstat_flush (if it is visibl= e > > at all of course). This would be more likely visible in all cpus busy > > situation (you can try heavy parallel kernel build from tmpfs for > > example). > > > I see. You are more worried about asynchronous flushing eating cpu > time rather than the synchronous flushing being slower. In fact, my > test is actually not representative at all because probably most of > the cgroups either do not have updates or the asynchronous flusher got > to them first. > > Let me try a workload that is more parallel & cpu intensive and report > back. I am thinking of parallel reclaim/refault loops since both > reclaim and refault paths invoke stat updates and stat flushing. > I am back with more data. So I wrote a small reclaim/refault stress test that creates (NR_CPUS * 2) cgroups, assigns them limits, runs a worker process in each cgroup that allocates tmpfs memory equal to quadruple the limit (to invoke reclaim) continuously, and then reads back the entire file (to invoke refaults). All workers are run in parallel, and zram is used as a swapping backend. Both reclaim and refault have conditional stats flushing. I ran this on a machine with 112 cpus, once on mm-unstable, and once on mm-unstable with this patch reverted. The script is attached. (1) A few runs without this patch: # time ./stress_reclaim_refault.sh real 0m9.949s user 0m0.496s sys 14m44.974s # time ./stress_reclaim_refault.sh real 0m10.049s user 0m0.486s sys 14m55.791s # time ./stress_reclaim_refault.sh real 0m9.984s user 0m0.481s sys 14m53.841s (2) A few runs with this patch: # time ./stress_reclaim_refault.sh real 0m9.885s user 0m0.486s sys 14m48.753s # time ./stress_reclaim_refault.sh real 0m9.903s user 0m0.495s sys 14m48.339s # time ./stress_reclaim_refault.sh real 0m9.861s user 0m0.507s sys 14m49.317s I do not see any regressions from this patch. There is actually a very slight improvement. If I have to guess, maybe it's because we avoid the percpu loop in count_shadow_nodes() when calling lruvec_page_state_local(), but I could not prove this using perf, it's probably in the noise. Let me know if the testing is satisfactory for you. I can send an updated commit log accordingly with a summary of this conversation. > > -- > > Michal Hocko > > SUSE Labs --000000000000967e290601f7d566 Content-Type: text/x-sh; charset="US-ASCII"; name="stress_reclaim_refault.sh" Content-Disposition: attachment; filename="stress_reclaim_refault.sh" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lku9s0ue0 IyEvYmluL2Jhc2gKCk5SX0NQVVM9JChnZXRjb25mIF9OUFJPQ0VTU09SU19PTkxOKQpOUl9DR1JP VVBTPSQoKCBOUl9DUFVTICogMiApKQpURVNUX01CPTUwClRPVEFMX01CPSQoKFRFU1RfTUIgKiBO Ul9DR1JPVVBTKSkKVE1QRlM9JChta3RlbXAgLWQpClJPT1Q9Ii9zeXMvZnMvY2dyb3VwLyIKWlJB TV9ERVY9Ii9tbnQvZGV2dG1wZnMvenJhbTAiCgpjbGVhbnVwKCkgewogIHVtb3VudCAkVE1QRlMK ICBybSAtcmYgJFRNUEZTCiAgZm9yIGkgaW4gJChzZXEgJE5SX0NHUk9VUFMpOyBkbwogICAgY2dy b3VwPSIkUk9PVC9jZyRpIgogICAgcm1kaXIgJGNncm91cAogIGRvbmUKICBzd2Fwb2ZmICRaUkFN X0RFVgogIGVjaG8gMSA+ICIvc3lzL2Jsb2NrL3pyYW0wL3Jlc2V0Igp9CnRyYXAgY2xlYW51cCBJ TlQgUVVJVCBFWElUCgojIFNldHVwIHpyYW0KZWNobyAkKChUT1RBTF9NQiA8PCAyMCkpID4gIi9z eXMvYmxvY2svenJhbTAvZGlza3NpemUiCm1rc3dhcCAkWlJBTV9ERVYKc3dhcG9uICRaUkFNX0RF VgplY2hvICJTZXR1cCB6cmFtIGRvbmUiCgojIENyZWF0ZSBjZ3JvdXBzLCBzZXQgbGltaXRzCmVj aG8gIittZW1vcnkiID4gIiRST09UL2Nncm91cC5zdWJ0cmVlX2NvbnRyb2wiCmZvciBpIGluICQo c2VxICROUl9DR1JPVVBTKTsgZG8KICBjZ3JvdXA9IiRST09UL2NnJGkiCiAgbWtkaXIgJGNncm91 cAogIGVjaG8gJCgoIChURVNUX01CIDw8IDIwKSAvIDQpKSA+ICIkY2dyb3VwL21lbW9yeS5tYXgi CmRvbmUKZWNobyAiU2V0dXAgY2dyb3VwcyBkb25lIgoKIyBTdGFydCB3b3JrZXJzIHRvIGFsbG9j YXRlIHRtcGZzIG1lbW9yeQptb3VudCAtdCB0bXBmcyBub25lICRUTVBGUwpmb3IgaSBpbiAkKHNl cSAkTlJfQ0dST1VQUyk7IGRvCiAgY2dyb3VwPSIkUk9PVC9jZyRpIgogIGY9IiRUTVBGUy90bXAk aSIKICAoZWNobyAwID4gIiRjZ3JvdXAvY2dyb3VwLnByb2NzIiAmJgogICAgZGQgaWY9L2Rldi96 ZXJvIG9mPSRmIGJzPTFNIGNvdW50PSRURVNUX01CIHN0YXR1cz1ub25lICYmCiAgICBjYXQgJGYg PiAvZGV2L251bGwpJgpkb25lCgojIFdhaXQgZm9yIHdvcmtlcnMKd2FpdAo= --000000000000967e290601f7d566--