From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BED6EC7EE23 for ; Thu, 8 Jun 2023 17:37:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9E448E0002; Thu, 8 Jun 2023 13:37:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D4EB38E0001; Thu, 8 Jun 2023 13:37:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C16078E0002; Thu, 8 Jun 2023 13:37:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B044D8E0001 for ; Thu, 8 Jun 2023 13:37:05 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 73859AF214 for ; Thu, 8 Jun 2023 17:37:05 +0000 (UTC) X-FDA: 80880286410.09.5B7A2EB Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf28.hostedemail.com (Postfix) with ESMTP id 9AD17C0017 for ; Thu, 8 Jun 2023 17:37:03 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="e/c15UQD"; spf=pass (imf28.hostedemail.com: domain of 3vhGCZAgKCBUD2v5zz6w19916z.x97638FI-775Gvx5.9C1@flex--shakeelb.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3vhGCZAgKCBUD2v5zz6w19916z.x97638FI-775Gvx5.9C1@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686245823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A5HMiXJj25/0cs+VQWx8wtIMazP9D7ttY2YMsSjDM3k=; b=MRtiAv6jZMq2eI/w/sPyWj/Ru7noqMOaJASG4XB3Tna6utBaSbyjx0DCnLwbN7D9epj22f +wWCcpE2ocDhNhhn9YTgTb6iYTrj20NYJvS2CuPeZ+io0i8V7xjhWll6KXOp2g/G2N7lOv 5NWZlAtGzKD1hINNlNrLZ23Oe8aqVlM= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="e/c15UQD"; spf=pass (imf28.hostedemail.com: domain of 3vhGCZAgKCBUD2v5zz6w19916z.x97638FI-775Gvx5.9C1@flex--shakeelb.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3vhGCZAgKCBUD2v5zz6w19916z.x97638FI-775Gvx5.9C1@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686245823; a=rsa-sha256; cv=none; b=AS/GukamN3BMN3j+8K6hXVB3AStB/8w1OqACsE6ty1nsEdbCnvo5FA/EJ9P68jUJF4VR20 RgYXCAnAcOZnGULFxlmOn454s92tDCfsrJ8kEgldvD12IFUNr1dwD4fSdOxft3vIHsI7vg 4uhHgs90bWh8UMqFBx4/9UO5BshEykc= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-256719f2381so782126a91.0 for ; Thu, 08 Jun 2023 10:37:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1686245822; x=1688837822; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=A5HMiXJj25/0cs+VQWx8wtIMazP9D7ttY2YMsSjDM3k=; b=e/c15UQDeZn5gVYdCP1WTVUOnES+ksevPkbz0RqG+FbeFuOa1OmqHpbYb7HMmwiBRg L01Vbb2O2AvFXQNR4FCyB+Z9OG6phe8RA0XyHjgmByAbCi+XN/GT+5GoOPENmUEqQ122 wLZw9SzLsBYt5PhJrdVvPfUKM1q/jzBTj0qfx05ZbA51cuJVlTrlAzxLvd3cXcCoostm fRzqaAmJ0T5a6qVQ105CVZGLjDrGqZY/q2PRL0OUW+aI4txyNqqaD+yZKOHf//MKVbxZ ARC/YEkAmMSg7PjqR6YxTFdSm6Z6wuz8A1ZipIaSzu5kP5otynNe2Uu40jpWpEI/v2TP OiLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686245822; x=1688837822; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=A5HMiXJj25/0cs+VQWx8wtIMazP9D7ttY2YMsSjDM3k=; b=e6O/B+loXeMDVm0dpCXLzE87++iSoFgmzMc5HtFsC+6ZAwoP4lHLUbJS2vrWOsZ8JG STL1VOcm+V8rnQo8vW5ToV+ggWArY8bxluvgVHG/KI/PpqWNZ4k5YCJAoXP3eqzCPoMG cb30vElaZZ9AM040mP6Edy6Jg9OiQTUHajsYRHitW8+MgHff/o8r0QtQREHTWhFxuCyb Nr9pWYNZvHwPSAIf+fR/fG5uxghtzXTw2wR5NMiYS5Y1UpMCNekeknj2mcGAz9WC+bb2 6JhRXvG9vhWdCPRansIYVhuuFLXvgOcEmeVbU5eY24Sh+NFS/COPKG/RZV27quOqbsjt a82w== X-Gm-Message-State: AC+VfDxDpT8ZHdSzAWfmKCXi0qmJgN9GX+mfyI6OqUUQc4CO3E6OkdME oZ87e1/tXmXGwsJ4SjfMdkyVSs4No/b37A== X-Google-Smtp-Source: ACHHUZ74G/VJbIfySxv6uBggTimaPqijqNP4UNpTFxQCf/GcIiJkljU1WVA4BLSOA08kPiDEVcYA+854RqbX2Q== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a17:90b:f95:b0:256:b9d1:4e34 with SMTP id ft21-20020a17090b0f9500b00256b9d14e34mr2339447pjb.5.1686245822177; Thu, 08 Jun 2023 10:37:02 -0700 (PDT) Date: Thu, 8 Jun 2023 17:37:00 +0000 In-Reply-To: <20230608111408.s2minsenlcjow7q3@quack3> Mime-Version: 1.0 References: <20221024052841.3291983-1-shakeelb@google.com> <20230608111408.s2minsenlcjow7q3@quack3> Message-ID: <20230608173700.wafw5tyw52gwoicu@google.com> Subject: Re: [PATCH] mm: convert mm's rss stats into percpu_counter From: Shakeel Butt To: Jan Kara Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.cz, vbabka@suse.cz, regressions@lists.linux.dev, Yu Ma Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 9AD17C0017 X-Rspam-User: X-Stat-Signature: 88mwq8xu9xtu5zh91ini3aksu3jhem8x X-Rspamd-Server: rspam01 X-HE-Tag: 1686245823-189250 X-HE-Meta: U2FsdGVkX1+rcZZf9smTvzFNxMzh5Xmyq9O3R90AUSBg4Vou++nAXmdK5G2PuPjBjiQlkH4+pK3zZLmk00R77p3YJrUg3uqJCP1EfQ6S4axx+pV8LHmDyHbZi7VjBjed91sCcFh1jfNjohZVjBi5HK4dm6ekHgDIOhgbzrGiUc3YyXxAIBzR4cE+uUsuCG0o0Fs+eIdo5Uh8gdAB5RCOCHPV4Q4Rm83fHrJsn8SosivqS45bwz39K/goHo0F3MYeALGuFB0BLXgMyvpvMaAgfrFtMYAvlia6F3GY8y6LWskqonSny/yI7wbwV8YFNxvxNAu+il7uDBcTmgj5lX97iy5fjQBYRuB8PYgrVrBm2eeXHdtz404QLwlenA/xo9vzyCWxvAU/UD50NJLFnA7N5ZBiApcSsCmtqYjWeLQ0kTJffdkEXkglTkMFDbI3GWxFReVDqKgoBEcICQc+3GdXhAUetiUJiTmbuXYNGDyXbifuDgZULMyIPfHKHsgGg0eGGpF2eZDNXjzOE9Pnn81muO4eQJXM3DP3X6bxkEUYXqxQFOwMLnnM2n4ODrL02itlY44vFZG0Z8IJjlv4Va3DFwSqbyLZqaASbBMuobG6mNZheo5mBZIm7fjSjRAT5aJ0HS+mbas1tpnGLvjbs+MtOMEHdjijvRcTML1rozNgqp9y64NGI2WOk59LtbtNSuN005KGcHwgwXydVRmLO67e8qi/q/oGChuekC0SfziN4B9YetxEiqFsIxzQf9jnTON29BpBA+AkqRTYBE4SPRj+DvpxQYAcZ3+f7TFt6x0iOy0kJ0W51gHdGFisxciSD9B0jnzwKIoVbGeK+yLuj53afBaViBjf0UXefjqJ0FYquWZ5SJuuzCZVkuNAJnoKrSvklk4S8PhIwiiu5cMFN1NxHjtErNRnQ3t7WpbftX0523pq74mCyq18S6LJXQxykY3bahGrx5OoBfiESc/L8YI K11NjJHy xjeUfnpeVVGFy3vRKYTFa+IpJv8nnKlNfC1ey/UkU/R32mxQ8AKZJoNG4l/IAR4BqH9pjpCC80yvN7gOXjt9i5ytkijjyWIDH0CQlYNwdnZqA5uj0FiRxHwL9q8E+TrGe0AObq9Z6nDRHfJpVy1A0zcDE1PT3h+4ooslL9mOzhtuCqO/JfGYGinhMI/yK/g1N0t1D4pKN1twe/bzQnp+LlsV1KKVW/gsjvpPg7ZAFD0Fej48FknJ2ObKWYRSegXoKfVFi4YCyHufOaR8IOtFUjbbGqFZNDcntSmVtgrgskdFVaSX0cOU3KFiBnEkvuaTctxygxOxXoGm67QUAL832XTEPGkdELiHnMqwFfHDgN6n+ARxPrupNfXTWeg8nlUmjp0k3U0sKxY82rBMZuk6WTBUNt3lzXMOPupLOFzW1lBM9LYXjJzF1O22F8K876/PHxLtt72gmN3EZ1L8iDJ0CGRBZWO/UFOjoDFt0o9NK+QiBpD3B327srM8ET8VzCmbqWjMcfnRLJUosZW5ujaUe7m0JnxuxTvEClpOEo02g41bsV5CezvVVT9upLfoAZOfZ44ao7cOhAu14jWmzyJ0CRRRBzB7VXRPUqIDMT/IiALhPCVcpXXU+VGV4CKnAhiyUJyOX+m5jrhzpVoE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 08, 2023 at 01:14:08PM +0200, Jan Kara wrote: [...] > > Somewhat late to the game but our performance testing grid has noticed this > commit causes a performance regression on shell-heavy workloads. For > example running 'make test' in git sources on our test machine with 192 > CPUs takes about 4% longer, system time is increased by about 9%: > > before (9cd6ffa6025) after (f1a7941243c1) > Amean User 471.12 * 0.30%* 481.77 * -1.96%* > Amean System 244.47 * 0.90%* 269.13 * -9.09%* > Amean Elapsed 709.22 * 0.45%* 742.27 * -4.19%* > Amean CPU 100.00 ( 0.20%) 101.00 * -0.80%* > > Essentially this workload spawns in sequence a lot of short-lived tasks and > the task startup + teardown cost is what this patch increases. To > demonstrate this more clearly, I've written trivial (and somewhat stupid) > benchmark shell_bench.sh: > > for (( i = 0; i < 20000; i++ )); do > /bin/true > done > > And when run like: > > numactl -C 1 ./shell_bench.sh > > (I've forced physical CPU binding to avoid task migrating over the machine > and cpu frequency scaling interfering which makes the numbers much more > noisy) I get the following elapsed times: > > 9cd6ffa6025 f1a7941243c1 > Avg 6.807429 7.631571 > Stddev 0.021797 0.016483 > > So some 12% regression in elapsed time. Just to be sure I've verified that > per-cpu allocator patch [1] does not improve these numbers in any > significant way. > > Where do we go from here? I think in principle the problem could be fixed > by being clever and when the task has only a single thread, we don't bother > with allocating pcpu counter (and summing it at the end) and just account > directly in mm_struct. When the second thread is spawned, we bite the > bullet, allocate pcpu counter and start with more scalable accounting. > These shortlived tasks in shell workloads or similar don't spawn any > threads so this should fix the regression. But this is obviously easier > said than done... > Thanks Jan for the report. I wanted to improve the percpu allocation to eliminate this regression as it was reported by intel test bot as well. However your suggestion seems seems targetted and reasonable as well. At the moment I am travelling, so not sure when I will get to this. Do you want to take a stab at it or you want me to do it? Also how urgent and sensitive this regression is for you? thanks, Shakeel