From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93149C7EE23 for ; Thu, 8 Jun 2023 19:10:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11F1E8E0002; Thu, 8 Jun 2023 15:10:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CF638E0001; Thu, 8 Jun 2023 15:10:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED9028E0002; Thu, 8 Jun 2023 15:10:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DD5018E0001 for ; Thu, 8 Jun 2023 15:10:23 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B6A001C797C for ; Thu, 8 Jun 2023 19:10:23 +0000 (UTC) X-FDA: 80880521526.15.6DDE2E3 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf19.hostedemail.com (Postfix) with ESMTP id CABA81A0003 for ; Thu, 8 Jun 2023 19:10:21 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=ObuUy7jq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686251421; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t7JDYeCj1Au9M2F7A6TgeeUVVVrQVhgPcLMJ9f44nT4=; b=23NiQNGk7IA1Cs8OCLj79hvKg1QP9gOw8JbsTCcx5rAJLOgcAWnoWKoMQuGxEre4Q3ZslA AJxQyN7XbTXdx6QpPiV2qlI/W2S8GhIXszjxlUtBFSDT3YSNZYPpfc4xrvGhaw7fnl2X96 yHkHl0IrRBheIZGEwTf/fTRmVcHJN2o= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=ObuUy7jq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686251421; a=rsa-sha256; cv=none; b=FFUk9e+JH0eX35EYkmZC/qnEr/nC7wrlpfXoUw4TWRvIqANQkJrodSgF0ZLp1H/q96dKIb VH3pfaK8IjifVaVV3eah/XTl0g9KdqiuVgp+MTkRWBg7k5sJLtK1HPmQWHbHdOOENFNAdk DjdwCXy5jJzQoHtnDgPMUgxagluGjx0= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-1b021cddb74so513135ad.0 for ; Thu, 08 Jun 2023 12:10:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686251420; x=1688843420; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=t7JDYeCj1Au9M2F7A6TgeeUVVVrQVhgPcLMJ9f44nT4=; b=ObuUy7jq4yJMZeFETYqxhPRD+TYk2lfW++fICGVSpTyoFgSwzI8oSot9SuARscGqBM gJ6yT9AOAmg0fobeDiW3RbJxtvsciXacKfGy8xjsuGVAV1Vu/2zVzltEjAILSuqZHPgr OgVzmMwxwUPdnXBKHpbA/586q5uEN7fT3yjTTlTIQFUWG7jhSUslUud02q/t+isp/eur 5lDWdH0aSaYIWHVoDyynf4GABLH58sBMhhgmD6JL1tKZRDiA4rsyBpBZ/Frvgw9DaXSY rgmyG3kUqW1gfqG1JXJJnjfl574QW2l0HmWxiWPg+Tpm3/Z4oOgeW0jOKhFBNwLMWcpH ES2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686251420; x=1688843420; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=t7JDYeCj1Au9M2F7A6TgeeUVVVrQVhgPcLMJ9f44nT4=; b=Og9AkyCN4v0tbVhB2HqI5XMLzPD70G8/pLMzSgSTn9i6wL4OY4IODdY5WcHqLItBy9 Cg++xBKBY2ZLx1V9ENzAYbrxJ9L+oUwTIgfgtW3g+ZSxgN5hpqzesSZy81WfHhwPXedl alXFmINgY9XZIl7vbWY1hVtz1++oHIQRKMWFSwKT2GspbPbwo++PFoboOp/a2cAnVwFu hs0i35E008+fCc0fp8ATzptH+V3HB0/4PxIElSExVPu+SqxLSOHRBipkIiy9+KBuWIKt MsuacQIgkw5SOGp2zBE0bKwlKRHeAgKT+O2jfbdIWSuabuV90Jzg0oAw11navybYuGoG GsHw== X-Gm-Message-State: AC+VfDyl5k9V4ByaCLVePKZGVQU7Jip5dPMvQopSIwru2nG0PzC2eAVC 0ZLdQLTzS+HUytT1jyYXIMc= X-Google-Smtp-Source: ACHHUZ5nR2KP9shcCg65TNHMHwD2feED6YAnUJLCRKW0B2VcKLwF5rQZJh1D7B9Bofg0fsquoq4UIQ== X-Received: by 2002:a17:902:7c82:b0:1a2:9ce6:6483 with SMTP id y2-20020a1709027c8200b001a29ce66483mr4006161pll.64.1686251420344; Thu, 08 Jun 2023 12:10:20 -0700 (PDT) Received: from V92F7Y9K0C.corp.robot.car ([199.73.127.2]) by smtp.gmail.com with ESMTPSA id z16-20020a170903019000b001b03f89daffsm1781251plg.110.2023.06.08.12.10.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jun 2023 12:10:19 -0700 (PDT) Date: Thu, 8 Jun 2023 12:10:15 -0700 From: Dennis Zhou To: Shakeel Butt Cc: Jan Kara , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.cz, vbabka@suse.cz, regressions@lists.linux.dev, Yu Ma Subject: Re: [PATCH] mm: convert mm's rss stats into percpu_counter Message-ID: References: <20221024052841.3291983-1-shakeelb@google.com> <20230608111408.s2minsenlcjow7q3@quack3> <20230608173700.wafw5tyw52gwoicu@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230608173700.wafw5tyw52gwoicu@google.com> X-Rspamd-Queue-Id: CABA81A0003 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: eut9gybqcznfdf5gjhh8htchmuz1oxoq X-HE-Tag: 1686251421-405421 X-HE-Meta: U2FsdGVkX1/WPVVCWMF5dQiCQ7Vqgav692nykSANj2g7BWMP2nb/U9YYP552ltUoJ366f/EtwACbCdaR5PZk99h168BVkalF9Kv3f//yz24TR1XIcaNgDuGaPM+gTrtqJOHUIC8+AOjVRP0+BKpju9BQJPxJbOMq08uZBjguV6I696rt7Xx47+tWGjVjluNqUTu4Yx749JHxnrhLF9MeZGp1QoGV4NWJDU7UUXLezKsf5nyt8cK91giwNJ12gboj4jt8VSmEpio2OSmyD1pa2Gu2hJjzKbnm1mGTbrBpmjscHHATRa8xCqQUUbfpo3gfs8ytBGXUVxGvzop6YM23grp3w3pHv2FxQzDCb0emQ+VX7L4/YIbQw7uOZJzKBgoM4jYFKhAr8+AxaYXzybeSO3m5yafnVJPfH/XMAiZQN4YLug8Yext7EII02xM3izyacuRIHXEx1uWzE896N5nPLyJKVafPXZJ45BR3zqr5FZkBPqf75Lu6H9nkVV/XB0zmEbmdzAomoep1l+0LikK0JasdFCbM+WevN0u7vOCYK8kzh1QCBKeTRlhR7XgB/wu5styr3WKpXeKbiTZIDaYqCNKGxjFGPiE8rHG2w197l0LrZOA84fGubZDJFS4+0S5D/kro/2KUo0bL8yjd45BkXpSENhqyBFa/h7n5TVyX1K7985gK1eYVgB61JW/NetesOaFj286iG33/0Qexp+CUoJYRNjZZQH7keGkckHi9J1jaUSJnKg33vCRmgAoMuBfvURGD01uESo3tKMLOw/Wbn22yfOI2H45XYwQsyfVSSFSRMKIK5xG5J1yi2ytxpJM4PhffTKpuKhe5WjCDvFvEf8MvVOgYbPSmh/jOh2PeyBDdOv3vbnILtk8WvVbDO1tw8deDhDyhUAc0jEeB9MEkMurQ3n4Q+4KhkuxHdSY7vpyEfDjPQI7Dmvm/LG4zEg764+yTLS8wHPqBBvo4a8h kfQ8sW51 53J48j+Kif8IHo1EpM6oqCBzhPxVTSBxKJs79/Ozg/oTCvPS0NSrGCI137syFyXUbNLCiqPYztjND0AFhzPTuFO4spnN+8JFJ7cYhgD0CkYooEQj6Qg9PNH/YDKnMavLbOrKvufDAHwiaDlgLxk7VRKVYIb0ssbLDBuWXbkq1sMz4gK24+YYnGCAojco2DpE83U+u14cAEnSyXTd/Ev94gk7pyM0QxaD5aI+ScUVi/EK5C5OnJX0UNfR+MLCgLI3+o/yGFuIsdxdFWc8FW8+U36TI0CLxj7wP7ct+S+Ls20IA/t/uZ2RwU+MMZYNNjTXOh21z0LACa0Aq9eAgb0eFQIm3zway25uGy+4yCGQwSOlglULwsBIAZITPTvdS0WekcpL3QOxlLAEHPWvDfiLaMdOrrUpiGWfNEqVUMHsCW7aCKTemRQWuRkBAAhgf0nFjIqhoW/TjfjIiurfrs+9m53yuOYjE6AGI00qL0ZrewbS1h721v+HJPeGoVQLvFK22u71cYbnok/TKGfkYwzcpAQqIFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Shakeel and Jan, On Thu, Jun 08, 2023 at 05:37:00PM +0000, Shakeel Butt wrote: > On Thu, Jun 08, 2023 at 01:14:08PM +0200, Jan Kara wrote: > [...] > > > > Somewhat late to the game but our performance testing grid has noticed this > > commit causes a performance regression on shell-heavy workloads. For > > example running 'make test' in git sources on our test machine with 192 > > CPUs takes about 4% longer, system time is increased by about 9%: > > > > before (9cd6ffa6025) after (f1a7941243c1) > > Amean User 471.12 * 0.30%* 481.77 * -1.96%* > > Amean System 244.47 * 0.90%* 269.13 * -9.09%* > > Amean Elapsed 709.22 * 0.45%* 742.27 * -4.19%* > > Amean CPU 100.00 ( 0.20%) 101.00 * -0.80%* > > > > Essentially this workload spawns in sequence a lot of short-lived tasks and > > the task startup + teardown cost is what this patch increases. To > > demonstrate this more clearly, I've written trivial (and somewhat stupid) > > benchmark shell_bench.sh: > > > > for (( i = 0; i < 20000; i++ )); do > > /bin/true > > done > > > > And when run like: > > > > numactl -C 1 ./shell_bench.sh > > > > (I've forced physical CPU binding to avoid task migrating over the machine > > and cpu frequency scaling interfering which makes the numbers much more > > noisy) I get the following elapsed times: > > > > 9cd6ffa6025 f1a7941243c1 > > Avg 6.807429 7.631571 > > Stddev 0.021797 0.016483 > > > > So some 12% regression in elapsed time. Just to be sure I've verified that > > per-cpu allocator patch [1] does not improve these numbers in any > > significant way. > > > > Where do we go from here? I think in principle the problem could be fixed > > by being clever and when the task has only a single thread, we don't bother > > with allocating pcpu counter (and summing it at the end) and just account > > directly in mm_struct. When the second thread is spawned, we bite the > > bullet, allocate pcpu counter and start with more scalable accounting. > > These shortlived tasks in shell workloads or similar don't spawn any > > threads so this should fix the regression. But this is obviously easier > > said than done... > > > > Thanks Jan for the report. I wanted to improve the percpu allocation to > eliminate this regression as it was reported by intel test bot as well. > However your suggestion seems seems targetted and reasonable as well. At > the moment I am travelling, so not sure when I will get to this. Do you > want to take a stab at it or you want me to do it? Also how urgent and > sensitive this regression is for you? > > thanks, > Shakeel > > I _think_ I could probably spin you a percpu_alloc_bulk() series in a couple days for percpu_counters. Let me try and find some time, unless you had something different in mind. Thanks, Dennis