From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D7F7C71153 for ; Thu, 24 Aug 2023 06:28:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C346C2800A0; Thu, 24 Aug 2023 02:28:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE4158E0011; Thu, 24 Aug 2023 02:28:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAD162800A0; Thu, 24 Aug 2023 02:28:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 98F748E0011 for ; Thu, 24 Aug 2023 02:28:48 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6B612401F8 for ; Thu, 24 Aug 2023 06:28:48 +0000 (UTC) X-FDA: 81158019936.27.7E5694C Received: from mail-il1-f179.google.com (mail-il1-f179.google.com [209.85.166.179]) by imf02.hostedemail.com (Postfix) with ESMTP id B6DCC8002F for ; Thu, 24 Aug 2023 06:28:46 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf02.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.166.179 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692858526; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9KoOHMSDlEnpGFHlNiaW0IwLUMumyRYSBQp0VuouMWQ=; b=W7z0hQW93vdkkgM8W4YB6xOSkjlwYlV/GOHbkCP9gdLskpljxCgrVosmVJtrz/NVTCC+aS 3VmZuCb7r6v55rKOL+sLp7WZXwlYI9m9um2Z3Ub6p/9hpMMHNoxLqfrXZ3kEbZJmcaWvZf KWGq/3qK5cQmD1IlwgixYaOMDGLCBt0= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf02.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.166.179 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692858526; a=rsa-sha256; cv=none; b=TddGYAUFPwN4AUpyqseSAeXTISX1RL0IRYtuQhiQwNuQnDWjYTxdAreHB/MDAOIa7pJWN9 FsZmfcbnWmmxKqQK4PWNQL7de3UJQGiuJr4G+XhRLVcboiXoSspyMVZ6r+tdobEYc+7DXn pM7n9HFZdZeKkHiDb7i1CdIcKF299RE= Received: by mail-il1-f179.google.com with SMTP id e9e14a558f8ab-34baf19955cso20985105ab.2 for ; Wed, 23 Aug 2023 23:28:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692858526; x=1693463326; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9KoOHMSDlEnpGFHlNiaW0IwLUMumyRYSBQp0VuouMWQ=; b=feGG3dP9XVdCzYieEyY8UU9xIwnrIk4WlD5gdF2SKAFZ4C6ohmeXu5EdXteCy1CN5j CQMXqBJUuiJjCRfqIk52SnpkbgCBZJuvwyzCI+AjiF3ePDn2s31mOMnXV4Eh5Rngecm9 GaaRj+UAL+BuVVfe+tFZmAzqUIFz+5JWV6Gk6EM9rqxOHCd8WRIKQRfwIPxq+PGasxII JGcDi6xa+aFLYBUY7i6ESVzhCbiPz2pvFFWhWrBsNwbUY3NQ0nJGSQYhBEFwxLlvQU5L u+IG910HrUHS5jWeGCOtud/p0T9D49i0QoOOzThiFE5HHKiO4hZvEdJV2UNoR0XMUqz1 kPsA== X-Gm-Message-State: AOJu0YzqCT4GTPqpFjgndS/r9Wo1/PIxRaVrm2kD2uH7p8rhOGtga2kS g49Hjn5zHRtZdPOz6urntAY= X-Google-Smtp-Source: AGHT+IHQJBEXv4D2EaFHKC6j5lUD5/R+FKr5LMXLQOdMiAQKonT3YVOJ3rjvYfsZqzNotzdnXLPypw== X-Received: by 2002:a05:6e02:104b:b0:34c:a7e2:c9a6 with SMTP id p11-20020a056e02104b00b0034ca7e2c9a6mr4191882ilj.31.1692858525811; Wed, 23 Aug 2023 23:28:45 -0700 (PDT) Received: from snowbird (c-73-228-235-230.hsd1.mn.comcast.net. [73.228.235.230]) by smtp.gmail.com with ESMTPSA id t17-20020a05663801f100b0042b2959e6dcsm4299317jaq.87.2023.08.23.23.28.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 23:28:45 -0700 (PDT) Date: Wed, 23 Aug 2023 23:28:43 -0700 From: Dennis Zhou To: Mateusz Guzik Cc: linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com, akpm@linux-foundation.org, shakeelb@google.com, vegard.nossum@oracle.com, linux-mm@kvack.org Subject: Re: [PATCH v3 2/2] kernel/fork: group allocation/free of per-cpu counters for mm struct Message-ID: References: <20230823050609.2228718-1-mjguzik@gmail.com> <20230823050609.2228718-3-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230823050609.2228718-3-mjguzik@gmail.com> X-Rspamd-Queue-Id: B6DCC8002F X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bus59bzaez1zi71cmx3btek4jdcjq4wk X-HE-Tag: 1692858526-886642 X-HE-Meta: U2FsdGVkX1+ZIE3rGk7hS9rc4G7G+99PnXnp9Tst3os6/iWyPghHfX0vKB1vW24xvtiTfhLWtWDEcAuFn9jtudYhUuV+v9TA02TzthENbG+MwsIFemK/sBoPmjIJRk8LKVwaPeKlZQqFZfGW6Q4X2Sl/GjVgP4/37kV7ND1jCE2wREpQ3XyEN/VW3+KwuFGG5A8WOjivYqO2wTCHjyRY4U0Ap9yysFz81yjb6LIQRND3llID1ZgsSUB0WXRTk/63xEIYyPRZQsI5IJA3pRyw2JegQrF9oo2URaggYXqXvIu5cmHjXK4dEhuBU3ZDtyZ8vGhNvldodMde8kcxAdoLcf6/cqCEVxd5/uVy43C2vkhYQbTIOAEY+BkY8GYN97jrwqP6hn89/VMw7sBuwUYO8GY4RlHT0/N27b6WnYCHuwetpHw09jlF5iAISGN/RkAglQZy4sXepGRQIrHdzw+Pa/aV1lDnziRqN71CTXFiUl70kdD3RtE84nabe1uY1+cg/0mYRGFsZ/pn/zjSt+kRJWbltbMmlWPqKfbV/Nd+b2i4b4jKqWJC45eiukRwgkiLw6+X0uuYZ8xVBhBXdUZdYEFRBUz+F3AcsosJ2gccyewn+M6rkeZNHFYcM9POv3+LrPWWzUHi8JAd+66LfeT12v9UKuYGOCqmlgtQy4sZ5eAnTdRyJtagSM7hRGgBKmFi/IIN/xFOm36O4iU0jPITmY2niXHjHMcjD0ZhVGBhHKdik1nyAPyJeGRMkNc8Lqub3qyXfViJT7unvM16OtY9yl4OW3QZCMBostCztxpFmT0lfADb0A9oefN5Vbk0YC5OUrKTRyNhc0/DQVdIbXM7fKOwqLepazW3h2qgVNoxjUeWIgJTTrEEzV5h0OYLDcr0rpsOoQGWgmVCDhg85sx4B2Hi1o4iLIbdFEEViT6luNGF5HulJ5GeXvVN68fPYShW9CqOexS7HYikdSJPZ2q cplRPiIF xXlZs7S/Z3F0Iq2nCZfGppwImpj37tXC3hTpZI/HfeTJAZX0h1D07wBxT2Us1jSeSLK+ZlMav5ya2j7nPzgTkEgOmEz6VDIBms89S5vpiaUQiR2czP/9bX9KXb5sBl1hCP6XdvvdIiI6Km3cP36umTCzXmbCKp5duXXbRNRUMM+6/ZNIeOFPEW+54MHOPZB9wlQp+JSOuUi6/ROCfsTF5AgIjhP14jQKS+swfLoK5EYo9CQY2QZqcf7R4jVR4sgjS09mTsynbFUJjSnIHJk9HtfuEG3/mIksLwqC1QeI7OH8JyyU/Zjs/mQj+SqMJv7eOgagZUyGayDaCDdjXetcUIQ9TskTUuuYsRmedF7W2F5TG9pTLukimTK6Q6dKC6T/Vzlccr+V5aK+4t5zXq3+/XK2ykl7hlMyA2PH0fXaLQcyUim74I7DZafEeYG3OCk6ADF0KOOL/q3UP0dSw4RVm8qHK40Xsm1tWYlRRrwawaS92BewnsRkBs3CLPzOabacohiFro4vpUoX20EQ6xvXhXwIFyXcEwiSBx3gfLLGGqnmgkg5Q/H6EzapOJ/wMy+fGbU2NW3Jby/8ff0KecJ/GfNh+vZve+vvYxv+sw5f9v++AJNE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 23, 2023 at 07:06:09AM +0200, Mateusz Guzik wrote: > A trivial execve scalability test which tries to be very friendly > (statically linked binaries, all separate) is predominantly bottlenecked > by back-to-back per-cpu counter allocations which serialize on global > locks. > > Ease the pain by allocating and freeing them in one go. > > Bench can be found here: > http://apollo.backplane.com/DFlyMisc/doexec.c > > $ cc -static -O2 -o static-doexec doexec.c > $ ./static-doexec $(nproc) > > Even at a very modest scale of 26 cores (ops/s): > before: 133543.63 > after: 186061.81 (+39%) > > While with the patch these allocations remain a significant problem, > the primary bottleneck shifts to page release handling. > > Signed-off-by: Mateusz Guzik Same message as for 1/2. I'm happy with this, just a minor reflow. I'll take this for-6.6 unless there are other comments / objections to that. I'll run a few tests myself too tomorrow just for validation. Reviewed-by: Dennis Zhou Thanks, Dennis > --- > kernel/fork.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index d2e12b6d2b18..4f0ada33457e 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -909,8 +909,6 @@ static void cleanup_lazy_tlbs(struct mm_struct *mm) > */ > void __mmdrop(struct mm_struct *mm) > { > - int i; > - > BUG_ON(mm == &init_mm); > WARN_ON_ONCE(mm == current->mm); > > @@ -925,9 +923,8 @@ void __mmdrop(struct mm_struct *mm) > put_user_ns(mm->user_ns); > mm_pasid_drop(mm); > mm_destroy_cid(mm); > + percpu_counter_destroy_many(mm->rss_stat, NR_MM_COUNTERS); > > - for (i = 0; i < NR_MM_COUNTERS; i++) > - percpu_counter_destroy(&mm->rss_stat[i]); > free_mm(mm); > } > EXPORT_SYMBOL_GPL(__mmdrop); > @@ -1252,8 +1249,6 @@ static void mm_init_uprobes_state(struct mm_struct *mm) > static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, > struct user_namespace *user_ns) > { > - int i; > - > mt_init_flags(&mm->mm_mt, MM_MT_FLAGS); > mt_set_external_lock(&mm->mm_mt, &mm->mmap_lock); > atomic_set(&mm->mm_users, 1); > @@ -1301,17 +1296,14 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, > if (mm_alloc_cid(mm)) > goto fail_cid; > > - for (i = 0; i < NR_MM_COUNTERS; i++) > - if (percpu_counter_init(&mm->rss_stat[i], 0, GFP_KERNEL_ACCOUNT)) > - goto fail_pcpu; > + if (percpu_counter_init_many(mm->rss_stat, 0, GFP_KERNEL_ACCOUNT, NR_MM_COUNTERS)) > + goto fail_pcpu; > > mm->user_ns = get_user_ns(user_ns); > lru_gen_init_mm(mm); > return mm; > > fail_pcpu: > - while (i > 0) > - percpu_counter_destroy(&mm->rss_stat[--i]); > mm_destroy_cid(mm); > fail_cid: > destroy_context(mm); > -- > 2.41.0 >