From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EDC8C3DA6F for ; Fri, 25 Aug 2023 15:14:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5624B2800B4; Fri, 25 Aug 2023 11:14:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4EB1B2800A2; Fri, 25 Aug 2023 11:14:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38C692800B4; Fri, 25 Aug 2023 11:14:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2182A2800A2 for ; Fri, 25 Aug 2023 11:14:26 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id F3520806F8 for ; Fri, 25 Aug 2023 15:14:25 +0000 (UTC) X-FDA: 81162973290.15.7B09CFC Received: from mail-il1-f181.google.com (mail-il1-f181.google.com [209.85.166.181]) by imf09.hostedemail.com (Postfix) with ESMTP id 4504C140035 for ; Fri, 25 Aug 2023 15:14:23 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.166.181 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692976463; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4dWU0nxXevjdvt6ghvXCB5FvF1502/TurOTI/7ffia4=; b=gjqPUhUBGPzf48qdBTXg0mkxXaH28mQ1oI90uceK2JEFs97O94X8gxOah2G0pxe0pbolJr a+rrM3D5nMWGGmkLZi+U0Z24kMXUhCrTjeGfHCKtC9wJQL3IWCpeKUQF0lO+MWMgKz80cT M+DizHQapeU99GkMqkTtGKFLE2JaKZw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.166.181 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692976463; a=rsa-sha256; cv=none; b=DKDcK5aXC+rUH+XzrL00N5Crvr6tl7osdcTwWMz9T7S9I0XxeVwgUddwSkIp/plODiQKoR 8pfjS533l4JN/FBPX2aDtwnzwt7db1DPrwtbJAs6rGNjfuASqv5lzs4Ihi1qHPfXG9CcOk DkYrEuwGbTsBjyLb2HMJD1iNeDEm89I= Received: by mail-il1-f181.google.com with SMTP id e9e14a558f8ab-34961362f67so5312775ab.0 for ; Fri, 25 Aug 2023 08:14:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692976462; x=1693581262; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4dWU0nxXevjdvt6ghvXCB5FvF1502/TurOTI/7ffia4=; b=cRfLwXn3z7lVzayaW6v283bjNa/vhSapSkOfJMd5Fe90qNbAbVi6fChWTELvWSbWTY aI8T/PmHXpmdHIb3jO2utkpHBW+MgKKmBt2WDJ9lUaSibQcKL3aBSmf+UgWZM36G8kxG 5IrfXPyRlNE+s4R6Bt0K5DOKkk3njzr8fn2TZ/PQM2OpDBMshfMMsp2Mbfv4yHIQxssV wbip51ckkH8afcbtqQ9hB69ilc4X8i8UghAEImbT3Lp6zbndDncTHW0ZX6YvfAsHfr6R 9k3KyfdQQgFx6Tq3xLNvjm1vr2iJBE09J/nWHfpg6W3mWSyOTYqm+aIJnswQMPp1vpNN aG9g== X-Gm-Message-State: AOJu0YwCYkX5P6TCG6kH0Rw1AYi4H/83SPm4Qy7uCujZx0OSVlQynjB3 oXoeF0/yqL6hxLjx71Sh1SY= X-Google-Smtp-Source: AGHT+IF1KJ987vueX2tYfk71qttW5eCsvu1f86s9JhMG5zr1nirKlA2jkP4qzpJF24em7mkkxBGp7w== X-Received: by 2002:a05:6e02:1806:b0:348:d683:36bf with SMTP id a6-20020a056e02180600b00348d68336bfmr8732710ilv.12.1692976462178; Fri, 25 Aug 2023 08:14:22 -0700 (PDT) Received: from snowbird (c-73-228-235-230.hsd1.mn.comcast.net. [73.228.235.230]) by smtp.gmail.com with ESMTPSA id gg13-20020a056638690d00b0042fec8620e4sm558615jab.57.2023.08.25.08.14.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 08:14:21 -0700 (PDT) Date: Fri, 25 Aug 2023 08:14:19 -0700 From: Dennis Zhou To: Mateusz Guzik Cc: linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com, akpm@linux-foundation.org, shakeelb@google.com, vegard.nossum@oracle.com, linux-mm@kvack.org Subject: Re: [PATCH v3 0/2] execve scalability issues, part 1 Message-ID: References: <20230823050609.2228718-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230823050609.2228718-1-mjguzik@gmail.com> X-Rspamd-Queue-Id: 4504C140035 X-Rspam-User: X-Stat-Signature: nenthf9ne4k65cty8cs8bwxfg1j1feyy X-Rspamd-Server: rspam01 X-HE-Tag: 1692976463-740055 X-HE-Meta: U2FsdGVkX18MeUaV1g6Yv4KzJpE3RTni5bBm/UGOxrxOXnrEMwi1HimY04O7xstWFo6nSgzsI9KumkykRlbybmADoje2/S/WWyxuTF1aljLDu7tLOa7NBsGbugQAuOG17oov7f6H2IwO2C4HqZNxTp0HjTFg707jSXVU5nPLTMG63Zy97O/9YfYjcAQrViF1MAwwVzKkRUkiRMyQ2+ZCn0h/cgUzIZwhmVNUa3iyj+gWcx/au5WkjTq7v7x2tUnMNApjXTDidOV+eUcarNl9ET7PcEzrVcwtNXibcjrjBKfl591wQzRNTpqeJ7n3yaE//IfgUOx9EDro9wXMAYjCRfvKb/qIj53LKTOZkcHAPxYFzscl53fmDF4sgxvFcL1xFPGs9l5FoDsBAyr802lmUIHALJv2Xg0xFoEuMk/wOWz1328Ri14/qLmCSUR0Gg7fdtTR4H7kHDMvKLkHayCyup8GM5hcy4xzndlK7/dk8fvlloYHzYoP9CslSnBucP91M7WGsm6S2OWUWokQTU3rPINcinciFrzCfLM+vgl3vQreJYnPOR0iJTJC9ijznq7jQR35sPdHeSuBShBHV0D4eosm+VqdAUdYttS9xGynnLuzw6M3dnhpwZKTzrzXIRBSIkX1FCkJUfHeR57tW3j8m9DHgkYJLg7KuqEzQc/BkGf6U1A2KHoEhD0vKsC5YXbmc59Jd6vdUvap9OG+jsz2HhquLOnvOa3ty++oFDhmQBW8UMsp0c8BeFSGFxvxaRFRP9eEmWLOJ9yxmviBQ9ZgsB3F0yENvVsEK3Kij5HJvq/pkSkm0vCur9XQB6n1/FUnn0QoXaN+luVqCFnpAq3ilqqr4v7l/OeE//abJO6PNsGANu7C9pjd0C7O8ebH+7N1Y5q+wgBEC6gkPZTGJZNoa7PthyuU66wvYW58EI5kl+79aa8U17GJet8pvzY7KU/NsEJU6QZdocyoT9/1sPU G3B5+2j3 844wnBFLkaLSugJh36kXyxW27jgG4mqMcrh9mdTXi18jd76LjF1REzh182o1joC3Qr/XyMbEXWwrSijVmN+xj4qK7LvtkMzyrxORb8nnU92QKNr9kdN5Z4F7LxqSnX5U9vEbrfrk41CkMCD4v/PXUEsymG+GwUCHUgaRN2+ApFW/F9RRd3TcuOtGTewrEsaECFZ4lZJVvOHzaz/5n8130kcZncVtuljZ0AKwNd7Vx+wxIQt5S3hHU5gnvdUmmoBy5pHpKXCrCNktCdgqmXpb0X6GXSdbRgaM5BxiURpf6C2RoKKKT7tv7S4J241piyVE5u8v4hsd7payLPKlpyjx2jdMOUxF2xVlg5ZypjbDeSNcWYtGyPhOAZMugB7KDF+WVCPJftf5mQ+cFqT4I8ir0nx4+3yICW0eaXouQNjPJzFvB8Nk6jGLdOLWt1SNi+TAgo7lFh6eIGG0WgHRB1YknaztpcbMj8VboFSuMSo6Vyhz4WFnCqD643/hcLWpZNM9lg1CWCMlonPlFJUX/E43n5orQEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Wed, Aug 23, 2023 at 07:06:07AM +0200, Mateusz Guzik wrote: > To start I figured I'm going to bench about as friendly case as it gets > -- statically linked *separate* binaries all doing execve in a loop. > > I borrowed the bench from here: > http://apollo.backplane.com/DFlyMisc/doexec.c > > $ cc -static -O2 -o static-doexec doexec.c > $ ./static-doexec $(nproc) > > It prints a result every second. > > My test box is temporarily only 26 cores and even at this scale I run > into massive lock contention stemming from back-to-back calls to > percpu_counter_init (and _destroy later). > > While not a panacea, one simple thing to do here is to batch these ops. > Since the term "batching" is already used in the file, I decided to > refer to it as "grouping" instead. > > Even if this code could be patched to dodge these counters, I would > argue a high-traffic alloc/free consumer is only a matter of time so it > makes sense to facilitate it. > > With the fix I get an ok win, to quote from the commit: > > Even at a very modest scale of 26 cores (ops/s): > > before: 133543.63 > > after: 186061.81 (+39%) > > While with the patch these allocations remain a significant problem, > the primary bottleneck shifts to: > > __pv_queued_spin_lock_slowpath+1 > _raw_spin_lock_irqsave+57 > folio_lruvec_lock_irqsave+91 > release_pages+590 > tlb_batch_pages_flush+61 > tlb_finish_mmu+101 > exit_mmap+327 > __mmput+61 > begin_new_exec+1245 > load_elf_binary+712 > bprm_execve+644 > do_execveat_common.isra.0+429 > __x64_sys_execve+50 > do_syscall_64+46 > entry_SYSCALL_64_after_hwframe+110 > > I intend to do more work on the area to mostly sort it out, but I would > not mind if someone else took the hammer to folio. :) > > With this out of the way I'll be looking at some form of caching to > eliminate these allocs as a problem. > > v3: > - fix !CONFIG_SMP build > - drop the backtrace from fork commit message > > v2: > - force bigger alignment on alloc > - rename "counters" to "nr_counters" and pass prior to lock key > - drop {}'s for single-statement loops > > > Mateusz Guzik (2): > pcpcntr: add group allocation/free > fork: group allocation of per-cpu counters for mm struct > > include/linux/percpu_counter.h | 39 ++++++++++++++++++---- > kernel/fork.c | 14 ++------ > lib/percpu_counter.c | 61 +++++++++++++++++++++++----------- > 3 files changed, 77 insertions(+), 37 deletions(-) > > -- > 2.41.0 > I've applied both to percpu#for-6.6. Thanks, Dennis