From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4B9CEE4996 for ; Mon, 21 Aug 2023 21:39:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5691F94000D; Mon, 21 Aug 2023 17:39:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5193E94000B; Mon, 21 Aug 2023 17:39:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E0B694000D; Mon, 21 Aug 2023 17:39:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2EA6694000B for ; Mon, 21 Aug 2023 17:39:58 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 05566160108 for ; Mon, 21 Aug 2023 21:39:58 +0000 (UTC) X-FDA: 81149429676.15.4A59A34 Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) by imf27.hostedemail.com (Postfix) with ESMTP id 2A9AD40009 for ; Mon, 21 Aug 2023 21:39:55 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="g7qX7sQ/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692653996; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G0EKwEgwpxLoJoQonV7N63EDw/RpmfssFT2FoDBVEL8=; b=l1zm0XXDzCbTsuANzYeLDGgR2rWTtwy67vSJS/Ib4YvrfLqUC7l/VHgJVyM+GpfPOpWLAR VOASdC7LoMwJjgFu6vzI7EQg9GJBuoLLd6lZbXoaOnVFSZ6rt7qmk8KKXkH+jSkNX/8Hh8 CgJ3Vb9ohKCCJga2MG2mBfQlyw/D3W0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="g7qX7sQ/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692653996; a=rsa-sha256; cv=none; b=HTDNbr2Wrx8NCTI8twlyX3/Ce8qKjRpSyT/OSAwVu57Pn1+AsdDlVDNXFfp1UrEdWLkPnX bBH9/vhTubm6+clTmhFH0Pptnim8k4uOZHnhkZwll8skeuI/7QffosKg6gM2cqbVEhLWkH +KZdz37aTLKrcTq+AU/c9j0cmsccDz4= Received: by mail-lj1-f177.google.com with SMTP id 38308e7fff4ca-2b9bb097c1bso59718941fa.0 for ; Mon, 21 Aug 2023 14:39:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692653994; x=1693258794; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=G0EKwEgwpxLoJoQonV7N63EDw/RpmfssFT2FoDBVEL8=; b=g7qX7sQ/Rs8bZjNmNyXLm5+gc5kNDYyDLFNc3q5+ZrkAbAMffug47cj8hWl9GApAL4 ePKRg5CDGMT5ZpX2I1Qjf4fnPHZSsAIpyaqJAAPuRW/r8m2Xmh5BMDLOU+I5sVXO2D/J nouLUEYFCYm0ok2N9fL3DPJF9jmhqmNABcnfRvJ2TP3hYDrKIqt7omtEqU55rmK3qZ98 bYZHX5wSpr0UaAEyAmlK+Kw4tbkYJR9fb0WMxSmJZFJ+wXELO0s6Rm3YjGBorY7Ruqar oYh+jsMrkagvojX8gWt9A3Fcdsmd66pkMh4lQ8qwHozOVBui5gh4rx8oc+3jqKtYUJmb Isow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692653994; x=1693258794; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=G0EKwEgwpxLoJoQonV7N63EDw/RpmfssFT2FoDBVEL8=; b=GOwA9UhUUue95bWA14PcIQrEe0LsYGSwQTTdHfKabHr+fdfTu/klOY58mqQxkGN0V+ D6iIbz2aYstOjDIu8CHGuMfkD6w4YcaXeMJwh7PgMBrw0hMkr+9dlm5QgXaQw5GUNTkV IYLCBKknxrUeApHyqxADXAtnfcoGAbT+XBRBBqlhe33wMHHSpZkxfArNEnGWZTdu8w15 Zm0NuQjb8fyUSqX2lKCgkkjyx4+LAxoceJPtpY3ygsh0e5pgOM97/iI/LgS1c8hSLS8S GmFm0WaygyAWfQfMb3P6YBJkBNVpdo+KcGViHXZPk11Y1cY2T1ny6Ry2OX9001+IKBGo x2lA== X-Gm-Message-State: AOJu0YxBx3Y96rTelG1D3HoqmbEXWw3ilGoRHZeeBhC5vooiwTnb3NPN 2sln4nAlSUU97UMMOk8NHVs= X-Google-Smtp-Source: AGHT+IHhOmR8TzN4tM4XCWA5/HiA4fDXN4ZfpxFnlpl5S1uIRiE68fdOsY2fCkQtdbQwwvvHB6Yu/w== X-Received: by 2002:a2e:9d58:0:b0:2bc:bc7e:e2df with SMTP id y24-20020a2e9d58000000b002bcbc7ee2dfmr3010671ljj.33.1692653994101; Mon, 21 Aug 2023 14:39:54 -0700 (PDT) Received: from f (cst-prg-85-121.cust.vodafone.cz. [46.135.85.121]) by smtp.gmail.com with ESMTPSA id k26-20020a1709062a5a00b00997cce73cc7sm7161693eje.29.2023.08.21.14.39.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Aug 2023 14:39:53 -0700 (PDT) Date: Mon, 21 Aug 2023 23:39:51 +0200 From: Mateusz Guzik To: Dennis Zhou Cc: linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com, akpm@linux-foundation.org, shakeelb@google.com, linux-mm@kvack.org Subject: Re: [PATCH 0/2] execve scalability issues, part 1 Message-ID: <20230821213951.bx3yyqh7omdvpyae@f> References: <20230821202829.2163744-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 2A9AD40009 X-Stat-Signature: kt4gjygnfh9pwddpkbf34gc8dncibnd7 X-HE-Tag: 1692653995-74163 X-HE-Meta: U2FsdGVkX18P7rErKRCY5Hgkph9RCfAgAt7ODpE/WDqEMrD6+MS9sQ7QPKhhdTlOVwcTec3EzhpXBlH0oV4/0reSquYIVY1BX9j6axcmqyTrDg98k6EGfE27aYw4j0UnqhvOrt6zfjd6/TCH9zUoJ3KGMZKpYbHRAkqXbTm+9BZBTIH+SKoHK/Z1SRf8Gz+Tf/FW1rCuxBfuYIdtz0LCg7sQi+d/IxodB8ToFh7CL3Q9DaRWszdfX0NCf1adgnAThXTpt4H2F/0ceYQjYlKIidOOmo7Ewh3EV/uspZ6CvG2EuVaqirYDNescudHJPygIgaI2vCQoUqXiVhHUPhfDgu1kl8eB/tMRyNuiQSESCP3SIkxOFJAqTtvqhnLRiuT3d3NygLi+uR8klEsBNR7lPB43L7LIPWo/CCqqh3c6Inbk6PqCQHdPBMshg8/t9uDdQQK8FGP2D8gojEVEDvqgl3ZTih5GV+FqCQ4tBZPOkAgFoFN7cq4dNUFr4HbzG7M1jZknpSyWq+Zoa/eW18XT6C9CbQ52dvPI1/vBivECv+mT/Qkj/j4RP42D7kSB+oj+W/uqGpqXu47E4gQqhVvVAx1XHWyozCNFfcCL/kHwrWJBDDoUWI3b1AY0GcX+qiRdp86iRHB5ZdLQGa746yUNPGSDvhWqkucJ70oyibZg9EJZ54Vji9Iz+f6HMw8UAE7A+Tw+0tZ376BIrFPT3PhwJUtl3Q5705mb+jniD+rGdIT4fe0EZjIa1HeBZq58falnJnqK2J+XGdoCk8KYJT58NtGIz8IplWsxq8gDpvQY95AsGq/JVki0cLX3jubrNfpiqWxVa1oO9Yfm82XN0KRJXLr7YDwuS3hBgKvNgHOQ83FbyLPPlHt0U6dIoUj/p+qQKY9qsp+rKLbiY/uo124Sf/nIG1+K6wH1GLkiIIJN/O0q/CTsOlbuHcmj02QQS7sDEkQ9XH8wVRiFRmRlHwx lLaZym6t JMqp/8s/7gdpwcsxDJmBrmaze06mojEEEFrLPEG7UYLtGyBIp997eHFOQ4H0zdQ58u0bqPQwnXkFp6gM4eS4xao5YQT5gek2moZtZ9bNiUfhqLHdOx6wsEXOpvJ20zq4sbdkWg0MCy5s6+1BQKSMwF9WdO7ig/PrGxo+6aibf1QZ/uO1JpvoMAqMcb8/7PLh+BJ4OhnjBZAkULRV/NO31UOtQ7ldmI47CNR9gHF58pT5fHDFIvHrazz/LhVb6ENG3WNrFROqko27EZs9KOVRU1gf7gp7FA4rkbBW+bL/UlOmXks8NZfDskg3qP8mRJhBiM9pDF94nsgBIgmB2FlOL3ZTK+61mk9tMxsJWNsaEmqbCx65dNSseM/XuPMTUHh+kdoXZ6sgDkYDjytIjgjF81RyHbweh5rHRkzG0qbDeEv174SI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 21, 2023 at 02:07:28PM -0700, Dennis Zhou wrote: > On Mon, Aug 21, 2023 at 10:28:27PM +0200, Mateusz Guzik wrote: > > With this out of the way I'll be looking at some form of caching to > > eliminate these allocs as a problem. > > > > I'm not against caching, this is just my first thought. Caching will > have an impact on the backing pages of percpu. All it takes is 1 > allocation on a page for the current allocator to pin n pages of memory. > A few years ago percpu depopulation was implemented so that limits the > amount of resident backing pages. > I'm painfully aware. > Maybe the right thing to do is preallocate pools of common sized > allocations so that way they can be recycled such that we don't have to > think too hard about fragmentation that can occur if we populate these > pools over time? > This is what I was going to suggest :) FreeBSD has a per-cpu allocator which pretends to be the same as the slab allocator, except handing out per-cpu bufs. So far it has sizes 4, 8, 16, 32 and 64 and you can act as if you are mallocing in that size. Scales perfectly fine of course since it caches objs per-CPU, but there is some waste and I have 0 idea how it compares to what Linux is doing on that front. I stress though that even if you were to carve out certain sizes, a global lock to handle ops will still kill scalability. Perhaps granularity better than global, but less than per-CPU would be a sweet spot for scalabability vs memory waste. That said... > Also as you've pointed out, it wasn't just the percpu allocation being > the bottleneck, but percpu_counter's global lock too for hotplug > support. I'm hazarding a guess most use cases of percpu might have > additional locking requirements too such as percpu_counter. > True Fix(tm) is a longer story. Maybe let's sort out this patchset first, whichever way. :) > Thanks, > Dennis > > > Thoughts? > > > > Mateusz Guzik (2): > > pcpcntr: add group allocation/free > > fork: group allocation of per-cpu counters for mm struct > > > > include/linux/percpu_counter.h | 19 ++++++++--- > > kernel/fork.c | 13 ++------ > > lib/percpu_counter.c | 61 ++++++++++++++++++++++++---------- > > 3 files changed, 60 insertions(+), 33 deletions(-) > > > > -- > > 2.39.2 > >