From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB29EC7EE23 for ; Tue, 23 May 2023 21:48:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 026486B0074; Tue, 23 May 2023 17:48:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F17F06B0075; Tue, 23 May 2023 17:48:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE03B900002; Tue, 23 May 2023 17:48:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CFE1C6B0074 for ; Tue, 23 May 2023 17:48:37 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9DEF5140845 for ; Tue, 23 May 2023 21:48:37 +0000 (UTC) X-FDA: 80822859474.09.9D42370 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf29.hostedemail.com (Postfix) with ESMTP id 80E5312000E for ; Tue, 23 May 2023 21:48:34 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b=BnvRAfdL; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf29.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684878514; a=rsa-sha256; cv=none; b=k3djw5jB/8woJ/jv46NKI+bwqn3xVDZSkcqnepU0uHD/O6ytoy+9ci0wJ2760eSCQTFGuh wipY5Q4YKov9VU/kKY04FBaQA6jYx7hGSnvgqaLYnBFXm+Y7AYaeqMQIem3HZXyw47BOAb j78pucLP3TxV+XYiYQn0W1qtq+5Yh08= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b=BnvRAfdL; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf29.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684878514; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DfzJTH/YS8p4/aFLXua1OpKUSEReEQjhxnkCdOPbEC8=; b=NWG2kbsQZiAZy3Rz6DrKaYxXHK9KqCLg1cQlQuKCz66cpflL9WiqGkGpD2LKpKPjird3YM 4aajbcKJi0F0MdQNzzk4qvHLh/ATeLTinaFDXueZjboz0cVpYpCEux0hEt2BV/ufLCBkt6 bSquBNydZgBuaeWYv807qFNLSiA6JuU= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1ae851f2a7dso36783715ad.0 for ; Tue, 23 May 2023 14:48:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20221208.gappssmtp.com; s=20221208; t=1684878513; x=1687470513; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=DfzJTH/YS8p4/aFLXua1OpKUSEReEQjhxnkCdOPbEC8=; b=BnvRAfdLPHUoB25s74WCh7Ed7BfIUFS/s1bk26w0jQbs/ANLh3OodAduU2R/njA261 FhzISYrI+ZVyI5RkRVBp1nm19R0O4H1ttBu2ve3cK51i7PeG86oL5fWo9ubRxuXWbguv PrA07QcmSn+MHDn/CK3DSHbIY5EM/lD5u2fpMJw7AgXxrTyVjirGpxF92/H4Ki5LSqbr B2fJoOigxfzraIpJ2WgIhA7Mrz6F/MfG2KuwXiXYcZAQIhwk5wv3IX5sgmovDlxzqPoe lu6n96KVwkBIi98EK/D/O0gsXoCVsD21hei3mxpGEY7RIM0LPEVf3uPDvAWV9RR3HdAV pJLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684878513; x=1687470513; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=DfzJTH/YS8p4/aFLXua1OpKUSEReEQjhxnkCdOPbEC8=; b=OEoEVfNZcqSdAd2xQ49kdZjZTbJE5aVY907awHwxGVVCuFW3yune2BMDd8DmV0t8q/ NcX2yfzlEnPsckJF6YImEJqITLKIovGQRofTftGheYKG1+YOHnscQsRuDRlUHlCKqTNp Vmv3htSMzZR51Z7CGfmABSEXtwBuAfm0Gnqf/xoOCkG7T3IGwnKiyurFH58qZjhc+rbL qeW0Z5WsGIgTUqXDpPn1VeE0ZYOOtp2eynup9eqfCYftM55KU02fwG2m2qwRE8ix+vPP kPSHCbczAiNzVcgx1V2ljOBFx0+lFavnMADM/FjIbomy3ovk0iOeuD1cgd0tO+JKRyIY mi9A== X-Gm-Message-State: AC+VfDxQzzx21H61veJ7xeed1C9WNt1/OZ6DQO8jzRwPo5xDNlZegpe8 Ts+m4pOvYb8D43W779741Vv1/d0+CacQWx9cryw= X-Google-Smtp-Source: ACHHUZ4asKk+nhhwjMAeaek5vqjkxfvQAw8udpAW2PCEL4vtT1VHByKWLgE7DfNeLaWy3DBRi45QLg== X-Received: by 2002:a17:902:b212:b0:1ae:6bb9:7dc4 with SMTP id t18-20020a170902b21200b001ae6bb97dc4mr12376153plr.1.1684878513234; Tue, 23 May 2023 14:48:33 -0700 (PDT) Received: from dread.disaster.area (pa49-179-0-188.pa.nsw.optusnet.com.au. [49.179.0.188]) by smtp.gmail.com with ESMTPSA id ji1-20020a170903324100b001a5fccab02dsm7293760plb.177.2023.05.23.14.48.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 14:48:32 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1q1Zn8-0035Oz-2S; Wed, 24 May 2023 07:43:38 +1000 Date: Wed, 24 May 2023 07:43:38 +1000 From: Dave Chinner To: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Uladzislau Rezki , linux-mm@kvack.org, Andrew Morton , LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , "Liam R . Howlett" , "Paul E . McKenney" , Joel Fernandes , Oleksiy Avramchenko , linux-xfs@vger.kernel.org Subject: Re: [PATCH 0/9] Mitigate a vmap lock contention Message-ID: References: <20230522110849.2921-1-urezki@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 80E5312000E X-Stat-Signature: deuco16n56zbu1sx5kdek4xd9wgqua5p X-HE-Tag: 1684878514-92890 X-HE-Meta: U2FsdGVkX1+ebiaojPYQsPIE2dJnR6S0ipA9Zz0hPyokwGo17syjroH3sHhXP8sICdmaB+yiAq6L64oHRBwi48bmm/++IyTYK9KSTtU0Rb/mkJfrsZQzCc4k9v1ofej4a0LbmIRJEk3aW4L1r/oygGllQVZdddeyZxFu+44lMRJ+0/Q5NxOc9XENWR7CdBM/Cp7bu9xE2/P8a/z2xXKwTBcK6YAo0+yZXzqf7PV9qMfgMe41/ToQC7FBiuj/efkn0ks8HbcMvOt2Q1yw8p2beDtXhqHfeX9GGXA8X6/BOVYCcqm38wbLeZh1VsQJV4uZ/kcmRADFbxTWRUUOYTyV5t5ECYdePax/2Aa9Ee8fFk1pFYxEDyuLUlMga5Rzc19OTXM7vXbPsly5xhHuBt+6hMItYKKHqIgPojdidtSz6sityE2ECqlO5mLvUi2Fmf/zRRR18RrhGjs8feDt3UflOhQWAuj3exaBD5ZWPIg10wySuBPfJVaNw9AdVkTuc812jr7bz80q49yi1uyzXtjzOEpU1nZuHrxdCjeyKlrhCZD4NgKaOBuViIc0GrOBK7wmjwrJ0tblvESEQ/Oms7jk7u+BuJ4+rj3YfP6mBW4MuKZIucSuPdMP3IGbIuVw5yqR4F+rMUXW1E5XaEQpsRc90qrSABmSFzqDekZQSwKeXRG9zRcDEdcDixgoIcRILFx1WUhSpb1lVQFft1CV91EXM9KSICWhIYFDxMfc52bYc/MClGF8s0twuFO1ipK46Je3KKEPW5SQps16LI1xFxe0g7InDWNSyfDt9r4l2D4razAUK6AVPqLj+SwjGh4u1zYZGjSUqyEb1Wr0wcGck6xengNY5gY6rhhtJwEBenM8Ns/zZbWKxYUoXo+n7V0c79yi5GY5riNbNitYe/axtED3oVXWXsCjdxGnyCxRGmzAjUxQXeQgymGk9r5arAh56wb6UrZ+mhzAVp1uw84RzbW fvQ6jfH3 IKRTxe9az4bHNupeydegYnhhrtrurbu5eyCABKn0+czG8Wj3dRHzbN6EnGf0mMoC2Ih9be4Il48vVw4RQa6pbVFDdEWc3tStrpWOCYC18BONnTMZw7N8mYLM0ghXW+Sa6Akeg0rE7orkSAlOB9RJWOkrpA/Zk467d56d7R22oc7oZHVybUVDIgMteKlkBms58sXeKVanhEu7LeDEs3sIAC+yfSm/U42q9F5eSqDubmmPwLGffeljONGS2wHtePfDzfpsFguE2u1c7LXEA8cEL3lZFK1XzaSVb8/3HiEvViNUoXfHI6b1sYfQxFF+qXoECb6nA341HRQ273rTikypPq68V71NYxkLwc3KXqmqOgOAgaXSWbFt83asOkjp/eAIf1i9b9QHZNyfBXqrcNRKDaQ72hm1mgOhRGCsqVaaecP2DbwDAc83uM0tSaGUl5dt6LRpiHOP660f6U8wtjnQaDh84nTu8psa7UemJZFTtNJvUdcndZ3eBJpWSgAGzeJmggar5UQyQwLPzEdwhuS8UWoHDawjzng3329Lt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, May 24, 2023 at 03:04:28AM +0900, Hyeonggon Yoo wrote: > On Tue, May 23, 2023 at 05:12:30PM +0200, Uladzislau Rezki wrote: > > > > 2. Motivation. > > > > > > > > - The vmap code is not scalled to number of CPUs and this should be fixed; > > > > - XFS folk has complained several times that vmalloc might be contented on > > > > their workloads: > > > > > > > > > > > > commit 8dc9384b7d75012856b02ff44c37566a55fc2abf > > > > Author: Dave Chinner > > > > Date: Tue Jan 4 17:22:18 2022 -0800 > > > > > > > > xfs: reduce kvmalloc overhead for CIL shadow buffers > > > > > > > > Oh, let me count the ways that the kvmalloc API sucks dog eggs. > > > > > > > > The problem is when we are logging lots of large objects, we hit > > > > kvmalloc really damn hard with costly order allocations, and > > > > behaviour utterly sucks: > > > > > > based on the commit I guess xfs should use vmalloc/kvmalloc is because > > > it allocates large buffers, how large could it be? > > > > > They use kvmalloc(). When the page allocator is not able to serve a > > request they fallback to vmalloc. At least what i see, the sizes are: > > > > from 73728 up to 1048576, i.e. 18 pages up to 256 pages. > > > > > > 3. Test > > > > > > > > On my: AMD Ryzen Threadripper 3970X 32-Core Processor, i have below figures: > > > > > > > > 1-page 1-page-this-patch > > > > 1 0.576131 vs 0.555889 > > > > 2 2.68376 vs 1.07895 > > > > 3 4.26502 vs 1.01739 > > > > 4 6.04306 vs 1.28924 > > > > 5 8.04786 vs 1.57616 > > > > 6 9.38844 vs 1.78142 > > > > > > > > > > > > > 29 20.06 vs 3.59869 > > > > 30 20.4353 vs 3.6991 > > > > 31 20.9082 vs 3.73028 > > > > 32 21.0865 vs 3.82904 > > > > > > > > 1..32 - is a number of jobs. The results are in usec and is a vmallco()/vfree() > > > > pair throughput. > > > > > > I would be more interested in real numbers than synthetic benchmarks, > > > Maybe XFS folks could help performing profiling similar to commit 8dc9384b7d750 > > > with and without this patchset? > > > > > I added Dave Chinner to this thread. > > Oh, I missed that, and it would be better to [+Cc linux-xfs] > > > But. The contention exists. > > I think "theoretically can be contended" doesn't necessarily mean it's actually > contended in the real world. Did you not read the commit message for the XFS commit documented above? vmalloc lock contention most c0ertainly does exist in the real world and the profiles in commit 8dc9384b7d75 ("xfs: reduce kvmalloc overhead for CIL shadow buffers") document it clearly. > Also I find it difficult to imagine vmalloc being highly contended because it was > historically considered slow and thus discouraged when performance is important. Read the above XFS commit. We use vmalloc in critical high performance fast paths that cannot tolerate high order memory allocation failure. XFS runs this fast path millions of times a second, and will call into vmalloc() several hundred thousands times a second with machine wide concurrency under certain types of workloads. > IOW vmalloc would not be contended when allocation size is small because we have > kmalloc/buddy API, and therefore I wonder which workloads are allocating very large > buffers and at the same time allocating very frequently, thus performance-sensitive. > > I am not against this series, but wondering which workloads would benefit ;) Yup, you need to read the XFS commit message. If you understand what is in that commit message, then you wouldn't be doubting that vmalloc contention is real and that it is used in high performance fast paths that are traversed millions of times a second.... > > Apart of that per-cpu-KVA allocator can go away if we make it generic instead. > > Not sure I understand your point, can you elaborate please? > > And I would like to ask some side questions: > > 1. Is vm_[un]map_ram() API still worth with this patchset? XFS also uses this interface for mapping multi-page buffers in the XFS buffer cache. These are the items that also require the high order costly kvmalloc allocations in the transaction commit path when they are modified. So, yes, we need these mapping interfaces to scale just as well as vmalloc itself.... > 2. How does this patchset deals with 32-bit machines where > vmalloc address space is limited? >From the XFS side, we just don't care about 32 bit machines at all. XFS is aimed at server and HPC environments which have been entirely 64 bit for a long, long time now... -Dave. -- Dave Chinner david@fromorbit.com