From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0A4E1D3E77D for ; Wed, 10 Dec 2025 22:28:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 589776B0006; Wed, 10 Dec 2025 17:28:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 53A126B0007; Wed, 10 Dec 2025 17:28:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4286D6B0008; Wed, 10 Dec 2025 17:28:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 321FA6B0006 for ; Wed, 10 Dec 2025 17:28:44 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A57E716063A for ; Wed, 10 Dec 2025 22:28:43 +0000 (UTC) X-FDA: 84205002126.05.9844C8F Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf28.hostedemail.com (Postfix) with ESMTP id CB5A3C000A for ; Wed, 10 Dec 2025 22:28:41 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iDmm5jbj; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1765405721; a=rsa-sha256; cv=none; b=0mDRibAkqdE5vtCkHD5cYDdpaViwTc8vsthwCbOa2Jtfs8ansNmAIsSNzGE9Hb8Ju8Ky58 cbUtnR3TswyOEWXlL3/5/xbIwwgDa0+l7Bo9LPzEk302HDKEoKG0f5gVW4iCHJsjv1q0lk EnP4huoT+xO/Dd3Rop5kxqyqllIeR4A= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iDmm5jbj; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of vishal.moola@gmail.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=vishal.moola@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1765405721; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2Y4YR3EvZUPa1M0pk6ofj+UMivnBJWFnH2I8aOHSdEA=; b=5pP68zQfw3VHWJqBSirG80waI4Muad3xt/oLPwTorVDFr3IlJO49GtJ41mqMKo2ahuBJrZ BQB7xVGpPTP3nsWtkAwupvWoW0i5iOqOumyYn6ukb0VFDMVc7gp30VhP6AZ3AyW0drQ66V VWtZ+qM4d4HUoI/uUzy2loUPsXJ5myw= Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-bf1b402fa3cso333100a12.3 for ; Wed, 10 Dec 2025 14:28:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765405721; x=1766010521; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2Y4YR3EvZUPa1M0pk6ofj+UMivnBJWFnH2I8aOHSdEA=; b=iDmm5jbjSX/LBaY16r58Ky3YaeRNGuzLbmfvsgj0uk/UVd5WGX1g5aGWFrLBwzqKlu NxfKaUR6ZEJNW4Te7m5CTQYbNNtoyWJX2bJJapCR6fVHa+X6ea0bVs1qOU4Rkg6YCN4j TJ7NcYNbafrpkfHcZOqRAO0bsZvaeWfu11c+3BLuhuVdhOosMrE2kWp5/kK43+AKz14r w5Ntk8AzMC65vGzVTIEJkK2lrQnt1p+idvJFUmSjlk83COVkwq2aF/s6IMwAjpV2yZb2 Tyu3Dh/Eb3+RI5zQ+pal+CtiX0C31kLwHbNRYy9Z5u3XbOH7E8hZLNxc3fHYkk0Q5s12 goFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765405721; x=1766010521; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2Y4YR3EvZUPa1M0pk6ofj+UMivnBJWFnH2I8aOHSdEA=; b=L8DsFEdTB8+lpT49BYza+sU+UUGLVUNC1LZZBcqiSqvyDEFH5HINcXykxphpPZQkHR GYW9Wose7D6bquiUVHVkjkqc0/T3FS1OGx0ujGpbgRBcPllS435jn4AEZj5ZZuZUn0L/ xkvGSXsi6yYMAJAxp0APBQU13FDUCGyFzyODjeHuN68LIBTlpH898ivdrN8Ts0Uw7/hO 7GgVm2tfLv4qX6Wxeo2PdZFkIoo1m+4NEnBoAztGFRW9UscfjIro8XnlbakplPLifvGS qs/UYuHmmfG+A2Y388k78289Sy+xVJHEU7lYMYwgqxKlx3IvXfjNbnU6m10PdEoJhZxY ePAw== X-Gm-Message-State: AOJu0YyyaMml/ZMTFYTwfa8UGHA8OAcbEQDrDh/+qnJWP5dXM6PHbAh+ hj/A/vNAeoonGWfX7ACXYa64eGuY0SyQhcD4o3iQTEJCYFboNzJJ8i8c X-Gm-Gg: ASbGnctmWSYWm1+duu0AUrVG8fDfYHSxaqIwrTefD2804yGzWQRVrVaoXK6+fSWnHw0 7pKz8CxOdEB2wxb25flJTOp5dnL5PCGkn4/HLhpMBQ1YCLn8Aq07dljYrdzWhvvQOguC6TTtAuB i0TRZoteJTq51OEGiKKPngdx/CbXHaOEH4KdKYoT37FGFfWhMYp+dyo3na3fMkupc/k5kpxr+oZ BnBdyRAxbI7qb/KO/1XbLayE0mn/6s6/GLGnLg/Ymh2K6YYSvx06BLe7SQ5JF+tp38VUEDRGiOQ bu8eoBo/ri9TTCeLVN29VgPdfoFpBOXQwdpmeILSBXZuukf1HrtoowBEnzu7dYPgq230AeD366c QMfa+QrcYHTc+pqq0fSkIFdSmLaLvxBgVoA55xEg5T3CJDNC/uAS/WMEBPi7BxW5XlC7o8iRHof /LeE4ZzeAzKnOy/FIAAr/yKeLdtx3lN+f9TnCS7UWtKUE= X-Google-Smtp-Source: AGHT+IHZ6lBnOUweOAui2ZzJf3TpHq0VX8oilk8s2hjf+9kxfxm8aifawoRpmp/EQn7l9HuOqXwLqA== X-Received: by 2002:a05:7022:2485:b0:119:e56c:189c with SMTP id a92af1059eb24-11f2966a2ecmr2952426c88.4.1765405720243; Wed, 10 Dec 2025 14:28:40 -0800 (PST) Received: from fedora (c-67-164-59-41.hsd1.ca.comcast.net. [67.164.59.41]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-11f2e2ff624sm2090304c88.12.2025.12.10.14.28.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Dec 2025 14:28:39 -0800 (PST) Date: Wed, 10 Dec 2025 14:28:37 -0800 From: "Vishal Moola (Oracle)" To: Ryan Roberts Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Uladzislau Rezki , Andrew Morton Subject: Re: [PATCH] mm/vmalloc: request large order pages from buddy allocator Message-ID: References: <20251021194455.33351-2-vishal.moola@gmail.com> <66919a28-bc81-49c9-b68f-dd7c73395a0d@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <66919a28-bc81-49c9-b68f-dd7c73395a0d@arm.com> X-Stat-Signature: sw7wrdfxyat1upmxsk58okaos45rwi7d X-Rspam-User: X-Rspamd-Queue-Id: CB5A3C000A X-Rspamd-Server: rspam01 X-HE-Tag: 1765405721-162632 X-HE-Meta: U2FsdGVkX18c1nVeDqXPiQ+6cjGakp6+BEu4Cv7TZiljWmQBjOe21bmDDTjB71P16vpTubYieQotNvjuficNErKOtummxJkRjfkCTNTB7sq0Ek7Ll5zdXor4gDo0xISBiq0PmKHd77hWj0Wmvu/pcRbfKtqMZc7uop8FGzUqpTt/TYLCdqz2heqqW7yjdAmBUs0IAd1InbyCkXNO46HoEEQKN99O4YtzV/X6dxTi0ifiEnkfP51OBYXtYOZvLfiMwFz0ApLqQwfmQtYFu9o1BVx1jwKRT+Zy9j5CQnLUwM86+SXI9in1Nd4PUcuuA//JiB3y6pHNp7gzxObUuy8haGzSVv8/9Wsy2S8kg9VVslUVcJ3xBg6WhznT2AxPYZGl1I0RWtNZX1ZZpa09HP+IXREU5onTxnQUJKKrN1HYjKGh5CeB1FchlXxArFU77JKI+ErDesmudBnLdRaUp+yzbCfLmovgdWHZwvnJI4VRm1N1wWrPmn78aUC6UsBByOqPnq7rKX++yMz1ZMhKxHvBDru0JczpWIXN6OtLKepsGXOiEnwpeh3Ukj5v4d2g/Dhmd5MpIY4uEFWxsJWSRQ5+kd+B0bxAPWM9ePYQGOPISMYPBqf7JVMyOKt9Tv4wQHE7vzsm6qYOItONwJl5xZ7e7dyV6xzJA1o0gWibqlpSf6en9BHovXpqnjbVW7gBKDqOXNWorZ+z9l0m1fhDYyRojqBc+vkkS4F8cA4OFIu+WBNBZU4399dC/wXNutI8xVKHq6nPxFbnGDrAdPeeBf9RdcYIBtTCird/TDddmNEC15MeOwctsxTU9yGBfWv+fAnKl9G0bqtk2CmTyLN9NCbu3V/n5rtMT5icVi5G4wTcPumdiiBOCa8zJ4AO+RDgGLZVbj3P96CcEDaXgySatvTwkQp3UNorX1WMnwDRroMqs65nNmQsY6DdMHvF+VdsmxWvxI5JQmjVuxvDGcStDp7 Ahu5d+/t nKpaB4eShun6TEBBDzxKTpMBwRSCI3DgfK3M3XWH+3y/kAMMVB9mS7PJjldSbPNq8o9Eu0zoEK7oIPr6uOL8xgLWx7j9g+QFlW5HBywxEp3j/m6HD0hIGgpqrXpf2b0C4pEAtjIM6iKmhthlqcEdjp/Vg++juw0FC2fO85Go0cUTHI97PRk7wJnNjClNDEtCOKr7WhxKOirqEr+evuIll16UZ+L4Igj4XxdrgbaYfUx9VUCiRfgLIXwMjC2zAS3GJzwUAt+6EO/N2SfwncnCYsqK1ymwwEw+PhE8XTxG0leJbl2uwitReH6jtsMAyztG4plzncSRsEpi0msWERsfLCSb25r2on5sokQwvS/S5RxkUtFiCkSS0yrJ6zaazeFaMOYgr6739njWsI9HHLVNvSzXyrnr0cYiGsOo6CeR6D6JqV0MVxCLEkczl7WQ31V+bbki3UHLgowPvgtcifzGd9VxDUiAkpyG2Jy8HC0+e1PSWU6M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 10, 2025 at 01:21:22PM +0000, Ryan Roberts wrote: > Hi Vishal, > > > On 21/10/2025 20:44, Vishal Moola (Oracle) wrote: > > Sometimes, vm_area_alloc_pages() will want many pages from the buddy > > allocator. Rather than making requests to the buddy allocator for at > > most 100 pages at a time, we can eagerly request large order pages a > > smaller number of times. > > > > We still split the large order pages down to order-0 as the rest of the > > vmalloc code (and some callers) depend on it. We still defer to the bulk > > allocator and fallback path in case of order-0 pages or failure. > > > > Running 1000 iterations of allocations on a small 4GB system finds: > > > > 1000 2mb allocations: > > [Baseline] [This patch] > > real 46.310s real 0m34.582 > > user 0.001s user 0.006s > > sys 46.058s sys 0m34.365s > > > > 10000 200kb allocations: > > [Baseline] [This patch] > > real 56.104s real 0m43.696 > > user 0.001s user 0.003s > > sys 55.375s sys 0m42.995s > > I'm seeing some big vmalloc micro benchmark regressions on arm64, for which > bisect is pointing to this patch. Ulad had similar findings/concerns[1]. Tldr: The numbers you are seeing are expected for how the test module is currently written. > The tests are all originally from the vmalloc_test module. Note that (R) > indicates a statistically significant regression and (I) indicates a > statistically improvement. > > p is number of pages in the allocation, h is huge. So it looks like the > regressions are all coming for the non-huge case, where we want to split to > order-0. > > +---------------------------------+----------------------------------------------------------+------------+------------------------+ > | Benchmark | Result Class | 6-18-0 | 6-18-0-gc2f2b01b74be | > +=================================+==========================================================+============+========================+ > | micromm/vmalloc | fix_align_alloc_test: p:1, h:0, l:500000 (usec) | 514126.58 | (R) -42.20% | > | | fix_size_alloc_test: p:1, h:0, l:500000 (usec) | 320458.33 | -0.02% | > | | fix_size_alloc_test: p:4, h:0, l:500000 (usec) | 399680.33 | (R) -23.43% | > | | fix_size_alloc_test: p:16, h:0, l:500000 (usec) | 788723.25 | (R) -23.66% | > | | fix_size_alloc_test: p:16, h:1, l:500000 (usec) | 979839.58 | -1.05% | > | | fix_size_alloc_test: p:64, h:0, l:100000 (usec) | 481454.58 | (R) -23.99% | > | | fix_size_alloc_test: p:64, h:1, l:100000 (usec) | 615924.00 | (I) 2.56% | > | | fix_size_alloc_test: p:256, h:0, l:100000 (usec) | 1799224.08 | (R) -23.28% | > | | fix_size_alloc_test: p:256, h:1, l:100000 (usec) | 2313859.25 | (I) 3.43% | > | | fix_size_alloc_test: p:512, h:0, l:100000 (usec) | 3541904.75 | (R) -23.86% | > | | fix_size_alloc_test: p:512, h:1, l:100000 (usec) | 3597577.25 | (R) -2.97% | > | | full_fit_alloc_test: p:1, h:0, l:500000 (usec) | 487021.83 | (I) 4.95% | > | | kvfree_rcu_1_arg_vmalloc_test: p:1, h:0, l:500000 (usec) | 344466.33 | -0.65% | > | | kvfree_rcu_2_arg_vmalloc_test: p:1, h:0, l:500000 (usec) | 342484.25 | -1.58% | > | | long_busy_list_alloc_test: p:1, h:0, l:500000 (usec) | 4034901.17 | (R) -25.35% | > | | pcpu_alloc_test: p:1, h:0, l:500000 (usec) | 195973.42 | 0.57% | > | | random_size_align_alloc_test: p:1, h:0, l:500000 (usec) | 643489.33 | (R) -47.63% | > | | random_size_alloc_test: p:1, h:0, l:500000 (usec) | 2029261.33 | (R) -27.88% | > | | vm_map_ram_test: p:1, h:0, l:500000 (usec) | 83557.08 | -0.22% | > +---------------------------------+----------------------------------------------------------+------------+------------------------+ > > I have a couple of thoughts from looking at the patch: > > - Perhaps split_page() is the bulk of the cost? Previously for this case we > were allocating order-0 so there was no split to do. For h=1, split would > have already been called so that would explain why no regression for that > case? For h=1, this patch shouldn't change (as long as nr_pages < arch_vmap_{pte,pmd}_supported_shift). This is why you don't see regressions in those cases. > - I guess we are bypassing the pcpu cache? Could this be having an effect? Dev > (cc'ed) did some similar investigation a while back and saw increased vmalloc > latencies when bypassing pcpu cache. I'd say this is more a case of this test module targeting the pcpu cache. The module allocates then frees one at a time, which promotes reusing pcpu pages. [1] Has some numbers after modifying the test such that all the allocations are made before freeing any. > - Philosophically is allocating physically contiguous memory when it is not > strictly needed the right thing to do? Large physically contiguous blocks are > a scarce resource so we don't want to waste them. Although I guess it could > be argued that this actually preserves the contiguous blocks because the > lifetime of all the pages is tied together. Anyway, I doubt this is the This was the primary incentive for this patch :) > reason for the slow down, since those benchmarks are not under memory > pressure. > > Anyway, it would be good to resolve the performance regressions if we can. Imo, the appropriate way to address these is to modify the test module as seen in [1]. [1] https://lore.kernel.org/linux-mm/aPJ6lLf24TfW_1n7@milan/