From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4741DC77B7E for ; Wed, 24 May 2023 01:31:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA8AD900003; Tue, 23 May 2023 21:31:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A583D900002; Tue, 23 May 2023 21:31:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F861900003; Tue, 23 May 2023 21:31:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7D054900002 for ; Tue, 23 May 2023 21:31:29 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 45D011C67B0 for ; Wed, 24 May 2023 01:31:29 +0000 (UTC) X-FDA: 80823421098.24.7E8216E Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) by imf05.hostedemail.com (Postfix) with ESMTP id 77AFD10000B for ; Wed, 24 May 2023 01:31:27 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=ToCKKP7N; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684891887; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zIKD5WaWkrfgSg8wtOaFTVEhKmxqSZ1EIRMkzpKdqEQ=; b=Slrj8VMDmT3NCV5jd07x1ekJmKTqHKostgVVPmkWHdbc9e+ebmmAE0hfENisGei1QVbneS gefFmLrY5XVSlXCtHgkpoAY7TCbQb5IZXE/FNJY4Ukx7hhVNGiGR8O1RmgWfRq58UPN724 kaSFEsZAd65/Jzq457aNWFzT7qkBG44= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=ToCKKP7N; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684891887; a=rsa-sha256; cv=none; b=hH1r+vcazHfwY3wJAgHI8laxkS2i4ub6ESo2I5ny+Hj+UdQZoW5YIvRRQpRd03sE2ZuIOq ADOsdnptmkQEs6ZKdRFg5fyGnjrmXgR48dVNZhL88lbhPMipbBu8M3bzVJZ4/XErPUYYRu cujjLo7CS9ciTI7t47vgDlpsCr2dbGk= Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-64d57cd373fso151671b3a.1 for ; Tue, 23 May 2023 18:31:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684891886; x=1687483886; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=zIKD5WaWkrfgSg8wtOaFTVEhKmxqSZ1EIRMkzpKdqEQ=; b=ToCKKP7NsVqbbGR8Fly0Xb5c3b9r8QDuNxQoIMR4ixWOucGmvp70ne6J/MZZHQm/J6 vA0Al8Z8h3g7MUeS/uRe2LViAkz2vsquqvrLHFzxZFzMO7OyzGfI7xZigJoD+paJ7umE QQtjSSjaw35Lw0CqtNCQ02Pe67924rAq/a3X3upHOzT+6PzfBHxukrJEyumskwA2CLBL f/e9+0wQGFk5rTs7Jl//uz9Ze4Mv7PzlsXfD0JSgiLlLIM3o8wEvxVYkf16cA2BvryNP 41ZCkX75IjFKl3PvDaW1YO4Ne88igW5MeN7T8xue2Cbv3PN/fjGVZA3UyUuTNeaRIbrA kn5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684891886; x=1687483886; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=zIKD5WaWkrfgSg8wtOaFTVEhKmxqSZ1EIRMkzpKdqEQ=; b=kPUDlHmkHg204HYL5u0v2Q6M6L9ytnUYaJFuQabUYd7tcs9MuHgsWcGF/iJyyxDizb mY/G3JEdd84LR+KNQfwfEaEr201g9YwWe4+zCpqcIIVddurm1tEj2EpV+LpsIWoyBXF+ Y/1rh3xxerGVXgBfxlvkswyBBMONh2vniZfkq6GreUWXvfR97KAJiClN617KPI7IO9S2 KbUF+Yo1ugdfG6q732+Xa3c8BK+HTUzc2wkO5yDdjkQdhiChf0YJQvCkCdzDiyaFAEvr R7omkJHsZB5qJTpk5mQUw4SQz7BqOaxfUds9DdTzw5Gh0XSTE4UpBv/FN7pzamdRqHtW wlOg== X-Gm-Message-State: AC+VfDzLnyVbXjpHrM1XEsQ7Dahe0FCMS6auo44fwcIeOLJbUTWeJbOF IBaZp8xuE87iqbf8fxRDtd4= X-Google-Smtp-Source: ACHHUZ7tAULbDowDVZ6ohTYz8RfKZ2TDB/BRq9U39zdrqfFhNXgrWcwroCYMOWo7KSh+PvPajyTJwQ== X-Received: by 2002:a05:6a20:938e:b0:10b:60c1:2999 with SMTP id x14-20020a056a20938e00b0010b60c12999mr9842392pzh.22.1684891885793; Tue, 23 May 2023 18:31:25 -0700 (PDT) Received: from debian-BULLSEYE-live-builder-AMD64 ([168.188.236.124]) by smtp.gmail.com with ESMTPSA id j12-20020aa78dcc000000b0062de9ef6915sm6303308pfr.216.2023.05.23.18.31.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 18:31:25 -0700 (PDT) Date: Wed, 24 May 2023 10:30:46 +0900 From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Dave Chinner Cc: Uladzislau Rezki , linux-mm@kvack.org, Andrew Morton , LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , "Liam R . Howlett" , "Paul E . McKenney" , Joel Fernandes , Oleksiy Avramchenko , linux-xfs@vger.kernel.org Subject: Re: [PATCH 0/9] Mitigate a vmap lock contention Message-ID: References: <20230522110849.2921-1-urezki@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 77AFD10000B X-Stat-Signature: sucatz9k7wtzax8pih3jek9khb5pj3rh X-Rspam-User: X-HE-Tag: 1684891887-844655 X-HE-Meta: U2FsdGVkX18dXXpj4FYXH1qSOMNRUGm/IglsnM87CdXX96xA8ArdTCEd8ZcmIpWA2n9Vz/32JvGxAqZk0uPh8ozYs9+LN1bWyQLGFNr0fXbOY3QcQMGqBSHtE0IFLmM91Lp2mr/sUoedMP01hAe/Mb02sFgw5Ymk9ARkAqFcd3eXQv/WN+WIltqVziloz4T2RgaBT5WzwXp/Vwoay3UPxkM0487bmOOv0dHilYRgcy9p0g+wio4oSCz9SRhkJ/ZmzuSkw2us6ApWWG8jFzDK2jKHhdHyR5gfqcfCXEs6oaVT5HFZEnoHBHpdX7pe9KibcQSmsIvg55APmG3Waa4/6qYAMUcdPEswLAu3GBSDNxjsYbD/wGl2WB/qPn0kIyL4FUUt9lJ/UGdEqOzzBQhXbKewfZ7CZVwocgzUpmHauCLMQi9LcxeffBtAT5ubL3goAEafvYQjt3vNgF9MHLThUhr2/h6oTdoN3sp+3l6mdreEG1Al4iLj9JuCvEkUJd5bhydg7MPCNO6B+xbnCgGdeWbM/eYbnHLu8+w2hz1rJFYOBZXHIlpVaTgtBZn0wCMxvcWRGNURYDwROpvF7qs6jFLnW9xKDVqxQ8zAOYCeqWkVc0GV2xTs8F1LCQ0mP+/A/PWcDDWXHqVofqKbQ4+TWfti6pDBBN6sj1synQ3kseQm4Lniyyt91eFlZ4DhEpyBpYK4Bqlh+Hh/6QnrerUzaHsCCFqUb7BB28EMvLv6W6kWTWipKkI7fqdEPfzJxfu0okXCXh9kvnmlr/RVAgpgddSvyWzyAQOn1Oa5SRDmC+rGlCzciZou7QkDQjr+OXSKaAK6Cy+bYxA5z9zx5HTxm7G2uaRTORhNg1E1gkGMMLh+xpXtb4EXn8y1/+J3svKV2YZOBzNvPxvmMnz648ZoYi6zTbFms9nd2mjXVnCW2ObrA/T9WlmQj+oirlP1D+7ZGB6L6jKmT6rslFi9Nh/ dxHSh32s +H4F361QAqbAU8Ga8qwDFDg3ltRA5tb5I2wgzvSBLoiyLrLtQVuEOp5o5TyQDFKgrX+1nKohzWS8NOBlOFc4WIOv2jo3Rt9TzLt4TRhlvkvX8xzeIlCrgcJ7CbW8SX4puSt2HMdO7AeWCOAsdpeeIHUT9xDaLU74a52woGZFa2qxXEktLM6Gd/1809lMlolEn5jnpv7df6TaAPDprLtFaKUKg8VT/+L3ZfRey9T/nuuUzhFP0GOpgUPRkp6r9XXI0QKNyo1S8K+rNcJ7UCGEA28k0VlW6xGOmDc53yLjeYGL5gtVFsBW5Zcdj7zzuwbpNxcN0zgECA+gerbYlBWYoE36uAz75hU+m0W1YQqodNDPOwYvhKg1o764bEMHhZ+nb8Q7G8pmXX0s+aJpcoVPCkLf93eWuX8gKqZHIUs6plI+rsJPvfPc3n0YVtmOMb06ianOMGz+PYhttJZZbMZpj6MWbK4XPMb6FAAvLNb1/eQbjKihfGfECGpXg4LL6Fs6Kt/ZMSeMuL7eayDh7l3s+22+nZ3t9TK8IkKkmdfdDVtEAnIN5+wI/pDhNlmYcMmdrKO67gBeNTYgEBOs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, May 24, 2023 at 07:43:38AM +1000, Dave Chinner wrote: > On Wed, May 24, 2023 at 03:04:28AM +0900, Hyeonggon Yoo wrote: > > On Tue, May 23, 2023 at 05:12:30PM +0200, Uladzislau Rezki wrote: > > > > > 2. Motivation. > > > > > > > > > > - The vmap code is not scalled to number of CPUs and this should be fixed; > > > > > - XFS folk has complained several times that vmalloc might be contented on > > > > > their workloads: > > > > > > > > > > > > > > > commit 8dc9384b7d75012856b02ff44c37566a55fc2abf > > > > > Author: Dave Chinner > > > > > Date: Tue Jan 4 17:22:18 2022 -0800 > > > > > > > > > > xfs: reduce kvmalloc overhead for CIL shadow buffers > > > > > > > > > > Oh, let me count the ways that the kvmalloc API sucks dog eggs. > > > > > > > > > > The problem is when we are logging lots of large objects, we hit > > > > > kvmalloc really damn hard with costly order allocations, and > > > > > behaviour utterly sucks: > > > > > > > > based on the commit I guess xfs should use vmalloc/kvmalloc is because > > > > it allocates large buffers, how large could it be? > > > > > > > They use kvmalloc(). When the page allocator is not able to serve a > > > request they fallback to vmalloc. At least what i see, the sizes are: > > > > > > from 73728 up to 1048576, i.e. 18 pages up to 256 pages. > > > > > > > > 3. Test > > > > > > > > > > On my: AMD Ryzen Threadripper 3970X 32-Core Processor, i have below figures: > > > > > > > > > > 1-page 1-page-this-patch > > > > > 1 0.576131 vs 0.555889 > > > > > 2 2.68376 vs 1.07895 > > > > > 3 4.26502 vs 1.01739 > > > > > 4 6.04306 vs 1.28924 > > > > > 5 8.04786 vs 1.57616 > > > > > 6 9.38844 vs 1.78142 > > > > > > > > > > > > > > > > > 29 20.06 vs 3.59869 > > > > > 30 20.4353 vs 3.6991 > > > > > 31 20.9082 vs 3.73028 > > > > > 32 21.0865 vs 3.82904 > > > > > > > > > > 1..32 - is a number of jobs. The results are in usec and is a vmallco()/vfree() > > > > > pair throughput. > > > > > > > > I would be more interested in real numbers than synthetic benchmarks, > > > > Maybe XFS folks could help performing profiling similar to commit 8dc9384b7d750 > > > > with and without this patchset? > > > > > > > I added Dave Chinner to this thread. > > > > Oh, I missed that, and it would be better to [+Cc linux-xfs] > > > > > But. The contention exists. > > > > I think "theoretically can be contended" doesn't necessarily mean it's actually > > contended in the real world. > > Did you not read the commit message for the XFS commit documented > above? vmalloc lock contention most c0ertainly does exist in the > real world and the profiles in commit 8dc9384b7d75 ("xfs: reduce > kvmalloc overhead for CIL shadow buffers") document it clearly. > > > Also I find it difficult to imagine vmalloc being highly contended because it was > > historically considered slow and thus discouraged when performance is important. > > Read the above XFS commit. > > We use vmalloc in critical high performance fast paths that cannot > tolerate high order memory allocation failure. XFS runs this > fast path millions of times a second, and will call into > vmalloc() several hundred thousands times a second with machine wide > concurrency under certain types of workloads. > > > IOW vmalloc would not be contended when allocation size is small because we have > > kmalloc/buddy API, and therefore I wonder which workloads are allocating very large > > buffers and at the same time allocating very frequently, thus performance-sensitive. > > > > I am not against this series, but wondering which workloads would benefit ;) > > Yup, you need to read the XFS commit message. If you understand what > is in that commit message, then you wouldn't be doubting that > vmalloc contention is real and that it is used in high performance > fast paths that are traversed millions of times a second.... Oh, I read the commit but seems slipped my mind while reading it - sorry for such a dumb question, now I get it, and thank you so much. In any case didn't mean to offend, I should've read and thought more before asking. > > > > Apart of that per-cpu-KVA allocator can go away if we make it generic instead. > > > > Not sure I understand your point, can you elaborate please? > > > > And I would like to ask some side questions: > > > > 1. Is vm_[un]map_ram() API still worth with this patchset? > > XFS also uses this interface for mapping multi-page buffers in the > XFS buffer cache. These are the items that also require the high > order costly kvmalloc allocations in the transaction commit path > when they are modified. > > So, yes, we need these mapping interfaces to scale just as well as > vmalloc itself.... I mean, even before this series, vm_[un]map_ram() caches vmap_blocks per CPU but it has limitation on size that can be cached per cpu. But now that vmap() itself becomes scalable after this series, I wonder they are still worth, why not replace it with v[un]map()? > > > 2. How does this patchset deals with 32-bit machines where > > vmalloc address space is limited? > > From the XFS side, we just don't care about 32 bit machines at all. > XFS is aimed at server and HPC environments which have been entirely > 64 bit for a long, long time now... But Linux still supports 32 bit machines and is not going to drop support for them anytime soon so I think there should be at least a way to disable this feature. Thanks! -- Hyeonggon Yoo Doing kernel stuff as a hobby Undergraduate | Chungnam National University Dept. Computer Science & Engineering