From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0121BC369D1 for ; Thu, 24 Apr 2025 18:57:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACE676B0011; Thu, 24 Apr 2025 14:57:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7AAD6B00D5; Thu, 24 Apr 2025 14:57:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91CB96B00D7; Thu, 24 Apr 2025 14:57:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6B2706B0011 for ; Thu, 24 Apr 2025 14:57:02 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DFE0D1A1A1F for ; Thu, 24 Apr 2025 18:57:02 +0000 (UTC) X-FDA: 83369844684.24.723DCB8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id A336E40005 for ; Thu, 24 Apr 2025 18:57:00 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TxhZuvT5; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf11.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745521020; a=rsa-sha256; cv=none; b=Rg4YXBp1atMkKsgGdfXV/8l9OKaXYlpPlR60jHObBZId/V02EJ/kDyCrgNg0IK3exiWzSA wpM8PiGACAx/cdgW1ko+yJwF7QaZyFanqa36CDuLQALwZ2Pqsyn4Yrrjp81vJRCpPgyPmm Ly1Y1byiChsnykipReV/drWxyjOIcAc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TxhZuvT5; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf11.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745521020; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0PCrxE+7A1vf6K3VotNkZFsmQ7kJosiH5JYFYju3fXQ=; b=hxHDOxJNSeWWnvg4zKUDUEKPx88VypxJw2yzjheNFnyxwA4j/aDHzxpgXZjsJHytT5/vUt CAPJU63cfl/Sz9HMqH9qv5I4cekls2EnL917G3K4NHkegtFvnRmbxVcdNfhCXbvR5O7/GV x0lwPABx4Vf8SudPKhC69zBzyFVb2rM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1745521020; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0PCrxE+7A1vf6K3VotNkZFsmQ7kJosiH5JYFYju3fXQ=; b=TxhZuvT56KBAPn0UTzNmOGrm8tg5jwicyjmz4m0utZ0gQepumL0VUqURtScURD1pB0ZFon +N2TbBGZBN3CIg8LfeIf/ZsVyL8R+rWvic/PuSgbqREXhV4/lU9t01qrBWxcB7cFzMjRh0 O2q1lD/OMDUoer93mo5HMAxowtbhlXY= Received: from mail-yw1-f199.google.com (mail-yw1-f199.google.com [209.85.128.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-671-vOfnUhC5MZ2Gy-_-dmz4dQ-1; Thu, 24 Apr 2025 14:56:58 -0400 X-MC-Unique: vOfnUhC5MZ2Gy-_-dmz4dQ-1 X-Mimecast-MFC-AGG-ID: vOfnUhC5MZ2Gy-_-dmz4dQ_1745521017 Received: by mail-yw1-f199.google.com with SMTP id 00721157ae682-708344df2a3so21270097b3.3 for ; Thu, 24 Apr 2025 11:56:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745521017; x=1746125817; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0PCrxE+7A1vf6K3VotNkZFsmQ7kJosiH5JYFYju3fXQ=; b=NC0VztBDSgctofj5xgc9YtEUXJ9pFe1Uyvsx/Dd/DjQ0OHUuUZp5RtpbKxpPpOv6ru 4D+OkmFP1Ow5sDJxZ9Na3khdo66NMeKRFjLMKyJWVrUdw5I/8BlXtkLYE2sniknWEeKE BZPxeUjARlFr2LMiwPcdt/7SSFFZJWu4vc5LHA1WE2CUntvfDMzbhVOpbpE4fjeBy7fN x5oYdJXjXp6SMp3SoZaCbPNnP5/sXM7AeIRWb4VbUhzkc4LWBNXn8EgFnqJWFLpbzdCG qN/etmkWn+dUKd0IHIsi1o/bTVmsuJH5wOdq6QhSKdkUbMeIIsqX2l3BVJ9m4E6fh4eC KC+Q== X-Forwarded-Encrypted: i=1; AJvYcCUEYyazj3PUqwiPHElTY5KeRxFlOqX7fdnBgR6YhamkOuBG5KySCUKR3xc0xzSq7pjIwpaFrdllNw==@kvack.org X-Gm-Message-State: AOJu0YxfX4AHYYNNdZzyfHXVdulqwI2+7PBzGbwr+Q164c1h2qjFzrhC ZBCV4fVgY2QAPpZwdVIZRoUGlQHevFDom4Wd+0NiEhMp4+TUzS2aXkJiS3Y/GsF0E+IRcm16+Zr Ozd+xSeOXeVmtqtKMUKN+V9UgpMCVyR6ATD24rnZd/cCHmsudX6yGNNd6efUa+X3VY98QymBQ/g WQZNjrrtjOX+nnqOVeRuQWuJM= X-Gm-Gg: ASbGncuXio4R2ehUCeu7dkBj/3pbqyyfA6UQuPc5UQAiZs6HaqyzdnNIdtN9s7LMnOl VI0zs6UlYL2pTacUYhv5AEwtey76fZYfbSVuX3oVjxKv9sA0zt39Q1Zs7lSs4ChxNVAnZcTBlzt RHbQTTCz8= X-Received: by 2002:a05:690c:9a85:b0:708:100a:5797 with SMTP id 00721157ae682-7083ec20e85mr62945897b3.11.1745521017380; Thu, 24 Apr 2025 11:56:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFLq1YlbJnKiQchqS7R3u1IbfLVwKmtMNZfXOw4/4RsExrw85/EL3EEEm81MSFHeeKMeVPc93dtEK5J5CZyQdE= X-Received: by 2002:a05:690c:9a85:b0:708:100a:5797 with SMTP id 00721157ae682-7083ec20e85mr62945447b3.11.1745521017076; Thu, 24 Apr 2025 11:56:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nico Pache Date: Thu, 24 Apr 2025 12:56:31 -0600 X-Gm-Features: ATxdqUGkHN0DmvAZCO60egd059xRcfAJHhIl46i1OjI1cjRSg7_Z47yjLvR87-U Message-ID: Subject: Re: [PATCH v2 00/17] khugepaged: Asynchronous mTHP collapse To: Mitchell Augustin Cc: akpm@linux-foundation.org, 20250211152341.3431089327c5e0ec6ba6064d@linux-foundation.org, 21cnbao@gmail.com, aneesh.kumar@kernel.org, anshuman.khandual@arm.com, apopple@nvidia.com, baohua@kernel.org, catalin.marinas@arm.com, cl@gentwo.org, dave.hansen@linux.intel.com, david@redhat.com, dev.jain@arm.com, haowenchao22@gmail.com, hughd@google.com, ioworker0@gmail.com, jack@suse.cz, jglisse@google.com, John Hubbard , kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, Peter Xu , ryan.roberts@arm.com, srivatsa@csail.mit.edu, surenb@google.com, vbabka@suse.cz, vishal.moola@gmail.com, wangkefeng.wang@huawei.com, will@kernel.org, willy@infradead.org, yang@os.amperecomputing.com, zhengqi.arch@bytedance.com, Zi Yan , zokeefe@google.com, Jacob Martin , =?UTF-8?Q?Vanda_Hendrychov=C3=A1?= X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: ozpcvXeelbSlzTH-BJaNtrET04gS6jz0NzhRrpbKhi0_1745521017 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A336E40005 X-Stat-Signature: 7tyzw7eqbwkwacm6u115uy57a8b1ffu8 X-Rspam-User: X-HE-Tag: 1745521020-585136 X-HE-Meta: U2FsdGVkX1+JCiYcm2Xf6cff70VUp0QePilafhzh1JdGi2nNB2YM2Tte5WZ/Q45JscwIVn9KcwbQaqD+VMqA/hSK4zFMPnWS2ls7JJzbjCz5cUFFLxAcZsAzAP3rlYW+TcpsCyEAYRQegiIr02mx5kZ58nwobyQx1+fhqA1vdamZHUIQ9OCkYD50X+iGUPl9xkKgej3HxMqGAlhuxRsz1OJ0kavE895KnRxFEDb1brTm21JTqznN4zHsSlS4Zyp/wSNTfVfrx11CdFIja0NvyGukQLnpHyF6Q08TV9ZLTPog6pI6NHepGd1p4rYI9sfNtgK/q2phRKreOhxsPK5kElD8eGdWTxa7RVMGlBqcd4iStdg/MJPrlrg2Yd5wRaav76RC1WHgA4y8y3eozIGhYdrY+fKfFIwmn/Kca/6plvcJumCEoxiOaSTU9wbIlbz7esjcJUSCxBG1r8uW523LBvAnQEcRaneBVvBAMKkKD4DSridWXEWBMdtPT7XB54Qs61UppnQfSiU9+14MKHu6rG0S7lM8tSB0mDGTiLTvnwt18BpyOG+kuZ0N76kD7cFVJwUa1RNk3hN4l3AgEYisJ60Xz23IHirL8SVDAKRizNjkNK+uZoEIX78dnkPpmToiexI9pznoaLkQ1afXqw8KAcm68hGFrTNAtI2Vmn5ME55c3dey+Fe/8/rtjHJ3DZW7Fh1J8slgVbGtb8mmuzxWfEEWSeKsTTMe6mhw8TXzywlQ/L8j8HIA2LfW2u1eMCl+NL3ZUxWv7hkw6dZSHl4Xn/enhfqxDUbL1fZ9mlBrDkoUwvdN51tvKvbbJ/3+Cd2ExFFg2XKDYJXDF8MiHuku4ANazWElSg178iLFPMqeNskPgNAg64gmAwTWVQLWnpZCHI4RuDve3HtYlTdRRexqEv0+Gkdb2UdBMiGYdHl2BekT01UzOsC/kDQjwanUkhH34rsun6p1KavFN4fl2Hq vNDZSmLr YKIxWn/6jAzQBsiQTTpyZGmBOkwdBQKk+IJyRJn++FkQRo6TkU8at2kgGQL2bseOfSXsa39vCz0VIa/ctpdxrn4iYF1iONtYBJ+Qxg/oTm8kbdXGqaUzTJ1cGER0Urc7eY0yo8jTYNPPnA3keeQqcKyKYPI/mqlE9nBw/5WQxGXijiRoteLXmGamj72WPJQ/ckj1Ch7UAjyFGnGtOCTvTsAbeTbfLCEwLOyhDpb+ogLGIzI3XrqmJOfKGu49GHuQJK/y93JIEjsb01xj0JfNqJtZHvlUI6EUnIoQOfCp3RSNg5LujS03caffi70kWlNYjUxUpzAXGP8IpUsezK14nQjK7cb8qWVHQ23fi+7m0N9LcwvJBh1vTyylkakuyjjXgi533LFbutL3/EvjkvoUK6Hv3CeWreww38ZBzmuSZACVPJgYz0XlydMPLg/NBNXZWcS/3FF0TliUBkEj00Q+98o8CKO/z2gqJMaaKmi00oSFjFliamnfj9rt2Ecb+WiBEd2UmjydjiRL4Gwg1hxNzMB53qunnixyAtScAD9ojpu1gaQfS8iQRvd8u6MXPIX+KlpKFf1mCgnJ6tBRjYNvC11TFUxNWT4vFYvvZm0jdaFvQX0Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 24, 2025 at 12:18=E2=80=AFPM Mitchell Augustin wrote: > > Hello, > > I realize this is an older version of the series, but @Vanda > Hendrychov=C3=A1 and I started on a benchmark effort of this version prio= r > to the most recent revision's introduction and wanted to provide our > results as feedback for this discussion. > > For context, my team and I previously identified that some of the > benchmarks outlined in this phoronix benchmark suite [0] perform more > poorly with thp=3Dmadvise than thp=3Dalways - so I suspected that the > THP=3Ddefer and khugepaged collapse functionality outlined in this > article [6] might yield performance in between madvise and always for > the following benchmarks from that suite: > - GraphicsMagick (all tests), which were substantially improved when > switching from thp=3Dmadvise to thp=3Dalways > - 7-Zip Compression rating, which was substantially improved when > switching from thp=3Dmadvise to thp=3Dalways > - Compilation time tests, which were slightly improved when switching > from thp=3Dmadvise to thp=3Dalways > > There were more benchmarks in this suite, but these three were the > ones we had previously identified as being significantly impacted by > the thp setting, and thus are the primary focus of our results. > > To analyze this, we ran the benchmarks outlined in this article on the > upstream 6.14 kernel with the following configurations: > - linux v6.14 thp=3Ddefer-v1: Transparent Huge Pages: defer > - linux v6.14 thp=3Ddefer-v2: Transparent Huge Pages: defer > - linux v6.14 thp=3Dalways: Transparent Huge Pages: always > - linux v6.14 thp=3Dnever: Transparent Huge Pages: never > - linux v6.14 thp=3Dmadvise: Transparent Huge Pages: madvise > > "defer-v1" refers to the thp collapse implementation by Nico Pache > [3], and "defer-v2" refers to the implementation in this thread [4]. > Both use defer as implemented by series [5]. > > > Ultimately, we did observe that some of the GraphicsMagick tests > performed marginally better with Nico Pache's khugepaged collapse > implementation and thp=3Ddefer than with just thp=3Dmadvise, which aligns > a bit with my theory - however, these improvements unfortunately did > not appear to be statistically significant and gained only marginal > ground in the performance gap between thp=3Dmadvise and thp=3Dalways in > our workloads of interest. > > Results for other benchmarks in this set also did not show any > conclusive performance gains from mTHP=3Ddefer (however I was not > expecting those to change significantly with this series, since they > weren=E2=80=99t heavily impacted by thp settings in my prior tests). > > I can't speak for the impact of this series on other workloads - I > just wanted to share results for the ones we were aware of and > interested in. Hi Mitchell, Thank you very much for both testing and sharing the results! I'm glad no major regressions were noted, and in some cases performance was marginally better. Another good set of workloads to test for defer would be latency tests... THP=3Dalways can increase PF latencies, while "defer" should eliminate that penalty, with the hopes of regaining some of the THP benefits after the khugepaged collapse. I wanted to note one thing, with the default of max_ptes_none=3D511 and no mTHP sizes configured, the khugepaged series' (both mine and Devs) should have very little impact. This is a good test of the defer feature, while confirming that neither me nor Dev regressed the legacy PMD khugepaged case; however, this is not a good test of the actual mTHP collapsing. If you plan on testing the mTHP changes for performance changes, I would suggest enabling all the mTHP orders and setting max_ptes_none=3D0 (Devs series requires 0 or 511 for mTHP collapse to work). Given this is a new feature, it may be hard to find something to compare it to, other than each other's series'. enabling defer during these tests has the added benefit of pushing everything to khugepaged and really stressing its mTHP collapse performance. Once again thank you for taking the time to test these features :) -- Nico > > Full results from our tests on the DGX A100 [1] and Lenovo SR670v2 [2] > are linked below. > > [0]: https://www.phoronix.com/review/linux-os-ampereone/5 > [1]: https://pastebin.ubuntu.com/p/SDSSj8cr6k/ > [2]: https://pastebin.ubuntu.com/p/nqbWxyC33d/ > [3]: https://lwn.net/ml/all/20250211003028.213461-1-npache@redhat.com > [4]: https://lwn.net/ml/all/20250211111326.14295-1-dev.jain@arm.com > [5]: https://lwn.net/ml/all/20250211004054.222931-1-npache@redhat.com > [6]: https://lwn.net/Articles/1009039/ > -- > Mitchell Augustin > Software Engineer - Ubuntu Partner Engineering >