From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FB1DEB64D9 for ; Tue, 4 Jul 2023 07:11:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1C3E280061; Tue, 4 Jul 2023 03:11:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ACBA8280049; Tue, 4 Jul 2023 03:11:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94722280061; Tue, 4 Jul 2023 03:11:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 82A24280049 for ; Tue, 4 Jul 2023 03:11:52 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3F4171A0983 for ; Tue, 4 Jul 2023 07:11:52 +0000 (UTC) X-FDA: 80973059664.07.BF99F3D Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf08.hostedemail.com (Postfix) with ESMTP id 84789160002 for ; Tue, 4 Jul 2023 07:11:50 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=nfAUOZ6G; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688454710; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LwbZIzQvk26i3CiKr+ou8Vd2tamJaYv2XqL+VObr8Oo=; b=lrm+rUQJIA6elV+J3+jmg5ze8zc+pHj88gpcklPTZ3GqqOp3ZISUIu44VSKYrfsRgC6b3m c2ipCRvVdqbaxh1x3sR+dFbcbrj7vRFpvxjZkmgwi8PkhGq9A3J9bSbKr4D9zMglhayFxw iuuP5Sf4vhQY1S7UUBg6f8XhJ/S/ENw= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=nfAUOZ6G; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688454710; a=rsa-sha256; cv=none; b=gwrEXgXKToKXGutHfU/yEEysBb/839ARH2dJQExjvPX2MQuLfqYoXAZwltnOaUoh0dYDxY 5NIhZh3cT0fkBDVhPcxR+xmJSgREnMzmTbJg/Hp7OLpZvnK9jJyC/cMaARZ2Ln4SRrltj0 zphqus9r77X93i0h72PU2KLQrJkRTQw= Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-40371070eb7so54771cf.1 for ; Tue, 04 Jul 2023 00:11:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688454709; x=1691046709; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LwbZIzQvk26i3CiKr+ou8Vd2tamJaYv2XqL+VObr8Oo=; b=nfAUOZ6GBTCembuc+7TJeZjhFynsYIaOL3YYoBv0gGCmBVQ+lgkvZIzdYF8xEQvluX Ccqse6Lusr7PriYGaKZT7/GiGmBdRAAs3Z+JOv3aGMSrpSsSGbPUN6gXPdb5OCUf9U40 +nO0tCz19IB4x8EsSDIeqF55fnffrZL3CKwkOeP2fNwPwsLdIrMJUX6h7eKpa0TCzbOB VZdJDOrVhQFT4hOVxioVPbi9+qJ+V+iFPMhmffbmN7MMq7Qq6OYhZ4MN4h8uTt+xjD0A m2VWFSrTP9fHtizv7olFsLTPti4+c7l65IJLWpcLUnVTxfu9T8pLRq4RI2aabru0+O/M YSMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688454709; x=1691046709; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LwbZIzQvk26i3CiKr+ou8Vd2tamJaYv2XqL+VObr8Oo=; b=HySSA2NfZijC0zdlIc0meXiqfVdAO36X9Yb9lwX+npPhqz1W+rBbM1kOiyxiDFrIqP LLfBCWCNVXdttRZRdN1ySA1xv+NYQYw9bZuIrtrr8+bVHOXPssu8ps9o4WEGjDCorGVM SEmuVdX7jplyfQnQRHc1amYNX1702bO1i4LgzeCNo6BT2D45/Q9B0yrt/IHOy7/KPTz4 q+za1bEvekZy9pJK4i3Qc8aqjgK3twPnt2WdCvVJG/S452i/xGuV4h/YIMG0j2e3HmW9 /wcrx7GGnpE2XV+FU9mzFMc5M0vkQU4A0WN1pur9SmV40aMRT7FYlo9WKoLIUSHPpJ80 7anA== X-Gm-Message-State: ABy/qLYUpv2Ov9O4d7KpSx3SUwb7DlOxVkBJdfAz9cFDmmoEzxAvLHuH lkt8ipkVWVypjJ5gHpSj9u40+tRJ98HgL8Y3DOJ3Kw== X-Google-Smtp-Source: APBJJlGQVPweFXxkxiOxLO8wpewZiGr/Zfo58d7NkDfhJIN4+SwzmiQa53m54nW31M0YQ3nx1JivC83/e516ZifpHn0= X-Received: by 2002:a05:622a:188b:b0:3f5:49b6:f18d with SMTP id v11-20020a05622a188b00b003f549b6f18dmr52585qtc.11.1688454709502; Tue, 04 Jul 2023 00:11:49 -0700 (PDT) MIME-Version: 1.0 References: <20230703135330.1865927-1-ryan.roberts@arm.com> <69aada71-0b3f-e928-6413-742fe7926576@intel.com> In-Reply-To: <69aada71-0b3f-e928-6413-742fe7926576@intel.com> From: Yu Zhao Date: Tue, 4 Jul 2023 01:11:13 -0600 Message-ID: Subject: Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory To: "Yin, Fengwei" , Ryan Roberts Cc: Andrew Morton , Matthew Wilcox , "Kirill A. Shutemov" , David Hildenbrand , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: auhdnee7xjtq5gsrmqyjqc5dsq8k4ksz X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 84789160002 X-HE-Tag: 1688454710-989927 X-HE-Meta: U2FsdGVkX18Sa0ez5PC2IRcMxj+xt1aN+HZPSur8PMQx5F46cdunrjwEAYDvQjgFsUHitlKdKx63MpwvUF59oY390vNyIeczNbZXLJitFvKgcKJUfrOjJrNgdKNWsQLCBW5fEdmTE01BU9DpJQ6bixVnEJ1eGBiuMjYQaPBrgtTIFHbp7xWex794C9p7BmD3y6y2kuhVGebcsuPtAZXYeDY9NbeLzH9ZS2I+j0vhUx4sxJTHkZDHW7HGDfX/rxRFmTqIZMVIMOXctiv/n+V56zG16E/mYSOySyVP855YpI3+S+tLhp6iVOEes3sLB9kAnd+6necUKlfi1llYAHeF4Xni27WsEBDpCXsBxjctbZ8XSv+tlNrFeHY8l33WIv2b+3SY6o9k8Uc9w59jdsY09CKYzqmq7oFhk0ffyJ4Oo8pHucWwonZo5Yf6SvQzn0oN3t4E0HP0cICDfcn/xC0q9MdNxFcQ+muB3UFDOKjkXBQ/dOlve9/sWnOXj4RaJzGjBs6Ua80eJo4PWXe+OGWCXAPJw4+rh4Fjk5K+ZF2ewOdkmU6SRkfgzvP+GYCI7hWddS8t05//b4afKviCeaBP0tsGWhJ/wcF5FzAfxIBf+qFqad8ZixH9xsuG0OtwbXp+hRhS7az9WHUyCd9h4OjzTpHBEUqpmwnj6FFKgAHeEO/s2d5jsjGp/56031tq3Eh8hldqfZPtBlv2Snb6kjOBrAx2agEAJ1VCKPuYpDdgNtoWl0B3fWLVM5ehaamWlxSOoeU8X4yZbmWhMDznmsAM+oGj2kYr0kgW/x/EghzxZlLpCJNowFJiO9+6R9KtZMuUYPGa8ktt+1ORam1Kvfhnu5rAETQuquvkcFNuyNW79HDN6hgvZc9UQiK4g8Y/LuMV3TJynADK4Ygu4R2ESXR3jILJbQglihuXIfx1VNslYyE0WfxLb8B11KBKMNpMxDwZmEK6purLgnMu9X8p2l9 L2p9MbMe k9L/mmy2Zu+rPZ85ju/m1Qs+cTf8KqqGvKTyS0D0rSfYxdmy6dgSYEWaCVngSccniYbs27N2ghH5RzPmMdAG4gsYpOR28tNniNZKLJOot6m4pRHsgR3EHAOdLP8/ImkJKwnB4k0CjTsKvETbx5X+BceiQo+PH3m9cH56ekW7ynMunPN7SfpH+U9cx9+QQS2zJ+wGRhLxsgLesvAQfFl205JtkEoQGew2cHimpOokjC4NJrYd5HRLfZpmR2nbLu6ez/Ep50rb7mN124/RjKJXyzpdr/boiIekYdzG2gpOyeqD8wyxi6bKiVBAQdg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 4, 2023 at 12:22=E2=80=AFAM Yin, Fengwei wrote: > > On 7/4/2023 10:18 AM, Yu Zhao wrote: > > On Mon, Jul 3, 2023 at 7:53=E2=80=AFAM Ryan Roberts wrote: > >> > >> Hi All, > >> > >> This is v2 of a series to implement variable order, large folios for a= nonymous > >> memory. The objective of this is to improve performance by allocating = larger > >> chunks of memory during anonymous page faults. See [1] for background. > > > > Thanks for the quick response! > > > >> I've significantly reworked and simplified the patch set based on comm= ents from > >> Yu Zhao (thanks for all your feedback!). I've also renamed the feature= to > >> VARIABLE_THP, on Yu's advice. > >> > >> The last patch is for arm64 to explicitly override the default > >> arch_wants_pte_order() and is intended as an example. If this series i= s accepted > >> I suggest taking the first 4 patches through the mm tree and the arm64= change > >> could be handled through the arm64 tree separately. Neither has any bu= ild > >> dependency on the other. > >> > >> The one area where I haven't followed Yu's advice is in the determinat= ion of the > >> size of folio to use. It was suggested that I have a single preferred = large > >> order, and if it doesn't fit in the VMA (due to exceeding VMA bounds, = or there > >> being existing overlapping populated PTEs, etc) then fallback immediat= ely to > >> order-0. It turned out that this approach caused a performance regress= ion in the > >> Speedometer benchmark. > > > > I suppose it's regression against the v1, not the unpatched kernel. > From the performance data Ryan shared, it's against unpatched kernel: > > Speedometer 2.0: > > | kernel | runs_per_min | > |:-------------------------------|---------------:| > | baseline-4k | 0.0% | > | anonfolio-lkml-v1 | 0.7% | > | anonfolio-lkml-v2-simple-order | -0.9% | > | anonfolio-lkml-v2 | 0.5% | I see. Thanks. A couple of questions: 1. Do we have a stddev? 2. Do we have a theory why it regressed? Assuming no bugs, I don't see how a real regression could happen -- falling back to order-0 isn't different from the original behavior. Ryan, could you `perf record` and `cat /proc/vmstat` and share them?