From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5AEAC77B7D for ; Wed, 17 May 2023 13:59:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51C5A900004; Wed, 17 May 2023 09:59:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CC21900003; Wed, 17 May 2023 09:59:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39499900004; Wed, 17 May 2023 09:59:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 28CEB900003 for ; Wed, 17 May 2023 09:59:05 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id CDD7CC05C1 for ; Wed, 17 May 2023 13:59:04 +0000 (UTC) X-FDA: 80799903408.17.92BD3FE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 33CC218001C for ; Wed, 17 May 2023 13:59:01 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SX9PHVo9; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684331942; a=rsa-sha256; cv=none; b=6H3SgDaOkvImkcqXHOGvmhQhJNFcso/tFLQvR5Ev4zE0yVTKVnrXvByzgS6ZaQBeuYQMSh /brH8f+nfdAGHlbWwuwPht+WiPz8ugVNXBlX8v7NJwCsy2zQOxpk7BGJjh2MccM1zNL+/W SEvdn2d+14H6d4B32bn3K5nLHchfpXI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SX9PHVo9; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684331942; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1IzZVzq3cQSGXpXCpdeXRzHUaRc97Cd6NWDw3W2gz+8=; b=uwvbc5WCM8zeK0gBG/1eM3mjoNx+aMh1J8X1pA3hcqxJQHgSRObt+PL6MKuVqii+uHI1/B 39OQ17+dn70DW3ww+ptkxWISaQ/0Atn2RQhpTRsT9jK/fryTEnx2e5M1nzrHYRoFmOoNl4 BVHR/LlOJ3M0xg68dYRSYgBmrTmM60E= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684331941; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1IzZVzq3cQSGXpXCpdeXRzHUaRc97Cd6NWDw3W2gz+8=; b=SX9PHVo9ajXR4X5N/hw/Rjqstzy307XSaGc0JAx6YjMWQtBlUXhJvqsEHwow9/w7Ln4A3I S334VeELjtyMRJBCqRe/mNmB8DB5aXOJhqHJCw9eqwPv/MG0Y4rNy1QZI+84NLM72lqcC1 v5m/a7EVTYOSiirmDown8ZVSD7OE9ak= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-204-Mj07b1gVPzWBou3P63Y3Vw-1; Wed, 17 May 2023 09:59:00 -0400 X-MC-Unique: Mj07b1gVPzWBou3P63Y3Vw-1 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-30620ebd3c2so577312f8f.0 for ; Wed, 17 May 2023 06:58:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684331939; x=1686923939; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1IzZVzq3cQSGXpXCpdeXRzHUaRc97Cd6NWDw3W2gz+8=; b=YzvjjgcWcUvIi5ZYXIFhLEPdvfvwx1RJrEs9QLfk47YDix5osOx4uHnz4ZQmYKjcN0 CCY3HeLwEfEQxd51MHhRYCr2Y6p0VYvLJSfljMYtljTH1IMc0aVIGkNRw/eqJR7MVGTQ nwEUXlylLNNLMuZWKtRMRnA+tqK2Xvmz0XWkd2QapZ0aQxyzIBygt+8IKisXABzPXM2B WCbfQMls+eRuYHB06YNLwnf+hFfvJIH8GimoDFcRTJ1c6tJp+AaE4FEVwvB7y05jr1Tj ojXItfG08s2zZ4YXQB3X8jGPz4w+o/STOgj2De9aurMAabpVp3lr/5XUTVNG6fUlDB0T Umlw== X-Gm-Message-State: AC+VfDwEyx531yh8/BJTS802OMrtEniEWR8wRGJh4axIBb+/wxBHtBAY J+fLFhQP5/IGEgXYznnjExvfxy6lJbopMPVW7uUjl2aE5TXQVauEWE2pXgYJ7T0LGWgd1nR0fwL T+98EgYon0GY= X-Received: by 2002:a5d:5919:0:b0:307:9194:9a94 with SMTP id v25-20020a5d5919000000b0030791949a94mr867389wrd.17.1684331938859; Wed, 17 May 2023 06:58:58 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7pwA0EN99BRphnE7g+wdvv+Nluk9SgrUI9FP9/GJ+bF6HZ6bKwDELQMk0tN8uDABeCvCcz5g== X-Received: by 2002:a5d:5919:0:b0:307:9194:9a94 with SMTP id v25-20020a5d5919000000b0030791949a94mr867375wrd.17.1684331938446; Wed, 17 May 2023 06:58:58 -0700 (PDT) Received: from ?IPV6:2003:cb:c707:3900:757e:83f8:a99d:41ae? (p200300cbc7073900757e83f8a99d41ae.dip0.t-ipconnect.de. [2003:cb:c707:3900:757e:83f8:a99d:41ae]) by smtp.gmail.com with ESMTPSA id l7-20020a5d5267000000b002fe96f0b3acsm3106064wrc.63.2023.05.17.06.58.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 May 2023 06:58:57 -0700 (PDT) Message-ID: <41549336-e1a5-9929-f3a2-5a2252837679@redhat.com> Date: Wed, 17 May 2023 15:58:57 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 To: Ryan Roberts , Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org References: <20230414130303.2345383-1-ryan.roberts@arm.com> <13969045-4e47-ae5d-73f4-dad40fe631be@arm.com> <568b5b73-f0e9-c385-f628-93e45825fb7b@redhat.com> <6857912b-4afd-7fb5-b11b-ebe0e32298c2@arm.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC v2 PATCH 00/17] variable-order, large folios for anonymous memory In-Reply-To: <6857912b-4afd-7fb5-b11b-ebe0e32298c2@arm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 33CC218001C X-Stat-Signature: 1o4463tex5ae53yjqarkbz5pwa8sd5qu X-HE-Tag: 1684331941-707720 X-HE-Meta: U2FsdGVkX18t5XeUN7iTM/ZLVtscy4jxtz5nbR08+IQZV9MmgbJ8aUUUgr5zlbhGbHilbT7Oymk/pGThQJJaKR0Bff7kIyCBvDAKQYQJy8tbGfuNml5jAyUrIhcWkC+G+4C42WnQLnvzGhXlcIa6++Llo06dxR8D1WFccWXMgdJhqJL0B8D4b0YL/+Or1XriyIpAw1eTCEtel85S4WIb/G5QneGg6kprLhBDV8G3B/azgdhQKtFdIpfAdmOeaCDN25NfnRvXs4E9YgKRAKYq+kiyW4xurqKMe6/cChAkE/++7GUPx9+MC0i+Aynv3OM1Asc8zudiAvd2pPifzBdzHAZ0eeEQM7BFf6iwmd6GGa2+YSQuG3NLwQm+ydrUXQjyEzyWVcJ4IO410lVtYjIoSfPKQj2oPOjCp38dLh283Umm1n1zHpTGPSrSGW17Sex95/UqukSPhSLZ04NdBCgUjvxWDaL99QGl3651agD7BNmd9cYoJw5ZxOfOoi70BEsHlGqwUNNUlWbewJC0wTcUfeuU83P/vjUGUHwYWPUMJ76LL7N+eIzlmHhNTrmMq69H797112XgyDwGEy1wwUuZPCIQHegl0OA3doMCLx/GmRq6OlZGRkZ8OHl+Y17moJUv60JkaIh+AZOsTKNQ6x8HEEo5Dv0F0NYwoPtOXyX+1W3OpAsS4IN9TloOO4m/oLTnCgYYkdWnoU4M3bhg+/WSZPKSIKY8zxjjV6dcNeM7gQJyXPX2+IfmDMTY4Vku9uvKt3M3cDCS9v1bI21703GC2XOjqbUscRFNYl6KD3RgSlue1tdCVcbLO62TNzKe6HXccRPUHMHIH5P9MGrOD9OOOwDzod1ntvKaMfaj/YrnQGmPR48yLGXBMTmyNPFUYsRhP33Uwp66ONsBzRZEDjpHUc3OE/F8NuXiLKadd9Tqu8YX8Hkew3SJx2MDSsyr2UwMkzG8ZbSdfUsutryKuO2 CnejsptP YVDNp7KnBbhElIMnwTb9f+9QNFjPPlbl0TejA1uElt3t5lGeKaP6awdrCsZfKhKPrOvSRQkHhInSTh/MEfj8LGNOVpwm3FqwYVELFiZ/s0xV99FlHjkrjaiDfLEJmBG7Ji1hlKaTTCsmX7C/BzP3md/l+Y5Q5Q3YDD9lRCt7XU+4Iu8XzcUz1OomlGtm5MTguPwKRWSeXm1oagmYneBQqa/skbB6a2LJYxK0B71xIC3YCEns0Ibmwbb+e2ElRXAgtnCNKbKaNh7oLwjbELxvRfcFw4SvIwG4dFEWMGlVdLUkIuG1WxghZ/PCkq/gEPj/8MiVcfkJPOVD75ckQPkg348VO+VmjDqskYb2Rv19XL9h7YdQWtcY6iiJ7AYCdw0ss6KVk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000123, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 26.04.23 12:41, Ryan Roberts wrote: > Hi David, > > On 17/04/2023 16:44, David Hildenbrand wrote: > >>>>>> >>>>>> So what should be safe is replacing all sub-pages of a folio that are marked >>>>>> "maybe shared" by a new folio under PT lock. However, I wonder if it's really >>>>>> worth the complexity. For THP we were happy so far to *not* optimize this, >>>>>> implying that maybe we shouldn't worry about optimizing the fork() case for >>>>>> now >>>>>> that heavily. >>>>> >>>>> I don't have the exact numbers to hand, but I'm pretty sure I remember enabling >>>>> large copies was contributing a measurable amount to the performance >>>>> improvement. (Certainly, the zero-page copy case, is definitely a big >>>>> contributer). I don't have access to the HW at the moment but can rerun later >>>>> with and without to double check. >>>> >>>> In which test exactly? Some micro-benchmark? >>> >>> The kernel compile benchmark that I quoted numbers for in the cover letter. I >>> have some trace points (not part of the submitted series) that tell me how many >>> mappings of each order we get for each code path. I'm pretty sure I remember all >>> of these 4 code paths contributing non-negligible amounts. >> >> Interesting! It would be great to see if there is an actual difference after >> patch #10 was applied without the other COW replacement. >> > > Sorry about the delay. I now have some numbers for this... > Dito, I'm swamped :) Thanks for running these benchmarks! As LSF/MM reminded me again of this topic ... > I rearranged the patch order so that all the "utility" stuff (new rmap > functions, etc) are first (1, 2, 3, 4, 5, 8, 9, 11, 12, 13), followed by a > couple of general improvements (7, 17), which should be dormant until we have > the final patches, then finally (6, 10, 14, 15), which implement large anon > folios the allocate, reuse, copy-non-zero and copy-zero paths respectively. I've > dropped patch 16 and fixed the copy-exclusive bug you spotted (by ensuring we > never replace an exclusive page). > > I've measured performance at the following locations in the patch set: > > - baseline: none of my patches applied > - utility: has utility and general improvement patches applied > - alloc: utility + 6 > - reuse: utility + 6 + 10 > - copy: utility + 6 + 10 + 14 > - zero-alloc: utility + 6 + 19 + 14 + 15 > > The test is `make defconfig && time make -jN Image` for a clean checkout of > v6.3-rc3. The first result is thrown away, and the next 3 are kept. I saw some > per-boot variance (probably down to kaslr, etc). So have booted each kernel 7 > times for a total of 3x7=21 samples per kernel. Then I've taken the mean: > > > jobs=8: > > | label | real | user | kernel | > |:-----------|-------:|-------:|---------:| > | baseline | 0.0% | 0.0% | 0.0% | > | utility | -2.7% | -2.8% | -3.1% | > | alloc | -6.0% | -2.3% | -24.1% | > | reuse | -9.5% | -5.8% | -28.5% | > | copy | -10.6% | -6.9% | -29.4% | > | zero-alloc | -9.2% | -5.1% | -29.8% | > > > jobs=160: > > | label | real | user | kernel | > |:-----------|-------:|-------:|---------:| > | baseline | 0.0% | 0.0% | 0.0% | > | utility | -1.8% | -0.0% | -7.7% | > | alloc | -6.0% | 1.8% | -20.9% | > | reuse | -7.8% | -1.6% | -24.1% | > | copy | -7.8% | -2.5% | -26.8% | > | zero-alloc | -7.7% | 1.5% | -29.4% | > > > So it looks like patch 10 (reuse) is making a difference, but copy and > zero-alloc are not adding a huge amount, as you hypothesized. Personally I would > prefer not to drop those patches though, as it will all help towards utilization > of contiguous PTEs on arm64, which is the second part of the change that I'm now > working on. Yes, pretty much what I expected :) I can only suggest to (1) Make the initial support as simple and minimal as possible. That means, strip anything that is not absolutely required. That is, exclude *at least* copy and zero-alloc. We can always add selected optimizations on top later. You'll do yourself a favor to get as much review coverage, faster review for inclusion, and less chances for nasty BUGs. (2) Keep the COW logic simple. We've had too many issues in that area for my taste already. As 09854ba94c6a ("mm: do_wp_page() simplification") from Linus puts it: "Simplify, simplify, simplify.". If it doesn't add significant benefit, rather keep it simple. > > > For the final config ("zero-alloc") I also collected stats on how many > operations each of the 4 paths was performing, using ftrace and histograms. > "pnr" is the number of pages allocated/reused/copied, and "fnr" is the number of > pages in the source folio): > > > do_anonymous_page: > > { pnr: 1 } hitcount: 2749722 > { pnr: 4 } hitcount: 387832 > { pnr: 8 } hitcount: 409628 > { pnr: 16 } hitcount: 4296115 > > pages: 76315914 > faults: 7843297 > pages per fault: 9.7 > > > wp_page_reuse (anon): > > { pnr: 1, fnr: 1 } hitcount: 47887 > { pnr: 3, fnr: 4 } hitcount: 2 > { pnr: 4, fnr: 4 } hitcount: 6131 > { pnr: 6, fnr: 8 } hitcount: 1 > { pnr: 7, fnr: 8 } hitcount: 10 > { pnr: 8, fnr: 8 } hitcount: 3794 > { pnr: 1, fnr: 16 } hitcount: 36 > { pnr: 2, fnr: 16 } hitcount: 23 > { pnr: 3, fnr: 16 } hitcount: 5 > { pnr: 4, fnr: 16 } hitcount: 9 > { pnr: 5, fnr: 16 } hitcount: 8 > { pnr: 6, fnr: 16 } hitcount: 9 > { pnr: 7, fnr: 16 } hitcount: 3 > { pnr: 8, fnr: 16 } hitcount: 24 > { pnr: 9, fnr: 16 } hitcount: 2 > { pnr: 10, fnr: 16 } hitcount: 1 > { pnr: 11, fnr: 16 } hitcount: 9 > { pnr: 12, fnr: 16 } hitcount: 2 > { pnr: 13, fnr: 16 } hitcount: 27 > { pnr: 14, fnr: 16 } hitcount: 2 > { pnr: 15, fnr: 16 } hitcount: 54 > { pnr: 16, fnr: 16 } hitcount: 6673 > > pages: 211393 > faults: 64712 > pages per fault: 3.3 > > > wp_page_copy (anon): > > { pnr: 1, fnr: 1 } hitcount: 81242 > { pnr: 4, fnr: 4 } hitcount: 5974 > { pnr: 1, fnr: 8 } hitcount: 1 > { pnr: 4, fnr: 8 } hitcount: 1 > { pnr: 8, fnr: 8 } hitcount: 12933 > { pnr: 1, fnr: 16 } hitcount: 19 > { pnr: 4, fnr: 16 } hitcount: 3 > { pnr: 8, fnr: 16 } hitcount: 7 > { pnr: 16, fnr: 16 } hitcount: 4106 > > pages: 274390 > faults: 104286 > pages per fault: 2.6 > > > wp_page_copy (zero): > > { pnr: 1 } hitcount: 178699 > { pnr: 4 } hitcount: 14498 > { pnr: 8 } hitcount: 23644 > { pnr: 16 } hitcount: 257940 > > pages: 4552883 > faults: 474781 > pages per fault: 9.6 I'll have to set aside more time to digest these values :) -- Thanks, David / dhildenb