From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5E62C433EF for ; Fri, 27 May 2022 05:39:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3F1E8D0003; Fri, 27 May 2022 01:39:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CEE888D0002; Fri, 27 May 2022 01:39:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB41D8D0003; Fri, 27 May 2022 01:39:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ABBDE8D0002 for ; Fri, 27 May 2022 01:39:28 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7B52234EF3 for ; Fri, 27 May 2022 05:39:28 +0000 (UTC) X-FDA: 79510420416.29.B686F44 Received: from mail-oa1-f47.google.com (mail-oa1-f47.google.com [209.85.160.47]) by imf17.hostedemail.com (Postfix) with ESMTP id C842840039 for ; Fri, 27 May 2022 05:38:56 +0000 (UTC) Received: by mail-oa1-f47.google.com with SMTP id 586e51a60fabf-d39f741ba0so4631951fac.13 for ; Thu, 26 May 2022 22:39:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2j31mgyTy2/RMpvPafIuRR/7M5aCsOn5pH1JxQKELh0=; b=WLF0ZKcKkqyL/2e9Vrm4jJIU7Iq70SjWVWeR4mG+diJpadGhfuylwbDWHZVjDdFbuq Hxc+zHrNjRLjtRhI1yhC+nolDKLNrwOV2/oaWTaoh4VCK67BB0z5yPlcOmnthCOgtmez 8jkwQgYPNLUuG0ZEbUiSy7uVEpWFtrO5L4UM+4mbtXpWMiQIVZp0eLdp45xYgN9Jmvmc aulsD6JeO/TIOhvHj+bALfaW9BR7tPWEUeyaMI8bJbO4wOxVeH/I0lrHRJ+FP9F521BK /lEKp/SRZycKZGHLE0LZwkZV2trusLGaniTNIWTP9EnU7l+Xe/n5XI3K1uCmL30r4C8o Jgxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2j31mgyTy2/RMpvPafIuRR/7M5aCsOn5pH1JxQKELh0=; b=CbTH2EhkoiEMEYvkBxioDWkOmmr8sZxHCzYXjVkT8wSZG9BwPPEHdO6rX5Dh2huQWG wsSJCXOndgiq2nqXqYwZLGirX/GwE5XZO+x25NTnsYjvDzPD77+9dr/xtqz6Xeq00SBp 5m4Ed4zMpg6bPKxxg0OjZSge8k7GQ0HIqNDpFuYdcYRIzc/jLl6ZMzkad3fMCTRE2RM8 ewh02MVRr+ljhO3FKKu3+lj43Un+zu8JIiPB1y86j56qg+oyj/XJ2qY8jf9X+VpQXlVE Ynf63VLvi0PVRK8KZ0wezoskESyu1eegWNDQzUHPeFjw9LiztR7H1J7rFpUNsAn6dOx0 D2UQ== X-Gm-Message-State: AOAM530v++hAMeoZWo0u5rZWXXydKNfNBv9/sFHcZQi6EIKNiDlCOB/I RVVxm1JftItcdVIRX6JYUSjBaWdpQZJCkixCcbs= X-Google-Smtp-Source: ABdhPJy8+6PjRSs5xseRfaJdStvLUZHZV5vQKgs4hS+3gJJgloJJ42WfwmBmCJpVr7XfQyeK+DIvF8/HIUAjs0cVwxU= X-Received: by 2002:a05:6870:5ba6:b0:f1:5840:f38e with SMTP id em38-20020a0568705ba600b000f15840f38emr3323758oab.210.1653629967367; Thu, 26 May 2022 22:39:27 -0700 (PDT) MIME-Version: 1.0 References: <20220524234531.1949-1-peterx@redhat.com> In-Reply-To: <20220524234531.1949-1-peterx@redhat.com> From: Max Filippov Date: Thu, 26 May 2022 22:39:15 -0700 Message-ID: Subject: Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types To: Peter Xu Cc: LKML , Linux Memory Management List , Richard Henderson , David Hildenbrand , Matt Turner , Albert Ou , Michal Simek , Russell King , Ivan Kokshaysky , linux-riscv , Alexander Gordeev , Dave Hansen , Jonas Bonn , Will Deacon , "James E . J . Bottomley" , "H . Peter Anvin" , Andrea Arcangeli , openrisc@lists.librecores.org, linux-s390 , Ingo Molnar , "open list:M68K ARCHITECTURE" , Palmer Dabbelt , Heiko Carstens , Chris Zankel , Peter Zijlstra , Alistair Popple , linux-csky@vger.kernel.org, "open list:QUALCOMM HEXAGON..." , Vlastimil Babka , Thomas Gleixner , "open list:SPARC + UltraSPAR..." , Christian Borntraeger , Stafford Horne , Michael Ellerman , "maintainer:X86 ARCHITECTURE..." , Thomas Bogendoerfer , Paul Mackerras , linux-arm-kernel@lists.infradead.org, Sven Schnelle , Benjamin Herrenschmidt , "open list:TENSILICA XTENSA PORT (xtensa)" , Nicholas Piggin , "open list:SUPERH" , Vasily Gorbik , Borislav Petkov , linux-mips@vger.kernel.org, Helge Deller , Vineet Gupta , Al Viro , Paul Walmsley , Johannes Weiner , Anton Ivanov , Catalin Marinas , linux-um@lists.infradead.org, "open list:ALPHA PORT" , Johannes Berg , "open list:IA64 (Itanium) PL..." , Geert Uytterhoeven , Dinh Nguyen , Guo Ren , linux-snps-arc@lists.infradead.org, Hugh Dickins , Rich Felker , Andy Lutomirski , Richard Weinberger , linuxppc-dev@lists.ozlabs.org, Brian Cain , Yoshinori Sato , Andrew Morton , Stefan Kristiansson , "open list:PARISC ARCHITECTURE" , "David S . Miller" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: C842840039 X-Rspam-User: X-Stat-Signature: gh9tnr96mr9yb61ut6ote439cscwt9zr Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=WLF0ZKcK; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of jcmvbkbc@gmail.com designates 209.85.160.47 as permitted sender) smtp.mailfrom=jcmvbkbc@gmail.com X-HE-Tag: 1653629936-454319 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 24, 2022 at 4:45 PM Peter Xu wrote: > > I observed that for each of the shared file-backed page faults, we're very > likely to retry one more time for the 1st write fault upon no page. It's > because we'll need to release the mmap lock for dirty rate limit purpose > with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()). > > Then after that throttling we return VM_FAULT_RETRY. > > We did that probably because VM_FAULT_RETRY is the only way we can return > to the fault handler at that time telling it we've released the mmap lock. > > However that's not ideal because it's very likely the fault does not need > to be retried at all since the pgtable was well installed before the > throttling, so the next continuous fault (including taking mmap read lock, > walk the pgtable, etc.) could be in most cases unnecessary. > > It's not only slowing down page faults for shared file-backed, but also add > more mmap lock contention which is in most cases not needed at all. > > To observe this, one could try to write to some shmem page and look at > "pgfault" value in /proc/vmstat, then we should expect 2 counts for each > shmem write simply because we retried, and vm event "pgfault" will capture > that. > > To make it more efficient, add a new VM_FAULT_COMPLETED return code just to > show that we've completed the whole fault and released the lock. It's also > a hint that we should very possibly not need another fault immediately on > this page because we've just completed it. > > This patch provides a ~12% perf boost on my aarch64 test VM with a simple > program sequentially dirtying 400MB shmem file being mmap()ed and these are > the time it needs: > > Before: 650.980 ms (+-1.94%) > After: 569.396 ms (+-1.38%) > > I believe it could help more than that. > > We need some special care on GUP and the s390 pgfault handler (for gmap > code before returning from pgfault), the rest changes in the page fault > handlers should be relatively straightforward. > > Another thing to mention is that mm_account_fault() does take this new > fault as a generic fault to be accounted, unlike VM_FAULT_RETRY. > > I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do > not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping > them as-is. > > Signed-off-by: Peter Xu > --- > > v3: > - Rebase to akpm/mm-unstable > - Copy arch maintainers > --- For xtensa: Acked-by: Max Filippov -- Thanks. -- Max