From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A33A9C4345F for ; Fri, 26 Apr 2024 15:28:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0632F6B00A8; Fri, 26 Apr 2024 11:28:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2E136B00A9; Fri, 26 Apr 2024 11:28:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF5CB6B00AA; Fri, 26 Apr 2024 11:28:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BA2016B00A8 for ; Fri, 26 Apr 2024 11:28:54 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3BBB61213F0 for ; Fri, 26 Apr 2024 15:28:54 +0000 (UTC) X-FDA: 82052065788.18.173EBB3 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf08.hostedemail.com (Postfix) with ESMTP id E5001160019 for ; Fri, 26 Apr 2024 15:28:51 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jE+HT8fE; spf=none (imf08.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714145332; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5+JV8nGNDt7Yb0J1C93bew3ffEU+OKReUrVUHx6lRBY=; b=W9xGisE3vgiabn79L3EfbGvQR9BQhlbWdPyDAv0H0WBpVVAhf9AIPkuTyNcwnYkEHh1rzv draAmHzRZoK5J/IUaIbpdeNSSvny2iQ5ZSxD00aX/KQ4pfgFqyUw06qj3VtVXjHuJ3LzMN 907rAtw2bJoxfOJhJP3jIWy6LbnnOMo= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jE+HT8fE; spf=none (imf08.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714145332; a=rsa-sha256; cv=none; b=Xjv+CsD6S80JGusDaG32E/4LBKiBr4UmUTGUzrMpHx6TQPgdMs2yWe1TXwxpiaeC4bwe7w wL7uDnhLenc1ai1FinGuQDvL5pQYOitktl91k5IWXb76TZGQXQrF3yOG+5j2iwbFTD09cP rphxyZZncIk7MfehiBuMdz8/VQ0RgHI= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=5+JV8nGNDt7Yb0J1C93bew3ffEU+OKReUrVUHx6lRBY=; b=jE+HT8fEk5nk3WgGMTjsyJxmHt RFv/JBKSxfDB7yTJ27NOxG5to192JMyfyKkkZ0bQI/YN4HPRsvQge5CaLYs9sG3IVDRabru6mRDLT ZEwUAzogWDtfsjWnWU2iMlLb217tXbci8JwOA684kwuKr6KCdyyZgPL7Uy8K6YuCcGH9YXz0KYRBJ u93vvpac5T2s4ZJoMXqr7rDlibkgYS8ZZgXjJcPgWmlhEAebwkRbm9ati6SWpQjAxggHEg8hl+TrR 5lvTrm+kvXpWlgZewTlHbOaUX+6X6jzS0BTKMt06NzcB3mwvJXzVBE1tsaKjMJftBXXsZqFKA0t7g EmcymXpA==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1s0NVG-00000005U9x-0lSp; Fri, 26 Apr 2024 15:28:46 +0000 Date: Fri, 26 Apr 2024 16:28:46 +0100 From: Matthew Wilcox To: Suren Baghdasaryan Cc: Peter Xu , "Liam R. Howlett" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Lokesh Gidra , Alistair Popple Subject: Re: [PATCH] mm: Always sanity check anon_vma first for per-vma locks Message-ID: References: <20240410170621.2011171-1-peterx@redhat.com> <20240411171319.almhz23xulg4f7op@revolver> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Stat-Signature: aphjh3amd1meb68tgfq45hhw3wmcws1x X-Rspam-User: X-Rspamd-Queue-Id: E5001160019 X-Rspamd-Server: rspam05 X-HE-Tag: 1714145331-415090 X-HE-Meta: U2FsdGVkX19XIXW9J6Y9FVyI3z5KrzpqFsf3qJi79Rel8mmGkbJ2RzoLd1mAzKy5+WL9+XZpGPK5gBZNvV/bO73k4BZKPCTmtW8vSiGhOhwEAf9ObABdO8bQa99b3psL51BPUdxDonLa2C8SZnqhCXYMEPgegQh4CNIoHvcbFU85uHVhCNYaLVEg1rsXf36+jk9zfbyPwRA7w2FhNYLBItc8K/oVRagJqI+nBXGNsln3e3z8PfH5qIsahjRo91lXjMiFPBrJOvtzkrKRGn5xysvF0r8+RkDuprtCBgYdgt5IxgAh+A/ppOOk3OHWU/RJp7nbNBcnsFnqSJKAIibbDkYet+gVSLlh6IppTOJy4bIc+zRhqw4KYDUUtfzgvhm72zTr/s5nIaEGPLZG8OHlzFZniU3AYEMJ5o0y5D3hJH5z6E/jzcIFX9sD5qeZwQYjBBgc4F2x/7TrBtLRifxZj99Ek58chSpEpdbL03BckZ6TfodC1Qz/zDaSzacTPoxyzDHUR21vORphSq54edB7vnlF7jMMgHMM9hZGEZ9DgzHYHRO3rJ7o6MuHv9JoJN2wqI0rONM7kY16H4MY5YB42gmuXMkvlero9jGu/veTLO480M28q89p3FICMcM6/qdmSOFroUj2teRTGhZ0MEkYuZsTgt+N65ccXqnwFXc2wQ4Vn+vThqqbaH5hxf+8qhEo49CpU3sNjawOseUbsalLEa4cKSlm5YXzTnCz+esX2n7nehMQot1SAeUbOQzwuD5Ofqic6lmt15jaurNb2ZCs0JUhOB23SFT+dx1x80909Idrv7TqcSuyLd88yKt7m/Sk9oSY/CaRa7vNOyXUDDIIZlIpvC65W2kyOdOrWXzesZ2lTgma9uydBuRJs+BVuCYi2U648I91gbciusD/hRI3ZXaQ1J4hbX9u10LTl/CqDrXJw98gOQwhMs3ybmVhJq0HTSqhsRTxWFJZxLuP1H2 s6xjZoLF 9RSpA06hAp0b5YVFhgbfEaZ1MOJSaCJxXRib2UzsK+KEG2QBeDRKDrXL3iwUT+vcd8VLHsBKJQ2mOQ1xgR4eC1JFCnhgkftW1SIB/FG+e8JCj3AsJicf2MlycDdKLX/3o+rN2U6ff32mHM7OQPrUIPrS0sB9rSOec+Goljdr9X3hP2wwhQ4iz3Ebfyqj6SdDG2g0OhAwdEXpyTBGCg0m/6ISde1p0bh7YEHJ/J+1FlC7HTBHusXf9eXXEDRE5otgfgP0hVpWbPa9d9cQFSwTyEw06GaTwJF4+cPNCfkeH/clqwINPXEnttGT9yGpWHDouPWHP40S8aGL7gBl9UOWrrUHG/k0OE3tpFidIwo0kWmMjXVP1EnrStThXuboEEVhQ6Q2G4yjv8V1XLtlrBNh9ni5YrLDjnPIUoJ0d X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 26, 2024 at 08:07:45AM -0700, Suren Baghdasaryan wrote: > On Fri, Apr 26, 2024 at 7:00 AM Matthew Wilcox wrote: > > Intel's 0day got back to me with data and it's ridiculously good. > > Headline figure: over 3x throughput improvement with vm-scalability > > https://lore.kernel.org/all/202404261055.c5e24608-oliver.sang@intel.com/ > > > > I can't see why it's that good. It shouldn't be that good. I'm > > seeing big numbers here: > > > > 4366 ą 2% +565.6% 29061 perf-stat.overall.cycles-between-cache-misses > > > > and the code being deleted is only checking vma->vm_ops and > > vma->anon_vma. Surely that cache line is referenced so frequently > > during pagefault that deleting a reference here will make no difference > > at all? > > That indeed looks overly good. Sorry, I didn't have a chance to run > the benchmarks on my side yet because of the ongoing Android bootcamp > this week. No problem. Darn work getting in the way of having fun ;-) > > I still don't understand why we have to take the mmap_sem less often. > > Is there perhaps a VMA for which we have a NULL vm_ops, but don't set > > an anon_vma on a page fault? > > I think the only path in either do_anonymous_page() or > do_huge_pmd_anonymous_page() that skips calling anon_vma_prepare() is > the "Use the zero-page for reads" here: > https://elixir.bootlin.com/linux/latest/source/mm/memory.c#L4265. I > didn't look into this particular benchmark yet but will try it out > once I have some time to benchmark your change. Yes, Liam and I had just brainstormed that as being a plausible explanation too. I don't know how frequent it is to use anon memory read-only. Presumably it must happen often enough that we've bothered to implement the zero-page optimisation. But probably not nearly as often as this benchmark makes it happen ;-)