From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60098C369CB for ; Wed, 23 Apr 2025 23:28:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C35CE6B0005; Wed, 23 Apr 2025 19:28:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BBC026B0007; Wed, 23 Apr 2025 19:28:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A35936B0008; Wed, 23 Apr 2025 19:28:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 83AD06B0005 for ; Wed, 23 Apr 2025 19:28:31 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E7FFBBE1C3 for ; Wed, 23 Apr 2025 23:28:32 +0000 (UTC) X-FDA: 83366900064.08.BFE5DE2 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) by imf17.hostedemail.com (Postfix) with ESMTP id E99F440009 for ; Wed, 23 Apr 2025 23:28:30 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=hqSXEw4K; dmarc=none; spf=pass (imf17.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.222.172 as permitted sender) smtp.mailfrom=jgg@ziepe.ca ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745450911; a=rsa-sha256; cv=none; b=WaMuv3IQpoFye30er/foJ401U4f6CegvQuz17B0Pop+R4qO9NukyehryiilrZ8FDojYMhb XOM5BOqDjiPUG7KCCAGXV7iZpkkLD94hNyTFIlp2eVtwQ0C9MWZaDVo0vdsyuTF6pPUfnH +0KM28HsaJQ58+o4FZAOHrC2XviQmwQ= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=hqSXEw4K; dmarc=none; spf=pass (imf17.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.222.172 as permitted sender) smtp.mailfrom=jgg@ziepe.ca ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745450911; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0Go7qZaXvLQqCMqWTP9/Yf9f9Qu/B9pFeocBe+GrV9o=; b=v/6RiRVEVo6AAUdYFHNS5zO+o4xBRszq+sxs8t/W8L93VtuhZnFjOJRSqcMm5P+XlFXayH wtrIhw60tsmWlLRG65PgGGywvYALid4VbITTWo5khaMKl4VOroAHcGZhmKlvjPcLn+GU7G YXrRId2G6SDUE5fc/m3desueRPCdGuo= Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7c54f67db99so179528785a.1 for ; Wed, 23 Apr 2025 16:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1745450910; x=1746055710; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=0Go7qZaXvLQqCMqWTP9/Yf9f9Qu/B9pFeocBe+GrV9o=; b=hqSXEw4Kh9CQ9c8zuwKGVyn2uhxEb+iHzGEZ9iSIAztEGU0mTie73dnUPpZSUDu7Pi fsJGpFYNuTxrgmkK1rjDfrXvLK1f9xdFTXPH4SA5XF+Ga93FAdRTBj2/p6jKDw1H0YOe RjLhQNql2hqLfwPTyecYTCdikIqRR7cy7DQNIrEc45G5T/kw/BKnH6EGPQh/BLs7LV/h 2rTt+QGZ0rbwlXcLzWreYo/3yctMcnNqrYQbTwVOP2sZD+KFPGrl7Nx0PRF9uzXaCkuU n7LgMULc6JQ3iOU4fBuUYSV7Sm4DKf/0nAIE+qrxxeaEvfuxjFy6yzyljViCjzFhiwHs g45w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745450910; x=1746055710; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=0Go7qZaXvLQqCMqWTP9/Yf9f9Qu/B9pFeocBe+GrV9o=; b=AL7KD65PHmRiAetsr/+hsbKL7uKmzsfULFiPjDaU+wx1+XmpSIBSvxDzPaVKZ7zNOm u/vhYBviPerZ52VQliqe/2UK3fYR4lDb2GbKyXQIT9emPM8YxWSmvOjW1KSxZsT7tfXU yRkW+hnxhNrTC7BxWZ6vi50hcSPJ+l1Vzh7N4NPFKiv6Z+ByeUTN7jZBCpwC4BTIjKwb M7nasTJ3j2X3bHF7DxJpEwdCwZiKRuXmOQgQioKvKKo7Oa4kWcJDrdKr32b5CFSO/8jT KO8qXocn/GsO3FhUPomv+QWg3wC717KMerU5XuPeThw1sVCbZ5121B6c3IV2k5jHGUzc WCqQ== X-Forwarded-Encrypted: i=1; AJvYcCV5IYE405fYcwwiMAlio4E5LBIxHVDh6O6pdieZStMKVEeKYzKXT1/v2Mr5sHnVxo8CeclXytPr6Q==@kvack.org X-Gm-Message-State: AOJu0Yy5iYJdbVbtQr29x3tZFNRpyncGpmwquz5W+LXeCbKzvXpSgwk1 UEOZD7cO8oi0r7aot22Lv78ulc8suxaVOYo8xQowhO3lbrgum3Ty+vE6GG9n+wQ= X-Gm-Gg: ASbGncvBS7ulh9O+HiMtYAmbHTP/vTBpobyEn2uhbvhMQtu3J2cv/VSaS0BWwFQOYr8 XHO4gkCOL3m5OxedbT89lHQzJ613VonR4tuPC8k9vLDavbUvdxZghikjGf4yK7hjLtf7RJ4mOMG NSivqNwa4+1UxM/+IlTIBqDcSlC25APwFl0ncugt9jB5v+uU0vYiuYA3stl6CSG4N19o3l2I5kD akPLPaRYeIFNoz+AW3Iyj5lxBSKgXVwtuBRMRgxM3f/FmWY4dDVf2QEu9sXaCw/CCWJdQ0C7qJm La+xp72G/rZMVmZE0w2bLbE8w8SnYejwqWB0fNRaUADZOhcr4kBhflqnDF2jrv8MIuNLQQK95Ns 4hmv/lPGqASwopfc/Zew= X-Google-Smtp-Source: AGHT+IHQwoUGFyqXOFOgVitrB4+AxVC8d+UTNu2w6OYf0KswkT4EXsvYxHqs0LgYSSaBSDPbedDu/w== X-Received: by 2002:a05:620a:2887:b0:7c0:a1c8:1db3 with SMTP id af79cd13be357-7c9585c5877mr36268185a.11.1745450909829; Wed, 23 Apr 2025 16:28:29 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-167-219-86.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.167.219.86]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7c958e7f154sm2581385a.76.2025.04.23.16.28.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Apr 2025 16:28:29 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.97) (envelope-from ) id 1u7jW0-00000007PIX-0JOv; Wed, 23 Apr 2025 20:28:28 -0300 Date: Wed, 23 Apr 2025 20:28:28 -0300 From: Jason Gunthorpe To: jane.chu@oracle.com Cc: logane@deltatee.com, hch@lst.de, gregkh@linuxfoundation.org, willy@infradead.org, kch@nvidia.com, axboe@kernel.dk, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org Subject: Re: Report: Performance regression from ib_umem_get on zone device pages Message-ID: <20250423232828.GV1213339@ziepe.ca> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: E99F440009 X-Stat-Signature: tiwtgj6t8uffrmszi7kckawpas1gyt39 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1745450910-962096 X-HE-Meta: U2FsdGVkX199GMYV2XMljOtLXWNWSYcsYZqd/Pz1g6lX20hkGNEa6j+L4nH0dlL0jQWnk0FK69F1BWcGNV5vl9iDDUbjhqcOpSmsCYIafdQ2t6Nu/dVp0GF3lOVyfPfR3muOwfvZtbxH+sP781tFDXZhgi/snxQDj6F9llSmtCYD+hqXsypFtNqvFZYHILo4tpKL1FIAt2ZuyxXNliAM8lEFQ+97u9/Zli20zlEfyevwVi5wQcUXnfaMEGbylL8Sk5iJTvAfZ2y20Dw7lxXZzANvPGnOG1kpxy6jwSvjnilGiV9eBOgGmenyczyrQuvlXFLb+PBwsQvFqmlT+uJWqo/QwV+6aKij7FMF/GfImdWq7w4FfofW06MTKmPbi8i4idi7rg2zzOHIrQzD9EsitcD3aOgilkYH3JPQXieBThqU23ihMgG5G7/Yr0MlVTchq9H7DLbCBFfFmBI0PRWuLFjDMlXjZaE0Nn0PQLDFomABW9NbS7yUN+J4Ci2nyzUicY9RsTcMDn028Lo8yQdrmm22tmxiK9fenFaD9RcJ72TejQc4ZYRqAmj3/a0NAq/IJUCUOymiYZh1D6ynu4//o4tQiWAgm1IMQBhIyz82vwzXjftk0BbM8vOmv8i3wiFDrScL0OB0U2xuG7zdiMT5eI5EkCPYjTZaY6q55rJUrpFPGs3U+y7K7G7qZX7XRcs22/tlROSOTvVw3P9xujlSxGn+vFSwniepmqTjnz6lPO8u/rjiSvULPFhp+/mP2IB6sl9nXDZdV2CG9Qk55N0w56sbbIFamh2eMQkPN3Uooqi74oBiAZOhensdSImo0VARa7Uf69f4jIF6Misil6uWiNFbS3WWAzf3LsqE/JdF5NmgjRjOOdaEbGGynRuaPvvfBbbWLmId6yP+PSX7HaSuw4aF5M+CDgUEAVxjEUATLL0aiQze9qzuMFBT+iMkOyGVoBlm3vq46stRGv3Msf3 pSaBVbr8 85Er7z0XhfopNlGkKxPp2jzeFrCSbiMVgTLIwIetVVgSYVB/DeyEL8PwkFtu1KDGc758MPAykAzQTsadQy5cgTL/FC7kIYArxL7iLSsN9TogmwzC0Dt9rehPzKYsLAdch8MIVbrOup2KA1t/m0ZS1jJh9iL0nO+wbNcV9r3XtP2TzrPUrQpdubybkQpd8E6OEXLSEn+nR42dmbkLGEYFBBvk2JIONL44Yv3a0xH6Rd4nKsTZWcUNHMGEJ8sgWCKcLetQ1uqJfkY+NOxU/vZIT71iRFedNJtHLFUHoPSESsIYR8Nt2rMsjWlYiH2aBB4CTVLzYVMyfX7GmFVtdadxE1o+tetDwa7DqQCIIkGK/idRaUYI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 23, 2025 at 12:21:15PM -0700, jane.chu@oracle.com wrote: > So this looks like a case of CPU cache thrashing, but I don't know to fix > it. Could someone help address the issue? I'd be happy to help verifying. I don't know that we can even really fix it if that is the cause.. But it seems suspect, if you are only doing 2M at a time per CPU core then that is only 512 struct pages or 32k of data. The GUP process will have touched all of that if device-dax is not creating folios. So why did it fall out of the cache? If it is creating folios then maybe we can improve things by recovering the folios before adding the pages. Or is something weird going on like the device-dax is using 1G folios and all of these pins and checks are sharing and bouncing the same struct page cache lines? Can the device-dax implement memfd_pin_folios()? > The flow of a single test run: > 1. reserve virtual address space for (61440 * 2MB) via mmap with PROT_NONE > and MAP_ANONYMOUS | MAP_NORESERVE| MAP_PRIVATE > 2. mmap ((61440 * 2MB) / 12) from each of the 12 device-dax to the > reserved virtual address space sequentially to form a continual VA > space Like is there any chance that each of these 61440 VMA's is a single 2MB folio from device-dax, or could it be? IIRC device-dax does could not use folios until 6.15 so I'm assuming it is not folios even if it is a pmd mapping? Jason