From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACB8DC3DA64 for ; Wed, 31 Jul 2024 11:37:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00A236B0082; Wed, 31 Jul 2024 07:37:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED4F56B0083; Wed, 31 Jul 2024 07:36:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D75236B0085; Wed, 31 Jul 2024 07:36:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B197C6B0082 for ; Wed, 31 Jul 2024 07:36:59 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3AC99A44BA for ; Wed, 31 Jul 2024 11:36:59 +0000 (UTC) X-FDA: 82399846158.13.374469C Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by imf04.hostedemail.com (Postfix) with ESMTP id 6F0B040007 for ; Wed, 31 Jul 2024 11:36:56 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="CwDK/pds"; spf=none (imf04.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 198.175.65.16) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722425762; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mHNF+6U2eHoOQP7RBPb7rpRBBxR9ofZE7dy8COF6ve8=; b=Gpw5/RFx8SD4F5k9X7uh93JugPPm87BLN8DPTvaAJsw4BL6JYJ6iIdUiSKTRh+1Qf1FSCp SlnVeVowV+0l4Yr4TEYA0HHWTaySxdnfrFKP7j3k//KbnD8LSWHt1VTlRpFdf6SvzG4EZl fv6DylsasK61MflRNWKtg5TZp7/1gvc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722425762; a=rsa-sha256; cv=none; b=lToAbHHOwPSE/8vWnpY/bl1SMBN5+Ky2bvgAWInqW5Eai38EYrrvUrxdTNl+nGKraCANYR 6nJS5+fVRKr8KSl72vbSGV/n09jnXhHMEwNCbIpQZjLWTyrYcmnpqCEtusvM79FHSXpzfN Jn5OdTb+E/au77Ai9TkJsN98+gcXgaE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="CwDK/pds"; spf=none (imf04.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 198.175.65.16) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722425816; x=1753961816; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=DvxRYb8t1EwWUAm27DIBnObbDa/lEizruMwpmgZsz2c=; b=CwDK/pdskefcQYijEjOrMSOShcnxg6P5CzvAu9EnHvBqN1G37o+QehDz ziEuGNRQJV5aI/XA3naR+LfBXznFG9xpAf3U+L4bVGb9L1/A69yeWQsiV 8PBIehBRYUBbjkYX63XDG043VSotI1GwFmmH/83/pO4xsi9Cg98ReZqEK om1yjfpvSJQM1lbAlDTHSFKSUZYNM96v1g+MJlPkMDJW2ucoNVnEOrASb BLAdxp4ivv1zzRp6BJ0F6xP/wSgvWahzm9fkul/Fs35eYuu7onmZroCd6 BTiGHJFQlh0yrdjELRWKoCWGQItQRF9G/5gJ2KfYjZ0Uiyn314SvoP8/z g==; X-CSE-ConnectionGUID: CP0o/lFWRNa/l7cJGt9Iyg== X-CSE-MsgGUID: 9rVsJ4C4QxiTJPVf+F6Wgg== X-IronPort-AV: E=McAfee;i="6700,10204,11149"; a="20454551" X-IronPort-AV: E=Sophos;i="6.09,251,1716274800"; d="scan'208";a="20454551" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Jul 2024 04:36:55 -0700 X-CSE-ConnectionGUID: GOq9DN9kTZ+xvVgfLA40KA== X-CSE-MsgGUID: RfkRdjaTSzehKNbqcHf2aw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,251,1716274800"; d="scan'208";a="54651188" Received: from black.fi.intel.com ([10.237.72.28]) by orviesa009.jf.intel.com with ESMTP; 31 Jul 2024 04:36:49 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 479C516B; Wed, 31 Jul 2024 14:36:47 +0300 (EEST) Date: Wed, 31 Jul 2024 14:36:47 +0300 From: "Kirill A. Shutemov" To: Thomas Gleixner , Shivank Garg Cc: ardb@kernel.org, bp@alien8.de, brijesh.singh@amd.com, corbet@lwn.net, dave.hansen@linux.intel.com, hpa@zytor.com, jan.kiszka@siemens.com, jgross@suse.com, kbingham@kernel.org, linux-doc@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, luto@kernel.org, michael.roth@amd.com, mingo@redhat.com, peterz@infradead.org, rick.p.edgecombe@intel.com, sandipan.das@amd.com, thomas.lendacky@amd.com, x86@kernel.org Subject: Re: [PATCH 0/3] x86: Make 5-level paging support unconditional for x86-64 Message-ID: References: <80734605-1926-4ac7-9c63-006fe3ea6b6a@amd.com> <87wml16hye.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87wml16hye.ffs@tglx> X-Rspamd-Queue-Id: 6F0B040007 X-Stat-Signature: paktd89bhuw3eyr7excgtn1ymw4xf9a5 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1722425816-591675 X-HE-Meta: U2FsdGVkX18DTU9GlE0GvayHbVLAhD+aGS2lTcDVNaO1h3WukQ10jvZMTDyFmrRdUQQfAAzjO2lP+Fg/BeSft4lZ1+f46DUAtI2Gb2g3bC7wEfKdlMAcNj4Q5qHmI9k5Mbs0M3IjVEsXXNDCQcy8C+Vax3RD/efOvkMzBW1x0VjwAfC2NoyvlxU32/IX0d83mzfx+ro/dziELazePF+4PSidvQitLuGzVkqsJhB5YhOZhPvNkdCeff4lQrNZoPoGzNR6kwOGIFy87BqL5bTh93DlBRoUFaighvVJL5alSRSghxiIsuZVI0pZbNa5hfPydR8vjJWpXGsfy1EIRfrbQJHowhm6qGtHxx5Rpp460N7+qbUjS+KLpACfTF7IS0KinP+fXu7wygVjGwTIQ250tO1iTHDYG8+VhPtW/O+Pt2XGL4mUdHjlZ/dF5RjtCPD3GDr2D7OGDriJY3vk4pr8ZzFOzolhcXNyko7gsLZvg+vrtcGOWMJA848+h3sNwfzSe9aoPhhrqxb6ci1JSwI2MFNIHugabN7L3DssfGdX3FeMi2S2E4ujDta/CBURyHePgI9wWTe9oFQN+Jea47BMG6ki0b1dVEIwVCqaHXMVvsJepK9hWYtUb5hBr/WXsZgcLBmC4f/qDNe1qKN/y4k2WIO73j5C5yyWQSMVL+AIKHupgLRe9y21vjC0B9SiT64kmf3QH1mmo1S35BeVjWOKRfB8AhJ7WoIzBgcjPikCUSzw0U018N6nZ3S/QouXjpHNynNwN6y9yQrt0nFcQSeX8JWUY9yZqznJiOSh/NNQLbwFy6s6W+3kdQ+95GdedYc7Xr1XF8mBn0jeHi3F6aNouA/u3VEvzTYRyPv4zXdPnrBZ3AzbEInXmxlBTHuWrjFRzCrQTy17xznngI4cZovoBNzJrMHC0u+iBXdktvg6a9k8Rt5MJGOWTvPUb9IukhnJRcwk+kfjpzPBDMD88ud 9W805yzY NFuK5ucKQci3hRxyPYkqHhrHfO1hhFRmVfCRIu1zIUqdCqJrfMQGY3gQnXoKXymY20eybUg18gQgkmNlkBnjnkyB/1iC5gKHZTzaHyON6OTJYto9LEbEK4msVjMf5W/XDaLa1aN5G50a0xsNUMsrtqKmX6fRBSpLPQdGT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 31, 2024 at 11:15:05AM +0200, Thomas Gleixner wrote: > On Wed, Jul 31 2024 at 14:27, Shivank Garg wrote: > > lmbench:lat_pagefault: Metric- page-fault time (us) - Lower is better > > 4-Level PT 5-Level PT % Change > > THP-never Mean:0.4068 Mean:0.4294 5.56 > > 95% CI:0.4057-0.4078 95% CI:0.4287-0.4302 > > > > THP-Always Mean: 0.4061 Mean: 0.4288 % Change > > 95% CI: 0.4051-0.4071 95% CI: 0.4281-0.4295 5.59 > > > > Inference: > > 5-level page table shows increase in page-fault latency but it does > > not significantly impact other benchmarks. > > 5% regression on lmbench is a NONO. Yeah, that's a biggy. In our testing (on Intel HW) we didn't see any significant difference between 4- and 5-level paging. But we were focused on TLB fill latency. In both bare metal and in VMs. Maybe something wrong in the fault path? It requires a closer look. Shivank, could you share how you run lat_pagefault? What file size? How parallel you run it?... It would also be nice to get perf traces. Maybe it is purely SW issue. > 5-level page tables add a cost in every hardware page table walk. That's > a matter of fact and there is absolutely no reason to inflict this cost > on everyone. > > The solution to this to make the 5-level mechanics smarter by evaluating > whether the machine has enough memory to require 5-level tables and > select the depth at boot time. Let's understand the reason first. The risk with your proposal is that 5-level paging will not get any testing and rot over time. I would like to keep it on, if possible. -- Kiryl Shutsemau / Kirill A. Shutemov