From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34E55C021A1 for ; Tue, 11 Feb 2025 14:22:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 861086B0085; Tue, 11 Feb 2025 09:22:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E9EB6B0088; Tue, 11 Feb 2025 09:22:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 615876B0089; Tue, 11 Feb 2025 09:22:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 41BAB6B0085 for ; Tue, 11 Feb 2025 09:22:55 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4757016029C for ; Tue, 11 Feb 2025 14:22:33 +0000 (UTC) X-FDA: 83107879386.01.593B938 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) by imf04.hostedemail.com (Postfix) with ESMTP id D76E740004 for ; Tue, 11 Feb 2025 14:22:30 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=fiD6+Dnt; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf04.hostedemail.com: domain of dave.hansen@intel.com designates 198.175.65.14 as permitted sender) smtp.mailfrom=dave.hansen@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739283751; a=rsa-sha256; cv=none; b=KKVQwqsfesNO/qSUVq8B1HN9jNP5giOABgbqeF/X7Mxok5VQddfPx+OJsJZPUYsCO6NNeV lsC1O95PX6rjgsZ+KDMxbLmOlWkyivFPzNvc+4og0bh92joQ6uHk/kCfxXi12PQRAiQNRJ Pt5rukQsq24FjS99+eEPtxOK1Da+SsI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=fiD6+Dnt; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf04.hostedemail.com: domain of dave.hansen@intel.com designates 198.175.65.14 as permitted sender) smtp.mailfrom=dave.hansen@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739283751; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nCUwebixV9TdRyCQIOhGL5C4aXweKVeFsqWUQVl0gFQ=; b=RROXL+2gL5HzHEGLnkneYay2JqRvRNiCePOWJQLAPPcsYaXfznpskmKgJyupKCvk3GnI5R ehTFXrgGABWgGQW9RVnUA7TYe+hcjTlynCgmNizQARbKKta5Bp4HKSbyXGROn3VXp7YDef gNmqwuZoLLRraIn3GkEGJzmGa1rzv5I= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739283751; x=1770819751; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=OIOaYHQIBwV812yK9y+6HQb1Dqb37cKSzaeLkw+iEVQ=; b=fiD6+Dnt0emTtBK9CKnm2z9yj/Pi4sveRAyZSJW4bED2e+LLZHWWxwmg 49BpNvORQJ1oHL/k6nn5Oom8u0+AQMTX7eY2A9RADY+xXvv6ABHn5hBin d8SJfgXPyy24mAUc7PllUadbEGnOUmcoUYZnX66xmTRg4ouUH9VXc9Ekp /YD+9RlPofwMYbE6LqErSWnuljOqKbwEscYepU7GlFHo9KzrCE+u19T/w X4JS9XTREHkm7zalD175M02Yobzn8PuMBIGGg6KSv2bBMc0FE4Z9ufXBN mF+umS+4/tBj/v80iN+xcM84m0D3Z9mXmpIz+gPduD5GiFOtOZd8/caR6 g==; X-CSE-ConnectionGUID: 6x05CafVSn+vNSz8js8p6Q== X-CSE-MsgGUID: lsHnBm5LQYa6wNfMuBW2hg== X-IronPort-AV: E=McAfee;i="6700,10204,11341"; a="43664040" X-IronPort-AV: E=Sophos;i="6.13,277,1732608000"; d="scan'208";a="43664040" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Feb 2025 06:22:29 -0800 X-CSE-ConnectionGUID: 3FCkRAySSWO0svGVxXxe1w== X-CSE-MsgGUID: 6QD9KGbrTcesYO1zwkhHiw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="135774821" Received: from vverma7-desk1.amr.corp.intel.com (HELO [10.125.108.10]) ([10.125.108.10]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Feb 2025 06:22:24 -0800 Message-ID: <352317e3-c7dc-43b4-b4cb-9644489318d0@intel.com> Date: Tue, 11 Feb 2025 06:22:27 -0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 29/30] x86/mm, mm/vmalloc: Defer flush_tlb_kernel_range() targeting NOHZ_FULL CPUs To: Valentin Schneider , Jann Horn Cc: linux-kernel@vger.kernel.org, x86@kernel.org, virtualization@lists.linux.dev, linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev, linux-riscv@lists.infradead.org, linux-perf-users@vger.kernel.org, xen-devel@lists.xenproject.org, kvm@vger.kernel.org, linux-arch@vger.kernel.org, rcu@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, bcm-kernel-feedback-list@broadcom.com, Juergen Gross , Ajay Kaher , Alexey Makhalov , Russell King , Catalin Marinas , Will Deacon , Huacai Chen , WANG Xuerui , Paul Walmsley , Palmer Dabbelt , Albert Ou , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Peter Zijlstra , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , "Liang, Kan" , Boris Ostrovsky , Josh Poimboeuf , Pawan Gupta , Sean Christopherson , Paolo Bonzini , Andy Lutomirski , Arnd Bergmann , Frederic Weisbecker , "Paul E. McKenney" , Jason Baron , Steven Rostedt , Ard Biesheuvel , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Juri Lelli , Clark Williams , Yair Podemsky , Tomas Glozar , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Kees Cook , Andrew Morton , Christoph Hellwig , Shuah Khan , Sami Tolvanen , Miguel Ojeda , Alice Ryhl , "Mike Rapoport (Microsoft)" , Samuel Holland , Rong Xu , Nicolas Saenz Julienne , Geert Uytterhoeven , Yosry Ahmed , "Kirill A. Shutemov" , "Masami Hiramatsu (Google)" , Jinghao Jia , Luis Chamberlain , Randy Dunlap , Tiezhu Yang References: <20250114175143.81438-1-vschneid@redhat.com> <20250114175143.81438-30-vschneid@redhat.com> From: Dave Hansen Content-Language: en-US Autocrypt: addr=dave.hansen@intel.com; keydata= xsFNBE6HMP0BEADIMA3XYkQfF3dwHlj58Yjsc4E5y5G67cfbt8dvaUq2fx1lR0K9h1bOI6fC oAiUXvGAOxPDsB/P6UEOISPpLl5IuYsSwAeZGkdQ5g6m1xq7AlDJQZddhr/1DC/nMVa/2BoY 2UnKuZuSBu7lgOE193+7Uks3416N2hTkyKUSNkduyoZ9F5twiBhxPJwPtn/wnch6n5RsoXsb ygOEDxLEsSk/7eyFycjE+btUtAWZtx+HseyaGfqkZK0Z9bT1lsaHecmB203xShwCPT49Blxz VOab8668QpaEOdLGhtvrVYVK7x4skyT3nGWcgDCl5/Vp3TWA4K+IofwvXzX2ON/Mj7aQwf5W iC+3nWC7q0uxKwwsddJ0Nu+dpA/UORQWa1NiAftEoSpk5+nUUi0WE+5DRm0H+TXKBWMGNCFn c6+EKg5zQaa8KqymHcOrSXNPmzJuXvDQ8uj2J8XuzCZfK4uy1+YdIr0yyEMI7mdh4KX50LO1 pmowEqDh7dLShTOif/7UtQYrzYq9cPnjU2ZW4qd5Qz2joSGTG9eCXLz5PRe5SqHxv6ljk8mb ApNuY7bOXO/A7T2j5RwXIlcmssqIjBcxsRRoIbpCwWWGjkYjzYCjgsNFL6rt4OL11OUF37wL QcTl7fbCGv53KfKPdYD5hcbguLKi/aCccJK18ZwNjFhqr4MliQARAQABzUVEYXZpZCBDaHJp c3RvcGhlciBIYW5zZW4gKEludGVsIFdvcmsgQWRkcmVzcykgPGRhdmUuaGFuc2VuQGludGVs LmNvbT7CwXgEEwECACIFAlQ+9J0CGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEGg1 lTBwyZKwLZUP/0dnbhDc229u2u6WtK1s1cSd9WsflGXGagkR6liJ4um3XCfYWDHvIdkHYC1t MNcVHFBwmQkawxsYvgO8kXT3SaFZe4ISfB4K4CL2qp4JO+nJdlFUbZI7cz/Td9z8nHjMcWYF IQuTsWOLs/LBMTs+ANumibtw6UkiGVD3dfHJAOPNApjVr+M0P/lVmTeP8w0uVcd2syiaU5jB aht9CYATn+ytFGWZnBEEQFnqcibIaOrmoBLu2b3fKJEd8Jp7NHDSIdrvrMjYynmc6sZKUqH2 I1qOevaa8jUg7wlLJAWGfIqnu85kkqrVOkbNbk4TPub7VOqA6qG5GCNEIv6ZY7HLYd/vAkVY E8Plzq/NwLAuOWxvGrOl7OPuwVeR4hBDfcrNb990MFPpjGgACzAZyjdmYoMu8j3/MAEW4P0z F5+EYJAOZ+z212y1pchNNauehORXgjrNKsZwxwKpPY9qb84E3O9KYpwfATsqOoQ6tTgr+1BR CCwP712H+E9U5HJ0iibN/CDZFVPL1bRerHziuwuQuvE0qWg0+0SChFe9oq0KAwEkVs6ZDMB2 P16MieEEQ6StQRlvy2YBv80L1TMl3T90Bo1UUn6ARXEpcbFE0/aORH/jEXcRteb+vuik5UGY 5TsyLYdPur3TXm7XDBdmmyQVJjnJKYK9AQxj95KlXLVO38lczsFNBFRjzmoBEACyAxbvUEhd GDGNg0JhDdezyTdN8C9BFsdxyTLnSH31NRiyp1QtuxvcqGZjb2trDVuCbIzRrgMZLVgo3upr MIOx1CXEgmn23Zhh0EpdVHM8IKx9Z7V0r+rrpRWFE8/wQZngKYVi49PGoZj50ZEifEJ5qn/H Nsp2+Y+bTUjDdgWMATg9DiFMyv8fvoqgNsNyrrZTnSgoLzdxr89FGHZCoSoAK8gfgFHuO54B lI8QOfPDG9WDPJ66HCodjTlBEr/Cwq6GruxS5i2Y33YVqxvFvDa1tUtl+iJ2SWKS9kCai2DR 3BwVONJEYSDQaven/EHMlY1q8Vln3lGPsS11vSUK3QcNJjmrgYxH5KsVsf6PNRj9mp8Z1kIG qjRx08+nnyStWC0gZH6NrYyS9rpqH3j+hA2WcI7De51L4Rv9pFwzp161mvtc6eC/GxaiUGuH BNAVP0PY0fqvIC68p3rLIAW3f97uv4ce2RSQ7LbsPsimOeCo/5vgS6YQsj83E+AipPr09Caj 0hloj+hFoqiticNpmsxdWKoOsV0PftcQvBCCYuhKbZV9s5hjt9qn8CE86A5g5KqDf83Fxqm/ vXKgHNFHE5zgXGZnrmaf6resQzbvJHO0Fb0CcIohzrpPaL3YepcLDoCCgElGMGQjdCcSQ+Ci FCRl0Bvyj1YZUql+ZkptgGjikQARAQABwsFfBBgBAgAJBQJUY85qAhsMAAoJEGg1lTBwyZKw l4IQAIKHs/9po4spZDFyfDjunimEhVHqlUt7ggR1Hsl/tkvTSze8pI1P6dGp2XW6AnH1iayn yRcoyT0ZJ+Zmm4xAH1zqKjWplzqdb/dO28qk0bPso8+1oPO8oDhLm1+tY+cOvufXkBTm+whm +AyNTjaCRt6aSMnA/QHVGSJ8grrTJCoACVNhnXg/R0g90g8iV8Q+IBZyDkG0tBThaDdw1B2l asInUTeb9EiVfL/Zjdg5VWiF9LL7iS+9hTeVdR09vThQ/DhVbCNxVk+DtyBHsjOKifrVsYep WpRGBIAu3bK8eXtyvrw1igWTNs2wazJ71+0z2jMzbclKAyRHKU9JdN6Hkkgr2nPb561yjcB8 sIq1pFXKyO+nKy6SZYxOvHxCcjk2fkw6UmPU6/j/nQlj2lfOAgNVKuDLothIxzi8pndB8Jju KktE5HJqUUMXePkAYIxEQ0mMc8Po7tuXdejgPMwgP7x65xtfEqI0RuzbUioFltsp1jUaRwQZ MTsCeQDdjpgHsj+P2ZDeEKCbma4m6Ez/YWs4+zDm1X8uZDkZcfQlD9NldbKDJEXLIjYWo1PH hYepSffIWPyvBMBTW2W5FRjJ4vLRrJSUoEfJuPQ3vW9Y73foyo/qFoURHO48AinGPZ7PC7TF vUaNOTjKedrqHkaOcqB185ahG2had0xnFsDPlx5y In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D76E740004 X-Stat-Signature: jx3113q8cxsanqodphrjdxocufooxjpy X-Rspam-User: X-HE-Tag: 1739283750-877480 X-HE-Meta: U2FsdGVkX1+nNDWg6b5hGBewlVE1bd5I8NAxp/j5kzOw5Rh9UIJyIteG+NMVDV8eSL9QOrm7KbABMzqCYCaei/KIz5I5WriFTDuOVM7cOjRLA17FkTXxq/WdVKd3B7gDqQrWMaEvV+/lqT621PjvnY6y9odfgPUJIl87D4hgAHDGRehrhO+9ViW8K5rYicnXRh+KC0TsK8vebub+5LAcnYncPzW2ru2cnhd4N0aavdbS0nJnKRsw99GbdJ0OmFwDHeE7dyFHrb+0uBq54VkPyoAvTuxiDKrh+tjgFwAJ/j9yb/ANlMowySiCWXOXwSZIRQ3FpJhCxFiTOmcQ4SWSOD1EcwJX5wMcjKEO+9D8fioOzeCt74PYVXcjacYqfHrzMnMwKYoNQrMwcgWRlKljeJSwWAdyxY7U6aaUAfdNP4LTx/J58yz7hsUe6LQt/zHw4+0Qb7PNz/uL3t3inL/SW6IXjvrP+t1QBA5p7Xs8PRf8hYZYM0QZZlGemg4AaUXKdxagh2zzeItKaGKGu7JyFBEf5q9ufVsrvMBCz8nPwQJfMlIsaR0h1KoMUB0DYSd7FFxn2EvW3PJsRjzg9hYgG0gIVE7Sg8TGRlUP+wAShgl3WOGEddMphgF3ei/daRliG4AwGIn5g5eX/wGdoBOj/OdASJAwUcQnOj9e9FRq14IJlosPIUqWFouJ+pRMzrW65Dk/ZBM7Z4h66oPs8ueJHu7Ly/NE4SWpwVekdvHA+l9CqFJDss8QrsxWOzBcY49YYPChJSastnvUpdSKuNB0a7hyJRpJSroMwPOrvLNBbR/C9tBUPkkg8UYW2LFExXgPHpzN5Caz2jF4lg6hgFRZ/V3sCQ9sAYxxNVVo+SFvWrdjoXTb9X/+fdMfmrn3BGpZ8aCrlQuDpJVXtD8k+N+xQRmdswnk7Y3o1BD+PBf/s3qZIYdopoRinP35i1fu8Aoi9EgAPFSBYFulSM/OKNg ppS9iMAz PHb72ztxVCKC2oRz4D8J6ZW8noSWUPOLlyBYsnyCgUGdHDgz0UbMIIjPuYgpC6M4oYA7KML+ouoRCdw2kwlsNB2FRI5Z+tF6LLW3Hmvzs5QbuuD/THAgleGf/Leko2Ga7f038S+KU0dLbBy1z0Sqj9F0aVYqCAIN2UlEFjSbW5XNeb8At6yga8bcGzGIFAM+wZvGAeLf3Lw/kIhoYzzzL9HxKwyFnajmS/tmLAq5vGXjokSjBvv72CzTDt6pvh5GG1GLARH80fif3RVZtKf3D0sQkywXY80kbR3Fk+qozaU4QG3hQaW71Hj7XwEbmZ3yTElUn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/11/25 05:33, Valentin Schneider wrote: >> 2. It's wrong to assume that TLB entries are only populated for >> addresses you access - thanks to speculative execution, you have to >> assume that the CPU might be populating random TLB entries all over >> the place. > Gotta love speculation. Now it is supposed to be limited to genuinely > accessible data & code, right? Say theoretically we have a full TLBi as > literally the last thing before doing the return-to-userspace, speculation > should be limited to executing maybe bits of the return-from-userspace > code? In practice, it's mostly limited like that. Architecturally, there are no promises from the CPU. It is within its rights to cache anything from the page tables at any time. If it's in the CR3 tree, it's fair game. > Furthermore, I would hope that once a CPU is executing in userspace, it's > not going to populate the TLB with kernel address translations - AIUI the > whole vulnerability mitigation debacle was about preventing this sort of > thing. Nope, unfortunately. There's two big exception to this. First, "implicit supervisor-mode accesses". There are structures for which the CPU gets a virtual address and accesses it even while userspace is running. The LDT and GDT are the most obvious examples, but there are some less ubiquitous ones like the buffers for PEBS events. Second, remember that user versus supervisor is determined *BY* the page tables. Before Linear Address Space Separation (LASS), all virtual memory accesses walk the page tables, even userspace accesses to kernel addresses. The User/Supervisor bit is *in* the page tables, of course. A userspace access to a kernel address results in a page walk and the CPU is completely free to cache all or part of that page walk. A Meltdown-style _speculative_ userspace access to kernel memory won't generate a fault either. It won't leak data like it used to, of course, but it can still walk the page tables. That's one reason LASS is needed.