From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52472C25B74 for ; Thu, 16 May 2024 20:28:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A03836B0083; Thu, 16 May 2024 16:28:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B3DD6B0085; Thu, 16 May 2024 16:28:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A3936B0089; Thu, 16 May 2024 16:28:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 66C586B0083 for ; Thu, 16 May 2024 16:28:48 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D860D406F3 for ; Thu, 16 May 2024 20:28:47 +0000 (UTC) X-FDA: 82125397494.06.A9A4390 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by imf06.hostedemail.com (Postfix) with ESMTP id E98ED180002 for ; Thu, 16 May 2024 20:28:45 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XNmjgPWj; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.221.49 as permitted sender) smtp.mailfrom=axelrasmussen@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715891326; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YZSwX2jBcRCBdUFcHdErZ/3QKMz3PjihBzuXDa6XBd8=; b=4nzs5fQb5fzYX14RR2tcRp8+TSL2FOz2xuTFmKEQuk8xc2+CrvYvXcO1+gTMBP43WLYOin wM3VJghBmEgZ3FGnn3MxJoxPRU6KwO2ctMxhR0JDDOnvfnjh3kH7bqSotC/HKrEaVcm26l IhPQWgyw1/+lEq/tpWQcZTpwIKIi3d4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XNmjgPWj; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.221.49 as permitted sender) smtp.mailfrom=axelrasmussen@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715891326; a=rsa-sha256; cv=none; b=3PWltqFj+8eT9ZczoPKFH/jxY9vAn9BR5dp22+y8u8Vi0m48U0Fptvax0SAMaYtG0pp260 pMH6SoDnFUaVtuc7b7L1OIpY6JPii4ZupT11E8Ao+24UIqNLWDstUxnx+fkEZwt1o5JZoJ C1fzqyx4Ygg+is37qoAOWP/RImphPb0= Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-34d8f6cfe5bso6715454f8f.3 for ; Thu, 16 May 2024 13:28:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715891324; x=1716496124; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YZSwX2jBcRCBdUFcHdErZ/3QKMz3PjihBzuXDa6XBd8=; b=XNmjgPWj6PfhlgSIIolUGCalh6DgnEu8+a5qkAo80k/4XFESyhu2dkRSWJBOEQgYrG wsh9iYKI0ePBnLa4lO1WuHdVTYtQ1PNRmcexR2rADY9rx7+7RuJ1pEX5SJfayrR+BLhJ 5vb9j7MMtRodMKBOFJR0uhO9avMWKgZ57FjLF2g9IkVhipSTCEnZ4JXgGmUHnXfbxhTG TI91tuhdilPpHiEJOvn7ejdKaMJNj1UPxbx6tkKoeXiYh0mdwCfEU6CeF1C08KoFgg/e yxWwzRt6UGcf3vKPsta78uKLNqLg1LHB7o5DQeodERhci0UeyYN6yAIS7/b/KTBzsiPy DXBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715891324; x=1716496124; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YZSwX2jBcRCBdUFcHdErZ/3QKMz3PjihBzuXDa6XBd8=; b=WlQZ2KtqG3AtS6VAnh7qURjVssWTAjBBtn3VMAk1A5Alh4G0oUmLutBnpWV03fuh1Q 7gD5+b1SGn7B6Dt4qj4sLy6XVx2/CfzVFrDBqBMXRWhVyLS95d2gzYt3fuBo+Z/xCCjn UaEyFxHVEEmE4BYKZVlhlnQFUOn7qV5qf+t6azI2WYvQQGF4Wf6AI4X7ZVrZwQTdBB2K +oTCiUXexhED6ouBs0WS/GZzy0OrHfm5RP32ehdIHjE5tHQoEUBkJcbzlX+1NLnSvSJH RjJKh22GxhxiArLvAEyCV9X9IZSvjMczvNLzyF59/n9CHz/EnuGe9DNHPMK/ZPFD3A/d 4YRw== X-Forwarded-Encrypted: i=1; AJvYcCXHSHvmPvrnHVIq2TafkJbpzKxqGqYFy/TqQZuE1219ZqvtLU8/63Y1C2r0JPEyJ5KV4X0ECz00fWVbE2msmbbRn78= X-Gm-Message-State: AOJu0YyWmbNefxb7TI6XySRfgVn/R8K17PXqSAfncGGBNOnBp6sG/tQc VvALV6ma3kZa6O0HzGg1cVeikxU25ma7U42MV4YcS2dK0tAusF8OgSkdwLE6LSwKKjggvI/I+wi eDsz/r6dFqgR+JZkbciw6iEofjy6O7WhkJpm7 X-Google-Smtp-Source: AGHT+IHSs0PnvLr7K65U0pIJasei0dLnQZzcKZVqtNdpsde/9sHa0rl6+tv66pbq/0+Y0GTi9KV6Q72LZ1sl+YrcQZc= X-Received: by 2002:a05:6000:280d:b0:351:b2db:d7d2 with SMTP id ffacd0b85a97d-351b2dbd889mr9447250f8f.27.1715891324087; Thu, 16 May 2024 13:28:44 -0700 (PDT) MIME-Version: 1.0 References: <20240510182926.763131-1-axelrasmussen@google.com> <20240510182926.763131-2-axelrasmussen@google.com> <20240515104142.GBZkSRZsa3cxJ3DKVy@fat_crate.local> <20240515183222.GCZkT_tvEffgYtah4T@fat_crate.local> <20240515201831.GDZkUYlybfejSh79ix@fat_crate.local> In-Reply-To: <20240515201831.GDZkUYlybfejSh79ix@fat_crate.local> From: Axel Rasmussen Date: Thu, 16 May 2024 13:28:04 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors To: Borislav Petkov Cc: Oscar Salvador , Andrew Morton , Andy Lutomirski , "Aneesh Kumar K.V" , Christophe Leroy , Dave Hansen , David Hildenbrand , "H. Peter Anvin" , Helge Deller , Ingo Molnar , "James E.J. Bottomley" , John Hubbard , Liu Shixin , "Matthew Wilcox (Oracle)" , Michael Ellerman , Muchun Song , "Naveen N. Rao" , Nicholas Piggin , Peter Xu , Peter Zijlstra , Suren Baghdasaryan , Thomas Gleixner , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: E98ED180002 X-Stat-Signature: 3z8iape1mrun1h371s9izg4tfkugpdri X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1715891325-966913 X-HE-Meta: U2FsdGVkX19pYea/VRGR8LiB1nWojX5mJFXqwOa1IzRIMvB9cfN/5n1fW12TVaBNK/VcpDZ7pNMWkF65oHAc623KeIttu5+kJDFhr3nuBsYgbKklutU0Au+eMpVnbeSH9ELPhzg4j7mSJXIbczA443s0TWrRKUUdTBeYkH/+CC2PFRkpDibTiGty0RFDArDJyBH69FFPJuTyXvk2tTWZsvWN2BLBzhtYJco0266dVq4AOSMJ0jny+CHb11r3SyTFjTAovJcHJYa68jfMD+a2g3PV/gt6+hatlfXiCtSHf5/JCToM7inxR8Xo5EkwTIEIeSbBa4jx31EkE/KECMKiRM8liBqmPXdt9O9CkAtmZ+EmJXJ0xEc/8nnx6PlAhYpgnbUa3jKJKkrhUf4jx10LLv9FxIKpEhUOAbEJ/7sQZjwDdD4yNQCP47Y/hto6BtOUtb61RZjcQFEYb5EN/qQgO/keAXJVmMpmGD46FLkCDd6g9I9vefJ4GyuzS8P/wuuVIGTbiewBdFYzWrejVuLTPaZNjk4HB98GPURJfRmOcyQml1yLc9XYAHwID5sk798Cj8yLL+tp3fI8XCp9u7XQOETuhGUZqpg9bF2pUsRHuRiKsc3jwyvVC2AOaAFwusMXeu7SifiHtB4cs0QDjT5hTHzhzFgi58tSQLstcT0S2Tn9R/gURCbKV+xxH0BtvHk6FQYq8iQ7Spxb3WhpRLJmKp+iUKdHhz4d2TJt36Ksm5ym6oLVpIf2bBIG1QROgMdXIWg/QlIJ3/T35cfAHqrRvSsYjfISZpWfZqtg1vCgmOJqM+LNhQJtkKce9RWrtm+QzuJJLzUa82bKljoc/1EvB7lQwyv4xgzSRIt+SGNK5vVcPk347SyV+aTvmVVNofCCmd4ajGi9oEtzzhI6DYjTY8LOl4lQVfxOJjQTvXsPkHgGj/hTDbWbhPNFfeOSjjpd4q9+y6uuEiIY2TF3zSX 5Ve/a5xF HDMT5CziCMFmJpjZBYy0hxJmtbN3e6DrB2GG26gmpW71cmBAubNjwDXDPNfrGVz+UgAVj17Hh7GWWlc1mxnnE6IxeNZQS5Cyz6iRVenuO3OKoSLeegFaKzNaizcOq1OrdeCfTlugbUpdh17cP3YMH4mBA7D/0NoWsnX5hlW1a1tcBYljT8V9sruIRBebL6EDFIWkMusbZay8/h7yoqspoAgZzRYrVgJ3M5wzbxufcKndgZrIYEGR8Yi0+2i5NDc17MPVlpUxQA5X0bVSUzlKLAhnlqCCdooA2vnCm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 15, 2024 at 1:19=E2=80=AFPM Borislav Petkov wrot= e: > > On Wed, May 15, 2024 at 12:19:16PM -0700, Axel Rasmussen wrote: > > An unprivileged process can allocate a VMA, use the userfaultfd API to > > install one of these PTE markers, and then register a no-op SIGBUS > > handler. Now it can access that address in a tight loop, > > Maybe the userfaultfd should not allow this, I dunno. You made me look > at this thing and to me it all sounds weird. One thread does page fault > handling for the other and that helps with live migration somehow. OMG, > whaaat? > > Maybe I don't understand it and probably never will... > > But, for example, membarrier used do to a stupid thing of allowing one > thread to hammer another with an IPI storm. Bad bad idea. So it got > fixed. > > All I'm saying is, if unprivileged processes can do crap, they should be > prevented from doing crap. Like ratelimiting the pagefaults or whatnot. > > One of the recovery action strategies from memory poison is, well, you > kill the process. If you can detect the hammering process which > installed that page marker, you kill it. Problem solved. > > But again, this userfaultfd thing sounds really weird so I could very > well be way wrong. > > > Even in a non-contrived / non-malicious case, use of this API could > > have similar effects. If nothing else, the log message can be > > confusing to administrators: they state that an MCE occurred, whereas > > with the simulated poison API, this is not the case; it isn't a "real" > > MCE / hardware error. > > Yeah, I read that part in > > Documentation/admin-guide/mm/userfaultfd.rst > > Simulated poison huh? Another WTF. > > > In the KVM use case, the host can't just allocate a new page, because > > it doesn't know what the guest might have had stored there. Best we > > Ok, let's think of real hw poison. > > When doing the recovery, you don't care what's stored there because as > far as the hardware is concerned, if you consume that poison the *whole* > machine might go down. > > So you lose the page. Plain and simple. And the guest can go visit the > bureau of complaints and grievances. > > Still better than killing the guest or even the whole host with other > guests running on it. > > > can do is propagate the poison into the guest, and let the guest OS > > deal with it as it sees fit, and mark the page poisoned on the host. > > You mark the page as poison on the host and you yank it from under the > guest. That physical frame is gone and the faster all the actors > involved understand that, the better. > > > I don't disagree the guest *shouldn't* reaccess it in this case. :) > > But if it did, it should get another poison event just as you say. > > Yes, it shouldn't. Look at memory_failure(). This will kill whole > processes if it has to, depending on what the page is used for. > > > And, live migration between physical hosts should be transparent to > > the guest. So if the guest gets a poison, and then we live migrate it, > > So if I were to design this, I'd do it this way: > > 0. guest gets hw poison injected > > 1. it runs memory_failure() and it kills the processes using the page. > > 2. page is marked poisoned on the host so no other guest gets it. > > That's it. No second accesses whatsoever. At least this is how it works > on baremetal. I agree with almost all of the above. But one point is, I don't think we can trust the guest to be reasonable. :) Public cloud provider customers might run some OS other than Linux, or an old / buggy kernel, or one with out-of-tree patches which make it do who knows what. There can also be users who are actively malicious. Some customers may try to do fancy "poison recovery" where they can avoid killing the in-guest process when a poison event occurs. These implementations can be buggy :) and unintentionally reaccess. > > This hw poisoning emulation is just silly and unnecessary. > > But again, I probably am missing some aspects. It all just sounded > really weird to me that's why I thought I should ask what's behind all > that. > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette