From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5115C52D7D for ; Sat, 10 Aug 2024 01:20:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2218B6B008A; Fri, 9 Aug 2024 21:20:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D1886B0092; Fri, 9 Aug 2024 21:20:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 070CE6B0095; Fri, 9 Aug 2024 21:20:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D848D6B008A for ; Fri, 9 Aug 2024 21:20:27 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4C7A61C48A1 for ; Sat, 10 Aug 2024 01:20:27 +0000 (UTC) X-FDA: 82434580494.13.E0B64DA Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) by imf19.hostedemail.com (Postfix) with ESMTP id 504E01A0008 for ; Sat, 10 Aug 2024 01:20:25 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=intel.com (policy=none); spf=pass (imf19.hostedemail.com: domain of balrogg@gmail.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=balrogg@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723252771; a=rsa-sha256; cv=none; b=PmaFXGU5A6gEXamysUPdTbQf0nAx7ofo2AmMzEYm0uOzPJI4nYt++kia0Xzuu9mnis31nW jc0rJGvcXupSxfSWtZH2qOQIoBQpkKNlUWc1hqWQ5aoJHfFKuLVAMiH50SYN+lYLER0pWb /s/2/+ewRLSCYLf/m3hynsSpdDDoyfw= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=intel.com (policy=none); spf=pass (imf19.hostedemail.com: domain of balrogg@gmail.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=balrogg@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723252771; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=921RZ7c4N2Z6LcTEM1hasybk9CumTC+dT+QKpGhch3o=; b=KH4IhziM835gOWSGHu1vYCJ/srkXpgVvc8t1fSvCWIrM3T5taSmDwfZsZcLFDETcZrkJrP Lm+pd7GA8KAjVcO+Ebdx8qJYD132t9vRDwW9MPuU/1ue/Ov3PEtW9W7Tv2ncHaPdga5Lwi Hd6onhU+l5FZFE+1hUFrLdPbnR4rq/Y= Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2ef2cce8be8so27141551fa.1 for ; Fri, 09 Aug 2024 18:20:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723252823; x=1723857623; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=921RZ7c4N2Z6LcTEM1hasybk9CumTC+dT+QKpGhch3o=; b=e/RRO8oG7h7ziSM9/MOjBGftDOGqks8WI9Ikc5fHEkqA6p23lRaTvR7SszyyF73pDv 19IMT3L2yQZs4ExuJ6HrfTKKANi8/pkLmt6k2G/FnyaElE0kZ1HPNxQIA7vUM6D22I3r GJ/XFqp5Me4Do3C0OkWL48JgTsgnnwscPaldi6HPlwnCPAwZi4DLNKlONL/olaKNMwTp Z8SzL+gF8iw1H9UGqTOuAfDTJNsUa/7XSpRcdjg6f9p0E6hsJyyBhIKOooWhx1dCJr56 5OMB8rjqXcn1Vu90gnBlv6twXpcYq0R0gFJQ4q9oaud2oB3maKIRKrZPK6JktDt43mPy DoZQ== X-Forwarded-Encrypted: i=1; AJvYcCXy1VjvF0TYd2s6yuoJ0OpcN5juXMbeK9jRQQ+XnMdYrNdkX+XBaIFRSWxSotmL0VtMcyt+pAWwd4O1/kkfH6MP81I= X-Gm-Message-State: AOJu0YxpdG2SVWSrEaRxOI++eKSj9wS0GZEZ0FagL4PwCRkr4j1qoloZ LsGl6QZR20lZ0y5swhZmU+63OLyoc0NaBUu/b6YpYdCe6/pAsaFeM4UsocudXPA= X-Google-Smtp-Source: AGHT+IHZMzDdICJeFf2O6QLL4Z4uefjdibGwhdBAp83hhOOYOx9W3loNLwJKe6uNJNM6QWcCrquQRw== X-Received: by 2002:a2e:a589:0:b0:2ef:24e0:6338 with SMTP id 38308e7fff4ca-2f1a6c6bce0mr25466861fa.27.1723252822564; Fri, 09 Aug 2024 18:20:22 -0700 (PDT) Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com. [209.85.208.172]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f291ddb4fdsm1269711fa.14.2024.08.09.18.20.22 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 09 Aug 2024 18:20:22 -0700 (PDT) Received: by mail-lj1-f172.google.com with SMTP id 38308e7fff4ca-2ef2cb7d562so25716101fa.3 for ; Fri, 09 Aug 2024 18:20:22 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCXJwDf59Gfr/eyI7kfAYv+TaTW80AFxanfIaU7YjAaVk/SU6hT8N9JktmvbZ8fCAv+7nEDqE16hBIujeYJZFKNnfqk= X-Received: by 2002:a2e:b887:0:b0:2f0:1b87:9090 with SMTP id 38308e7fff4ca-2f1a6c6d3f8mr26956051fa.29.1723252822006; Fri, 09 Aug 2024 18:20:22 -0700 (PDT) MIME-Version: 1.0 References: <20240723144752.1478226-1-andrew.zaborowski@intel.com> <202408052135.342F9455@keescook> <6273D749-9CEC-45E4-8C56-FA3A1DBE1137@alien8.de> <20240808145331.GAZrTb60FX_I3p0Ukx@fat_crate.local> <20240809083229.GAZrXUHfjgVcHSZPsb@fat_crate.local> In-Reply-To: From: Andrew Zaborowski Date: Sat, 10 Aug 2024 03:20:10 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RESEND][PATCH 1/3] x86: Add task_struct flag to force SIGBUS on MCE To: Borislav Petkov , "linux-edac@vger.kernel.org" , "linux-mm@kvack.org" , Eric Biederman , "x86@kernel.org" , Tony , Andrew Zaborowski Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 504E01A0008 X-Stat-Signature: rotsf75hm87hbat7mouz89394ozgbuwo X-Rspam-User: X-HE-Tag: 1723252825-974236 X-HE-Meta: U2FsdGVkX1/6UomzmTRblr03nMYHmhNvKumfkgfSb2zSMvXduAGCxlIs9uWpxQMXba1lzJ7TJg7/IXxxr7fKN6gXNsFgG78kYq4ujhMZgVyTIJXRslV/0vmixl5n915SrBKlliv+ClWRcxMlSORgDXDCWb1D4dh3LY08i9TUrK8BUypvMf0R8pFKZd4l5yE/edClAOAdX/zfb7ZvRejB9JM17P73/I16ebgTaXyICAglQUyuXJoVUiL2Kn5YJE0ZgrR4J+g/IChYK2fdQC6XQHvshmBfbnfkt3/0euW4zivMV+wS347Xyq+mBP7De9LrGU/ZHT4tkvkLyIjwVl6zM1IdLERnpa7hmR2nZcn0L3J+PQ15X2k4yeo/bAnboPEAqWPsQnF6XxY+DURNZ06/4eYy8iXslqmqisXJMhJvJL9LH1buzmnM5K2YnitrrfRpifuKEHMpDaf9TZgbM6UWhCiuhn+0MqJS14ScK1YcNgFWuhh12jrk4vFpZ9ey8Y18lKUnFt+5lrjVwL82PLBZIP1mJSdnc0WC0hyGnNHOmTojlWw6GAyqb3H4SG464sCoT8rrTVxHib8JAQVgqC/Tlbi0tAsYa0e9DNxZ4GDOqTqS49EqK1wwmnyfr6NyVggjHS1jlzg+VJqIDnA3fIt/Va5DdU8EwXpOxvHt1kZSrARSjZIuLBd2jT8+N+3VGSVZWpqbAInC+FpiniQiRCcxmpShrOgdwW2eQM3U817XSfnWFEe6IHcm0A+5LW3/88Iq3i2UdDmwdfyQLl4z97so1PFOrlJH4g9/ko87w7V5DN628NB3mb6zCx5YGWRKf/dgOB3hprmDxetYZYNtrURW0h+yFYaPYWoqvLijP90KYje8xaDd3WRxy3+1FG8HIfYy1Wkik6R1G/EaNu5Bdnd8/qBEoNJoJSJfAHk0WrLLanDxlXbj/ieaECGAvsdtD+BtIMMk2dedot4fMydL8DE z9R77LE3 2x8IV3ZzzCoGuVYy+RnJy0rp374gXwc+YQif0MfaHyCUdxmUARTCFwlptG+YkFO2FbWa+IyI1TXOmrKFXk08bdM7mJ0j2X19AghLIrQEfmfsrjfoqrfDrrJ6X9Xsc2JdDThRa4+0Ivir8L/2ozzHdX0gi+ZzyKfFrq4h4cbNzVjhq5C80C7VkjDx151auoUL1jU5j25fq7ABNbQKUQBRB0H5VDY4OrEOZQIm4tkTMYeMBU1cIpm3GePfFgSNOWFz2DqhkLO8xdqhf7rn20dH2JPYgbz49hCB5AAcg5D1z4oykG7UPF2iUSS91dACOTKmyr9gqZlT06VAy9LKHVxYc91LQYdT1ay82bez5YaBANCO7GeH7tR0aSBd+YeFbAXw50dt6jFB28n995ah75nL2r6xubyGeNDn0gI1wYF5OJ8w03unMESG71MfZVv/LRV9IXCp8G/AX8W0tm50zK16Rha9ZbmLCR1031Ipdm4qUpp5LRKmrBB0RTYkVh+4/MRblog3U X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Borislav Petkov wrote: > So instead of the process getting killed, you want to return SIGBUS > because, "hey caller, your process encountered an MCE while being > attempted to be executed"? The tests could be changed to expect the SIGSEGV but in this case it seemed that the test was good and the kernel was misbehaving. One of the authors of the MCE handling code confirmed that. > > > Qemu relies on the SIGBUS logic but the execve and rseq > > cases cannot be recovered from, the main benefit of sending the > > correct signal is perhaps information to the user. > > You will have that info in the logs - we're usually very loud when we > get an MCE... True, though that's hard to link to a specific process crash. It's also hard to extract the page address in the process's address space from that, although I don't think there's a current use case. > > > If this cannot be fixed then optimally it should be documented. > > I'm not convinced at all that jumping through hoops you're doing, is > worth the effort. That could be, again this could be fixed in the documentation instead. > > > As for "all that code", the memory failure handling code is of certain > > size and this is a comparatively tiny fix for a tiny issue. > > No, I didn't say anything about the memory failure code - it is about I was replying to your comment about the size of the change. > supporting that obscure use case and the additional logic you're adding > to the #MC handler which looks like a real mess already and us having to > support that use case indefinitely. Supporting something generally includes supporting the common and the obscure cases. From the user's point of view the kernel has been committed to supporting these scenarios indefinitely or until the deprecation of the SIGBUS-on-memory-error logic, and simply has a bug. > > So why does it matter if a process which is being executed and gets an > MCE beyond the point of no return absolutely needs to return SIGBUS vs > it getting killed and you still get an MCE logged on the machine, in > either case? A SIGSEGV strongly implies a problem with the program being run, not a specific instance of it. A SIGBUS could be not the program's fault, like in this case. In these tests the workload was simply relaunched on a SIGBUS which sounded fair to me. A qemu VM could similarly be restarted on an unrecoverable MCE in a page that doesn't belong to the VM but to qemu itself. Best regards