From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86AEDC636BD for ; Sat, 25 Nov 2023 12:11:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21C6F8D00B4; Sat, 25 Nov 2023 07:11:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A5178D0096; Sat, 25 Nov 2023 07:11:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06D338D00B4; Sat, 25 Nov 2023 07:11:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id ECB828D0096 for ; Sat, 25 Nov 2023 07:11:45 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C6439803A9 for ; Sat, 25 Nov 2023 12:11:45 +0000 (UTC) X-FDA: 81496362570.08.AAEFDC9 Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by imf27.hostedemail.com (Postfix) with ESMTP id 9251E40011 for ; Sat, 25 Nov 2023 12:11:43 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=QVHMXdIM; dmarc=pass (policy=none) header.from=alien8.de; spf=pass (imf27.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700914304; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lXipk2MsKuURRyBeamrn5qlUVJSdVw3qeIkGplBQFJ4=; b=zWrKbbym6nl1SJMG8nULFkf5eRPHOmLQxJkBMi1rYWmTGQjd77P9nu1InBvYcq5Uba8UUO iYx9YI/qiRD9ERK0pZmjkOg7bi0B/Ss6Mx4/u/eYDWw2mXWK7R33soy5SZz7wXjgX6MNvo L+CERFZk8FIiZnGGlFFDCtKSI4jr94M= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=QVHMXdIM; dmarc=pass (policy=none) header.from=alien8.de; spf=pass (imf27.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700914304; a=rsa-sha256; cv=none; b=gSjz+RBuEjqnqMgfrUuuDrzBcsN5QW4PKey8Pa0JTz/vLx02bGFdXAvmJuygK3uskbwrJj 1TZUES8LYOLNAzryRkcnMADjBIZeU/xajgCYv8WYBoXuFaKpuqrhnyza/xZFK6vTUyI499 eM+u55J4LmTi0egSjxim2CEQqJE8Wko= Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id D5AE540E0258; Sat, 25 Nov 2023 12:11:40 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id EdH3TiqjZDCa; Sat, 25 Nov 2023 12:11:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1700914299; bh=lXipk2MsKuURRyBeamrn5qlUVJSdVw3qeIkGplBQFJ4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QVHMXdIMPncOpYh8vp/Gt8sIPjBAhGxxgzfTsGMtNrIzT8wggTgyMqTzF2sMopuvJ fFocbJt5u7vYk1KWv2aXXVIr+ZlGQdemphn2IjeEZlMNX0WFHnavg4TVFKJX+y18ez cGcKjNKw5uyFoU8GV4dxNJitJcyu6bHt4Q3NL7Ub0RAmIgtZQFowBZ+9vs4E3mRv63 j3kQtwGnNTtSQpMx3YZ2kgU5qWkTJue3GIaK522+AR1akEkGhSjE/IzPx/J739Dov3 GKP34gBAwkG8NZmyP8EMPhRuXxjo8bwsBko/EnPpLtDzKrN2PpSn+6sTb3Isz3k+dc xiBoseG1C8zdPMi12qMElb25fLR3LiwuhduajcqG07p77ecV2hpEf8/RMEKNQOSb+q Fh5sKrg5PzV6wG/L87hZX7wthXNkMnLuqxroMgEt0endDC1W3e7q9o8LAD41eFhoFX NrczcQWH9c/CVIRiuviiNVgBODS68QaH4Z2D3BbMdQNIjHrpTiFjwFXREF2dv3Vdf3 6X8CQd6plNpIxOXVhOPK2VJQ7TmE5frF/KYBOFrXbiiWjNtxFbJC0KR+PRRTMTyMXB LQ2SG6apkA4OeV23g3JZke+5tCvHsC6UZOPft/wPnNk7lgN5nU9iO1om3rcaxCvA3l Rbb6RjLody3WRP8nJoG+7R/Q= Received: from zn.tnic (pd95304da.dip0.t-ipconnect.de [217.83.4.218]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 7EDAB40E01A5; Sat, 25 Nov 2023 12:11:04 +0000 (UTC) Date: Sat, 25 Nov 2023 13:10:59 +0100 From: Borislav Petkov To: Shuai Xue Cc: rafael@kernel.org, wangkefeng.wang@huawei.com, tanxiaofei@huawei.com, mawupeng1@huawei.com, tony.luck@intel.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, james.morse@arm.com, gregkh@linuxfoundation.org, will@kernel.org, jarkko@kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, linux-edac@vger.kernel.org, acpica-devel@lists.linuxfoundation.org, stable@vger.kernel.org, x86@kernel.org, justin.he@arm.com, ardb@kernel.org, ying.huang@intel.com, ashish.kalra@amd.com, baolin.wang@linux.alibaba.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, lenb@kernel.org, hpa@zytor.com, robert.moore@intel.com, lvying6@huawei.com, xiexiuqi@huawei.com, zhuo.song@linux.alibaba.com Subject: Re: [PATCH v9 0/2] ACPI: APEI: handle synchronous errors in task work with proper si_code Message-ID: <20231125121059.GAZWHkU27odMLns7TZ@fat_crate.local> References: <20221027042445.60108-1-xueshuai@linux.alibaba.com> <20231007072818.58951-1-xueshuai@linux.alibaba.com> <20231123150710.GEZV9qnkWMBWrggGc1@fat_crate.local> <9e92e600-86a4-4456-9de4-b597854b107c@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <9e92e600-86a4-4456-9de4-b597854b107c@linux.alibaba.com> X-Rspamd-Queue-Id: 9251E40011 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bjc747nrqc19nwww1pepnptcxiugf3hm X-HE-Tag: 1700914303-249447 X-HE-Meta: U2FsdGVkX18RYvRhMTNdvcnCXU2kvXOH6FYZe/+EgU31hUl07Juy78bW6GFXbXXGAwPfYIBmvJU23rROu9nJ31ft0nwqSX2vFEE6+vWwEd5rONyxbtHj3dLN59HIhJNrjuXOSI5OWvRocy+JUeQOg9PbU9IyKu+irvFDzpuxgBOjkqmzsdI8VDlYCBCDGfZhlQRqIeDKVbnW7C0hHUL4qJiJ8xGWDdMyRZbOBojjkOPLTEmqKJy8U+/Wys+WWNJyTsyYwxFsKEmXhy5euu8bZUf34X3ZqmpXgB16AWrbmLtcdFXjcWJtbjryuWcPoi2NQuoZzPpXK7oOfNMtCT548qdGdwdESLzzfWnfuRsRcIA5oUNyvYLUi95j7/Jga9UR7US0ZnnKLdH147n/8eG1kxHNA6aBJVylwwT0GkA1A/eCap/cus+ULDh95xP1N9zShPsSjrRE+4dg9EzCaX4H/+XFVCReeBmqZ5ZH1vFqHro8YVIsUEehktOKFZuIenAni4QlaZbuW1Zw70TNmJZ0lEwjG7awVwQPtD+VPx5Y18Z7ebBMsV26SNcWEiOKJuKnK+Zd9jNr36sOoHSG8M3o36mvldsqT9SaZmlLDsNwts64RHLP6XX1SGNhUKbhu9S2JDAeq9V3x99InY11emXFwqWcbyuM6SSoojA31NjqStA3TYa4Q3rurjIcFX2zdvvJfS2alZi62Q58scfyO6Ew25VnjFIdCLwkyTPWe96AZHmoCKxGBaotumAXLTKH+Snr+lxmkvKAZTMBI/LA5qpQqLj3Un4hEY4yX4+bfwd2QupH/Tr96qsTRqIr8C6l1uowlI8Ook1eoX22GDHW9QUPp4YdahAzpJSfpPOi/k3Wcjrsn5NwzwFR859lewkUsaV+q5N8P2U6/tUtA9zn/TJlQYNiG6RL8sW/n54vobzlp04qRNXu6B0gHsmEfk4XeabnCRHlgqqL2XuvzTB/+Oh uH8cWRb5 6pLKH0puqq5NYYRWcsF7S/Xa4U35oi19tR/lyxCXhPo7m6L3jRqbCatcNT3bImr0T80B4Hd1JHERhUHvGDMfpOIHCCtga1TN6pkPj4XFm05jqmV0UzCLC3/wX0pKRnauaDITaBF2/rX3jU0bw26HJAivyFO67ybZn+PCu3pHKPdopnxBsN/uOQOh5WoTepYedpRA/y6Rn0ju9s0z3U8VG/03guxlFArtsHSLUJg9+VVaCp9O1kR18XZC2DQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Nov 25, 2023 at 02:44:52PM +0800, Shuai Xue wrote: > - an AR error consumed by current process is deferred to handle in a > dedicated kernel thread, but memory_failure() assumes that it runs in the > current context On x86? ARM? Please point to the exact code flow. > - another page fault is not unnecessary, we can send sigbus to current > process in the first Synchronous External Abort SEA on arm64 (analogy > Machine Check Exception on x86) I have no clue what that means. What page fault? > I just give an example that the user space process *really* relys on the > si_code of signal to handle hardware errors No, don't give examples. Explain what the exact problem is you're seeing, in your use case, point to the code and then state how you think it should be fixed and why. Right now your text is "all over the place" and I have no clue what you even want. > The SIGBUS si_codes defined in include/uapi/asm-generic/siginfo.h says: > > /* hardware memory error consumed on a machine check: action required */ > #define BUS_MCEERR_AR 4 > /* hardware memory error detected in process but not consumed: action optional*/ > #define BUS_MCEERR_AO 5 > > When a synchronous error is consumed by Guest, the kernel should send a > signal with BUS_MCEERR_AR instead of BUS_MCEERR_AO. Can you drop this "synchronous" bla and concentrate on the error *severity*? I think you want to say that there are some types of errors for which error handling needs to happen immediately and for some reason that doesn't happen. Which errors are those? Types? Why do you need them to be handled immediately? > Exactly. No, not exactly. Why is it ok to do that? What are the implications of this? Is immediate killing the right decision? Is this ok for *every* possible kernel running out there - not only for your use case? And so on and so on... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette