From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D902C32793 for ; Wed, 18 Jan 2023 17:31:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2975F6B0075; Wed, 18 Jan 2023 12:31:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 247F76B007B; Wed, 18 Jan 2023 12:31:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10F966B0082; Wed, 18 Jan 2023 12:31:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 002D26B0075 for ; Wed, 18 Jan 2023 12:31:44 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C566EC08C2 for ; Wed, 18 Jan 2023 17:31:44 +0000 (UTC) X-FDA: 80368612128.15.C6B44CB Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf14.hostedemail.com (Postfix) with ESMTP id 21A75100025 for ; Wed, 18 Jan 2023 17:31:42 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ok5Eylhv; spf=pass (imf14.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674063103; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=19OuHCgxUbF9UKnwu4pr6XMfVFyV6Iu/xh8B2trjM74=; b=pK2JMCydfJUtQZ+DuIfTYAYtaLaDLxHJzDRezLJz3/3fY09Aeqr3LQ6jCLGHJBGNNEfEfr mTuIk59jh5v9MPw+rmCWUVpZHP+CxUSYVDyUPHzNm+BA5F280dN9Cz35++KH9MbpD+EwPu FWHnkNgg4baMOpDXwnfIFcxQ1KvprhQ= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ok5Eylhv; spf=pass (imf14.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674063103; a=rsa-sha256; cv=none; b=X3LMVLLxdQ/ZtMdtgpsN/SjU+BpjXNE3PPl0ZK2S/+Sw6XWFcZBYh8ZV7r4fa4fv5B81NU tcatPHFEANROTEou/W045yYhozZ1N5uL9KrS+IMKScxoX91yJSLHqCH5mWsZR/mtzNT+R+ 4fjdANahaSJVyRhO50F2DA0fl2BPXSc= Received: by mail-pj1-f47.google.com with SMTP id n20-20020a17090aab9400b00229ca6a4636so2256155pjq.0 for ; Wed, 18 Jan 2023 09:31:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=19OuHCgxUbF9UKnwu4pr6XMfVFyV6Iu/xh8B2trjM74=; b=ok5EylhvkhJSsRbLHT+910C3/b622W5p4TakgZZR6OPkDQfssUi4JGBqJZ+S/OTqno NsKN5Jyseryw7lZbarWn6DaaZvOD4RoG4owdHjOntINqMpkyPB7+nLYk3zkv4f6vPBz+ lxEmox+OV51td73mhvbOhiUHtsrYEjj49dMDpIV9J9e/g7EOMO45LboQDQ+lTDb0R8tK t8G+1F2YRwTuPOa/G8rYMnG2BBrtgfnS5H3QmTMX8ijeBx15hcaFBEGfiKjFw0CgG2+D XM8Bwvy6oKThqw9ZNJZ+ZD1oB5w07HSisAXbBen7m12ux59RJmJs/4FxqpAwjt+k/7aN s4EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=19OuHCgxUbF9UKnwu4pr6XMfVFyV6Iu/xh8B2trjM74=; b=Ps/lg4SdWms4aXM7yg3i+L1yVN8zYd+ezqdMx7TL1PBSg6GxswH71okA4zju3C69// FDTPtDaoFvdGbNaYUgiwNDMz4O5d1PBjR/ZZvyFjsp63LrLZh41Z3pSPDSLoGNko9rTY yQV159DY616aKYg6HhKzHROehRf5pyINncTS/l150fAy+zCR/ZELdDcW2bWB4MCG/8aG LV1Zy4/cpQX4XflwNGXufdU9K7+S3NFJXTMVozBNhvFVFzc3LBI8wyoplcKZo4b/M45w cLjhdU9xduVdp4FsqAUYLBAqoq4QCClspJOeTuQjMiv5mWTNPMBHE2axgruC007ARonz K72g== X-Gm-Message-State: AFqh2koQi4CvGNjyYo6X3mVxRaOp8txHG5WiqmeqDm4iDQMyGabd58wh HXEg/KIEEt7tt3Xx9Z1ntiEnCGd+MC6Sd4ki9EZnNSNv8du8Co4F X-Google-Smtp-Source: AMrXdXuKqQneypPuG3Vqy5QN8ZBYGIbiqowW+XScEoUVDxuCU8SdSamT7DoWo+pCTSHo40+gde97zUgifO+s4yZMPFU= X-Received: by 2002:a17:90a:4d09:b0:229:7ffb:24b with SMTP id c9-20020a17090a4d0900b002297ffb024bmr900371pjg.95.1674063101715; Wed, 18 Jan 2023 09:31:41 -0800 (PST) MIME-Version: 1.0 References: <20230116193902.1315236-1-jiaqiyan@google.com> <20230117091859.GD3428106@hori.linux.bs1.fc.nec.co.jp> In-Reply-To: From: Jiaqi Yan Date: Wed, 18 Jan 2023 09:31:30 -0800 Message-ID: Subject: Re: [PATCH v1 0/3] Introduce per NUMA node memory error statistics To: "Luck, Tony" Cc: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , "Hsiao, Duen-wen" , "rientjes@google.com" , "linux-mm@kvack.org" , "shy828301@gmail.com" , "akpm@linux-foundation.org" , "wangkefeng.wang@huawei.com" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 21A75100025 X-Stat-Signature: xp3f6fpdhho8crc73xu3fk9iz8koea4t X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1674063102-640421 X-HE-Meta: U2FsdGVkX1/XjYCH8HHjbIcDXKEgpMVL4F1q6rfpDpxAhMC2MxD9kSi5ozUuyyawELLwakA3tjxTzxgPH0z9IxnUgWwWk5Bf+esbsT3ALkxUulf6P8UbDRWEDGfCUi48+rf2tWhfDadxB/CE30UX/23DRhBl0m7WV6esOdfvsAsR4C/AttNEcH0zapC+dlMo9r+v1+gV11Ap5Q4TrMQNWTu9AQpYla+uObe44YNfmIGbY639xtXjMiQ//HCdbysVtr0zHuCqh4EY3bDvVR5brPVPmc4fqIM7dxZ5KgOUSaAxUhkt1FLv8F3WAdt2a+uHdZlbE48YAYXwgsFW/R71ZIt+5wOOu1FXP91XFYaskeE2zxLBA6fWWSZZ8VCnZ+git9P2yD+/GZG9xkKrGRqxm9tbOEv7MbCvbkoUG3f42zTA9EjTugey5MPm25XIrMPa8GEwnPBywTVhVAAAE2yt4SdcPRv9Aq7yXerCPo9Ts/KzlwdZXCTT3fNq7e+O5NrvfAMxfnyD6sX5WxC8MRb/PNvID02/fy1yS2pUhdr2n3FFh20YWmy8SdVvBXzSO8CYlwFdbu8uN9KU2XjV7gFEfuns4Ahx8bqTIlR7aZkbEfdxdcqVS8o7MzVE0m/NvjMvX94c50IMztEhdjWLKA9S4Sz6kUHPru53WL0u8r/aKjfkzJjG8URxggqlYrn+pu6dZOiZDFn1pu0jxU2Mm78KdIPDsw5UgrNvbB2XwrqbjYdJeoQKlSbejt6qudRT2mRD1Jml/4GCF27Cwwf8E8YYZNIZT3KgQ/71Z/GsQKd4yohzlvIgnOHiK/AJ3Wq+OPKquy/RU6wJPvQPitca4ESfBeUJ1Uc8kEb+7Z0LQFluTbI6L0D4kcWz7g1tl5zzpV6hExpxU1RLjVE/8Pez6h4JkK4CZb781efxujMnsLWD184wxdNx0Jpa1Mju1s1wi3Cx6AITvjc14gQFi/EzC0C gmeL9zCk bH7zhHcamK/yySydMFzY8+BWnXOOcPTvybZSX0oGliwz7RAZD9ZUjcxNmOw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000014, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 10:34 AM Luck, Tony wrote: > > > For SRAO signaled via **machine check exception**, my reading of the > > current x86 MCE code is this: > ... > > 3) therefore, do_machine_check just skips kill_me_now or > > kill_me_maybe, and directly goto out: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/mce/core.c#n1539 > > That does appear to be what we do. But it looks like a regression from older > behavior. An SRAO machine check *ought* to call memory_failure() without > the MF_ACTION_REQUIRED bit set in flags. > > -Tony > Oh, maybe SRAO signaled via MCE calls memory_failure() with these async code paths? 1. __mc_scan_banks => mce_log => mce_gen_pool_add + irq_work_queue(mce_irq_work) 2. mce_irq_work_cb => mce_schedule_work => schedule_work(&mce_work) 3. mce_work => mce_gen_pool_process => blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce) => mce_uc_nb => uc_decode_notifier => memory_failure