From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76C04C433F5 for ; Wed, 22 Sep 2021 19:37:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 099DD610A1 for ; Wed, 22 Sep 2021 19:37:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 099DD610A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 81E1394000B; Wed, 22 Sep 2021 15:37:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CE3094000A; Wed, 22 Sep 2021 15:37:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 696DC94000B; Wed, 22 Sep 2021 15:37:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id 581EA94000A for ; Wed, 22 Sep 2021 15:37:47 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 0F6BB1829322C for ; Wed, 22 Sep 2021 19:37:47 +0000 (UTC) X-FDA: 78616219374.13.D192258 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf05.hostedemail.com (Postfix) with ESMTP id 1BC61505B25E for ; Wed, 22 Sep 2021 19:37:45 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10115"; a="221794049" X-IronPort-AV: E=Sophos;i="5.85,314,1624345200"; d="scan'208";a="221794049" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2021 12:37:44 -0700 X-IronPort-AV: E=Sophos;i="5.85,314,1624345200"; d="scan'208";a="702385665" Received: from agluck-desk2.sc.intel.com (HELO agluck-desk2.amr.corp.intel.com) ([10.3.52.146]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2021 12:37:44 -0700 Date: Wed, 22 Sep 2021 12:37:43 -0700 From: "Luck, Tony" To: Yang Shi Cc: naoya.horiguchi@nec.com, osalvador@suse.de, tdmackey@twitter.com, david@redhat.com, willy@infradead.org, akpm@linux-foundation.org, corbet@lwn.net, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [v2 PATCH 3/3] mm: hwpoison: dump page for unhandlable page Message-ID: References: <20210819054116.266126-1-shy828301@gmail.com> <20210819054116.266126-3-shy828301@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210819054116.266126-3-shy828301@gmail.com> Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf05.hostedemail.com: domain of tony.luck@intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=tony.luck@intel.com X-Stat-Signature: 7mbxb6qmzrjjspnmm1oabfeqkpnakzte X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 1BC61505B25E X-HE-Tag: 1632339465-623217 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote: > Currently just very simple message is shown for unhandlable page, e.g. > non-LRU page, like: > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 () > > It is not very helpful for further debug, calling dump_page() could show > more useful information. Looks like your code already caught something. An error injection test may have injected into a shared library. Though I'm not sure that the refcount/mapcount in the dump agrees with that diagnosis from the author of this test. Here's what appeared on the console: [ 4817.622254] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4817.630520] page:000000003ab9dca4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xcef2747 [ 4817.638651] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4817.646860] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) [ 4818.025515] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4818.033689] raw: 0057ffffc0801000 ffd400033bc9d1c8 ffd400033bc9d1c8 0000000000000000 [ 4818.272435] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4818.280640] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [ 4818.280658] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4818.313606] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4818.321804] page dumped because: hwpoison: unhandlable page [ 4818.564802] mce: Uncorrected hardware memory error in user-access at cef2747000 [ 4818.573043] Memory failure: 0xcef2747: recovery action for unknown page: Ignored [ 4818.595837] Memory failure: 0xcef2747: already hardware poisoned [ 4818.603245] Memory failure: 0xcef2747: Sending SIGBUS to multichase:67460 due to hardware memory corruption [ 4818.614297] Memory failure: 0xcef2747: already hardware poisoned -Tony