From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52B23D5A6D1 for ; Tue, 26 Nov 2024 03:10:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 473B76B0083; Mon, 25 Nov 2024 22:10:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FBF16B0085; Mon, 25 Nov 2024 22:10:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29D496B0088; Mon, 25 Nov 2024 22:10:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0B9EE6B0083 for ; Mon, 25 Nov 2024 22:10:05 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 75A21120C5E for ; Tue, 26 Nov 2024 03:10:04 +0000 (UTC) X-FDA: 82826767002.26.4CF983A Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf25.hostedemail.com (Postfix) with ESMTP id B0BB7A0003 for ; Tue, 26 Nov 2024 03:09:58 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf25.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732590599; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1O4kn3k+i7dVftNAdHCTNjrJ6NhHJKqsveRKLJg4jcM=; b=iBHC5dsBmTBwrVdfNh5FA2huGHuyTes90br8d/3oviQ8uMp1fy5ST2F+NSBDo6YH/LnGGr Ysedr6R9tbCLJQ5VX7eNdiKwJIsEnV2jXW4El/IcMPgMB5zJTh+GNRCTIj+pTxx6YjefKv fctq2ZhCOi3opD04NOybKdzVq88vllY= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf25.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732590599; a=rsa-sha256; cv=none; b=Ad14PDRaxQpOKRVw4acJZX+CH6D9NBOwlWpHZ4AqSmYajatM5qIknOsjrezEEaQW1UmZQr JBiHIOEdZIoiZv9a7p2rdGJapSMMJyNCBNMsChGS3WIl/UbGLx5cCLfcyTJbxRyoelSISf bMK5iD4Exv/3bGTNur945pWWxdRgu7A= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4Xy6yC0tSYz1yqrm; Tue, 26 Nov 2024 11:10:11 +0800 (CST) Received: from kwepemd200019.china.huawei.com (unknown [7.221.188.193]) by mail.maildlp.com (Postfix) with ESMTPS id 19D651A0188; Tue, 26 Nov 2024 11:09:57 +0800 (CST) Received: from [10.173.127.72] (10.173.127.72) by kwepemd200019.china.huawei.com (7.221.188.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 26 Nov 2024 11:09:56 +0800 Subject: Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats To: Tomohiro Misono CC: , , , Andrew Morton , Naoya Horiguchi References: <20241121045504.2233544-1-misono.tomohiro@fujitsu.com> From: Miaohe Lin Message-ID: Date: Tue, 26 Nov 2024 11:09:55 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20241121045504.2233544-1-misono.tomohiro@fujitsu.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.173.127.72] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemd200019.china.huawei.com (7.221.188.193) X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B0BB7A0003 X-Stat-Signature: joqrsywf3r5yum7ywmz9qqjq1u4p5do6 X-Rspam-User: X-HE-Tag: 1732590598-446413 X-HE-Meta: U2FsdGVkX18PlgCxAEtVKdxaSXNr7EIM/n/E9bpkg6ZMFGsNrKD2AHpgBsrqtTpuyccW4a8XwhL9fxKbzGlwcELsGzLBC8/xX+WoAt3YUAcSb7I2xE5pDPID3QCLII6vPdgBvVgJpRwFfoFMfLGfmpB3CoY8fA+AjAlsw1RA0IMnz/8Vvq9mr32WNboKpPZvzTCWjMfIH/00ccE8qACNc70wF8rShJofFtZlo8Cj5emaxYKQVglo0xjb5d/39YkTB+5JVLuUKfu7aZC8ki42tWc4AaiLYE8HgOUWMkaRwLbpq2OPCX+4k3msxbh+tiruo+VS5efuGLHk/7TvtlN34K3K1AqzsKfb4QgIyDP1HfQV3Fpp99Cesz//D1if4CFFIQRBeuS/5hoCw/17Qy0M8RtQ9C1DLc+eQETtrKNKzWnGL1K3M2tmsqTZk53dI3QLcuGRV7MIpLuIiiRCrtGuuvi01s8bFXK3C9aZjY3k73Pchq1LRs7padN4+AnHiHfiTXLEnQsPAls3jU6fJ+TaqZ7XlkdK5ftubN9uLcmsP3jCLQxcrcCfKAdDA+LnpmTFXm8JVgo39Rl0PqypmiceG8o+kOfciQv/WFhmxHAgAaz4YDH7YAdPMXzs4X+MrQlxdB4YIwCOdsXGWQxy/hJwjzupNK6uiGeT5KRHEBcR0LB5LU4c9Qv8W5+FJZd6A7eD02iaGofx9kZ5Z6tZGeWKx4SUWCT4MgxlWpMozikEPJifaPEcrVXA6YxC0eDG/4E0IrxubnaJpIgtCuWJK74JEhIVKCxUF2UuswuZDtmf8i5ZH95dEZZgz3tkrQ5Ta/tK15vby0qwio3gZMpxzTPTwCO5ZcLLFDMFBMl78Wnos/IRQ21tO8liskZQn6R5atEDBD6sMclaHcnJRrHfGf/V1vkYaYRgKr1wGK5YrGtdRpfFSniXOfps78xHXkpteFo+EfMvqUW/SXAYEbtgVDI Bn02F3qu zfVI+umdgrtsf4l+Qrxr/B+G+MCRyM6aeSr3nGsyLelT2C9kILLnApU6CTQxRkkH0pBsdhx2YHa8BetwIOiA2ziPUhGV1B9+c0GWcVRFbnX4gNVWruweBdAKKL1oz0UO0h0bYcPoGtISX1v84PxZqzqcHO9SDMcKVmvHuED9rHmf8tdfJvLUl5EtE/gQ1QlMrcx+gPDEI1oS+43vHqNJCSZSm6vkudWI+cWfHyktNCKzQH1QzESsPNSb2Jj8eAyvdW0OG+5fpV+w5W1o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/11/21 12:55, Tomohiro Misono wrote: > commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats Sorry for late, I've been swamped recently. > to sysfs") introduces per NUMA memory error stats which show > breakdown of HardwareCorrupted of /proc/meminfo in > /sys/devices/system/node/nodeX/memory_failure. Thanks for your patch. > > However, HardwareCorrupted also counts soft-offline pages. So, add > soft-offline stats in mf_stats too to represent more accurate status. Adding soft-offline stats makes sense to me. > > This updates total count as: > total = recovered + ignored + failed + delayed + soft_offline> > Test example: > 1) # grep HardwareCorrupted /proc/meminfo > HardwareCorrupted: 0 kB > 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE) > 3) # grep HardwareCorrupted /proc/meminfo > HardwareCorrupted: 4 kB > # grep -r "" /sys/devices/system/node/node0/memory_failure > /sys/devices/system/node/node0/memory_failure/total:1 > /sys/devices/system/node/node0/memory_failure/soft_offline:1 > /sys/devices/system/node/node0/memory_failure/recovered:0 > /sys/devices/system/node/node0/memory_failure/ignored:0 > /sys/devices/system/node/node0/memory_failure/failed:0 > /sys/devices/system/node/node0/memory_failure/delayed:0 > > Signed-off-by: Tomohiro Misono > --- > Hello > > This is RFC because I'm not sure adding SOFT_OFFLINE in enum > mf_result is a right approach. Also, maybe is it better to move > update_per_node_mf_stats() into num_poisoned_pages_inc()? > > I omitted some cleanups and sysfs doc update in this version to > highlight changes. I'd appreciate any suggestions. > > Regards, > Tomohiro Misono > > include/linux/mm.h | 2 ++ > include/linux/mmzone.h | 4 +++- > mm/memory-failure.c | 9 +++++++++ > 3 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 5d6cd523c7c0..7f93f6883760 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -3991,6 +3991,8 @@ enum mf_result { > MF_FAILED, /* Error: handling failed */ > MF_DELAYED, /* Will be handled later */ > MF_RECOVERED, /* Successfully recovered */ > + > + MF_RES_SOFT_OFFLINE, /* Soft-offline */ It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here. Thanks. .