From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B22E5D609A9 for ; Wed, 27 Nov 2024 07:07:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCBC16B0089; Wed, 27 Nov 2024 02:07:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B79B76B008C; Wed, 27 Nov 2024 02:07:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F3F96B0092; Wed, 27 Nov 2024 02:07:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7DE966B0089 for ; Wed, 27 Nov 2024 02:07:06 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2D10FC12AA for ; Wed, 27 Nov 2024 07:07:06 +0000 (UTC) X-FDA: 82830993084.12.7C26415 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) by imf24.hostedemail.com (Postfix) with ESMTP id AAABE180014 for ; Wed, 27 Nov 2024 07:07:03 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MFuYRhCQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.41 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732691219; a=rsa-sha256; cv=none; b=izNs7QsMx6Vtnk3zSKpjXt9DU1ulvqa/E9Yp+8kvSNc/ShYJmdPElrvY9GodoDRLJ4MEPP WNPop+zXdXHc6dk3zOJb7dQ1Gw1LdcLJf1i443On6lyv470EqKffmiZVLLiBjrUODN0uMU Dl4NX65n5UI2OpQHabQ8nhDIZDFToMA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MFuYRhCQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.41 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732691219; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DwKwlYtOsZUCit3oNU9hWfcBfkDTF3qr0tky4nBLYbU=; b=wVWyIfUKxyVoQe1nkyVU7vBf+wTLeU4x63loGMfMBTU66wqJegVc4RTgwd2PYi4yeVFycA 7t6FQdU/s/NifxH4S2klcFYi9yctqowXaw2HfEqal41u+CBwbf0QlqfHVqW2sGigprGje+ QK4vLSXOKhK6hIYXhlMIed9hJu1LYNo= Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-4349338add3so25805e9.0 for ; Tue, 26 Nov 2024 23:07:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1732691223; x=1733296023; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=DwKwlYtOsZUCit3oNU9hWfcBfkDTF3qr0tky4nBLYbU=; b=MFuYRhCQZw/+inPVYhJNT9E2U3f+xXG5Q6gQuoSzw18gws+5EWGqpxcpi4G0UDs5J+ mgBXiVePhk/+HpXud3Ap/rP8u9UAIqzFM2ET5qKNF7j45uitHVeJLIUUsN+yxaAmJxTW azw7KGcjxeIPNkakMNgR/aG7y0qtGaSYu36DIO0EdTQ2E/QW5+jAhpYmMSu78a2x+CqO 5CesqH+0fC9HhGmYXPCM6UIDfeQJhXLfogqzoVXGZ9T50oXSaxVeKKcAOdC9A+cg342k s79SB2TxNJd0/udhSnhP5OaRxsWhbXEH65dP5mSospdrIYw01g0i+tVP68jWN9fTjESF IQpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732691223; x=1733296023; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DwKwlYtOsZUCit3oNU9hWfcBfkDTF3qr0tky4nBLYbU=; b=TPp1ayir4ai/BV/wmIf6+aXaJV9CSQvJfDe0qYp/vcGKlUXwFUbXKf4dcg3+WmW/BB aVEmpHwGI0J1tEq6FztHA4r+JGN55F+FU/IwmkX6I6TupXH6radkehNATwVwcyf20Bd4 MMA7nFWNfLgIPZaDdjTAgKPSAT+ATbQxCiYA0Mw1uJYCpbMKigruFYt45gFYWAW4Ic7a mtRPl1T+U2SjpjK0jsYScX0VLPrK42KuwFP4BmRit9LLFuWCpuNNBSwsyQsqqXqUrRoK UY+tPZ7qiCJ2g3D0Z3iNgDse3AbNG9ZYKn9f2y79zC7DSsJK0MzoA8tc2Wnw2g8LWChg j5Vg== X-Forwarded-Encrypted: i=1; AJvYcCWu59fDAB9iPsikz4NMD9PnrLm4WvWw/uz2tCBVbDLCb7uT/u0DKYdOfU8sJY1Nd/S9onjFirxZOg==@kvack.org X-Gm-Message-State: AOJu0YyM95WnosTq2KSzhRolihR66lOy6YCBo7WGrqIUrM0Occ0BPbmn 7t5eXyM9rAL+IBIstT87v7D99APC/3L2WB7KuN0Ksx3Q2USTm1aD5rX4hwlU1Y55aMalpGI+/47 B62UJYtrGA9tx/A34LhOTSrbabWPbQtWaD454 X-Gm-Gg: ASbGncv84W25kgdkKvrzEKj+8Qf+Gb9cmSa3o8bcdZNrkJTgjOx8R8sCtWbS7vhhLiW /uzAcP90nkSESQPW5Sd+acEonQXVsSuRkwY/rfdJAZ3fnOc8bZjlegvJd7h0Uu6Yv X-Google-Smtp-Source: AGHT+IF42Hw8KNvWDk4vwLC9Rjy91t3vWf98tpVCeA1OGtbmyFBfGlnGcMqaAb3Klu2iRfklUAgFlLhNpBW4f6YRdM4= X-Received: by 2002:a05:600c:1c1a:b0:434:9fac:3408 with SMTP id 5b1f17b1804b1-434aa10de24mr787155e9.2.1732691222411; Tue, 26 Nov 2024 23:07:02 -0800 (PST) MIME-Version: 1.0 References: <20241121045504.2233544-1-misono.tomohiro@fujitsu.com> In-Reply-To: From: Jiaqi Yan Date: Tue, 26 Nov 2024 23:06:51 -0800 Message-ID: Subject: Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats To: "Tomohiro Misono (Fujitsu)" Cc: Miaohe Lin , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Andrew Morton , Naoya Horiguchi Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 8owe7kkaarquf89gf5w5jtxn8ihzu3g8 X-Rspam-User: X-Rspamd-Queue-Id: AAABE180014 X-Rspamd-Server: rspam08 X-HE-Tag: 1732691223-930058 X-HE-Meta: U2FsdGVkX19hNAF+55qf/MCYVEsc2HaN0eRW4vjrapQth99kV1Qd3BDvTqljKRcFkmdTewV4fwZHNsZ0GXOM8/lTii/4rJHxynhvYt8thcVjyC+k60BPVPcBJK7074/agq/wFCGhTgJmPnpcVktKxDh+n/tqIjciz68ZHc8wGehRGKlGnjsvszZ296WpSP17DTx9MbEXv9DAggS1PzcWUgIpaPYttsf+EZtc0GFUGIugsr7RoJvuDQXp7DwRWLO7HBQKhRxnlVO6rvX71BvWa+q4rpV4BfRS61HvubDJdZQoAFL8/Y4DBGaWex8F+vmMORwEWZg7nhgaryBRXsBClXxICif+h3wW9h+mP8igNUSNWWQhEOjmclUGd3H/bQoynNOJmo0NzcoTrEUPV/SyRTlVT9AYIDcTgs8ZhgTeHL5C9pMSI332cMTAEwpmrQ+4YuXCI/DJTR01UxF/aO0utMa4vLogF049jU2QWvwPIvOg8xYkl7RPDME1cTrrRTPHxyGqpQT3UEtES82NOcgupRZP4ZBUfXQTCMNwJdlgQczBGjut8Tu0/s2JoNjPWyIx0tLF3rBa8sTb2PzfYW7J8c8TXQsdJQflV7+esIP/bWDPvOaQ/xDpq5c4rXj08mzFHs9zzVrjR7iuwzEaoiOyGdySDhSq6+eKvfey32+Nl9rSdznnm88nVthusgYfV54+Vd9Z2DR1ygCd9qhqyT2edy7kdVBASDD9XZcPi/7RVhaFdTUT3BcGC89xvhETgYBwcOb9K5YXQ0PhiCoi1EoEte/3pW8X4rdIWU7iq6kvmk28tFOXSMoCzKoyVjeuJsKrxclk52IMy1+KzeP2RyPczbzQ4kdT1EOTdmsTEvPD25n3M2U9SwJjiaxbP+6h81eFMnr34dUD6Eis1aJ5tAnfDF5dEOU0ab1GkzrfBXoezsIJVYvNEda4QUABfvElwJFnSNUeaelP7yNLzebv3o2 pxzqPMhB AIDm1fdhxfgmEw9WncA9SN38rrQE0K3/Oq5bb1VV+49rXX6Wgt89fpY2TQDs0lT34jguyxyJCbvjw6qLvOmzVZpIf0rywOmm4rGAVE10y6xBbmH1mFXOpnPi4oPHx0V8HJpvNEvhD29tX74vS6TS/czDcCaYgGtKxX8vVXHCeQUGnR60hsaVZzxo1U8yILqF+ivQgzzHd+JSiIMILFlani/QLhBNcGYKmcfCV7Il0raR/kStVBJQaGjihG7gTDVMHI9/bkmoIoVhRb2/qN8YDlFVUuCv1Aq0YMCT4ckphSk0SCDAikqKqa7XcXjSPXShOaY9xIm2a8NSphKsJmWbzE87yB2XiwqMos+k6iCWuN9fSnoE2zze/5mIDzg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 26, 2024 at 6:32=E2=80=AFPM Tomohiro Misono (Fujitsu) wrote: > > > On 2024/11/21 12:55, Tomohiro Misono wrote: > > > commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats > > > > Sorry for late, I've been swamped recently. > > Hi, > Thanks for your comments. > > > > > > to sysfs") introduces per NUMA memory error stats which show > > > breakdown of HardwareCorrupted of /proc/meminfo in > > > /sys/devices/system/node/nodeX/memory_failure. > > > > Thanks for your patch. > > > > > > > > However, HardwareCorrupted also counts soft-offline pages. So, add > > > soft-offline stats in mf_stats too to represent more accurate status. > > > > Adding soft-offline stats makes sense to me. > > Thanks for confirming. Agreed with Miaohe. > > > > > > > > > This updates total count as: > > > total =3D recovered + ignored + failed + delayed + soft_offline> > > > Test example: > > > 1) # grep HardwareCorrupted /proc/meminfo > > > HardwareCorrupted: 0 kB > > > 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE) > > > 3) # grep HardwareCorrupted /proc/meminfo > > > HardwareCorrupted: 4 kB > > > # grep -r "" /sys/devices/system/node/node0/memory_failure > > > /sys/devices/system/node/node0/memory_failure/total:1 > > > /sys/devices/system/node/node0/memory_failure/soft_offline:1 > > > /sys/devices/system/node/node0/memory_failure/recovered:0 > > > /sys/devices/system/node/node0/memory_failure/ignored:0 > > > /sys/devices/system/node/node0/memory_failure/failed:0 > > > /sys/devices/system/node/node0/memory_failure/delayed:0 > > > > > > Signed-off-by: Tomohiro Misono > > > --- > > > Hello > > > > > > This is RFC because I'm not sure adding SOFT_OFFLINE in enum > > > mf_result is a right approach. Also, maybe is it better to move > > > update_per_node_mf_stats() into num_poisoned_pages_inc()? > > > > > > I omitted some cleanups and sysfs doc update in this version to > > > highlight changes. I'd appreciate any suggestions. > > > > > > Regards, > > > Tomohiro Misono > > > > > > include/linux/mm.h | 2 ++ > > > include/linux/mmzone.h | 4 +++- > > > mm/memory-failure.c | 9 +++++++++ > > > 3 files changed, 14 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > > index 5d6cd523c7c0..7f93f6883760 100644 > > > --- a/include/linux/mm.h > > > +++ b/include/linux/mm.h > > > @@ -3991,6 +3991,8 @@ enum mf_result { > > > MF_FAILED, /* Error: handling failed */ > > > MF_DELAYED, /* Will be handled later */ > > > MF_RECOVERED, /* Successfully recovered */ > > > + > > > + MF_RES_SOFT_OFFLINE, /* Soft-offline */ > > > > It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result= ' is used to record > > the result of memory failure handler. So it might be inappropriate to a= dd MF_RES_SOFT_OFFLINE here. > > Understood. As I don't see other suitable place to put ENUM value, how ab= out changing like below? > Or, do you prefer adding another ENUM type instead of this? I think SOFT_OFFLINE-ed is one of the results of successfully recovered, and the other one is HARD_OFFLINE-ed. So how about make a separate sub-ENUM for MF_RECOVERED? Something like: enum mf_recovered_result { MF_RECOVERED_SOFT_OFFLINE, MF_RECOVERED_HARD_OFFLINE, }; And 1. total =3D recovered + ignored + failed + delayed 2. recovered =3D soft_offline + hard_offline > > ``` > static void update_per_node_mf_stats(unsigned long pfn, > - enum mf_result result) > + enum mf_result result, bool is_soft_= offline) > { > int nid =3D MAX_NUMNODES; > struct memory_failure_stats *mf_stats =3D NULL; > @@ -1299,6 +1299,12 @@ static void update_per_node_mf_stats(unsigned long= pfn, > } > > mf_stats =3D &NODE_DATA(nid)->mf_stats; > + if (is_soft_offline) { > + ++mf->stats->soft_offlined; > + ++mf_stats->total; > + return; > + } > + > switch (result) { > case MF_IGNORED: > ++mf_stats->ignored; > ``` > > Regards, > Tomohiro Misono > > > > > > > Thanks. > > . >