From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F165C46467 for ; Mon, 16 Jan 2023 21:52:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86E8F6B0071; Mon, 16 Jan 2023 16:52:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 81E616B0073; Mon, 16 Jan 2023 16:52:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E6276B0074; Mon, 16 Jan 2023 16:52:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 536AA6B0071 for ; Mon, 16 Jan 2023 16:52:24 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E3E3C1C63C9 for ; Mon, 16 Jan 2023 21:52:23 +0000 (UTC) X-FDA: 80362011366.02.E3A6DD3 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf14.hostedemail.com (Postfix) with ESMTP id 49AF7100015 for ; Mon, 16 Jan 2023 21:52:21 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=sB4Axz3S; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673905941; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=j3wiTJZAv/xug7RnFn1UWIhndAe5lirS0PumBIOZp4s=; b=Q79a8/mVzoNSea+grln9DAK2CXStrRmWxTxq7jq7dnqnw6CMGJWZT0jIAPq7NwzDX4MTBz IBqCJ3kPLuQXNmK+93xxzVT58ZgEZRqC+UGrbm/zTZyWMpPKtCJ+sCBRJQEdVjp5Da0bTc hdwkfBD7E45QN1zEa97BiFElllW7VOE= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=sB4Axz3S; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673905941; a=rsa-sha256; cv=none; b=4sDzR5myJ2Hz3DJ8Y+xRAzXuGY8NpiL8F8bKIKiTHvlkTpDAfkNBM4uM4vttnK7TleeQp3 Dq80iMg/powWRjOZ+AvwlUmQ/Za0igFq4gdaXUwb+dHS4bNUK9ks7rEL9R5NZhH8QAeJ9m M3l2YOkN3oQ+zJeQ1XHukkXTOZdnqKk= Received: by mail-pl1-f170.google.com with SMTP id b17so24075748pld.7 for ; Mon, 16 Jan 2023 13:52:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=j3wiTJZAv/xug7RnFn1UWIhndAe5lirS0PumBIOZp4s=; b=sB4Axz3SrMzx/F0rUMm3vizbAJI7K4KesOlf8PGoCnpfFjczrzgd+q37mNQ645Wjp+ 5xziyewvF8DYFwaPiPPhIxIenRB2z9uGQZA202Cjz+t+1E+T+qqDbPH0oMKt8RAxsd5X 2v2UY/HqKT9tlv7yLZQ+TKS7G5BC8VYeIWfPVSBktFq7Sv6iSHx0F5mftCaXgddeP5pr dsS0xLznjXMQ+CcDlYHiX2LmNA9kEiYq7uWsm8oN1xC4JWmOcUcWYzYJigqF3fT0VrLm MwyxHWnf5kpVTe7A+JbN5HPXyr7I4rZaYYIMjdcTO1yXOtIvDd2c3sXn9QWThQ+W0klb n5uA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=j3wiTJZAv/xug7RnFn1UWIhndAe5lirS0PumBIOZp4s=; b=kkDAdXdVZ4aDnurq2IezaHY+WhR4osSaY26BVH4XZCe0b+mfREO07/YPQWi4lWoyvz S/iMxZewjGYZ9QfzmSOPLZ9jfByiO3nLm91chvUnyuNaXsQ1vcl4eRcgErR3qSPR7l25 FIi//B+hdaawyXyFehfY8r25ojJEvxD/iheLfLcGBaxJFTgXVL6Do4XkXoUFrM528SA5 1az7S3ITd5zZ5faKD6UPJ4HXQat1OojEeAI25BebqTDqJ5k7wlagDbnYKkRSadNSyuKO E4ezJzid+SQ6hbrq2ZIqlPK2PrRtMEvqBINOaOIcRR5SoaXvMTiXWCIAIfD1ZIOFmM0V matA== X-Gm-Message-State: AFqh2kqGK0KNyJneEDlqpk2E27dmiqlWhBrtaSKD4nYwwD/hB8p5rC2Q 7z8Q6c/N26JTCUSJmiIdBvyT6a0EzCjk8WuRTkgRcA== X-Google-Smtp-Source: AMrXdXubbzAcnhkx6XzSLtJf0YYBi6+ta6QMppxRgekvVs3LZ1WfyA9SEsHPvc2UtlP0DW/iBuJfLGBcdAo/noGCKF0= X-Received: by 2002:a17:90a:4d09:b0:229:7ffb:24b with SMTP id c9-20020a17090a4d0900b002297ffb024bmr77267pjg.95.1673905939822; Mon, 16 Jan 2023 13:52:19 -0800 (PST) MIME-Version: 1.0 References: <20230116193902.1315236-1-jiaqiyan@google.com> <20230116121305.a9412c9a54d086c83cbfc71f@linux-foundation.org> In-Reply-To: <20230116121305.a9412c9a54d086c83cbfc71f@linux-foundation.org> From: Jiaqi Yan Date: Mon, 16 Jan 2023 13:52:08 -0800 Message-ID: Subject: Re: [PATCH v1 0/3] Introduce per NUMA node memory error statistics To: Andrew Morton Cc: tony.luck@intel.com, naoya.horiguchi@nec.com, duenwen@google.com, rientjes@google.com, linux-mm@kvack.org, shy828301@gmail.com, wangkefeng.wang@huawei.com Content-Type: multipart/alternative; boundary="000000000000eefb4f05f2689891" X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 49AF7100015 X-Stat-Signature: zpnp9rettw7agq3qykqwwi44ig3wk74a X-HE-Tag: 1673905941-83059 X-HE-Meta: U2FsdGVkX18JcHV/N0qikaqOq3krCb97trxiSWpLAWfxA5WqIoQYc3z9tpoCcj9lo4nxnEgHP/Th8sJcsFsA/4Sol4dB1ngKyXw9e6nlYkats4eyGgSTbXkfZk9FX1qZrZpoGJuclC31CKwLqZ54gCASuIA28grpUZy56n2ZRqh3yvbBJbacHZ9eo7NqySk9Fbk5UcgWc2bsuYRw23opRErJmU1A0ApRg1ZazIeIHiyzk9HaCZXMg2RSTGoolWzKxtIbjvj+R0NwXc1oN3mXLdGWULibZlgFwToJmOb6GZsubvoGoIHdVotiIPKnC2GVSJF8/XMgbZlaWxJbaxr2gIlcCSn5T5FPH7Z7ZLJEBk7sOOub5UNUBcdkbtRopLn39szL66fmfWffoZ7Qrn0OqkefkABAN92SuF4XDB026kcttlMiaSlwpsJXfC4WIby9dGfD4L8mAoKpcoHke+ZIdYs1vFOZ6S4UCTr1rkCBd1G4y57IKCRuooVYvX0Q9SGSR9GECPMkbqRSc9kkCoKR9WC2w+m+JH+EzlQ71oSLTL1Ipu3UZr9nikqZqTjEntYb6T7kVPaJGkrrv6YAOUhF+1KfpRPrVUkj7iCvNkaSnl8B86ywXD3D46pEBBVT/qTYHZVheNpnFKX8VlR6uYP5vSA5cgeR0FA6dz+9mp8KL8P9XfqnHtyTth/PpYPxYt4DkoM1i9Pdjt1nHLS8WmX84P91uyAWIVAg97eXw/+29HPb+MdtkE0GoK6EcI4R67S/egwzqf6RR3sFSqKYkcV35/07beplabfWn+z0oW9lPGGWJnItpk/ldsPhtKTstNYfsnbJelcqWJlRjc/CWKFm4HfB/HzY1gbB5v8I9YtAuqBs6FzLm0DsA+p9k63HRjCmv1A/TAQZOn4Z0OlDdra07dK/WJGsrZE6IDAQwANrqD3lRZDaKqatRAtR4+8Y3FxEHDJd5z0e3/QGtUW22Yq 3lIV0VXM L3W6LUC6Vjf2JhHcm03UDvYanita+YuseeaD8URrmVjC1AfcuZVmm8xoGCA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000131, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --000000000000eefb4f05f2689891 Content-Type: text/plain; charset="UTF-8" On Mon, Jan 16, 2023 at 12:13 PM Andrew Morton wrote: > On Mon, 16 Jan 2023 19:38:59 +0000 Jiaqi Yan wrote: > > > Background > > ========== > > In the RFC for Kernel Support of Memory Error Detection [1], one > advantage > > of software-based scanning over hardware patrol scrubber is the ability > > to make statistics visible to system administrators. The statistics > > include 2 categories: > > * Memory error statistics, for example, how many memory error are > > encountered, how many of them are recovered by the kernel. Note these > > memory errors are non-fatal to kernel: during the machine check > > exception (MCE) handling kernel already classified MCE's severity to > > be unnecessary to panic (but either action required or optional). > > * Scanner statistics, for example how many times the scanner have fully > > scanned a NUMA node, how many errors are first detected by the scanner. > > > > The memory error statistics are useful to userspace and actually not > > specific to scanner detected memory errors, and are the focus of this > RFC. > > I assume this is a leftover and this is no longer "RFC". > > I'd normally sit back and await reviewer input, but this series is > simple, so I'll slurp it up so we get some testing while that review is > ongoing. > Ah, yes, my typo, my intent is PATCH. I did test the patches on several test hosts I have, but more testing is always better. Thanks, Andrew! --000000000000eefb4f05f2689891 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Mon, Jan 16, 2023 at 12:13 PM Andr= ew Morton <akpm@linux-found= ation.org> wrote:
On Mon, 16 Jan 2023 19:38:59 +0000 Jiaqi Yan <jiaqiyan@google.com> wrote= :

> Background
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> In the RFC for Kernel Support of Memory Error Detection [1], one advan= tage
> of software-based scanning over hardware patrol scrubber is the abilit= y
> to make statistics visible to system administrators. The statistics > include 2 categories:
> * Memory error statistics, for example, how many memory error are
>=C2=A0 =C2=A0encountered, how many of them are recovered by the kernel.= Note these
>=C2=A0 =C2=A0memory errors are non-fatal to kernel: during the machine = check
>=C2=A0 =C2=A0exception (MCE) handling kernel already classified MCE'= ;s severity to
>=C2=A0 =C2=A0be unnecessary to panic (but either action required or opt= ional).
> * Scanner statistics, for example how many times the scanner have full= y
>=C2=A0 =C2=A0scanned a NUMA node, how many errors are first detected by= the scanner.
>
> The memory error statistics are useful to userspace and actually not > specific to scanner detected memory errors, and are the focus of this = RFC.

I assume this is a leftover and this is no longer "RFC".

I'd normally sit back and await reviewer input, but this series is
simple, so I'll slurp it up so we get some testing while that review is=
ongoing.

Ah, yes, my typo, my intent is= PATCH.
I did test=C2=A0the patches on several test=C2=A0hosts=C2= =A0I=C2=A0have, but more testing is always better. Thanks, Andrew!
=C2=A0
--000000000000eefb4f05f2689891--