From: Jiaqi Yan <jiaqiyan@google.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
"Hsiao, Duen-wen" <duenwen@google.com>,
"rientjes@google.com" <rientjes@google.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"shy828301@gmail.com" <shy828301@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"wangkefeng.wang@huawei.com" <wangkefeng.wang@huawei.com>
Subject: Re: [PATCH v1 0/3] Introduce per NUMA node memory error statistics
Date: Wed, 18 Jan 2023 09:31:30 -0800 [thread overview]
Message-ID: <CACw3F51bLSBM4MdLeFVG8c9d4COcj=3TTrjO4VDaQU-iNUVe9g@mail.gmail.com> (raw)
In-Reply-To: <IA1PR11MB607660CEAFCF549553ACE40CFCC69@IA1PR11MB6076.namprd11.prod.outlook.com>
On Tue, Jan 17, 2023 at 10:34 AM Luck, Tony <tony.luck@intel.com> wrote:
>
> > For SRAO signaled via **machine check exception**, my reading of the
> > current x86 MCE code is this:
> ...
> > 3) therefore, do_machine_check just skips kill_me_now or
> > kill_me_maybe, and directly goto out:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/mce/core.c#n1539
>
> That does appear to be what we do. But it looks like a regression from older
> behavior. An SRAO machine check *ought* to call memory_failure() without
> the MF_ACTION_REQUIRED bit set in flags.
>
> -Tony
>
Oh, maybe SRAO signaled via MCE calls memory_failure() with these
async code paths?
1. __mc_scan_banks => mce_log => mce_gen_pool_add + irq_work_queue(mce_irq_work)
2. mce_irq_work_cb => mce_schedule_work => schedule_work(&mce_work)
3. mce_work => mce_gen_pool_process =>
blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce)
=> mce_uc_nb => uc_decode_notifier => memory_failure
next prev parent reply other threads:[~2023-01-18 17:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-16 19:38 Jiaqi Yan
2023-01-16 19:39 ` [PATCH v1 1/3] mm: memory-failure: Add memory failure stats to sysfs Jiaqi Yan
2023-01-16 20:15 ` Andrew Morton
2023-01-17 9:14 ` HORIGUCHI NAOYA(堀口 直也)
2023-01-19 21:16 ` Jiaqi Yan
2023-01-17 9:02 ` HORIGUCHI NAOYA(堀口 直也)
2023-01-16 19:39 ` [PATCH v1 2/3] mm: memory-failure: Bump memory failure stats to pglist_data Jiaqi Yan
2023-01-16 20:16 ` Andrew Morton
2023-01-17 9:03 ` HORIGUCHI NAOYA(堀口 直也)
2023-01-18 23:05 ` Jiaqi Yan
2023-01-19 6:40 ` HORIGUCHI NAOYA(堀口 直也)
2023-01-19 18:05 ` Jiaqi Yan
2023-01-16 19:39 ` [PATCH v1 3/3] mm: memory-failure: Document memory failure stats Jiaqi Yan
2023-01-16 20:13 ` [PATCH v1 0/3] Introduce per NUMA node memory error statistics Andrew Morton
2023-01-16 21:52 ` Jiaqi Yan
2023-01-17 9:18 ` HORIGUCHI NAOYA(堀口 直也)
2023-01-17 17:51 ` Jiaqi Yan
2023-01-17 18:33 ` Luck, Tony
2023-01-18 17:31 ` Jiaqi Yan [this message]
2023-01-18 17:50 ` Luck, Tony
2023-01-18 23:33 ` Jiaqi Yan
2023-01-19 4:52 ` HORIGUCHI NAOYA(堀口 直也)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACw3F51bLSBM4MdLeFVG8c9d4COcj=3TTrjO4VDaQU-iNUVe9g@mail.gmail.com' \
--to=jiaqiyan@google.com \
--cc=akpm@linux-foundation.org \
--cc=duenwen@google.com \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@nec.com \
--cc=rientjes@google.com \
--cc=shy828301@gmail.com \
--cc=tony.luck@intel.com \
--cc=wangkefeng.wang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox