linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Xie Yuanbin <xieyuanbin1@huawei.com>
To: <david@redhat.com>, <dave.hansen@intel.com>, <bp@alien8.de>,
	<tglx@linutronix.de>, <mingo@redhat.com>,
	<dave.hansen@linux.intel.com>, <hpa@zytor.com>,
	<akpm@linux-foundation.org>, <lorenzo.stoakes@oracle.com>,
	<Liam.Howlett@oracle.com>, <vbabka@suse.cz>, <rppt@kernel.org>,
	<surenb@google.com>, <mhocko@suse.com>, <linmiaohe@huawei.com>,
	<nao.horiguchi@gmail.com>, <luto@kernel.org>,
	<peterz@infradead.org>, <tony.luck@intel.com>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <linux-edac@vger.kernel.org>,
	<will@kernel.org>, <liaohua4@huawei.com>, <lilinjie8@huawei.com>,
	Xie Yuanbin <xieyuanbin1@huawei.com>
Subject: [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM
Date: Tue, 4 Nov 2025 15:23:04 +0800	[thread overview]
Message-ID: <20251104072306.100738-1-xieyuanbin1@huawei.com> (raw)

Memory bit flips are among the most common hardware errors in the server
and embedded fields, many hardware components have memory verification
mechanisms, for example ECC. When an error is detected, some hardware or
architectures report the information to software (OS/BIOS), for example,
the MCE (Machine Check Exception) on x86.

Common errors include CE (Correctable Errors) and UE (Uncorrectable
Errors). When the kernel receives memory error information, if it has the
memory-failure feature, it can better handle memory errors without reboot.
For example, kernel can attempt to offline the affected memory by
migrating it or killing the process. Therefore, this feature is widely
used in servers and embedded fields.

For historical versions, memory-failure cannot be enabled with x86_32 &&
SPARSEMEM because the number of page-flags are insufficient. However, this
issue has been resolved in the current version, and this patch will allow
SPARSEMEM and memory-failure to be enabled together on x86_32.

By the way, due to increased demand, DRAM prices have recently
skyrocketed, making memory-failure potentially even more valuable in the
coming years.

v1-v2: https://lore.kernel.org/20251103033536.52234-1-xieyuanbin1@huawei.com
  - Describe the purpose of these patches in the cover letter.

  - Correct the description of historical changes to page flags.

  - Move the memory-failure traceing code from ras_event.h to
    memory-failure.h

Xie Yuanbin (2):
  x86/mm: support memory-failure on 32-bits with SPARSEMEM
  mm/memory-failure: remove the selection of RAS

 arch/x86/Kconfig                      |  3 -
 include/ras/ras_event.h               | 86 ------------------------
 include/trace/events/memory-failure.h | 97 +++++++++++++++++++++++++++
 mm/Kconfig                            |  1 -
 mm/memory-failure.c                   |  5 +-
 5 files changed, 101 insertions(+), 91 deletions(-)
 create mode 100644 include/trace/events/memory-failure.h

-- 
2.51.0



             reply	other threads:[~2025-11-04  7:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04  7:23 Xie Yuanbin [this message]
2025-11-04  7:23 ` [PATCH v2 1/2] " Xie Yuanbin
2025-11-04  7:23 ` [PATCH v2 2/2] mm/memory-failure: remove the selection of RAS Xie Yuanbin
2025-11-04  9:38   ` David Hildenbrand (Red Hat)
2025-11-04  9:50     ` Xie Yuanbin
2025-11-04  9:33 ` [PATCH v2 0/2] x86/mm: support memory-failure on 32-bits with SPARSEMEM David Hildenbrand (Red Hat)
2025-11-04 13:29   ` Xie Yuanbin
2025-11-04 13:32   ` Xie Yuanbin
2025-11-04 14:26 ` Dave Hansen
2025-11-05  2:45   ` Xie Yuanbin
2025-11-05  8:12     ` David Hildenbrand (Red Hat)
2025-11-05  9:05       ` Xie Yuanbin
2025-11-17  2:09         ` Xie Yuanbin
2025-11-17 13:03           ` David Hildenbrand (Red Hat)
2025-11-18  8:09             ` Xie Yuanbin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251104072306.100738-1-xieyuanbin1@huawei.com \
    --to=xieyuanbin1@huawei.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=liaohua4@huawei.com \
    --cc=lilinjie8@huawei.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox