[PATCH v12 0/2] ACPI: Add support for ACPI RAS2 feature table

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v12 0/2] ACPI: Add support for ACPI RAS2 feature table
@ 2025-09-02 17:30 shiju.jose
  2025-09-02 17:30 ` [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
  2025-09-02 17:30 ` [PATCH v12 2/2] ras: mem: Add memory " shiju.jose
  0 siblings, 2 replies; 20+ messages in thread
From: shiju.jose @ 2025-09-02 17:30 UTC (permalink / raw)
  To: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam,
	mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add support for ACPI RAS2 feature table (RAS2) defined in the
ACPI 6.5 specification, section 5.2.21 and RAS2 HW based memory
scrubbing feature.

ACPI RAS2 patches were part of the EDAC series [1].

The code is based on linux.git v6.17-rc4 [2].

1. https://lore.kernel.org/linux-cxl/20250212143654.1893-1-shiju.jose@huawei.com/
2. https://github.com/torvalds/linux.git

Changes
=======
v11 -> v12:
1. Modified logic for finding the lowest contiguous phy memory addr range for
NUMA domain using node_start_pfn() and node_spanned_pages() according to the
feedback from Mike Rapoport in v11.
https://lore.kernel.org/all/aKsIlFTkBsAF5sqD@kernel.org/

2. Rebase to 6.17-rc4.

v10 -> v11:
1. Simplified code by removing workarounds previously added to support
   non-compliant case of single PCC channel shared across all proximity
   domains (which is no longer required). 
   https://lore.kernel.org/all/f5b28977-0b80-4c39-929b-cf02ab1efb97@os.amperecomputing.com/

2. Fix for the comments from Borislav (Thanks).
   https://lore.kernel.org/all/20250811152805.GQaJoMBecC4DSDtTAu@fat_crate.local/

3. Rebase to 6.17-rc1.

v9 -> v10:
1. Use pcc_chan->shmem instead of 
   acpi_os_ioremap(pcc_chan->shmem_base_addr,...) as it was
   acpi_os_ioremap internally by the PCC driver to pcc_chan->shmem.

2. Changes required for the Ampere Computing system where uses a single
   PCC channel for RAS2 memory features across all NUMA domains. Based on the
   requirements from by Daniel on V9
   https://lore.kernel.org/all/547ed8fb-d6b7-4b6b-a38b-bf13223971b1@os.amperecomputing.com/
   and discussion with Jonathan.
2.1 Add node_to_range lookup facility to numa_memblks. This is to retrieve the lowest
    physical continuous memory range of the memory associated with a NUMA domain.
2.2. Set requested addr range to the memory region's base addr and size
   while send RAS2 cmd GET_PATROL_PARAMETER 
   in functions ras2_update_patrol_scrub_params_cache() &
   ras2_get_patrol_scrub_running().
2.3. Split struct ras2_mem_ctx into struct ras2_mem_ctx_hdr and struct ras2_pxm_domain
   to support cases, uses a single PCC channel for RAS2 scrubbers across all NUMA
   domains and PCC channel per RAS2 scrub instance. Provided ACPI spec define single
   memory scrub per NUMA domain.
2.4. EDAC feature sysfs folder for RAS2 changed from "acpi_ras_memX" to  "acpi_ras_mem_idX"
   because memory scrub instances across all NUMA domains would present under
   "acpi_ras_mem_id0" when a system uses a single PCC channel for RAS2 scrubbers across
   all NUMA domains etc.
2.5. Removed Acked-by: Rafael from patch [2], because of the several above changes from v9.

v8 -> v9:
1. Added following changes for feedback from Yazen.
 1.1 In ras2_check_pcc_chan(..) function
    - u32 variables moved to the same line.
    - Updated error log for readw_relaxed_poll_timeout()
    - Added error log for if (status & PCC_STATUS_ERROR), error condition.
    - Removed an impossible condition check.
  1.2. Added guard for ras2_pc_list_lock in ras2_get_pcc_subspace().

2. Rebased to linux.git v6.16-rc2 [2].

v7 -> v8:
1. Rebased to linux.git v6.16-rc1 [2].

v6 -> v7:
1. Fix for the issue reported by Daniel,
   In ras2_check_pcc_chan(), add read, clear and check RAS2 set_cap_status outside
   if (status & PCC_STATUS_ERROR) check. 
   https://lore.kernel.org/all/51bcb52c-4132-4daf-8903-29b121c485a1@os.amperecomputing.com/

v5 -> v6:
1. Fix for the issue reported by Daniel, in start scrubbing with correct addr and size
   after firmware return INVALID DATA error for scrub request with invalid addr or size.
   https://lore.kernel.org/all/8cdf7885-31b3-4308-8a7c-f4e427486429@os.amperecomputing.com/

v4 -> v5:
1. Fix for the build warnings reported by kernel test robot.
   https://patchwork.kernel.org/project/linux-edac/patch/20250423163511.1412-3-shiju.jose@huawei.com/
2. Removed patch "ACPI: ACPI 6.5: RAS2: Rename RAS2 table structure and field names"
   from the series as the patch was merged to linux-pm.git : branch linux-next
3. Rebased to ras.git: edac-for-next branch merged with linux-pm.git : linux-next branch.

v3 -> v4:
1.  Changes for feedbacks from Yazen on v3.
    https://lore.kernel.org/all/20250415210504.GA854098@yaz-khff2.amd.com/

v2 -> v3:
1. Rename RAS2 table structure and field names in 
   include/acpi/actbl2.h limited to only necessary
   for RAS2 scrub feature.
2. Changes for feedbacks from Jonathan on v2.
3. Daniel reported a known behaviour: when readback 'size' attribute after
   setting in, returns 0 before starting scrubbing via 'addr' attribute.
   Changes added to fix this.
4. Daniel reported that firmware cannot update status of demand scrubbing
   via the 'Actual Address Range (OUTPUT)', thus add workaround in the
   kernel to update sysfs 'addr' attribute with the status of demand
   scrubbing.
5. Optimized logic in ras2_check_pcc_chan() function
   (patch - ACPI:RAS2: Add ACPI RAS2 driver).
6. Add PCC channel lock to struct ras2_pcc_subspace and change
   lock in ras2_mem_ctx as a pointer to pcc channel lock to make sure
   writing to PCC subspace shared memory is protected from race conditions.

v1 -> v2:
1.  Changes for feedbacks from Borislav.
    - Shorten ACPI RAS2 structures and variables names.
    - Shorten some of the other variables in the RAS2 drivers.
    - Fixed few CamelCases.

2.  Changes for feedbacks from Yazen.
    - Added newline after number of '}' and return statements.
    - Changed return type for "ras2_add_aux_device() to 'int'.
    - Deleted a duplication of acpi_get_table("RAS2",...) in the ras2_acpi_parse_table().
    - Add "FW_WARN" to few error logs in the ras2_acpi_parse_table().
    - Rename ras2_acpi_init() to acpi_ras2_init() and modified to call acpi_ras2_init()
      function from the acpi_init().
    - Moved scrub related variables from the struct ras2_mem_ctx from  patch
      "ACPI:RAS2: Add ACPI RAS2 driver" to "ras: mem: Add memory ACPI RAS2 driver".

Shiju Jose (2):
  ACPI:RAS2: Add ACPI RAS2 driver
  ras: mem: Add memory ACPI RAS2 driver

 Documentation/edac/scrub.rst |  73 ++++++
 drivers/acpi/Kconfig         |  12 +
 drivers/acpi/Makefile        |   1 +
 drivers/acpi/bus.c           |   3 +
 drivers/acpi/ras2.c          | 389 ++++++++++++++++++++++++++++++++
 drivers/ras/Kconfig          |  11 +
 drivers/ras/Makefile         |   1 +
 drivers/ras/acpi_ras2.c      | 424 +++++++++++++++++++++++++++++++++++
 include/acpi/ras2.h          |  77 +++++++
 9 files changed, 991 insertions(+)
 create mode 100644 drivers/acpi/ras2.c
 create mode 100644 drivers/ras/acpi_ras2.c
 create mode 100644 include/acpi/ras2.h

-- 
2.43.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-02 17:30 [PATCH v12 0/2] ACPI: Add support for ACPI RAS2 feature table shiju.jose
@ 2025-09-02 17:30 ` shiju.jose
  2025-09-09 16:24   ` Yazen Ghannam
  2025-09-10 19:27   ` Borislav Petkov
  2025-09-02 17:30 ` [PATCH v12 2/2] ras: mem: Add memory " shiju.jose
  1 sibling, 2 replies; 20+ messages in thread
From: shiju.jose @ 2025-09-02 17:30 UTC (permalink / raw)
  To: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam,
	mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add support for ACPI RAS2 feature table (RAS2) defined in the
ACPI 6.5 Specification, section 5.2.21.
Driver defines RAS2 Init, which extracts the RAS2 table and driver
adds auxiliary device for each memory feature which binds to the
RAS2 memory driver.

Driver uses PCC mailbox to communicate with the ACPI HW and the
driver adds OSPM interfaces to send RAS2 commands.

According to the ACPI specification rev 6.5, section 5.2.21.1.1
RAS2 Platform Communication Channel Descriptor, “RAS2 supports multiple
PCC channels, where a channel is dedicated to a given component
instance.” Thus, RAS2 driver has been implemented to support only
systems that comply with the specification, i.e. a dedicated PCC channel
per system component instance for communication.

ACPI specification rev 6.5, section 5.2.21.1.1 Table 5.80: RAS2 Platform
Communication Channel Descriptor format, defines Field: Instance,
Identifier for the system component instance that the RAS feature is
associated with.

Section 5.2.21.2.1 Hardware-based Memory Scrubbing describes as
The platform can use this feature to expose controls and capabilities
associated with hardware-based memory scrub engines. Modern scalable
platforms have complex memory systems with a multitude of memory
controllers that are in turn associated with NUMA domains. It is also
common for RAS errors related to memory to be associated with NUMA
domains, where the NUMA domain functions as a FRU identifier. This
feature thus provides memory scrubbing at a NUMA domain granularity.
The following are supported:
1. Independent memory scrubbing controls for each NUMA domain, identified
using its proximity domain.
2. Provision for background (patrol) scrubbing of the entire memory
system, as well as on-demand scrubbing for a specific region of memory.

Thus, the RAS2 driver requires the lowest contiguous physical memory range
of the memory associated with a NUMA domain when communicating with the
firmware for memory-related features such as scrubbing. The driver uses
the component instance ID, as defined in Table 5.80, to retrieve the
lowest contiguous physical memory address range within the NUMA node
and store it in the struct ras2_context to expose the address range to
the userspace and for the communication with the firmware.

Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/acpi/Kconfig  |  12 ++
 drivers/acpi/Makefile |   1 +
 drivers/acpi/bus.c    |   3 +
 drivers/acpi/ras2.c   | 389 ++++++++++++++++++++++++++++++++++++++++++
 include/acpi/ras2.h   |  63 +++++++
 5 files changed, 468 insertions(+)
 create mode 100644 drivers/acpi/ras2.c
 create mode 100644 include/acpi/ras2.h

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index b594780a57d7..db21bf5a39c7 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
 	  If your platform does not support CPPC in firmware,
 	  leave this option disabled.
 
+config ACPI_RAS2
+	bool "ACPI RAS2 driver"
+	select AUXILIARY_BUS
+	select MAILBOX
+	select PCC
+	select NUMA_KEEP_MEMINFO if NUMA_MEMBLKS
+	help
+	  The driver adds support for ACPI RAS2 feature table (extracts RAS2
+	  table from OS system table) and OSPM interfaces to send RAS2
+	  commands via PCC mailbox subspace. Driver adds platform device for
+	  the RAS2 memory features which binds to the RAS2 memory driver.
+
 config ACPI_PROCESSOR
 	tristate "Processor"
 	depends on X86 || ARM64 || LOONGARCH || RISCV
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index d1b0affb844f..abfec6745724 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -105,6 +105,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)	+= ec_sys.o
 obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
+obj-$(CONFIG_ACPI_RAS2)		+= ras2.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
 obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
 obj-$(CONFIG_ACPI_PFRUT)	+= pfr_update.o pfr_telemetry.o
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index a984ccd4a2a0..b02ceb2837c6 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -31,6 +31,7 @@
 #include <acpi/apei.h>
 #include <linux/suspend.h>
 #include <linux/prmt.h>
+#include <acpi/ras2.h>
 
 #include "internal.h"
 
@@ -1474,6 +1475,8 @@ static int __init acpi_init(void)
 	acpi_debugger_init();
 	acpi_setup_sb_notify_handler();
 	acpi_viot_init();
+	acpi_ras2_init();
+
 	return 0;
 }
 
diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c
new file mode 100644
index 000000000000..a17eab9eecf1
--- /dev/null
+++ b/drivers/acpi/ras2.c
@@ -0,0 +1,389 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Implementation of ACPI RAS2 driver.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ * Support for RAS2 - ACPI 6.5 Specification, section 5.2.21
+ *
+ * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table and
+ * get the PCC channel subspace for communicating with the ACPI compliant
+ * HW platform which supports ACPI RAS2. Driver adds auxiliary devices
+ * for each RAS2 memory feature which binds to the memory ACPI RAS2 driver.
+ */
+
+#define pr_fmt(fmt) "ACPI RAS2: " fmt
+
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/iopoll.h>
+#include <linux/ktime.h>
+#include <acpi/pcc.h>
+#include <acpi/ras2.h>
+
+/**
+ * struct ras2_pcc_subspace - Data structure for PCC communication
+ * @mbox_client:	struct mbox_client object
+ * @pcc_chan:		Pointer to struct pcc_mbox_chan
+ * @comm_addr:		Pointer to RAS2 PCC shared memory region
+ * @elem:		List for registered RAS2 PCC channel subspaces
+ * @pcc_lock:		PCC lock to provide mutually exclusive access
+ *			to PCC channel subspace
+ * @deadline_us:	Poll PCC status register timeout in micro secs
+ *			for PCC command complete
+ * @pcc_mpar:		Maximum Periodic Access Rate(MPAR) for PCC channel
+ * @pcc_mrtt:		Minimum Request Turnaround Time(MRTT) in micro secs
+ *			OS must wait after completion of a PCC command before
+ *			issue next command
+ * @last_cmd_cmpl_time:	completion time of last PCC command
+ * @last_mpar_reset:	Time of last MPAR count reset
+ * @mpar_count:		MPAR count
+ * @pcc_id:		Identifier of the RAS2 platform communication channel
+ * @last_cmd:		Last PCC command
+ * @pcc_chnl_acq:	Status of PCC channel acquired
+ */
+struct ras2_pcc_subspace {
+	struct mbox_client		mbox_client;
+	struct pcc_mbox_chan		*pcc_chan;
+	struct acpi_ras2_shmem __iomem	*comm_addr;
+	struct list_head		elem;
+	struct mutex			pcc_lock;
+	unsigned int			deadline_us;
+	unsigned int			pcc_mpar;
+	unsigned int			pcc_mrtt;
+	ktime_t				last_cmd_cmpl_time;
+	ktime_t				last_mpar_reset;
+	int				mpar_count;
+	int				pcc_id;
+	u16				last_cmd;
+	bool				pcc_chnl_acq;
+};
+
+/*
+ * Arbitrary retries for PCC commands because the remote processor
+ * could be much slower to reply. Keeping it high enough to cover
+ * emulators where the processors run painfully slow.
+ */
+#define RAS2_NUM_RETRIES 600ULL
+
+#define RAS2_FEAT_TYPE_MEMORY 0x00
+
+static int ras2_report_cap_error(u32 cap_status)
+{
+	switch (cap_status) {
+	case ACPI_RAS2_NOT_VALID:
+	case ACPI_RAS2_NOT_SUPPORTED:
+		return -EPERM;
+	case ACPI_RAS2_BUSY:
+		return -EBUSY;
+	case ACPI_RAS2_FAILED:
+	case ACPI_RAS2_ABORTED:
+	case ACPI_RAS2_INVALID_DATA:
+		return -EINVAL;
+	default: /* 0 or other, Success */
+		return 0;
+	}
+}
+
+static int ras2_check_pcc_chan(struct ras2_pcc_subspace *pcc_subspace)
+{
+	struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace->comm_addr;
+	u32 cap_status, rc;
+	u16 status;
+
+	/*
+	 * As per ACPI spec, the PCC space will be initialized by
+	 * platform and should have set the command completion bit when
+	 * PCC can be used by OSPM.
+	 *
+	 * Poll PCC status register every 3us(delay_us) for maximum of
+	 * deadline_us(timeout_us) until PCC command complete bit is set(cond).
+	 */
+	rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status,
+					status & PCC_STATUS_CMD_COMPLETE, 3,
+					pcc_subspace->deadline_us);
+	if (rc) {
+		pr_warn("PCC check channel timeout for pcc_id=%d rc=%d\n",
+			pcc_subspace->pcc_id, rc);
+		return rc;
+	}
+
+	if (status & PCC_STATUS_ERROR) {
+		pr_warn("Error in executing last command=%d for pcc_id=%d\n",
+			pcc_subspace->last_cmd, pcc_subspace->pcc_id);
+		status &= ~PCC_STATUS_ERROR;
+		writew_relaxed(status, &gen_comm_base->status);
+		return -EIO;
+	}
+
+	cap_status = readw_relaxed(&gen_comm_base->set_caps_status);
+	writew_relaxed(0x0, &gen_comm_base->set_caps_status);
+	return ras2_report_cap_error(cap_status);
+}
+
+/**
+ * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
+ * @ras2_ctx:	pointer to the RAS2 context structure
+ * @cmd:	command to send
+ *
+ * Returns: 0 on success, an error otherwise
+ */
+int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd)
+{
+	struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace;
+	struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace->comm_addr;
+	struct mbox_chan *pcc_channel;
+	unsigned int time_delta;
+	int rc;
+
+	rc = ras2_check_pcc_chan(pcc_subspace);
+	if (rc < 0)
+		return rc;
+
+	pcc_channel = pcc_subspace->pcc_chan->mchan;
+
+	/*
+	 * Handle the Minimum Request Turnaround Time(MRTT).
+	 * "The minimum amount of time that OSPM must wait after the completion
+	 * of a command before issuing the next command, in microseconds."
+	 */
+	if (pcc_subspace->pcc_mrtt) {
+		time_delta = ktime_us_delta(ktime_get(),
+					    pcc_subspace->last_cmd_cmpl_time);
+		if (pcc_subspace->pcc_mrtt > time_delta)
+			udelay(pcc_subspace->pcc_mrtt - time_delta);
+	}
+
+	/*
+	 * Handle the non-zero Maximum Periodic Access Rate(MPAR).
+	 * "The maximum number of periodic requests that the subspace channel can
+	 * support, reported in commands per minute. 0 indicates no limitation."
+	 *
+	 * This parameter should be ideally zero or large enough so that it can
+	 * handle maximum number of requests that all the cores in the system can
+	 * collectively generate. If it is not, we will follow the spec and just
+	 * not send the request to the platform after hitting the MPAR limit in
+	 * any 60s window.
+	 */
+	if (pcc_subspace->pcc_mpar) {
+		if (pcc_subspace->mpar_count == 0) {
+			time_delta = ktime_ms_delta(ktime_get(),
+						    pcc_subspace->last_mpar_reset);
+			if (time_delta < 60 * MSEC_PER_SEC) {
+				dev_dbg(ras2_ctx->dev,
+					"PCC cmd not sent due to MPAR limit");
+				return -EIO;
+			}
+			pcc_subspace->last_mpar_reset = ktime_get();
+			pcc_subspace->mpar_count = pcc_subspace->pcc_mpar;
+		}
+		pcc_subspace->mpar_count--;
+	}
+
+	/* Write to the shared comm region */
+	writew_relaxed(cmd, &gen_comm_base->command);
+
+	/* Flip CMD COMPLETE bit */
+	writew_relaxed(0, &gen_comm_base->status);
+
+	/* Ring doorbell */
+	rc = mbox_send_message(pcc_channel, &cmd);
+	if (rc < 0) {
+		dev_warn(ras2_ctx->dev,
+			 "Err sending PCC mbox message. cmd:%d, rc:%d\n",
+			 cmd, rc);
+		return rc;
+	}
+
+	pcc_subspace->last_cmd = cmd;
+
+	/*
+	 * If Minimum Request Turnaround Time is non-zero, we need
+	 * to record the completion time of both READ and WRITE
+	 * command for proper handling of MRTT, so we need to check
+	 * for pcc_mrtt in addition to CMD_READ.
+	 */
+	if (cmd == PCC_CMD_EXEC_RAS2 || pcc_subspace->pcc_mrtt) {
+		rc = ras2_check_pcc_chan(pcc_subspace);
+		if (pcc_subspace->pcc_mrtt)
+			pcc_subspace->last_cmd_cmpl_time = ktime_get();
+	}
+
+	if (pcc_channel->mbox->txdone_irq)
+		mbox_chan_txdone(pcc_channel, rc);
+	else
+		mbox_client_txdone(pcc_channel, rc);
+
+	return rc < 0 ? rc : 0;
+}
+EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
+
+static void ras2_list_pcc_release(struct ras2_pcc_subspace *pcc_subspace)
+{
+	pcc_mbox_free_channel(pcc_subspace->pcc_chan);
+	kfree(pcc_subspace);
+}
+
+static int ras2_register_pcc_channel(struct ras2_mem_ctx *ras2_ctx, int pcc_id)
+{
+	struct ras2_pcc_subspace *pcc_subspace;
+	struct pcc_mbox_chan *pcc_chan;
+	struct mbox_client *mbox_cl;
+
+	if (pcc_id < 0)
+		return -EINVAL;
+
+	pcc_subspace = kzalloc(sizeof(*pcc_subspace), GFP_KERNEL);
+	if (!pcc_subspace)
+		return -ENOMEM;
+
+	mbox_cl			= &pcc_subspace->mbox_client;
+	mbox_cl->knows_txdone	= true;
+
+	pcc_chan = pcc_mbox_request_channel(mbox_cl, pcc_id);
+	if (IS_ERR(pcc_chan)) {
+		kfree(pcc_subspace);
+		return PTR_ERR(pcc_chan);
+	}
+
+	pcc_subspace->pcc_id		= pcc_id;
+	pcc_subspace->pcc_chan		= pcc_chan;
+	pcc_subspace->comm_addr		= pcc_chan->shmem;
+	pcc_subspace->deadline_us	= RAS2_NUM_RETRIES * pcc_chan->latency;
+	pcc_subspace->pcc_mrtt		= pcc_chan->min_turnaround_time;
+	pcc_subspace->pcc_mpar		= pcc_chan->max_access_rate;
+	pcc_subspace->mbox_client.knows_txdone	= true;
+	pcc_subspace->pcc_chnl_acq	= true;
+
+	ras2_ctx->pcc_subspace	= pcc_subspace;
+	ras2_ctx->comm_addr	= pcc_subspace->comm_addr;
+	ras2_ctx->dev		= pcc_chan->mchan->mbox->dev;
+
+	mutex_init(&pcc_subspace->pcc_lock);
+	ras2_ctx->pcc_lock	= &pcc_subspace->pcc_lock;
+
+	return 0;
+}
+
+static DEFINE_IDA(ras2_ida);
+static void ras2_release(struct device *device)
+{
+	struct auxiliary_device *auxdev = to_auxiliary_dev(device);
+	struct ras2_mem_ctx *ras2_ctx =
+		container_of(auxdev, struct ras2_mem_ctx, adev);
+
+	ida_free(&ras2_ida, auxdev->id);
+	ras2_list_pcc_release(ras2_ctx->pcc_subspace);
+	kfree(ras2_ctx);
+}
+
+static int ras2_add_aux_device(char *name, int channel, u32 pxm_inst)
+{
+	unsigned long start_pfn, size_pfn;
+	struct ras2_mem_ctx *ras2_ctx;
+	int id, rc;
+
+	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
+	if (!ras2_ctx)
+		return -ENOMEM;
+
+	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);
+	/*
+	 * Retrieve the lowest contiguous physical memory address range within
+	 * the NUMA node.
+	 */
+	start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
+	size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
+	if (!size_pfn) {
+		pr_debug("Failed to find phy addr range for NUMA node(%u)\n",
+			 pxm_inst);
+		goto ctx_free;
+	}
+	ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn);
+	ras2_ctx->mem_size = __pfn_to_phys(size_pfn);
+
+	rc = ras2_register_pcc_channel(ras2_ctx, channel);
+	if (rc < 0) {
+		pr_debug("Failed to register pcc channel rc=%d\n", rc);
+		goto ctx_free;
+	}
+
+	id = ida_alloc(&ras2_ida, GFP_KERNEL);
+	if (id < 0) {
+		rc = id;
+		goto ctx_free;
+	}
+
+	ras2_ctx->adev.id		= id;
+	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;
+	ras2_ctx->adev.dev.release	= ras2_release;
+	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
+
+	rc = auxiliary_device_init(&ras2_ctx->adev);
+	if (rc)
+		goto ida_free;
+
+	rc = auxiliary_device_add(&ras2_ctx->adev);
+	if (rc) {
+		auxiliary_device_uninit(&ras2_ctx->adev);
+		return rc;
+	}
+
+	return 0;
+
+ida_free:
+	ida_free(&ras2_ida, id);
+ctx_free:
+	kfree(ras2_ctx);
+
+	return rc;
+}
+
+static int acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab)
+{
+	struct acpi_ras2_pcc_desc *pcc_desc_list;
+	int rc;
+	u16 i;
+
+	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
+		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too short #1)\n");
+		return -EINVAL;
+	}
+
+	if (!ras2_tab->num_pcc_descs) {
+		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
+		return -EINVAL;
+	}
+
+	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
+	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
+		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
+			continue;
+
+		rc = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME,
+					 pcc_desc_list->channel_id,
+					 pcc_desc_list->instance);
+		if (rc)
+			pr_warn("Failed to add RAS2 auxiliary device rc=%d\n", rc);
+	}
+
+	return 0;
+}
+
+void __init acpi_ras2_init(void)
+{
+	struct acpi_table_ras2 *ras2_tab;
+	acpi_status status;
+
+	status = acpi_get_table(ACPI_SIG_RAS2, 0,
+				(struct acpi_table_header **)&ras2_tab);
+	if (ACPI_FAILURE(status)) {
+		pr_err("Failed to get table, %s\n", acpi_format_exception(status));
+		return;
+	}
+
+	if (acpi_ras2_parse(ras2_tab))
+		pr_err("Failed to parse RAS2 table\n");
+
+	acpi_put_table((struct acpi_table_header *)ras2_tab);
+}
diff --git a/include/acpi/ras2.h b/include/acpi/ras2.h
new file mode 100644
index 000000000000..cb053b5f37e7
--- /dev/null
+++ b/include/acpi/ras2.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * ACPI RAS2 driver header file
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited
+ */
+
+#ifndef _ACPI_RAS2_H
+#define _ACPI_RAS2_H
+
+#include <linux/acpi.h>
+#include <linux/auxiliary_bus.h>
+#include <linux/mailbox_client.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+struct device;
+
+/*
+ * ACPI spec 6.5 Table 5.82: PCC command codes used by
+ * RAS2 platform communication channel.
+ */
+#define PCC_CMD_EXEC_RAS2 0x01
+
+#define RAS2_AUX_DEV_NAME "ras2"
+#define RAS2_MEM_DEV_ID_NAME "acpi_ras2_mem"
+
+/**
+ * struct ras2_mem_ctx - Context for RAS2 memory features
+ * @adev:		Auxiliary device object
+ * @comm_addr:		Pointer to RAS2 PCC shared memory region
+ * @dev:		Pointer to device backing struct mbox_controller for PCC
+ * @pcc_subspace:	Pointer to local data structure for PCC communication
+ * @pcc_lock:		Pointer to PCC lock to provide mutually exclusive access
+ *			to PCC channel subspace
+ * @sys_comp_nid:	Node ID of the system component that the RAS feature
+ *			is associated with. See ACPI spec 6.5 Table 5.80: RAS2
+ *			Platform Communication Channel Descriptor format,
+ *			Field: Instance
+ * @mem_base_addr:	Base of the lowest physical continuous memory range
+ *			of the memory associated with the NUMA domain
+ * @mem_size		Size of the lowest physical continuous memory range
+ *			of the memory associated with the NUMA domain
+ */
+struct ras2_mem_ctx {
+	struct auxiliary_device		adev;
+	struct acpi_ras2_shmem __iomem	*comm_addr;
+	struct device			*dev;
+	void				*pcc_subspace;
+	struct mutex			*pcc_lock;
+	u32				sys_comp_nid;
+	u64				mem_base_addr;
+	u64				mem_size;
+};
+
+#ifdef CONFIG_ACPI_RAS2
+void __init acpi_ras2_init(void);
+int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd);
+#else
+static inline void acpi_ras2_init(void) { }
+#endif
+
+#endif /* _ACPI_RAS2_H */
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v12 2/2] ras: mem: Add memory ACPI RAS2 driver
  2025-09-02 17:30 [PATCH v12 0/2] ACPI: Add support for ACPI RAS2 feature table shiju.jose
  2025-09-02 17:30 ` [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
@ 2025-09-02 17:30 ` shiju.jose
  1 sibling, 0 replies; 20+ messages in thread
From: shiju.jose @ 2025-09-02 17:30 UTC (permalink / raw)
  To: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam,
	mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Memory ACPI RAS2 auxiliary driver binds to the auxiliary device
add by the ACPI RAS2 table parser.

Driver uses a PCC subspace for communicating with the ACPI compliant
platform.

According to the ACPI specification rev 6.5, section 5.2.21.1.1
RAS2 Platform Communication Channel Descriptor, “RAS2 supports multiple
PCC channels, where a channel is dedicated to a given component
instance.”
Device with ACPI RAS2 scrub feature registers with EDAC device driver,
which retrieves the scrub descriptor from EDAC scrub and exposes
the scrub control attributes for RAS2 scrub instance to userspace in
/sys/bus/edac/devices/acpi_ras_memX/scrub0/.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/edac/scrub.rst |  73 ++++++
 drivers/ras/Kconfig          |  11 +
 drivers/ras/Makefile         |   1 +
 drivers/ras/acpi_ras2.c      | 424 +++++++++++++++++++++++++++++++++++
 include/acpi/ras2.h          |  14 ++
 5 files changed, 523 insertions(+)
 create mode 100644 drivers/ras/acpi_ras2.c

diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
index 2cfa74fa1ffd..4c6ee84fb691 100644
--- a/Documentation/edac/scrub.rst
+++ b/Documentation/edac/scrub.rst
@@ -340,3 +340,76 @@ controller or platform when unexpectedly high error rates are detected.
 
 Sysfs files for scrubbing are documented in
 `Documentation/ABI/testing/sysfs-edac-ecs`
+
+3. ACPI RAS2 Hardware-based Memory Scrubbing
+
+3.1. On demand scrubbing for a specific memory region.
+
+3.1.1. Query the status of demand scrubbing
+
+Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+0
+
+3.1.2. Query what is device default/current scrub cycle setting.
+
+       Applicable to both on-demand and background scrubbing.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+36000
+
+3.1.3. Query the range of device supported scrub cycle for a memory region.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration
+
+3600
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration
+
+86400
+
+3.1.4. Program scrubbing for the memory region in RAS2 device to repeat every
+43200 seconds (half a day).
+
+# echo 43200 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+3.1.5. Program address and size of the memory region to scrub
+
+Write 'size' of the memory region to scrub.
+
+# echo 0x300000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/size
+
+Write 'addr' starts demand scrubbing, please make sure other attributes are
+set prior to that.
+
+# echo 0x200000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+0x200000
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+0
+
+3.2. Background scrubbing the entire memory
+
+3.2.1. Query the status of background scrubbing.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
+
+0
+
+3.2.2. Program background scrubbing for RAS2 device to repeat in every 21600
+seconds (quarter of a day).
+
+# echo 21600 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+3.2.3. Start 'background scrubbing'.
+
+# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
index fc4f4bb94a4c..a88002f1f462 100644
--- a/drivers/ras/Kconfig
+++ b/drivers/ras/Kconfig
@@ -46,4 +46,15 @@ config RAS_FMPM
 	  Memory will be retired during boot time and run time depending on
 	  platform-specific policies.
 
+config MEM_ACPI_RAS2
+	tristate "Memory ACPI RAS2 driver"
+	depends on ACPI_RAS2
+	depends on EDAC
+	depends on EDAC_SCRUB
+	help
+	  The driver binds to the platform device added by the ACPI RAS2
+	  table parser. Use a PCC channel subspace for communicating with
+	  the ACPI compliant platform to provide control of memory scrub
+	  parameters to the user via the EDAC scrub.
+
 endif
diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
index 11f95d59d397..a0e6e903d6b0 100644
--- a/drivers/ras/Makefile
+++ b/drivers/ras/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_RAS)	+= ras.o
 obj-$(CONFIG_DEBUG_FS)	+= debugfs.o
 obj-$(CONFIG_RAS_CEC)	+= cec.o
+obj-$(CONFIG_MEM_ACPI_RAS2)	+= acpi_ras2.o
 
 obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
 obj-y			+= amd/atl/
diff --git a/drivers/ras/acpi_ras2.c b/drivers/ras/acpi_ras2.c
new file mode 100644
index 000000000000..3971653b477a
--- /dev/null
+++ b/drivers/ras/acpi_ras2.c
@@ -0,0 +1,424 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ACPI RAS2 memory driver
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ */
+
+#define pr_fmt(fmt)	"ACPI RAS2 MEMORY: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/edac.h>
+#include <linux/platform_device.h>
+#include <acpi/ras2.h>
+
+#define RAS2_SUPPORT_HW_PARTOL_SCRUB BIT(0)
+#define RAS2_TYPE_PATROL_SCRUB 0x0000
+
+#define RAS2_GET_PATROL_PARAMETERS 0x01
+#define RAS2_START_PATROL_SCRUBBER 0x02
+#define RAS2_STOP_PATROL_SCRUBBER 0x03
+
+/*
+ * RAS2 patrol scrub
+ */
+#define RAS2_PS_SC_HRS_IN_MASK GENMASK(15, 8)
+#define RAS2_PS_EN_BACKGROUND BIT(0)
+#define RAS2_PS_SC_HRS_OUT_MASK GENMASK(7, 0)
+#define RAS2_PS_MIN_SC_HRS_OUT_MASK GENMASK(15, 8)
+#define RAS2_PS_MAX_SC_HRS_OUT_MASK GENMASK(23, 16)
+#define RAS2_PS_FLAG_SCRUB_RUNNING BIT(0)
+
+#define RAS2_SCRUB_NAME_LEN 128
+#define RAS2_HOUR_IN_SECS 3600
+
+enum ras2_od_scrub_status {
+	OD_SCRUB_STS_IDLE,
+	OD_SCRUB_STS_INIT,
+	OD_SCRUB_STS_ACTIVE,
+};
+
+struct acpi_ras2_ps_shared_mem {
+	struct acpi_ras2_shmem common;
+	struct acpi_ras2_patrol_scrub_param params;
+};
+
+#define TO_ACPI_RAS2_PS_SHMEM(_addr) \
+	container_of(_addr, struct acpi_ras2_ps_shared_mem, common)
+
+static int ras2_is_patrol_scrub_support(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct acpi_ras2_shmem __iomem *common = (void *)ras2_ctx->comm_addr;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	common->set_caps[0] = 0;
+
+	return common->features[0] & RAS2_SUPPORT_HW_PARTOL_SCRUB;
+}
+
+static int ras2_update_patrol_scrub_params_cache(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	int ret;
+
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.command = RAS2_GET_PATROL_PARAMETERS;
+	ps_sm->params.req_addr_range[0] = ras2_ctx->mem_base_addr;
+	ps_sm->params.req_addr_range[1] = ras2_ctx->mem_size;
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "failed to read parameters\n");
+		return ret;
+	}
+
+	ras2_ctx->min_scrub_cycle = FIELD_GET(RAS2_PS_MIN_SC_HRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	ras2_ctx->max_scrub_cycle = FIELD_GET(RAS2_PS_MAX_SC_HRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	ras2_ctx->scrub_cycle_hrs = FIELD_GET(RAS2_PS_SC_HRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	if (ras2_ctx->bg_scrub) {
+		ras2_ctx->base = 0;
+		ras2_ctx->size = 0;
+		ras2_ctx->od_scrub_sts = OD_SCRUB_STS_IDLE;
+		return 0;
+	}
+
+	if  (ps_sm->params.flags & RAS2_PS_FLAG_SCRUB_RUNNING) {
+		ras2_ctx->base = ps_sm->params.actl_addr_range[0];
+		ras2_ctx->size = ps_sm->params.actl_addr_range[1];
+	} else if (ras2_ctx->od_scrub_sts != OD_SCRUB_STS_INIT) {
+		/*
+		 * When demand scrubbing is finished driver resets actual
+		 * address range to 0 when readback. Otherwise userspace
+		 * assumes demand scrubbing is in progress.
+		 */
+		ras2_ctx->base = 0;
+		ras2_ctx->size = 0;
+		ras2_ctx->od_scrub_sts = OD_SCRUB_STS_IDLE;
+	}
+
+	return 0;
+}
+
+/* Context - PCC lock must be held */
+static int ras2_get_patrol_scrub_running(struct ras2_mem_ctx *ras2_ctx,
+					 bool *running)
+{
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	int ret;
+
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.command = RAS2_GET_PATROL_PARAMETERS;
+	ps_sm->params.req_addr_range[0] = ras2_ctx->mem_base_addr;
+	ps_sm->params.req_addr_range[1] = ras2_ctx->mem_size;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "failed to read parameters\n");
+		return ret;
+	}
+
+	*running = ps_sm->params.flags & RAS2_PS_FLAG_SCRUB_RUNNING;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 *min)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*min = ras2_ctx->min_scrub_cycle * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 *max)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*max = ras2_ctx->max_scrub_cycle * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_cycle_read(struct device *dev, void *drv_data,
+				    u32 *scrub_cycle_secs)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*scrub_cycle_secs = ras2_ctx->scrub_cycle_hrs * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_cycle_write(struct device *dev, void *drv_data,
+				     u32 scrub_cycle_secs)
+{
+	u8 scrub_cycle_hrs = scrub_cycle_secs / RAS2_HOUR_IN_SECS;
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	bool running;
+	int ret;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	if (scrub_cycle_hrs < ras2_ctx->min_scrub_cycle ||
+	    scrub_cycle_hrs > ras2_ctx->max_scrub_cycle)
+		return -EINVAL;
+
+	ras2_ctx->scrub_cycle_hrs = scrub_cycle_hrs;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_addr(struct device *dev, void *drv_data, u64 *base)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	int ret;
+
+	/*
+	 * When BG scrubbing is enabled the actual address range is not valid.
+	 * Return -EBUSY now unless find out a method to retrieve actual full PA range.
+	 */
+	if (ras2_ctx->bg_scrub)
+		return -EBUSY;
+
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+
+	*base = ras2_ctx->base;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_size(struct device *dev, void *drv_data, u64 *size)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	int ret;
+
+	if (ras2_ctx->bg_scrub)
+		return -EBUSY;
+
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+
+	*size = ras2_ctx->size;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_write_addr(struct device *dev, void *drv_data, u64 base)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	bool running;
+	int ret;
+
+	if (ras2_ctx->bg_scrub)
+		return -EBUSY;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+
+	if (!ras2_ctx->size || ras2_ctx->size > ras2_ctx->mem_size ||
+	    base < ras2_ctx->mem_base_addr ||
+	    (base + ras2_ctx->size) >
+		(ras2_ctx->mem_base_addr + ras2_ctx->mem_size)) {
+		dev_warn(dev,
+			 "%s: Invalid address range, base=0x%llx size=0x%llx\n",
+			 __func__, base, ras2_ctx->size);
+		return -ERANGE;
+	}
+
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	ras2_ctx->base = base;
+	ps_sm->params.scrub_params_in &= ~RAS2_PS_SC_HRS_IN_MASK;
+	ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PS_SC_HRS_IN_MASK,
+						    ras2_ctx->scrub_cycle_hrs);
+	ps_sm->params.req_addr_range[0] = ras2_ctx->base;
+	ps_sm->params.req_addr_range[1] = ras2_ctx->size;
+	ps_sm->params.scrub_params_in &= ~RAS2_PS_EN_BACKGROUND;
+	ps_sm->params.command = RAS2_START_PATROL_SCRUBBER;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(dev, "Failed to start demand scrubbing rc(%d)\n", ret);
+		if (ret != -EBUSY) {
+			ps_sm->params.req_addr_range[0] = 0;
+			ps_sm->params.req_addr_range[1] = 0;
+			ras2_ctx->base = 0;
+			ras2_ctx->size = 0;
+			ras2_ctx->od_scrub_sts = OD_SCRUB_STS_IDLE;
+		}
+		return ret;
+	}
+	ras2_ctx->od_scrub_sts = OD_SCRUB_STS_ACTIVE;
+
+	return ras2_update_patrol_scrub_params_cache(ras2_ctx);
+}
+
+static int ras2_hw_scrub_write_size(struct device *dev, void *drv_data, u64 size)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	bool running;
+	int ret;
+
+	if (!size) {
+		dev_warn(dev, "%s: Invalid address range size=0x%llx\n",
+			 __func__, size);
+		return -EINVAL;
+	}
+
+	if (ras2_ctx->bg_scrub)
+		return -EBUSY;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	ras2_ctx->size = size;
+	ras2_ctx->od_scrub_sts = OD_SCRUB_STS_INIT;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	bool running;
+	int ret;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (enable) {
+		if (ras2_ctx->bg_scrub || running)
+			return -EBUSY;
+		ps_sm->params.req_addr_range[0] = 0;
+		ps_sm->params.req_addr_range[1] = 0;
+		ps_sm->params.scrub_params_in &= ~RAS2_PS_SC_HRS_IN_MASK;
+		ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PS_SC_HRS_IN_MASK,
+							    ras2_ctx->scrub_cycle_hrs);
+		ps_sm->params.command = RAS2_START_PATROL_SCRUBBER;
+	} else {
+		if (!ras2_ctx->bg_scrub)
+			return -EPERM;
+		ps_sm->params.command = RAS2_STOP_PATROL_SCRUBBER;
+	}
+
+	ps_sm->params.scrub_params_in &= ~RAS2_PS_EN_BACKGROUND;
+	ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PS_EN_BACKGROUND,
+						    enable);
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(dev, "Failed to %s background scrubbing\n",
+			str_enable_disable(enable));
+		return ret;
+	}
+
+	if (enable) {
+		ras2_ctx->bg_scrub = true;
+		/* Update the cache to account for rounding of supplied parameters and similar */
+		ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	} else {
+		ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+		ras2_ctx->bg_scrub = false;
+	}
+
+	return ret;
+}
+
+static int ras2_hw_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*enabled = ras2_ctx->bg_scrub;
+
+	return 0;
+}
+
+static const struct edac_scrub_ops ras2_scrub_ops = {
+	.read_addr = ras2_hw_scrub_read_addr,
+	.read_size = ras2_hw_scrub_read_size,
+	.write_addr = ras2_hw_scrub_write_addr,
+	.write_size = ras2_hw_scrub_write_size,
+	.get_enabled_bg = ras2_hw_scrub_get_enabled_bg,
+	.set_enabled_bg = ras2_hw_scrub_set_enabled_bg,
+	.get_min_cycle = ras2_hw_scrub_read_min_scrub_cycle,
+	.get_max_cycle = ras2_hw_scrub_read_max_scrub_cycle,
+	.get_cycle_duration = ras2_hw_scrub_cycle_read,
+	.set_cycle_duration = ras2_hw_scrub_cycle_write,
+};
+
+static int ras2_probe(struct auxiliary_device *auxdev,
+		      const struct auxiliary_device_id *id)
+{
+	struct ras2_mem_ctx *ras2_ctx = container_of(auxdev, struct ras2_mem_ctx, adev);
+	struct edac_dev_feature ras_features;
+	char scrub_name[RAS2_SCRUB_NAME_LEN];
+	int ret;
+
+	if (!ras2_is_patrol_scrub_support(ras2_ctx))
+		return -EOPNOTSUPP;
+
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+
+	sprintf(scrub_name, "acpi_ras_mem%d", auxdev->id);
+
+	ras_features.ft_type	= RAS_FEAT_SCRUB;
+	ras_features.instance	= 0;
+	ras_features.scrub_ops	= &ras2_scrub_ops;
+	ras_features.ctx	= ras2_ctx;
+
+	return edac_dev_register(&auxdev->dev, scrub_name, NULL, 1,
+				 &ras_features);
+}
+
+static const struct auxiliary_device_id ras2_mem_dev_id_table[] = {
+	{ .name = RAS2_AUX_DEV_NAME "." RAS2_MEM_DEV_ID_NAME, },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(auxiliary, ras2_mem_dev_id_table);
+
+static struct auxiliary_driver ras2_mem_driver = {
+	.name = RAS2_MEM_DEV_ID_NAME,
+	.probe = ras2_probe,
+	.id_table = ras2_mem_dev_id_table,
+};
+module_auxiliary_driver(ras2_mem_driver);
+
+MODULE_IMPORT_NS("ACPI_RAS2");
+MODULE_DESCRIPTION("ACPI RAS2 memory driver");
+MODULE_LICENSE("GPL");
diff --git a/include/acpi/ras2.h b/include/acpi/ras2.h
index cb053b5f37e7..fe7b7c454ac8 100644
--- a/include/acpi/ras2.h
+++ b/include/acpi/ras2.h
@@ -41,6 +41,13 @@ struct device;
  *			of the memory associated with the NUMA domain
  * @mem_size		Size of the lowest physical continuous memory range
  *			of the memory associated with the NUMA domain
+ * @base:		Base address of the memory region to scrub
+ * @size:		Size of the memory region to scrub
+ * @scrub_cycle_hrs:	Current scrub rate in hours
+ * @min_scrub_cycle:	Minimum scrub rate supported
+ * @max_scrub_cycle:	Maximum scrub rate supported
+ * @od_scrub_sts:	Status of demand scrubbing (memory region)
+ * @bg_scrub:		Status of background patrol scrubbing
  */
 struct ras2_mem_ctx {
 	struct auxiliary_device		adev;
@@ -51,6 +58,13 @@ struct ras2_mem_ctx {
 	u32				sys_comp_nid;
 	u64				mem_base_addr;
 	u64				mem_size;
+	u64				base;
+	u64				size;
+	u8				scrub_cycle_hrs;
+	u8				min_scrub_cycle;
+	u8				max_scrub_cycle;
+	u8				od_scrub_sts;
+	bool				bg_scrub;
 };
 
 #ifdef CONFIG_ACPI_RAS2
-- 
2.43.0



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-02 17:30 ` [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
@ 2025-09-09 16:24   ` Yazen Ghannam
  2025-09-10  8:38     ` Shiju Jose
  2025-09-10 19:27   ` Borislav Petkov
  1 sibling, 1 reply; 20+ messages in thread
From: Yazen Ghannam @ 2025-09-09 16:24 UTC (permalink / raw)
  To: shiju.jose
  Cc: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, mchehab,
	jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang

On Tue, Sep 02, 2025 at 06:30:39PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>

[...]

> +static int ras2_add_aux_device(char *name, int channel, u32 pxm_inst)
> +{
> +	unsigned long start_pfn, size_pfn;
> +	struct ras2_mem_ctx *ras2_ctx;
> +	int id, rc;

'rc' is uninitialized, and LLVM gives a warning.

The issue is the "goto ctx_free" paths return 'rc' before it is set.

> +
> +	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
> +	if (!ras2_ctx)
> +		return -ENOMEM;
> +
> +	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);
> +	/*
> +	 * Retrieve the lowest contiguous physical memory address range within
> +	 * the NUMA node.
> +	 */
> +	start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
> +	size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
> +	if (!size_pfn) {
> +		pr_debug("Failed to find phy addr range for NUMA node(%u)\n",
> +			 pxm_inst);
> +		goto ctx_free;
> +	}
> +	ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn);
> +	ras2_ctx->mem_size = __pfn_to_phys(size_pfn);
> +
> +	rc = ras2_register_pcc_channel(ras2_ctx, channel);
> +	if (rc < 0) {
> +		pr_debug("Failed to register pcc channel rc=%d\n", rc);
> +		goto ctx_free;
> +	}
> +
> +	id = ida_alloc(&ras2_ida, GFP_KERNEL);
> +	if (id < 0) {
> +		rc = id;
> +		goto ctx_free;
> +	}
> +
> +	ras2_ctx->adev.id		= id;
> +	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;
> +	ras2_ctx->adev.dev.release	= ras2_release;
> +	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
> +
> +	rc = auxiliary_device_init(&ras2_ctx->adev);
> +	if (rc)
> +		goto ida_free;
> +
> +	rc = auxiliary_device_add(&ras2_ctx->adev);
> +	if (rc) {
> +		auxiliary_device_uninit(&ras2_ctx->adev);
> +		return rc;
> +	}
> +
> +	return 0;
> +
> +ida_free:
> +	ida_free(&ras2_ida, id);
> +ctx_free:
> +	kfree(ras2_ctx);
> +
> +	return rc;
> +}
> +
> +static int acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab)
> +{
> +	struct acpi_ras2_pcc_desc *pcc_desc_list;
> +	int rc;
> +	u16 i;
> +
> +	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
> +		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too short #1)\n");
> +		return -EINVAL;
> +	}
> +
> +	if (!ras2_tab->num_pcc_descs) {
> +		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
> +		return -EINVAL;
> +	}
> +
> +	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
> +	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
> +		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
> +			continue;
> +
> +		rc = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME,
> +					 pcc_desc_list->channel_id,
> +					 pcc_desc_list->instance);
> +		if (rc)
> +			pr_warn("Failed to add RAS2 auxiliary device rc=%d\n", rc);
> +	}
> +
> +	return 0;

Should this return 'rc' from above? Or is the 'warning' case not a total
failure?

Thanks,
Yazen


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-09 16:24   ` Yazen Ghannam
@ 2025-09-10  8:38     ` Shiju Jose
  0 siblings, 0 replies; 20+ messages in thread
From: Shiju Jose @ 2025-09-10  8:38 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, mchehab,
	Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Yazen Ghannam <yazen.ghannam@amd.com>
>Sent: 09 September 2025 17:25
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: rafael@kernel.org; bp@alien8.de; akpm@linux-foundation.org;
>rppt@kernel.org; dferguson@amperecomputing.com; linux-
>edac@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-
>doc@vger.kernel.org; tony.luck@intel.com; lenb@kernel.org;
>leo.duran@amd.com; mchehab@kernel.org; Jonathan Cameron
><jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
>
>On Tue, Sep 02, 2025 at 06:30:39PM +0100, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>
>[...]
>
>> +static int ras2_add_aux_device(char *name, int channel, u32 pxm_inst)
>> +{
>> +	unsigned long start_pfn, size_pfn;
>> +	struct ras2_mem_ctx *ras2_ctx;
>> +	int id, rc;
>
>'rc' is uninitialized, and LLVM gives a warning.
>
>The issue is the "goto ctx_free" paths return 'rc' before it is set.

Thanks Yazen. I missed setting rc with the error code in the failure case of the
newly added code. I will fix in the next version.

>
>> +
>> +	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
>> +	if (!ras2_ctx)
>> +		return -ENOMEM;
>> +
>> +	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);
>> +	/*
>> +	 * Retrieve the lowest contiguous physical memory address range within
>> +	 * the NUMA node.
>> +	 */
>> +	start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
>> +	size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
>> +	if (!size_pfn) {
>> +		pr_debug("Failed to find phy addr range for NUMA node(%u)\n",
>> +			 pxm_inst);
>> +		goto ctx_free;
>> +	}
>> +	ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn);
>> +	ras2_ctx->mem_size = __pfn_to_phys(size_pfn);
>> +
[...]
>
>Thanks,
>Yazen

Thanks,
Shiju


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-02 17:30 ` [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
  2025-09-09 16:24   ` Yazen Ghannam
@ 2025-09-10 19:27   ` Borislav Petkov
  2025-09-12 12:04     ` Shiju Jose
  1 sibling, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2025-09-10 19:27 UTC (permalink / raw)
  To: shiju.jose
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam, mchehab,
	jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang

On Tue, Sep 02, 2025 at 06:30:39PM +0100, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
> 
> Add support for ACPI RAS2 feature table (RAS2) defined in the
> ACPI 6.5 Specification, section 5.2.21.
> Driver defines RAS2 Init, which extracts the RAS2 table and driver

What is "RAS2 Init"?

$ grep -EriIn "ras2.*init" ras2.mbox 
188:    - Rename ras2_acpi_init() to acpi_ras2_init() and modified to call acpi_ras2_init()
283:Driver defines RAS2 Init, which extracts the RAS2 table and driver
391:+   acpi_ras2_init();
410:+ * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table and
774:+void __init acpi_ras2_init(void)

...

I guess you mean acpi_ras2_init().

> adds auxiliary device for each memory feature which binds to the
> RAS2 memory driver.

But if you mean that, why would you explain what the patch does? That should
be visible.

What you should explaining is the "why".

> Driver uses PCC mailbox to communicate with the ACPI HW and the
> driver adds OSPM interfaces to send RAS2 commands.

More useless "what".

> According to the ACPI specification rev 6.5, section 5.2.21.1.1
> RAS2 Platform Communication Channel Descriptor, “RAS2 supports multiple
> PCC channels, where a channel is dedicated to a given component
> instance.” Thus, RAS2 driver has been implemented to support only
> systems that comply with the specification, i.e. a dedicated PCC channel
> per system component instance for communication.

Why is that paragraph here?

> ACPI specification rev 6.5, section 5.2.21.1.1 Table 5.80: RAS2 Platform
> Communication Channel Descriptor format, defines Field: Instance,
> Identifier for the system component instance that the RAS feature is
> associated with.

More word salad.
 
> Section 5.2.21.2.1 Hardware-based Memory Scrubbing describes as
> The platform can use this feature to expose controls and capabilities
> associated with hardware-based memory scrub engines. Modern scalable
> platforms have complex memory systems with a multitude of memory
> controllers that are in turn associated with NUMA domains. It is also
> common for RAS errors related to memory to be associated with NUMA
> domains, where the NUMA domain functions as a FRU identifier. This
> feature thus provides memory scrubbing at a NUMA domain granularity.

More word salad. Why is this here at all?!

> The following are supported:
> 1. Independent memory scrubbing controls for each NUMA domain, identified
> using its proximity domain.
> 2. Provision for background (patrol) scrubbing of the entire memory
> system, as well as on-demand scrubbing for a specific region of memory.
> 
> Thus, the RAS2 driver requires the lowest contiguous physical memory range
> of the memory associated with a NUMA domain when communicating with the
> firmware for memory-related features such as scrubbing. The driver uses
> the component instance ID, as defined in Table 5.80, to retrieve the
> lowest contiguous physical memory address range within the NUMA node
> and store it in the struct ras2_context to expose the address range to
> the userspace and for the communication with the firmware.

This commit message needs serious rewriting and I don't even know what it
should contain. It is a dump of random text which reads like it is trying to
tell me something about this new driver but I'm failing to grok what
exactly...

> Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
> Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
>  drivers/acpi/Kconfig  |  12 ++
>  drivers/acpi/Makefile |   1 +
>  drivers/acpi/bus.c    |   3 +
>  drivers/acpi/ras2.c   | 389 ++++++++++++++++++++++++++++++++++++++++++
>  include/acpi/ras2.h   |  63 +++++++
>  5 files changed, 468 insertions(+)
>  create mode 100644 drivers/acpi/ras2.c
>  create mode 100644 include/acpi/ras2.h
> 
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index b594780a57d7..db21bf5a39c7 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
>  	  If your platform does not support CPPC in firmware,
>  	  leave this option disabled.
>  
> +config ACPI_RAS2
> +	bool "ACPI RAS2 driver"
> +	select AUXILIARY_BUS
> +	select MAILBOX
> +	select PCC
> +	select NUMA_KEEP_MEMINFO if NUMA_MEMBLKS

Why is this selecting crap instead of depending on the facilities it uses?

> +	help
> +	  The driver adds support for ACPI RAS2 feature table (extracts RAS2
> +	  table from OS system table) and OSPM interfaces to send RAS2
> +	  commands via PCC mailbox subspace. Driver adds platform device for
> +	  the RAS2 memory features which binds to the RAS2 memory driver.

This help text is useless and is throwing random abbreviations around like it
ain't no tomorrow.

Also, in all your text: use definite or indefinite articles because it reads
weird otherwise.

> +
>  config ACPI_PROCESSOR
>  	tristate "Processor"
>  	depends on X86 || ARM64 || LOONGARCH || RISCV
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index d1b0affb844f..abfec6745724 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -105,6 +105,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)	+= ec_sys.o
>  obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
>  obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
>  obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
> +obj-$(CONFIG_ACPI_RAS2)		+= ras2.o
>  obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
>  obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
>  obj-$(CONFIG_ACPI_PFRUT)	+= pfr_update.o pfr_telemetry.o
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index a984ccd4a2a0..b02ceb2837c6 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -31,6 +31,7 @@
>  #include <acpi/apei.h>
>  #include <linux/suspend.h>
>  #include <linux/prmt.h>
> +#include <acpi/ras2.h>
>  
>  #include "internal.h"
>  
> @@ -1474,6 +1475,8 @@ static int __init acpi_init(void)
>  	acpi_debugger_init();
>  	acpi_setup_sb_notify_handler();
>  	acpi_viot_init();
> +	acpi_ras2_init();
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c
> new file mode 100644
> index 000000000000..a17eab9eecf1
> --- /dev/null
> +++ b/drivers/acpi/ras2.c
> @@ -0,0 +1,389 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Implementation of ACPI RAS2 driver.
> + *
> + * Copyright (c) 2024-2025 HiSilicon Limited.
> + *
> + * Support for RAS2 - ACPI 6.5 Specification, section 5.2.21
> + *
> + * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table and
> + * get the PCC channel subspace for communicating with the ACPI compliant
> + * HW platform which supports ACPI RAS2. Driver adds auxiliary devices
> + * for each RAS2 memory feature which binds to the memory ACPI RAS2 driver.

For whom is this "text" here and what is its purpose?

It'd be more helpful if you at least explained what all those abbreviations
meant at the top here, so that people can go find out about them if needed.

> + */
> +
> +#define pr_fmt(fmt) "ACPI RAS2: " fmt
> +
> +#include <linux/delay.h>
> +#include <linux/export.h>
> +#include <linux/iopoll.h>
> +#include <linux/ktime.h>
> +#include <acpi/pcc.h>
> +#include <acpi/ras2.h>
> +
> +/**
> + * struct ras2_pcc_subspace - Data structure for PCC communication
> + * @mbox_client:	struct mbox_client object
> + * @pcc_chan:		Pointer to struct pcc_mbox_chan
> + * @comm_addr:		Pointer to RAS2 PCC shared memory region
> + * @elem:		List for registered RAS2 PCC channel subspaces
> + * @pcc_lock:		PCC lock to provide mutually exclusive access
> + *			to PCC channel subspace
> + * @deadline_us:	Poll PCC status register timeout in micro secs
> + *			for PCC command complete
> + * @pcc_mpar:		Maximum Periodic Access Rate(MPAR) for PCC channel
> + * @pcc_mrtt:		Minimum Request Turnaround Time(MRTT) in micro secs
> + *			OS must wait after completion of a PCC command before
> + *			issue next command
> + * @last_cmd_cmpl_time:	completion time of last PCC command
> + * @last_mpar_reset:	Time of last MPAR count reset
> + * @mpar_count:		MPAR count
> + * @pcc_id:		Identifier of the RAS2 platform communication channel
> + * @last_cmd:		Last PCC command
> + * @pcc_chnl_acq:	Status of PCC channel acquired
> + */
> +struct ras2_pcc_subspace {
> +	struct mbox_client		mbox_client;
> +	struct pcc_mbox_chan		*pcc_chan;
> +	struct acpi_ras2_shmem __iomem	*comm_addr;
> +	struct list_head		elem;
> +	struct mutex			pcc_lock;
> +	unsigned int			deadline_us;
> +	unsigned int			pcc_mpar;
> +	unsigned int			pcc_mrtt;
> +	ktime_t				last_cmd_cmpl_time;
> +	ktime_t				last_mpar_reset;
> +	int				mpar_count;
> +	int				pcc_id;
> +	u16				last_cmd;
> +	bool				pcc_chnl_acq;
> +};
> +
> +/*
> + * Arbitrary retries for PCC commands because the remote processor
> + * could be much slower to reply. Keeping it high enough to cover
> + * emulators where the processors run painfully slow.
> + */
> +#define RAS2_NUM_RETRIES 600ULL

Why doesn't it have "PCC" in the name then?

> +#define RAS2_FEAT_TYPE_MEMORY 0x00
> +
> +static int ras2_report_cap_error(u32 cap_status)

All static functions do not need a "ras2_" namespace.

Also, this function is not reporting anything. Function naming *is* important.

> +{
> +	switch (cap_status) {
> +	case ACPI_RAS2_NOT_VALID:
> +	case ACPI_RAS2_NOT_SUPPORTED:
> +		return -EPERM;
> +	case ACPI_RAS2_BUSY:
> +		return -EBUSY;
> +	case ACPI_RAS2_FAILED:
> +	case ACPI_RAS2_ABORTED:
> +	case ACPI_RAS2_INVALID_DATA:
> +		return -EINVAL;
> +	default: /* 0 or other, Success */

Comments go ontop, not sideways.

And this comment is useless too.

> +		return 0;
> +	}
> +}
> +
> +static int ras2_check_pcc_chan(struct ras2_pcc_subspace *pcc_subspace)

struct ras2_pcc_subspace *sspc

or so, so that you don't have insane long lines.

> +{
> +	struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace->comm_addr;
> +	u32 cap_status, rc;
> +	u16 status;
> +
> +	/*
> +	 * As per ACPI spec, the PCC space will be initialized by
> +	 * platform and should have set the command completion bit when
> +	 * PCC can be used by OSPM.
> +	 *
> +	 * Poll PCC status register every 3us(delay_us) for maximum of

delay_us?

> +	 * deadline_us(timeout_us)

timeout_us?

> until PCC command complete bit is set(cond).

cond?

> +	 */
> +	rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status,
> +					status & PCC_STATUS_CMD_COMPLETE, 3,
> +					pcc_subspace->deadline_us);
> +	if (rc) {

This function returns an int, you store it into a u32, caller checks for rc
< 0...

What a mess. :-(

> +		pr_warn("PCC check channel timeout for pcc_id=%d rc=%d\n",
> +			pcc_subspace->pcc_id, rc);
> +		return rc;
> +	}
> +
> +	if (status & PCC_STATUS_ERROR) {
> +		pr_warn("Error in executing last command=%d for pcc_id=%d\n",
> +			pcc_subspace->last_cmd, pcc_subspace->pcc_id);
> +		status &= ~PCC_STATUS_ERROR;
> +		writew_relaxed(status, &gen_comm_base->status);
> +		return -EIO;
> +	}
> +
> +	cap_status = readw_relaxed(&gen_comm_base->set_caps_status);
> +	writew_relaxed(0x0, &gen_comm_base->set_caps_status);
> +	return ras2_report_cap_error(cap_status);
> +}
> +
> +/**
> + * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
> + * @ras2_ctx:	pointer to the RAS2 context structure
> + * @cmd:	command to send
> + *
> + * Returns: 0 on success, an error otherwise
> + */
> +int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd)
> +{
> +	struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace;
> +	struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace->comm_addr;
> +	struct mbox_chan *pcc_channel;
> +	unsigned int time_delta;
> +	int rc;
> +
> +	rc = ras2_check_pcc_chan(pcc_subspace);
> +	if (rc < 0)
> +		return rc;
> +
> +	pcc_channel = pcc_subspace->pcc_chan->mchan;
> +
> +	/*
> +	 * Handle the Minimum Request Turnaround Time(MRTT).
						     ^

In all your text: put a space before the abbreviation in brackets.


> +	 * "The minimum amount of time that OSPM must wait after the completion
> +	 * of a command before issuing the next command, in microseconds."
> +	 */
> +	if (pcc_subspace->pcc_mrtt) {
> +		time_delta = ktime_us_delta(ktime_get(),
> +					    pcc_subspace->last_cmd_cmpl_time);
> +		if (pcc_subspace->pcc_mrtt > time_delta)
> +			udelay(pcc_subspace->pcc_mrtt - time_delta);
> +	}
> +
> +	/*
> +	 * Handle the non-zero Maximum Periodic Access Rate(MPAR).
> +	 * "The maximum number of periodic requests that the subspace channel can
> +	 * support, reported in commands per minute. 0 indicates no limitation."
> +	 *
> +	 * This parameter should be ideally zero or large enough so that it can
> +	 * handle maximum number of requests that all the cores in the system can
> +	 * collectively generate. If it is not, we will follow the spec and just

s/we will//

> +	 * not send the request to the platform after hitting the MPAR limit in
> +	 * any 60s window.
> +	 */
> +	if (pcc_subspace->pcc_mpar) {
> +		if (pcc_subspace->mpar_count == 0) {
> +			time_delta = ktime_ms_delta(ktime_get(),
> +						    pcc_subspace->last_mpar_reset);
> +			if (time_delta < 60 * MSEC_PER_SEC) {
> +				dev_dbg(ras2_ctx->dev,
> +					"PCC cmd not sent due to MPAR limit");

you can dump which cmd here.

> +				return -EIO;
> +			}
> +			pcc_subspace->last_mpar_reset = ktime_get();
> +			pcc_subspace->mpar_count = pcc_subspace->pcc_mpar;
> +		}
> +		pcc_subspace->mpar_count--;
> +	}
> +
> +	/* Write to the shared comm region */
> +	writew_relaxed(cmd, &gen_comm_base->command);
> +
> +	/* Flip CMD COMPLETE bit */
> +	writew_relaxed(0, &gen_comm_base->status);
> +
> +	/* Ring doorbell */
> +	rc = mbox_send_message(pcc_channel, &cmd);
> +	if (rc < 0) {
> +		dev_warn(ras2_ctx->dev,
> +			 "Err sending PCC mbox message. cmd:%d, rc:%d\n",
> +			 cmd, rc);
> +		return rc;
> +	}
> +
> +	pcc_subspace->last_cmd = cmd;
> +
> +	/*
> +	 * If Minimum Request Turnaround Time is non-zero, we need
> +	 * to record the completion time of both READ and WRITE

s/we need to//

You get the idea...

> +	 * command for proper handling of MRTT, so we need to check

s/we need to//

> +	 * for pcc_mrtt in addition to CMD_READ.
> +	 */
> +	if (cmd == PCC_CMD_EXEC_RAS2 || pcc_subspace->pcc_mrtt) {
> +		rc = ras2_check_pcc_chan(pcc_subspace);
> +		if (pcc_subspace->pcc_mrtt)
> +			pcc_subspace->last_cmd_cmpl_time = ktime_get();
> +	}
> +
> +	if (pcc_channel->mbox->txdone_irq)
> +		mbox_chan_txdone(pcc_channel, rc);
> +	else
> +		mbox_client_txdone(pcc_channel, rc);
> +
> +	return rc < 0 ? rc : 0;
> +}
> +EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
> +
> +static void ras2_list_pcc_release(struct ras2_pcc_subspace *pcc_subspace)
> +{
> +	pcc_mbox_free_channel(pcc_subspace->pcc_chan);
> +	kfree(pcc_subspace);
> +}
> +
> +static int ras2_register_pcc_channel(struct ras2_mem_ctx *ras2_ctx, int pcc_id)
> +{
> +	struct ras2_pcc_subspace *pcc_subspace;
> +	struct pcc_mbox_chan *pcc_chan;
> +	struct mbox_client *mbox_cl;
> +
> +	if (pcc_id < 0)
> +		return -EINVAL;
> +
> +	pcc_subspace = kzalloc(sizeof(*pcc_subspace), GFP_KERNEL);
> +	if (!pcc_subspace)
> +		return -ENOMEM;
> +
> +	mbox_cl			= &pcc_subspace->mbox_client;
> +	mbox_cl->knows_txdone	= true;
> +
> +	pcc_chan = pcc_mbox_request_channel(mbox_cl, pcc_id);
> +	if (IS_ERR(pcc_chan)) {
> +		kfree(pcc_subspace);
> +		return PTR_ERR(pcc_chan);
> +	}
> +
> +	pcc_subspace->pcc_id		= pcc_id;
> +	pcc_subspace->pcc_chan		= pcc_chan;
> +	pcc_subspace->comm_addr		= pcc_chan->shmem;
> +	pcc_subspace->deadline_us	= RAS2_NUM_RETRIES * pcc_chan->latency;
> +	pcc_subspace->pcc_mrtt		= pcc_chan->min_turnaround_time;
> +	pcc_subspace->pcc_mpar		= pcc_chan->max_access_rate;
> +	pcc_subspace->mbox_client.knows_txdone	= true;
> +	pcc_subspace->pcc_chnl_acq	= true;
> +
> +	ras2_ctx->pcc_subspace	= pcc_subspace;
> +	ras2_ctx->comm_addr	= pcc_subspace->comm_addr;
> +	ras2_ctx->dev		= pcc_chan->mchan->mbox->dev;
> +
> +	mutex_init(&pcc_subspace->pcc_lock);
> +	ras2_ctx->pcc_lock	= &pcc_subspace->pcc_lock;
> +
> +	return 0;
> +}
> +
> +static DEFINE_IDA(ras2_ida);
> +static void ras2_release(struct device *device)
> +{
> +	struct auxiliary_device *auxdev = to_auxiliary_dev(device);
> +	struct ras2_mem_ctx *ras2_ctx =
> +		container_of(auxdev, struct ras2_mem_ctx, adev);
> +
> +	ida_free(&ras2_ida, auxdev->id);
> +	ras2_list_pcc_release(ras2_ctx->pcc_subspace);
> +	kfree(ras2_ctx);
> +}
> +
> +static int ras2_add_aux_device(char *name, int channel, u32 pxm_inst)
> +{
> +	unsigned long start_pfn, size_pfn;
> +	struct ras2_mem_ctx *ras2_ctx;
> +	int id, rc;
> +
> +	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
> +	if (!ras2_ctx)
> +		return -ENOMEM;
> +
> +	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);

<---- newline here.

> +	/*
> +	 * Retrieve the lowest contiguous physical memory address range within
> +	 * the NUMA node.
> +	 */

Why is this requirement here?

The commit message is trying to explain it but I'm still none the wiser.

> +	start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
> +	size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
> +	if (!size_pfn) {
> +		pr_debug("Failed to find phy addr range for NUMA node(%u)\n",
> +			 pxm_inst);
> +		goto ctx_free;
> +	}
> +	ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn);
> +	ras2_ctx->mem_size = __pfn_to_phys(size_pfn);
> +
> +	rc = ras2_register_pcc_channel(ras2_ctx, channel);
> +	if (rc < 0) {
> +		pr_debug("Failed to register pcc channel rc=%d\n", rc);
> +		goto ctx_free;
> +	}
> +
> +	id = ida_alloc(&ras2_ida, GFP_KERNEL);
> +	if (id < 0) {
> +		rc = id;
> +		goto ctx_free;

You just leaked pcc_subspace which ras2_register_pcc_channel() allocated.

> +	}
> +
> +	ras2_ctx->adev.id		= id;
> +	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;
> +	ras2_ctx->adev.dev.release	= ras2_release;
> +	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
> +
> +	rc = auxiliary_device_init(&ras2_ctx->adev);
> +	if (rc)
> +		goto ida_free;
> +
> +	rc = auxiliary_device_add(&ras2_ctx->adev);
> +	if (rc) {
> +		auxiliary_device_uninit(&ras2_ctx->adev);
> +		return rc;

And this leaked everything you allocated above.

> +	}
> +
> +	return 0;
> +
> +ida_free:
> +	ida_free(&ras2_ida, id);
> +ctx_free:
> +	kfree(ras2_ctx);
> +
> +	return rc;
> +}
> +
> +static int acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab)
> +{
> +	struct acpi_ras2_pcc_desc *pcc_desc_list;
> +	int rc;
> +	u16 i;
> +
> +	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
> +		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too short #1)\n");

So dump the sizes to make the error message more helpful.

> +		return -EINVAL;
> +	}
> +
> +	if (!ras2_tab->num_pcc_descs) {
> +		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
> +		return -EINVAL;
> +	}
> +
> +	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
> +	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
> +		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
> +			continue;
> +
> +		rc = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME,
> +					 pcc_desc_list->channel_id,
> +					 pcc_desc_list->instance);
> +		if (rc)
> +			pr_warn("Failed to add RAS2 auxiliary device rc=%d\n", rc);

What happens with the aux devices you created successfully here? Unwind?

> +	}
> +
> +	return 0;
> +}
> +
> +void __init acpi_ras2_init(void)
> +{
> +	struct acpi_table_ras2 *ras2_tab;
> +	acpi_status status;
> +
> +	status = acpi_get_table(ACPI_SIG_RAS2, 0,
> +				(struct acpi_table_header **)&ras2_tab);
> +	if (ACPI_FAILURE(status)) {
> +		pr_err("Failed to get table, %s\n", acpi_format_exception(status));
> +		return;
> +	}
> +
> +	if (acpi_ras2_parse(ras2_tab))
> +		pr_err("Failed to parse RAS2 table\n");

No need for that print if you issue warns above.

> +
> +	acpi_put_table((struct acpi_table_header *)ras2_tab);
> +}

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-10 19:27   ` Borislav Petkov
@ 2025-09-12 12:04     ` Shiju Jose
  2025-09-12 14:11       ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Shiju Jose @ 2025-09-12 12:04 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam, mchehab,
	Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 10 September 2025 20:27
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: rafael@kernel.org; akpm@linux-foundation.org; rppt@kernel.org;
>dferguson@amperecomputing.com; linux-edac@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-doc@vger.kernel.org;
>tony.luck@intel.com; lenb@kernel.org; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; mchehab@kernel.org; Jonathan Cameron
><jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
>
>On Tue, Sep 02, 2025 at 06:30:39PM +0100, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add support for ACPI RAS2 feature table (RAS2) defined in the ACPI 6.5
>> Specification, section 5.2.21.
>> Driver defines RAS2 Init, which extracts the RAS2 table and driver
>

Thanks for reviewing  and for the feedback.

>What is "RAS2 Init"?
>
>$ grep -EriIn "ras2.*init" ras2.mbox
>188:    - Rename ras2_acpi_init() to acpi_ras2_init() and modified to call
>acpi_ras2_init()
>283:Driver defines RAS2 Init, which extracts the RAS2 table and driver
>391:+   acpi_ras2_init();
>410:+ * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table and
>774:+void __init acpi_ras2_init(void)
>
>...
>
>I guess you mean acpi_ras2_init().

Yes. 

>
>> adds auxiliary device for each memory feature which binds to the
>> RAS2 memory driver.
>
>But if you mean that, why would you explain what the patch does? That should
>be visible.
>
>What you should explaining is the "why".
Will do.  I will update the patch header.

>
>> Driver uses PCC mailbox to communicate with the ACPI HW and the driver
[...]
>>
>> Thus, the RAS2 driver requires the lowest contiguous physical memory
>> range of the memory associated with a NUMA domain when communicating
>> with the firmware for memory-related features such as scrubbing. The
>> driver uses the component instance ID, as defined in Table 5.80, to
>> retrieve the lowest contiguous physical memory address range within
>> the NUMA node and store it in the struct ras2_context to expose the
>> address range to the userspace and for the communication with the firmware.
>
>This commit message needs serious rewriting and I don't even know what it
>should contain. It is a dump of random text which reads like it is trying to tell me
>something about this new driver but I'm failing to grok what exactly...

I will move these description to the code where it retrieve physical memory range
for the NUMA domain. 
>
>> Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
>> Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>>  drivers/acpi/Kconfig  |  12 ++
>>  drivers/acpi/Makefile |   1 +
>>  drivers/acpi/bus.c    |   3 +
>>  drivers/acpi/ras2.c   | 389 ++++++++++++++++++++++++++++++++++++++++++
>>  include/acpi/ras2.h   |  63 +++++++
>>  5 files changed, 468 insertions(+)
>>  create mode 100644 drivers/acpi/ras2.c  create mode 100644
>> include/acpi/ras2.h
>>
>> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index
>> b594780a57d7..db21bf5a39c7 100644
>> --- a/drivers/acpi/Kconfig
>> +++ b/drivers/acpi/Kconfig
>> @@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
>>  	  If your platform does not support CPPC in firmware,
>>  	  leave this option disabled.
>>
>> +config ACPI_RAS2
>> +	bool "ACPI RAS2 driver"
>> +	select AUXILIARY_BUS
>> +	select MAILBOX
>> +	select PCC
>> +	select NUMA_KEEP_MEMINFO if NUMA_MEMBLKS
>
>Why is this selecting crap instead of depending on the facilities it uses?
Ok. 
>
>> +	help
>> +	  The driver adds support for ACPI RAS2 feature table (extracts RAS2
>> +	  table from OS system table) and OSPM interfaces to send RAS2
>> +	  commands via PCC mailbox subspace. Driver adds platform device for
>> +	  the RAS2 memory features which binds to the RAS2 memory driver.
>
>This help text is useless and is throwing random abbreviations around like it ain't
>no tomorrow.
>
>Also, in all your text: use definite or indefinite articles because it reads weird
>otherwise.
I will change.
>
>> +
[...]
>>
>> diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c new file mode
>> 100644 index 000000000000..a17eab9eecf1
>> --- /dev/null
>> +++ b/drivers/acpi/ras2.c
>> @@ -0,0 +1,389 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Implementation of ACPI RAS2 driver.
>> + *
>> + * Copyright (c) 2024-2025 HiSilicon Limited.
>> + *
>> + * Support for RAS2 - ACPI 6.5 Specification, section 5.2.21
>> + *
>> + * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table
>> +and
>> + * get the PCC channel subspace for communicating with the ACPI
>> +compliant
>> + * HW platform which supports ACPI RAS2. Driver adds auxiliary
>> +devices
>> + * for each RAS2 memory feature which binds to the memory ACPI RAS2
>driver.
>
>For whom is this "text" here and what is its purpose?
>
>It'd be more helpful if you at least explained what all those abbreviations meant
>at the top here, so that people can go find out about them if needed.
Will do.
>
>> + */
>> +
>> +#define pr_fmt(fmt) "ACPI RAS2: " fmt
[...]
>> + * Arbitrary retries for PCC commands because the remote processor
>> + * could be much slower to reply. Keeping it high enough to cover
>> + * emulators where the processors run painfully slow.
>> + */
>> +#define RAS2_NUM_RETRIES 600ULL
>
>Why doesn't it have "PCC" in the name then?
Will change.

>
>> +#define RAS2_FEAT_TYPE_MEMORY 0x00
>> +
>> +static int ras2_report_cap_error(u32 cap_status)
>
>All static functions do not need a "ras2_" namespace.
>
>Also, this function is not reporting anything. Function naming *is* important.
Will change.

>
>> +{
>> +	switch (cap_status) {
>> +	case ACPI_RAS2_NOT_VALID:
>> +	case ACPI_RAS2_NOT_SUPPORTED:
>> +		return -EPERM;
>> +	case ACPI_RAS2_BUSY:
>> +		return -EBUSY;
>> +	case ACPI_RAS2_FAILED:
>> +	case ACPI_RAS2_ABORTED:
>> +	case ACPI_RAS2_INVALID_DATA:
>> +		return -EINVAL;
>> +	default: /* 0 or other, Success */
>
>Comments go ontop, not sideways.
>
>And this comment is useless too.
Ok.

>
>> +		return 0;
>> +	}
>> +}
>> +
>> +static int ras2_check_pcc_chan(struct ras2_pcc_subspace
>> +*pcc_subspace)
>
>struct ras2_pcc_subspace *sspc
>
>or so, so that you don't have insane long lines.
>
>> +{
>> +	struct acpi_ras2_shmem __iomem *gen_comm_base = pcc_subspace-
>>comm_addr;
>> +	u32 cap_status, rc;
>> +	u16 status;
>> +
>> +	/*
>> +	 * As per ACPI spec, the PCC space will be initialized by
>> +	 * platform and should have set the command completion bit when
>> +	 * PCC can be used by OSPM.
>> +	 *
>> +	 * Poll PCC status register every 3us(delay_us) for maximum of
>
>delay_us?
>
>> +	 * deadline_us(timeout_us)
>
>timeout_us?
>
>> until PCC command complete bit is set(cond).
>
>cond?
Will update the comment.

>
>> +	 */
>> +	rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status,
>> +					status &
>PCC_STATUS_CMD_COMPLETE, 3,
>> +					pcc_subspace->deadline_us);
>> +	if (rc) {
>
>This function returns an int, you store it into a u32, caller checks for rc < 0...
>
>What a mess. :-(

Will fix.
>
>> +		pr_warn("PCC check channel timeout for pcc_id=%d rc=%d\n",
>> +			pcc_subspace->pcc_id, rc);
>> +		return rc;
>> +	}
>> +
[...]
>> +	/*
>> +	 * Handle the Minimum Request Turnaround Time(MRTT).
>						     ^
>
>In all your text: put a space before the abbreviation in brackets.
>
Will do.

>
>> +	 * "The minimum amount of time that OSPM must wait after the
>completion
[...]
><---- newline here.
Ok.
>
>> +	/*
>> +	 * Retrieve the lowest contiguous physical memory address range within
>> +	 * the NUMA node.
>> +	 */
>
>Why is this requirement here?
The physical memory address range retrieved here for the NUMA domain is used in the subsequent
patch  [PATCH v12 2/2] ras: mem: Add memory ACPI RAS2 driver,
1. to set Requested Address Range(INPUT) field of Table 5.87: Parameter Block Structure for PATROL_SCRUB
when send GET_PATROL_PARAMETERS command to the firmware, to get scrub parameters, running status,
current scrub rate etc.
2. for the validity check of the user requested memory address range to scrub. 
Also intended to expose this supported memory address range to the
userspace via EDAC scrub control interface, though it is not present now.

>
>The commit message is trying to explain it but I'm still none the wiser.
>
>> +	start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
>> +	size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
>> +	if (!size_pfn) {
>> +		pr_debug("Failed to find phy addr range for NUMA node(%u)\n",
>> +			 pxm_inst);
>> +		goto ctx_free;
>> +	}
>> +	ras2_ctx->mem_base_addr = __pfn_to_phys(start_pfn);
>> +	ras2_ctx->mem_size = __pfn_to_phys(size_pfn);
>> +
>> +	rc = ras2_register_pcc_channel(ras2_ctx, channel);
>> +	if (rc < 0) {
>> +		pr_debug("Failed to register pcc channel rc=%d\n", rc);
>> +		goto ctx_free;
>> +	}
>> +
>> +	id = ida_alloc(&ras2_ida, GFP_KERNEL);
>> +	if (id < 0) {
>> +		rc = id;
>> +		goto ctx_free;
>
>You just leaked pcc_subspace which ras2_register_pcc_channel() allocated.
Thanks for spotting this. The ras2_pcc_put() was removed in v11 during simplification,
but missed to call  pcc_mbox_free_channel(...) and kfree(pcc_subspace) here.
Will fix.

>
>> +	}
>> +
>> +	ras2_ctx->adev.id		= id;
>> +	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;
>> +	ras2_ctx->adev.dev.release	= ras2_release;
>> +	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
>> +
>> +	rc = auxiliary_device_init(&ras2_ctx->adev);
>> +	if (rc)
>> +		goto ida_free;
>> +
>> +	rc = auxiliary_device_add(&ras2_ctx->adev);
>> +	if (rc) {
>> +		auxiliary_device_uninit(&ras2_ctx->adev);
>> +		return rc;
>
>And this leaked everything you allocated above.
auxiliary_device_uninit() called on failure would trigger  ras2_release(),  which
free the allocated memories etc.
https://elixir.bootlin.com/linux/v6.17-rc5/source/drivers/base/auxiliary.c#L316

>
>> +	}
>> +
>> +	return 0;
>> +
>> +ida_free:
>> +	ida_free(&ras2_ida, id);
>> +ctx_free:
>> +	kfree(ras2_ctx);
>> +
>> +	return rc;
>> +}
>> +
>> +static int acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab) {
>> +	struct acpi_ras2_pcc_desc *pcc_desc_list;
>> +	int rc;
>> +	u16 i;
>> +
>> +	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
>> +		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too
>short
>> +#1)\n");
>
>So dump the sizes to make the error message more helpful.
Sure.

>
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (!ras2_tab->num_pcc_descs) {
>> +		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
>> +	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
>> +		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
>> +			continue;
>> +
>> +		rc = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME,
>> +					 pcc_desc_list->channel_id,
>> +					 pcc_desc_list->instance);
>> +		if (rc)
>> +			pr_warn("Failed to add RAS2 auxiliary device rc=%d\n",
>rc);
>
>What happens with the aux devices you created successfully here? Unwind?
Please see the previous discussions on this were about allowing the successfully created
auxiliary devices to exist.
https://lore.kernel.org/all/20250415210504.GA854098@yaz-khff2.amd.com/

>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +void __init acpi_ras2_init(void)
>> +{
>> +	struct acpi_table_ras2 *ras2_tab;
>> +	acpi_status status;
>> +
>> +	status = acpi_get_table(ACPI_SIG_RAS2, 0,
>> +				(struct acpi_table_header **)&ras2_tab);
>> +	if (ACPI_FAILURE(status)) {
>> +		pr_err("Failed to get table, %s\n",
>acpi_format_exception(status));
>> +		return;
>> +	}
>> +
>> +	if (acpi_ras2_parse(ras2_tab))
>> +		pr_err("Failed to parse RAS2 table\n");
>
>No need for that print if you issue warns above.
Will remove.
>
>> +
>> +	acpi_put_table((struct acpi_table_header *)ras2_tab); }
>
>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette


Thanks,
Shiju


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-12 12:04     ` Shiju Jose
@ 2025-09-12 14:11       ` Borislav Petkov
  2025-09-15 11:50         ` Shiju Jose
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2025-09-12 14:11 UTC (permalink / raw)
  To: Shiju Jose
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam, mchehab,
	Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Fri, Sep 12, 2025 at 12:04:57PM +0000, Shiju Jose wrote:
> >Why is this requirement here?
> The physical memory address range retrieved here for the NUMA domain is used in the subsequent
> patch  [PATCH v12 2/2] ras: mem: Add memory ACPI RAS2 driver,
> 1. to set Requested Address Range(INPUT) field of Table 5.87: Parameter Block Structure for PATROL_SCRUB
> when send GET_PATROL_PARAMETERS command to the firmware, to get scrub parameters, running status,
> current scrub rate etc.
> 2. for the validity check of the user requested memory address range to scrub. 

Again, why does it have to be *lowest* and *contiguous*?

Your answer doesn't explain that.

> Also intended to expose this supported memory address range to the
> userspace via EDAC scrub control interface, though it is not present now.

Why? To tie ourselves with even more user ABI?!

There better be a good reason and not a better design for what this is trying
to do.

> >What happens with the aux devices you created successfully here? Unwind?
> Please see the previous discussions on this were about allowing the successfully created
> auxiliary devices to exist.
> https://lore.kernel.org/all/20250415210504.GA854098@yaz-khff2.amd.com/

There's no discussion here. And nothing answers the question "why" this is ok
to do this way.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-12 14:11       ` Borislav Petkov
@ 2025-09-15 11:50         ` Shiju Jose
  2025-09-17 16:22           ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Shiju Jose @ 2025-09-15 11:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam, mchehab,
	Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 12 September 2025 15:12
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: rafael@kernel.org; akpm@linux-foundation.org; rppt@kernel.org;
>dferguson@amperecomputing.com; linux-edac@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-doc@vger.kernel.org;
>tony.luck@intel.com; lenb@kernel.org; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; mchehab@kernel.org; Jonathan Cameron
><jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
>
>On Fri, Sep 12, 2025 at 12:04:57PM +0000, Shiju Jose wrote:
>> >Why is this requirement here?
>> The physical memory address range retrieved here for the NUMA domain
>> is used in the subsequent patch  [PATCH v12 2/2] ras: mem: Add memory
>> ACPI RAS2 driver, 1. to set Requested Address Range(INPUT) field of
>> Table 5.87: Parameter Block Structure for PATROL_SCRUB when send
>> GET_PATROL_PARAMETERS command to the firmware, to get scrub
>parameters, running status, current scrub rate etc.
>> 2. for the validity check of the user requested memory address range to scrub.
>
>Again, why does it have to be *lowest* and *contiguous*?
>
>Your answer doesn't explain that.
This has been added as suggested by Jonathan considering the interleaved NUMA node.
Link to the related discussion in V11:
https://lore.kernel.org/all/20250821100655.00003942@huawei.com/#t

| node 0 | node 1 | node 0 |   PA address map.
Can you give your suggestion what we should do about it?

>
>> Also intended to expose this supported memory address range to the
>> userspace via EDAC scrub control interface, though it is not present now.
>
>Why? To tie ourselves with even more user ABI?!
>
>There better be a good reason and not a better design for what this is trying to
>do.

The "address_range_base" and "address_range_size" sysfs attributes
(until the v13 of EDAC scrub interface,) which we could be used for publish this
physical address range of the memory in NUMA domain to the userspace when the demand scrubbing
is not in progress, but "address_range_base" has changed to read the status of  on-demand
scrubbing based on the feedback here.
https://lore.kernel.org/all/4ee36d03a2894606a571b37f440da36f@huawei.com/#t   
Also to present this requirement for the RAS2,  there was no method found to retrieve the
memory physical address range until recent versions as it is not supported in the RAS2.

Use case for the RAS2 to publish the supported PA range of the node memory to the userspace:
Systems with multiple NUMA node domains with the support for the demand scrubbing
exposed to the user via the EDAC scrub interface as acpi_ras_mem0/scrub, acpi_ras_mem1/scrub,
acpi_ras_mem2/scrub, ... etc.   When the userspace tool (For e.g. rasdaemon)  or an admin detects
a faulty page or faulty address, system policy may decides to scrub the corresponding memory. 
However it is required to find out the EDAC scrub instance of  the corresponding  memory in the
NUMA domain, set scrub parameters and issue the scrub request.
There are two options present,
(1) Set the scrub parameters and issue scrub request in all the EDAC scrub instances present for RAS2. The
 scrub request should fail for the invalid cases.
(2)  Locate the corresponding EDAC scrub instances for the corresponding Node memory
      by read and check against the PA range published.

I think Option (2) seems better? 
If so, can the EDAC scrub interface  be updated to include attributes for publishing the supported
PA range for the memory device to scrub?

>
>> >What happens with the aux devices you created successfully here? Unwind?
>> Please see the previous discussions on this were about allowing the
>> successfully created auxiliary devices to exist.
>> https://lore.kernel.org/all/20250415210504.GA854098@yaz-khff2.amd.com/
>
>There's no discussion here. And nothing answers the question "why" this is ok to
>do this way.

This was changed  based on the feedback from the Yazen in v3 of the series,
Copy of the Yazen's feedback from the above link: 
=============================
> +	}
> +
> +	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
> +	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
> +		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
> +			continue;
> +
> +		rc = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME, pcc_desc_list->channel_id);
> +		if (rc)
> +			return rc;

This returns error on the first failure.

What if there was a success before? Does that aux_device need to be removed?

If not, then why return failure at all? Why not just try to add all devices? Some may fail and some may succeed.
============================= 

We thought second option is a better because a successfully added aux dev for a memory device and corresponding
EDAC interface continue exist and support the scrub/a memory feature. 
We do not mind doing stop on a failure adding an aux_device and free previously crated aux devices, though
it may require some additional dynamically allocated memory space to store the successfully created aux devices
so that free them on a failure later. Hope that is acceptable?

Thanks,
Shiju  

>
>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-15 11:50         ` Shiju Jose
@ 2025-09-17 16:22           ` Borislav Petkov
  2025-09-17 17:36             ` Jonathan Cameron
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2025-09-17 16:22 UTC (permalink / raw)
  To: Shiju Jose
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, Yazen.Ghannam, mchehab,
	Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Mon, Sep 15, 2025 at 11:50:16AM +0000, Shiju Jose wrote:
> This has been added as suggested by Jonathan considering the interleaved NUMA node.
> Link to the related discussion in V11:
> https://lore.kernel.org/all/20250821100655.00003942@huawei.com/#t

Sorry, this doesn't work this way.

If something in the code is being done which is not obvious and trivial, then
the reason for it is written down in a prominent place so that it is clear to
people.

Not pointing to a discussion or some funky place on the web where someone
might've said something.

Your patch submission should contain that info and not have reviewers ask for
it.

> | node 0 | node 1 | node 0 |   PA address map.
> Can you give your suggestion what we should do about it?

I don't know what the problem is to begin with...

> I think Option (2) seems better?  If so, can the EDAC scrub interface  be
> updated to include attributes for publishing the supported PA range for the
> memory device to scrub?

The memory ranges should already be available somewhere in the NUMA/mm code or
so and for starters, we should start a scrub for all ranges and do the
single-range only when there really is a good reason for it.

Also, you don't have to expose any ranges to userspace in order to start
a scrub activity - you can simply start the scrub in the affected range
automatically.

Like I preached the last time, your aim should be to make as much of the
variables that control the scrub automatic and not expose everything to
userspace so that some userspace tool decides. The tool should simply start
the scrub and the kernel should DTRT.

> This returns error on the first failure.
> 
> What if there was a success before? Does that aux_device need to be removed?
> 
> If not, then why return failure at all? Why not just try to add all devices? Some may fail and some may succeed.
> ============================= 
> 
> We thought second option is a better because a successfully added aux dev for a memory device and corresponding
> EDAC interface continue exist and support the scrub/a memory feature. 
> We do not mind doing stop on a failure adding an aux_device and free previously crated aux devices, though
> it may require some additional dynamically allocated memory space to store the successfully created aux devices
> so that free them on a failure later. Hope that is acceptable?

So how are you going to present to people a subset of devices loaded? And what
is the point at all? 

Is there a valid use case where you can use only a subset of the devices to
even try to support such nonsense?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-17 16:22           ` Borislav Petkov
@ 2025-09-17 17:36             ` Jonathan Cameron
  2025-09-18 20:22               ` Daniel Ferguson
  2025-09-19 10:39               ` Borislav Petkov
  0 siblings, 2 replies; 20+ messages in thread
From: Jonathan Cameron @ 2025-09-17 17:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shiju Jose, rafael, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, Yazen.Ghannam,
	mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Wed, 17 Sep 2025 18:22:53 +0200
Borislav Petkov <bp@alien8.de> wrote:

> On Mon, Sep 15, 2025 at 11:50:16AM +0000, Shiju Jose wrote:
> > This has been added as suggested by Jonathan considering the interleaved NUMA node.
> > Link to the related discussion in V11:
> > https://lore.kernel.org/all/20250821100655.00003942@huawei.com/#t  
> 
> Sorry, this doesn't work this way.
> 
> If something in the code is being done which is not obvious and trivial, then
> the reason for it is written down in a prominent place so that it is clear to
> people.
> 
> Not pointing to a discussion or some funky place on the web where someone
> might've said something.
> 
> Your patch submission should contain that info and not have reviewers ask for
> it.
> 
> > | node 0 | node 1 | node 0 |   PA address map.
> > Can you give your suggestion what we should do about it?  
> 
> I don't know what the problem is to begin with...
> 
> > I think Option (2) seems better?  If so, can the EDAC scrub interface  be
> > updated to include attributes for publishing the supported PA range for the
> > memory device to scrub?  
> 
> The memory ranges should already be available somewhere in the NUMA/mm code or
> so and for starters, we should start a scrub for all ranges and do the
> single-range only when there really is a good reason for it.
> 
> Also, you don't have to expose any ranges to userspace in order to start
> a scrub activity - you can simply start the scrub in the affected range
> automatically.
> 
> Like I preached the last time, your aim should be to make as much of the
> variables that control the scrub automatic and not expose everything to
> userspace so that some userspace tool decides. The tool should simply start
> the scrub and the kernel should DTRT.

This 'first contiguous range' is an attempt to DTRT in a corner
case that is real but where there is not an obvious right thing due to spec limitations.

The problem is the ACPI specification ties a controller to a NUMA / _PXM but
then controls it via a single memory range with rules on whether that range can
include holes (that are actually covered by a different controller).  There
is no way to discover if two disconnected ranges may be scanned at once other
that trying that.

Without resolving this corner, there is no way we could come up with for the kernel
to DTRT.

Aim is to automatically establish the range that can be scrubbed. The corner
case is the hole problem.  Alternative as discussed in earlier versions of
this series was ignore holes.

The options discussed in earlier versions of this patch
=======================================================

So with Node / PA mapping (happens because people want low memory addresses
near to CPUs in different sockets - I simplified it here somewhat).

| Node 0 | Node 1 | Node 0 | Node 2|

1. Hole skipping approach
Have control parameters default to
|       Node 0             |
         | Node 1 |
                           | Node 2|
2. Present part of range that is at least not including a hole where it might
look like we were controlling memory scrub that we were not.
| Node 0 |
         | Node 1 |
                           | Node 2|

3. Just don't present anything and leave it up to general mm interfaces
   to provide what 'should' be set.
|
All 3 here, 0 size.

Whilst nice to support, I'm not seeing 'default' as a key use case and
changing scrub from a kernel driver without policy from userspace to
me would be a wrong thing to do. (I'm not sure if you are suggesting this)

The scrub should always be running pre linux (true today on all systems that I'm
aware of, but maybe not as universal as I think?).  If there is a need for
a default scrub setting on boot of Linux, then sure we can add it.

This interface all about tweaking the settings not defaults (unlike the CXL
case which does need the setting of defaults as well because of hotplug of
the devices and lack of firmware involvement in that).

We are fine with any of the options above.

This was an attempt to respond to review feedback from Daniel - it was not
something Huawei needs.

https://lore.kernel.org/all/547ed8fb-d6b7-4b6b-a38b-bf13223971b1@os.amperecomputing.com/

After a discussion of why 0, 0 defaults give an unexpected result...

"Proposed Solution:
What we propose, is to instead of zeroing out the base and size after an error,
use the full range of the current NUMA node. We believe that a superset of a
currently active scrub range can properly report all the relevant and correct
information."

The above Numa node pattern in PA space |0|1|0|2| etc is a thing
that happens on real systems so if was the best that had come up in earlier
discussion as an approximation of what Daniel asked for that should allow the
right values to be queried.

I'm not entirely sure this even matters now they have resolved the shared
PCC interface issue on their platforms.  Felt nice to provide meaningful
defaults but maybe this is a problem we don't need to solve and can go back to
just using 0, 0 until told to do something else.

Daniel, perhaps you can provide more info?

Thanks,

Jonathan

> 
> > This returns error on the first failure.
> > 
> > What if there was a success before? Does that aux_device need to be removed?
> > 
> > If not, then why return failure at all? Why not just try to add all devices? Some may fail and some may succeed.
> > ============================= 
> > 
> > We thought second option is a better because a successfully added aux dev for a memory device and corresponding
> > EDAC interface continue exist and support the scrub/a memory feature. 
> > We do not mind doing stop on a failure adding an aux_device and free previously crated aux devices, though
> > it may require some additional dynamically allocated memory space to store the successfully created aux devices
> > so that free them on a failure later. Hope that is acceptable?  
> 
> So how are you going to present to people a subset of devices loaded? And what
> is the point at all? 
> 
> Is there a valid use case where you can use only a subset of the devices to
> even try to support such nonsense?
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-17 17:36             ` Jonathan Cameron
@ 2025-09-18 20:22               ` Daniel Ferguson
  2025-09-19 10:39               ` Borislav Petkov
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Ferguson @ 2025-09-18 20:22 UTC (permalink / raw)
  To: Jonathan Cameron, Borislav Petkov
  Cc: Shiju Jose, rafael, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, Yazen.Ghannam,
	mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang, vanshikonda, danielf



On 9/17/2025 10:36 AM, Jonathan Cameron wrote:
> On Wed, 17 Sep 2025 18:22:53 +0200
> Borislav Petkov <bp@alien8.de> wrote:
> 
>> On Mon, Sep 15, 2025 at 11:50:16AM +0000, Shiju Jose wrote:
>>> This has been added as suggested by Jonathan considering the interleaved NUMA node.
>>> Link to the related discussion in V11:
>>> https://lore.kernel.org/all/20250821100655.00003942@huawei.com/#t  
>>
>> Sorry, this doesn't work this way.
>>
>> If something in the code is being done which is not obvious and trivial, then
>> the reason for it is written down in a prominent place so that it is clear to
>> people.
>>
>> Not pointing to a discussion or some funky place on the web where someone
>> might've said something.
>>
>> Your patch submission should contain that info and not have reviewers ask for
>> it.
>>
>>> | node 0 | node 1 | node 0 |   PA address map.
>>> Can you give your suggestion what we should do about it?  
>>
>> I don't know what the problem is to begin with...
>>
>>> I think Option (2) seems better?  If so, can the EDAC scrub interface  be
>>> updated to include attributes for publishing the supported PA range for the
>>> memory device to scrub?  
>>
>> The memory ranges should already be available somewhere in the NUMA/mm code or
>> so and for starters, we should start a scrub for all ranges and do the
>> single-range only when there really is a good reason for it.
>>
>> Also, you don't have to expose any ranges to userspace in order to start
>> a scrub activity - you can simply start the scrub in the affected range
>> automatically.
>>
>> Like I preached the last time, your aim should be to make as much of the
>> variables that control the scrub automatic and not expose everything to
>> userspace so that some userspace tool decides. The tool should simply start
>> the scrub and the kernel should DTRT.
> 
> This 'first contiguous range' is an attempt to DTRT in a corner
> case that is real but where there is not an obvious right thing due to spec limitations.
> 
> The problem is the ACPI specification ties a controller to a NUMA / _PXM but
> then controls it via a single memory range with rules on whether that range can
> include holes (that are actually covered by a different controller).  There
> is no way to discover if two disconnected ranges may be scanned at once other
> that trying that.
> 
> Without resolving this corner, there is no way we could come up with for the kernel
> to DTRT.
> 
> Aim is to automatically establish the range that can be scrubbed. The corner
> case is the hole problem.  Alternative as discussed in earlier versions of
> this series was ignore holes.
> 
> The options discussed in earlier versions of this patch
> =======================================================
> 
> So with Node / PA mapping (happens because people want low memory addresses
> near to CPUs in different sockets - I simplified it here somewhat).
> 
> | Node 0 | Node 1 | Node 0 | Node 2|
> 
> 1. Hole skipping approach
> Have control parameters default to
> |       Node 0             |
>          | Node 1 |
>                            | Node 2|
> 2. Present part of range that is at least not including a hole where it might
> look like we were controlling memory scrub that we were not.
> | Node 0 |
>          | Node 1 |
>                            | Node 2|
> 
> 3. Just don't present anything and leave it up to general mm interfaces
>    to provide what 'should' be set.
> |
> All 3 here, 0 size.
> 
> Whilst nice to support, I'm not seeing 'default' as a key use case and
> changing scrub from a kernel driver without policy from userspace to
> me would be a wrong thing to do. (I'm not sure if you are suggesting this)
> 
> The scrub should always be running pre linux (true today on all systems that I'm
> aware of, but maybe not as universal as I think?).  If there is a need for
> a default scrub setting on boot of Linux, then sure we can add it.
> 
> This interface all about tweaking the settings not defaults (unlike the CXL
> case which does need the setting of defaults as well because of hotplug of
> the devices and lack of firmware involvement in that).
> 
> We are fine with any of the options above.
> 
> This was an attempt to respond to review feedback from Daniel - it was not
> something Huawei needs.
> 
> https://lore.kernel.org/all/547ed8fb-d6b7-4b6b-a38b-bf13223971b1@os.amperecomputing.com/
> 
> After a discussion of why 0, 0 defaults give an unexpected result...
> 
> "Proposed Solution:
> What we propose, is to instead of zeroing out the base and size after an error,
> use the full range of the current NUMA node. We believe that a superset of a
> currently active scrub range can properly report all the relevant and correct
> information."
> 
> The above Numa node pattern in PA space |0|1|0|2| etc is a thing
> that happens on real systems so if was the best that had come up in earlier
> discussion as an approximation of what Daniel asked for that should allow the
> right values to be queried.
> 
> I'm not entirely sure this even matters now they have resolved the shared
> PCC interface issue on their platforms.  Felt nice to provide meaningful
> defaults but maybe this is a problem we don't need to solve and can go back to
> just using 0, 0 until told to do something else.
> 
> Daniel, perhaps you can provide more info?

Thanks for pointing out the limitation of using NUMA base address + size as the
Requested Address Range to determine the status of the scrubber for that NUMA
domain.

> go back to just using 0, 0 until told to do something else.

Here is why I think (0,0) is not scalable:
Not every architecture would consider 0 to be an invalid DRAM address. For a
system where 0 is a valid base address, size 0 is an invalid argument. The
firmware would be correct to return "Status - 0001b" for this Requested Address
Range. This would be compliant to the ACPI spec as defined today.


Based on the discussion so far on all the threads, here's my attempt at
summarizing the spec gap. Please add if I'm missing anything:
The OS can't determine if the component is busy in a single RAS2 scrub call if
the NUMA range has holes. To correctly determine status of the component, the OS
would have to make multiple calls to cover all the disjoint memory ranges in the
NUMA domain.


Could this gap be addressed as follows?:
The patrol scrubber can only be busy due to a previously *successful* request
from the OS. The OS can keep track of the address range it submitted for this
request. The next time we try to submit a GET_PATROL_PARAMETERS command, we use
the previously requested range to check if the scrubber is busy. Additionally,
if you know the previous START_PATROL_SCRUBBER wasn't successful, then there is
no need to call GET_PATROL_PARAMETERS to determine if the scrubber is running.

At driver init time, use the first page of the NUMA range as the Requested
Address Range to invoke GET_PATROL_PARAMETERS - to get parameters that apply to
all addresses in the NUMA domain, i.e., "Scrub Parameters" and the "Extended
Data Region". The "Actual Address Range", "Flags" and "Status" fields will
change when the "Requested Address Range" changes but not "Scrub Parameters" and
"Extended Data Region" [1].

[1] - https://github.com/tianocore/edk2/issues/11353

Thanks,
~Daniel

> 
> Thanks,
> 
> Jonathan


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-17 17:36             ` Jonathan Cameron
  2025-09-18 20:22               ` Daniel Ferguson
@ 2025-09-19 10:39               ` Borislav Petkov
  2025-10-06 10:37                 ` Shiju Jose
  1 sibling, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2025-09-19 10:39 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Shiju Jose, rafael, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, Yazen.Ghannam,
	mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Wed, Sep 17, 2025 at 06:36:08PM +0100, Jonathan Cameron wrote:
> This 'first contiguous range' is an attempt to DTRT in a corner
> case that is real but where there is not an obvious right thing due to spec limitations.

Thanks for taking the time to expand. The gist of this needs to be in a commit
message for future reference.

HOWEVER, I'm still not clear *why* we're jumping through hoops which we
probably set up ourselves without even knowing why... at least it looks like
this from where I'm standing.

So why not start a scrub on the whole system? Why do we care?

Scrub is "cheap" in the sense that it runs in the background and is the lowest
priority and everything else overrides it.

So why design an interface only when there's a need to design one and do the
simplest thing now, for starters? Gather some experience and then imrpove it
by actually designing an interface...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-09-19 10:39               ` Borislav Petkov
@ 2025-10-06 10:37                 ` Shiju Jose
  2025-10-16 10:30                   ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Shiju Jose @ 2025-10-06 10:37 UTC (permalink / raw)
  To: Borislav Petkov, Jonathan Cameron
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, Yazen.Ghannam, mchehab, Linuxarm,
	rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
	james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
	duenwen, gthelen, wschwartz, wbs, nifan.cxl, tanxiaofei,
	Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 19 September 2025 11:40
>To: Jonathan Cameron <jonathan.cameron@huawei.com>
>Cc: Shiju Jose <shiju.jose@huawei.com>; rafael@kernel.org; akpm@linux-
>foundation.org; rppt@kernel.org; dferguson@amperecomputing.com; linux-
>edac@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-
>doc@vger.kernel.org; tony.luck@intel.com; lenb@kernel.org;
>Yazen.Ghannam@amd.com; mchehab@kernel.org; Linuxarm
><linuxarm@huawei.com>; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; wbs@os.amperecomputing.com;
>nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
>
>On Wed, Sep 17, 2025 at 06:36:08PM +0100, Jonathan Cameron wrote:
>> This 'first contiguous range' is an attempt to DTRT in a corner case
>> that is real but where there is not an obvious right thing due to spec
>limitations.
>
>Thanks for taking the time to expand. The gist of this needs to be in a commit
>message for future reference.
>
>HOWEVER, I'm still not clear *why* we're jumping through hoops which we
>probably set up ourselves without even knowing why... at least it looks like this
>from where I'm standing.
>
>So why not start a scrub on the whole system? Why do we care?
>
>Scrub is "cheap" in the sense that it runs in the background and is the lowest
>priority and everything else overrides it.
>
>So why design an interface only when there's a need to design one and do the
>simplest thing now, for starters? Gather some experience and then imrpove it by
>actually designing an interface...

Hi Boris,

Sorry for the delay in replaying.

We have a prototype looking to simplify things as follows (leaving per node and
range control for future work), but before I post it (waiting for rc1) I'd like to discuss
the approach and a few open questions"

1.Scrub rate
1.1. Scrub rate is common across the NUMA node domains.
1.2. Common min scrub rate is max of min scrub rates across nodes.
1.3. Common max scrub rate is min of max scrub rates across nodes.
1.4. Scrub rate allowed to change only if NO demand and patrol
   scrubbing is in progress and should be within min and max
   range of scrub rates.

2. Demand scrubbing and Background (patrol) scrubbing
2.1. Background scrubbing request enables BG scrubbing
     on all NUMA nodes.

2.2. For, demand scrubbing request 2 options are identified,
     with (b) tried. Please suggest the right approach?
a) Enable demand scrubbing on all NUMA nodes, hope for
     the 'Requested Address Range(INPUT)' field, can use
     address set to scrub and PAGE_SIZE(or similar) for all the
     nodes. For RAS2, only 'addr' sysfs attribute is added now
     and 'size' sysfs attribute is removed.
b) Enable demand scrubbing on a NUMA node for which
     the requested address to scrub is within the PA range of
     that node.

2.3. Demand scrubbing is not allowed when background scrubbing
     is in progress.

2.4. If 2.2. (b) is chosen, should kernel allow BG
      scrubbing on rest of the nodes, when demand scrubbing on
      some node/s is in progress?

2.5 The status of the BG scrubbing exposed to the user space
    in 'enable_background' sysfs attribute.

2.6 The status of the demand scrubbing exposed to the
       user space in 'addr' sysfs attribute. However when the
       demand scrubbing is on multiple/all nodes are in progress,
       which demand scrubbing status and address in 'addr' sysfs attribute
       as status should be exposed to the user space?
a) May be the status of the first detected node with demand scrubbing
     is in progress?
b) Does not show the status at all, just fail the request if the
    demand scrubbing is already in progress on a node/all nodes?
c)  Any other suggestion?

Thanks,
Shiju

>
>Thx.
>
>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-10-06 10:37                 ` Shiju Jose
@ 2025-10-16 10:30                   ` Borislav Petkov
  2025-10-17 12:54                     ` Shiju Jose
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2025-10-16 10:30 UTC (permalink / raw)
  To: Shiju Jose
  Cc: Jonathan Cameron, rafael, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, Yazen.Ghannam,
	mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Mon, Oct 06, 2025 at 10:37:39AM +0000, Shiju Jose wrote:
> 1.Scrub rate
> 1.1. Scrub rate is common across the NUMA node domains.
> 1.2. Common min scrub rate is max of min scrub rates across nodes.
> 1.3. Common max scrub rate is min of max scrub rates across nodes.

And you need scrub rate to be per node because...?

Why can't it be a system-wide scrub rate?

If the use case appears which needs per-node scrub rate, then you design it
this way.

Or you already have a valid use case for it which dictates this design?

> 1.4. Scrub rate allowed to change only if NO demand and patrol
>    scrubbing is in progress

Right.

> 2. Demand scrubbing and Background (patrol) scrubbing
> 2.1. Background scrubbing request enables BG scrubbing
>      on all NUMA nodes.

Right.

> 2.2. For, demand scrubbing request 2 options are identified,
>      with (b) tried. Please suggest the right approach?
> a) Enable demand scrubbing on all NUMA nodes, hope for
>      the 'Requested Address Range(INPUT)' field, can use
>      address set to scrub and PAGE_SIZE(or similar) for all the
>      nodes.

Why do you need an address range? Why not start scrubbing and have it be
fire-and-forget?

> b) Enable demand scrubbing on a NUMA node for which
>      the requested address to scrub is within the PA range of
>      that node.
> 
> 2.3. Demand scrubbing is not allowed when background scrubbing
>      is in progress.
> 
> 2.4. If 2.2. (b) is chosen, should kernel allow BG
>       scrubbing on rest of the nodes, when demand scrubbing on
>       some node/s is in progress?

It seems like all scrubbing should be mutually-exclusive... or is there
a point in scrubbing in parallel...?

> 2.5 The status of the BG scrubbing exposed to the user space
>     in 'enable_background' sysfs attribute.
> 
> 2.6 The status of the demand scrubbing exposed to the
>        user space in 'addr' sysfs attribute. However when the
>        demand scrubbing is on multiple/all nodes are in progress,
>        which demand scrubbing status and address in 'addr' sysfs attribute
>        as status should be exposed to the user space?
> a) May be the status of the first detected node with demand scrubbing
>      is in progress?
> b) Does not show the status at all, just fail the request if the
>     demand scrubbing is already in progress on a node/all nodes?
> c)  Any other suggestion?

First we need a proper granularity defined and then everything will revolve
around it: should it be system-wide, per-node, does it need to have an address
range or can it be started and no need for any further user interaction and so
on and so on...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-10-16 10:30                   ` Borislav Petkov
@ 2025-10-17 12:54                     ` Shiju Jose
  2025-10-24 18:13                       ` Daniel Ferguson
  2025-11-03 13:19                       ` Borislav Petkov
  0 siblings, 2 replies; 20+ messages in thread
From: Shiju Jose @ 2025-10-17 12:54 UTC (permalink / raw)
  To: Borislav Petkov, Daniel Ferguson
  Cc: Jonathan Cameron, rafael, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, Yazen.Ghannam,
	mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 16 October 2025 11:31
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: Jonathan Cameron <jonathan.cameron@huawei.com>; rafael@kernel.org;
>akpm@linux-foundation.org; rppt@kernel.org;
>dferguson@amperecomputing.com; linux-edac@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-doc@vger.kernel.org;
>tony.luck@intel.com; lenb@kernel.org; Yazen.Ghannam@amd.com;
>mchehab@kernel.org; Linuxarm <linuxarm@huawei.com>;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
>
>On Mon, Oct 06, 2025 at 10:37:39AM +0000, Shiju Jose wrote:
>> 1.Scrub rate
>> 1.1. Scrub rate is common across the NUMA node domains.
>> 1.2. Common min scrub rate is max of min scrub rates across nodes.
>> 1.3. Common max scrub rate is min of max scrub rates across nodes.
>
>And you need scrub rate to be per node because...?
>
>Why can't it be a system-wide scrub rate?

ACPI spec defined RAS2 interface for scrub and scrub parameters per node .
Thus to make compatible to the spec,  kernel and firmware implementations for
RAS2 scrubbing are per node. 
For the design and prototyping your request for "start a scrub on the whole system", 
we are trying make sysfs scrub control system-wide while keeping underlying RAS2
scrubbing per node. Then these open questions arise, such as need for the 
system-wide common scrub rate across all nodes,  for the demand scrubbing should
the kernel send scrub request to only on the corresponding node or to all the nodes etc.
>
>If the use case appears which needs per-node scrub rate, then you design it this
>way.

From the ACPI spec RAS2 scrub interface perspective,  needs per-node scrub rate and other
scrub parameters. One of the use case for demand/background scrubbing in a specific node
in which frequent corrected memory errors reported to the user space and CE count exceeds the
threshold.

May be Daniel can provide more inputs for this question about use cases?
 
If you agree to keep per-node scrub rate and thus per-node scrub control in the sysfs,
then I will continue to use the original design in v12? Otherwise will try to use the new design
with common system-wide scrub control in the sysfs and underlying RAS2 scrubbing 
implementation per node.

>
>Or you already have a valid use case for it which dictates this design?
>
>> 1.4. Scrub rate allowed to change only if NO demand and patrol
>>    scrubbing is in progress
>
>Right.
>
>> 2. Demand scrubbing and Background (patrol) scrubbing 2.1. Background
>> scrubbing request enables BG scrubbing
>>      on all NUMA nodes.
>
>Right.
>
>> 2.2. For, demand scrubbing request 2 options are identified,
>>      with (b) tried. Please suggest the right approach?
>> a) Enable demand scrubbing on all NUMA nodes, hope for
>>      the 'Requested Address Range(INPUT)' field, can use
>>      address set to scrub and PAGE_SIZE(or similar) for all the
>>      nodes.
>
>Why do you need an address range? Why not start scrubbing and have it be fire-
>and-forget?
This is for demand scrubbing feature/use case where a specific address range to scrub and
OS must set the mandatory  spec defined  RAS2 table field 'Requested Address Range(INPUT)' 
while requesting the demand scrubbing in a node. Hope the firmware can ignore the request
if the requested address range to scrub is irrelevant for a node, because in this approach we have
common sysfs scrub control and kernel is requesting demand scrubbing system-wide across
all nodes.

If this approach is not correct, can we use (b) as below? providing we need to get PA range
for the nodes in the RAS2 driver  using the functions (start_pfn = node_start_pfn(nid) and 
size_pfn = node_spanned_pages(nid);)  as implemented in v12 and discussed earlier in this thread.

>
>> b) Enable demand scrubbing on a NUMA node for which
>>      the requested address to scrub is within the PA range of
>>      that node.
>>
>> 2.3. Demand scrubbing is not allowed when background scrubbing
>>      is in progress.
>>
>> 2.4. If 2.2. (b) is chosen, should kernel allow BG
>>       scrubbing on rest of the nodes, when demand scrubbing on
>>       some node/s is in progress?
>
>It seems like all scrubbing should be mutually-exclusive... or is there a point in
>scrubbing in parallel...?

Sure. Then background scrubbing will not be allowed if demand scrubbing is in progress
in a node, if the system-wide scrub control in sysfs is chosen. 
>
>> 2.5 The status of the BG scrubbing exposed to the user space
>>     in 'enable_background' sysfs attribute.
>>
>> 2.6 The status of the demand scrubbing exposed to the
>>        user space in 'addr' sysfs attribute. However when the
>>        demand scrubbing is on multiple/all nodes are in progress,
>>        which demand scrubbing status and address in 'addr' sysfs attribute
>>        as status should be exposed to the user space?
>> a) May be the status of the first detected node with demand scrubbing
>>      is in progress?
>> b) Does not show the status at all, just fail the request if the
>>     demand scrubbing is already in progress on a node/all nodes?
>> c)  Any other suggestion?
>
>First we need a proper granularity defined and then everything will revolve
>around it: should it be system-wide, per-node, does it need to have an address
>range or can it be started and no need for any further user interaction and so on
>and so on...
Sure.
>

Thanks,
Shiju

>--
>Regards/Gruss,
>    Boris.
>
>https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-10-17 12:54                     ` Shiju Jose
@ 2025-10-24 18:13                       ` Daniel Ferguson
  2025-11-03 13:19                       ` Borislav Petkov
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Ferguson @ 2025-10-24 18:13 UTC (permalink / raw)
  To: Shiju Jose, Borislav Petkov
  Cc: Jonathan Cameron, rafael, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, Yazen.Ghannam,
	mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang


>> If the use case appears which needs per-node scrub rate, then you design it this
>> way.
> 
> From the ACPI spec RAS2 scrub interface perspective,  needs per-node scrub rate and other
> scrub parameters. One of the use case for demand/background scrubbing in a specific node
> in which frequent corrected memory errors reported to the user space and CE count exceeds the
> threshold.
> 
> May be Daniel can provide more inputs for this question about use cases?

Hi, sorry for delay;

I do agree that per-node scrub rates are allowed by the spec, and it is noble to
surface this capability.
The way the driver is currently setup, if we ever allow the user to specify
scrub rate, then it would only require a change on our end.

However, at this time, we do not honor user-specified scrub rates. We only have
one rate; FAST.
And when the scrub rate is changed, we simply ignore it, for now.

Therefore, we do not have any use case that is sensitive to per-node scrub
rates, for now.

Cheers,
Daniel





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-10-17 12:54                     ` Shiju Jose
  2025-10-24 18:13                       ` Daniel Ferguson
@ 2025-11-03 13:19                       ` Borislav Petkov
  2025-11-04 12:55                         ` Shiju Jose
  1 sibling, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2025-11-03 13:19 UTC (permalink / raw)
  To: Shiju Jose
  Cc: Daniel Ferguson, Jonathan Cameron, rafael, akpm, rppt, dferguson,
	linux-edac, linux-acpi, linux-mm, linux-doc, tony.luck, lenb,
	Yazen.Ghannam, mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Fri, Oct 17, 2025 at 12:54:36PM +0000, Shiju Jose wrote:
> ACPI spec defined RAS2 interface for scrub and scrub parameters per node
> .  Thus to make compatible to the spec,  kernel and firmware implementations
> for RAS2 scrubbing are per node.

Ok, makes sense. You can have heterogeneous or whatever nodes.

> For the design and prototyping your request for "start a scrub on the whole
> system", we are trying make sysfs scrub control system-wide while keeping
> underlying RAS2 scrubbing per node.

I guess per-node does make sense...

> for the demand scrubbing should the kernel send scrub request to only on the
> corresponding node or to all the nodes etc.

Well, since scrubbing should not interfere with normal operation, you could
start it on the target where it should scrub and then do a full circle over
all memory. For example. Or do something simple and which comes "natural".

> From the ACPI spec RAS2 scrub interface perspective,  needs per-node scrub
> rate and other scrub parameters. One of the use case for demand/background
> scrubbing in a specific node in which frequent corrected memory errors
> reported to the user space and CE count exceeds the threshold.

I guess.

Or you can simply start scrubbing around the failing address. With a certain
radius. If the node thing comes more natural, sure but you can have a big fat
node and if you start scrubbing the whole thing, you will get to the actual
address you want to scrub after a long while. So the per-node thing is not
necessarily the optimal solution. Question is, what you really wanna do on an
error, as a reaction...

> If you agree to keep per-node scrub rate and thus per-node scrub control in
> the sysfs, then I will continue to use the original design in v12? Otherwise
> will try to use the new design with common system-wide scrub control in the
> sysfs and underlying RAS2 scrubbing implementation per node.

See above.

> This is for demand scrubbing feature/use case where a specific address range
> to scrub and OS must set the mandatory  spec defined  RAS2 table field
> 'Requested Address Range(INPUT)' while requesting the demand scrubbing in
> a node. Hope the firmware can ignore the request if the requested address
> range to scrub is irrelevant for a node, because in this approach we have
> common sysfs scrub control and kernel is requesting demand scrubbing
> system-wide across all nodes.
> 
> If this approach is not correct, can we use (b) as below? providing we need
> to get PA range for the nodes in the RAS2 driver  using the functions
> (start_pfn = node_start_pfn(nid) and size_pfn = node_spanned_pages(nid);)
> as implemented in v12 and discussed earlier in this thread.
> 

I'm wondering how useful that address range scrubbing would be and whether it
is worth the effort... I guess the goal here is something along those lines:
"oh, you just had an error at address X, so let's scrub [ A ... X ... B ] with
A and B having, hm, dunno, sufficient values to contain X and perhaps cover
sufficient range to catch error locality or whatnot.

But you'd need to do this only when you have a fat memory node and where you
start scrubbing at the beginning of the node range and then you'd have to wait
for a relatively long time to reach the PA X at fault...

But I have a better idea: how about you start at X - y, i.e., at an address
a bit smaller than the last reported one and then continue from there on,
reach the *end* of the node and then wraparound to the beginning until
you reach X again?

This way you don't need to supply any range and you are still "on time" when
reacting to the error with scrubbing...

Hmmm?

> Sure. Then background scrubbing will not be allowed if demand scrubbing is
> in progress in a node, if the system-wide scrub control in sysfs is chosen. 

So can the kernel interrupt background scrubbing on some node? Because then it
is easy:

You interrupt background scrubbing whenever needed with on-demand scrubbing on
that particular node...

It looks like it is starting to crystallize...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-11-03 13:19                       ` Borislav Petkov
@ 2025-11-04 12:55                         ` Shiju Jose
  2025-11-22 11:36                           ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Shiju Jose @ 2025-11-04 12:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Daniel Ferguson, Jonathan Cameron, rafael, akpm, rppt, dferguson,
	linux-edac, linux-acpi, linux-mm, linux-doc, tony.luck, lenb,
	Yazen.Ghannam, mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 03 November 2025 13:19
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: Daniel Ferguson <danielf@os.amperecomputing.com>; Jonathan Cameron
><jonathan.cameron@huawei.com>; rafael@kernel.org; akpm@linux-
>foundation.org; rppt@kernel.org; dferguson@amperecomputing.com; linux-
>edac@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-
>doc@vger.kernel.org; tony.luck@intel.com; lenb@kernel.org;
>Yazen.Ghannam@amd.com; mchehab@kernel.org; Linuxarm
><linuxarm@huawei.com>; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; wbs@os.amperecomputing.com;
>nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
>
>On Fri, Oct 17, 2025 at 12:54:36PM +0000, Shiju Jose wrote:
[...]
>
>> This is for demand scrubbing feature/use case where a specific address
>> range to scrub and OS must set the mandatory  spec defined  RAS2 table
>> field 'Requested Address Range(INPUT)' while requesting the demand
>> scrubbing in a node. Hope the firmware can ignore the request if the
>> requested address range to scrub is irrelevant for a node, because in
>> this approach we have common sysfs scrub control and kernel is
>> requesting demand scrubbing system-wide across all nodes.
>>
>> If this approach is not correct, can we use (b) as below? providing we
>> need to get PA range for the nodes in the RAS2 driver  using the
>> functions (start_pfn = node_start_pfn(nid) and size_pfn =
>> node_spanned_pages(nid);) as implemented in v12 and discussed earlier in this
>thread.
>>
>
>I'm wondering how useful that address range scrubbing would be and whether it
>is worth the effort... I guess the goal here is something along those lines:
>"oh, you just had an error at address X, so let's scrub [ A ... X ... B ] with A and B
>having, hm, dunno, sufficient values to contain X and perhaps cover sufficient
>range to catch error locality or whatnot.
>
>But you'd need to do this only when you have a fat memory node and where you
>start scrubbing at the beginning of the node range and then you'd have to wait
>for a relatively long time to reach the PA X at fault...
>
>But I have a better idea: how about you start at X - y, i.e., at an address a bit
>smaller than the last reported one and then continue from there on, reach the
>*end* of the node and then wraparound to the beginning until you reach X
>again?
>
>This way you don't need to supply any range and you are still "on time" when
>reacting to the error with scrubbing...
>
>Hmmm?
Thanks Borislav for the valuable suggestion and it make sense. Since presently we are
not sure how reaching the end of the node work on individual platforms,  can we do
this as an optimization in the next stage? and 
Can we start with basic demand scrubbing without address range control in sysfs,
but with user space set only scrub rate and enable_demand, kernel set the
node's addr range as Requested Address Range to start the demand scrubbing on
entire node, as you suggested?
>
>> Sure. Then background scrubbing will not be allowed if demand
>> scrubbing is in progress in a node, if the system-wide scrub control in sysfs is
>chosen.
>
>So can the kernel interrupt background scrubbing on some node? Because then it
>is easy:
RAS2 STOP_PATROL_SCRUBBER  command allows interrupting background scrubbing.  
>
>You interrupt background scrubbing whenever needed with on-demand
>scrubbing on that particular node...
>
Sure. Will do.
>It looks like it is starting to crystallize...
>

Thanks,
Shiju

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver
  2025-11-04 12:55                         ` Shiju Jose
@ 2025-11-22 11:36                           ` Borislav Petkov
  0 siblings, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2025-11-22 11:36 UTC (permalink / raw)
  To: Shiju Jose
  Cc: Daniel Ferguson, Jonathan Cameron, rafael, akpm, rppt, dferguson,
	linux-edac, linux-acpi, linux-mm, linux-doc, tony.luck, lenb,
	Yazen.Ghannam, mchehab, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

On Tue, Nov 04, 2025 at 12:55:48PM +0000, Shiju Jose wrote:
> Thanks Borislav for the valuable suggestion and it make sense. Since
> presently we are not sure how reaching the end of the node work on
> individual platforms,  can we do this as an optimization in the next stage?
> and Can we start with basic demand scrubbing without address range control
> in sysfs, but with user space set only scrub rate and enable_demand, kernel
> set the node's addr range as Requested Address Range to start the demand
> scrubbing on entire node, as you suggested?

Yap, I like starting simple and then delving into a more involved solution
which is dictated by real life.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-11-22 11:37 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-02 17:30 [PATCH v12 0/2] ACPI: Add support for ACPI RAS2 feature table shiju.jose
2025-09-02 17:30 ` [PATCH v12 1/2] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2025-09-09 16:24   ` Yazen Ghannam
2025-09-10  8:38     ` Shiju Jose
2025-09-10 19:27   ` Borislav Petkov
2025-09-12 12:04     ` Shiju Jose
2025-09-12 14:11       ` Borislav Petkov
2025-09-15 11:50         ` Shiju Jose
2025-09-17 16:22           ` Borislav Petkov
2025-09-17 17:36             ` Jonathan Cameron
2025-09-18 20:22               ` Daniel Ferguson
2025-09-19 10:39               ` Borislav Petkov
2025-10-06 10:37                 ` Shiju Jose
2025-10-16 10:30                   ` Borislav Petkov
2025-10-17 12:54                     ` Shiju Jose
2025-10-24 18:13                       ` Daniel Ferguson
2025-11-03 13:19                       ` Borislav Petkov
2025-11-04 12:55                         ` Shiju Jose
2025-11-22 11:36                           ` Borislav Petkov
2025-09-02 17:30 ` [PATCH v12 2/2] ras: mem: Add memory " shiju.jose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox