linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v13 0/2]  ACPI: Add support for ACPI RAS2 feature table
@ 2025-11-21 18:28 shiju.jose
  2025-11-21 18:28 ` [PATCH v13 1/2] ACPI:RAS2: Add driver for the " shiju.jose
  2025-11-21 18:28 ` [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver shiju.jose
  0 siblings, 2 replies; 10+ messages in thread
From: shiju.jose @ 2025-11-21 18:28 UTC (permalink / raw)
  To: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam,
	mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add support for ACPI RAS2 feature table (RAS2) defined in the
ACPI 6.5 specification, section 5.2.21 and RAS2 HW based memory
scrubbing feature.

ACPI RAS2 patches were part of the EDAC series [1].

The code is based on linux.git v6.18-rc5 [2].

1. https://lore.kernel.org/linux-cxl/20250212143654.1893-1-shiju.jose@huawei.com/
2. https://github.com/torvalds/linux.git

Changes
=======
v12 -> v13:
1. Fixed some bugs reported and changes wanted by Borislav.
   https://lore.kernel.org/all/20250910192707.GAaMHRCxWx37XitN3t@fat_crate.local/ 

2. Tried modifying the patch header as commented by Borislav.

3. Fixed a bug reported by Yazen.
   https://lore.kernel.org/all/20250909162434.GB11602@yaz-khff2.amd.com/

4. Changed setting 'Requested Address Range' for GET_PATROL_PARAMETERS
   command to meet the requirements from Daniel for Ampere Computing
   platform. 
   https://lore.kernel.org/all/7a211c5c-174c-438b-9a98-fd47b057ea4a@os.amperecomputing.com/

5. In RAS2 driver, removed support for scrub control attributes 'addr' and
   'size' for the time being with the expectation that a firmware will do
   the full node demand scrubbing and may enable these attributes in the
   future.
   
6. Add 'enable_demand' attribute to the EDAC scrub interface to start/stop
   the demand scrub, which is used for the RAS2 demand scrub control.

v11 -> v12:
1. Modified logic for finding the lowest contiguous phy memory addr range for
NUMA domain using node_start_pfn() and node_spanned_pages() according to the
feedback from Mike Rapoport in v11.
https://lore.kernel.org/all/aKsIlFTkBsAF5sqD@kernel.org/

2. Rebase to 6.17-rc4.

v10 -> v11:
1. Simplified code by removing workarounds previously added to support
   non-compliant case of single PCC channel shared across all proximity
   domains (which is no longer required). 
   https://lore.kernel.org/all/f5b28977-0b80-4c39-929b-cf02ab1efb97@os.amperecomputing.com/

2. Fix for the comments from Borislav (Thanks).
   https://lore.kernel.org/all/20250811152805.GQaJoMBecC4DSDtTAu@fat_crate.local/

3. Rebase to 6.17-rc1.

v9 -> v10:
1. Use pcc_chan->shmem instead of 
   acpi_os_ioremap(pcc_chan->shmem_base_addr,...) as it was
   acpi_os_ioremap internally by the PCC driver to pcc_chan->shmem.
   
2. Changes required for the Ampere Computing system where uses a single
   PCC channel for RAS2 memory features across all NUMA domains. Based on the
   requirements from by Daniel on V9
   https://lore.kernel.org/all/547ed8fb-d6b7-4b6b-a38b-bf13223971b1@os.amperecomputing.com/
   and discussion with Jonathan.
2.1 Add node_to_range lookup facility to numa_memblks. This is to retrieve the lowest
    physical continuous memory range of the memory associated with a NUMA domain.
2.2. Set requested addr range to the memory region's base addr and size
   while send RAS2 cmd GET_PATROL_PARAMETER 
   in functions ras2_update_patrol_scrub_params_cache() &
   ras2_get_patrol_scrub_running().
2.3. Split struct ras2_mem_ctx into struct ras2_mem_ctx_hdr and struct ras2_pxm_domain
   to support cases, uses a single PCC channel for RAS2 scrubbers across all NUMA
   domains and PCC channel per RAS2 scrub instance. Provided ACPI spec define single
   memory scrub per NUMA domain.
2.4. EDAC feature sysfs folder for RAS2 changed from "acpi_ras_memX" to  "acpi_ras_mem_idX"
   because memory scrub instances across all NUMA domains would present under
   "acpi_ras_mem_id0" when a system uses a single PCC channel for RAS2 scrubbers across
   all NUMA domains etc.
2.5. Removed Acked-by: Rafael from patch [2], because of the several above changes from v9.

v8 -> v9:
1. Added following changes for feedback from Yazen.
 1.1 In ras2_check_pcc_chan(..) function
    - u32 variables moved to the same line.
    - Updated error log for readw_relaxed_poll_timeout()
    - Added error log for if (status & PCC_STATUS_ERROR), error condition.
    - Removed an impossible condition check.
  1.2. Added guard for ras2_pc_list_lock in ras2_get_pcc_subspace().
        
2. Rebased to linux.git v6.16-rc2 [2].

v7 -> v8:
1. Rebased to linux.git v6.16-rc1 [2].

v6 -> v7:
1. Fix for the issue reported by Daniel,
   In ras2_check_pcc_chan(), add read, clear and check RAS2 set_cap_status outside
   if (status & PCC_STATUS_ERROR) check. 
   https://lore.kernel.org/all/51bcb52c-4132-4daf-8903-29b121c485a1@os.amperecomputing.com/

v5 -> v6:
1. Fix for the issue reported by Daniel, in start scrubbing with correct addr and size
   after firmware return INVALID DATA error for scrub request with invalid addr or size.
   https://lore.kernel.org/all/8cdf7885-31b3-4308-8a7c-f4e427486429@os.amperecomputing.com/
   
v4 -> v5:
1. Fix for the build warnings reported by kernel test robot.
   https://patchwork.kernel.org/project/linux-edac/patch/20250423163511.1412-3-shiju.jose@huawei.com/
2. Removed patch "ACPI: ACPI 6.5: RAS2: Rename RAS2 table structure and field names"
   from the series as the patch was merged to linux-pm.git : branch linux-next
3. Rebased to ras.git: edac-for-next branch merged with linux-pm.git : linux-next branch.
      
v3 -> v4:
1.  Changes for feedbacks from Yazen on v3.
    https://lore.kernel.org/all/20250415210504.GA854098@yaz-khff2.amd.com/

v2 -> v3:
1. Rename RAS2 table structure and field names in 
   include/acpi/actbl2.h limited to only necessary
   for RAS2 scrub feature.
2. Changes for feedbacks from Jonathan on v2.
3. Daniel reported a known behaviour: when readback 'size' attribute after
   setting in, returns 0 before starting scrubbing via 'addr' attribute.
   Changes added to fix this.
4. Daniel reported that firmware cannot update status of demand scrubbing
   via the 'Actual Address Range (OUTPUT)', thus add workaround in the
   kernel to update sysfs 'addr' attribute with the status of demand
   scrubbing.
5. Optimized logic in ras2_check_pcc_chan() function
   (patch - ACPI:RAS2: Add ACPI RAS2 driver).
6. Add PCC channel lock to struct ras2_pcc_subspace and change
   lock in ras2_mem_ctx as a pointer to pcc channel lock to make sure
   writing to PCC subspace shared memory is protected from race conditions.
   
v1 -> v2:
1.  Changes for feedbacks from Borislav.
    - Shorten ACPI RAS2 structures and variables names.
    - Shorten some of the other variables in the RAS2 drivers.
    - Fixed few CamelCases.

2.  Changes for feedbacks from Yazen.
    - Added newline after number of '}' and return statements.
    - Changed return type for "ras2_add_aux_device() to 'int'.
    - Deleted a duplication of acpi_get_table("RAS2",...) in the ras2_acpi_parse_table().
    - Add "FW_WARN" to few error logs in the ras2_acpi_parse_table().
    - Rename ras2_acpi_init() to acpi_ras2_init() and modified to call acpi_ras2_init()
      function from the acpi_init().
    - Moved scrub related variables from the struct ras2_mem_ctx from  patch
      "ACPI:RAS2: Add ACPI RAS2 driver" to "ras: mem: Add memory ACPI RAS2 driver".

Shiju Jose (2):
  ACPI:RAS2: Add driver for the ACPI RAS2 feature table
  ras: mem: Add ACPI RAS2 memory driver

 Documentation/ABI/testing/sysfs-edac-scrub |  13 +-
 Documentation/edac/scrub.rst               |  58 +++
 drivers/acpi/Kconfig                       |  12 +
 drivers/acpi/Makefile                      |   1 +
 drivers/acpi/bus.c                         |   3 +
 drivers/acpi/ras2.c                        | 398 ++++++++++++++++++++
 drivers/edac/scrub.c                       |  12 +
 drivers/ras/Kconfig                        |  12 +
 drivers/ras/Makefile                       |   1 +
 drivers/ras/acpi_ras2.c                    | 403 +++++++++++++++++++++
 include/acpi/ras2.h                        |  74 ++++
 include/linux/edac.h                       |   4 +
 12 files changed, 986 insertions(+), 5 deletions(-)
 create mode 100644 drivers/acpi/ras2.c
 create mode 100644 drivers/ras/acpi_ras2.c
 create mode 100644 include/acpi/ras2.h

-- 
2.43.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v13 1/2] ACPI:RAS2: Add driver for the ACPI RAS2 feature table
  2025-11-21 18:28 [PATCH v13 0/2] ACPI: Add support for ACPI RAS2 feature table shiju.jose
@ 2025-11-21 18:28 ` shiju.jose
  2025-11-22  5:18   ` Randy Dunlap
  2025-11-25  7:36   ` Borislav Petkov
  2025-11-21 18:28 ` [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver shiju.jose
  1 sibling, 2 replies; 10+ messages in thread
From: shiju.jose @ 2025-11-21 18:28 UTC (permalink / raw)
  To: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam,
	mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

ACPI 6.5 Specification, section 5.2.21, defined RAS2 feature table (RAS2).
Driver adds support for RAS2 feature table, which provides interfaces for
platform RAS features, for eg. HW-based memory scrubbing, and logical to
PA translation service. RAS2 uses PCC channel subspace for communicating
with the ACPI compliant HW platform.

Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/acpi/Kconfig  |  12 ++
 drivers/acpi/Makefile |   1 +
 drivers/acpi/bus.c    |   3 +
 drivers/acpi/ras2.c   | 398 ++++++++++++++++++++++++++++++++++++++++++
 include/acpi/ras2.h   |  57 ++++++
 5 files changed, 471 insertions(+)
 create mode 100644 drivers/acpi/ras2.c
 create mode 100644 include/acpi/ras2.h

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index ca00a5dbcf75..bfa9f3f4def5 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
 	  If your platform does not support CPPC in firmware,
 	  leave this option disabled.
 
+config ACPI_RAS2
+	bool "ACPI RAS2 driver"
+	select AUXILIARY_BUS
+	select MAILBOX
+	select PCC
+	depends on NUMA_KEEP_MEMINFO
+	help
+	  This driver adds support for RAS2 feature table provides interfaces
+	  for platform RAS features, for eg. HW-based memory scrubbing.
+	  If your platform does not support RAS2 in firmware, leave this
+	  option disabled.
+
 config ACPI_PROCESSOR
 	tristate "Processor"
 	depends on X86 || ARM64 || LOONGARCH || RISCV
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index d1b0affb844f..abfec6745724 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -105,6 +105,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)	+= ec_sys.o
 obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
+obj-$(CONFIG_ACPI_RAS2)		+= ras2.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
 obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
 obj-$(CONFIG_ACPI_PFRUT)	+= pfr_update.o pfr_telemetry.o
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index a984ccd4a2a0..b02ceb2837c6 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -31,6 +31,7 @@
 #include <acpi/apei.h>
 #include <linux/suspend.h>
 #include <linux/prmt.h>
+#include <acpi/ras2.h>
 
 #include "internal.h"
 
@@ -1474,6 +1475,8 @@ static int __init acpi_init(void)
 	acpi_debugger_init();
 	acpi_setup_sb_notify_handler();
 	acpi_viot_init();
+	acpi_ras2_init();
+
 	return 0;
 }
 
diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c
new file mode 100644
index 000000000000..9df94d8c953c
--- /dev/null
+++ b/drivers/acpi/ras2.c
@@ -0,0 +1,398 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * ACPI RAS2 feature table driver.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ * Support for RAS2 table - ACPI 6.5 Specification, section 5.2.21, which
+ * provides interfaces for platform RAS features, for eg. HW-based memory
+ * scrubbing, and logical to PA translation service. RAS2 uses PCC channel
+ * subspace for communicating with the ACPI compliant HW platform.
+ */
+
+#define pr_fmt(fmt) "ACPI RAS2: " fmt
+
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/iopoll.h>
+#include <linux/ktime.h>
+#include <acpi/pcc.h>
+#include <acpi/ras2.h>
+
+/**
+ * struct ras2_sspcc - Data structure for PCC communication
+ * @mbox_client:	struct mbox_client object
+ * @pcc_chan:		Pointer to struct pcc_mbox_chan
+ * @comm_addr:		Pointer to RAS2 PCC shared memory region
+ * @elem:		List for registered RAS2 PCC channel subspaces
+ * @pcc_lock:		PCC lock to provide mutually exclusive access
+ *			to PCC channel subspace
+ * @deadline_us:	Poll PCC status register timeout in micro secs
+ *			for PCC command complete
+ * @pcc_mpar:		Maximum Periodic Access Rate (MPAR) for PCC channel
+ * @pcc_mrtt:		Minimum Request Turnaround Time (MRTT) in micro secs
+ *			OS must wait after completion of a PCC command before
+ *			issue next command
+ * @last_cmd_cmpl_time:	completion time of last PCC command
+ * @last_mpar_reset:	Time of last MPAR count reset
+ * @mpar_count:		MPAR count
+ * @pcc_id:		Identifier of the RAS2 platform communication channel
+ * @last_cmd:		Last PCC command
+ * @pcc_chnl_acq:	Status of PCC channel acquired
+ */
+struct ras2_sspcc {
+	struct mbox_client		mbox_client;
+	struct pcc_mbox_chan		*pcc_chan;
+	struct acpi_ras2_shmem __iomem	*comm_addr;
+	struct list_head		elem;
+	struct mutex			pcc_lock;
+	unsigned int			deadline_us;
+	unsigned int			pcc_mpar;
+	unsigned int			pcc_mrtt;
+	ktime_t				last_cmd_cmpl_time;
+	ktime_t				last_mpar_reset;
+	int				mpar_count;
+	int				pcc_id;
+	u16				last_cmd;
+	bool				pcc_chnl_acq;
+};
+
+/*
+ * Arbitrary retries for PCC commands because the remote processor
+ * could be much slower to reply. Keeping it high enough to cover
+ * emulators where the processors run painfully slow.
+ */
+#define PCC_NUM_RETRIES 600ULL
+
+#define RAS2_FEAT_TYPE_MEMORY 0x00
+
+static int decode_cap_error(u32 cap_status)
+{
+	switch (cap_status) {
+	case ACPI_RAS2_NOT_VALID:
+	case ACPI_RAS2_NOT_SUPPORTED:
+		return -EPERM;
+	case ACPI_RAS2_BUSY:
+		return -EBUSY;
+	case ACPI_RAS2_FAILED:
+	case ACPI_RAS2_ABORTED:
+	case ACPI_RAS2_INVALID_DATA:
+		return -EINVAL;
+	default:
+		return 0;
+	}
+}
+
+static int check_pcc_chan(struct ras2_sspcc *sspcc)
+{
+	struct acpi_ras2_shmem __iomem *gen_comm_base = sspcc->comm_addr;
+	u32 cap_status;
+	u16 status;
+	int rc;
+
+	/*
+	 * As per ACPI spec, the PCC space will be initialized by
+	 * platform and should have set the command completion bit when
+	 * PCC can be used by OSPM.
+	 *
+	 * Poll PCC status register every 3us for maximum of 600ULL * PCC
+	 * channel latency until PCC command complete bit is set.
+	 */
+	rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status,
+					status & PCC_STATUS_CMD_COMPLETE, 3,
+					sspcc->deadline_us);
+	if (rc) {
+		pr_warn("PCC check channel timeout for pcc_id=%d rc=%d\n",
+			sspcc->pcc_id, rc);
+		return rc;
+	}
+
+	if (status & PCC_STATUS_ERROR) {
+		pr_warn("Error in executing last command=%d for pcc_id=%d\n",
+			sspcc->last_cmd, sspcc->pcc_id);
+		status &= ~PCC_STATUS_ERROR;
+		writew_relaxed(status, &gen_comm_base->status);
+		return -EIO;
+	}
+
+	cap_status = readw_relaxed(&gen_comm_base->set_caps_status);
+	writew_relaxed(0x0, &gen_comm_base->set_caps_status);
+	return decode_cap_error(cap_status);
+}
+
+/**
+ * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
+ * @ras2_ctx:	pointer to the RAS2 context structure
+ * @cmd:	command to send
+ *
+ * Returns: 0 on success, an error otherwise
+ */
+int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd)
+{
+	struct ras2_sspcc *sspcc = ras2_ctx->sspcc;
+	struct acpi_ras2_shmem __iomem *gen_comm_base = sspcc->comm_addr;
+	struct mbox_chan *pcc_channel;
+	unsigned int time_delta;
+	int rc;
+
+	rc = check_pcc_chan(sspcc);
+	if (rc < 0)
+		return rc;
+
+	pcc_channel = sspcc->pcc_chan->mchan;
+
+	/*
+	 * Handle the Minimum Request Turnaround Time (MRTT).
+	 * "The minimum amount of time that OSPM must wait after the completion
+	 * of a command before issuing the next command, in microseconds."
+	 */
+	if (sspcc->pcc_mrtt) {
+		time_delta = ktime_us_delta(ktime_get(),
+					    sspcc->last_cmd_cmpl_time);
+		if (sspcc->pcc_mrtt > time_delta)
+			udelay(sspcc->pcc_mrtt - time_delta);
+	}
+
+	/*
+	 * Handle the non-zero Maximum Periodic Access Rate (MPAR).
+	 * "The maximum number of periodic requests that the subspace channel can
+	 * support, reported in commands per minute. 0 indicates no limitation."
+	 *
+	 * This parameter should be ideally zero or large enough so that it can
+	 * handle maximum number of requests that all the cores in the system can
+	 * collectively generate. If it is not, follow the spec and just not
+	 * send the request to the platform after hitting the MPAR limit in
+	 * any 60s window.
+	 */
+	if (sspcc->pcc_mpar) {
+		if (sspcc->mpar_count == 0) {
+			time_delta = ktime_ms_delta(ktime_get(),
+						    sspcc->last_mpar_reset);
+			if (time_delta < 60 * MSEC_PER_SEC) {
+				dev_dbg(ras2_ctx->dev,
+					"PCC cmd(%u) not sent due to MPAR limit",
+					cmd);
+				return -EIO;
+			}
+			sspcc->last_mpar_reset = ktime_get();
+			sspcc->mpar_count = sspcc->pcc_mpar;
+		}
+		sspcc->mpar_count--;
+	}
+
+	/* Write to the shared comm region */
+	writew_relaxed(cmd, &gen_comm_base->command);
+
+	/* Flip CMD COMPLETE bit */
+	writew_relaxed(0, &gen_comm_base->status);
+
+	/* Ring doorbell */
+	rc = mbox_send_message(pcc_channel, &cmd);
+	if (rc < 0) {
+		dev_warn(ras2_ctx->dev,
+			 "Err sending PCC mbox message. cmd:%d, rc:%d\n",
+			 cmd, rc);
+		return rc;
+	}
+
+	sspcc->last_cmd = cmd;
+
+	/*
+	 * If Minimum Request Turnaround Time is non-zero, need
+	 * to record the completion time of both READ and WRITE
+	 * command for proper handling of MRTT, so need to check
+	 * for pcc_mrtt in addition to PCC_CMD_EXEC_RAS2.
+	 */
+	if (cmd == PCC_CMD_EXEC_RAS2 || sspcc->pcc_mrtt) {
+		rc = check_pcc_chan(sspcc);
+		if (sspcc->pcc_mrtt)
+			sspcc->last_cmd_cmpl_time = ktime_get();
+	}
+
+	if (pcc_channel->mbox->txdone_irq)
+		mbox_chan_txdone(pcc_channel, rc);
+	else
+		mbox_client_txdone(pcc_channel, rc);
+
+	return rc < 0 ? rc : 0;
+}
+EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
+
+static int register_pcc_channel(struct ras2_mem_ctx *ras2_ctx, int pcc_id)
+{
+	struct ras2_sspcc *sspcc;
+	struct pcc_mbox_chan *pcc_chan;
+	struct mbox_client *mbox_cl;
+
+	if (pcc_id < 0)
+		return -EINVAL;
+
+	sspcc = kzalloc(sizeof(*sspcc), GFP_KERNEL);
+	if (!sspcc)
+		return -ENOMEM;
+
+	mbox_cl			= &sspcc->mbox_client;
+	mbox_cl->knows_txdone	= true;
+
+	pcc_chan = pcc_mbox_request_channel(mbox_cl, pcc_id);
+	if (IS_ERR(pcc_chan)) {
+		kfree(sspcc);
+		return PTR_ERR(pcc_chan);
+	}
+
+	sspcc->pcc_id		= pcc_id;
+	sspcc->pcc_chan		= pcc_chan;
+	sspcc->comm_addr	= pcc_chan->shmem;
+	sspcc->deadline_us	= PCC_NUM_RETRIES * pcc_chan->latency;
+	sspcc->pcc_mrtt		= pcc_chan->min_turnaround_time;
+	sspcc->pcc_mpar		= pcc_chan->max_access_rate;
+	sspcc->mbox_client.knows_txdone	= true;
+	sspcc->pcc_chnl_acq	= true;
+
+	ras2_ctx->sspcc		= sspcc;
+	ras2_ctx->comm_addr	= sspcc->comm_addr;
+	ras2_ctx->dev		= pcc_chan->mchan->mbox->dev;
+
+	mutex_init(&sspcc->pcc_lock);
+	ras2_ctx->pcc_lock	= &sspcc->pcc_lock;
+
+	return 0;
+}
+
+static DEFINE_IDA(ras2_ida);
+static void ras2_release(struct device *device)
+{
+	struct auxiliary_device *auxdev = to_auxiliary_dev(device);
+	struct ras2_sspcc *sspcc;
+	struct ras2_mem_ctx *ras2_ctx =
+		container_of(auxdev, struct ras2_mem_ctx, adev);
+
+	ida_free(&ras2_ida, auxdev->id);
+	sspcc = ras2_ctx->sspcc;
+	pcc_mbox_free_channel(sspcc->pcc_chan);
+	kfree(sspcc);
+	kfree(ras2_ctx);
+}
+
+static struct ras2_mem_ctx *
+add_aux_device(char *name, int channel, u32 pxm_inst)
+{
+	struct ras2_mem_ctx *ras2_ctx;
+	struct ras2_sspcc *sspcc;
+	int id, rc;
+
+	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
+	if (!ras2_ctx)
+		return ERR_PTR(-ENOMEM);
+
+	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);
+
+	rc = register_pcc_channel(ras2_ctx, channel);
+	if (rc < 0) {
+		pr_debug("Failed to register pcc channel rc=%d\n", rc);
+		goto ctx_free;
+	}
+
+	id = ida_alloc(&ras2_ida, GFP_KERNEL);
+	if (id < 0) {
+		rc = id;
+		goto pcc_free;
+	}
+
+	ras2_ctx->adev.id		= id;
+	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;
+	ras2_ctx->adev.dev.release	= ras2_release;
+	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
+
+	rc = auxiliary_device_init(&ras2_ctx->adev);
+	if (rc)
+		goto ida_free;
+
+	rc = auxiliary_device_add(&ras2_ctx->adev);
+	if (rc) {
+		auxiliary_device_uninit(&ras2_ctx->adev);
+		return ERR_PTR(rc);
+	}
+
+	return ras2_ctx;
+
+ida_free:
+	ida_free(&ras2_ida, id);
+pcc_free:
+	sspcc = ras2_ctx->sspcc;
+	pcc_mbox_free_channel(sspcc->pcc_chan);
+	kfree(sspcc);
+ctx_free:
+	kfree(ras2_ctx);
+
+	return ERR_PTR(rc);
+}
+
+static void acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab)
+{
+	struct acpi_ras2_pcc_desc *pcc_desc_list;
+	struct ras2_mem_ctx *ras2_ctx;
+	u16 i, count;
+
+	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
+		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too short, size=%u)\n",
+			ras2_tab->header.length);
+		return;
+	}
+
+	if (!ras2_tab->num_pcc_descs) {
+		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
+		return;
+	}
+
+	struct ras2_mem_ctx **pctx_list __free(kfree) =
+		kzalloc(ras2_tab->num_pcc_descs * sizeof(*pctx_list),
+			GFP_KERNEL);
+	if (!pctx_list)
+		return;
+
+	count = 0;
+	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
+	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
+		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
+			continue;
+
+		ras2_ctx = add_aux_device(RAS2_MEM_DEV_ID_NAME,
+					  pcc_desc_list->channel_id,
+					  pcc_desc_list->instance);
+		if (IS_ERR(ras2_ctx)) {
+			pr_warn("Failed to add RAS2 auxiliary device rc=%ld\n",
+				PTR_ERR(ras2_ctx));
+			for (i = count; i > 0; i--)
+				auxiliary_device_uninit(&pctx_list[i - 1]->adev);
+			return;
+		}
+		pctx_list[count++] = ras2_ctx;
+	}
+}
+
+/**
+ * acpi_ras2_init - RAS2 driver initialization function.
+ *
+ * Extracts the ACPI RAS2 table and retrieves ID for the PCC channel subspace
+ * for communicating with the ACPI compliant HW platform. Driver adds an
+ * auxiliary device, which binds to the memory ACPI RAS2 driver, for each RAS2
+ * memory feature.
+ *
+ * Returns: none.
+ */
+void __init acpi_ras2_init(void)
+{
+	struct acpi_table_ras2 *ras2_tab;
+	acpi_status status;
+
+	status = acpi_get_table(ACPI_SIG_RAS2, 0,
+				(struct acpi_table_header **)&ras2_tab);
+	if (ACPI_FAILURE(status)) {
+		pr_err("Failed to get table, %s\n", acpi_format_exception(status));
+		return;
+	}
+
+	acpi_ras2_parse(ras2_tab);
+	acpi_put_table((struct acpi_table_header *)ras2_tab);
+}
diff --git a/include/acpi/ras2.h b/include/acpi/ras2.h
new file mode 100644
index 000000000000..10deab0b5541
--- /dev/null
+++ b/include/acpi/ras2.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * ACPI RAS2 (RAS Feature Table) methods.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited
+ */
+
+#ifndef _ACPI_RAS2_H
+#define _ACPI_RAS2_H
+
+#include <linux/acpi.h>
+#include <linux/auxiliary_bus.h>
+#include <linux/mailbox_client.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+struct device;
+
+/*
+ * ACPI spec 6.5 Table 5.82: PCC command codes used by
+ * RAS2 platform communication channel.
+ */
+#define PCC_CMD_EXEC_RAS2 0x01
+
+#define RAS2_AUX_DEV_NAME "ras2"
+#define RAS2_MEM_DEV_ID_NAME "acpi_ras2_mem"
+
+/**
+ * struct ras2_mem_ctx - Context for RAS2 memory features
+ * @adev:		Auxiliary device object
+ * @comm_addr:		Pointer to RAS2 PCC shared memory region
+ * @dev:		Pointer to device backing struct mbox_controller for PCC
+ * @sspcc:		Pointer to local data structure for PCC communication
+ * @pcc_lock:		Pointer to PCC lock to provide mutually exclusive access
+ *			to PCC channel subspace
+ * @sys_comp_nid:	Node ID of the system component that the RAS feature
+ *			is associated with. See ACPI spec 6.5 Table 5.80: RAS2
+ *			Platform Communication Channel Descriptor format,
+ *			Field: Instance
+ */
+struct ras2_mem_ctx {
+	struct auxiliary_device		adev;
+	struct acpi_ras2_shmem __iomem	*comm_addr;
+	struct device			*dev;
+	void				*sspcc;
+	struct mutex			*pcc_lock;
+	u32				sys_comp_nid;
+};
+
+#ifdef CONFIG_ACPI_RAS2
+void __init acpi_ras2_init(void);
+int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd);
+#else
+static inline void acpi_ras2_init(void) { }
+#endif
+
+#endif /* _ACPI_RAS2_H */
-- 
2.43.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
  2025-11-21 18:28 [PATCH v13 0/2] ACPI: Add support for ACPI RAS2 feature table shiju.jose
  2025-11-21 18:28 ` [PATCH v13 1/2] ACPI:RAS2: Add driver for the " shiju.jose
@ 2025-11-21 18:28 ` shiju.jose
  2025-11-22  5:18   ` Randy Dunlap
  2025-11-22  5:22   ` Randy Dunlap
  1 sibling, 2 replies; 10+ messages in thread
From: shiju.jose @ 2025-11-21 18:28 UTC (permalink / raw)
  To: rafael, bp, akpm, rppt, dferguson, linux-edac, linux-acpi,
	linux-mm, linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam,
	mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

ACPI 6.5 Specification, section 5.2.21, defined RAS2 feature table (RAS2).
Driver adds support for RAS2 feature table, which provides interfaces for
platform RAS features, for eg. HW-based memory scrubbing, and logical to
PA translation service. RAS2 uses PCC channel subspace for communicating
with the ACPI compliant HW platform.

ACPI RAS2 auxiliary driver for the memory features binds to the auxiliary
device, which is added by the RAS2 table parser in the ACPI RAS2 driver.

Given the address range is not provided to userspace (and hence no
chance of exposing misleading values), even in the presence
of disjoint address ranges, use the start to end of the NUMA node
with the expectation that a firmware will allow that to indicate that
the full node will be scrubbed, skipping address ranges that are from
other NUMA nodes but happen to lie within this range.

Driver retrieves the PA range of the NUMA domain and use it as the
'Requested Address Range', when send GET_PATROL_PARAMETERS command to
get parameters that apply to all addresses in the NUMA domain as well as
when send START_PATROL_SCRUBBER command to start the demand scrubbing.

Device with ACPI RAS2 scrub feature registers with EDAC device driver,
which retrieves the scrub descriptor from EDAC scrub and exposes
the scrub control attributes for RAS2 scrub instance to userspace in
/sys/bus/edac/devices/acpi_ras_memX/scrub0/.

Add 'enable_demand' attribute to the EDAC scrub interface to start/stop
the demand scrub, which is used in the RAS2 demand scrub control.

In the future, RAS2 driver may add support for the ‘addr’ and ‘size’
EDAC scrub-control attributes, to enable the user to set address range
of the memory region to scrub.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/ABI/testing/sysfs-edac-scrub |  13 +-
 Documentation/edac/scrub.rst               |  58 +++
 drivers/edac/scrub.c                       |  12 +
 drivers/ras/Kconfig                        |  12 +
 drivers/ras/Makefile                       |   1 +
 drivers/ras/acpi_ras2.c                    | 403 +++++++++++++++++++++
 include/acpi/ras2.h                        |  17 +
 include/linux/edac.h                       |   4 +
 8 files changed, 515 insertions(+), 5 deletions(-)
 create mode 100644 drivers/ras/acpi_ras2.c

diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub
index ab6014743da5..3f68f63556f4 100644
--- a/Documentation/ABI/testing/sysfs-edac-scrub
+++ b/Documentation/ABI/testing/sysfs-edac-scrub
@@ -20,11 +20,7 @@ KernelVersion:	6.15
 Contact:	linux-edac@vger.kernel.org
 Description:
 		(RW) The base address of the memory region to be scrubbed
-		for on-demand scrubbing. Setting address starts scrubbing.
-		The size must be set before that.
-
-		The readback addr value is non-zero if the requested
-		on-demand scrubbing is in progress, zero otherwise.
+		for demand scrubbing.
 
 What:		/sys/bus/edac/devices/<dev-name>/scrubX/size
 Date:		March 2025
@@ -34,6 +30,13 @@ Description:
 		(RW) The size of the memory region to be scrubbed
 		(on-demand scrubbing).
 
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/enable_demand
+Date:		Jan 2026
+KernelVersion:	6.19
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) Start/Stop demand scrubbing if supported.
+
 What:		/sys/bus/edac/devices/<dev-name>/scrubX/enable_background
 Date:		March 2025
 KernelVersion:	6.15
diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
index 2cfa74fa1ffd..737a10da224f 100644
--- a/Documentation/edac/scrub.rst
+++ b/Documentation/edac/scrub.rst
@@ -340,3 +340,61 @@ controller or platform when unexpectedly high error rates are detected.
 
 Sysfs files for scrubbing are documented in
 `Documentation/ABI/testing/sysfs-edac-ecs`
+
+3. ACPI RAS2 Hardware-based Memory Scrubbing
+
+3.1. On demand scrubbing for a specific memory region.
+
+3.1.1. Query the status of demand scrubbing
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_demand
+
+0
+
+3.1.2. Query what is device default/current scrub cycle setting.
+
+Applicable to both demand and background scrubbing.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+36000
+
+3.1.3. Query the range of device supported scrub cycle for a memory region.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration
+
+3600
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration
+
+86400
+
+3.1.4. Program scrubbing for the memory region in RAS2 device to repeat every
+43200 seconds (half a day).
+
+# echo 43200 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+3.1.5. Start 'demand scrubbing'.
+
+When a demand scrub is started, any background scrub currently in progress
+will be stopped and then automatically restarted once the demand scrub has
+completed.
+
+# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_demand
+
+3.2. Background scrubbing the entire memory
+
+3.2.1. Query the status of background scrubbing.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
+
+0
+
+3.2.2. Program background scrubbing for RAS2 device to repeat in every 21600
+seconds (quarter of a day).
+
+# echo 21600 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+3.2.3. Start 'background scrubbing'.
+
+# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
diff --git a/drivers/edac/scrub.c b/drivers/edac/scrub.c
index f9d02af2fc3a..f3b9a2f04950 100644
--- a/drivers/edac/scrub.c
+++ b/drivers/edac/scrub.c
@@ -14,6 +14,7 @@ enum edac_scrub_attributes {
 	SCRUB_ADDRESS,
 	SCRUB_SIZE,
 	SCRUB_ENABLE_BACKGROUND,
+	SCRUB_ENABLE_DEMAND,
 	SCRUB_MIN_CYCLE_DURATION,
 	SCRUB_MAX_CYCLE_DURATION,
 	SCRUB_CUR_CYCLE_DURATION,
@@ -55,6 +56,7 @@ static ssize_t attrib##_show(struct device *ras_feat_dev,			\
 EDAC_SCRUB_ATTR_SHOW(addr, read_addr, u64, "0x%llx\n")
 EDAC_SCRUB_ATTR_SHOW(size, read_size, u64, "0x%llx\n")
 EDAC_SCRUB_ATTR_SHOW(enable_background, get_enabled_bg, bool, "%u\n")
+EDAC_SCRUB_ATTR_SHOW(enable_demand, get_enabled_od, bool, "%u\n")
 EDAC_SCRUB_ATTR_SHOW(min_cycle_duration, get_min_cycle, u32, "%u\n")
 EDAC_SCRUB_ATTR_SHOW(max_cycle_duration, get_max_cycle, u32, "%u\n")
 EDAC_SCRUB_ATTR_SHOW(current_cycle_duration, get_cycle_duration, u32, "%u\n")
@@ -84,6 +86,7 @@ static ssize_t attrib##_store(struct device *ras_feat_dev,			\
 EDAC_SCRUB_ATTR_STORE(addr, write_addr, u64, kstrtou64)
 EDAC_SCRUB_ATTR_STORE(size, write_size, u64, kstrtou64)
 EDAC_SCRUB_ATTR_STORE(enable_background, set_enabled_bg, unsigned long, kstrtoul)
+EDAC_SCRUB_ATTR_STORE(enable_demand, set_enabled_od, unsigned long, kstrtoul)
 EDAC_SCRUB_ATTR_STORE(current_cycle_duration, set_cycle_duration, unsigned long, kstrtoul)
 
 static umode_t scrub_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id)
@@ -119,6 +122,14 @@ static umode_t scrub_attr_visible(struct kobject *kobj, struct attribute *a, int
 				return 0444;
 		}
 		break;
+	case SCRUB_ENABLE_DEMAND:
+		if (ops->get_enabled_od) {
+			if (ops->set_enabled_od)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
 	case SCRUB_MIN_CYCLE_DURATION:
 		if (ops->get_min_cycle)
 			return a->mode;
@@ -164,6 +175,7 @@ static int scrub_create_desc(struct device *scrub_dev,
 		[SCRUB_ADDRESS] = EDAC_SCRUB_ATTR_RW(addr, instance),
 		[SCRUB_SIZE] = EDAC_SCRUB_ATTR_RW(size, instance),
 		[SCRUB_ENABLE_BACKGROUND] = EDAC_SCRUB_ATTR_RW(enable_background, instance),
+		[SCRUB_ENABLE_DEMAND] = EDAC_SCRUB_ATTR_RW(enable_demand, instance),
 		[SCRUB_MIN_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(min_cycle_duration, instance),
 		[SCRUB_MAX_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(max_cycle_duration, instance),
 		[SCRUB_CUR_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RW(current_cycle_duration, instance)
diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
index fc4f4bb94a4c..7e7afd2b2ba7 100644
--- a/drivers/ras/Kconfig
+++ b/drivers/ras/Kconfig
@@ -46,4 +46,16 @@ config RAS_FMPM
 	  Memory will be retired during boot time and run time depending on
 	  platform-specific policies.
 
+config MEM_ACPI_RAS2
+	tristate "Memory ACPI RAS2 driver"
+	depends on ACPI_RAS2
+	depends on EDAC
+	depends on EDAC_SCRUB
+	help
+	  The driver binds to the auxiliary device added by the ACPI RAS2
+	  feature table parser. The driver uses a PCC channel subspace to
+	  communicating with the ACPI-compliant platform and provides
+	  control of the HW-based memory scrubber parameters to the user
+	  through the EDAC scrub interface.
+
 endif
diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
index 11f95d59d397..a0e6e903d6b0 100644
--- a/drivers/ras/Makefile
+++ b/drivers/ras/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_RAS)	+= ras.o
 obj-$(CONFIG_DEBUG_FS)	+= debugfs.o
 obj-$(CONFIG_RAS_CEC)	+= cec.o
+obj-$(CONFIG_MEM_ACPI_RAS2)	+= acpi_ras2.o
 
 obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
 obj-y			+= amd/atl/
diff --git a/drivers/ras/acpi_ras2.c b/drivers/ras/acpi_ras2.c
new file mode 100644
index 000000000000..0997cccc5242
--- /dev/null
+++ b/drivers/ras/acpi_ras2.c
@@ -0,0 +1,403 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ACPI RAS2 memory driver
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ */
+
+#define pr_fmt(fmt)	"ACPI RAS2 MEMORY: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/delay.h>
+#include <linux/edac.h>
+#include <linux/kthread.h>
+#include <linux/platform_device.h>
+#include <acpi/ras2.h>
+
+#define RAS2_SUPPORT_HW_PARTOL_SCRUB BIT(0)
+#define RAS2_TYPE_PATROL_SCRUB 0x0000
+
+#define RAS2_GET_PATROL_PARAMETERS 0x01
+#define RAS2_START_PATROL_SCRUBBER 0x02
+#define RAS2_STOP_PATROL_SCRUBBER 0x03
+
+/*
+ * RAS2 patrol scrub
+ */
+#define RAS2_PS_SC_HRS_IN_MASK GENMASK(15, 8)
+#define RAS2_PS_EN_BACKGROUND BIT(0)
+#define RAS2_PS_SC_HRS_OUT_MASK GENMASK(7, 0)
+#define RAS2_PS_MIN_SC_HRS_OUT_MASK GENMASK(15, 8)
+#define RAS2_PS_MAX_SC_HRS_OUT_MASK GENMASK(23, 16)
+#define RAS2_PS_FLAG_SCRUB_RUNNING BIT(0)
+
+#define RAS2_SCRUB_NAME_LEN 128
+#define RAS2_HOUR_IN_SECS 3600
+
+struct acpi_ras2_ps_shared_mem {
+	struct acpi_ras2_shmem common;
+	struct acpi_ras2_patrol_scrub_param params;
+};
+
+#define TO_ACPI_RAS2_PS_SHMEM(_addr) \
+	container_of(_addr, struct acpi_ras2_ps_shared_mem, common)
+
+static int ras2_hw_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable);
+
+static int ras2_is_patrol_scrub_support(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct acpi_ras2_shmem __iomem *common = (void *)ras2_ctx->comm_addr;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	common->set_caps[0] = 0;
+
+	return common->features[0] & RAS2_SUPPORT_HW_PARTOL_SCRUB;
+}
+
+static int ras2_update_patrol_scrub_params_cache(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	int ret;
+
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.command = RAS2_GET_PATROL_PARAMETERS;
+	ps_sm->params.req_addr_range[0] = ras2_ctx->base;
+	ps_sm->params.req_addr_range[1] = ras2_ctx->size;
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "failed to read parameters\n");
+		return ret;
+	}
+
+	ras2_ctx->min_scrub_cycle = FIELD_GET(RAS2_PS_MIN_SC_HRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	ras2_ctx->max_scrub_cycle = FIELD_GET(RAS2_PS_MAX_SC_HRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	ras2_ctx->scrub_cycle_hrs = FIELD_GET(RAS2_PS_SC_HRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	if (ras2_ctx->bg_scrub) {
+		ras2_ctx->od_scrub = false;
+		return 0;
+	}
+
+	if  (ps_sm->params.flags & RAS2_PS_FLAG_SCRUB_RUNNING)
+		ras2_ctx->od_scrub = true;
+	else
+		ras2_ctx->od_scrub = false;
+
+	return 0;
+}
+
+/* Context - PCC lock must be held */
+static int ras2_get_demand_scrub_running(struct ras2_mem_ctx *ras2_ctx,
+					 bool *running)
+{
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	int ret;
+
+	if (!ras2_ctx->od_scrub) {
+		*running = false;
+		return 0;
+	}
+
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.command = RAS2_GET_PATROL_PARAMETERS;
+	ps_sm->params.req_addr_range[0] = ras2_ctx->base;
+	ps_sm->params.req_addr_range[1] = ras2_ctx->size;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "failed to read parameters\n");
+		return ret;
+	}
+
+	*running = ps_sm->params.flags & RAS2_PS_FLAG_SCRUB_RUNNING;
+	if (!(*running))
+		ras2_ctx->od_scrub = false;
+
+	return 0;
+}
+
+static int ras2_scrub_monitor_thread(void *p)
+{
+	struct ras2_mem_ctx *ras2_ctx = (struct ras2_mem_ctx *)p;
+	bool running;
+	int ret;
+
+	while (!kthread_should_stop()) {
+		if (!ras2_ctx->reenable_bg_scrub)
+			return 0;
+
+		mutex_lock(ras2_ctx->pcc_lock);
+		ret = ras2_get_demand_scrub_running(ras2_ctx, &running);
+		mutex_unlock(ras2_ctx->pcc_lock);
+		if (ret)
+			return ret;
+
+		if (!running)
+			return ras2_hw_scrub_set_enabled_bg(ras2_ctx->dev,
+							    ras2_ctx, true);
+		msleep(1000);
+	}
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 *min)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*min = ras2_ctx->min_scrub_cycle * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 *max)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*max = ras2_ctx->max_scrub_cycle * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_cycle_read(struct device *dev, void *drv_data,
+				    u32 *scrub_cycle_secs)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*scrub_cycle_secs = ras2_ctx->scrub_cycle_hrs * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_cycle_write(struct device *dev, void *drv_data,
+				     u32 scrub_cycle_secs)
+{
+	u8 scrub_cycle_hrs = scrub_cycle_secs / RAS2_HOUR_IN_SECS;
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	bool running;
+	int ret;
+
+	if (ras2_ctx->bg_scrub)
+		return -EBUSY;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ret = ras2_get_demand_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	if (scrub_cycle_hrs < ras2_ctx->min_scrub_cycle ||
+	    scrub_cycle_hrs > ras2_ctx->max_scrub_cycle)
+		return -EINVAL;
+
+	ras2_ctx->scrub_cycle_hrs = scrub_cycle_hrs;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*enabled = ras2_ctx->bg_scrub;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	bool running;
+	int ret;
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ret = ras2_get_demand_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	if (enable) {
+		if (ras2_ctx->bg_scrub || running)
+			return -EBUSY;
+
+		ps_sm->params.req_addr_range[0] = 0;
+		ps_sm->params.req_addr_range[1] = 0;
+		ps_sm->params.scrub_params_in &= ~RAS2_PS_SC_HRS_IN_MASK;
+		ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PS_SC_HRS_IN_MASK,
+							    ras2_ctx->scrub_cycle_hrs);
+		ps_sm->params.command = RAS2_START_PATROL_SCRUBBER;
+	} else {
+		if (!ras2_ctx->bg_scrub)
+			return -EPERM;
+
+		ps_sm->params.command = RAS2_STOP_PATROL_SCRUBBER;
+	}
+
+	ps_sm->params.scrub_params_in &= ~RAS2_PS_EN_BACKGROUND;
+	ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PS_EN_BACKGROUND,
+						    enable);
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(dev, "Failed to %s background scrubbing\n",
+			str_enable_disable(enable));
+		return ret;
+	}
+
+	ras2_ctx->bg_scrub = enable;
+	if (enable)
+		ras2_ctx->reenable_bg_scrub = false;
+
+	/* Update the cache to account for rounding of supplied parameters and similar */
+	return ras2_update_patrol_scrub_params_cache(ras2_ctx);
+}
+
+static int ras2_hw_scrub_get_enabled_od(struct device *dev, void *drv_data, bool *enabled)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*enabled = ras2_ctx->od_scrub;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_set_enabled_od(struct device *dev, void *drv_data, bool enable)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		TO_ACPI_RAS2_PS_SHMEM(ras2_ctx->comm_addr);
+	struct task_struct *thrd;
+	bool running;
+	int ret;
+
+	/* Stop any background scrub currently in progress */
+	if (ras2_ctx->bg_scrub && enable) {
+		ret = ras2_hw_scrub_set_enabled_bg(dev, drv_data, false);
+		if (ret)
+			return ret;
+
+		ras2_ctx->reenable_bg_scrub = true;
+		thrd = kthread_run(ras2_scrub_monitor_thread, ras2_ctx,
+				   "ras2_scrub_nid%d", ras2_ctx->sys_comp_nid);
+		if (IS_ERR(thrd)) {
+			ras2_ctx->reenable_bg_scrub = false;
+			ras2_hw_scrub_set_enabled_bg(dev, drv_data, true);
+			return PTR_ERR(thrd);
+		}
+	}
+
+	guard(mutex)(ras2_ctx->pcc_lock);
+	ret = ras2_get_demand_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	ps_sm->common.set_caps[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.scrub_params_in &= ~RAS2_PS_SC_HRS_IN_MASK;
+	ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PS_SC_HRS_IN_MASK,
+						    ras2_ctx->scrub_cycle_hrs);
+	ps_sm->params.req_addr_range[0] = ras2_ctx->base;
+	ps_sm->params.req_addr_range[1] = ras2_ctx->size;
+	ps_sm->params.scrub_params_in &= ~RAS2_PS_EN_BACKGROUND;
+	ps_sm->params.command = RAS2_START_PATROL_SCRUBBER;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, PCC_CMD_EXEC_RAS2);
+	if (ret) {
+		dev_err(dev, "Failed to start demand scrubbing rc(%d)\n", ret);
+		if (ret != -EBUSY) {
+			ps_sm->params.req_addr_range[0] = 0;
+			ps_sm->params.req_addr_range[1] = 0;
+			ras2_ctx->od_scrub = false;
+		}
+		return ret;
+	}
+
+	ras2_ctx->od_scrub = enable;
+
+	return ras2_update_patrol_scrub_params_cache(ras2_ctx);
+}
+
+static const struct edac_scrub_ops ras2_scrub_ops = {
+	.get_enabled_bg = ras2_hw_scrub_get_enabled_bg,
+	.set_enabled_bg = ras2_hw_scrub_set_enabled_bg,
+	.get_enabled_od = ras2_hw_scrub_get_enabled_od,
+	.set_enabled_od = ras2_hw_scrub_set_enabled_od,
+	.get_min_cycle = ras2_hw_scrub_read_min_scrub_cycle,
+	.get_max_cycle = ras2_hw_scrub_read_max_scrub_cycle,
+	.get_cycle_duration = ras2_hw_scrub_cycle_read,
+	.set_cycle_duration = ras2_hw_scrub_cycle_write,
+};
+
+static int ras2_probe(struct auxiliary_device *auxdev,
+		      const struct auxiliary_device_id *id)
+{
+	struct ras2_mem_ctx *ras2_ctx = container_of(auxdev, struct ras2_mem_ctx, adev);
+	struct edac_dev_feature ras_features;
+	char scrub_name[RAS2_SCRUB_NAME_LEN];
+	unsigned long start_pfn, size_pfn;
+	int ret;
+
+	if (!ras2_is_patrol_scrub_support(ras2_ctx))
+		return -EOPNOTSUPP;
+
+	/*
+	 * Retrieve the PA range of the NUMA domain and use it as the
+	 * 'Requested Address Range', when send GET_PATROL_PARAMETERS
+	 * command to get parameters that apply to all addresses in the
+	 * NUMA domain as well as when send START_PATROL_SCRUBBER command
+	 * to start the demand scrubbing.
+	 */
+	start_pfn = node_start_pfn(ras2_ctx->sys_comp_nid);
+	size_pfn = node_spanned_pages(ras2_ctx->sys_comp_nid);
+	if (!size_pfn) {
+		pr_debug("Failed to find PA range of NUMA node(%u)\n",
+			 ras2_ctx->sys_comp_nid);
+		return -EPERM;
+	}
+
+	ras2_ctx->base = __pfn_to_phys(start_pfn);
+	ras2_ctx->size = __pfn_to_phys(size_pfn);
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+
+	sprintf(scrub_name, "acpi_ras_mem%d", auxdev->id);
+
+	ras_features.ft_type	= RAS_FEAT_SCRUB;
+	ras_features.instance	= 0;
+	ras_features.scrub_ops	= &ras2_scrub_ops;
+	ras_features.ctx	= ras2_ctx;
+
+	return edac_dev_register(&auxdev->dev, scrub_name, NULL, 1,
+				 &ras_features);
+}
+
+static const struct auxiliary_device_id ras2_mem_dev_id_table[] = {
+	{ .name = RAS2_AUX_DEV_NAME "." RAS2_MEM_DEV_ID_NAME, },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(auxiliary, ras2_mem_dev_id_table);
+
+static struct auxiliary_driver ras2_mem_driver = {
+	.name = RAS2_MEM_DEV_ID_NAME,
+	.probe = ras2_probe,
+	.id_table = ras2_mem_dev_id_table,
+};
+module_auxiliary_driver(ras2_mem_driver);
+
+MODULE_IMPORT_NS("ACPI_RAS2");
+MODULE_DESCRIPTION("ACPI RAS2 memory driver");
+MODULE_LICENSE("GPL");
diff --git a/include/acpi/ras2.h b/include/acpi/ras2.h
index 10deab0b5541..c0357f943bca 100644
--- a/include/acpi/ras2.h
+++ b/include/acpi/ras2.h
@@ -37,6 +37,15 @@ struct device;
  *			is associated with. See ACPI spec 6.5 Table 5.80: RAS2
  *			Platform Communication Channel Descriptor format,
  *			Field: Instance
+ * @base:		Base address of the memory region to scrub
+ * @size:		Size of the memory region to scrub
+ * @scrub_cycle_hrs:	Current scrub rate in hours
+ * @min_scrub_cycle:	Minimum scrub rate supported
+ * @max_scrub_cycle:	Maximum scrub rate supported
+ * @od_scrub:		Status of demand scrubbing (memory region)
+ * @bg_scrub:		Status of background patrol scrubbing
+ * @reenable_bg_scrub:	Flag indicates restart background scrubbing after demand
+ *			scrubbing is finished
  */
 struct ras2_mem_ctx {
 	struct auxiliary_device		adev;
@@ -45,6 +54,14 @@ struct ras2_mem_ctx {
 	void				*sspcc;
 	struct mutex			*pcc_lock;
 	u32				sys_comp_nid;
+	u64				base;
+	u64				size;
+	u8				scrub_cycle_hrs;
+	u8				min_scrub_cycle;
+	u8				max_scrub_cycle;
+	bool				od_scrub;
+	bool				bg_scrub;
+	bool				reenable_bg_scrub;
 };
 
 #ifdef CONFIG_ACPI_RAS2
diff --git a/include/linux/edac.h b/include/linux/edac.h
index fa32f2aca22f..2342ff38e9d5 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -680,6 +680,8 @@ enum edac_dev_feat {
  * @write_size: set offset of the scrubbing range.
  * @get_enabled_bg: check if currently performing background scrub.
  * @set_enabled_bg: start or stop a bg-scrub.
+ * @get_enabled_od: check if currently performing demand scrub.
+ * @set_enabled_od: start or stop a demand-scrub.
  * @get_min_cycle: get minimum supported scrub cycle duration in seconds.
  * @get_max_cycle: get maximum supported scrub cycle duration in seconds.
  * @get_cycle_duration: get current scrub cycle duration in seconds.
@@ -692,6 +694,8 @@ struct edac_scrub_ops {
 	int (*write_size)(struct device *dev, void *drv_data, u64 size);
 	int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable);
 	int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable);
+	int (*get_enabled_od)(struct device *dev, void *drv_data, bool *enable);
+	int (*set_enabled_od)(struct device *dev, void *drv_data, bool enable);
 	int (*get_min_cycle)(struct device *dev, void *drv_data,  u32 *min);
 	int (*get_max_cycle)(struct device *dev, void *drv_data,  u32 *max);
 	int (*get_cycle_duration)(struct device *dev, void *drv_data, u32 *cycle);
-- 
2.43.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v13 1/2] ACPI:RAS2: Add driver for the ACPI RAS2 feature table
  2025-11-21 18:28 ` [PATCH v13 1/2] ACPI:RAS2: Add driver for the " shiju.jose
@ 2025-11-22  5:18   ` Randy Dunlap
  2025-11-25  7:36   ` Borislav Petkov
  1 sibling, 0 replies; 10+ messages in thread
From: Randy Dunlap @ 2025-11-22  5:18 UTC (permalink / raw)
  To: shiju.jose, rafael, bp, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, leo.duran,
	Yazen.Ghannam, mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang



On 11/21/25 10:28 AM, shiju.jose@huawei.com wrote:
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index ca00a5dbcf75..bfa9f3f4def5 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
>  	  If your platform does not support CPPC in firmware,
>  	  leave this option disabled.
>  
> +config ACPI_RAS2
> +	bool "ACPI RAS2 driver"
> +	select AUXILIARY_BUS
> +	select MAILBOX
> +	select PCC
> +	depends on NUMA_KEEP_MEMINFO
> +	help
> +	  This driver adds support for RAS2 feature table provides interfaces
> +	  for platform RAS features, for eg. HW-based memory scrubbing.

	                   features, e.g., for HW-based

> +	  If your platform does not support RAS2 in firmware, leave this
> +	  option disabled.

-- 
~Randy



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
  2025-11-21 18:28 ` [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver shiju.jose
@ 2025-11-22  5:18   ` Randy Dunlap
  2025-11-24  9:29     ` Shiju Jose
  2025-11-22  5:22   ` Randy Dunlap
  1 sibling, 1 reply; 10+ messages in thread
From: Randy Dunlap @ 2025-11-22  5:18 UTC (permalink / raw)
  To: shiju.jose, rafael, bp, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, leo.duran,
	Yazen.Ghannam, mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang



On 11/21/25 10:28 AM, shiju.jose@huawei.com wrote:
> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
> index fc4f4bb94a4c..7e7afd2b2ba7 100644
> --- a/drivers/ras/Kconfig
> +++ b/drivers/ras/Kconfig
> @@ -46,4 +46,16 @@ config RAS_FMPM
>  	  Memory will be retired during boot time and run time depending on
>  	  platform-specific policies.
>  
> +config MEM_ACPI_RAS2
> +	tristate "Memory ACPI RAS2 driver"
> +	depends on ACPI_RAS2
> +	depends on EDAC
> +	depends on EDAC_SCRUB
> +	help
> +	  The driver binds to the auxiliary device added by the ACPI RAS2
> +	  feature table parser. The driver uses a PCC channel subspace to
> +	  communicating with the ACPI-compliant platform and provides

	  communicate with

> +	  control of the HW-based memory scrubber parameters to the user
> +	  through the EDAC scrub interface.

-- 
~Randy



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
  2025-11-21 18:28 ` [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver shiju.jose
  2025-11-22  5:18   ` Randy Dunlap
@ 2025-11-22  5:22   ` Randy Dunlap
  2025-11-24 10:00     ` Shiju Jose
  1 sibling, 1 reply; 10+ messages in thread
From: Randy Dunlap @ 2025-11-22  5:22 UTC (permalink / raw)
  To: shiju.jose, rafael, bp, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, leo.duran,
	Yazen.Ghannam, mchehab
  Cc: jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang



On 11/21/25 10:28 AM, shiju.jose@huawei.com wrote:
> diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
> index 2cfa74fa1ffd..737a10da224f 100644
> --- a/Documentation/edac/scrub.rst
> +++ b/Documentation/edac/scrub.rst
> @@ -340,3 +340,61 @@ controller or platform when unexpectedly high error rates are detected.
>  
>  Sysfs files for scrubbing are documented in
>  `Documentation/ABI/testing/sysfs-edac-ecs`
> +
> +3. ACPI RAS2 Hardware-based Memory Scrubbing
> +
> +3.1. On demand scrubbing for a specific memory region.
> +
> +3.1.1. Query the status of demand scrubbing
> +
> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_demand
> +
> +0
> +
> +3.1.2. Query what is device default/current scrub cycle setting.
> +
> +Applicable to both demand and background scrubbing.
> +
> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
> +
> +36000
> +

What units (above)?

> +3.1.3. Query the range of device supported scrub cycle for a memory region.
> +
> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration
> +
> +3600
> +
> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration
> +
> +86400
> +

ditto.

> +3.1.4. Program scrubbing for the memory region in RAS2 device to repeat every
> +43200 seconds (half a day).
> +
> +# echo 43200 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
> +
> +3.1.5. Start 'demand scrubbing'.
> +
> +When a demand scrub is started, any background scrub currently in progress
> +will be stopped and then automatically restarted once the demand scrub has
> +completed.

Will it restart where it left off or at the beginning?

> +
> +# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_demand
> +
> +3.2. Background scrubbing the entire memory
> +
> +3.2.1. Query the status of background scrubbing.
> +
> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
> +
> +0
> +
> +3.2.2. Program background scrubbing for RAS2 device to repeat in every 21600
> +seconds (quarter of a day).
> +
> +# echo 21600 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
> +
> +3.2.3. Start 'background scrubbing'.
> +
> +# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background

-- 
~Randy



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
  2025-11-22  5:18   ` Randy Dunlap
@ 2025-11-24  9:29     ` Shiju Jose
  0 siblings, 0 replies; 10+ messages in thread
From: Shiju Jose @ 2025-11-24  9:29 UTC (permalink / raw)
  To: Randy Dunlap, rafael, bp, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, leo.duran,
	Yazen.Ghannam, mchehab
  Cc: Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang


>-----Original Message-----
>From: Randy Dunlap <rdunlap@infradead.org>
>Sent: 22 November 2025 05:18
>To: Shiju Jose <shiju.jose@huawei.com>; rafael@kernel.org; bp@alien8.de;
>akpm@linux-foundation.org; rppt@kernel.org;
>dferguson@amperecomputing.com; linux-edac@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-doc@vger.kernel.org;
>tony.luck@intel.com; lenb@kernel.org; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; mchehab@kernel.org
>Cc: Jonathan Cameron <jonathan.cameron@huawei.com>; Linuxarm
><linuxarm@huawei.com>; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; wbs@os.amperecomputing.com;
>nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
>
>
>
>On 11/21/25 10:28 AM, shiju.jose@huawei.com wrote:
>> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
>> index fc4f4bb94a4c..7e7afd2b2ba7 100644
>> --- a/drivers/ras/Kconfig
>> +++ b/drivers/ras/Kconfig
>> @@ -46,4 +46,16 @@ config RAS_FMPM
>>  	  Memory will be retired during boot time and run time depending on
>>  	  platform-specific policies.
>>
>> +config MEM_ACPI_RAS2
>> +	tristate "Memory ACPI RAS2 driver"
>> +	depends on ACPI_RAS2
>> +	depends on EDAC
>> +	depends on EDAC_SCRUB
>> +	help
>> +	  The driver binds to the auxiliary device added by the ACPI RAS2
>> +	  feature table parser. The driver uses a PCC channel subspace to
>> +	  communicating with the ACPI-compliant platform and provides
>
>	  communicate with
Thanks Randy for reviewing. I will fix.

>
>> +	  control of the HW-based memory scrubber parameters to the user
>> +	  through the EDAC scrub interface.
>
>--
>~Randy
>

Thanks,
Shiju

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
  2025-11-22  5:22   ` Randy Dunlap
@ 2025-11-24 10:00     ` Shiju Jose
  0 siblings, 0 replies; 10+ messages in thread
From: Shiju Jose @ 2025-11-24 10:00 UTC (permalink / raw)
  To: Randy Dunlap, rafael, bp, akpm, rppt, dferguson, linux-edac,
	linux-acpi, linux-mm, linux-doc, tony.luck, lenb, leo.duran,
	Yazen.Ghannam, mchehab
  Cc: Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang



>-----Original Message-----
>From: Randy Dunlap <rdunlap@infradead.org>
>Sent: 22 November 2025 05:23
>To: Shiju Jose <shiju.jose@huawei.com>; rafael@kernel.org; bp@alien8.de;
>akpm@linux-foundation.org; rppt@kernel.org;
>dferguson@amperecomputing.com; linux-edac@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-doc@vger.kernel.org;
>tony.luck@intel.com; lenb@kernel.org; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; mchehab@kernel.org
>Cc: Jonathan Cameron <jonathan.cameron@huawei.com>; Linuxarm
><linuxarm@huawei.com>; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; wbs@os.amperecomputing.com;
>nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver
>
>
>
>On 11/21/25 10:28 AM, shiju.jose@huawei.com wrote:
>> diff --git a/Documentation/edac/scrub.rst
>> b/Documentation/edac/scrub.rst index 2cfa74fa1ffd..737a10da224f 100644
>> --- a/Documentation/edac/scrub.rst
>> +++ b/Documentation/edac/scrub.rst
>> @@ -340,3 +340,61 @@ controller or platform when unexpectedly high error
>rates are detected.
>>
>>  Sysfs files for scrubbing are documented in
>> `Documentation/ABI/testing/sysfs-edac-ecs`
>> +
>> +3. ACPI RAS2 Hardware-based Memory Scrubbing
>> +
>> +3.1. On demand scrubbing for a specific memory region.
>> +
>> +3.1.1. Query the status of demand scrubbing
>> +
>> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_demand
>> +
>> +0
>> +
>> +3.1.2. Query what is device default/current scrub cycle setting.
>> +
>> +Applicable to both demand and background scrubbing.
>> +
>> +# cat
>> +/sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
>> +
>> +36000
>> +
>
>What units (above)?
In seconds.

>
>> +3.1.3. Query the range of device supported scrub cycle for a memory region.
>> +
>> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration
>> +
>> +3600
>> +
>> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration
>> +
>> +86400
>> +
>
>ditto.
Unit -  Seconds.
>
>> +3.1.4. Program scrubbing for the memory region in RAS2 device to
>> +repeat every
>> +43200 seconds (half a day).
>> +
>> +# echo 43200 >
>> +/sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
>> +
>> +3.1.5. Start 'demand scrubbing'.
>> +
>> +When a demand scrub is started, any background scrub currently in
>> +progress will be stopped and then automatically restarted once the
>> +demand scrub has completed.
>
>Will it restart where it left off or at the beginning?
In this case, presently kernel send 'START_PATROL_SCRUBBER ' command  to restart
the background scrubbing and thus restarts at the beginning unless I think firmware
has some implementation to detect this case and  'resume' background scrubbing
where it has stopped. Otherwise I think RAS2 may define some new commands to
'pause' and 'resume' scrubbing if that make sense so that kernel could send those
commands to the firmware for this case.
>
>> +
>> +# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_demand
>> +
>> +3.2. Background scrubbing the entire memory
>> +
>> +3.2.1. Query the status of background scrubbing.
>> +
>> +# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
>> +
>> +0
>> +
>> +3.2.2. Program background scrubbing for RAS2 device to repeat in
>> +every 21600 seconds (quarter of a day).
>> +
>> +# echo 21600 >
>> +/sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
>> +
>> +3.2.3. Start 'background scrubbing'.
>> +
>> +# echo 1 >
>> +/sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
>
>--
>~Randy

Thanks,
Shiju


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v13 1/2] ACPI:RAS2: Add driver for the ACPI RAS2 feature table
  2025-11-21 18:28 ` [PATCH v13 1/2] ACPI:RAS2: Add driver for the " shiju.jose
  2025-11-22  5:18   ` Randy Dunlap
@ 2025-11-25  7:36   ` Borislav Petkov
  2025-11-25 13:28     ` Shiju Jose
  1 sibling, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2025-11-25  7:36 UTC (permalink / raw)
  To: shiju.jose
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam, mchehab,
	jonathan.cameron, linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang

On Fri, Nov 21, 2025 at 06:28:20PM +0000, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
> 
> ACPI 6.5 Specification, section 5.2.21, defined RAS2 feature table (RAS2).
> Driver adds support for RAS2 feature table, which provides interfaces for
> platform RAS features, for eg. HW-based memory scrubbing, and logical to
> PA translation service. RAS2 uses PCC channel subspace for communicating
> with the ACPI compliant HW platform.
> 
> Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
> Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
>  drivers/acpi/Kconfig  |  12 ++
>  drivers/acpi/Makefile |   1 +
>  drivers/acpi/bus.c    |   3 +
>  drivers/acpi/ras2.c   | 398 ++++++++++++++++++++++++++++++++++++++++++
>  include/acpi/ras2.h   |  57 ++++++
>  5 files changed, 471 insertions(+)
>  create mode 100644 drivers/acpi/ras2.c
>  create mode 100644 include/acpi/ras2.h
> 
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index ca00a5dbcf75..bfa9f3f4def5 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
>  	  If your platform does not support CPPC in firmware,
>  	  leave this option disabled.
>  
> +config ACPI_RAS2
> +	bool "ACPI RAS2 driver"
> +	select AUXILIARY_BUS
> +	select MAILBOX
> +	select PCC

Why are those select instead of depend?

> +	depends on NUMA_KEEP_MEMINFO
> +	help
> +	  This driver adds support for RAS2 feature table provides interfaces
> +	  for platform RAS features, for eg. HW-based memory scrubbing.
> +	  If your platform does not support RAS2 in firmware, leave this
> +	  option disabled.

So this driver is so niche that the majority of users want to leave it
disabled. Pls explain that in the help.

>  config ACPI_PROCESSOR
>  	tristate "Processor"
>  	depends on X86 || ARM64 || LOONGARCH || RISCV
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index d1b0affb844f..abfec6745724 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -105,6 +105,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)	+= ec_sys.o
>  obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
>  obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
>  obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
> +obj-$(CONFIG_ACPI_RAS2)		+= ras2.o
>  obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
>  obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
>  obj-$(CONFIG_ACPI_PFRUT)	+= pfr_update.o pfr_telemetry.o

...

> +static int check_pcc_chan(struct ras2_sspcc *sspcc)
> +{
> +	struct acpi_ras2_shmem __iomem *gen_comm_base = sspcc->comm_addr;
> +	u32 cap_status;
> +	u16 status;
> +	int rc;
> +
> +	/*
> +	 * As per ACPI spec, the PCC space will be initialized by
								  ^
								 the

> +	 * platform and should have set the command completion bit when
> +	 * PCC can be used by OSPM.
> +	 *
> +	 * Poll PCC status register every 3us for maximum of 600ULL * PCC
> +	 * channel latency until PCC command complete bit is set.
> +	 */
> +	rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status,
> +					status & PCC_STATUS_CMD_COMPLETE, 3,
> +					sspcc->deadline_us);
> +	if (rc) {
> +		pr_warn("PCC check channel timeout for pcc_id=%d rc=%d\n",
> +			sspcc->pcc_id, rc);
> +		return rc;
> +	}
> +
> +	if (status & PCC_STATUS_ERROR) {
> +		pr_warn("Error in executing last command=%d for pcc_id=%d\n",

Commands are better printed in hex, no?

IOW, "... command: 0x%x for ..."

> +			sspcc->last_cmd, sspcc->pcc_id);
> +		status &= ~PCC_STATUS_ERROR;
> +		writew_relaxed(status, &gen_comm_base->status);
> +		return -EIO;
> +	}
> +
> +	cap_status = readw_relaxed(&gen_comm_base->set_caps_status);

Is that register read always successful or you need to handle errors here too?

> +	writew_relaxed(0x0, &gen_comm_base->set_caps_status);
> +	return decode_cap_error(cap_status);
> +}
> +
> +/**
> + * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
> + * @ras2_ctx:	pointer to the RAS2 context structure
> + * @cmd:	command to send
> + *
> + * Returns: 0 on success, an error otherwise
> + */
> +int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd)
> +{
> +	struct ras2_sspcc *sspcc = ras2_ctx->sspcc;

No check for ras2_ctx before dereffing it? Especially if this is an exported
function.

> +	struct acpi_ras2_shmem __iomem *gen_comm_base = sspcc->comm_addr;
> +	struct mbox_chan *pcc_channel;
> +	unsigned int time_delta;
> +	int rc;
> +
> +	rc = check_pcc_chan(sspcc);
> +	if (rc < 0)
> +		return rc;
> +
> +	pcc_channel = sspcc->pcc_chan->mchan;
> +
> +	/*
> +	 * Handle the Minimum Request Turnaround Time (MRTT).
> +	 * "The minimum amount of time that OSPM must wait after the completion
> +	 * of a command before issuing the next command, in microseconds."
> +	 */
> +	if (sspcc->pcc_mrtt) {
> +		time_delta = ktime_us_delta(ktime_get(),

Remove that linebreak pls. Audit your whole code for those pls.

> +					    sspcc->last_cmd_cmpl_time);
> +		if (sspcc->pcc_mrtt > time_delta)
> +			udelay(sspcc->pcc_mrtt - time_delta);
> +	}
> +
> +	/*
> +	 * Handle the non-zero Maximum Periodic Access Rate (MPAR).
> +	 * "The maximum number of periodic requests that the subspace channel can
> +	 * support, reported in commands per minute. 0 indicates no limitation."
> +	 *
> +	 * This parameter should be ideally zero or large enough so that it can
> +	 * handle maximum number of requests that all the cores in the system can
> +	 * collectively generate. If it is not, follow the spec and just not
> +	 * send the request to the platform after hitting the MPAR limit in
> +	 * any 60s window.
> +	 */
> +	if (sspcc->pcc_mpar) {
> +		if (sspcc->mpar_count == 0) {

		if (!sspcc->mpar_count) {


> +			time_delta = ktime_ms_delta(ktime_get(),
> +						    sspcc->last_mpar_reset);
> +			if (time_delta < 60 * MSEC_PER_SEC) {
> +				dev_dbg(ras2_ctx->dev,
> +					"PCC cmd(%u) not sent due to MPAR limit",
> +					cmd);
> +				return -EIO;
> +			}
> +			sspcc->last_mpar_reset = ktime_get();
> +			sspcc->mpar_count = sspcc->pcc_mpar;
> +		}
> +		sspcc->mpar_count--;
> +	}
> +
> +	/* Write to the shared comm region */
> +	writew_relaxed(cmd, &gen_comm_base->command);
> +
> +	/* Flip CMD COMPLETE bit */
> +	writew_relaxed(0, &gen_comm_base->status);
> +
> +	/* Ring doorbell */
> +	rc = mbox_send_message(pcc_channel, &cmd);
> +	if (rc < 0) {
> +		dev_warn(ras2_ctx->dev,
> +			 "Err sending PCC mbox message. cmd:%d, rc:%d\n",

Yeah, you can say "Error". It is easier for all those dmesg greppers :)

> +			 cmd, rc);
> +		return rc;
> +	}
> +
> +	sspcc->last_cmd = cmd;
> +
> +	/*
> +	 * If Minimum Request Turnaround Time is non-zero, need
> +	 * to record the completion time of both READ and WRITE
> +	 * command for proper handling of MRTT, so need to check
> +	 * for pcc_mrtt in addition to PCC_CMD_EXEC_RAS2.

	 * If Minimum Request Turnaround Time is non-zero, need to record the
	 * completion time of both READ and WRITE command for proper handling
	 * of MRTT, so need to check for pcc_mrtt in addition to
	 * PCC_CMD_EXEC_RAS2.

Looks properly formatted to me.

> +	 */
> +	if (cmd == PCC_CMD_EXEC_RAS2 || sspcc->pcc_mrtt) {
> +		rc = check_pcc_chan(sspcc);
> +		if (sspcc->pcc_mrtt)
> +			sspcc->last_cmd_cmpl_time = ktime_get();
> +	}
> +
> +	if (pcc_channel->mbox->txdone_irq)
> +		mbox_chan_txdone(pcc_channel, rc);
> +	else
> +		mbox_client_txdone(pcc_channel, rc);
> +
> +	return rc < 0 ? rc : 0;

So you mean simply
	
	return rc;

no? rc can be 0 too so what's the point of the ternary expression?

And what's the logic here? You'd capture rc above from check_pcc_chan() and
even if it is != 0, you'd pass it into the mbox* functions? I guess that
weirdness deserves a comment...

> +}
> +EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
> +
> +static int register_pcc_channel(struct ras2_mem_ctx *ras2_ctx, int pcc_id)
> +{
> +	struct ras2_sspcc *sspcc;
> +	struct pcc_mbox_chan *pcc_chan;
> +	struct mbox_client *mbox_cl;
> +
> +	if (pcc_id < 0)
> +		return -EINVAL;
> +
> +	sspcc = kzalloc(sizeof(*sspcc), GFP_KERNEL);
> +	if (!sspcc)
> +		return -ENOMEM;
> +
> +	mbox_cl			= &sspcc->mbox_client;
> +	mbox_cl->knows_txdone	= true;
> +
> +	pcc_chan = pcc_mbox_request_channel(mbox_cl, pcc_id);
> +	if (IS_ERR(pcc_chan)) {
> +		kfree(sspcc);
> +		return PTR_ERR(pcc_chan);
> +	}
> +
> +	sspcc->pcc_id		= pcc_id;
> +	sspcc->pcc_chan		= pcc_chan;
> +	sspcc->comm_addr	= pcc_chan->shmem;
> +	sspcc->deadline_us	= PCC_NUM_RETRIES * pcc_chan->latency;
> +	sspcc->pcc_mrtt		= pcc_chan->min_turnaround_time;
> +	sspcc->pcc_mpar		= pcc_chan->max_access_rate;
> +	sspcc->mbox_client.knows_txdone	= true;
> +	sspcc->pcc_chnl_acq	= true;
> +
> +	ras2_ctx->sspcc		= sspcc;
> +	ras2_ctx->comm_addr	= sspcc->comm_addr;
> +	ras2_ctx->dev		= pcc_chan->mchan->mbox->dev;
> +
> +	mutex_init(&sspcc->pcc_lock);
> +	ras2_ctx->pcc_lock	= &sspcc->pcc_lock;
> +
> +	return 0;
> +}
> +
> +static DEFINE_IDA(ras2_ida);
> +static void ras2_release(struct device *device)
> +{
> +	struct auxiliary_device *auxdev = to_auxiliary_dev(device);
> +	struct ras2_sspcc *sspcc;
> +	struct ras2_mem_ctx *ras2_ctx =

No ugly linebreaks like that pls.

> +		container_of(auxdev, struct ras2_mem_ctx, adev);
> +
> +	ida_free(&ras2_ida, auxdev->id);
> +	sspcc = ras2_ctx->sspcc;
> +	pcc_mbox_free_channel(sspcc->pcc_chan);
> +	kfree(sspcc);
> +	kfree(ras2_ctx);
> +}
> +
> +static struct ras2_mem_ctx *

No ugly linebreaks like that pls.

> +add_aux_device(char *name, int channel, u32 pxm_inst)
> +{
> +	struct ras2_mem_ctx *ras2_ctx;
> +	struct ras2_sspcc *sspcc;
> +	int id, rc;
> +
> +	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
> +	if (!ras2_ctx)
> +		return ERR_PTR(-ENOMEM);
> +
> +	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);

This needs to handle NUMA_NO_NODE retval.

> +	rc = register_pcc_channel(ras2_ctx, channel);
> +	if (rc < 0) {
> +		pr_debug("Failed to register pcc channel rc=%d\n", rc);

Make that error message more informative by dumping pxm_inst, channel and
whatever else would be helpful in error debugging.

> +		goto ctx_free;
> +	}
> +
> +	id = ida_alloc(&ras2_ida, GFP_KERNEL);
> +	if (id < 0) {
> +		rc = id;
> +		goto pcc_free;
> +	}
> +
> +	ras2_ctx->adev.id		= id;
> +	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;

Wouldn't it make sense to have id be part of the name? I.e.,
"acpi_ras2_mem%d" with id at the %d?


> +	ras2_ctx->adev.dev.release	= ras2_release;
> +	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
> +
> +	rc = auxiliary_device_init(&ras2_ctx->adev);
> +	if (rc)
> +		goto ida_free;
> +
> +	rc = auxiliary_device_add(&ras2_ctx->adev);
> +	if (rc) {
> +		auxiliary_device_uninit(&ras2_ctx->adev);
> +		return ERR_PTR(rc);
> +	}
> +
> +	return ras2_ctx;
> +
> +ida_free:
> +	ida_free(&ras2_ida, id);
> +pcc_free:
> +	sspcc = ras2_ctx->sspcc;
> +	pcc_mbox_free_channel(sspcc->pcc_chan);
> +	kfree(sspcc);
> +ctx_free:
> +	kfree(ras2_ctx);
> +
> +	return ERR_PTR(rc);
> +}
> +
> +static void acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab)

"parse_ras2_table"

> +{
> +	struct acpi_ras2_pcc_desc *pcc_desc_list;
> +	struct ras2_mem_ctx *ras2_ctx;
> +	u16 i, count;
> +
> +	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
> +		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too short, size=%u)\n",
> +			ras2_tab->header.length);
> +		return;
> +	}
> +
> +	if (!ras2_tab->num_pcc_descs) {
> +		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
> +		return;
> +	}

You need to sanity-check the number of descs so that the below allocation
doesn't go nuts.

> +
> +	struct ras2_mem_ctx **pctx_list __free(kfree) = kzalloc(ras2_tab->num_pcc_descs * sizeof(*pctx_list), GFP_KERNEL);

Function member declarations at the beginning of the function, pls, and then
you can remove this ugly linebreak too.

> +	if (!pctx_list)
> +		return;
> +
> +	count = 0;
> +	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
> +	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
> +		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
> +			continue;
> +
> +		ras2_ctx = add_aux_device(RAS2_MEM_DEV_ID_NAME,
> +					  pcc_desc_list->channel_id,
> +					  pcc_desc_list->instance);
> +		if (IS_ERR(ras2_ctx)) {
> +			pr_warn("Failed to add RAS2 auxiliary device rc=%ld\n",
> +				PTR_ERR(ras2_ctx));
> +			for (i = count; i > 0; i--)

You don't need that count var - can use i directly.

> +				auxiliary_device_uninit(&pctx_list[i - 1]->adev);
> +			return;

When you return here you have dangling pointers in that pctx_list array.

> +		}
> +		pctx_list[count++] = ras2_ctx;


Also, what's the point of that pctx_list array at all? So that you can do
uninit on the ->adev in case you encounter a failure?

> +	}
> +}
> +
> +/**
> + * acpi_ras2_init - RAS2 driver initialization function.
> + *
> + * Extracts the ACPI RAS2 table and retrieves ID for the PCC channel subspace
> + * for communicating with the ACPI compliant HW platform. Driver adds an
> + * auxiliary device, which binds to the memory ACPI RAS2 driver, for each RAS2
> + * memory feature.
> + *
> + * Returns: none.
> + */
> +void __init acpi_ras2_init(void)
> +{
> +	struct acpi_table_ras2 *ras2_tab;
> +	acpi_status status;
> +
> +	status = acpi_get_table(ACPI_SIG_RAS2, 0,
> +				(struct acpi_table_header **)&ras2_tab);
> +	if (ACPI_FAILURE(status)) {
> +		pr_err("Failed to get table, %s\n", acpi_format_exception(status));

Looks like pr_debug to me.

> +		return;
> +	}
> +
> +	acpi_ras2_parse(ras2_tab);

This function does some table sanity checking and warns. What it should do is
fail the driver load if the table is broken.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v13 1/2] ACPI:RAS2: Add driver for the ACPI RAS2 feature table
  2025-11-25  7:36   ` Borislav Petkov
@ 2025-11-25 13:28     ` Shiju Jose
  0 siblings, 0 replies; 10+ messages in thread
From: Shiju Jose @ 2025-11-25 13:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: rafael, akpm, rppt, dferguson, linux-edac, linux-acpi, linux-mm,
	linux-doc, tony.luck, lenb, leo.duran, Yazen.Ghannam, mchehab,
	Jonathan Cameron, Linuxarm, rientjes, jiaqiyan, Jon.Grimm,
	dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	wbs, nifan.cxl, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang

>-----Original Message-----
>From: Borislav Petkov <bp@alien8.de>
>Sent: 25 November 2025 07:36
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: rafael@kernel.org; akpm@linux-foundation.org; rppt@kernel.org;
>dferguson@amperecomputing.com; linux-edac@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-doc@vger.kernel.org;
>tony.luck@intel.com; lenb@kernel.org; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; mchehab@kernel.org; Jonathan Cameron
><jonathan.cameron@huawei.com>; Linuxarm <linuxarm@huawei.com>;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>
>Subject: Re: [PATCH v13 1/2] ACPI:RAS2: Add driver for the ACPI RAS2 feature
>table
>
>On Fri, Nov 21, 2025 at 06:28:20PM +0000, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> ACPI 6.5 Specification, section 5.2.21, defined RAS2 feature table (RAS2).
>> Driver adds support for RAS2 feature table, which provides interfaces
>> for platform RAS features, for eg. HW-based memory scrubbing, and
>> logical to PA translation service. RAS2 uses PCC channel subspace for
>> communicating with the ACPI compliant HW platform.
>>
>> Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
>> Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>>  drivers/acpi/Kconfig  |  12 ++
>>  drivers/acpi/Makefile |   1 +
>>  drivers/acpi/bus.c    |   3 +
>>  drivers/acpi/ras2.c   | 398
>++++++++++++++++++++++++++++++++++++++++++
>>  include/acpi/ras2.h   |  57 ++++++
>>  5 files changed, 471 insertions(+)
>>  create mode 100644 drivers/acpi/ras2.c  create mode 100644
>> include/acpi/ras2.h
>>
>> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index
>> ca00a5dbcf75..bfa9f3f4def5 100644
>> --- a/drivers/acpi/Kconfig
>> +++ b/drivers/acpi/Kconfig
>> @@ -293,6 +293,18 @@ config ACPI_CPPC_LIB
>>  	  If your platform does not support CPPC in firmware,
>>  	  leave this option disabled.
>>
>> +config ACPI_RAS2
>> +	bool "ACPI RAS2 driver"
>> +	select AUXILIARY_BUS
>> +	select MAILBOX
>> +	select PCC
>
>Why are those select instead of depend?

Thanks Borislav for reviewing and feedback.

I will change to depends. I followed the existing CONFIG ACPI_CPPC_LIB. 
>
>> +	depends on NUMA_KEEP_MEMINFO
>> +	help
>> +	  This driver adds support for RAS2 feature table provides interfaces
>> +	  for platform RAS features, for eg. HW-based memory scrubbing.
>> +	  If your platform does not support RAS2 in firmware, leave this
>> +	  option disabled.
>
>So this driver is so niche that the majority of users want to leave it disabled. Pls
>explain that in the help.

Will change.
>
>>  config ACPI_PROCESSOR
>>  	tristate "Processor"
>>  	depends on X86 || ARM64 || LOONGARCH || RISCV diff --git
>> a/drivers/acpi/Makefile b/drivers/acpi/Makefile index
>> d1b0affb844f..abfec6745724 100644
>> --- a/drivers/acpi/Makefile
>> +++ b/drivers/acpi/Makefile
>> @@ -105,6 +105,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)	+= ec_sys.o
>>  obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
>>  obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
>>  obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
>> +obj-$(CONFIG_ACPI_RAS2)		+= ras2.o
>>  obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
>>  obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
>>  obj-$(CONFIG_ACPI_PFRUT)	+= pfr_update.o pfr_telemetry.o
>
>...
>
>> +static int check_pcc_chan(struct ras2_sspcc *sspcc) {
>> +	struct acpi_ras2_shmem __iomem *gen_comm_base = sspcc-
>>comm_addr;
>> +	u32 cap_status;
>> +	u16 status;
>> +	int rc;
>> +
>> +	/*
>> +	 * As per ACPI spec, the PCC space will be initialized by
>								  ^
>								 the
>
>> +	 * platform and should have set the command completion bit when
>> +	 * PCC can be used by OSPM.
>> +	 *
>> +	 * Poll PCC status register every 3us for maximum of 600ULL * PCC
>> +	 * channel latency until PCC command complete bit is set.
>> +	 */
>> +	rc = readw_relaxed_poll_timeout(&gen_comm_base->status, status,
>> +					status &
>PCC_STATUS_CMD_COMPLETE, 3,
>> +					sspcc->deadline_us);
>> +	if (rc) {
>> +		pr_warn("PCC check channel timeout for pcc_id=%d rc=%d\n",
>> +			sspcc->pcc_id, rc);
>> +		return rc;
>> +	}
>> +
>> +	if (status & PCC_STATUS_ERROR) {
>> +		pr_warn("Error in executing last command=%d for
>pcc_id=%d\n",
>
>Commands are better printed in hex, no?
>
>IOW, "... command: 0x%x for ..."

Sure.
>
>> +			sspcc->last_cmd, sspcc->pcc_id);
>> +		status &= ~PCC_STATUS_ERROR;
>> +		writew_relaxed(status, &gen_comm_base->status);
>> +		return -EIO;
>> +	}
>> +
>> +	cap_status = readw_relaxed(&gen_comm_base->set_caps_status);
>
>Is that register read always successful or you need to handle errors here too?

Return value of 'set capability status'  is decoded and return error code on error case
in the below function call  'return decode_cap_error(cap_status)'
>
>> +	writew_relaxed(0x0, &gen_comm_base->set_caps_status);
>> +	return decode_cap_error(cap_status); }
>> +
>> +/**
>> + * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
>> + * @ras2_ctx:	pointer to the RAS2 context structure
>> + * @cmd:	command to send
>> + *
>> + * Returns: 0 on success, an error otherwise  */ int
>> +ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd) {
>> +	struct ras2_sspcc *sspcc = ras2_ctx->sspcc;
>
>No check for ras2_ctx before dereffing it? Especially if this is an exported
>function.

I will add validity check for ras2_ctx.
>
>> +	struct acpi_ras2_shmem __iomem *gen_comm_base = sspcc-
>>comm_addr;
>> +	struct mbox_chan *pcc_channel;
>> +	unsigned int time_delta;
>> +	int rc;
>> +
>> +	rc = check_pcc_chan(sspcc);
>> +	if (rc < 0)
>> +		return rc;
>> +
>> +	pcc_channel = sspcc->pcc_chan->mchan;
>> +
>> +	/*
>> +	 * Handle the Minimum Request Turnaround Time (MRTT).
>> +	 * "The minimum amount of time that OSPM must wait after the
>completion
>> +	 * of a command before issuing the next command, in microseconds."
>> +	 */
>> +	if (sspcc->pcc_mrtt) {
>> +		time_delta = ktime_us_delta(ktime_get(),
>
>Remove that linebreak pls. Audit your whole code for those pls.

Sure.
>
>> +					    sspcc->last_cmd_cmpl_time);
[...]
>> +	/* Ring doorbell */
>> +	rc = mbox_send_message(pcc_channel, &cmd);
>> +	if (rc < 0) {
>> +		dev_warn(ras2_ctx->dev,
>> +			 "Err sending PCC mbox message. cmd:%d, rc:%d\n",
>
>Yeah, you can say "Error". It is easier for all those dmesg greppers :)

Sure.
>
>> +			 cmd, rc);
>> +		return rc;
>> +	}
>> +
>> +	sspcc->last_cmd = cmd;
>> +
>> +	/*
>> +	 * If Minimum Request Turnaround Time is non-zero, need
>> +	 * to record the completion time of both READ and WRITE
>> +	 * command for proper handling of MRTT, so need to check
>> +	 * for pcc_mrtt in addition to PCC_CMD_EXEC_RAS2.
>
>	 * If Minimum Request Turnaround Time is non-zero, need to record the
>	 * completion time of both READ and WRITE command for proper
>handling
>	 * of MRTT, so need to check for pcc_mrtt in addition to
>	 * PCC_CMD_EXEC_RAS2.
>
>Looks properly formatted to me.

Will correct.
>
>> +	 */
>> +	if (cmd == PCC_CMD_EXEC_RAS2 || sspcc->pcc_mrtt) {
>> +		rc = check_pcc_chan(sspcc);
>> +		if (sspcc->pcc_mrtt)
>> +			sspcc->last_cmd_cmpl_time = ktime_get();
>> +	}
>> +
>> +	if (pcc_channel->mbox->txdone_irq)
>> +		mbox_chan_txdone(pcc_channel, rc);
>> +	else
>> +		mbox_client_txdone(pcc_channel, rc);
>> +
>> +	return rc < 0 ? rc : 0;
>
>So you mean simply
>
>	return rc;
>
>no? rc can be 0 too so what's the point of the ternary expression?

This was added to handle the case rc = check_pcc_chan(sspcc); is not called
and last rc is returned from mbox_send_message() call because mbox_send_message()
return non-negative value for success and negative value for failure as per the documentation.
https://elixir.bootlin.com/linux/v6.18-rc7/source/drivers/mailbox/mailbox.c#L241

>
>And what's the logic here? You'd capture rc above from check_pcc_chan() and
>even if it is != 0, you'd pass it into the mbox* functions? I guess that weirdness
>deserves a comment...

Both mbox_chan_txdone() and  mbox_client_txdone() required the status of the
last transmission as second argument.
https://elixir.bootlin.com/linux/v6.18-rc7/source/drivers/mailbox/mailbox.c#L159
https://elixir.bootlin.com/linux/v6.18-rc7/source/drivers/mailbox/mailbox.c#L180

>
>> +}
>> +EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
>> +
[...]
>> +static DEFINE_IDA(ras2_ida);
>> +static void ras2_release(struct device *device) {
>> +	struct auxiliary_device *auxdev = to_auxiliary_dev(device);
>> +	struct ras2_sspcc *sspcc;
>> +	struct ras2_mem_ctx *ras2_ctx =
>
>No ugly linebreaks like that pls.

Will fix.
>
>> +		container_of(auxdev, struct ras2_mem_ctx, adev);
>> +
>> +	ida_free(&ras2_ida, auxdev->id);
>> +	sspcc = ras2_ctx->sspcc;
>> +	pcc_mbox_free_channel(sspcc->pcc_chan);
>> +	kfree(sspcc);
>> +	kfree(ras2_ctx);
>> +}
>> +
>> +static struct ras2_mem_ctx *
>
>No ugly linebreaks like that pls.

Will fix.
>
>> +add_aux_device(char *name, int channel, u32 pxm_inst) {
>> +	struct ras2_mem_ctx *ras2_ctx;
>> +	struct ras2_sspcc *sspcc;
>> +	int id, rc;
>> +
>> +	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
>> +	if (!ras2_ctx)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	ras2_ctx->sys_comp_nid = pxm_to_node(pxm_inst);
>
>This needs to handle NUMA_NO_NODE retval.

Sure. Will correct.
>
>> +	rc = register_pcc_channel(ras2_ctx, channel);
>> +	if (rc < 0) {
>> +		pr_debug("Failed to register pcc channel rc=%d\n", rc);
>
>Make that error message more informative by dumping pxm_inst, channel and
>whatever else would be helpful in error debugging.

Sure.
>
>> +		goto ctx_free;
>> +	}
>> +
>> +	id = ida_alloc(&ras2_ida, GFP_KERNEL);
>> +	if (id < 0) {
>> +		rc = id;
>> +		goto pcc_free;
>> +	}
>> +
>> +	ras2_ctx->adev.id		= id;
>> +	ras2_ctx->adev.name		= RAS2_MEM_DEV_ID_NAME;
>
>Wouldn't it make sense to have id be part of the name? I.e., "acpi_ras2_mem%d"
>with id at the %d?
adev.name and name of the auxiliary_driver (in patch 2) must match to probe the
driver successfully. Both are set as a const name RAS2_MEM_DEV_ID_NAME ("acpi_ras2_mem"). 

auxiliary_driver (in patch 2)
static struct auxiliary_driver ras2_mem_driver = {
	.name = RAS2_MEM_DEV_ID_NAME,
	.probe = ras2_probe,
	.id_table = ras2_mem_dev_id_table,
};

>
>
>> +	ras2_ctx->adev.dev.release	= ras2_release;
>> +	ras2_ctx->adev.dev.parent	= ras2_ctx->dev;
>> +
>> +	rc = auxiliary_device_init(&ras2_ctx->adev);
>> +	if (rc)
>> +		goto ida_free;
>> +
>> +	rc = auxiliary_device_add(&ras2_ctx->adev);
>> +	if (rc) {
>> +		auxiliary_device_uninit(&ras2_ctx->adev);
>> +		return ERR_PTR(rc);
>> +	}
>> +
>> +	return ras2_ctx;
>> +
>> +ida_free:
>> +	ida_free(&ras2_ida, id);
>> +pcc_free:
>> +	sspcc = ras2_ctx->sspcc;
>> +	pcc_mbox_free_channel(sspcc->pcc_chan);
>> +	kfree(sspcc);
>> +ctx_free:
>> +	kfree(ras2_ctx);
>> +
>> +	return ERR_PTR(rc);
>> +}
>> +
>> +static void acpi_ras2_parse(struct acpi_table_ras2 *ras2_tab)
>
>"parse_ras2_table"

Sure. Will change.
>
>> +{
>> +	struct acpi_ras2_pcc_desc *pcc_desc_list;
>> +	struct ras2_mem_ctx *ras2_ctx;
>> +	u16 i, count;
>> +
>> +	if (ras2_tab->header.length < sizeof(*ras2_tab)) {
>> +		pr_warn(FW_WARN "ACPI RAS2 table present but broken (too
>short, size=%u)\n",
>> +			ras2_tab->header.length);
>> +		return;
>> +	}
>> +
>> +	if (!ras2_tab->num_pcc_descs) {
>> +		pr_warn(FW_WARN "No PCC descs in ACPI RAS2 table\n");
>> +		return;
>> +	}
>
>You need to sanity-check the number of descs so that the below allocation
>doesn't go nuts.
Sorry, can you give more information?
I am wondering the above check  'if (!ras2_tab->num_pcc_descs)' { } is not enough? 
>
>> +
>> +	struct ras2_mem_ctx **pctx_list __free(kfree) =
>> +kzalloc(ras2_tab->num_pcc_descs * sizeof(*pctx_list), GFP_KERNEL);
>
>Function member declarations at the beginning of the function, pls, and then you
>can remove this ugly linebreak too.
>
>> +	if (!pctx_list)
>> +		return;

Sure.
>> +
>> +	count = 0;
>> +	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(ras2_tab + 1);
>> +	for (i = 0; i < ras2_tab->num_pcc_descs; i++, pcc_desc_list++) {
>> +		if (pcc_desc_list->feature_type != RAS2_FEAT_TYPE_MEMORY)
>> +			continue;
>> +
>> +		ras2_ctx = add_aux_device(RAS2_MEM_DEV_ID_NAME,
>> +					  pcc_desc_list->channel_id,
>> +					  pcc_desc_list->instance);
>> +		if (IS_ERR(ras2_ctx)) {
>> +			pr_warn("Failed to add RAS2 auxiliary device rc=%ld\n",
>> +				PTR_ERR(ras2_ctx));
>> +			for (i = count; i > 0; i--)
>
>You don't need that count var - can use i directly.

Will change.
>
>> +				auxiliary_device_uninit(&pctx_list[i - 1]->adev);
>> +			return;
>
>When you return here you have dangling pointers in that pctx_list array.

pctx_list will be freed when exiting this function.
>
>> +		}
>> +		pctx_list[count++] = ras2_ctx;
>
>
>Also, what's the point of that pctx_list array at all? So that you can do uninit on
>the ->adev in case you encounter a failure?
Local variable ras2_ctx  is updated when calling add_aux_device() in each iteration as
add_aux_device()  allocates memory for struct ras2_mem_ctx  for the corresponding PCC
descriptor in the RAS2 table. 
Thus storing pointer to each ras2_ctx  in pctx_list[] to uninit all the previously added auxiliary devices
using auxiliary_device_uninit(->adev); when encounter a failure in a later iteration.   

>
>> +	}
>> +}
>> +
>> +/**
>> + * acpi_ras2_init - RAS2 driver initialization function.
>> + *
>> + * Extracts the ACPI RAS2 table and retrieves ID for the PCC channel
>> +subspace
>> + * for communicating with the ACPI compliant HW platform. Driver adds
>> +an
>> + * auxiliary device, which binds to the memory ACPI RAS2 driver, for
>> +each RAS2
>> + * memory feature.
>> + *
>> + * Returns: none.
>> + */
>> +void __init acpi_ras2_init(void)
>> +{
>> +	struct acpi_table_ras2 *ras2_tab;
>> +	acpi_status status;
>> +
>> +	status = acpi_get_table(ACPI_SIG_RAS2, 0,
>> +				(struct acpi_table_header **)&ras2_tab);
>> +	if (ACPI_FAILURE(status)) {
>> +		pr_err("Failed to get table, %s\n",
>acpi_format_exception(status));
>
>Looks like pr_debug to me.

Sure. Will change.
>
>> +		return;
>> +	}
>> +
>> +	acpi_ras2_parse(ras2_tab);
>
>This function does some table sanity checking and warns. What it should do is fail
>the driver load if the table is broken.

Sure. 
If acpi_ras2_parse() and thus acpi_ras2_init() return error, can you guide
how to handle this error in acpi_init(void) where  acpi_ras2_init() is called?  
Something similar to this below,
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index b02ceb2837c6..8b4fc572a05b 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -1475,7 +1475,12 @@ static int __init acpi_init(void)
        acpi_debugger_init();
        acpi_setup_sb_notify_handler();
        acpi_viot_init();
-       acpi_ras2_init();
+       result = acpi_ras2_init();
+       if (result) {
+               kobject_put(acpi_kobj);
+               disable_acpi();
+               return result;
+       }
 
        return 0;
 }
>

Thanks,
Shiju


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-11-25 13:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-21 18:28 [PATCH v13 0/2] ACPI: Add support for ACPI RAS2 feature table shiju.jose
2025-11-21 18:28 ` [PATCH v13 1/2] ACPI:RAS2: Add driver for the " shiju.jose
2025-11-22  5:18   ` Randy Dunlap
2025-11-25  7:36   ` Borislav Petkov
2025-11-25 13:28     ` Shiju Jose
2025-11-21 18:28 ` [PATCH v13 2/2] ras: mem: Add ACPI RAS2 memory driver shiju.jose
2025-11-22  5:18   ` Randy Dunlap
2025-11-24  9:29     ` Shiju Jose
2025-11-22  5:22   ` Randy Dunlap
2025-11-24 10:00     ` Shiju Jose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox