From: <shiju.jose@huawei.com>
To: <linux-edac@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<linux-acpi@vger.kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>
Cc: <bp@alien8.de>, <tony.luck@intel.com>, <rafael@kernel.org>,
<lenb@kernel.org>, <mchehab@kernel.org>,
<dan.j.williams@intel.com>, <dave@stgolabs.net>,
<jonathan.cameron@huawei.com>, <dave.jiang@intel.com>,
<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
<ira.weiny@intel.com>, <david@redhat.com>,
<Vilas.Sridharan@amd.com>, <leo.duran@amd.com>,
<Yazen.Ghannam@amd.com>, <rientjes@google.com>,
<jiaqiyan@google.com>, <Jon.Grimm@amd.com>,
<dave.hansen@linux.intel.com>, <naoya.horiguchi@nec.com>,
<james.morse@arm.com>, <jthoughton@google.com>,
<somasundaram.a@hpe.com>, <erdemaktas@google.com>,
<pgonda@google.com>, <duenwen@google.com>, <gthelen@google.com>,
<wschwartz@amperecomputing.com>, <dferguson@amperecomputing.com>,
<wbs@os.amperecomputing.com>, <nifan.cxl@gmail.com>,
<yazen.ghannam@amd.com>, <tanxiaofei@huawei.com>,
<prime.zeng@hisilicon.com>, <roberto.sassu@huawei.com>,
<kangkang.shen@futurewei.com>, <wanghuiqiang@huawei.com>,
<linuxarm@huawei.com>, <shiju.jose@huawei.com>
Subject: [PATCH v17 01/18] EDAC: Add support for EDAC device features control
Date: Fri, 22 Nov 2024 18:03:58 +0000 [thread overview]
Message-ID: <20241122180416.1932-2-shiju.jose@huawei.com> (raw)
In-Reply-To: <20241122180416.1932-1-shiju.jose@huawei.com>
From: Shiju Jose <shiju.jose@huawei.com>
Add generic EDAC device feature controls supporting the registration
of RAS features available in the system. The driver exposes control
attributes for these features to userspace in
/sys/bus/edac/devices/<dev-name>/<ras-feature>/
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
Documentation/edac/features.rst | 94 ++++++++++++++++++++++++++++++
Documentation/edac/index.rst | 10 ++++
drivers/edac/edac_device.c | 100 ++++++++++++++++++++++++++++++++
include/linux/edac.h | 28 +++++++++
4 files changed, 232 insertions(+)
create mode 100644 Documentation/edac/features.rst
create mode 100644 Documentation/edac/index.rst
diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst
new file mode 100644
index 000000000000..e7a63146e708
--- /dev/null
+++ b/Documentation/edac/features.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================================
+Augmenting EDAC for controlling RAS features
+============================================
+
+Copyright (c) 2024 HiSilicon Limited.
+
+:Author: Shiju Jose <shiju.jose@huawei.com>
+:License: The GNU Free Documentation License, Version 1.2
+ (dual licensed under the GPL v2)
+:Original Reviewers:
+
+- Written for: 6.13
+
+Introduction
+------------
+The expansion of EDAC for controlling RAS features and exposing features
+control attributes to userspace via sysfs. Some Examples:
+
+* Scrub control
+
+* Error Check Scrub (ECS) control
+
+* ACPI RAS2 features
+
+* Post Package Repair (PPR) control
+
+* Memory Sparing Repair control etc.
+
+High level design is illustrated in the following diagram::
+
+ _______________________________________________
+ | Userspace - Rasdaemon |
+ | _____________ |
+ | | RAS CXL mem | _______________ |
+ | |error handler|---->| | |
+ | |_____________| | RAS dynamic | |
+ | _____________ | scrub, memory | |
+ | | RAS memory |---->| repair control| |
+ | |error handler| |_______________| |
+ | |_____________| | |
+ |__________________________|____________________|
+ |
+ |
+ _______________________________|______________________________
+ | Kernel EDAC extension for | controlling RAS Features |
+ | ______________________________|____________________________ |
+ || EDAC Core Sysfs EDAC| Bus | |
+ || __________________________|_________ _____________ | |
+ || |/sys/bus/edac/devices/<dev>/scrubX/ | | EDAC device || |
+ || |/sys/bus/edac/devices/<dev>/ecsX/ |<->| EDAC MC || |
+ || |/sys/bus/edac/devices/<dev>/repairX | | EDAC sysfs || |
+ || |____________________________________| |_____________|| |
+ || EDAC|Bus | |
+ || | | |
+ || __________ Get feature | Get feature | |
+ || | |desc _________|______ desc __________ | |
+ || |EDAC scrub|<-----| EDAC device | | | | |
+ || |__________| | driver- RAS |---->| EDAC mem | | |
+ || __________ | feature control| | repair | | |
+ || | |<-----|________________| |__________| | |
+ || |EDAC ECS | Register RAS|features | |
+ || |__________| | | |
+ || ______________________|_____________ | |
+ ||_________|_______________|__________________|______________| |
+ | _______|____ _______|_______ ____|__________ |
+ | | | | CXL mem driver| | Client driver | |
+ | | ACPI RAS2 | | scrub, ECS, | | memory repair | |
+ | | driver | | sparing, PPR | | features | |
+ | |____________| |_______________| |_______________| |
+ | | | | |
+ |________|_________________|____________________|______________|
+ | | |
+ ________|_________________|____________________|______________
+ | ___|_________________|____________________|_______ |
+ | | | |
+ | | Platform HW and Firmware | |
+ | |__________________________________________________| |
+ |______________________________________________________________|
+
+
+1. EDAC Features components - Create feature specific descriptors.
+For example, EDAC scrub, EDAC ECS, EDAC memory repair in the above
+diagram.
+
+2. EDAC device driver for controlling RAS Features - Get feature's attribute
+descriptors from EDAC RAS feature component and registers device's RAS
+features with EDAC bus and exposes the features control attributes via
+the sysfs EDAC bus. For example, /sys/bus/edac/devices/<dev-name>/<feature>X/
+
+3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for
+dynamic scrub/repair control to issue scrubbing/repair when excess number
+of corrected memory errors are reported in a short span of time.
diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst
new file mode 100644
index 000000000000..b6c265a4cffb
--- /dev/null
+++ b/Documentation/edac/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+EDAC Subsystem
+==============
+
+.. toctree::
+ :maxdepth: 1
+
+ features
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 621dc2a5d034..9fce46dd7405 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -570,3 +570,103 @@ void edac_device_handle_ue_count(struct edac_device_ctl_info *edac_dev,
block ? block->name : "N/A", count, msg);
}
EXPORT_SYMBOL_GPL(edac_device_handle_ue_count);
+
+static void edac_dev_release(struct device *dev)
+{
+ struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev);
+
+ kfree(ctx->dev.groups);
+ kfree(ctx);
+}
+
+const struct device_type edac_dev_type = {
+ .name = "edac_dev",
+ .release = edac_dev_release,
+};
+
+static void edac_dev_unreg(void *data)
+{
+ device_unregister(data);
+}
+
+/**
+ * edac_dev_register - register device for RAS features with EDAC
+ * @parent: parent device.
+ * @name: parent device's name.
+ * @private: parent driver's data to store in the context if any.
+ * @num_features: number of RAS features to register.
+ * @ras_features: list of RAS features to register.
+ *
+ * Return:
+ * * %0 - Success.
+ * * %-EINVAL - Invalid parameters passed.
+ * * %-ENOMEM - Dynamic memory allocation failed.
+ *
+ */
+int edac_dev_register(struct device *parent, char *name,
+ void *private, int num_features,
+ const struct edac_dev_feature *ras_features)
+{
+ const struct attribute_group **ras_attr_groups;
+ struct edac_dev_feat_ctx *ctx;
+ int attr_gcnt = 0;
+ int ret, feat;
+
+ if (!parent || !name || !num_features || !ras_features)
+ return -EINVAL;
+
+ /* Double parse to make space for attributes */
+ for (feat = 0; feat < num_features; feat++) {
+ switch (ras_features[feat].ft_type) {
+ /* Add feature specific code */
+ default:
+ return -EINVAL;
+ }
+ }
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx)
+ return -ENOMEM;
+
+ ras_attr_groups = kcalloc(attr_gcnt + 1, sizeof(*ras_attr_groups), GFP_KERNEL);
+ if (!ras_attr_groups) {
+ ret = -ENOMEM;
+ goto ctx_free;
+ }
+
+ attr_gcnt = 0;
+ for (feat = 0; feat < num_features; feat++, ras_features++) {
+ switch (ras_features->ft_type) {
+ /* Add feature specific code */
+ default:
+ ret = -EINVAL;
+ goto groups_free;
+ }
+ }
+
+ ctx->dev.parent = parent;
+ ctx->dev.bus = edac_get_sysfs_subsys();
+ ctx->dev.type = &edac_dev_type;
+ ctx->dev.groups = ras_attr_groups;
+ ctx->private = private;
+ dev_set_drvdata(&ctx->dev, ctx);
+
+ ret = dev_set_name(&ctx->dev, name);
+ if (ret)
+ goto groups_free;
+
+ ret = device_register(&ctx->dev);
+ if (ret) {
+ put_device(&ctx->dev);
+ return ret;
+ }
+
+ return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev);
+
+groups_free:
+ kfree(ras_attr_groups);
+ctx_free:
+ kfree(ctx);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(edac_dev_register);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index b4ee8961e623..521b17113d4d 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -661,4 +661,32 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci,
return mci->dimms[index];
}
+
+#define EDAC_FEAT_NAME_LEN 128
+
+/* RAS feature type */
+enum edac_dev_feat {
+ RAS_FEAT_MAX
+};
+
+/* EDAC device feature information structure */
+struct edac_dev_data {
+ u8 instance;
+ void *private;
+};
+
+struct edac_dev_feat_ctx {
+ struct device dev;
+ void *private;
+};
+
+struct edac_dev_feature {
+ enum edac_dev_feat ft_type;
+ u8 instance;
+ void *ctx;
+};
+
+int edac_dev_register(struct device *parent, char *dev_name,
+ void *parent_pvt_data, int num_features,
+ const struct edac_dev_feature *ras_features);
#endif /* _LINUX_EDAC_H_ */
--
2.43.0
next prev parent reply other threads:[~2024-11-22 18:04 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-22 18:03 [PATCH v17 00/18] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2024-11-22 18:03 ` shiju.jose [this message]
2024-11-22 18:03 ` [PATCH v17 02/18] EDAC: Add scrub control feature shiju.jose
2024-11-22 18:04 ` [PATCH v17 03/18] EDAC: Add ECS " shiju.jose
2024-11-22 18:04 ` [PATCH v17 04/18] cxl: Refactor user ioctl command path from mds to mailbox shiju.jose
2024-11-22 18:04 ` [PATCH v17 05/18] cxl: Add Get Supported Features command for kernel usage shiju.jose
2024-12-06 21:40 ` Dan Williams
2024-12-09 14:28 ` Shiju Jose
2024-12-11 17:58 ` Shiju Jose
2024-11-22 18:04 ` [PATCH v17 06/18] cxl/mbox: Add GET_FEATURE mailbox command shiju.jose
2024-11-22 18:04 ` [PATCH v17 07/18] cxl: Add Get Feature command support for user submission shiju.jose
2024-11-22 18:04 ` [PATCH v17 08/18] cxl/mbox: Add SET_FEATURE mailbox command shiju.jose
2024-11-22 18:04 ` [PATCH v17 09/18] cxl: Add Set Feature command support for user submission shiju.jose
2024-11-22 18:04 ` [PATCH v17 10/18] cxl: Add UUIDs for the CXL RAS features shiju.jose
2024-11-22 18:04 ` [PATCH v17 11/18] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
2024-11-22 18:04 ` [PATCH v17 12/18] cxl/memfeature: Add CXL memory device ECS " shiju.jose
2024-11-22 18:04 ` [PATCH v17 13/18] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-11-22 18:04 ` [PATCH v17 14/18] ras: mem: Add memory " shiju.jose
2024-11-22 18:04 ` [PATCH v17 15/18] EDAC: Add memory repair control feature shiju.jose
2024-11-22 18:04 ` [PATCH v17 16/18] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2024-11-22 18:04 ` [PATCH v17 17/18] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
2024-11-22 18:04 ` [PATCH v17 18/18] cxl/memfeature: Add CXL memory device memory sparing " shiju.jose
2025-01-03 11:41 ` [PATCH v17 00/18] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Borislav Petkov
2025-01-03 13:02 ` Jonathan Cameron
2025-01-03 15:49 ` Dave Jiang
2025-01-03 18:32 ` Shiju Jose
2025-01-03 19:17 ` Shiju Jose
2025-01-03 19:26 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241122180416.1932-2-shiju.jose@huawei.com \
--to=shiju.jose@huawei.com \
--cc=Jon.Grimm@amd.com \
--cc=Vilas.Sridharan@amd.com \
--cc=Yazen.Ghannam@amd.com \
--cc=alison.schofield@intel.com \
--cc=bp@alien8.de \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dferguson@amperecomputing.com \
--cc=duenwen@google.com \
--cc=erdemaktas@google.com \
--cc=gthelen@google.com \
--cc=ira.weiny@intel.com \
--cc=james.morse@arm.com \
--cc=jiaqiyan@google.com \
--cc=jonathan.cameron@huawei.com \
--cc=jthoughton@google.com \
--cc=kangkang.shen@futurewei.com \
--cc=lenb@kernel.org \
--cc=leo.duran@amd.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxarm@huawei.com \
--cc=mchehab@kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=nifan.cxl@gmail.com \
--cc=pgonda@google.com \
--cc=prime.zeng@hisilicon.com \
--cc=rafael@kernel.org \
--cc=rientjes@google.com \
--cc=roberto.sassu@huawei.com \
--cc=somasundaram.a@hpe.com \
--cc=tanxiaofei@huawei.com \
--cc=tony.luck@intel.com \
--cc=vishal.l.verma@intel.com \
--cc=wanghuiqiang@huawei.com \
--cc=wbs@os.amperecomputing.com \
--cc=wschwartz@amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox