linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
@ 2025-02-07 14:44 shiju.jose
  2025-02-07 14:44 ` [PATCH v19 01/15] EDAC: Add support for EDAC device features control shiju.jose
                   ` (15 more replies)
  0 siblings, 16 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

The CXL patches of this series has dependency on Dave's CXL fwctl
series [1].

The code is based on v3 of CXL fwctl series [1] posted by Dave and
v3 of FWCTL series [2] posted by Jason and rebased on top of
v6.14-rc1.

[1]: https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jiang@intel.com/
[2]: https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@nvidia.com/#r


Userspace code for CXL memory repair features [3] and
sample boot-script for CXL memory repair [4].

[3]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
[4]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/
  
Augmenting EDAC for controlling RAS features
============================================
The proposed expansion of EDAC for controlling RAS features and
exposing features control attributes to userspace in sysfs.
Some Examples:
 - Scrub control
 - Error Check Scrub (ECS) control
 - ACPI RAS2 features
 - Post Package Repair (PPR) control
 - Memory Sparing Repair control etc.

High level design is illustrated in the following diagram.
 
        +-----------------------------------------------+
        |   Userspace - Rasdaemon                       |
        | +-------------+                               |
        | | RAS CXL mem |     +---------------+         |
        | |error handler|---->|               |         |
        | +-------------+     | RAS dynamic   |         |
        | +-------------+     | scrub, memory |         |
        | | RAS memory  |---->| repair control|         |
        | |error handler|     +----|----------+         |
        | +-------------+          |                    |
        +--------------------------|--------------------+
                                   |
                                   |
   +-------------------------------|------------------------------+
   |     Kernel EDAC extension for | controlling RAS Features     |
   |+------------------------------|----------------------------+ |
   || EDAC Core          Sysfs EDAC| Bus                        | |
   ||   +--------------------------|---------------------------+| |
   ||   |/sys/bus/edac/devices/<dev>/scrubX/ |   | EDAC device || |
   ||   |/sys/bus/edac/devices/<dev>/ecsX/   |<->| EDAC MC     || |
   ||   |/sys/bus/edac/devices/<dev>/repairX |   | EDAC sysfs  || |
   ||   +---------------------------|--------------------------+| |
   ||                           EDAC|Bus                        | |
   ||                               |                           | |
   ||   +----------+ Get feature    |      Get feature          | |
   ||   |          | desc +---------|------+ desc +----------+  | |
   ||   |EDAC scrub|<-----| EDAC device    |      |          |  | |
   ||   +----------+      | driver- RAS    |----->| EDAC mem |  | |
   ||   +----------+      | feature control|      | repair   |  | |
   ||   |          |<-----|                |      +----------+  | |
   ||   |EDAC ECS  |      +---------|------+                    | |
   ||   +----------+    Register RAS|features                   | |
   ||         ______________________|_____________              | |
   |+---------|---------------|------------------|--------------+ |
   |  +-------|----+  +-------|-------+     +----|----------+     |
   |  |            |  | CXL mem driver|     | Client driver |     |
   |  | ACPI RAS2  |  | scrub, ECS,   |     | memory repair |     |
   |  | driver     |  | sparing, PPR  |     | features      |     |
   |  +-----|------+  +-------|-------+     +------|--------+     |
   |        |                 |                    |              |
   +--------|-----------------|--------------------|--------------+
            |                 |                    |
   +--------|-----------------|--------------------|--------------+
   |    +---|-----------------|--------------------|-------+      |
   |    |                                                  |      |
   |    |            Platform HW and Firmware              |      |
   |    +--------------------------------------------------+      |
   +--------------------------------------------------------------+


1. EDAC RAS Features components - Create feature specific descriptors.
   for example, EDAC scrub, EDAC ECS, EDAC memory repair in the above
   diagram. 
2. EDAC device driver for controlling RAS Features - Get feature's attr
   descriptors from EDAC RAS feature component and registers device's
   RAS features with EDAC bus and expose the feature's sysfs attributes
   under the sysfs EDAC bus.
3. RAS dynamic scrub controller - Userspace sample module added for scrub
   control in rasdaemon to issue scrubbing when excess number of memory
   errors are reported in a short span of time.

The added EDAC feature specific components (e.g. EDAC scrub, EDAC ECS,
EDAC memory repair etc) do callbacks to  the parent driver (e.g. CXL
driver, ACPI RAS driver etc) for the controls rather than just letting the
caller deal with it because of the following reasons.
1. Enforces a common API across multiple implementations can do that
   via review, but that's not generally gone well in the long run for
   subsystems that have done it (several have later moved to callback
   and feature list based approaches).
2. Gives a path for 'intercepting' in the EDAC feature driver.
   An example for this is that we could intercept PPR repair calls
   and sanity check that the memory in question is offline before
   passing back to the underlying code.  Sure we could rely on doing
   that via some additional calls from the parent driver, but the
   ABI will get messier.
3. (Speculative) we may get in kernel users of some features in the
   long run.

More details of the common RAS features are described in the following
sections.

Memory Scrubbing
================
Increasing DRAM size and cost has made memory subsystem reliability
an important concern. These modules are used where potentially
corrupted data could cause expensive or fatal issues. Memory errors are
one of the top hardware failures that cause server and workload crashes.

Memory scrub is a feature where an ECC engine reads data from
each memory media location, corrects with an ECC if necessary and
writes the corrected data back to the same memory media location.

The memory DIMMs could be scrubbed at a configurable rate to detect
uncorrected memory errors and attempts to recover from detected memory
errors providing the following benefits.
- Proactively scrubbing memory DIMMs reduces the chance of a correctable
  error becoming uncorrectable.
- Once detected, uncorrected errors caught in unallocated memory pages are
  isolated and prevented from being allocated to an application or the OS.
- The probability of software/hardware products encountering memory
  errors is reduced.
Some details of background can be found in Reference [5].

There are 2 types of memory scrubbing,
1. Background (patrol) scrubbing of the RAM whilst the RAM is otherwise
   idle.
2. On-demand scrubbing for a specific address range/region of memory.

There are several types of interfaces to HW memory scrubbers
identified such as ACPI NVDIMM ARS(Address Range Scrub), CXL memory
device patrol scrub, CXL DDR5 ECS, ACPI RAS2 memory scrubbing.

The scrub control varies between different memory scrubbers. To allow
for standard userspace tooling there is a need to present these controls
with a standard ABI.

Introduce generic memory EDAC scrub control which allows user to control
underlying scrubbers in the system via generic sysfs scrub control
interface. The common sysfs scrub control interface abstracts the control
of an arbitrary scrubbing functionality to a common set of functions.

Use case of common scrub control feature
========================================
1. There are several types of interfaces to HW memory scrubbers identified
   such as ACPI NVDIMM ARS(Address Range Scrub), CXL memory device patrol
   scrub, CXL DDR5 ECS, ACPI RAS2 memory scrubbing features and software
   based memory scrubber(discussed in the community Reference [5]).
   Also some scrubbers support controlling (background) patrol scrubbing
   (ACPI RAS2, CXL) and/or on-demand scrubbing(ACPI RAS2, ACPI ARS).
   However the scrub controls varies between memory scrubbers. Thus there
   is a requirement for a standard generic sysfs scrub controls exposed
   to userspace for the seamless control of the HW/SW scrubbers in
   the system by admin/scripts/tools etc.
2. Scrub controls in user space allow the user to disable the scrubbing
   in case disabling of the background patrol scrubbing or changing the
   scrub rate are needed for other purposes such as performance-aware
   operations which requires the background operations to be turned off
   or reduced.
3. Allows to perform on-demand scrubbing for specific address range if
   supported by the scrubber.
4. User space tools controls scrub the memory DIMMs regularly at a
   configurable scrub rate using the sysfs scrub controls discussed help,
   - to detect uncorrectable memory errors early before user accessing
     memory, which helps to recover the detected memory errors.
   - reduces the chance of a correctable error becoming uncorrectable.
5. Policy control for hotplugged memory. There is not necessarily a system
   wide bios or similar in the loop to control the scrub settings on a CXL
   device that wasn't there at boot. What that setting should be is a policy
   decision as we are trading of reliability vs performance - hence it should
   be in control of userspace. As such, 'an' interface is needed. Seems more
   sensible to try and unify it with other similar interfaces than spin
   yet another one.

The draft version of userspace code added in rasdaemon for dynamic scrub
control, based on frequency of memory errors reported to userspace, tested
for CXL device based patrol scrubbing feature and ACPI RAS2 based
scrubbing feature.

https://github.com/shijujose4/rasdaemon/tree/ras_feature_control

ToDO: For memory repair features, such as PPR, memory sparing, rasdaemon
collates records and decides to replace a row if there are lots of
corrected errors, or a single uncorrected error or error record received
with 'maintenance need' flag set as in some CXL event records.

Comparison of scrubbing features
================================
 +--------------+-----------+-----------+-----------+-----------+
 |              |   ACPI    | CXL patrol|  CXL ECS  |  ARS      |
 |  Name        |   RAS2    | scrub     |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | On-demand    | Supported | No        | No        | Supported |
 | Scrubbing    |           |           |           |           |
 |              |           |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | Background   | Supported | Supported | Supported | No        |
 | scrubbing    |           |           |           |           |
 |              |           |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | Mode of      | Scrub ctrl| per device| per memory|  Unknown  |
 | scrubbing    | per NUMA  |           | media     |           |
 |              | domain.   |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | Query scrub  | Supported | Supported | Supported | Supported |
 | capabilities |           |           |           |           |
 |              |           |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | Setting      | Supported | No        | No        | Supported |
 | address range|           |           |           |           |
 |              |           |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | Setting      | Supported | Supported | No        | No        |
 | scrub rate   |           |           |           |           |
 |              |           |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              |           |           |           |           |
 | Unit for     | Not       | in hours  | No        | No        |
 | scrub rate   | Defined   |           |           |           |
 |              |           |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 |              | Supported |           |           |           |
 | Scrub        | on-demand | No        | No        | Supported |
 | status/      | scrubbing |           |           |           |
 | Completion   | only      |           |           |           |
 +--------------+-----------+-----------+-----------+-----------+
 | UC error     |           |CXL general|CXL general| ACPI UCE  |
 | reporting    | Exception |media/DRAM |media/DRAM | notify and|
 |              |           |event/media|event/media| query     |
 |              |           |scan?      |scan?      | ARS status|
 +--------------+-----------+-----------+-----------+-----------+

CXL Memory Scrubbing features
=============================
CXL spec r3.1 section 8.2.9.9.11.1 describes the memory device patrol scrub
control feature. The device patrol scrub proactively locates and makes
corrections to errors in regular cycle. The patrol scrub control allows the
request to configure patrol scrubber's input configurations.

The patrol scrub control allows the requester to specify the number of
hours in which the patrol scrub cycles must be completed, provided that
the requested number is not less than the minimum number of hours for the
patrol scrub cycle that the device is capable of. In addition, the patrol
scrub controls allow the host to disable and enable the feature in case
disabling of the feature is needed for other purposes such as
performance-aware operations which require the background operations to be
turned off.

The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.

The DDR5 device contains number of memory media FRUs per device. The
DDR5 ECS feature and thus the ECS control driver supports configuring
the ECS parameters per FRU.

ACPI RAS2 Hardware-based Memory Scrubbing
=========================================
ACPI spec 6.5 section 5.2.21 ACPI RAS2 describes ACPI RAS2 table
provides interfaces for platform RAS features and supports independent
RAS controls and capabilities for a given RAS feature for multiple
instances of the same component in a given system.
Memory RAS features apply to RAS capabilities, controls and operations
that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS
features have a Feature Type of 0x00 (Memory).

The platform can use the hardware-based memory scrubbing feature to expose
controls and capabilities associated with hardware-based memory scrub
engines. The RAS2 memory scrubbing feature supports following as per spec,
 - Independent memory scrubbing controls for each NUMA domain, identified
   using its proximity domain.
   Note: However AmpereComputing has single entry repeated as they have
         centralized controls.
 - Provision for background (patrol) scrubbing of the entire memory system,
   as well as on-demand scrubbing for a specific region of memory.

ACPI Address Range Scrubbing(ARS)
================================
ARS allows the platform to communicate memory errors to system software.
This capability allows system software to prevent accesses to addresses
with uncorrectable errors in memory. ARS functions manage all NVDIMMs
present in the system. Only one scrub can be in progress system wide
at any given time.
Following functions are supported as per the specification.
1. Query ARS Capabilities for a given address range, indicates platform
   supports the ACPI NVDIMM Root Device Unconsumed Error Notification.
2. Start ARS triggers an Address Range Scrub for the given memory range.
   Address scrubbing can be done for volatile memory, persistent memory,
   or both.
3. Query ARS Status command allows software to get the status of ARS,  
   including the progress of ARS and ARS error record.
4. Clear Uncorrectable Error.
5. Translate SPA
6. ARS Error Inject etc.
Note: Support for ARS is not added in this series because to reduce the
line of code for review and could be added after initial code is merged. 
We'd like feedback on whether this is of interest to ARS community?

Post Package Repair(PPR)
========================
PPR (Post Package Repair) maintenance operation requests the memory device
to perform a repair operation on its media if supported. A memory device
may support two types of PPR: Hard PPR (hPPR), for a permanent row repair,
and Soft PPR (sPPR), for a temporary row repair. sPPR is much faster than
hPPR, but the repair is lost with a power cycle. During the execution of a
PPR maintenance operation, a memory device, may or may not retain data and
may or may not be able to process memory requests correctly. sPPR maintenance
operation may be executed at runtime, if data is retained and memory requests
are correctly processed. hPPR maintenance operation may be executed only at
boot because data would not be retained.

Use cases of common PPR control feature
=======================================
1. The Soft PPR (sPPR) and Hard PPR (hPPR) share similar control interfaces,
thus there is a requirement for a standard generic sysfs PPR controls exposed
to userspace for the seamless control of the PPR features in the system by the
admin/scripts/tools etc.
2. When a CXL device identifies a failure on a memory component, the device
may inform the host about the need for a PPR maintenance operation by using
an event record, where the maintenance needed flag is set. The event record
specifies the DPA that should be repaired. Kernel reports the corresponding
cxl general media or DRAM trace event to userspace. The userspace tool,
for reg. rasdaemon initiate a PPR maintenance operation in response to a
device request using the sysfs PPR control.
3. User space tools, for eg. rasdaemon, do request PPR on a memory region
when uncorrected memory error or excess corrected memory errors reported
on that memory.
4. Likely multiple instances of PPR present per memory device.

Memory Sparing
==============
Memory sparing is defined as a repair function that replaces a portion of
memory with a portion of functional memory at that same DPA. User space
tool, e.g. rasdaemon, may request the sparing operation for a given
address for which the uncorrectable error is reported. In CXL,
(CXL spec 3.1 section 8.2.9.7.1.4) subclasses for sparing operation vary
in terms of the scope of the sparing being performed. The cacheline sparing
subclass refers to a sparing action that can replace a full cacheline.
Row sparing is provided as an alternative to PPR sparing functions and its
scope is that of a single DDR row. Bank sparing allows an entire bank to
be replaced. Rank sparing is defined as an operation in which an entire
DDR rank is replaced.

Series adds,
1. EDAC device driver extended for controlling RAS features, EDAC scrub
   driver, EDAC ECS driver, EDAC memory repair driver supports memory
   scrub control, ECS control, memory repair(PPR, sparing) control
   respectively.
2. Several common patches from Dave's cxl/fwctl series.   
3. Support for CXL feature mailbox commands, which is used by CXL device
   scrubbing and memory repair features. 
4. CXL features driver supporting patrol scrub control (device and
   region based).
5. CXL features driver supporting ECS control feature.
6. ACPI RAS2 driver adds OS interface for RAS2 communication through
   PCC mailbox and extracts ACPI RAS2 feature table (RAS2) and
   create platform device for the RAS memory features, which binds
   to the memory ACPI RAS2 driver.
7. Memory ACPI RAS2 driver gets the PCC subspace for communicating
   with the ACPI compliant platform supports ACPI RAS2. Add callback
   functions and registers with EDAC device to support user to
   control the HW patrol scrubbers exposed to the kernel via the
   ACPI RAS2 table.
8. Support for CXL maintenance mailbox command, which is used by
   CXL device memory repair feature.   
9. CXL features drivers supporting PPR and memory sparing features.
    Note: There are other PPR, memory sparing drivers to come.

Open Questions based on feedbacks from the community:
1. Jonathan:
   - Any need for discoverability of capability to scan different regions,
   such as global PA space to userspace. Left as future extension.

2. Jiaqi:
   - STOP_PATROL_SCRUBBER from RAS2 must be blocked and, must not be exposed to
     OS/userspace. Stopping patrol scrubber is unacceptable for platform where
     OEM has enabled patrol scrubber, because the patrol scrubber is a key part
     of logging and is repurposed for other RAS actions.
   If the OEM does not want to expose this control, they should lock it down so the
   interface is not exposed to the OS. These features are optional after all.
   - "Requested Address Range"/"Actual Address Range" (region to scrub) is a
      similarly bad thing to expose in RAS2.
   If the OEM does not want to expose this, they should lock it down so the
   interface is not exposed to the OS. These features are optional after all.
   As per LPC discussion, support for stop and attributes for addr range
   to be exposed to the userspace. 
3. Borislav: 
   - How the scrub control exposed to userspace will be used?
     POC added in rasdaemon with dynamic scrub control for CXL memory media
     errors and memory errors reported to userspace.
     https://github.com/shijujose4/rasdaemon/tree/scrub_control_6_june_2024
   - Is the scrub interface is sufficient for the use cases?
   - Who is going to use scrub controls tools/admin/scripts?
     1) Rasdaemon for dynamic control
     2) Udev script for more static 'defaults' on hotplug etc.
   - Sample userspace code for memory repair feature
     https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
   - Sample boot-script for memory repair
     https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/

     
4. PPR   
   - For PPR, rasdaemon collates records and decides to replace a row if there
     are lots of corrected errors, or a single uncorrected error or error record
     received with maintenance request flag set as in CXL DRAM error record.
   - sPPR more or less startup only (so faking hPPR) or actually useful
     in a running system (if not the safe version that keeps everything
     running whilst replacement is ongoing)
   - Is future proofing for multiple PPR units useful given we've mashed
     together hPPR and sPPR for CXL.

Implementation
==============
1. Linux kernel
Version 19 of kernel implementations of RAS features control is available in,
https://github.com/shijujose4/linux.git
Branch: edac-enhancement-ras-features_v19

The code is based on v3 of CXL fwctl series [1] posted by Dave and
v3 of FWCTL series [2] posted by Jason and rebased on top of
v6.14-rc1.

[1]: https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jiang@intel.com/
[2]: https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@nvidia.com/#r
 
2. QEMU emulation
QEMU for CXL RAS features implementation is available in, 
https://gitlab.com/shiju.jose/qemu.git
Branch: cxl-ras-features-2024-10-24

3. Userspace - rasdaemon
3.1. Memory scrubbing 
   The draft version of userspace sample code for dynamic scrub control,
   based on frequency of memory errors reported to userspace, is added
   in rasdaemon and enabled, tested for CXL device based patrol scrubbing
   feature and ACPI RAS2 based scrubbing feature. This required updation
   for the latest sysfs scrub interface.
   https://github.com/shijujose4/rasdaemon/tree/ras_feature_control

3.2. Memory repair
   - Sample userspace code for memory repair feature [3].

   - Sample boot-script for memory repair [4].

[3]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
[4]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/

References:
1. ACPI spec r6.5 section 5.2.21 ACPI RAS2.
2. ACPI spec r6.5 section 9.19.7.2 ARS.
3. CXL spec  r3.1 8.2.9.9.11.1 Device patrol scrub control feature
4. CXL spec  r3.1 8.2.9.9.11.2 DDR5 ECS feature
5. CXL spec  r3.1 8.2.9.7.1.1 PPR Maintenance Operations
6. CXL spec  r3.1 8.2.9.7.2.1 sPPR Feature Discovery and Configuration
7. CXL spec  r3.1 8.2.9.7.2.2 hPPR Feature Discovery and Configuration
8. Background information about kernel support for memory scan, memory
   error detection and ACPI RASF.
   https://lore.kernel.org/all/20221103155029.2451105-1-jiaqiyan@google.com/
9. Discussions on RASF:
   https://lore.kernel.org/lkml/20230915172818.761-1-shiju.jose@huawei.com/#r 

Changes
=======
v18 -> v19:
1. Added config EDAC_SCRUB, EDAC_ECS and EDAC_MEM_REPAIR and
   associated changes for feedback from Borislav.

2. Control attributes related to the memory-sparing features
   in EDAC memory repair moved into another patch.

3. Feedback from Dan on CXL patches, 
   - Added more comments to the examples given for CXL features
     in the Documentation/edac/scrub.rst and 
     Documentation/edac/memory_repair.rst  
   - Removed CXL_PCI dependency for the CONFIG_CXL_RAS_FEATURES 
   - Updated description for CONFIG_CXL_RAS_FEATURES ad Dan suggested.
   - Added specification reference comments for CXL patrol scrub and ECS.  
   - kmalloc -> kzalloc
   - Added comment and/or hold lock for region based memory scrubbing code. 
     In addition, Added common helper functions to hold and release rwsem for read.
   - Removed unwanted logs and dev_err -> dev_dbg in other places.
   - Added separate helper functions for cxl_mem_ras_features_init() for memdev and region.
     This solved a bug pointed by Dan in region based init.
   - Replaced "cxl_region%d", cxlr->id with "%s_%s", "cxl", dev_name(&cxlr->dev)  
   - removed module soft dep on cxl_features.  

4. Mauro's review comments,
   - Fixed ABI documentation issues, update diagram & table for ReST
     and update .rst docs licensing update.
   - Changed sysfs memory repair_type from integer to string format.
   - Reorder, diagram and table updates in the patch[0].

5. Feedback on memory repair from community and Jonathan, 
   - Removed min and max attributes for range from edac mem repair for
     bank, rank, channel,..etc. those were added in v16.
   - For range attributes series continue to use min_ and max_ 
     attributes. For eg. scrub rate in EDAC scrub and dpa & hpa in 
     EDAC memory repair. Not sure to change or not to the scheme
     suggested by Borislav for exposing range in sysfs.
   - CXL memory repair - Added check for memory if offline and memory
     to reapir with attributes reported in ev record in the current
     boot. Otherwise do not issue repair command. 
   etc.

6. Feedback from Jonathan from internal review, 
   - Fixes in the .rst documentations
   - Update date in file headers.
   - Change 'persist_mode' in memory repair to boolean.
   - Moved some enums from patch for EDAC mem repair patch to 
     CXL repair patches.
   - Few optimizations and tidy up in RAS2 drivers.
   - Removed extra return error code check for set_feature and get_feature
     functions in CXL RAS code.
   - Re-order structure definitions in CXL RAS driver for better packaging.
   - Fixed multi line comment format in CXL memory repair drivers.
   - Removed support for 'query' command from the CXL memory repair drivers.
   - Removed zeroing sparing attributes if spare request to the device return error.
   etc.

7. 
 - The code is based on v3 of CXL fwctl series [1] posted by Dave and
    v3 of FWCTL series [2] posted by Jason and rebased on top of v6.14-rc1.
 - Modified CXL code for Dave's changes based on Dan's suggestions, for
   input for the feature calls is mbox based.
 - In addition, added separate patch for cxl_get_supported_feature_entry()   
   to EDAC series.

v17 -> v18:
1. Rebased to kernel version 6.13-rc5
2. Reordered patches for feedback on v17.
3.
3.1 Took updated patches for CXL feature infrastructure and feature commands
   from Dave's cxl/features branch, debug and tested.
3.2. RAS features in the cxl/core/memfeature.c updated for interface
     changes in the CXL feature commands.
4. Modified ACPI RAS2 code for the recent interface changes in the
   PCC mbox code.

v16 -> v17:
1. 
1.1 Took several patches for CXL feature commands from Dave's 
   fwctl/cxl series and add fixes pointed by Jonathan in those patches.
1.2. cxl/core/memfeature.c updated for interface changes in the
   Get Supported Feature, Get Feature and Set Feature functions.
1.3. Used the UUID's for RAS features in CXL features code from
     include/cxl/features.h    
2. Changes based on feedbacks from Boris
 - Added attributes in EDAC memory repair to return the range for DPA
   and other control attributes, and added callback functions for the
   DPA range in CXL PPR and memory sparing code, which is the only one
   supported in the CXL.
 - Removed 'query' attribute for memory repair feature.

v15 -> v16:
1. Changes and Fixes for feedbacks from Boris
 - Modified documentations and interface file for EDAC memory repair
   to add more details and use cases.
 - Merged documentations to corresponding patches instead of common patch
   for full documentation for better readability.
 - Removed 'persist_mode_avail' attribute for memory repair feature.
2. Changes for feedback from Dave Jiang
 - Dave suggested helper function for ECS initialization in cxl/core/memfeature.c,
   which added for all CXl RAS features, scrub, ECS, PPR and memory sparing features.
 - Fixed endian conversion pointed by Dave in CXL memory sparing. Also I fixed
     similar in CXL scrub, ECS and PPR features.
3. Changes for feedback from Ni Fan.
 - Fixed a memory leak in edac_dev_register() for memory repair feature
   and few suggestions by Ni Fan.

v14 -> v15:
1. Changes and Fixes for feedbacks from Boris
  - Added documentations for edac features, scrub and memory_repair etc
    and placed in a separate patch.
  - Deleted extra 2 attributes for EDAC ECS log_entry_type_per_* and
    mode_counts_*.
  - Rsolved issues reported in Documentation/ABI/testing/sysfs-edac-ecs.
  - Deleted unused pr_ftmt() from few files.
  - Fixed some formatting issues EDAC ECS code and similar in other files. 
    etc.
2. Change for feedback from Dave Jiang
  - In CXL code for patrol scrub control, Dave suggested replace
    void *drv_data with a union of parameters in cxl_ps_get_attrs() and
    similar functions.
    This is fixed by replacing void *drv_data with corresponding context
    structure(struct cxl_patrol_scrub_context) in CXL local functions as
    struct cxl_patrol_scrub_context difficult and can't be visible in
    generic EDAC control interface. Similar changes are made for CXL ECS,
    CXL PPR and CXL memory sparing local functions.

v13 -> v14:
1. Changes and Fixes for feedback from Boris
  - Check grammar of patch description.
  - Changed scrub control attributes for memory scrub range to "addr" and "size".
  - Fixed unreached code in edac_dev_register(). 
  - Removed enable_on_demand attribute from EDAC scrub control and modified
    RAS2 driver for the same.
  - Updated ABI documentation for EDAC scrub control.
    etc.

2. Changes for feedback from Greg/Rafael/Jonathan for ACPI RAS2
  - Replaced platform device creation and binding with
    auxiliary device creation and binding with ACPI RAS2
    memory auxiliary driver.

3. Changes and Fixes for feedback from Jonathan
  - Fixed unreached code in edac_dev_register(). 
  - Optimize callback functions in CXL ECS using macros.
  - Add readback attributes for the EDAC memory repair feature
    and add support in the CXL driver for PPR and memory sparing.
  - Add refactoring in the CXL driver for PPR and memory sparing
    for query/repair maintenance commands.
  - Add cxl_dpa_to_region_locked() function.  
  - Some more cleanups in the ACPI RAS2 and RAS2 memory drivers.
    etc.

4. Changes and Fixes for feedback from Ni Fan
   - Fixed compilation error - cxl_mem_ras_features_init refined, when CXL components
     build as module.

5. Optimize callback functions in CXL memory sparing using macros.
   etc.
   
v12 -> v13:
1. Changes and Fixes for feedback from Boris
  - Function edac_dev_feat_init() merge with edac_dev_register()
  - Add macros in EDAC feature specific code for repeated code.
  - Correct spelling mistakes.
  - Removed feature specific code from the patch "EDAC: Add support
    for EDAC device features control"
2. Changes for feedbacks from Dave Jiang
   - Move fields num_features and entries to struct cxl_mailbox,
     in "cxl: Add Get Supported Features command for kernel usage"
   - Use series from 
     https://lore.kernel.org/linux-cxl/20240905223711.1990186-1-dave.jiang@intel.com/   
3. Changes and Fixes for feedback from Ni Fan
   - In documentation scrub* to scrubX, ecs_fru* to ecs_fruX
   - Corrected some grammar mistakes in the patch headers.
   - Fixed an error print for min_scrub_cycle_hrs in the CXL patrol scrub
     code.
   - Improved an error print in the CXL ECS code.
   - bool -> tristate for config CXL_RAS_FEAT
4. Add support for CXL memory sparing feature.
5. Add common EDAC memory repair driver for controlling memory repair
   features, PPR, memory sparing etc.

v11 -> v12:
1. Changes and Fixes for feedback from Boris mainly for
    patch "EDAC: Add support for EDAC device features control"
    and other generic comments.

2. Took CXL patches from Dave Jiang for "Add Get Supported Features
   command for kernel usage" and other related patches. Merged helper
   functions from this series to the above patch. Modifications of
   CXL code in this series due to refactoring of CXL mailbox in Dave's
   patches.

3. Modified EDAC scrub control code to support multiple scrub instances
   per device.

v10 -> v11:
1. Feedback from Borislav:
   - Add generic EDAC code for control device features to
     /drivers/edac/edac_device.c.
   - Add common structure in edac for device feature's data.
   
2. Some more optimizations in generic EDAC code for control
   device features.

3. Changes for feedback from Fan for ACPI RAS2 memory driver.

4. Add support for control memory PPR (Post Package Repair) features
   in EDAC.
   
5. Add support for maintenance command in the CXL mailbox code,
   which is needed for support PPR features in CXL driver.  

6. Add support for control memory PPR (Post Package Repair) features
   and do perform PPR maintenance operation in CXL driver.

7. Rename drivers/cxl/core/memscrub.c to drivers/cxl/core/memfeature.c

v9 -> v10:
1. Feedback from Mauro Carvalho Chehab:
   - Changes suggested in EDAC RAS feature driver.
     use uppercase for enums, if else to switch-case, documentation for
     static scrub and ecs init functions etc.
   - Changes suggested in EDAC scrub.
     unit of scrub cycle hour to seconds.
     attribute node cycle_in_hours_available to min_cycle_duration and 
     max_cycle_duration.
     attribute node cycle_in_hours to current_cycle_duration.
     Use base 0 for kstrtou64() and kstrtol() functions.
     etc.
   - Changes suggested in EDAC ECS.
     uppercase for enums
     add ABI documentation. etc
        
2. Feedback from Fan:
   - Changes suggested in EDAC RAS feature driver.
     use uppercase for enums, change if...else to switch-case. 
     some optimization in edac_ras_dev_register() function
     add missing goto free_ctx
   - Changes suggested in the code for feature commands.  
   - CXL driver scrub and ECS code
     use uppercase for enums, fix typo, use enum type for mode
     fix lonf lines etc.
       
v8 -> v9:
1. Feedback from Borislav:
   - Add scrub control driver to the EDAC on feedback from Borislav.
   - Changed DEVICE_ATTR_..() static.
   - Changed the write permissions for scrub control sysfs files as
     root-only.
2. Feedback from Fan:
   - Optimized cxl_get_feature() function by using min() and removed
     feat_out_min_size.
   - Removed unreached return from cxl_set_feature() function.
   - Changed the term  "rate" to "cycle_in_hours" in all the
     scrub control code.
   - Allow cxl_mem_probe() continue if cxl_mem_patrol_scrub_init() fail,
     with just a debug warning.
      
3. Feedback from Jonathan:
   - Removed patch __free() based cleanup function for acpi_put_table.
     and added fix in the acpi RAS2 driver.

4. Feedback from Dan Williams:
   - Allow cxl_mem_probe() continue if cxl_mem_patrol_scrub_init() fail,
     with just a debug warning.
   - Add support for CXL region based scrub control.

5. Feedback from Daniel Ferguson on RAS2 drivers:
    In the ACPI RAS2 driver,
  - Incorporated the changes given for clearing error reported.
  - Incorporated the changes given for check the Set RAS Capability
    status and return an appropriate error.
    In the RAS2 memory driver,
  - Added more checks for start/stop bg and on-demand scrubbing
    so that addr range in cache do not get cleared and restrict
    permitted operations during scrubbing.

History for v1 to v8 is available here.
https://lore.kernel.org/lkml/20240726160556.2079-1-shiju.jose@huawei.com/

Shiju Jose (15):
  EDAC: Add support for EDAC device features control
  EDAC: Add scrub control feature
  EDAC: Add ECS control feature
  EDAC: Add memory repair control feature
  ACPI:RAS2: Add ACPI RAS2 driver
  ras: mem: Add memory ACPI RAS2 driver
  cxl: Add helper function to retrieve a feature entry
  cxl/memfeature: Add CXL memory device patrol scrub control feature
  cxl/memfeature: Add CXL memory device ECS control feature
  cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command
  cxl/region: Add helper function to determine memory is online
  cxl: Support for finding memory operation attributes from the current
    boot
  cxl/memfeature: Add CXL memory device soft PPR control feature
  EDAC: Update memory repair control interface for memory sparing
    feature
  cxl/memfeature: Add CXL memory device memory sparing control feature

 Documentation/ABI/testing/sysfs-edac-ecs      |   74 +
 .../ABI/testing/sysfs-edac-memory-repair      |  206 ++
 Documentation/ABI/testing/sysfs-edac-scrub    |   69 +
 Documentation/edac/features.rst               |  104 +
 Documentation/edac/index.rst                  |   12 +
 Documentation/edac/memory_repair.rst          |  209 ++
 Documentation/edac/scrub.rst                  |  393 ++++
 drivers/acpi/Kconfig                          |   11 +
 drivers/acpi/Makefile                         |    1 +
 drivers/acpi/ras2.c                           |  417 ++++
 drivers/cxl/Kconfig                           |   18 +
 drivers/cxl/core/Makefile                     |    1 +
 drivers/cxl/core/core.h                       |    9 +
 drivers/cxl/core/features.c                   |   21 +
 drivers/cxl/core/mbox.c                       |   45 +-
 drivers/cxl/core/memdev.c                     |    9 +
 drivers/cxl/core/memfeature.c                 | 1723 +++++++++++++++++
 drivers/cxl/core/ras.c                        |  151 ++
 drivers/cxl/core/region.c                     |   15 +
 drivers/cxl/cxlmem.h                          |   82 +
 drivers/cxl/mem.c                             |    4 +
 drivers/cxl/pci.c                             |    3 +
 drivers/edac/Kconfig                          |   28 +
 drivers/edac/Makefile                         |    4 +
 drivers/edac/ecs.c                            |  207 ++
 drivers/edac/edac_device.c                    |  185 ++
 drivers/edac/mem_repair.c                     |  365 ++++
 drivers/edac/scrub.c                          |  209 ++
 drivers/ras/Kconfig                           |   10 +
 drivers/ras/Makefile                          |    1 +
 drivers/ras/acpi_ras2.c                       |  383 ++++
 include/acpi/ras2_acpi.h                      |   47 +
 include/cxl/features.h                        |    2 +
 include/linux/edac.h                          |  228 +++
 34 files changed, 5244 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-edac-ecs
 create mode 100644 Documentation/ABI/testing/sysfs-edac-memory-repair
 create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
 create mode 100644 Documentation/edac/features.rst
 create mode 100644 Documentation/edac/index.rst
 create mode 100644 Documentation/edac/memory_repair.rst
 create mode 100644 Documentation/edac/scrub.rst
 create mode 100755 drivers/acpi/ras2.c
 create mode 100644 drivers/cxl/core/memfeature.c
 create mode 100644 drivers/cxl/core/ras.c
 create mode 100755 drivers/edac/ecs.c
 create mode 100755 drivers/edac/mem_repair.c
 create mode 100755 drivers/edac/scrub.c
 create mode 100644 drivers/ras/acpi_ras2.c
 create mode 100644 include/acpi/ras2_acpi.h

-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 01/15] EDAC: Add support for EDAC device features control
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 02/15] EDAC: Add scrub control feature shiju.jose
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add generic EDAC device feature controls supporting the registration
of RAS features available in the system. The driver exposes control
attributes for these features to userspace in
/sys/bus/edac/devices/<dev-name>/<ras-feature>/

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/edac/features.rst |  94 +++++++++++++++++++++++++++++
 Documentation/edac/index.rst    |  10 ++++
 drivers/edac/edac_device.c      | 102 ++++++++++++++++++++++++++++++++
 include/linux/edac.h            |  26 ++++++++
 4 files changed, 232 insertions(+)
 create mode 100644 Documentation/edac/features.rst
 create mode 100644 Documentation/edac/index.rst

diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst
new file mode 100644
index 000000000000..6b0fdc6f5d6e
--- /dev/null
+++ b/Documentation/edac/features.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+============================================
+Augmenting EDAC for controlling RAS features
+============================================
+
+Copyright (c) 2024-2025 HiSilicon Limited.
+
+:Author:   Shiju Jose <shiju.jose@huawei.com>
+:License:  The GNU Free Documentation License, Version 1.2 without
+           Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
+           (dual licensed under the GPL v2)
+
+- Written for: 6.15
+
+Introduction
+------------
+The expansion of EDAC for controlling RAS features and exposing features
+control attributes to userspace via sysfs. Some Examples:
+
+1. Scrub control
+
+2. Error Check Scrub (ECS) control
+
+3. ACPI RAS2 features
+
+4. Post Package Repair (PPR) control
+
+5. Memory Sparing Repair control etc.
+
+High level design is illustrated in the following diagram::
+
+        +-----------------------------------------------+
+        |   Userspace - Rasdaemon                       |
+        | +-------------+                               |
+        | | RAS CXL mem |     +---------------+         |
+        | |error handler|---->|               |         |
+        | +-------------+     | RAS dynamic   |         |
+        | +-------------+     | scrub, memory |         |
+        | | RAS memory  |---->| repair control|         |
+        | |error handler|     +----|----------+         |
+        | +-------------+          |                    |
+        +--------------------------|--------------------+
+                                   |
+                                   |
+   +-------------------------------|------------------------------+
+   |     Kernel EDAC extension for | controlling RAS Features     |
+   |+------------------------------|----------------------------+ |
+   || EDAC Core          Sysfs EDAC| Bus                        | |
+   ||   +--------------------------|---------------------------+| |
+   ||   |/sys/bus/edac/devices/<dev>/scrubX/ |   | EDAC device || |
+   ||   |/sys/bus/edac/devices/<dev>/ecsX/   |<->| EDAC MC     || |
+   ||   |/sys/bus/edac/devices/<dev>/repairX |   | EDAC sysfs  || |
+   ||   +---------------------------|--------------------------+| |
+   ||                           EDAC|Bus                        | |
+   ||                               |                           | |
+   ||   +----------+ Get feature    |      Get feature          | |
+   ||   |          | desc +---------|------+ desc +----------+  | |
+   ||   |EDAC scrub|<-----| EDAC device    |      |          |  | |
+   ||   +----------+      | driver- RAS    |----->| EDAC mem |  | |
+   ||   +----------+      | feature control|      | repair   |  | |
+   ||   |          |<-----|                |      +----------+  | |
+   ||   |EDAC ECS  |      +---------|------+                    | |
+   ||   +----------+    Register RAS|features                   | |
+   ||         ______________________|_____________              | |
+   |+---------|---------------|------------------|--------------+ |
+   |  +-------|----+  +-------|-------+     +----|----------+     |
+   |  |            |  | CXL mem driver|     | Client driver |     |
+   |  | ACPI RAS2  |  | scrub, ECS,   |     | memory repair |     |
+   |  | driver     |  | sparing, PPR  |     | features      |     |
+   |  +-----|------+  +-------|-------+     +------|--------+     |
+   |        |                 |                    |              |
+   +--------|-----------------|--------------------|--------------+
+            |                 |                    |
+   +--------|-----------------|--------------------|--------------+
+   |    +---|-----------------|--------------------|-------+      |
+   |    |                                                  |      |
+   |    |            Platform HW and Firmware              |      |
+   |    +--------------------------------------------------+      |
+   +--------------------------------------------------------------+
+
+
+1. EDAC Features components - Create feature specific descriptors.
+   For example, EDAC scrub, EDAC ECS, EDAC memory repair in the above
+   diagram.
+
+2. EDAC device driver for controlling RAS Features - Get feature's attribute
+   descriptors from EDAC RAS feature component and registers device's RAS
+   features with EDAC bus and exposes the features control attributes via
+   the sysfs EDAC bus. For example, /sys/bus/edac/devices/<dev-name>/<feature>X/
+
+3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for
+   dynamic scrub/repair control to issue scrubbing/repair when excess number
+   of corrected memory errors are reported in a short span of time.
diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst
new file mode 100644
index 000000000000..de4a3aa452cb
--- /dev/null
+++ b/Documentation/edac/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+==============
+EDAC Subsystem
+==============
+
+.. toctree::
+   :maxdepth: 1
+
+   features
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 621dc2a5d034..142a661ff543 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -570,3 +570,105 @@ void edac_device_handle_ue_count(struct edac_device_ctl_info *edac_dev,
 		      block ? block->name : "N/A", count, msg);
 }
 EXPORT_SYMBOL_GPL(edac_device_handle_ue_count);
+
+static void edac_dev_release(struct device *dev)
+{
+	struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev);
+
+	kfree(ctx->dev.groups);
+	kfree(ctx);
+}
+
+const struct device_type edac_dev_type = {
+	.name = "edac_dev",
+	.release = edac_dev_release,
+};
+
+static void edac_dev_unreg(void *data)
+{
+	device_unregister(data);
+}
+
+/**
+ * edac_dev_register - register device for RAS features with EDAC
+ * @parent: parent device.
+ * @name: name for the folder in the /sys/bus/edac/devices/,
+ *	  which is derived from the parent device.
+ *	  For eg. /sys/bus/edac/devices/cxl_mem0/
+ * @private: parent driver's data to store in the context if any.
+ * @num_features: number of RAS features to register.
+ * @ras_features: list of RAS features to register.
+ *
+ * Return:
+ *  * %0       - Success.
+ *  * %-EINVAL - Invalid parameters passed.
+ *  * %-ENOMEM - Dynamic memory allocation failed.
+ *
+ */
+int edac_dev_register(struct device *parent, char *name,
+		      void *private, int num_features,
+		      const struct edac_dev_feature *ras_features)
+{
+	const struct attribute_group **ras_attr_groups;
+	struct edac_dev_feat_ctx *ctx;
+	int attr_gcnt = 0;
+	int ret, feat;
+
+	if (!parent || !name || !num_features || !ras_features)
+		return -EINVAL;
+
+	/* Double parse to make space for attributes */
+	for (feat = 0; feat < num_features; feat++) {
+		switch (ras_features[feat].ft_type) {
+		/* Add feature specific code */
+		default:
+			return -EINVAL;
+		}
+	}
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ras_attr_groups = kcalloc(attr_gcnt + 1, sizeof(*ras_attr_groups), GFP_KERNEL);
+	if (!ras_attr_groups) {
+		ret = -ENOMEM;
+		goto ctx_free;
+	}
+
+	attr_gcnt = 0;
+	for (feat = 0; feat < num_features; feat++, ras_features++) {
+		switch (ras_features->ft_type) {
+		/* Add feature specific code */
+		default:
+			ret = -EINVAL;
+			goto groups_free;
+		}
+	}
+
+	ctx->dev.parent = parent;
+	ctx->dev.bus = edac_get_sysfs_subsys();
+	ctx->dev.type = &edac_dev_type;
+	ctx->dev.groups = ras_attr_groups;
+	ctx->private = private;
+	dev_set_drvdata(&ctx->dev, ctx);
+
+	ret = dev_set_name(&ctx->dev, name);
+	if (ret)
+		goto groups_free;
+
+	ret = device_register(&ctx->dev);
+	if (ret) {
+		put_device(&ctx->dev);
+		return ret;
+	}
+
+	return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev);
+
+groups_free:
+	kfree(ras_attr_groups);
+ctx_free:
+	kfree(ctx);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(edac_dev_register);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index b4ee8961e623..8c4b6ca2a994 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -661,4 +661,30 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci,
 
 	return mci->dimms[index];
 }
+
+/* RAS feature type */
+enum edac_dev_feat {
+	RAS_FEAT_MAX
+};
+
+/* EDAC device feature information structure */
+struct edac_dev_data {
+	u8 instance;
+	void *private;
+};
+
+struct edac_dev_feat_ctx {
+	struct device dev;
+	void *private;
+};
+
+struct edac_dev_feature {
+	enum edac_dev_feat ft_type;
+	u8 instance;
+	void *ctx;
+};
+
+int edac_dev_register(struct device *parent, char *dev_name,
+		      void *parent_pvt_data, int num_features,
+		      const struct edac_dev_feature *ras_features);
 #endif /* _LINUX_EDAC_H_ */
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 02/15] EDAC: Add scrub control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
  2025-02-07 14:44 ` [PATCH v19 01/15] EDAC: Add support for EDAC device features control shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 03/15] EDAC: Add ECS " shiju.jose
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add a generic EDAC scrub control to manage memory scrubbers in the system.
Devices with a scrub feature register with the EDAC device driver, which
retrieves the scrub descriptor from the EDAC scrub driver and exposes the
sysfs scrub control attributes for a scrub instance to userspace at
/sys/bus/edac/devices/<dev-name>/scrubX/.

The common sysfs scrub control interface abstracts the control of
arbitrary scrubbing functionality into a common set of functions. The
sysfs scrub attribute nodes are only present if the client driver has
implemented the corresponding attribute callback function and passed the
operations(ops) to the EDAC device driver during registration.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/ABI/testing/sysfs-edac-scrub |  69 ++++++
 Documentation/edac/features.rst            |   6 +
 Documentation/edac/index.rst               |   1 +
 Documentation/edac/scrub.rst               | 259 +++++++++++++++++++++
 drivers/edac/Kconfig                       |   9 +
 drivers/edac/Makefile                      |   2 +
 drivers/edac/edac_device.c                 |  41 +++-
 drivers/edac/scrub.c                       | 209 +++++++++++++++++
 include/linux/edac.h                       |  43 ++++
 9 files changed, 635 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
 create mode 100644 Documentation/edac/scrub.rst
 create mode 100755 drivers/edac/scrub.c

diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub
new file mode 100644
index 000000000000..a3c0ad40b2b0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-edac-scrub
@@ -0,0 +1,69 @@
+What:		/sys/bus/edac/devices/<dev-name>/scrubX
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory
+		belongs to an instance of memory scrub control feature,
+		where <dev-name> directory corresponds to a device/memory
+		region registered with the EDAC device driver for the
+		scrub control feature.
+
+		The sysfs scrub attr nodes are only present if the parent
+		driver has implemented the corresponding attr callback
+		function and provided the necessary operations to the EDAC
+		device driver during registration.
+
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/addr
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The base address of the memory region to be scrubbed
+		for on-demand scrubbing. Setting address starts scrubbing.
+		The size must be set before that.
+
+		The readback addr value is non-zero if the requested
+		on-demand scrubbing is in progress, zero otherwise.
+
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/size
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The size of the memory region to be scrubbed
+		(on-demand scrubbing).
+
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/enable_background
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) Start/Stop background(patrol) scrubbing if supported.
+
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/min_cycle_duration
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RO) Supported minimum scrub cycle duration in seconds
+		by the memory scrubber.
+
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/max_cycle_duration
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RO) Supported maximum scrub cycle duration in seconds
+		by the memory scrubber.
+
+What:		/sys/bus/edac/devices/<dev-name>/scrubX/current_cycle_duration
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The current scrub cycle duration in seconds and must be
+		within the supported range by the memory scrubber.
+
+		Scrub has an overhead when running and that may want to be
+		reduced by taking longer to do it.
diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst
index 6b0fdc6f5d6e..942d7a92b8d7 100644
--- a/Documentation/edac/features.rst
+++ b/Documentation/edac/features.rst
@@ -92,3 +92,9 @@ High level design is illustrated in the following diagram::
 3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for
    dynamic scrub/repair control to issue scrubbing/repair when excess number
    of corrected memory errors are reported in a short span of time.
+
+RAS features
+------------
+1. Memory Scrub
+
+Memory scrub features are documented in `Documentation/edac/scrub.rst`.
diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst
index de4a3aa452cb..0a00c23838b6 100644
--- a/Documentation/edac/index.rst
+++ b/Documentation/edac/index.rst
@@ -8,3 +8,4 @@ EDAC Subsystem
    :maxdepth: 1
 
    features
+   scrub
diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
new file mode 100644
index 000000000000..50bb44b126fa
--- /dev/null
+++ b/Documentation/edac/scrub.rst
@@ -0,0 +1,259 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+===================
+EDAC Scrub Control
+===================
+
+Copyright (c) 2024-2025 HiSilicon Limited.
+
+:Author:   Shiju Jose <shiju.jose@huawei.com>
+:License:  The GNU Free Documentation License, Version 1.2 without
+           Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
+           (dual licensed under the GPL v2)
+
+- Written for: 6.15
+
+Introduction
+------------
+Increasing DRAM size and cost have made memory subsystem reliability an
+important concern. These modules are used where potentially corrupted data
+could cause expensive or fatal issues. Memory errors are among the top
+hardware failures that cause server and workload crashes.
+
+Memory scrubbing is a feature where an ECC (Error-Correcting Code) engine
+reads data from each memory media location, corrects with an ECC if
+necessary and writes the corrected data back to the same memory media
+location.
+
+The memory DIMMs can be scrubbed at a configurable rate to detect
+uncorrected memory errors and attempt recovery from detected errors,
+providing the following benefits.
+
+1. Proactively scrubbing memory DIMMs reduces the chance of a correctable
+   error becoming uncorrectable.
+
+2. When detected, uncorrected errors caught in unallocated memory pages are
+   isolated and prevented from being allocated to an application or the OS.
+
+3. This reduces the likelihood of software or hardware products encountering
+   memory errors.
+
+4. The additional data on failures in memory may be used to build up statistics
+   that are later used to decide whether to use memory repair technologies
+   such as Post Package Repair or Sparing.
+
+There are 2 types of memory scrubbing:
+
+1. Background (patrol) scrubbing of the RAM while the RAM is otherwise
+   idle.
+
+2. On-demand scrubbing for a specific address range or region of memory.
+
+Several types of interfaces to hardware memory scrubbers have been
+identified, such as CXL memory device patrol scrub, CXL DDR5 ECS, ACPI
+RAS2 memory scrubbing, and ACPI NVDIMM ARS (Address Range Scrub).
+
+The control mechanisms vary across different memory scrubbers. To enable
+standardized userspace tooling, there is a need to present these controls
+through a standardized ABI.
+
+Introduce a generic memory EDAC scrub control that allows users to manage
+underlying scrubbers in the system through a standardized sysfs scrub
+control interface. This common sysfs scrub control interface abstracts the
+management of various scrubbing functionalities into a unified set of
+functions.
+
+Use cases of common scrub control feature
+-----------------------------------------
+1. Several types of interfaces for hardware (HW) memory scrubbers have
+   been identified, including the CXL memory device patrol scrub, CXL DDR5
+   ECS, ACPI RAS2 memory scrubbing features, ACPI NVDIMM ARS (Address Range
+   Scrub), and software-based memory scrubbers. Of the identified interfaces
+   to hardware memory scrubbers some support control over patrol (background)
+   scrubbing (e.g., ACPI RAS2, CXL) and/or on-demand scrubbing (e.g., ACPI RAS2,
+   ACPI ARS). However, the scrub control interfaces vary between memory
+   scrubbers, highlighting the need for a standardized, generic sysfs scrub
+   control interface that is accessible to userspace for administration and use
+   by scripts/tools.
+
+2. User-space scrub controls allow users to disable scrubbing if necessary,
+   for example, to disable background patrol scrubbing or adjust the scrub
+   rate for performance-aware operations where background activities need to
+   be minimized or disabled.
+
+3. User-space tools enable on-demand scrubbing for specific address ranges,
+   provided that the scrubber supports this functionality.
+
+4. User-space tools can also control memory DIMM scrubbing at a configurable
+   scrub rate via sysfs scrub controls. This approach offers several benefits:
+
+   4.1. Detects uncorrectable memory errors early, before user access to affected
+        memory, helping facilitate recovery.
+
+   4.2. Reduces the likelihood of correctable errors developing into uncorrectable
+        errors.
+
+5. Policy control for hotplugged memory is necessary because there may not
+   be a system-wide BIOS or similar control to manage scrub settings for a CXL
+   device added after boot. Determining these settings is a policy decision,
+   balancing reliability against performance, so userspace should control it.
+   Therefore, a unified interface is recommended for handling this function in
+   a way that aligns with other similar interfaces, rather than creating a
+   separate one.
+
+Scrubbing features
+------------------
+
+CXL Memory Scrubbing features
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+CXL spec r3.1 [1]_ section 8.2.9.9.11.1 describes the memory device patrol
+scrub control feature. The device patrol scrub proactively locates and makes
+corrections to errors in regular cycle. The patrol scrub control allows the
+userspace request to change CXL patrol scrubber's configurations.
+
+The patrol scrub control allows the requester to specify the number of
+hours in which the patrol scrub cycles must be completed, provided that
+the requested scrub rate must be within the supported range of the
+scrub rate that the device is capable of. In the CXL driver, the
+number of seconds per scrub cycles, which user requests via sysfs, is
+rescaled to hours per scrub cycles. In addition, the patrol scrub controls
+allow the host to disable and enable the feature in case disabling of the
+feature is needed for other purposes such as performance-aware operations
+which require the background operations to be turned off.
+
+Error Check Scrub (ECS)
+~~~~~~~~~~~~~~~~~~~~~~~
+CXL spec r3.1 [1]_ section 8.2.9.9.11.2 describes the Error Check Scrub (ECS)
+is a feature defined in JEDEC DDR5 SDRAM Specification (JESD79-5) and
+allows the DRAM to internally read, correct single-bit errors, and write
+back corrected data bits to the DRAM array while providing transparency
+to error counts.
+
+The DDR5 device contains number of memory media FRUs per device. The
+DDR5 ECS feature and thus the ECS control driver supports configuring
+the ECS parameters per FRU.
+
+ACPI RAS2 Hardware-based Memory Scrubbing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ACPI spec 6.5 [2]_ section 5.2.21 ACPI RAS2 describes ACPI RAS2 table
+provides interfaces for platform RAS features and supports independent
+RAS controls and capabilities for a given RAS feature for multiple
+instances of the same component in a given system.
+Memory RAS features apply to RAS capabilities, controls and operations
+that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS
+features have a Feature Type of 0x00 (Memory).
+
+The platform can use the hardware-based memory scrubbing feature to expose
+controls and capabilities associated with hardware-based memory scrub
+engines. The RAS2 memory scrubbing feature supports following as per spec,
+
+1. Independent memory scrubbing controls for each NUMA domain, identified
+   using its proximity domain.
+
+2. Provision for background (patrol) scrubbing of the entire memory system,
+   as well as on-demand scrubbing for a specific region of memory.
+
+ACPI Address Range Scrubbing(ARS)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ACPI spec 6.5 [2]_ section 9.19.7.2 describes Address Range Scrubbing(ARS).
+ARS allows the platform to communicate memory errors to system software.
+This capability allows system software to prevent accesses to addresses
+with uncorrectable errors in memory. ARS functions manage all NVDIMMs
+present in the system. Only one scrub can be in progress system wide
+at any given time.
+Following functions are supported as per the specification.
+
+1. Query ARS Capabilities for a given address range, indicates platform
+   supports the ACPI NVDIMM Root Device Unconsumed Error Notification.
+
+2. Start ARS triggers an Address Range Scrub for the given memory range.
+   Address scrubbing can be done for volatile memory, persistent memory,
+   or both.
+
+3. Query ARS Status command allows software to get the status of ARS,
+   including the progress of ARS and ARS error record.
+
+4. Clear Uncorrectable Error.
+
+5. Translate SPA
+
+6. ARS Error Inject etc.
+
+The kernel supports an existing control for ARS and ARS is currently not
+supported in EDAC.
+
+.. [1] https://computeexpresslink.org/cxl-specification/
+
+.. [2] https://uefi.org/specs/ACPI/6.5/
+
+Comparison of various scrubbing features
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |   ACPI    | CXL patrol|  CXL ECS  |  ARS      |
+ |  Name        |   RAS2    | scrub     |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | On-demand    | Supported | No        | No        | Supported |
+ | Scrubbing    |           |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Background   | Supported | Supported | Supported | No        |
+ | scrubbing    |           |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Mode of      | Scrub ctrl| per device| per memory|  Unknown  |
+ | scrubbing    | per NUMA  |           | media     |           |
+ |              | domain.   |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Query scrub  | Supported | Supported | Supported | Supported |
+ | capabilities |           |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Setting      | Supported | No        | No        | Supported |
+ | address range|           |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Setting      | Supported | Supported | No        | No        |
+ | scrub rate   |           |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Unit for     | Not       | in hours  | No        | No        |
+ | scrub rate   | Defined   |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ |              | Supported |           |           |           |
+ | Scrub        | on-demand | No        | No        | Supported |
+ | status/      | scrubbing |           |           |           |
+ | Completion   | only      |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+ | UC error     |           |CXL general|CXL general| ACPI UCE  |
+ | reporting    | Exception |media/DRAM |media/DRAM | notify and|
+ |              |           |event/media|event/media| query     |
+ |              |           |scan?      |scan?      | ARS status|
+ +--------------+-----------+-----------+-----------+-----------+
+ |              |           |           |           |           |
+ | Support for  | Supported | Supported | Supported | No        |
+ | EDAC control |           |           |           |           |
+ |              |           |           |           |           |
+ +--------------+-----------+-----------+-----------+-----------+
+
+The File System
+---------------
+
+The control attributes of a registered scrubber instance could be
+accessed in the
+
+/sys/bus/edac/devices/<dev-name>/scrubX/
+
+sysfs
+-----
+
+Sysfs files are documented in
+`Documentation/ABI/testing/sysfs-edac-scrub`
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 2051a7c944a5..175d706168ab 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -75,6 +75,15 @@ config EDAC_GHES
 
 	  In doubt, say 'Y'.
 
+config EDAC_SCRUB
+	bool "EDAC scrub feature"
+	help
+	  The EDAC scrub feature is optional and is designed to control the
+	  memory scrubbers in the system. The common sysfs scrub interface
+	  abstracts the control of various arbitrary scrubbing functionalities
+	  into a unified set of functions.
+	  Say 'y/n' to enable/disable EDAC scrub feature.
+
 config EDAC_AMD64
 	tristate "AMD64 (Opteron, Athlon64)"
 	depends on AMD_NB && EDAC_DECODE_MCE
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 89789ba8275f..f2a86ed997b7 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -13,6 +13,8 @@ edac_core-y	+= edac_module.o edac_device_sysfs.o wq.o
 
 edac_core-$(CONFIG_EDAC_DEBUG)		+= debugfs.o
 
+edac_core-$(CONFIG_EDAC_SCRUB)		+= scrub.o
+
 ifdef CONFIG_PCI
 edac_core-y	+= edac_pci.o edac_pci_sysfs.o
 endif
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 142a661ff543..40407f0ee600 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -575,6 +575,7 @@ static void edac_dev_release(struct device *dev)
 {
 	struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev);
 
+	kfree(ctx->scrub);
 	kfree(ctx->dev.groups);
 	kfree(ctx);
 }
@@ -610,8 +611,10 @@ int edac_dev_register(struct device *parent, char *name,
 		      const struct edac_dev_feature *ras_features)
 {
 	const struct attribute_group **ras_attr_groups;
+	struct edac_dev_data *dev_data;
 	struct edac_dev_feat_ctx *ctx;
 	int attr_gcnt = 0;
+	int scrub_cnt = 0;
 	int ret, feat;
 
 	if (!parent || !name || !num_features || !ras_features)
@@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char *name,
 	/* Double parse to make space for attributes */
 	for (feat = 0; feat < num_features; feat++) {
 		switch (ras_features[feat].ft_type) {
-		/* Add feature specific code */
+		case RAS_FEAT_SCRUB:
+			attr_gcnt++;
+			scrub_cnt++;
+			break;
 		default:
 			return -EINVAL;
 		}
@@ -636,13 +642,38 @@ int edac_dev_register(struct device *parent, char *name,
 		goto ctx_free;
 	}
 
+	if (scrub_cnt) {
+		ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), GFP_KERNEL);
+		if (!ctx->scrub) {
+			ret = -ENOMEM;
+			goto groups_free;
+		}
+	}
+
 	attr_gcnt = 0;
+	scrub_cnt = 0;
 	for (feat = 0; feat < num_features; feat++, ras_features++) {
 		switch (ras_features->ft_type) {
-		/* Add feature specific code */
+		case RAS_FEAT_SCRUB:
+			if (!ras_features->scrub_ops ||
+			    scrub_cnt != ras_features->instance)
+				goto data_mem_free;
+
+			dev_data = &ctx->scrub[scrub_cnt];
+			dev_data->instance = scrub_cnt;
+			dev_data->scrub_ops = ras_features->scrub_ops;
+			dev_data->private = ras_features->ctx;
+			ret = edac_scrub_get_desc(parent, &ras_attr_groups[attr_gcnt],
+						  ras_features->instance);
+			if (ret)
+				goto data_mem_free;
+
+			scrub_cnt++;
+			attr_gcnt++;
+			break;
 		default:
 			ret = -EINVAL;
-			goto groups_free;
+			goto data_mem_free;
 		}
 	}
 
@@ -655,7 +686,7 @@ int edac_dev_register(struct device *parent, char *name,
 
 	ret = dev_set_name(&ctx->dev, name);
 	if (ret)
-		goto groups_free;
+		goto data_mem_free;
 
 	ret = device_register(&ctx->dev);
 	if (ret) {
@@ -665,6 +696,8 @@ int edac_dev_register(struct device *parent, char *name,
 
 	return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev);
 
+data_mem_free:
+	kfree(ctx->scrub);
 groups_free:
 	kfree(ras_attr_groups);
 ctx_free:
diff --git a/drivers/edac/scrub.c b/drivers/edac/scrub.c
new file mode 100755
index 000000000000..e421d3ebd959
--- /dev/null
+++ b/drivers/edac/scrub.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * The generic EDAC scrub driver controls the memory scrubbers in the
+ * system. The common sysfs scrub interface abstracts the control of
+ * various arbitrary scrubbing functionalities into a unified set of
+ * functions.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ */
+
+#include <linux/edac.h>
+
+enum edac_scrub_attributes {
+	SCRUB_ADDRESS,
+	SCRUB_SIZE,
+	SCRUB_ENABLE_BACKGROUND,
+	SCRUB_MIN_CYCLE_DURATION,
+	SCRUB_MAX_CYCLE_DURATION,
+	SCRUB_CUR_CYCLE_DURATION,
+	SCRUB_MAX_ATTRS
+};
+
+struct edac_scrub_dev_attr {
+	struct device_attribute dev_attr;
+	u8 instance;
+};
+
+struct edac_scrub_context {
+	char name[EDAC_FEAT_NAME_LEN];
+	struct edac_scrub_dev_attr scrub_dev_attr[SCRUB_MAX_ATTRS];
+	struct attribute *scrub_attrs[SCRUB_MAX_ATTRS + 1];
+	struct attribute_group group;
+};
+
+#define TO_SCRUB_DEV_ATTR(_dev_attr)      \
+		container_of(_dev_attr, struct edac_scrub_dev_attr, dev_attr)
+
+#define EDAC_SCRUB_ATTR_SHOW(attrib, cb, type, format)				\
+static ssize_t attrib##_show(struct device *ras_feat_dev,			\
+			     struct device_attribute *attr, char *buf)		\
+{										\
+	u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance;				\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);		\
+	const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;		\
+	type data;								\
+	int ret;								\
+										\
+	ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, &data);	\
+	if (ret)								\
+		return ret;							\
+										\
+	return sysfs_emit(buf, format, data);					\
+}
+
+EDAC_SCRUB_ATTR_SHOW(addr, read_addr, u64, "0x%llx\n")
+EDAC_SCRUB_ATTR_SHOW(size, read_size, u64, "0x%llx\n")
+EDAC_SCRUB_ATTR_SHOW(enable_background, get_enabled_bg, bool, "%u\n")
+EDAC_SCRUB_ATTR_SHOW(min_cycle_duration, get_min_cycle, u32, "%u\n")
+EDAC_SCRUB_ATTR_SHOW(max_cycle_duration, get_max_cycle, u32, "%u\n")
+EDAC_SCRUB_ATTR_SHOW(current_cycle_duration, get_cycle_duration, u32, "%u\n")
+
+#define EDAC_SCRUB_ATTR_STORE(attrib, cb, type, conv_func)			\
+static ssize_t attrib##_store(struct device *ras_feat_dev,			\
+			      struct device_attribute *attr,			\
+			      const char *buf, size_t len)			\
+{										\
+	u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance;				\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);		\
+	const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;		\
+	type data;								\
+	int ret;								\
+										\
+	ret = conv_func(buf, 0, &data);						\
+	if (ret < 0)								\
+		return ret;							\
+										\
+	ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, data);	\
+	if (ret)								\
+		return ret;							\
+										\
+	return len;								\
+}
+
+EDAC_SCRUB_ATTR_STORE(addr, write_addr, u64, kstrtou64)
+EDAC_SCRUB_ATTR_STORE(size, write_size, u64, kstrtou64)
+EDAC_SCRUB_ATTR_STORE(enable_background, set_enabled_bg, unsigned long, kstrtoul)
+EDAC_SCRUB_ATTR_STORE(current_cycle_duration, set_cycle_duration, unsigned long, kstrtoul)
+
+static umode_t scrub_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id)
+{
+	struct device *ras_feat_dev = kobj_to_dev(kobj);
+	struct device_attribute *dev_attr = container_of(a, struct device_attribute, attr);
+	u8 inst = TO_SCRUB_DEV_ATTR(dev_attr)->instance;
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+	const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
+
+	switch (attr_id) {
+	case SCRUB_ADDRESS:
+		if (ops->read_addr) {
+			if (ops->write_addr)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case SCRUB_SIZE:
+		if (ops->read_size) {
+			if (ops->write_size)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case SCRUB_ENABLE_BACKGROUND:
+		if (ops->get_enabled_bg) {
+			if (ops->set_enabled_bg)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case SCRUB_MIN_CYCLE_DURATION:
+		if (ops->get_min_cycle)
+			return a->mode;
+		break;
+	case SCRUB_MAX_CYCLE_DURATION:
+		if (ops->get_max_cycle)
+			return a->mode;
+		break;
+	case SCRUB_CUR_CYCLE_DURATION:
+		if (ops->get_cycle_duration) {
+			if (ops->set_cycle_duration)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+#define EDAC_SCRUB_ATTR_RO(_name, _instance)       \
+	((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RO(_name), \
+					.instance = _instance })
+
+#define EDAC_SCRUB_ATTR_WO(_name, _instance)       \
+	((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_WO(_name), \
+					.instance = _instance })
+
+#define EDAC_SCRUB_ATTR_RW(_name, _instance)       \
+	((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RW(_name), \
+					.instance = _instance })
+
+static int scrub_create_desc(struct device *scrub_dev,
+			     const struct attribute_group **attr_groups, u8 instance)
+{
+	struct edac_scrub_context *scrub_ctx;
+	struct attribute_group *group;
+	int i;
+	struct edac_scrub_dev_attr dev_attr[] = {
+		[SCRUB_ADDRESS] = EDAC_SCRUB_ATTR_RW(addr, instance),
+		[SCRUB_SIZE] = EDAC_SCRUB_ATTR_RW(size, instance),
+		[SCRUB_ENABLE_BACKGROUND] = EDAC_SCRUB_ATTR_RW(enable_background, instance),
+		[SCRUB_MIN_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(min_cycle_duration, instance),
+		[SCRUB_MAX_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(max_cycle_duration, instance),
+		[SCRUB_CUR_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RW(current_cycle_duration, instance)
+	};
+
+	scrub_ctx = devm_kzalloc(scrub_dev, sizeof(*scrub_ctx), GFP_KERNEL);
+	if (!scrub_ctx)
+		return -ENOMEM;
+
+	group = &scrub_ctx->group;
+	for (i = 0; i < SCRUB_MAX_ATTRS; i++) {
+		memcpy(&scrub_ctx->scrub_dev_attr[i], &dev_attr[i], sizeof(dev_attr[i]));
+		scrub_ctx->scrub_attrs[i] = &scrub_ctx->scrub_dev_attr[i].dev_attr.attr;
+	}
+	sprintf(scrub_ctx->name, "%s%d", "scrub", instance);
+	group->name = scrub_ctx->name;
+	group->attrs = scrub_ctx->scrub_attrs;
+	group->is_visible  = scrub_attr_visible;
+
+	attr_groups[0] = group;
+
+	return 0;
+}
+
+/**
+ * edac_scrub_get_desc - get EDAC scrub descriptors
+ * @scrub_dev: client device, with scrub support
+ * @attr_groups: pointer to attribute group container
+ * @instance: device's scrub instance number.
+ *
+ * Return:
+ *  * %0	- Success.
+ *  * %-EINVAL	- Invalid parameters passed.
+ *  * %-ENOMEM	- Dynamic memory allocation failed.
+ */
+int edac_scrub_get_desc(struct device *scrub_dev,
+			const struct attribute_group **attr_groups, u8 instance)
+{
+	if (!scrub_dev || !attr_groups)
+		return -EINVAL;
+
+	return scrub_create_desc(scrub_dev, attr_groups, instance);
+}
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 8c4b6ca2a994..1cbab08720df 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -662,13 +662,54 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci,
 	return mci->dimms[index];
 }
 
+#define EDAC_FEAT_NAME_LEN	128
+
 /* RAS feature type */
 enum edac_dev_feat {
+	RAS_FEAT_SCRUB,
 	RAS_FEAT_MAX
 };
 
+/**
+ * struct edac_scrub_ops - scrub device operations (all elements optional)
+ * @read_addr: read base address of scrubbing range.
+ * @read_size: read offset of scrubbing range.
+ * @write_addr: set base address of the scrubbing range.
+ * @write_size: set offset of the scrubbing range.
+ * @get_enabled_bg: check if currently performing background scrub.
+ * @set_enabled_bg: start or stop a bg-scrub.
+ * @get_min_cycle: get minimum supported scrub cycle duration in seconds.
+ * @get_max_cycle: get maximum supported scrub cycle duration in seconds.
+ * @get_cycle_duration: get current scrub cycle duration in seconds.
+ * @set_cycle_duration: set current scrub cycle duration in seconds.
+ */
+struct edac_scrub_ops {
+	int (*read_addr)(struct device *dev, void *drv_data, u64 *base);
+	int (*read_size)(struct device *dev, void *drv_data, u64 *size);
+	int (*write_addr)(struct device *dev, void *drv_data, u64 base);
+	int (*write_size)(struct device *dev, void *drv_data, u64 size);
+	int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable);
+	int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable);
+	int (*get_min_cycle)(struct device *dev, void *drv_data,  u32 *min);
+	int (*get_max_cycle)(struct device *dev, void *drv_data,  u32 *max);
+	int (*get_cycle_duration)(struct device *dev, void *drv_data, u32 *cycle);
+	int (*set_cycle_duration)(struct device *dev, void *drv_data, u32 cycle);
+};
+
+#if IS_ENABLED(CONFIG_EDAC_SCRUB)
+int edac_scrub_get_desc(struct device *scrub_dev,
+			const struct attribute_group **attr_groups,
+			u8 instance);
+#else
+static inline int edac_scrub_get_desc(struct device *scrub_dev,
+				      const struct attribute_group **attr_groups,
+				      u8 instance)
+{ return -EOPNOTSUPP; }
+#endif /* CONFIG_EDAC_SCRUB */
+
 /* EDAC device feature information structure */
 struct edac_dev_data {
+	const struct edac_scrub_ops *scrub_ops;
 	u8 instance;
 	void *private;
 };
@@ -676,11 +717,13 @@ struct edac_dev_data {
 struct edac_dev_feat_ctx {
 	struct device dev;
 	void *private;
+	struct edac_dev_data *scrub;
 };
 
 struct edac_dev_feature {
 	enum edac_dev_feat ft_type;
 	u8 instance;
+	const struct edac_scrub_ops *scrub_ops;
 	void *ctx;
 };
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 03/15] EDAC: Add ECS control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
  2025-02-07 14:44 ` [PATCH v19 01/15] EDAC: Add support for EDAC device features control shiju.jose
  2025-02-07 14:44 ` [PATCH v19 02/15] EDAC: Add scrub control feature shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 04/15] EDAC: Add memory repair " shiju.jose
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add EDAC ECS (Error Check Scrub) control to manage a memory device's
ECS feature.

The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.

The DDR5 device contains number of memory media FRUs per device. The
DDR5 ECS feature and thus the ECS control driver supports configuring
the ECS parameters per FRU.

Memory devices support the ECS feature register with the EDAC device
driver, which retrieves the ECS descriptor from the EDAC ECS driver.
This driver exposes sysfs ECS control attributes to userspace via
/sys/bus/edac/devices/<dev-name>/ecs_fruX/.

The common sysfs ECS control interface abstracts the control of an
arbitrary ECS functionality to a common set of functions.

Support for the ECS feature is added separately because the control
attributes of the DDR5 ECS feature differ from those of the scrub
feature.

The sysfs ECS attribute nodes are only present if the client driver
has implemented the corresponding attribute callback function and
passed the necessary operations to the EDAC RAS feature driver during
registration.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/ABI/testing/sysfs-edac-ecs |  74 ++++++++
 Documentation/edac/scrub.rst             |   2 +
 drivers/edac/Kconfig                     |   9 +
 drivers/edac/Makefile                    |   1 +
 drivers/edac/ecs.c                       | 207 +++++++++++++++++++++++
 drivers/edac/edac_device.c               |  17 ++
 include/linux/edac.h                     |  48 +++++-
 7 files changed, 356 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-edac-ecs
 create mode 100755 drivers/edac/ecs.c

diff --git a/Documentation/ABI/testing/sysfs-edac-ecs b/Documentation/ABI/testing/sysfs-edac-ecs
new file mode 100644
index 000000000000..87c885c4eb1a
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-edac-ecs
@@ -0,0 +1,74 @@
+What:		/sys/bus/edac/devices/<dev-name>/ecs_fruX
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		The sysfs EDAC bus devices /<dev-name>/ecs_fruX subdirectory
+		pertains to the memory media ECS (Error Check Scrub) control
+		feature, where <dev-name> directory corresponds to a device
+		registered with the EDAC device driver for the ECS feature.
+		/ecs_fruX belongs to the media FRUs (Field Replaceable Unit)
+		under the memory device.
+
+		The sysfs ECS attr nodes are only present if the parent
+		driver has implemented the corresponding attr callback
+		function and provided the necessary operations to the EDAC
+		device driver during registration.
+
+What:		/sys/bus/edac/devices/<dev-name>/ecs_fruX/log_entry_type
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The log entry type of how the DDR5 ECS log is reported.
+
+		- 0 - per DRAM.
+
+		- 1 - per memory media FRU.
+
+		- All other values are reserved.
+
+What:		/sys/bus/edac/devices/<dev-name>/ecs_fruX/mode
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The mode of how the DDR5 ECS counts the errors.
+		Error count is tracked based on two different modes
+		selected by DDR5 ECS Control Feature - Codeword mode and
+		Row Count mode. If the ECS is under Codeword mode, then
+		the error count increments each time a codeword with check
+		bit errors is detected. If the ECS is under Row Count mode,
+		then the error counter increments each time a row with
+		check bit errors is detected.
+
+		- 0 - ECS counts rows in the memory media that have ECC errors.
+
+		- 1 - ECS counts codewords with errors, specifically, it counts
+		      the number of ECC-detected errors in the memory media.
+
+		- All other values are reserved.
+
+What:		/sys/bus/edac/devices/<dev-name>/ecs_fruX/reset
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(WO) ECS reset ECC counter.
+
+		- 1 - reset ECC counter to the default value.
+
+		- All other values are reserved.
+
+What:		/sys/bus/edac/devices/<dev-name>/ecs_fruX/threshold
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) DDR5 ECS threshold count per gigabits of memory cells.
+		The ECS error count is subject to the ECS Threshold count
+		per Gbit, which masks error counts less than the Threshold.
+
+		Supported values are 256, 1024 and 4096.
+
+		All other values are reserved.
diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
index 50bb44b126fa..5f1ff2bf54b0 100644
--- a/Documentation/edac/scrub.rst
+++ b/Documentation/edac/scrub.rst
@@ -257,3 +257,5 @@ sysfs
 
 Sysfs files are documented in
 `Documentation/ABI/testing/sysfs-edac-scrub`
+
+`Documentation/ABI/testing/sysfs-edac-ecs`
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 175d706168ab..9dfc2ea02df1 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -84,6 +84,15 @@ config EDAC_SCRUB
 	  into a unified set of functions.
 	  Say 'y/n' to enable/disable EDAC scrub feature.
 
+config EDAC_ECS
+	bool "EDAC ECS (Error Check Scrub) feature"
+	help
+	  The EDAC ECS feature is optional and is designed to control on-die
+	  error check scrub (e.g., DDR5 ECS) in the system. The common sysfs
+	  ECS interface abstracts the control of various ECS functionalities
+	  into a unified set of functions.
+	  Say 'y/n' to enable/disable EDAC ECS feature.
+
 config EDAC_AMD64
 	tristate "AMD64 (Opteron, Athlon64)"
 	depends on AMD_NB && EDAC_DECODE_MCE
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index f2a86ed997b7..21334b909ec4 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -14,6 +14,7 @@ edac_core-y	+= edac_module.o edac_device_sysfs.o wq.o
 edac_core-$(CONFIG_EDAC_DEBUG)		+= debugfs.o
 
 edac_core-$(CONFIG_EDAC_SCRUB)		+= scrub.o
+edac_core-$(CONFIG_EDAC_ECS)		+= ecs.o
 
 ifdef CONFIG_PCI
 edac_core-y	+= edac_pci.o edac_pci_sysfs.o
diff --git a/drivers/edac/ecs.c b/drivers/edac/ecs.c
new file mode 100755
index 000000000000..7fd97984e039
--- /dev/null
+++ b/drivers/edac/ecs.c
@@ -0,0 +1,207 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * The generic ECS driver is designed to support control of on-die error
+ * check scrub (e.g., DDR5 ECS). The common sysfs ECS interface abstracts
+ * the control of various ECS functionalities into a unified set of functions.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ */
+
+#include <linux/edac.h>
+
+#define EDAC_ECS_FRU_NAME "ecs_fru"
+
+enum edac_ecs_attributes {
+	ECS_LOG_ENTRY_TYPE,
+	ECS_MODE,
+	ECS_RESET,
+	ECS_THRESHOLD,
+	ECS_MAX_ATTRS
+};
+
+struct edac_ecs_dev_attr {
+	struct device_attribute dev_attr;
+	int fru_id;
+};
+
+struct edac_ecs_fru_context {
+	char name[EDAC_FEAT_NAME_LEN];
+	struct edac_ecs_dev_attr dev_attr[ECS_MAX_ATTRS];
+	struct attribute *ecs_attrs[ECS_MAX_ATTRS + 1];
+	struct attribute_group group;
+};
+
+struct edac_ecs_context {
+	u16 num_media_frus;
+	struct edac_ecs_fru_context *fru_ctxs;
+};
+
+#define TO_ECS_DEV_ATTR(_dev_attr)	\
+	container_of(_dev_attr, struct edac_ecs_dev_attr, dev_attr)
+
+#define EDAC_ECS_ATTR_SHOW(attrib, cb, type, format)				\
+static ssize_t attrib##_show(struct device *ras_feat_dev,			\
+			     struct device_attribute *attr, char *buf)		\
+{										\
+	struct edac_ecs_dev_attr *dev_attr = TO_ECS_DEV_ATTR(attr);		\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);		\
+	const struct edac_ecs_ops *ops = ctx->ecs.ecs_ops;			\
+	type data;								\
+	int ret;								\
+										\
+	ret = ops->cb(ras_feat_dev->parent, ctx->ecs.private,			\
+		      dev_attr->fru_id, &data);					\
+	if (ret)								\
+		return ret;							\
+										\
+	return sysfs_emit(buf, format, data);					\
+}
+
+EDAC_ECS_ATTR_SHOW(log_entry_type, get_log_entry_type, u32, "%u\n")
+EDAC_ECS_ATTR_SHOW(mode, get_mode, u32, "%u\n")
+EDAC_ECS_ATTR_SHOW(threshold, get_threshold, u32, "%u\n")
+
+#define EDAC_ECS_ATTR_STORE(attrib, cb, type, conv_func)			\
+static ssize_t attrib##_store(struct device *ras_feat_dev,			\
+			      struct device_attribute *attr,			\
+			      const char *buf, size_t len)			\
+{										\
+	struct edac_ecs_dev_attr *dev_attr = TO_ECS_DEV_ATTR(attr);		\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);		\
+	const struct edac_ecs_ops *ops = ctx->ecs.ecs_ops;			\
+	type data;								\
+	int ret;								\
+										\
+	ret = conv_func(buf, 0, &data);						\
+	if (ret < 0)								\
+		return ret;							\
+										\
+	ret = ops->cb(ras_feat_dev->parent, ctx->ecs.private,			\
+		      dev_attr->fru_id, data);					\
+	if (ret)								\
+		return ret;							\
+										\
+	return len;								\
+}
+
+EDAC_ECS_ATTR_STORE(log_entry_type, set_log_entry_type, unsigned long, kstrtoul)
+EDAC_ECS_ATTR_STORE(mode, set_mode, unsigned long, kstrtoul)
+EDAC_ECS_ATTR_STORE(reset, reset, unsigned long, kstrtoul)
+EDAC_ECS_ATTR_STORE(threshold, set_threshold, unsigned long, kstrtoul)
+
+static umode_t ecs_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id)
+{
+	struct device *ras_feat_dev = kobj_to_dev(kobj);
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+	const struct edac_ecs_ops *ops = ctx->ecs.ecs_ops;
+
+	switch (attr_id) {
+	case ECS_LOG_ENTRY_TYPE:
+		if (ops->get_log_entry_type)  {
+			if (ops->set_log_entry_type)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case ECS_MODE:
+		if (ops->get_mode) {
+			if (ops->set_mode)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case ECS_RESET:
+		if (ops->reset)
+			return a->mode;
+		break;
+	case ECS_THRESHOLD:
+		if (ops->get_threshold) {
+			if (ops->set_threshold)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+#define EDAC_ECS_ATTR_RO(_name, _fru_id)       \
+	((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_RO(_name), \
+				     .fru_id = _fru_id })
+
+#define EDAC_ECS_ATTR_WO(_name, _fru_id)       \
+	((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_WO(_name), \
+				     .fru_id = _fru_id })
+
+#define EDAC_ECS_ATTR_RW(_name, _fru_id)       \
+	((struct edac_ecs_dev_attr) { .dev_attr = __ATTR_RW(_name), \
+				     .fru_id = _fru_id })
+
+static int ecs_create_desc(struct device *ecs_dev,
+			   const struct attribute_group **attr_groups, u16 num_media_frus)
+{
+	struct edac_ecs_context *ecs_ctx;
+	u32 fru;
+
+	ecs_ctx = devm_kzalloc(ecs_dev, sizeof(*ecs_ctx), GFP_KERNEL);
+	if (!ecs_ctx)
+		return -ENOMEM;
+
+	ecs_ctx->num_media_frus = num_media_frus;
+	ecs_ctx->fru_ctxs = devm_kcalloc(ecs_dev, num_media_frus,
+					 sizeof(*ecs_ctx->fru_ctxs),
+					 GFP_KERNEL);
+	if (!ecs_ctx->fru_ctxs)
+		return -ENOMEM;
+
+	for (fru = 0; fru < num_media_frus; fru++) {
+		struct edac_ecs_fru_context *fru_ctx = &ecs_ctx->fru_ctxs[fru];
+		struct attribute_group *group = &fru_ctx->group;
+		int i;
+
+		fru_ctx->dev_attr[ECS_LOG_ENTRY_TYPE] =
+					EDAC_ECS_ATTR_RW(log_entry_type, fru);
+		fru_ctx->dev_attr[ECS_MODE] = EDAC_ECS_ATTR_RW(mode, fru);
+		fru_ctx->dev_attr[ECS_RESET] = EDAC_ECS_ATTR_WO(reset, fru);
+		fru_ctx->dev_attr[ECS_THRESHOLD] =
+					EDAC_ECS_ATTR_RW(threshold, fru);
+
+		for (i = 0; i < ECS_MAX_ATTRS; i++)
+			fru_ctx->ecs_attrs[i] = &fru_ctx->dev_attr[i].dev_attr.attr;
+
+		sprintf(fru_ctx->name, "%s%d", EDAC_ECS_FRU_NAME, fru);
+		group->name = fru_ctx->name;
+		group->attrs = fru_ctx->ecs_attrs;
+		group->is_visible  = ecs_attr_visible;
+
+		attr_groups[fru] = group;
+	}
+
+	return 0;
+}
+
+/**
+ * edac_ecs_get_desc - get EDAC ECS descriptors
+ * @ecs_dev: client device, supports ECS feature
+ * @attr_groups: pointer to attribute group container
+ * @num_media_frus: number of media FRUs in the device
+ *
+ * Return:
+ *  * %0	- Success.
+ *  * %-EINVAL	- Invalid parameters passed.
+ *  * %-ENOMEM	- Dynamic memory allocation failed.
+ */
+int edac_ecs_get_desc(struct device *ecs_dev,
+		      const struct attribute_group **attr_groups, u16 num_media_frus)
+{
+	if (!ecs_dev || !attr_groups || !num_media_frus)
+		return -EINVAL;
+
+	return ecs_create_desc(ecs_dev, attr_groups, num_media_frus);
+}
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 40407f0ee600..a8421dc9ab3c 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -627,6 +627,9 @@ int edac_dev_register(struct device *parent, char *name,
 			attr_gcnt++;
 			scrub_cnt++;
 			break;
+		case RAS_FEAT_ECS:
+			attr_gcnt += ras_features[feat].ecs_info.num_media_frus;
+			break;
 		default:
 			return -EINVAL;
 		}
@@ -671,6 +674,20 @@ int edac_dev_register(struct device *parent, char *name,
 			scrub_cnt++;
 			attr_gcnt++;
 			break;
+		case RAS_FEAT_ECS:
+			if (!ras_features->ecs_ops)
+				goto data_mem_free;
+
+			dev_data = &ctx->ecs;
+			dev_data->ecs_ops = ras_features->ecs_ops;
+			dev_data->private = ras_features->ctx;
+			ret = edac_ecs_get_desc(parent, &ras_attr_groups[attr_gcnt],
+						ras_features->ecs_info.num_media_frus);
+			if (ret)
+				goto data_mem_free;
+
+			attr_gcnt += ras_features->ecs_info.num_media_frus;
+			break;
 		default:
 			ret = -EINVAL;
 			goto data_mem_free;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 1cbab08720df..f8346014c14e 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -667,6 +667,7 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci,
 /* RAS feature type */
 enum edac_dev_feat {
 	RAS_FEAT_SCRUB,
+	RAS_FEAT_ECS,
 	RAS_FEAT_MAX
 };
 
@@ -707,9 +708,47 @@ static inline int edac_scrub_get_desc(struct device *scrub_dev,
 { return -EOPNOTSUPP; }
 #endif /* CONFIG_EDAC_SCRUB */
 
+/**
+ * struct edac_ecs_ops - ECS device operations (all elements optional)
+ * @get_log_entry_type: read the log entry type value.
+ * @set_log_entry_type: set the log entry type value.
+ * @get_mode: read the mode value.
+ * @set_mode: set the mode value.
+ * @reset: reset the ECS counter.
+ * @get_threshold: read the threshold count per gigabits of memory cells.
+ * @set_threshold: set the threshold count per gigabits of memory cells.
+ */
+struct edac_ecs_ops {
+	int (*get_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u32 *val);
+	int (*set_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u32 val);
+	int (*get_mode)(struct device *dev, void *drv_data, int fru_id, u32 *val);
+	int (*set_mode)(struct device *dev, void *drv_data, int fru_id, u32 val);
+	int (*reset)(struct device *dev, void *drv_data, int fru_id, u32 val);
+	int (*get_threshold)(struct device *dev, void *drv_data, int fru_id, u32 *threshold);
+	int (*set_threshold)(struct device *dev, void *drv_data, int fru_id, u32 threshold);
+};
+
+struct edac_ecs_ex_info {
+	u16 num_media_frus;
+};
+
+#if IS_ENABLED(CONFIG_EDAC_ECS)
+int edac_ecs_get_desc(struct device *ecs_dev,
+		      const struct attribute_group **attr_groups,
+		      u16 num_media_frus);
+#else
+static inline int edac_ecs_get_desc(struct device *ecs_dev,
+				    const struct attribute_group **attr_groups,
+				    u16 num_media_frus)
+{ return -EOPNOTSUPP; }
+#endif /* CONFIG_EDAC_ECS */
+
 /* EDAC device feature information structure */
 struct edac_dev_data {
-	const struct edac_scrub_ops *scrub_ops;
+	union {
+		const struct edac_scrub_ops *scrub_ops;
+		const struct edac_ecs_ops *ecs_ops;
+	};
 	u8 instance;
 	void *private;
 };
@@ -718,13 +757,18 @@ struct edac_dev_feat_ctx {
 	struct device dev;
 	void *private;
 	struct edac_dev_data *scrub;
+	struct edac_dev_data ecs;
 };
 
 struct edac_dev_feature {
 	enum edac_dev_feat ft_type;
 	u8 instance;
-	const struct edac_scrub_ops *scrub_ops;
+	union {
+		const struct edac_scrub_ops *scrub_ops;
+		const struct edac_ecs_ops *ecs_ops;
+	};
 	void *ctx;
+	struct edac_ecs_ex_info ecs_info;
 };
 
 int edac_dev_register(struct device *parent, char *dev_name,
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 04/15] EDAC: Add memory repair control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (2 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 03/15] EDAC: Add ECS " shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 05/15] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add a generic EDAC memory repair control driver to manage memory repairs
in the system, such as CXL Post Package Repair (PPR) and other soft and
hard PPR features.

For example, a CXL device with DRAM components that support PPR features
may implement PPR maintenance operations. DRAM components may support two
types of PPR, hard PPR, for a permanent row repair, and soft PPR,  for a
temporary row repair. Soft PPR is much faster than hard PPR, but the repair
is lost with a power cycle.

When a CXL device detects an error in a memory, it may report the need
for a repair maintenance operation by using an event record where the
"maintenance needed" flag is set. The event records contains the device
physical address(DPA) and other optional attributes of the memory to
repair. The kernel will report the corresponding CXL general media or
DRAM trace event to userspace, and userspace tools (e.g. rasdaemon) will
initiate a repair operation in response to the device request via the
sysfs repair control.

Device with memory repair features registers with EDAC device driver,
which retrieves memory repair descriptor from EDAC memory repair driver
and exposes the sysfs repair control attributes to userspace in
/sys/bus/edac/devices/<dev-name>/mem_repairX/.

The common memory repair control interface abstracts the control of
arbitrary memory repair functionality into a standardized set of functions.
The sysfs memory repair attribute nodes are only available if the client
driver has implemented the corresponding attribute callback function and
provided operations to the EDAC device driver during registration.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 .../ABI/testing/sysfs-edac-memory-repair      | 149 ++++++++++
 Documentation/edac/features.rst               |   4 +
 Documentation/edac/index.rst                  |   1 +
 Documentation/edac/memory_repair.rst          | 106 +++++++
 drivers/edac/Kconfig                          |  10 +
 drivers/edac/Makefile                         |   1 +
 drivers/edac/edac_device.c                    |  33 +++
 drivers/edac/mem_repair.c                     | 280 ++++++++++++++++++
 include/linux/edac.h                          |  74 +++++
 9 files changed, 658 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-edac-memory-repair
 create mode 100644 Documentation/edac/memory_repair.rst
 create mode 100755 drivers/edac/mem_repair.c

diff --git a/Documentation/ABI/testing/sysfs-edac-memory-repair b/Documentation/ABI/testing/sysfs-edac-memory-repair
new file mode 100644
index 000000000000..c54f59e4497b
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-edac-memory-repair
@@ -0,0 +1,149 @@
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		The sysfs EDAC bus devices /<dev-name>/mem_repairX subdirectory
+		pertains to the memory media repair features control, such as
+		PPR (Post Package Repair), memory sparing etc, where <dev-name>
+		directory corresponds to a device registered with the EDAC
+		device driver for the memory repair features.
+
+		Post Package Repair is a maintenance operation requests the memory
+		device to perform a repair operation on its media. It is a memory
+		self-healing feature that fixes a failing memory location by
+		replacing it with a spare row in a DRAM device. For example, a
+		CXL memory device with DRAM components that support PPR features may
+		implement PPR maintenance operations. DRAM components may support
+		two types of PPR functions: hard PPR, for a permanent row repair, and
+		soft PPR, for a temporary row repair. Soft PPR may be much faster
+		than hard PPR, but the repair is lost with a power cycle.
+
+		The sysfs attributes nodes for a repair feature are only
+		present if the parent driver has implemented the corresponding
+		attr callback function and provided the necessary operations
+		to the EDAC device driver during registration.
+
+		In some states of system configuration (e.g. before address
+		decoders have been configured), memory devices (e.g. CXL)
+		may not have an active mapping in the main host address
+		physical address map. As such, the memory to repair must be
+		identified by a device specific physical addressing scheme
+		using a device physical address(DPA). The DPA and other control
+		attributes to use will be presented in related error records.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair_type
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RO) Memory repair type. For eg. post package repair,
+		memory sparing etc. Valid values are:
+
+		- ppr - Post package repair.
+
+		- All other values are reserved.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) Get/Set the current persist repair mode set for a
+		repair function. Persist repair modes supported in the
+		device, based on a memory repair function, either is temporary,
+		which is lost with a power cycle or permanent. Valid values are:
+
+		- 0 - Soft memory repair (temporary repair).
+
+		- 1 - Hard memory repair (permanent repair).
+
+		- All other values are reserved.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair_safe_when_in_use
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RO) True if memory media is accessible and data is retained
+		during the memory repair operation.
+		The data may not be retained and memory requests may not be
+		correctly processed during a repair operation. In such case
+		repair operation can not be executed at runtime. The memory
+		must be taken offline.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/hpa
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) Host Physical Address (HPA) of the memory to repair.
+		The HPA to use will be provided in related error records.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/dpa
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) Device Physical Address (DPA) of the memory to repair.
+		The specific DPA to use will be provided in related error
+		records.
+
+		In some states of system configuration (e.g. before address
+		decoders have been configured), memory devices (e.g. CXL)
+		may not have an active mapping in the main host address
+		physical address map. As such, the memory to repair must be
+		identified by a device specific physical addressing scheme
+		using a DPA. The device physical address(DPA) to use will be
+		presented in related error records.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/nibble_mask
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) Read/Write Nibble mask of the memory to repair.
+		Nibble mask identifies one or more nibbles in error on the
+		memory bus that produced the error event. Nibble Mask bit 0
+		shall be set if nibble 0 on the memory bus produced the
+		event, etc. For example, CXL PPR and sparing, a nibble mask
+		bit set to 1 indicates the request to perform repair
+		operation in the specific device. All nibble mask bits set
+		to 1 indicates the request to perform the operation in all
+		devices. Eg. for CXL memory repair, the specific value of
+		nibble mask to use will be provided in related error records.
+		For more details, See nibble mask field in CXL spec ver 3.1,
+		section 8.2.9.7.1.2 Table 8-103 soft PPR and section
+		8.2.9.7.1.3 Table 8-104 hard PPR, section 8.2.9.7.1.4
+		Table 8-105 memory sparing.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/min_hpa
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/max_hpa
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/min_dpa
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/max_dpa
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The supported range of memory address that is to be
+		repaired. The memory device may give the supported range of
+		attributes to use and it will depend on the memory device
+		and the portion of memory to repair.
+		The userspace may receive the specific value of attributes
+		to use for a repair operation from the memory device via
+		related error records and trace events, for eg. CXL DRAM
+		and CXL general media error records in CXL memory devices.
+
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(WO) Issue the memory repair operation for the specified
+		memory repair attributes. The operation may fail if resources
+		are insufficient based on the requirements of the memory
+		device and repair function.
+
+		- 1 - Issue the repair operation.
+
+		- All other values are reserved.
diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst
index 942d7a92b8d7..9ef96734a519 100644
--- a/Documentation/edac/features.rst
+++ b/Documentation/edac/features.rst
@@ -98,3 +98,7 @@ RAS features
 1. Memory Scrub
 
 Memory scrub features are documented in `Documentation/edac/scrub.rst`.
+
+2. Memory Repair
+
+Memory repair features are documented in `Documentation/edac/memory_repair.rst`.
diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst
index 0a00c23838b6..420c6601dbae 100644
--- a/Documentation/edac/index.rst
+++ b/Documentation/edac/index.rst
@@ -8,4 +8,5 @@ EDAC Subsystem
    :maxdepth: 1
 
    features
+   memory_repair
    scrub
diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst
new file mode 100644
index 000000000000..7ccca02632f5
--- /dev/null
+++ b/Documentation/edac/memory_repair.rst
@@ -0,0 +1,106 @@
+.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later
+
+==========================
+EDAC Memory Repair Control
+==========================
+
+Copyright (c) 2024-2025 HiSilicon Limited.
+
+:Author:   Shiju Jose <shiju.jose@huawei.com>
+:License:  The GNU Free Documentation License, Version 1.2 without
+           Invariant Sections, Front-Cover Texts nor Back-Cover Texts.
+           (dual licensed under the GPL v2)
+:Original Reviewers:
+
+- Written for: 6.15
+
+Introduction
+------------
+Memory devices may support repair operations to address issues in their
+memory media. Post Package Repair (PPR) and memory sparing are examples
+of such features.
+
+Post Package Repair(PPR)
+~~~~~~~~~~~~~~~~~~~~~~~~
+Post Package Repair is a maintenance operation requests the memory device
+to perform repair operation on its media. It is a memory self-healing
+feature that fixes a failing memory location by replacing it with a spare
+row in a DRAM device. For example, a CXL memory device with DRAM components
+that support PPR features may implement PPR maintenance operations. DRAM
+components may support types of PPR functions, hard PPR, for a permanent row
+repair, and soft PPR, for a temporary row repair. Soft PPR is much faster
+than hard PPR, but the repair is lost with a power cycle.  The data may not
+be retained and memory requests may not be correctly processed during a
+repair operation. In such case, the repair operation should not executed
+at runtime.
+
+For example, CXL memory devices, soft PPR and hard PPR repair operations
+may be supported. See CXL spec rev 3.1 [1]_ sections 8.2.9.7.1.1 PPR
+Maintenance Operations, 8.2.9.7.1.2 sPPR Maintenance Operation and
+8.2.9.7.1.3 hPPR Maintenance Operation for more details.
+
+Memory Sparing
+~~~~~~~~~~~~~~
+Memory sparing is a repair function that replaces a portion of memory with
+a portion of functional memory at a particular granularity. Memory
+sparing has cacheline/row/bank/rank sparing granularities. For example, in
+rank memory-sparing mode, one memory rank serves as a spare for other ranks
+on the same channel in case they fail. The spare rank is held in reserve and
+not used as active memory until a failure is indicated, with reserved
+capacity subtracted from the total available memory in the system.
+After an error threshold is surpassed in a system protected by memory sparing,
+the content of a failing rank of DIMMs is copied to the spare rank. The
+failing rank is then taken offline and the spare rank placed online for
+use as active memory in place of the failed rank.
+
+For example, CXL memory devices may support various subclasses for sparing
+operation vary in terms of the scope of the sparing being performed.
+Cacheline sparing subclass refers to a sparing action that can replace a
+full cacheline. Row sparing is provided as an alternative to PPR sparing
+functions and its scope is that of a single DDR row. Bank sparing allows
+an entire bank to be replaced. Rank sparing is defined as an operation
+in which an entire DDR rank is replaced. See CXL spec 3.1 [1]_ section
+8.2.9.7.1.4 Memory Sparing Maintenance Operations for more details.
+
+.. [1] https://computeexpresslink.org/cxl-specification/
+
+Use cases of generic memory repair features control
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+1. The soft PPR, hard PPR and memory-sparing features share similar
+   control attributes. Therefore, there is a need for a standardized, generic
+   sysfs repair control that is exposed to userspace and used by
+   administrators, scripts and tools.
+
+2. When a CXL device detects an error in a memory component, it may inform
+   the host of the need for a repair maintenance operation by using an event
+   record where the "maintenance needed" flag is set. The event record
+   specifies the device physical address(DPA) and attributes of the memory that
+   requires repair. The kernel reports the corresponding CXL general media or
+   DRAM trace event to userspace, and userspace tools (e.g. rasdaemon) initiate
+   a repair maintenance operation in response to the device request using the
+   sysfs repair control.
+
+3. Userspace tools, such as rasdaemon, may request a repair operation on a
+   memory region when maintenance need flag set or an uncorrected memory error
+   or excess of corrected memory errors above a threshold value is reported or
+   an exceed corrected errors threshold flag set for that memory.
+
+4. Multiple PPR/sparing instances may be present per memory device.
+
+5. Drivers should enforce that live repair is safe. In systems where memory
+   mapping functions may change between boots, one approach to this is to log
+   memory errors seen on this boot against which to check live memory repair
+   requests.
+
+The File System
+---------------
+
+The control attributes of a registered memory repair instance could be
+accessed in the /sys/bus/edac/devices/<dev-name>/mem_repairX/
+
+sysfs
+-----
+
+Sysfs files are documented in
+`Documentation/ABI/testing/sysfs-edac-memory-repair`.
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 9dfc2ea02df1..703522d5d6c3 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -93,6 +93,16 @@ config EDAC_ECS
 	  into a unified set of functions.
 	  Say 'y/n' to enable/disable EDAC ECS feature.
 
+config EDAC_MEM_REPAIR
+	bool "EDAC memory repair feature"
+	help
+	  The EDAC memory repair feature is optional and is designed to control
+	  the memory devices with repair features, such as Post Package Repair
+	  (PPR), memory sparing etc. The common sysfs memory repair interface
+	  abstracts the control of various memory repair functionalities into
+	  a unified set of functions.
+	  Say 'y/n' to enable/disable EDAC memory repair feature.
+
 config EDAC_AMD64
 	tristate "AMD64 (Opteron, Athlon64)"
 	depends on AMD_NB && EDAC_DECODE_MCE
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 21334b909ec4..a342ea7a5f19 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -15,6 +15,7 @@ edac_core-$(CONFIG_EDAC_DEBUG)		+= debugfs.o
 
 edac_core-$(CONFIG_EDAC_SCRUB)		+= scrub.o
 edac_core-$(CONFIG_EDAC_ECS)		+= ecs.o
+edac_core-$(CONFIG_EDAC_MEM_REPAIR)	+= mem_repair.o
 
 ifdef CONFIG_PCI
 edac_core-y	+= edac_pci.o edac_pci_sysfs.o
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index a8421dc9ab3c..15c96b0bdd5b 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -575,6 +575,7 @@ static void edac_dev_release(struct device *dev)
 {
 	struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev);
 
+	kfree(ctx->mem_repair);
 	kfree(ctx->scrub);
 	kfree(ctx->dev.groups);
 	kfree(ctx);
@@ -613,6 +614,7 @@ int edac_dev_register(struct device *parent, char *name,
 	const struct attribute_group **ras_attr_groups;
 	struct edac_dev_data *dev_data;
 	struct edac_dev_feat_ctx *ctx;
+	int mem_repair_cnt = 0;
 	int attr_gcnt = 0;
 	int scrub_cnt = 0;
 	int ret, feat;
@@ -630,6 +632,10 @@ int edac_dev_register(struct device *parent, char *name,
 		case RAS_FEAT_ECS:
 			attr_gcnt += ras_features[feat].ecs_info.num_media_frus;
 			break;
+		case RAS_FEAT_MEM_REPAIR:
+			attr_gcnt++;
+			mem_repair_cnt++;
+			break;
 		default:
 			return -EINVAL;
 		}
@@ -653,8 +659,17 @@ int edac_dev_register(struct device *parent, char *name,
 		}
 	}
 
+	if (mem_repair_cnt) {
+		ctx->mem_repair = kcalloc(mem_repair_cnt, sizeof(*ctx->mem_repair), GFP_KERNEL);
+		if (!ctx->mem_repair) {
+			ret = -ENOMEM;
+			goto data_mem_free;
+		}
+	}
+
 	attr_gcnt = 0;
 	scrub_cnt = 0;
+	mem_repair_cnt = 0;
 	for (feat = 0; feat < num_features; feat++, ras_features++) {
 		switch (ras_features->ft_type) {
 		case RAS_FEAT_SCRUB:
@@ -688,6 +703,23 @@ int edac_dev_register(struct device *parent, char *name,
 
 			attr_gcnt += ras_features->ecs_info.num_media_frus;
 			break;
+		case RAS_FEAT_MEM_REPAIR:
+			if (!ras_features->mem_repair_ops ||
+			    mem_repair_cnt != ras_features->instance)
+				goto data_mem_free;
+
+			dev_data = &ctx->mem_repair[mem_repair_cnt];
+			dev_data->instance = mem_repair_cnt;
+			dev_data->mem_repair_ops = ras_features->mem_repair_ops;
+			dev_data->private = ras_features->ctx;
+			ret = edac_mem_repair_get_desc(parent, &ras_attr_groups[attr_gcnt],
+						       ras_features->instance);
+			if (ret)
+				goto data_mem_free;
+
+			mem_repair_cnt++;
+			attr_gcnt++;
+			break;
 		default:
 			ret = -EINVAL;
 			goto data_mem_free;
@@ -714,6 +746,7 @@ int edac_dev_register(struct device *parent, char *name,
 	return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev);
 
 data_mem_free:
+	kfree(ctx->mem_repair);
 	kfree(ctx->scrub);
 groups_free:
 	kfree(ras_attr_groups);
diff --git a/drivers/edac/mem_repair.c b/drivers/edac/mem_repair.c
new file mode 100755
index 000000000000..9fe310050464
--- /dev/null
+++ b/drivers/edac/mem_repair.c
@@ -0,0 +1,280 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * The generic EDAC memory repair driver is designed to control the memory
+ * devices with memory repair features, such as Post Package Repair (PPR),
+ * memory sparing etc. The common sysfs memory repair interface abstracts
+ * the control of various arbitrary memory repair functionalities into a
+ * unified set of functions.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ */
+
+#include <linux/edac.h>
+
+enum edac_mem_repair_attributes {
+	MEM_REPAIR_TYPE,
+	MEM_REPAIR_PERSIST_MODE,
+	MEM_REPAIR_SAFE_IN_USE,
+	MEM_REPAIR_HPA,
+	MEM_REPAIR_MIN_HPA,
+	MEM_REPAIR_MAX_HPA,
+	MEM_REPAIR_DPA,
+	MEM_REPAIR_MIN_DPA,
+	MEM_REPAIR_MAX_DPA,
+	MEM_REPAIR_NIBBLE_MASK,
+	MEM_DO_REPAIR,
+	MEM_REPAIR_MAX_ATTRS
+};
+
+struct edac_mem_repair_dev_attr {
+	struct device_attribute dev_attr;
+	u8 instance;
+};
+
+struct edac_mem_repair_context {
+	char name[EDAC_FEAT_NAME_LEN];
+	struct edac_mem_repair_dev_attr mem_repair_dev_attr[MEM_REPAIR_MAX_ATTRS];
+	struct attribute *mem_repair_attrs[MEM_REPAIR_MAX_ATTRS + 1];
+	struct attribute_group group;
+};
+
+#define TO_MEM_REPAIR_DEV_ATTR(_dev_attr)      \
+	container_of(_dev_attr, struct edac_mem_repair_dev_attr, dev_attr)
+
+#define EDAC_MEM_REPAIR_ATTR_SHOW(attrib, cb, type, format)			\
+static ssize_t attrib##_show(struct device *ras_feat_dev,			\
+			     struct device_attribute *attr, char *buf)		\
+{										\
+	u8 inst = TO_MEM_REPAIR_DEV_ATTR(attr)->instance;			\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);		\
+	const struct edac_mem_repair_ops *ops =					\
+		ctx->mem_repair[inst].mem_repair_ops;				\
+	type data;								\
+	int ret;								\
+										\
+	ret = ops->cb(ras_feat_dev->parent, ctx->mem_repair[inst].private,	\
+		      &data);							\
+	if (ret)								\
+		return ret;							\
+										\
+	return sysfs_emit(buf, format, data);					\
+}
+
+EDAC_MEM_REPAIR_ATTR_SHOW(repair_type, get_repair_type, const char *, "%s\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(persist_mode, get_persist_mode, bool, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(repair_safe_when_in_use, get_repair_safe_when_in_use, bool, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(hpa, get_hpa, u64, "0x%llx\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(min_hpa, get_min_hpa, u64, "0x%llx\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(max_hpa, get_max_hpa, u64, "0x%llx\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(dpa, get_dpa, u64, "0x%llx\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(min_dpa, get_min_dpa, u64, "0x%llx\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(max_dpa, get_max_dpa, u64, "0x%llx\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(nibble_mask, get_nibble_mask, u32, "0x%x\n")
+
+#define EDAC_MEM_REPAIR_ATTR_STORE(attrib, cb, type, conv_func)			\
+static ssize_t attrib##_store(struct device *ras_feat_dev,			\
+			      struct device_attribute *attr,			\
+			      const char *buf, size_t len)			\
+{										\
+	u8 inst = TO_MEM_REPAIR_DEV_ATTR(attr)->instance;			\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);		\
+	const struct edac_mem_repair_ops *ops =					\
+		ctx->mem_repair[inst].mem_repair_ops;				\
+	type data;								\
+	int ret;								\
+										\
+	ret = conv_func(buf, 0, &data);						\
+	if (ret < 0)								\
+		return ret;							\
+										\
+	ret = ops->cb(ras_feat_dev->parent, ctx->mem_repair[inst].private,	\
+		      data);							\
+	if (ret)								\
+		return ret;							\
+										\
+	return len;								\
+}
+
+EDAC_MEM_REPAIR_ATTR_STORE(persist_mode, set_persist_mode, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(hpa, set_hpa, u64, kstrtou64)
+EDAC_MEM_REPAIR_ATTR_STORE(dpa, set_dpa, u64, kstrtou64)
+EDAC_MEM_REPAIR_ATTR_STORE(nibble_mask, set_nibble_mask, unsigned long, kstrtoul)
+
+#define EDAC_MEM_REPAIR_DO_OP(attrib, cb)						\
+static ssize_t attrib##_store(struct device *ras_feat_dev,				\
+			      struct device_attribute *attr,				\
+			      const char *buf, size_t len)				\
+{											\
+	u8 inst = TO_MEM_REPAIR_DEV_ATTR(attr)->instance;				\
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);			\
+	const struct edac_mem_repair_ops *ops = ctx->mem_repair[inst].mem_repair_ops;	\
+	unsigned long data;								\
+	int ret;									\
+											\
+	ret = kstrtoul(buf, 0, &data);							\
+	if (ret < 0)									\
+		return ret;								\
+											\
+	ret = ops->cb(ras_feat_dev->parent, ctx->mem_repair[inst].private, data);	\
+	if (ret)									\
+		return ret;								\
+											\
+	return len;									\
+}
+
+EDAC_MEM_REPAIR_DO_OP(repair, do_repair)
+
+static umode_t mem_repair_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id)
+{
+	struct device *ras_feat_dev = kobj_to_dev(kobj);
+	struct device_attribute *dev_attr = container_of(a, struct device_attribute, attr);
+	struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
+	u8 inst = TO_MEM_REPAIR_DEV_ATTR(dev_attr)->instance;
+	const struct edac_mem_repair_ops *ops = ctx->mem_repair[inst].mem_repair_ops;
+
+	switch (attr_id) {
+	case MEM_REPAIR_TYPE:
+		if (ops->get_repair_type)
+			return a->mode;
+		break;
+	case MEM_REPAIR_PERSIST_MODE:
+		if (ops->get_persist_mode) {
+			if (ops->set_persist_mode)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_SAFE_IN_USE:
+		if (ops->get_repair_safe_when_in_use)
+			return a->mode;
+		break;
+	case MEM_REPAIR_HPA:
+		if (ops->get_hpa) {
+			if (ops->set_hpa)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_MIN_HPA:
+		if (ops->get_min_hpa)
+			return a->mode;
+		break;
+	case MEM_REPAIR_MAX_HPA:
+		if (ops->get_max_hpa)
+			return a->mode;
+		break;
+	case MEM_REPAIR_DPA:
+		if (ops->get_dpa) {
+			if (ops->set_dpa)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_MIN_DPA:
+		if (ops->get_min_dpa)
+			return a->mode;
+		break;
+	case MEM_REPAIR_MAX_DPA:
+		if (ops->get_max_dpa)
+			return a->mode;
+		break;
+	case MEM_REPAIR_NIBBLE_MASK:
+		if (ops->get_nibble_mask) {
+			if (ops->set_nibble_mask)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_DO_REPAIR:
+		if (ops->do_repair)
+			return a->mode;
+		break;
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+#define EDAC_MEM_REPAIR_ATTR_RO(_name, _instance)       \
+	((struct edac_mem_repair_dev_attr) { .dev_attr = __ATTR_RO(_name), \
+					     .instance = _instance })
+
+#define EDAC_MEM_REPAIR_ATTR_WO(_name, _instance)       \
+	((struct edac_mem_repair_dev_attr) { .dev_attr = __ATTR_WO(_name), \
+					     .instance = _instance })
+
+#define EDAC_MEM_REPAIR_ATTR_RW(_name, _instance)       \
+	((struct edac_mem_repair_dev_attr) { .dev_attr = __ATTR_RW(_name), \
+					     .instance = _instance })
+
+static int mem_repair_create_desc(struct device *dev,
+				  const struct attribute_group **attr_groups,
+				  u8 instance)
+{
+	struct edac_mem_repair_context *ctx;
+	struct attribute_group *group;
+	int i;
+	struct edac_mem_repair_dev_attr dev_attr[] = {
+		[MEM_REPAIR_TYPE] = EDAC_MEM_REPAIR_ATTR_RO(repair_type,
+							    instance),
+		[MEM_REPAIR_PERSIST_MODE] =
+				EDAC_MEM_REPAIR_ATTR_RW(persist_mode, instance),
+		[MEM_REPAIR_SAFE_IN_USE] =
+				EDAC_MEM_REPAIR_ATTR_RO(repair_safe_when_in_use,
+							instance),
+		[MEM_REPAIR_HPA] = EDAC_MEM_REPAIR_ATTR_RW(hpa, instance),
+		[MEM_REPAIR_MIN_HPA] = EDAC_MEM_REPAIR_ATTR_RO(min_hpa, instance),
+		[MEM_REPAIR_MAX_HPA] = EDAC_MEM_REPAIR_ATTR_RO(max_hpa, instance),
+		[MEM_REPAIR_DPA] = EDAC_MEM_REPAIR_ATTR_RW(dpa, instance),
+		[MEM_REPAIR_MIN_DPA] = EDAC_MEM_REPAIR_ATTR_RO(min_dpa, instance),
+		[MEM_REPAIR_MAX_DPA] = EDAC_MEM_REPAIR_ATTR_RO(max_dpa, instance),
+		[MEM_REPAIR_NIBBLE_MASK] =
+				EDAC_MEM_REPAIR_ATTR_RW(nibble_mask, instance),
+		[MEM_DO_REPAIR] = EDAC_MEM_REPAIR_ATTR_WO(repair, instance)
+	};
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	for (i = 0; i < MEM_REPAIR_MAX_ATTRS; i++) {
+		memcpy(&ctx->mem_repair_dev_attr[i].dev_attr,
+		       &dev_attr[i], sizeof(dev_attr[i]));
+		ctx->mem_repair_attrs[i] =
+			&ctx->mem_repair_dev_attr[i].dev_attr.attr;
+	}
+
+	sprintf(ctx->name, "%s%d", "mem_repair", instance);
+	group = &ctx->group;
+	group->name = ctx->name;
+	group->attrs = ctx->mem_repair_attrs;
+	group->is_visible  = mem_repair_attr_visible;
+	attr_groups[0] = group;
+
+	return 0;
+}
+
+/**
+ * edac_mem_repair_get_desc - get EDAC memory repair descriptors
+ * @dev: client device with memory repair feature
+ * @attr_groups: pointer to attribute group container
+ * @instance: device's memory repair instance number.
+ *
+ * Return:
+ *  * %0	- Success.
+ *  * %-EINVAL	- Invalid parameters passed.
+ *  * %-ENOMEM	- Dynamic memory allocation failed.
+ */
+int edac_mem_repair_get_desc(struct device *dev,
+			     const struct attribute_group **attr_groups, u8 instance)
+{
+	if (!dev || !attr_groups)
+		return -EINVAL;
+
+	return mem_repair_create_desc(dev, attr_groups, instance);
+}
diff --git a/include/linux/edac.h b/include/linux/edac.h
index f8346014c14e..cfb2ef41ab95 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -668,6 +668,7 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci,
 enum edac_dev_feat {
 	RAS_FEAT_SCRUB,
 	RAS_FEAT_ECS,
+	RAS_FEAT_MEM_REPAIR,
 	RAS_FEAT_MAX
 };
 
@@ -743,11 +744,82 @@ static inline int edac_ecs_get_desc(struct device *ecs_dev,
 { return -EOPNOTSUPP; }
 #endif /* CONFIG_EDAC_ECS */
 
+enum edac_mem_repair_type {
+	EDAC_REPAIR_MAX
+};
+
+enum edac_mem_repair_cmd {
+	EDAC_DO_MEM_REPAIR = 1,
+};
+
+/**
+ * struct edac_mem_repair_ops - memory repair operations
+ * (all elements are optional except do_repair, set_hpa/set_dpa)
+ * @get_repair_type: get the memory repair type, listed in
+ *			 enum edac_mem_repair_function.
+ * @get_persist_mode: get the current persist mode.
+ *		      false - Soft repair type (temporary repair).
+ *		      true - Hard memory repair type (permanent repair).
+ * @set_persist_mode: set the persist mode of the memory repair instance.
+ * @get_repair_safe_when_in_use: get whether memory media is accessible and
+ *				 data is retained during repair operation.
+ * @get_hpa: get current host physical address (HPA) of memory to repair.
+ * @set_hpa: set host physical address (HPA) of memory to repair.
+ * @get_min_hpa: get the minimum supported host physical address (HPA).
+ * @get_max_hpa: get the maximum supported host physical address (HPA).
+ * @get_dpa: get current device physical address (DPA) of memory to repair.
+ * @set_dpa: set device physical address (DPA) of memory to repair.
+ *	     In some states of system configuration (e.g. before address decoders
+ *	     have been configured), memory devices (e.g. CXL) may not have an active
+ *	     mapping in the host physical address map. As such, the memory
+ *	     to repair must be identified by a device specific physical addressing
+ *	     scheme using a device physical address(DPA). The DPA and other control
+ *	     attributes to use for the repair operations will be presented in related
+ *	     error records.
+ * @get_min_dpa: get the minimum supported device physical address (DPA).
+ * @get_max_dpa: get the maximum supported device physical address (DPA).
+ * @get_nibble_mask: get current nibble mask of memory to repair.
+ * @set_nibble_mask: set nibble mask of memory to repair.
+ * @do_repair: Issue memory repair operation for the HPA/DPA and
+ *	       other control attributes set for the memory to repair.
+ *
+ * All elements are optional except do_repair and at least one of set_hpa/set_dpa.
+ */
+struct edac_mem_repair_ops {
+	int (*get_repair_type)(struct device *dev, void *drv_data, const char **type);
+	int (*get_persist_mode)(struct device *dev, void *drv_data, bool *persist);
+	int (*set_persist_mode)(struct device *dev, void *drv_data, bool persist);
+	int (*get_repair_safe_when_in_use)(struct device *dev, void *drv_data, bool *safe);
+	int (*get_hpa)(struct device *dev, void *drv_data, u64 *hpa);
+	int (*set_hpa)(struct device *dev, void *drv_data, u64 hpa);
+	int (*get_min_hpa)(struct device *dev, void *drv_data, u64 *hpa);
+	int (*get_max_hpa)(struct device *dev, void *drv_data, u64 *hpa);
+	int (*get_dpa)(struct device *dev, void *drv_data, u64 *dpa);
+	int (*set_dpa)(struct device *dev, void *drv_data, u64 dpa);
+	int (*get_min_dpa)(struct device *dev, void *drv_data, u64 *dpa);
+	int (*get_max_dpa)(struct device *dev, void *drv_data, u64 *dpa);
+	int (*get_nibble_mask)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_nibble_mask)(struct device *dev, void *drv_data, u32 val);
+	int (*do_repair)(struct device *dev, void *drv_data, u32 val);
+};
+
+#if IS_ENABLED(CONFIG_EDAC_MEM_REPAIR)
+int edac_mem_repair_get_desc(struct device *dev,
+			     const struct attribute_group **attr_groups,
+			     u8 instance);
+#else
+static inline int edac_mem_repair_get_desc(struct device *dev,
+					   const struct attribute_group **attr_groups,
+					   u8 instance)
+{ return -EOPNOTSUPP; }
+#endif /* CONFIG_EDAC_MEM_REPAIR */
+
 /* EDAC device feature information structure */
 struct edac_dev_data {
 	union {
 		const struct edac_scrub_ops *scrub_ops;
 		const struct edac_ecs_ops *ecs_ops;
+		const struct edac_mem_repair_ops *mem_repair_ops;
 	};
 	u8 instance;
 	void *private;
@@ -758,6 +830,7 @@ struct edac_dev_feat_ctx {
 	void *private;
 	struct edac_dev_data *scrub;
 	struct edac_dev_data ecs;
+	struct edac_dev_data *mem_repair;
 };
 
 struct edac_dev_feature {
@@ -766,6 +839,7 @@ struct edac_dev_feature {
 	union {
 		const struct edac_scrub_ops *scrub_ops;
 		const struct edac_ecs_ops *ecs_ops;
+		const struct edac_mem_repair_ops *mem_repair_ops;
 	};
 	void *ctx;
 	struct edac_ecs_ex_info ecs_info;
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 05/15] ACPI:RAS2: Add ACPI RAS2 driver
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (3 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 04/15] EDAC: Add memory repair " shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 06/15] ras: mem: Add memory " shiju.jose
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add support for ACPI RAS2 feature table (RAS2) defined in the
ACPI 6.5 Specification, section 5.2.21.
Driver contains RAS2 Init, which extracts the RAS2 table and driver
adds auxiliary device for each memory feature which binds to the
RAS2 memory driver.

Driver uses PCC mailbox to communicate with the ACPI HW and the
driver adds OSPM interfaces to send RAS2 commands.

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Co-developed-by: A Somasundaram <somasundaram.a@hpe.com>
Signed-off-by: A Somasundaram <somasundaram.a@hpe.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/acpi/Kconfig     |  11 ++
 drivers/acpi/Makefile    |   1 +
 drivers/acpi/ras2.c      | 417 +++++++++++++++++++++++++++++++++++++++
 include/acpi/ras2_acpi.h |  47 +++++
 4 files changed, 476 insertions(+)
 create mode 100755 drivers/acpi/ras2.c
 create mode 100644 include/acpi/ras2_acpi.h

diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index d81b55f5068c..bae9a47c829d 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -293,6 +293,17 @@ config ACPI_CPPC_LIB
 	  If your platform does not support CPPC in firmware,
 	  leave this option disabled.
 
+config ACPI_RAS2
+	bool "ACPI RAS2 driver"
+	select AUXILIARY_BUS
+	select MAILBOX
+	select PCC
+	help
+	  The driver adds support for ACPI RAS2 feature table(extracts RAS2
+	  table from OS system table) and OSPM interfaces to send RAS2
+	  commands via PCC mailbox subspace. Driver adds platform device for
+	  the RAS2 memory features which binds to the RAS2 memory driver.
+
 config ACPI_PROCESSOR
 	tristate "Processor"
 	depends on X86 || ARM64 || LOONGARCH || RISCV
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 40208a0f5dfb..797b38cdc2f3 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -100,6 +100,7 @@ obj-$(CONFIG_ACPI_EC_DEBUGFS)	+= ec_sys.o
 obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
+obj-$(CONFIG_ACPI_RAS2)		+= ras2.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
 obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
 obj-$(CONFIG_ACPI_PFRUT)	+= pfr_update.o pfr_telemetry.o
diff --git a/drivers/acpi/ras2.c b/drivers/acpi/ras2.c
new file mode 100755
index 000000000000..cc8eef49c158
--- /dev/null
+++ b/drivers/acpi/ras2.c
@@ -0,0 +1,417 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Implementation of ACPI RAS2 driver.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ * Support for RAS2 - ACPI 6.5 Specification, section 5.2.21
+ *
+ * Driver contains ACPI RAS2 init, which extracts the ACPI RAS2 table and
+ * get the PCC channel subspace for communicating with the ACPI compliant
+ * HW platform which supports ACPI RAS2. Driver adds platform devices
+ * for each RAS2 memory feature which binds to the memory ACPI RAS2 driver.
+ */
+
+#define pr_fmt(fmt)    "ACPI RAS2: " fmt
+
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/ktime.h>
+#include <linux/platform_device.h>
+#include <acpi/pcc.h>
+#include <acpi/ras2_acpi.h>
+
+/* Data structure for PCC communication */
+struct ras2_pcc_subspace {
+	int pcc_subspace_id;
+	struct mbox_client mbox_client;
+	struct pcc_mbox_chan *pcc_chan;
+	struct acpi_ras2_shared_memory __iomem *pcc_comm_addr;
+	bool pcc_channel_acquired;
+	ktime_t deadline;
+	unsigned int pcc_mpar;
+	unsigned int pcc_mrtt;
+	struct list_head elem;
+	u16 ref_count;
+};
+
+/*
+ * Arbitrary Retries for PCC commands because the
+ * remote processor could be much slower to reply.
+ */
+#define RAS2_NUM_RETRIES 600
+
+#define RAS2_FEATURE_TYPE_MEMORY        0x00
+
+/* global variables for the RAS2 PCC subspaces */
+static DEFINE_MUTEX(ras2_pcc_subspace_lock);
+static LIST_HEAD(ras2_pcc_subspaces);
+
+static int ras2_report_cap_error(u32 cap_status)
+{
+	switch (cap_status) {
+	case ACPI_RAS2_NOT_VALID:
+	case ACPI_RAS2_NOT_SUPPORTED:
+		return -EPERM;
+	case ACPI_RAS2_BUSY:
+		return -EBUSY;
+	case ACPI_RAS2_FAILED:
+	case ACPI_RAS2_ABORTED:
+	case ACPI_RAS2_INVALID_DATA:
+		return -EINVAL;
+	default: /* 0 or other, Success */
+		return 0;
+	}
+}
+
+static int ras2_check_pcc_chan(struct ras2_pcc_subspace *pcc_subspace)
+{
+	struct acpi_ras2_shared_memory __iomem *generic_comm_base = pcc_subspace->pcc_comm_addr;
+	ktime_t next_deadline = ktime_add(ktime_get(), pcc_subspace->deadline);
+	u32 cap_status;
+	u16 status;
+	u32 ret;
+
+	while (!ktime_after(ktime_get(), next_deadline)) {
+		/*
+		 * As per ACPI spec, the PCC space will be initialized by
+		 * platform and should have set the command completion bit when
+		 * PCC can be used by OSPM
+		 */
+		status = readw_relaxed(&generic_comm_base->status);
+		if (status & RAS2_PCC_CMD_ERROR) {
+			cap_status = readw_relaxed(&generic_comm_base->set_capabilities_status);
+			ret = ras2_report_cap_error(cap_status);
+
+			status &= ~RAS2_PCC_CMD_ERROR;
+			writew_relaxed(status, &generic_comm_base->status);
+			return ret;
+		}
+		if (status & RAS2_PCC_CMD_COMPLETE)
+			return 0;
+		/*
+		 * Reducing the bus traffic in case this loop takes longer than
+		 * a few retries.
+		 */
+		msleep(10);
+	}
+
+	return -EIO;
+}
+
+/**
+ * ras2_send_pcc_cmd() - Send RAS2 command via PCC channel
+ * @ras2_ctx:	pointer to the RAS2 context structure
+ * @cmd:	command to send
+ *
+ * Returns: 0 on success, an error otherwise
+ */
+int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd)
+{
+	struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace;
+	struct acpi_ras2_shared_memory *generic_comm_base = pcc_subspace->pcc_comm_addr;
+	static ktime_t last_cmd_cmpl_time, last_mpar_reset;
+	struct mbox_chan *pcc_channel;
+	unsigned int time_delta;
+	static int mpar_count;
+	int ret;
+
+	guard(mutex)(&ras2_pcc_subspace_lock);
+	ret = ras2_check_pcc_chan(pcc_subspace);
+	if (ret < 0)
+		return ret;
+	pcc_channel = pcc_subspace->pcc_chan->mchan;
+
+	/*
+	 * Handle the Minimum Request Turnaround Time(MRTT)
+	 * "The minimum amount of time that OSPM must wait after the completion
+	 * of a command before issuing the next command, in microseconds"
+	 */
+	if (pcc_subspace->pcc_mrtt) {
+		time_delta = ktime_us_delta(ktime_get(), last_cmd_cmpl_time);
+		if (pcc_subspace->pcc_mrtt > time_delta)
+			udelay(pcc_subspace->pcc_mrtt - time_delta);
+	}
+
+	/*
+	 * Handle the non-zero Maximum Periodic Access Rate(MPAR)
+	 * "The maximum number of periodic requests that the subspace channel can
+	 * support, reported in commands per minute. 0 indicates no limitation."
+	 *
+	 * This parameter should be ideally zero or large enough so that it can
+	 * handle maximum number of requests that all the cores in the system can
+	 * collectively generate. If it is not, we will follow the spec and just
+	 * not send the request to the platform after hitting the MPAR limit in
+	 * any 60s window
+	 */
+	if (pcc_subspace->pcc_mpar) {
+		if (mpar_count == 0) {
+			time_delta = ktime_ms_delta(ktime_get(), last_mpar_reset);
+			if (time_delta < 60 * MSEC_PER_SEC) {
+				dev_dbg(ras2_ctx->dev,
+					"PCC cmd not sent due to MPAR limit");
+				return -EIO;
+			}
+			last_mpar_reset = ktime_get();
+			mpar_count = pcc_subspace->pcc_mpar;
+		}
+		mpar_count--;
+	}
+
+	/* Write to the shared comm region. */
+	writew_relaxed(cmd, &generic_comm_base->command);
+
+	/* Flip CMD COMPLETE bit */
+	writew_relaxed(0, &generic_comm_base->status);
+
+	/* Ring doorbell */
+	ret = mbox_send_message(pcc_channel, &cmd);
+	if (ret < 0) {
+		dev_err(ras2_ctx->dev,
+			"Err sending PCC mbox message. cmd:%d, ret:%d\n",
+			cmd, ret);
+		return ret;
+	}
+
+	/*
+	 * If Minimum Request Turnaround Time is non-zero, we need
+	 * to record the completion time of both READ and WRITE
+	 * command for proper handling of MRTT, so we need to check
+	 * for pcc_mrtt in addition to CMD_READ
+	 */
+	if (cmd == RAS2_PCC_CMD_EXEC || pcc_subspace->pcc_mrtt) {
+		ret = ras2_check_pcc_chan(pcc_subspace);
+		if (pcc_subspace->pcc_mrtt)
+			last_cmd_cmpl_time = ktime_get();
+	}
+
+	if (pcc_channel->mbox->txdone_irq)
+		mbox_chan_txdone(pcc_channel, ret);
+	else
+		mbox_client_txdone(pcc_channel, ret);
+
+	return ret >= 0 ? 0 : ret;
+}
+EXPORT_SYMBOL_GPL(ras2_send_pcc_cmd);
+
+static int ras2_register_pcc_channel(struct ras2_mem_ctx *ras2_ctx, int pcc_subspace_id)
+{
+	struct ras2_pcc_subspace *pcc_subspace;
+	struct pcc_mbox_chan *pcc_chan;
+	struct mbox_client *mbox_cl;
+
+	if (pcc_subspace_id < 0)
+		return -EINVAL;
+
+	mutex_lock(&ras2_pcc_subspace_lock);
+	list_for_each_entry(pcc_subspace, &ras2_pcc_subspaces, elem) {
+		if (pcc_subspace->pcc_subspace_id != pcc_subspace_id)
+			continue;
+		ras2_ctx->pcc_subspace = pcc_subspace;
+		pcc_subspace->ref_count++;
+		mutex_unlock(&ras2_pcc_subspace_lock);
+		return 0;
+	}
+	mutex_unlock(&ras2_pcc_subspace_lock);
+
+	pcc_subspace = kcalloc(1, sizeof(*pcc_subspace), GFP_KERNEL);
+	if (!pcc_subspace)
+		return -ENOMEM;
+	mbox_cl = &pcc_subspace->mbox_client;
+	mbox_cl->knows_txdone = true;
+
+	pcc_chan = pcc_mbox_request_channel(mbox_cl, pcc_subspace_id);
+	if (IS_ERR(pcc_chan)) {
+		kfree(pcc_subspace);
+		return PTR_ERR(pcc_chan);
+	}
+	*pcc_subspace = (struct ras2_pcc_subspace) {
+		.pcc_subspace_id = pcc_subspace_id,
+		.pcc_chan = pcc_chan,
+		.pcc_comm_addr = acpi_os_ioremap(pcc_chan->shmem_base_addr,
+						 pcc_chan->shmem_size),
+		.deadline = ns_to_ktime(RAS2_NUM_RETRIES *
+					pcc_chan->latency *
+					NSEC_PER_USEC),
+		.pcc_mrtt = pcc_chan->min_turnaround_time,
+		.pcc_mpar = pcc_chan->max_access_rate,
+		.mbox_client = {
+			.knows_txdone = true,
+		},
+		.pcc_channel_acquired = true,
+	};
+	mutex_lock(&ras2_pcc_subspace_lock);
+	list_add(&pcc_subspace->elem, &ras2_pcc_subspaces);
+	pcc_subspace->ref_count++;
+	mutex_unlock(&ras2_pcc_subspace_lock);
+	ras2_ctx->pcc_subspace = pcc_subspace;
+	ras2_ctx->pcc_comm_addr = pcc_subspace->pcc_comm_addr;
+	ras2_ctx->dev = pcc_chan->mchan->mbox->dev;
+
+	return 0;
+}
+
+static DEFINE_IDA(ras2_ida);
+static void ras2_remove_pcc(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct ras2_pcc_subspace *pcc_subspace = ras2_ctx->pcc_subspace;
+
+	guard(mutex)(&ras2_pcc_subspace_lock);
+	if (pcc_subspace->ref_count > 0)
+		pcc_subspace->ref_count--;
+	if (!pcc_subspace->ref_count) {
+		list_del(&pcc_subspace->elem);
+		pcc_mbox_free_channel(pcc_subspace->pcc_chan);
+		kfree(pcc_subspace);
+	}
+}
+
+static void ras2_release(struct device *device)
+{
+	struct auxiliary_device *auxdev = container_of(device, struct auxiliary_device, dev);
+	struct ras2_mem_ctx *ras2_ctx = container_of(auxdev, struct ras2_mem_ctx, adev);
+
+	ida_free(&ras2_ida, auxdev->id);
+	ras2_remove_pcc(ras2_ctx);
+	kfree(ras2_ctx);
+}
+
+static struct ras2_mem_ctx *ras2_add_aux_device(char *name, int channel)
+{
+	struct ras2_mem_ctx *ras2_ctx;
+	int id, ret;
+
+	ras2_ctx = kzalloc(sizeof(*ras2_ctx), GFP_KERNEL);
+	if (!ras2_ctx)
+		return ERR_PTR(-ENOMEM);
+
+	mutex_init(&ras2_ctx->lock);
+
+	ret = ras2_register_pcc_channel(ras2_ctx, channel);
+	if (ret < 0) {
+		pr_debug("failed to register pcc channel ret=%d\n", ret);
+		goto ctx_free;
+	}
+
+	id = ida_alloc(&ras2_ida, GFP_KERNEL);
+	if (id < 0) {
+		ret = id;
+		goto pcc_free;
+	}
+	ras2_ctx->id = id;
+	ras2_ctx->adev.id = id;
+	ras2_ctx->adev.name = RAS2_MEM_DEV_ID_NAME;
+	ras2_ctx->adev.dev.release = ras2_release;
+	ras2_ctx->adev.dev.parent = ras2_ctx->dev;
+
+	ret = auxiliary_device_init(&ras2_ctx->adev);
+	if (ret)
+		goto ida_free;
+
+	ret = auxiliary_device_add(&ras2_ctx->adev);
+	if (ret) {
+		auxiliary_device_uninit(&ras2_ctx->adev);
+		return ERR_PTR(ret);
+	}
+
+	return ras2_ctx;
+
+ida_free:
+	ida_free(&ras2_ida, id);
+pcc_free:
+	ras2_remove_pcc(ras2_ctx);
+ctx_free:
+	kfree(ras2_ctx);
+	return ERR_PTR(ret);
+}
+
+static int ras2_acpi_parse_table(struct acpi_table_header *pAcpiTable)
+{
+	struct acpi_ras2_pcc_desc *pcc_desc_list;
+	struct acpi_table_ras2 *pRas2Table;
+	struct ras2_mem_ctx *ras2_ctx;
+	int pcc_subspace_id;
+	acpi_size ras2_size;
+	acpi_status status;
+	u8 count = 0, i;
+
+	status = acpi_get_table("RAS2", 0, &pAcpiTable);
+	if (ACPI_FAILURE(status) || !pAcpiTable) {
+		pr_err("ACPI RAS2 driver failed to initialize, get table failed\n");
+		return -EINVAL;
+	}
+
+	ras2_size = pAcpiTable->length;
+	if (ras2_size < sizeof(struct acpi_table_ras2)) {
+		pr_err("ACPI RAS2 table present but broken (too short #1)\n");
+		return -EINVAL;
+	}
+
+	pRas2Table = (struct acpi_table_ras2 *)pAcpiTable;
+	if (pRas2Table->num_pcc_descs <= 0) {
+		pr_err("ACPI RAS2 table does not contain PCC descriptors\n");
+		return -EINVAL;
+	}
+
+	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(pRas2Table + 1);
+	/* Double scan for the case of only one actual controller */
+	pcc_subspace_id = -1;
+	count = 0;
+	for (i = 0; i < pRas2Table->num_pcc_descs; i++, pcc_desc_list++) {
+		if (pcc_desc_list->feature_type != RAS2_FEATURE_TYPE_MEMORY)
+			continue;
+		if (pcc_subspace_id == -1) {
+			pcc_subspace_id = pcc_desc_list->channel_id;
+			count++;
+		}
+		if (pcc_desc_list->channel_id != pcc_subspace_id)
+			count++;
+	}
+	/*
+	 * Workaround for the client platform with multiple scrub devices
+	 * but uses single PCC subspace for communication.
+	 */
+	if (count == 1) {
+		/* Add auxiliary device and bind ACPI RAS2 memory driver */
+		ras2_ctx = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME, pcc_subspace_id);
+		if (IS_ERR(ras2_ctx))
+			return PTR_ERR(ras2_ctx);
+		return 0;
+	}
+
+	pcc_desc_list = (struct acpi_ras2_pcc_desc *)(pRas2Table + 1);
+	count = 0;
+	for (i = 0; i < pRas2Table->num_pcc_descs; i++, pcc_desc_list++) {
+		if (pcc_desc_list->feature_type != RAS2_FEATURE_TYPE_MEMORY)
+			continue;
+		pcc_subspace_id = pcc_desc_list->channel_id;
+		/* Add auxiliary device and bind ACPI RAS2 memory driver */
+		ras2_ctx = ras2_add_aux_device(RAS2_MEM_DEV_ID_NAME, pcc_subspace_id);
+		if (IS_ERR(ras2_ctx))
+			return PTR_ERR(ras2_ctx);
+	}
+
+	return 0;
+}
+
+static int __init ras2_acpi_init(void)
+{
+	struct acpi_table_header *pAcpiTable = NULL;
+	acpi_status status;
+	int ret;
+
+	status = acpi_get_table("RAS2", 0, &pAcpiTable);
+	if (ACPI_FAILURE(status) || !pAcpiTable) {
+		pr_err("ACPI RAS2 driver failed to initialize, get table failed\n");
+		return -EINVAL;
+	}
+
+	ret = ras2_acpi_parse_table(pAcpiTable);
+	if (ret)
+		pr_err("ras2_acpi_parse_table failed\n");
+
+	acpi_put_table(pAcpiTable);
+
+	return ret;
+}
+late_initcall(ras2_acpi_init)
diff --git a/include/acpi/ras2_acpi.h b/include/acpi/ras2_acpi.h
new file mode 100644
index 000000000000..0d24c42eb34f
--- /dev/null
+++ b/include/acpi/ras2_acpi.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * RAS2 ACPI driver header file
+ *
+ * (C) Copyright 2014, 2015 Hewlett-Packard Enterprises
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited
+ */
+
+#ifndef _RAS2_ACPI_H
+#define _RAS2_ACPI_H
+
+#include <linux/acpi.h>
+#include <linux/auxiliary_bus.h>
+#include <linux/mailbox_client.h>
+#include <linux/mutex.h>
+#include <linux/types.h>
+
+#define RAS2_PCC_CMD_COMPLETE	BIT(0)
+#define RAS2_PCC_CMD_ERROR	BIT(2)
+
+/* RAS2 specific PCC commands */
+#define RAS2_PCC_CMD_EXEC 0x01
+
+#define RAS2_AUX_DEV_NAME "ras2"
+#define RAS2_MEM_DEV_ID_NAME "acpi_ras2_mem"
+
+/* Data structure RAS2 table */
+struct ras2_mem_ctx {
+	struct auxiliary_device adev;
+	/* Lock to provide mutually exclusive access to PCC channel */
+	struct mutex lock;
+	struct device *dev;
+	struct device *scrub_dev;
+	struct acpi_ras2_shared_memory __iomem *pcc_comm_addr;
+	void *pcc_subspace;
+	u64 base, size;
+	int id;
+	u8 instance;
+	u8 scrub_cycle_hrs;
+	u8 min_scrub_cycle;
+	u8 max_scrub_cycle;
+	bool bg;
+};
+
+int ras2_send_pcc_cmd(struct ras2_mem_ctx *ras2_ctx, u16 cmd);
+#endif /* _RAS2_ACPI_H */
-- 
2.43.0




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 06/15] ras: mem: Add memory ACPI RAS2 driver
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (4 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 05/15] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry shiju.jose
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Memory ACPI RAS2 auxiliary driver binds to the auxiliary device
add by the ACPI RAS2 table parser.

Driver uses a PCC subspace for communicating with the ACPI compliant
platform.

Device with ACPI RAS2 scrub feature registers with EDAC device driver,
which retrieves the scrub descriptor from EDAC scrub and exposes
the scrub control attributes for RAS2 scrub instance to userspace in
/sys/bus/edac/devices/acpi_ras_mem0/scrubX/.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Daniel Ferguson <danielf@os.amperecomputing.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/edac/scrub.rst |  73 +++++++
 drivers/ras/Kconfig          |  10 +
 drivers/ras/Makefile         |   1 +
 drivers/ras/acpi_ras2.c      | 383 +++++++++++++++++++++++++++++++++++
 4 files changed, 467 insertions(+)
 create mode 100644 drivers/ras/acpi_ras2.c

diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
index 5f1ff2bf54b0..3fe46b788eb6 100644
--- a/Documentation/edac/scrub.rst
+++ b/Documentation/edac/scrub.rst
@@ -259,3 +259,76 @@ Sysfs files are documented in
 `Documentation/ABI/testing/sysfs-edac-scrub`
 
 `Documentation/ABI/testing/sysfs-edac-ecs`
+
+Examples
+--------
+
+The usage takes the form shown in these examples:
+
+1. ACPI RAS2
+
+1.1 On demand scrubbing for a specific memory region.
+
+1.1.1. Query what is device default/current scrub cycle setting.
+
+       Applicable to both on-demand and background scrubbing.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+36000
+
+1.1.2 Query the range of device supported scrub cycle for a memory region.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration
+
+3600
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration
+
+86400
+
+1.1.3. Program scrubbing for the memory region in RAS2 device to repeat every 43200 seconds (half a day).
+
+# echo 43200 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+1.1.4. Program address and size of the memory region to scrub
+
+Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+0
+
+Write 'size' of the memory region to scrub.
+
+# echo 0x300000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/size
+
+Write 'addr' starts demand scrubbing, please make sure other attributes are set prior to that.
+
+# echo 0x200000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+0x200000
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr
+
+0
+
+1.2 Background scrubbing the entire memory
+
+1.2.3 Query the status of background scrubbing.
+
+# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
+
+0
+
+1.2.4. Program background scrubbing for RAS2 device to repeat in every 21600 seconds (quarter of a day).
+
+# echo 21600 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration
+
+1.2.5. Start 'background scrubbing'.
+
+# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
index fc4f4bb94a4c..c907e44360c8 100644
--- a/drivers/ras/Kconfig
+++ b/drivers/ras/Kconfig
@@ -46,4 +46,14 @@ config RAS_FMPM
 	  Memory will be retired during boot time and run time depending on
 	  platform-specific policies.
 
+config MEM_ACPI_RAS2
+	tristate "Memory ACPI RAS2 driver"
+	depends on ACPI_RAS2
+	depends on EDAC_SCRUB
+	help
+	  The driver binds to the platform device added by the ACPI RAS2
+	  table parser. Use a PCC channel subspace for communicating with
+	  the ACPI compliant platform to provide control of memory scrub
+	  parameters to the user via the EDAC scrub.
+
 endif
diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile
index 11f95d59d397..a0e6e903d6b0 100644
--- a/drivers/ras/Makefile
+++ b/drivers/ras/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_RAS)	+= ras.o
 obj-$(CONFIG_DEBUG_FS)	+= debugfs.o
 obj-$(CONFIG_RAS_CEC)	+= cec.o
+obj-$(CONFIG_MEM_ACPI_RAS2)	+= acpi_ras2.o
 
 obj-$(CONFIG_RAS_FMPM)	+= amd/fmpm.o
 obj-y			+= amd/atl/
diff --git a/drivers/ras/acpi_ras2.c b/drivers/ras/acpi_ras2.c
new file mode 100644
index 000000000000..99cbe5f74b9d
--- /dev/null
+++ b/drivers/ras/acpi_ras2.c
@@ -0,0 +1,383 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ACPI RAS2 memory driver
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ */
+
+#define pr_fmt(fmt)	"MEMORY ACPI RAS2: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/edac.h>
+#include <linux/platform_device.h>
+#include <acpi/ras2_acpi.h>
+
+#define RAS2_DEV_NUM_RAS_FEATURES	1
+
+#define RAS2_SUPPORT_HW_PARTOL_SCRUB	BIT(0)
+#define RAS2_TYPE_PATROL_SCRUB	0x0000
+
+#define RAS2_GET_PATROL_PARAMETERS	0x01
+#define	RAS2_START_PATROL_SCRUBBER	0x02
+#define	RAS2_STOP_PATROL_SCRUBBER	0x03
+
+#define RAS2_PATROL_SCRUB_SCHRS_IN_MASK	GENMASK(15, 8)
+#define RAS2_PATROL_SCRUB_EN_BACKGROUND	BIT(0)
+#define RAS2_PATROL_SCRUB_SCHRS_OUT_MASK	GENMASK(7, 0)
+#define RAS2_PATROL_SCRUB_MIN_SCHRS_OUT_MASK	GENMASK(15, 8)
+#define RAS2_PATROL_SCRUB_MAX_SCHRS_OUT_MASK	GENMASK(23, 16)
+#define RAS2_PATROL_SCRUB_FLAG_SCRUBBER_RUNNING	BIT(0)
+
+#define RAS2_SCRUB_NAME_LEN      128
+#define RAS2_HOUR_IN_SECS    3600
+
+struct acpi_ras2_ps_shared_mem {
+	struct acpi_ras2_shared_memory common;
+	struct acpi_ras2_patrol_scrub_parameter params;
+};
+
+static int ras2_is_patrol_scrub_support(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct acpi_ras2_shared_memory __iomem *common =
+		(void *)ras2_ctx->pcc_comm_addr;
+
+	guard(mutex)(&ras2_ctx->lock);
+	common->set_capabilities[0] = 0;
+
+	return common->features[0] & RAS2_SUPPORT_HW_PARTOL_SCRUB;
+}
+
+static int ras2_update_patrol_scrub_params_cache(struct ras2_mem_ctx *ras2_ctx)
+{
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		(void *)ras2_ctx->pcc_comm_addr;
+	int ret;
+
+	ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.patrol_scrub_command = RAS2_GET_PATROL_PARAMETERS;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "failed to read parameters\n");
+		return ret;
+	}
+
+	ras2_ctx->min_scrub_cycle = FIELD_GET(RAS2_PATROL_SCRUB_MIN_SCHRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	ras2_ctx->max_scrub_cycle = FIELD_GET(RAS2_PATROL_SCRUB_MAX_SCHRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+	if (!ras2_ctx->bg) {
+		ras2_ctx->base = ps_sm->params.actual_address_range[0];
+		ras2_ctx->size = ps_sm->params.actual_address_range[1];
+	}
+	ras2_ctx->scrub_cycle_hrs = FIELD_GET(RAS2_PATROL_SCRUB_SCHRS_OUT_MASK,
+					      ps_sm->params.scrub_params_out);
+
+	return 0;
+}
+
+/* Context - lock must be held */
+static int ras2_get_patrol_scrub_running(struct ras2_mem_ctx *ras2_ctx,
+					 bool *running)
+{
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		(void *)ras2_ctx->pcc_comm_addr;
+	int ret;
+
+	ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ps_sm->params.patrol_scrub_command = RAS2_GET_PATROL_PARAMETERS;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "failed to read parameters\n");
+		return ret;
+	}
+
+	*running = ps_sm->params.flags & RAS2_PATROL_SCRUB_FLAG_SCRUBBER_RUNNING;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 *min)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*min = ras2_ctx->min_scrub_cycle * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 *max)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*max = ras2_ctx->max_scrub_cycle * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_cycle_read(struct device *dev, void *drv_data,
+				    u32 *scrub_cycle_secs)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*scrub_cycle_secs = ras2_ctx->scrub_cycle_hrs * RAS2_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_cycle_write(struct device *dev, void *drv_data,
+				     u32 scrub_cycle_secs)
+{
+	u8 scrub_cycle_hrs = scrub_cycle_secs / RAS2_HOUR_IN_SECS;
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	bool running;
+	int ret;
+
+	guard(mutex)(&ras2_ctx->lock);
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	if (scrub_cycle_hrs < ras2_ctx->min_scrub_cycle ||
+	    scrub_cycle_hrs > ras2_ctx->max_scrub_cycle)
+		return -EINVAL;
+
+	ras2_ctx->scrub_cycle_hrs = scrub_cycle_hrs;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_addr(struct device *dev, void *drv_data, u64 *base)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	int ret;
+
+	/*
+	 * When BG scrubbing is enabled the actual address range is not valid.
+	 * Return -EBUSY now unless find out a method to retrieve actual full PA range.
+	 */
+	if (ras2_ctx->bg)
+		return -EBUSY;
+
+	/*
+	 * When demand scrubbing is finished firmware must reset actual
+	 * address range to 0. Otherwise userspace assumes demand scrubbing
+	 * is in progress.
+	 */
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+	*base = ras2_ctx->base;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_read_size(struct device *dev, void *drv_data, u64 *size)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	int ret;
+
+	if (ras2_ctx->bg)
+		return -EBUSY;
+
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+	*size = ras2_ctx->size;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_write_addr(struct device *dev, void *drv_data, u64 base)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		(void *)ras2_ctx->pcc_comm_addr;
+	bool running;
+	int ret;
+
+	guard(mutex)(&ras2_ctx->lock);
+	ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	if (ras2_ctx->bg)
+		return -EBUSY;
+
+	if (!base || !ras2_ctx->size) {
+		dev_warn(ras2_ctx->dev,
+			 "%s: Invalid address range, base=0x%llx "
+			 "size=0x%llx\n", __func__,
+			 base, ras2_ctx->size);
+		return -ERANGE;
+	}
+
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_SCHRS_IN_MASK;
+	ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_SCHRS_IN_MASK,
+						    ras2_ctx->scrub_cycle_hrs);
+	ps_sm->params.requested_address_range[0] = base;
+	ps_sm->params.requested_address_range[1] = ras2_ctx->size;
+	ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_EN_BACKGROUND;
+	ps_sm->params.patrol_scrub_command = RAS2_START_PATROL_SCRUBBER;
+
+	ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "Failed to start demand scrubbing\n");
+		return ret;
+	}
+
+	return ras2_update_patrol_scrub_params_cache(ras2_ctx);
+}
+
+static int ras2_hw_scrub_write_size(struct device *dev, void *drv_data, u64 size)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	bool running;
+	int ret;
+
+	guard(mutex)(&ras2_ctx->lock);
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+
+	if (running)
+		return -EBUSY;
+
+	if (!size) {
+		dev_warn(dev, "%s: Invalid address range size=0x%llx\n",
+			 __func__, size);
+		return -EINVAL;
+	}
+
+	ras2_ctx->size = size;
+
+	return 0;
+}
+
+static int ras2_hw_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+	struct acpi_ras2_ps_shared_mem __iomem *ps_sm =
+		(void *)ras2_ctx->pcc_comm_addr;
+	bool running;
+	int ret;
+
+	guard(mutex)(&ras2_ctx->lock);
+	ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB;
+	ret = ras2_get_patrol_scrub_running(ras2_ctx, &running);
+	if (ret)
+		return ret;
+	if (enable) {
+		if (ras2_ctx->bg || running)
+			return -EBUSY;
+		ps_sm->params.requested_address_range[0] = 0;
+		ps_sm->params.requested_address_range[1] = 0;
+		ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_SCHRS_IN_MASK;
+		ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_SCHRS_IN_MASK,
+							    ras2_ctx->scrub_cycle_hrs);
+		ps_sm->params.patrol_scrub_command = RAS2_START_PATROL_SCRUBBER;
+	} else {
+		if (!ras2_ctx->bg)
+			return -EPERM;
+		ps_sm->params.patrol_scrub_command = RAS2_STOP_PATROL_SCRUBBER;
+	}
+	ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_EN_BACKGROUND;
+	ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_EN_BACKGROUND,
+						    enable);
+	ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC);
+	if (ret) {
+		dev_err(ras2_ctx->dev, "Failed to %s background scrubbing\n",
+			str_enable_disable(enable));
+		return ret;
+	}
+	if (enable) {
+		ras2_ctx->bg = true;
+		/* Update the cache to account for rounding of supplied parameters and similar */
+		ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	} else {
+		ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+		ras2_ctx->bg = false;
+	}
+
+	return ret;
+}
+
+static int ras2_hw_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
+{
+	struct ras2_mem_ctx *ras2_ctx = drv_data;
+
+	*enabled = ras2_ctx->bg;
+
+	return 0;
+}
+
+static const struct edac_scrub_ops ras2_scrub_ops = {
+	.read_addr = ras2_hw_scrub_read_addr,
+	.read_size = ras2_hw_scrub_read_size,
+	.write_addr = ras2_hw_scrub_write_addr,
+	.write_size = ras2_hw_scrub_write_size,
+	.get_enabled_bg = ras2_hw_scrub_get_enabled_bg,
+	.set_enabled_bg = ras2_hw_scrub_set_enabled_bg,
+	.get_min_cycle = ras2_hw_scrub_read_min_scrub_cycle,
+	.get_max_cycle = ras2_hw_scrub_read_max_scrub_cycle,
+	.get_cycle_duration = ras2_hw_scrub_cycle_read,
+	.set_cycle_duration = ras2_hw_scrub_cycle_write,
+};
+
+static int ras2_probe(struct auxiliary_device *auxdev,
+		      const struct auxiliary_device_id *id)
+{
+	struct ras2_mem_ctx *ras2_ctx = container_of(auxdev, struct ras2_mem_ctx, adev);
+	struct edac_dev_feature ras_features[RAS2_DEV_NUM_RAS_FEATURES];
+	char scrub_name[RAS2_SCRUB_NAME_LEN];
+	int num_ras_features = 0;
+	int ret;
+
+	if (!ras2_is_patrol_scrub_support(ras2_ctx))
+		return -EOPNOTSUPP;
+
+	ret = ras2_update_patrol_scrub_params_cache(ras2_ctx);
+	if (ret)
+		return ret;
+
+	snprintf(scrub_name, sizeof(scrub_name), "acpi_ras_mem%d",
+		 ras2_ctx->id);
+
+	ras_features[num_ras_features].ft_type = RAS_FEAT_SCRUB;
+	ras_features[num_ras_features].instance = ras2_ctx->instance;
+	ras_features[num_ras_features].scrub_ops = &ras2_scrub_ops;
+	ras_features[num_ras_features].ctx = ras2_ctx;
+	num_ras_features++;
+
+	return edac_dev_register(&auxdev->dev, scrub_name, NULL,
+				 num_ras_features, ras_features);
+}
+
+static const struct auxiliary_device_id ras2_mem_dev_id_table[] = {
+	{ .name = RAS2_AUX_DEV_NAME "." RAS2_MEM_DEV_ID_NAME, },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(auxiliary, ras2_mem_dev_id_table);
+
+static struct auxiliary_driver ras2_mem_driver = {
+	.name = RAS2_MEM_DEV_ID_NAME,
+	.probe = ras2_probe,
+	.id_table = ras2_mem_dev_id_table,
+};
+module_auxiliary_driver(ras2_mem_driver);
+
+MODULE_IMPORT_NS("ACPI_RAS2");
+MODULE_DESCRIPTION("ACPI RAS2 memory driver");
+MODULE_LICENSE("GPL");
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (5 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 06/15] ras: mem: Add memory " shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-11  2:39   ` Dave Jiang
  2025-02-07 14:44 ` [PATCH v19 08/15] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add helper function to retrieve a feature entry from the supported
features list, if supported.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/cxl/core/features.c | 21 +++++++++++++++++++++
 include/cxl/features.h      |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c
index 5f64185a5c7a..bf175e69cda1 100644
--- a/drivers/cxl/core/features.c
+++ b/drivers/cxl/core/features.c
@@ -43,6 +43,27 @@ bool is_cxl_feature_exclusive(struct cxl_feat_entry *entry)
 }
 EXPORT_SYMBOL_NS_GPL(is_cxl_feature_exclusive, "CXL");
 
+struct cxl_feat_entry *cxl_get_feature_entry(struct cxl_memdev *cxlmd,
+					     const uuid_t *feat_uuid)
+{
+	struct cxl_features_state *cxlfs = cxlmd->cxlfs;
+	struct cxl_feat_entry *feat_entry;
+	int count;
+
+	/*
+	 * Retrieve the feature entry from the supported features list,
+	 * if the feature is supported.
+	 */
+	feat_entry = cxlfs->entries;
+	for (count = 0; count < cxlfs->num_features; count++, feat_entry++) {
+		if (uuid_equal(&feat_entry->uuid, feat_uuid))
+			return feat_entry;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_get_feature_entry, "CXL");
+
 size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
 		       enum cxl_get_feat_selection selection,
 		       void *feat_out, size_t feat_out_size, u16 offset,
diff --git a/include/cxl/features.h b/include/cxl/features.h
index e52d0573f504..563d966beee5 100644
--- a/include/cxl/features.h
+++ b/include/cxl/features.h
@@ -68,6 +68,8 @@ struct cxl_features_state {
 };
 
 struct cxl_mailbox;
+struct cxl_feat_entry *cxl_get_feature_entry(struct cxl_memdev *cxlmd,
+					     const uuid_t *feat_uuid);
 size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
 		       enum cxl_get_feat_selection selection,
 		       void *feat_out, size_t feat_out_size, u16 offset,
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 08/15] cxl/memfeature: Add CXL memory device patrol scrub control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (6 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 09/15] cxl/memfeature: Add CXL memory device ECS " shiju.jose
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control
feature. The device patrol scrub proactively locates and makes corrections
to errors in regular cycle.

Allow specifying the number of hours within which the patrol scrub must be
completed, subject to minimum and maximum limits reported by the device.
Also allow disabling scrub allowing trade-off error rates against
performance.

Add support for patrol scrub control on CXL memory devices.
Register with the EDAC device driver, which retrieves the scrub attribute
descriptors from EDAC scrub and exposes the sysfs scrub control attributes
to userspace. For example, scrub control for the CXL memory device
"cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/.

Additionally, add support for region-based CXL memory patrol scrub control.
CXL memory regions may be interleaved across one or more CXL memory
devices. For example, region-based scrub control for "cxl_region1" is
exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/edac/scrub.rst  |  59 +++++
 drivers/cxl/Kconfig           |  16 ++
 drivers/cxl/core/Makefile     |   1 +
 drivers/cxl/core/memfeature.c | 474 ++++++++++++++++++++++++++++++++++
 drivers/cxl/core/region.c     |   5 +
 drivers/cxl/cxlmem.h          |  10 +
 drivers/cxl/mem.c             |   4 +
 7 files changed, 569 insertions(+)
 create mode 100644 drivers/cxl/core/memfeature.c

diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
index 3fe46b788eb6..a803089b774e 100644
--- a/Documentation/edac/scrub.rst
+++ b/Documentation/edac/scrub.rst
@@ -332,3 +332,62 @@ Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finishe
 1.2.5. Start 'background scrubbing'.
 
 # echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background
+
+2. CXL memory device patrol scrubber
+
+2.1 Device based scrubbing
+
+2.1.1. Query what is device default/current scrub cycle setting.
+
+# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration
+
+43200
+
+2.1.2. Query the range of device supported scrub cycle.
+
+# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration
+
+3600
+
+# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration
+
+918000
+
+2.1.3. Program scrubbing for a device to repeat every 21600 seconds (quarter of a day).
+
+# echo 21600 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration
+
+# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background
+
+2.2. Region based scrubbing
+
+CXL memory is exposed to memory management subsystem and ultimately userspace
+via CXL regions.  These can incorporate one or more parts of multiple CXL
+Type 3 devices with traffic interleaved across them. The user may want to
+control the scrub rate via this more abstract region instead of having to
+figure out the constituent devices and program them separately. The scrub
+rate for each device covers the whole device. Thus if multiple regions use
+parts of that device then requests for scrubbing of other regions may result
+in a higher scrub rate than requested for this specific region.
+
+2.2.1 Query what is device default/current scrub cycle setting for a CXL memory region.
+
+# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration
+
+86400
+
+2.2.2 Query the range of device supported scrub cycle for a CXL memory region.
+
+# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration
+
+3600
+
+# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration
+
+918000
+
+2.2.3 Program scrubbing for a region to repeat every 43200 seconds (half a day)
+
+# echo  43200 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration
+
+# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 6bb1ffc74956..ac5ad2dc5996 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -161,4 +161,20 @@ config CXL_REGION_INVALIDATION_TEST
 	  If unsure, or if this kernel is meant for production environments,
 	  say N.
 
+config CXL_RAS_FEATURES
+	tristate "CXL: Memory RAS features"
+	depends on CXL_MEM
+	depends on EDAC_SCRUB
+	help
+	  The CXL memory RAS feature control is optional and allows host to
+	  control the RAS features configurations of CXL Type 3 devices.
+
+	  It registers with the EDAC device subsystem to expose control
+	  attributes of CXL memory device's RAS features to the user.
+	  It provides interface functions to support configuring the CXL
+	  memory device's RAS features.
+	  Say 'y/m' if you have an expert need to change default settings
+	  of a memory RAS feature established by the platform/device (eg.
+	  scrub rates for the patrol scrub feature). otherwise say 'n'.
+
 endif
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 73b6348afd67..54baca513ecb 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -17,3 +17,4 @@ cxl_core-y += cdat.o
 cxl_core-y += features.o
 cxl_core-$(CONFIG_TRACING) += trace.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
+cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeature.o
diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c
new file mode 100644
index 000000000000..c13b25a18e1a
--- /dev/null
+++ b/drivers/cxl/core/memfeature.c
@@ -0,0 +1,474 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * CXL memory RAS feature driver.
+ *
+ * Copyright (c) 2024-2025 HiSilicon Limited.
+ *
+ *  - Supports functions to configure RAS features of the
+ *    CXL memory devices.
+ *  - Registers with the EDAC device subsystem driver to expose
+ *    the features sysfs attributes to the user for configuring
+ *    CXL memory RAS feature.
+ */
+
+#include <linux/cleanup.h>
+#include <linux/edac.h>
+#include <linux/limits.h>
+#include <cxl/features.h>
+#include <cxl.h>
+#include <cxlmem.h>
+#include "core.h"
+
+#define CXL_DEV_NUM_RAS_FEATURES	1
+#define CXL_DEV_HOUR_IN_SECS	3600
+
+#define CXL_DEV_NAME_LEN	128
+
+static int cxl_hold_region_and_dpa(void)
+{
+	int rc;
+
+	rc = down_read_interruptible(&cxl_region_rwsem);
+	if (rc)
+		return rc;
+
+	rc = down_read_interruptible(&cxl_dpa_rwsem);
+	if (rc) {
+		up_read(&cxl_region_rwsem);
+		return rc;
+	}
+
+	return 0;
+}
+
+static void cxl_release_region_and_dpa(void)
+{
+	up_read(&cxl_dpa_rwsem);
+	up_read(&cxl_region_rwsem);
+}
+
+/*
+ * CXL memory patrol scrub control functions
+ */
+struct cxl_patrol_scrub_context {
+	u8 instance;
+	u16 get_feat_size;
+	u16 set_feat_size;
+	u8 get_version;
+	u8 set_version;
+	u16 effects;
+	struct cxl_memdev *cxlmd;
+	struct cxl_region *cxlr;
+};
+
+/**
+ * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure.
+ * @enable:     [IN & OUT] enable(1)/disable(0) patrol scrub.
+ * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable.
+ * @scrub_cycle_hrs:    [IN] Requested patrol scrub cycle in hours.
+ *                      [OUT] Current patrol scrub cycle in hours.
+ * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported.
+ */
+struct cxl_memdev_ps_params {
+	bool enable;
+	bool scrub_cycle_changeable;
+	u8 scrub_cycle_hrs;
+	u8 min_scrub_cycle_hrs;
+};
+
+enum cxl_scrub_param {
+	CXL_PS_PARAM_ENABLE,
+	CXL_PS_PARAM_SCRUB_CYCLE,
+};
+
+#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK	BIT(0)
+#define	CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK	BIT(1)
+#define	CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK	GENMASK(7, 0)
+#define	CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK	GENMASK(15, 8)
+#define	CXL_MEMDEV_PS_FLAG_ENABLED_MASK	BIT(0)
+
+/*
+ * See CXL spec rev 3.1 @8.2.9.9.11.1 Table 8-207 Device Patrol Scrub Control
+ * Feature Readable Attributes.
+ */
+struct cxl_memdev_ps_rd_attrs {
+	u8 scrub_cycle_cap;
+	__le16 scrub_cycle_hrs;
+	u8 scrub_flags;
+}  __packed;
+
+/*
+ * See CXL spec rev 3.1 @8.2.9.9.11.1 Table 8-208 Device Patrol Scrub Control
+ * Feature Writable Attributes.
+ */
+struct cxl_memdev_ps_wr_attrs {
+	u8 scrub_cycle_hrs;
+	u8 scrub_flags;
+}  __packed;
+
+static int cxl_mem_ps_get_attrs(struct cxl_mailbox *cxl_mbox,
+				struct cxl_memdev_ps_params *params)
+{
+	size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
+	u16 scrub_cycle_hrs;
+	size_t data_size;
+	struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
+		kzalloc(rd_data_size, GFP_KERNEL);
+	if (!rd_attrs)
+		return -ENOMEM;
+
+	data_size = cxl_get_feature(cxl_mbox, &CXL_FEAT_PATROL_SCRUB_UUID,
+				    CXL_GET_FEAT_SEL_CURRENT_VALUE,
+				    rd_attrs, rd_data_size, 0, NULL);
+	if (!data_size)
+		return -EIO;
+
+	params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,
+						   rd_attrs->scrub_cycle_cap);
+	params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
+				   rd_attrs->scrub_flags);
+	scrub_cycle_hrs = le16_to_cpu(rd_attrs->scrub_cycle_hrs);
+	params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
+					    scrub_cycle_hrs);
+	params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,
+						scrub_cycle_hrs);
+
+	return 0;
+}
+
+static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx,
+			    struct cxl_memdev_ps_params *params)
+{
+	struct cxl_mailbox *cxl_mbox;
+	struct cxl_memdev *cxlmd;
+	u16 min_scrub_cycle = 0;
+	int i, ret;
+
+	if (cxl_ps_ctx->cxlr) {
+		struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
+		struct cxl_region_params *p = &cxlr->params;
+
+		ret = cxl_hold_region_and_dpa();
+		if (ret)
+			return ret;
+		for (i = p->interleave_ways - 1; i >= 0; i--) {
+			struct cxl_endpoint_decoder *cxled = p->targets[i];
+
+			cxlmd = cxled_to_memdev(cxled);
+			cxl_mbox = &cxlmd->cxlds->cxl_mbox;
+			ret = cxl_mem_ps_get_attrs(cxl_mbox, params);
+			if (ret)
+				return ret;
+
+			if (params->min_scrub_cycle_hrs > min_scrub_cycle)
+				min_scrub_cycle = params->min_scrub_cycle_hrs;
+		}
+		cxl_release_region_and_dpa();
+
+		params->min_scrub_cycle_hrs = min_scrub_cycle;
+		return 0;
+	}
+	cxl_mbox = &cxl_ps_ctx->cxlmd->cxlds->cxl_mbox;
+
+	return cxl_mem_ps_get_attrs(cxl_mbox, params);
+}
+
+static int cxl_mem_ps_set_attrs(struct device *dev,
+				struct cxl_patrol_scrub_context *cxl_ps_ctx,
+				struct cxl_mailbox *cxl_mbox,
+				struct cxl_memdev_ps_params *params,
+				enum cxl_scrub_param param_type)
+{
+	struct cxl_memdev_ps_wr_attrs wr_attrs;
+	struct cxl_memdev_ps_params rd_params;
+	int ret;
+
+	ret = cxl_mem_ps_get_attrs(cxl_mbox, &rd_params);
+	if (ret) {
+		dev_dbg(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n", ret);
+		return ret;
+	}
+
+	switch (param_type) {
+	case CXL_PS_PARAM_ENABLE:
+		wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
+						  params->enable);
+		wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
+						      rd_params.scrub_cycle_hrs);
+		break;
+	case CXL_PS_PARAM_SCRUB_CYCLE:
+		if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) {
+			dev_dbg(dev, "Invalid CXL patrol scrub cycle(%d) to set\n",
+				params->scrub_cycle_hrs);
+			dev_dbg(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n",
+				rd_params.min_scrub_cycle_hrs);
+			return -EINVAL;
+		}
+		wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
+						      params->scrub_cycle_hrs);
+		wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
+						  rd_params.enable);
+		break;
+	}
+
+	ret = cxl_set_feature(cxl_mbox, &CXL_FEAT_PATROL_SCRUB_UUID,
+			      cxl_ps_ctx->set_version,
+			      &wr_attrs, sizeof(wr_attrs),
+			      CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET,
+			      0, NULL);
+	if (ret) {
+		dev_dbg(dev, "CXL patrol scrub set feature failed ret=%d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int cxl_ps_set_attrs(struct device *dev,
+			    struct cxl_patrol_scrub_context *cxl_ps_ctx,
+			    struct cxl_memdev_ps_params *params,
+			    enum cxl_scrub_param param_type)
+{
+	struct cxl_mailbox *cxl_mbox;
+	struct cxl_memdev *cxlmd;
+	int ret, i;
+
+	if (cxl_ps_ctx->cxlr) {
+		struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
+		struct cxl_region_params *p = &cxlr->params;
+
+		ret = cxl_hold_region_and_dpa();
+		if (ret)
+			return ret;
+		for (i = p->interleave_ways - 1; i >= 0; i--) {
+			struct cxl_endpoint_decoder *cxled = p->targets[i];
+
+			cxlmd = cxled_to_memdev(cxled);
+			cxl_mbox = &cxlmd->cxlds->cxl_mbox;
+			ret = cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, cxl_mbox,
+						   params, param_type);
+			if (ret)
+				return ret;
+		}
+		cxl_release_region_and_dpa();
+
+		return 0;
+	}
+	cxl_mbox = &cxl_ps_ctx->cxlmd->cxlds->cxl_mbox;
+
+	return cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, cxl_mbox,
+				    params, param_type);
+}
+
+static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
+{
+	struct cxl_patrol_scrub_context *ctx = drv_data;
+	struct cxl_memdev_ps_params params;
+	int ret;
+
+	ret = cxl_ps_get_attrs(ctx, &params);
+	if (ret)
+		return ret;
+
+	*enabled = params.enable;
+
+	return 0;
+}
+
+static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
+{
+	struct cxl_patrol_scrub_context *ctx = drv_data;
+	struct cxl_memdev_ps_params params = {
+		.enable = enable,
+	};
+
+	return cxl_ps_set_attrs(dev, ctx, &params, CXL_PS_PARAM_ENABLE);
+}
+
+static int cxl_patrol_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data,
+						 u32 *min)
+{
+	struct cxl_patrol_scrub_context *ctx = drv_data;
+	struct cxl_memdev_ps_params params;
+	int ret;
+
+	ret = cxl_ps_get_attrs(ctx, &params);
+	if (ret)
+		return ret;
+	*min = params.min_scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int cxl_patrol_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data,
+						 u32 *max)
+{
+	*max = U8_MAX * CXL_DEV_HOUR_IN_SECS; /* Max set by register size */
+
+	return 0;
+}
+
+static int cxl_patrol_scrub_read_scrub_cycle(struct device *dev, void *drv_data,
+					     u32 *scrub_cycle_secs)
+{
+	struct cxl_patrol_scrub_context *ctx = drv_data;
+	struct cxl_memdev_ps_params params;
+	int ret;
+
+	ret = cxl_ps_get_attrs(ctx, &params);
+	if (ret)
+		return ret;
+
+	*scrub_cycle_secs = params.scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS;
+
+	return 0;
+}
+
+static int cxl_patrol_scrub_write_scrub_cycle(struct device *dev, void *drv_data,
+					      u32 scrub_cycle_secs)
+{
+	struct cxl_patrol_scrub_context *ctx = drv_data;
+	struct cxl_memdev_ps_params params = {
+		.scrub_cycle_hrs = scrub_cycle_secs / CXL_DEV_HOUR_IN_SECS,
+	};
+
+	return cxl_ps_set_attrs(dev, ctx, &params, CXL_PS_PARAM_SCRUB_CYCLE);
+}
+
+static const struct edac_scrub_ops cxl_ps_scrub_ops = {
+	.get_enabled_bg = cxl_patrol_scrub_get_enabled_bg,
+	.set_enabled_bg = cxl_patrol_scrub_set_enabled_bg,
+	.get_min_cycle = cxl_patrol_scrub_read_min_scrub_cycle,
+	.get_max_cycle = cxl_patrol_scrub_read_max_scrub_cycle,
+	.get_cycle_duration = cxl_patrol_scrub_read_scrub_cycle,
+	.set_cycle_duration = cxl_patrol_scrub_write_scrub_cycle,
+};
+
+static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd,
+				 struct edac_dev_feature *ras_feature, u8 scrub_inst)
+{
+	struct cxl_patrol_scrub_context *cxl_ps_ctx;
+	struct cxl_feat_entry *feat_entry;
+
+	feat_entry = cxl_get_feature_entry(cxlmd, &CXL_FEAT_PATROL_SCRUB_UUID);
+	if (IS_ERR(feat_entry))
+		return -EOPNOTSUPP;
+
+	if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
+		return -EOPNOTSUPP;
+
+	cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
+	if (!cxl_ps_ctx)
+		return -ENOMEM;
+
+	*cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
+		.get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
+		.set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
+		.get_version = feat_entry->get_feat_ver,
+		.set_version = feat_entry->set_feat_ver,
+		.effects = le16_to_cpu(feat_entry->effects),
+		.instance = scrub_inst,
+		.cxlmd = cxlmd,
+	};
+
+	ras_feature->ft_type = RAS_FEAT_SCRUB;
+	ras_feature->instance = cxl_ps_ctx->instance;
+	ras_feature->scrub_ops = &cxl_ps_scrub_ops;
+	ras_feature->ctx = cxl_ps_ctx;
+
+	return 0;
+}
+
+static int cxl_region_scrub_init(struct cxl_region *cxlr,
+				 struct edac_dev_feature *ras_feature, u8 scrub_inst)
+{
+	struct cxl_patrol_scrub_context *cxl_ps_ctx;
+	struct cxl_region_params *p = &cxlr->params;
+	struct cxl_feat_entry *feat_entry;
+	struct cxl_memdev *cxlmd;
+	int i;
+
+	/*
+	 * The cxl_region_rwsem must be held if the code below is used in a context
+	 * other than when the region is in the probe state, as shown here.
+	 */
+	for (i = p->interleave_ways - 1; i >= 0; i--) {
+		struct cxl_endpoint_decoder *cxled = p->targets[i];
+
+		cxlmd = cxled_to_memdev(cxled);
+		feat_entry = cxl_get_feature_entry(cxlmd, &CXL_FEAT_PATROL_SCRUB_UUID);
+		if (IS_ERR(feat_entry))
+			return -EOPNOTSUPP;
+
+		if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
+			return -EOPNOTSUPP;
+	}
+
+	cxl_ps_ctx = devm_kzalloc(&cxlr->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
+	if (!cxl_ps_ctx)
+		return -ENOMEM;
+
+	*cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
+		.get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
+		.set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
+		.get_version = feat_entry->get_feat_ver,
+		.set_version = feat_entry->set_feat_ver,
+		.effects = le16_to_cpu(feat_entry->effects),
+		.instance = scrub_inst,
+		.cxlr = cxlr,
+	};
+
+	ras_feature->ft_type = RAS_FEAT_SCRUB;
+	ras_feature->instance = cxl_ps_ctx->instance;
+	ras_feature->scrub_ops = &cxl_ps_scrub_ops;
+	ras_feature->ctx = cxl_ps_ctx;
+
+	return 0;
+}
+
+int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
+{
+	struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
+	char cxl_dev_name[CXL_DEV_NAME_LEN];
+	int num_ras_features = 0;
+	u8 scrub_inst = 0;
+	int rc;
+
+	rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features],
+				   scrub_inst);
+	if (rc < 0 && rc != -EOPNOTSUPP)
+		return rc;
+
+	if (rc != -EOPNOTSUPP)
+		num_ras_features++;
+
+	snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s",
+		 "cxl", dev_name(&cxlmd->dev));
+
+	return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
+				 num_ras_features, ras_features);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_edac_register, "CXL");
+
+int devm_cxl_region_edac_register(struct cxl_region *cxlr)
+{
+	struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
+	char cxl_dev_name[CXL_DEV_NAME_LEN];
+	int num_ras_features = 0;
+	u8 scrub_inst = 0;
+	int rc;
+
+	rc = cxl_region_scrub_init(cxlr, &ras_features[num_ras_features],
+				   scrub_inst);
+	if (rc < 0)
+		return rc;
+
+	num_ras_features++;
+
+	snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s",
+		 "cxl", dev_name(&cxlr->dev));
+
+	return edac_dev_register(&cxlr->dev, cxl_dev_name, NULL,
+				 num_ras_features, ras_features);
+}
+EXPORT_SYMBOL_NS_GPL(devm_cxl_region_edac_register, "CXL");
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index e8d11a988fd9..5339237ced0f 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3443,6 +3443,11 @@ static int cxl_region_probe(struct device *dev)
 	case CXL_DECODER_PMEM:
 		return devm_cxl_add_pmem_region(cxlr);
 	case CXL_DECODER_RAM:
+		rc = devm_cxl_region_edac_register(cxlr);
+		if (rc)
+			dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=%d failed\n",
+				cxlr->id);
+
 		/*
 		 * The region can not be manged by CXL if any portion of
 		 * it is already online as 'System RAM'
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 4a42cdb64b5c..cad0d41b98d0 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -802,6 +802,16 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
 
+#if IS_ENABLED(CONFIG_CXL_RAS_FEATURES)
+int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd);
+int devm_cxl_region_edac_register(struct cxl_region *cxlr);
+#else
+static inline int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
+{ return 0; }
+static inline int devm_cxl_region_edac_register(struct cxl_region *cxlr)
+{ return 0; }
+#endif
+
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 47348a52bc05..349f01788778 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -185,6 +185,10 @@ static int cxl_mem_probe(struct device *dev)
 	if (rc)
 		dev_dbg(dev, "No CXL Features enumerated.\n");
 
+	rc = devm_cxl_memdev_edac_register(cxlmd);
+	if (rc)
+		dev_dbg(dev, "CXL memdev EDAC registration failed rc=%d\n", rc);
+
 	/*
 	 * The kernel may be operating out of CXL memory on this device,
 	 * there is no spec defined way to determine whether this device
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 09/15] cxl/memfeature: Add CXL memory device ECS control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (7 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 08/15] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 10/15] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 ECS (Error Check
Scrub) control feature.
The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
Specification (JESD79-5) and allows the DRAM to internally read, correct
single-bit errors, and write back corrected data bits to the DRAM array
while providing transparency to error counts.

The ECS control allows the requester to change the log entry type, the ECS
threshold count (provided the request falls within the limits specified in
DDR5 mode registers), switch between codeword mode and row count mode, and
reset the ECS counter.

Register with EDAC device driver, which retrieves the ECS attribute
descriptors from the EDAC ECS and exposes the ECS control attributes to
userspace via sysfs. For example, the ECS control for the memory media FRU0
in CXL mem0 device is located at /sys/bus/edac/devices/cxl_mem0/ecs_fru0/

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/cxl/Kconfig           |   1 +
 drivers/cxl/core/memfeature.c | 355 +++++++++++++++++++++++++++++++++-
 2 files changed, 355 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index ac5ad2dc5996..892c9a3c5679 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -165,6 +165,7 @@ config CXL_RAS_FEATURES
 	tristate "CXL: Memory RAS features"
 	depends on CXL_MEM
 	depends on EDAC_SCRUB
+	depends on EDAC_ECS
 	help
 	  The CXL memory RAS feature control is optional and allows host to
 	  control the RAS features configurations of CXL Type 3 devices.
diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c
index c13b25a18e1a..c8540df567f1 100644
--- a/drivers/cxl/core/memfeature.c
+++ b/drivers/cxl/core/memfeature.c
@@ -19,7 +19,7 @@
 #include <cxlmem.h>
 #include "core.h"
 
-#define CXL_DEV_NUM_RAS_FEATURES	1
+#define CXL_DEV_NUM_RAS_FEATURES	2
 #define CXL_DEV_HOUR_IN_SECS	3600
 
 #define CXL_DEV_NAME_LEN	128
@@ -426,6 +426,352 @@ static int cxl_region_scrub_init(struct cxl_region *cxlr,
 	return 0;
 }
 
+/*
+ * CXL DDR5 ECS control definitions.
+ */
+struct cxl_ecs_context {
+	u16 num_media_frus;
+	u16 get_feat_size;
+	u16 set_feat_size;
+	u8 get_version;
+	u8 set_version;
+	u16 effects;
+	struct cxl_memdev *cxlmd;
+};
+
+enum {
+	CXL_ECS_PARAM_LOG_ENTRY_TYPE,
+	CXL_ECS_PARAM_THRESHOLD,
+	CXL_ECS_PARAM_MODE,
+	CXL_ECS_PARAM_RESET_COUNTER,
+};
+
+#define CXL_ECS_LOG_ENTRY_TYPE_MASK	GENMASK(1, 0)
+#define CXL_ECS_REALTIME_REPORT_CAP_MASK	BIT(0)
+#define CXL_ECS_THRESHOLD_COUNT_MASK	GENMASK(2, 0)
+#define CXL_ECS_COUNT_MODE_MASK	BIT(3)
+#define CXL_ECS_RESET_COUNTER_MASK	BIT(4)
+#define CXL_ECS_RESET_COUNTER	1
+
+enum {
+	ECS_THRESHOLD_256 = 256,
+	ECS_THRESHOLD_1024 = 1024,
+	ECS_THRESHOLD_4096 = 4096,
+};
+
+enum {
+	ECS_THRESHOLD_IDX_256 = 3,
+	ECS_THRESHOLD_IDX_1024 = 4,
+	ECS_THRESHOLD_IDX_4096 = 5,
+};
+
+static const u16 ecs_supp_threshold[] = {
+	[ECS_THRESHOLD_IDX_256] = 256,
+	[ECS_THRESHOLD_IDX_1024] = 1024,
+	[ECS_THRESHOLD_IDX_4096] = 4096,
+};
+
+enum {
+	ECS_LOG_ENTRY_TYPE_DRAM = 0x0,
+	ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU = 0x1,
+};
+
+enum cxl_ecs_count_mode {
+	ECS_MODE_COUNTS_ROWS = 0,
+	ECS_MODE_COUNTS_CODEWORDS = 1,
+};
+
+/**
+ * struct cxl_ecs_params - CXL memory DDR5 ECS parameter data structure.
+ * @threshold: ECS threshold count per GB of memory cells.
+ * @log_entry_type: ECS log entry type, per DRAM or per memory media FRU.
+ * @reset_counter: [IN] reset ECC counter to default value.
+ * @count_mode: codeword/row count mode
+ *		0 : ECS counts rows with errors
+ *		1 : ECS counts codeword with errors
+ */
+struct cxl_ecs_params {
+	u16 threshold;
+	u8 log_entry_type;
+	u8 reset_counter;
+	enum cxl_ecs_count_mode count_mode;
+};
+
+/*
+ * See CXL spec rev 3.1 @8.2.9.9.11.2 Table 8-210 DDR5 ECS Control Feature
+ * Readable Attributes.
+ */
+struct cxl_ecs_fru_rd_attrs {
+	u8 ecs_cap;
+	__le16 ecs_config;
+	u8 ecs_flags;
+}  __packed;
+
+struct cxl_ecs_rd_attrs {
+	u8 ecs_log_cap;
+	struct cxl_ecs_fru_rd_attrs fru_attrs[];
+}  __packed;
+
+/*
+ * See CXL spec rev 3.1 @8.2.9.9.11.2 Table 8-211 DDR5 ECS Control Feature
+ * Writable Attributes.
+ */
+struct cxl_ecs_fru_wr_attrs {
+	__le16 ecs_config;
+} __packed;
+
+struct cxl_ecs_wr_attrs {
+	u8 ecs_log_cap;
+	struct cxl_ecs_fru_wr_attrs fru_attrs[];
+}  __packed;
+
+/*
+ * CXL DDR5 ECS control functions.
+ */
+static int cxl_mem_ecs_get_attrs(struct device *dev,
+				 struct cxl_ecs_context *cxl_ecs_ctx,
+				 int fru_id, struct cxl_ecs_params *params)
+{
+	struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
+	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
+	struct cxl_ecs_fru_rd_attrs *fru_rd_attrs;
+	size_t rd_data_size;
+	u8 threshold_index;
+	size_t data_size;
+	u16 ecs_config;
+
+	rd_data_size = cxl_ecs_ctx->get_feat_size;
+
+	struct cxl_ecs_rd_attrs *rd_attrs __free(kvfree) =
+		kvzalloc(rd_data_size, GFP_KERNEL);
+	if (!rd_attrs)
+		return -ENOMEM;
+
+	params->log_entry_type = 0;
+	params->threshold = 0;
+	params->count_mode = 0;
+	data_size = cxl_get_feature(cxl_mbox, &CXL_FEAT_ECS_UUID,
+				    CXL_GET_FEAT_SEL_CURRENT_VALUE,
+				    rd_attrs, rd_data_size, 0, NULL);
+	if (!data_size)
+		return -EIO;
+
+	fru_rd_attrs = rd_attrs->fru_attrs;
+	params->log_entry_type = FIELD_GET(CXL_ECS_LOG_ENTRY_TYPE_MASK,
+					   rd_attrs->ecs_log_cap);
+	ecs_config = le16_to_cpu(fru_rd_attrs[fru_id].ecs_config);
+	threshold_index = FIELD_GET(CXL_ECS_THRESHOLD_COUNT_MASK,
+				    ecs_config);
+	params->threshold = ecs_supp_threshold[threshold_index];
+	params->count_mode = FIELD_GET(CXL_ECS_COUNT_MODE_MASK,
+				       ecs_config);
+	return 0;
+}
+
+static int cxl_mem_ecs_set_attrs(struct device *dev,
+				 struct cxl_ecs_context *cxl_ecs_ctx,
+				 int fru_id, struct cxl_ecs_params *params,
+				 u8 param_type)
+{
+	struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd;
+	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
+	struct cxl_ecs_fru_rd_attrs *fru_rd_attrs;
+	struct cxl_ecs_fru_wr_attrs *fru_wr_attrs;
+	size_t rd_data_size, wr_data_size;
+	u16 num_media_frus, count;
+	size_t data_size;
+	u16 ecs_config;
+
+	num_media_frus = cxl_ecs_ctx->num_media_frus;
+	rd_data_size = cxl_ecs_ctx->get_feat_size;
+	wr_data_size = cxl_ecs_ctx->set_feat_size;
+	struct cxl_ecs_rd_attrs *rd_attrs __free(kvfree) =
+		kvzalloc(rd_data_size, GFP_KERNEL);
+	if (!rd_attrs)
+		return -ENOMEM;
+
+	data_size = cxl_get_feature(cxl_mbox, &CXL_FEAT_ECS_UUID,
+				    CXL_GET_FEAT_SEL_CURRENT_VALUE,
+				    rd_attrs, rd_data_size, 0, NULL);
+	if (!data_size)
+		return -EIO;
+
+	struct cxl_ecs_wr_attrs *wr_attrs __free(kvfree) =
+					kvzalloc(wr_data_size, GFP_KERNEL);
+	if (!wr_attrs)
+		return -ENOMEM;
+
+	/*
+	 * Fill writable attributes from the current attributes read
+	 * for all the media FRUs.
+	 */
+	fru_rd_attrs = rd_attrs->fru_attrs;
+	fru_wr_attrs = wr_attrs->fru_attrs;
+	wr_attrs->ecs_log_cap = rd_attrs->ecs_log_cap;
+	for (count = 0; count < num_media_frus; count++)
+		fru_wr_attrs[count].ecs_config = fru_rd_attrs[count].ecs_config;
+
+	/*
+	 * Fill attribute to be set for the media FRU
+	 */
+	ecs_config = le16_to_cpu(fru_rd_attrs[fru_id].ecs_config);
+	switch (param_type) {
+	case CXL_ECS_PARAM_LOG_ENTRY_TYPE:
+		if (params->log_entry_type != ECS_LOG_ENTRY_TYPE_DRAM &&
+		    params->log_entry_type != ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU)
+			return -EINVAL;
+
+		wr_attrs->ecs_log_cap = FIELD_PREP(CXL_ECS_LOG_ENTRY_TYPE_MASK,
+						   params->log_entry_type);
+		break;
+	case CXL_ECS_PARAM_THRESHOLD:
+		ecs_config &= ~CXL_ECS_THRESHOLD_COUNT_MASK;
+		switch (params->threshold) {
+		case ECS_THRESHOLD_256:
+			ecs_config |= FIELD_PREP(CXL_ECS_THRESHOLD_COUNT_MASK,
+						 ECS_THRESHOLD_IDX_256);
+			break;
+		case ECS_THRESHOLD_1024:
+			ecs_config |= FIELD_PREP(CXL_ECS_THRESHOLD_COUNT_MASK,
+						 ECS_THRESHOLD_IDX_1024);
+			break;
+		case ECS_THRESHOLD_4096:
+			ecs_config |= FIELD_PREP(CXL_ECS_THRESHOLD_COUNT_MASK,
+						 ECS_THRESHOLD_IDX_4096);
+			break;
+		default:
+			dev_dbg(dev,
+				"Invalid CXL ECS scrub threshold count(%d) to set\n",
+				params->threshold);
+			dev_dbg(dev,
+				"Supported scrub threshold counts: %u, %u, %u\n",
+				ECS_THRESHOLD_256, ECS_THRESHOLD_1024, ECS_THRESHOLD_4096);
+			return -EINVAL;
+		}
+		break;
+	case CXL_ECS_PARAM_MODE:
+		if (params->count_mode != ECS_MODE_COUNTS_ROWS &&
+		    params->count_mode != ECS_MODE_COUNTS_CODEWORDS) {
+			dev_dbg(dev,
+				"Invalid CXL ECS scrub mode(%d) to set\n",
+				params->count_mode);
+			dev_dbg(dev,
+				"Supported ECS Modes: 0: ECS counts rows with errors,"
+				" 1: ECS counts codewords with errors\n");
+			return -EINVAL;
+		}
+		ecs_config &= ~CXL_ECS_COUNT_MODE_MASK;
+		ecs_config |= FIELD_PREP(CXL_ECS_COUNT_MODE_MASK, params->count_mode);
+		break;
+	case CXL_ECS_PARAM_RESET_COUNTER:
+		if (params->reset_counter != CXL_ECS_RESET_COUNTER)
+			return -EINVAL;
+
+		ecs_config &= ~CXL_ECS_RESET_COUNTER_MASK;
+		ecs_config |= FIELD_PREP(CXL_ECS_RESET_COUNTER_MASK, params->reset_counter);
+		break;
+	default:
+		return -EINVAL;
+	}
+	fru_wr_attrs[fru_id].ecs_config = cpu_to_le16(ecs_config);
+
+	return cxl_set_feature(cxl_mbox, &CXL_FEAT_ECS_UUID,
+			       cxl_ecs_ctx->set_version,
+			       wr_attrs, wr_data_size,
+			       CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET,
+			       0, NULL);
+}
+
+#define CXL_ECS_GET_ATTR(attrib)						\
+static int cxl_ecs_get_##attrib(struct device *dev, void *drv_data,		\
+				int fru_id, u32 *val)				\
+{										\
+	struct cxl_ecs_context *ctx = drv_data;					\
+	struct cxl_ecs_params params;						\
+	int ret;								\
+										\
+	ret = cxl_mem_ecs_get_attrs(dev, ctx, fru_id, &params);			\
+	if (ret)								\
+		return ret;							\
+										\
+	*val = params.attrib;							\
+										\
+	return 0;								\
+}
+
+CXL_ECS_GET_ATTR(log_entry_type)
+CXL_ECS_GET_ATTR(count_mode)
+CXL_ECS_GET_ATTR(threshold)
+
+#define CXL_ECS_SET_ATTR(attrib, param_type)						\
+static int cxl_ecs_set_##attrib(struct device *dev, void *drv_data,			\
+				int fru_id, u32 val)					\
+{											\
+	struct cxl_ecs_context *ctx = drv_data;						\
+	struct cxl_ecs_params params = {						\
+		.attrib = val,								\
+	};										\
+											\
+	return cxl_mem_ecs_set_attrs(dev, ctx, fru_id, &params, (param_type));		\
+}
+CXL_ECS_SET_ATTR(log_entry_type, CXL_ECS_PARAM_LOG_ENTRY_TYPE)
+CXL_ECS_SET_ATTR(count_mode, CXL_ECS_PARAM_MODE)
+CXL_ECS_SET_ATTR(reset_counter, CXL_ECS_PARAM_RESET_COUNTER)
+CXL_ECS_SET_ATTR(threshold, CXL_ECS_PARAM_THRESHOLD)
+
+static const struct edac_ecs_ops cxl_ecs_ops = {
+	.get_log_entry_type = cxl_ecs_get_log_entry_type,
+	.set_log_entry_type = cxl_ecs_set_log_entry_type,
+	.get_mode = cxl_ecs_get_count_mode,
+	.set_mode = cxl_ecs_set_count_mode,
+	.reset = cxl_ecs_set_reset_counter,
+	.get_threshold = cxl_ecs_get_threshold,
+	.set_threshold = cxl_ecs_set_threshold,
+};
+
+static int cxl_memdev_ecs_init(struct cxl_memdev *cxlmd,
+			       struct edac_dev_feature *ras_feature)
+{
+	struct cxl_ecs_context *cxl_ecs_ctx;
+	struct cxl_feat_entry *feat_entry;
+	int num_media_frus;
+
+	feat_entry = cxl_get_feature_entry(cxlmd, &CXL_FEAT_ECS_UUID);
+	if (IS_ERR(feat_entry))
+		return -EOPNOTSUPP;
+
+	if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
+		return -EOPNOTSUPP;
+
+	num_media_frus = (le16_to_cpu(feat_entry->get_feat_size) -
+			  sizeof(struct cxl_ecs_rd_attrs)) /
+			  sizeof(struct cxl_ecs_fru_rd_attrs);
+	if (!num_media_frus)
+		return -EOPNOTSUPP;
+
+	cxl_ecs_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ecs_ctx),
+				   GFP_KERNEL);
+	if (!cxl_ecs_ctx)
+		return -ENOMEM;
+
+	*cxl_ecs_ctx = (struct cxl_ecs_context) {
+		.get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
+		.set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
+		.get_version = feat_entry->get_feat_ver,
+		.set_version = feat_entry->set_feat_ver,
+		.effects = le16_to_cpu(feat_entry->effects),
+		.num_media_frus = num_media_frus,
+		.cxlmd = cxlmd,
+	};
+
+	ras_feature->ft_type = RAS_FEAT_ECS;
+	ras_feature->ecs_ops = &cxl_ecs_ops;
+	ras_feature->ctx = cxl_ecs_ctx;
+	ras_feature->ecs_info.num_media_frus = num_media_frus;
+
+	return 0;
+}
+
 int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 {
 	struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
@@ -442,6 +788,13 @@ int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 	if (rc != -EOPNOTSUPP)
 		num_ras_features++;
 
+	rc = cxl_memdev_ecs_init(cxlmd, &ras_features[num_ras_features]);
+	if (rc < 0 && rc != -EOPNOTSUPP)
+		return rc;
+
+	if (rc != -EOPNOTSUPP)
+		num_ras_features++;
+
 	snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s",
 		 "cxl", dev_name(&cxlmd->dev));
 
-- 
2.43.0




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 10/15] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (8 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 09/15] cxl/memfeature: Add CXL memory device ECS " shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 11/15] cxl/region: Add helper function to determine memory is online shiju.jose
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add support for PERFORM_MAINTENANCE mailbox command.

CXL spec 3.1 section 8.2.9.7.1 describes the Perform Maintenance command.
This command requests the device to execute the maintenance operation
specified by the maintenance operation class and the maintenance operation
subclass.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/cxl/core/mbox.c | 34 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h    | 17 +++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d38a5fc1384f..93ef94c57092 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -829,6 +829,40 @@ static const uuid_t log_uuid[] = {
 	[VENDOR_DEBUG_UUID] = DEFINE_CXL_VENDOR_DEBUG_UUID,
 };
 
+int cxl_do_maintenance(struct cxl_mailbox *cxl_mbox,
+		       u8 class, u8 subclass,
+		       void *data_in, size_t data_in_size)
+{
+	struct cxl_memdev_maintenance_pi {
+		struct cxl_mbox_do_maintenance_hdr hdr;
+		u8 data[];
+	}  __packed;
+	struct cxl_mbox_cmd mbox_cmd;
+	size_t hdr_size;
+
+	struct cxl_memdev_maintenance_pi *pi __free(kfree) =
+					kmalloc(cxl_mbox->payload_size, GFP_KERNEL);
+	pi->hdr.op_class = class;
+	pi->hdr.op_subclass = subclass;
+	hdr_size = sizeof(pi->hdr);
+	/*
+	 * Check minimum mbox payload size is available for
+	 * the maintenance data transfer.
+	 */
+	if (hdr_size + data_in_size > cxl_mbox->payload_size)
+		return -ENOMEM;
+
+	memcpy(pi->data, data_in, data_in_size);
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_DO_MAINTENANCE,
+		.size_in = hdr_size + data_in_size,
+		.payload_in = pi,
+	};
+
+	return cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_do_maintenance, "CXL");
+
 /**
  * cxl_enumerate_cmds() - Enumerate commands for a device.
  * @mds: The driver data for the operation
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index cad0d41b98d0..19c6ab45860c 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -495,6 +495,7 @@ enum cxl_opcode {
 	CXL_MBOX_OP_GET_SUPPORTED_FEATURES	= 0x0500,
 	CXL_MBOX_OP_GET_FEATURE		= 0x0501,
 	CXL_MBOX_OP_SET_FEATURE		= 0x0502,
+	CXL_MBOX_OP_DO_MAINTENANCE	= 0x0600,
 	CXL_MBOX_OP_IDENTIFY		= 0x4000,
 	CXL_MBOX_OP_GET_PARTITION_INFO	= 0x4100,
 	CXL_MBOX_OP_SET_PARTITION_INFO	= 0x4101,
@@ -778,6 +779,19 @@ enum {
 	CXL_PMEM_SEC_PASS_USER,
 };
 
+/*
+ * Perform Maintenance CXL 3.1 Spec 8.2.9.7.1
+ */
+
+/*
+ * Perform Maintenance input payload
+ * CXL rev 3.1 section 8.2.9.7.1 Table 8-102
+ */
+struct cxl_mbox_do_maintenance_hdr {
+	u8 op_class;
+	u8 op_subclass;
+}  __packed;
+
 int cxl_internal_send_cmd(struct cxl_mailbox *cxl_mbox,
 			  struct cxl_mbox_cmd *cmd);
 int cxl_dev_state_identify(struct cxl_memdev_state *mds);
@@ -847,4 +861,7 @@ struct cxl_hdm {
 struct seq_file;
 struct dentry *cxl_debugfs_create_dir(const char *dir);
 void cxl_dpa_debug(struct seq_file *file, struct cxl_dev_state *cxlds);
+int cxl_do_maintenance(struct cxl_mailbox *cxl_mbox,
+		       u8 class, u8 subclass,
+		       void *data_in, size_t data_in_size);
 #endif /* __CXL_MEM_H__ */
-- 
2.43.0




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 11/15] cxl/region: Add helper function to determine memory is online
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (9 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 10/15] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 12/15] cxl: Support for finding memory operation attributes from the current boot shiju.jose
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Add helper function to determine a CXL memory is online.

Use case: certain memory operations are permitted when the memory
is offline only, for eg. some memory repair operations.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/cxl/core/core.h   |  9 +++++++++
 drivers/cxl/core/region.c | 10 ++++++++++
 2 files changed, 19 insertions(+)

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index 23761340e65c..fb3b5f1229e7 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -32,8 +32,17 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port);
 struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
 u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
 		   u64 dpa);
+bool cxl_are_decoders_committed(const struct cxl_memdev *cxlmd);
 
 #else
+static inline bool cxl_are_decoders_committed(const struct cxl_memdev *cxlmd)
+{
+	/*
+	 * If no driver, in absence of a way to check, assume decoders are committed.
+	 */
+	return true;
+}
+
 static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
 				 const struct cxl_memdev *cxlmd, u64 dpa)
 {
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5339237ced0f..33a3ef839f6a 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2851,6 +2851,16 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
 	return ctx.cxlr;
 }
 
+bool cxl_are_decoders_committed(const struct cxl_memdev *cxlmd)
+{
+	struct cxl_port *port = cxlmd->endpoint;
+
+	if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
+		return true;
+
+	return false;
+}
+
 static bool cxl_is_hpa_in_chunk(u64 hpa, struct cxl_region *cxlr, int pos)
 {
 	struct cxl_region_params *p = &cxlr->params;
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 12/15] cxl: Support for finding memory operation attributes from the current boot
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (10 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 11/15] cxl/region: Add helper function to determine memory is online shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 13/15] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Certain operations on memory, such as memory repair, are permitted
only when the address and other attributes for the operation are
from the current boot. This is determined by checking whether the
memory attributes for the operation match those in the CXL gen_media
or CXL DRAM memory event records reported during the current boot.

The CXL event records must be backed up because they are cleared
in the hardware after being processed by the kernel.

Support is added for storing CXL gen_media or CXL DRAM memory event
records in xarrays. Additionally, helper functions are implemented
to find a matching record in the xarray storage based on the memory
attributes and repair type.

Add validity check, when matching attributes for sparing, using
the validity flag in the DRAM event record, to ensure that all
required attributes for a requested repair operation are valid and
set.

Presently supported only when CONFIG_CXL_RAS_FEATURES is enabled, as
this feature is specifically used for CXL RAS functionalities now.

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 drivers/cxl/core/Makefile |   2 +-
 drivers/cxl/core/mbox.c   |  11 ++-
 drivers/cxl/core/memdev.c |   9 +++
 drivers/cxl/core/ras.c    | 151 ++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h      |  55 ++++++++++++++
 drivers/cxl/pci.c         |   3 +
 6 files changed, 228 insertions(+), 3 deletions(-)
 create mode 100644 drivers/cxl/core/ras.c

diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 54baca513ecb..2fbdd6cd1357 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -17,4 +17,4 @@ cxl_core-y += cdat.o
 cxl_core-y += features.o
 cxl_core-$(CONFIG_TRACING) += trace.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
-cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeature.o
+cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeature.o ras.o
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 93ef94c57092..435e9c0aff78 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -956,11 +956,18 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 		if (cxlr)
 			hpa = cxl_dpa_to_hpa(cxlr, cxlmd, dpa);
 
-		if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+		if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
+			if (cxl_store_rec_gen_media((struct cxl_memdev *)cxlmd, evt))
+				dev_warn(&cxlmd->dev, "CXL store rec_gen_media failed\n");
+
 			trace_cxl_general_media(cxlmd, type, cxlr, hpa,
 						&evt->gen_media);
-		else if (event_type == CXL_CPER_EVENT_DRAM)
+		} else if (event_type == CXL_CPER_EVENT_DRAM) {
+			if (cxl_store_rec_dram((struct cxl_memdev *)cxlmd, evt))
+				dev_warn(&cxlmd->dev, "CXL store rec_dram failed\n");
+
 			trace_cxl_dram(cxlmd, type, cxlr, hpa, &evt->dram);
+		}
 	}
 }
 EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, "CXL");
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 2e2e035abdaa..99cfc7c70876 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -25,8 +25,17 @@ static DEFINE_IDA(cxl_memdev_ida);
 static void cxl_memdev_release(struct device *dev)
 {
 	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_event_gen_media *rec_gen_media;
+	struct cxl_event_dram *rec_dram;
+	unsigned long index;
 
 	ida_free(&cxl_memdev_ida, cxlmd->id);
+	xa_for_each(&cxlmd->rec_dram, index, rec_dram)
+		kfree(rec_dram);
+	xa_destroy(&cxlmd->rec_dram);
+	xa_for_each(&cxlmd->rec_gen_media, index, rec_gen_media)
+		kfree(rec_gen_media);
+	xa_destroy(&cxlmd->rec_gen_media);
 	kfree(cxlmd);
 }
 
diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
new file mode 100644
index 000000000000..65994eec1037
--- /dev/null
+++ b/drivers/cxl/core/ras.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * CXL RAS driver.
+ *
+ * Copyright (c) 2025 HiSilicon Limited.
+ *
+ */
+
+#include <linux/unaligned.h>
+#include <cxlmem.h>
+
+#include "trace.h"
+
+struct cxl_event_gen_media *cxl_find_rec_gen_media(struct cxl_memdev *cxlmd,
+						   struct cxl_mem_repair_attrbs *attrbs)
+{
+	struct cxl_event_gen_media *rec;
+
+	rec = xa_load(&cxlmd->rec_gen_media, attrbs->dpa);
+	if (!rec)
+		return NULL;
+
+	if (attrbs->repair_type == CXL_PPR)
+		return rec;
+
+	return NULL;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_find_rec_gen_media, "CXL");
+
+struct cxl_event_dram *cxl_find_rec_dram(struct cxl_memdev *cxlmd,
+					 struct cxl_mem_repair_attrbs *attrbs)
+{
+	struct cxl_event_dram *rec;
+	u16 validity_flags;
+
+	rec = xa_load(&cxlmd->rec_dram, attrbs->dpa);
+	if (!rec)
+		return NULL;
+
+	validity_flags = get_unaligned_le16(rec->media_hdr.validity_flags);
+	if (!(validity_flags & CXL_DER_VALID_CHANNEL) ||
+	    !(validity_flags & CXL_DER_VALID_RANK))
+		return NULL;
+
+	switch (attrbs->repair_type) {
+	case CXL_PPR:
+		if (!(validity_flags & CXL_DER_VALID_NIBBLE) ||
+		    get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask)
+			return rec;
+		break;
+	case CXL_CACHELINE_SPARING:
+		if (!(validity_flags & CXL_DER_VALID_BANK_GROUP) ||
+		    !(validity_flags & CXL_DER_VALID_BANK) ||
+		    !(validity_flags & CXL_DER_VALID_ROW) ||
+		    !(validity_flags & CXL_DER_VALID_COLUMN))
+			return NULL;
+
+		if (rec->media_hdr.channel == attrbs->channel &&
+		    rec->media_hdr.rank == attrbs->rank &&
+		    rec->bank_group == attrbs->bank_group &&
+		    rec->bank == attrbs->bank &&
+		    get_unaligned_le24(rec->row) == attrbs->row &&
+		    get_unaligned_le16(rec->column) == attrbs->column &&
+		    (!(validity_flags & CXL_DER_VALID_NIBBLE) ||
+		     get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask) &&
+		    (!(validity_flags & CXL_DER_VALID_SUB_CHANNEL) ||
+		     rec->sub_channel == attrbs->sub_channel))
+			return rec;
+		break;
+	case CXL_ROW_SPARING:
+		if (!(validity_flags & CXL_DER_VALID_BANK_GROUP) ||
+		    !(validity_flags & CXL_DER_VALID_BANK) ||
+		    !(validity_flags & CXL_DER_VALID_ROW))
+			return NULL;
+
+		if (rec->media_hdr.channel == attrbs->channel &&
+		    rec->media_hdr.rank == attrbs->rank &&
+		    rec->bank_group == attrbs->bank_group &&
+		    rec->bank == attrbs->bank &&
+		    get_unaligned_le24(rec->row) == attrbs->row &&
+		    (!(validity_flags & CXL_DER_VALID_NIBBLE) ||
+		     get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask))
+			return rec;
+		break;
+	case CXL_BANK_SPARING:
+		if (!(validity_flags & CXL_DER_VALID_BANK_GROUP) ||
+		    !(validity_flags & CXL_DER_VALID_BANK))
+			return NULL;
+
+		if (rec->media_hdr.channel == attrbs->channel &&
+		    rec->media_hdr.rank == attrbs->rank &&
+		    rec->bank_group == attrbs->bank_group &&
+		    rec->bank == attrbs->bank &&
+		    (!(validity_flags & CXL_DER_VALID_NIBBLE) ||
+		     get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask))
+			return rec;
+		break;
+	case CXL_RANK_SPARING:
+		if (rec->media_hdr.channel == attrbs->channel &&
+		    rec->media_hdr.rank == attrbs->rank &&
+		    (!(validity_flags & CXL_DER_VALID_NIBBLE) ||
+		     get_unaligned_le24(rec->nibble_mask) == attrbs->nibble_mask))
+			return rec;
+		break;
+	default:
+		return NULL;
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_find_rec_dram, "CXL");
+
+int cxl_store_rec_gen_media(struct cxl_memdev *cxlmd, union cxl_event *evt)
+{
+	void *old_rec;
+	struct cxl_event_gen_media *rec = kmemdup(&evt->gen_media,
+						  sizeof(*rec), GFP_KERNEL);
+	if (!rec)
+		return -ENOMEM;
+
+	old_rec = xa_store(&cxlmd->rec_gen_media,
+			   le64_to_cpu(rec->media_hdr.phys_addr),
+			   rec, GFP_KERNEL);
+	if (xa_is_err(old_rec))
+		return xa_err(old_rec);
+
+	kfree(old_rec);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_store_rec_gen_media, "CXL");
+
+int cxl_store_rec_dram(struct cxl_memdev *cxlmd, union cxl_event *evt)
+{
+	void *old_rec;
+	struct cxl_event_dram *rec = kmemdup(&evt->dram, sizeof(*rec), GFP_KERNEL);
+
+	if (!rec)
+		return -ENOMEM;
+
+	old_rec = xa_store(&cxlmd->rec_dram,
+			   le64_to_cpu(rec->media_hdr.phys_addr),
+			   rec, GFP_KERNEL);
+	if (xa_is_err(old_rec))
+		return xa_err(old_rec);
+
+	kfree(old_rec);
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_store_rec_dram, "CXL");
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 19c6ab45860c..8af0a32fe20f 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -34,6 +34,41 @@
 	(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=                       \
 	 CXLMDEV_RESET_NEEDED_NOT)
 
+enum cxl_mem_repair_type {
+	CXL_PPR,
+	CXL_CACHELINE_SPARING,
+	CXL_ROW_SPARING,
+	CXL_BANK_SPARING,
+	CXL_RANK_SPARING,
+	CXL_REPAIR_MAX,
+};
+
+/**
+ * struct cxl_mem_repair_attrbs - CXL memory repair attributes
+ * @dpa: DPA of memory to repair
+ * @nibble_mask: nibble mask, identifies one or more nibbles on the memory bus
+ * @row: row of memory to repair
+ * @column: column of memory to repair
+ * @channel: channel of memory to repair
+ * @sub_channel: sub channel of memory to repair
+ * @rank: rank of memory to repair
+ * @bank_group: bank group of memory to repair
+ * @bank: bank of memory to repair
+ * @repair_type: repair type. For eg. PPR, memory sparing etc.
+ */
+struct cxl_mem_repair_attrbs {
+	u64 dpa;
+	u32 nibble_mask;
+	u32 row;
+	u16 column;
+	u8 channel;
+	u8 sub_channel;
+	u8 rank;
+	u8 bank_group;
+	u8 bank;
+	enum cxl_mem_repair_type repair_type;
+};
+
 /**
  * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
  * @dev: driver core device object
@@ -46,6 +81,8 @@
  * @endpoint: connection to the CXL port topology for this memory device
  * @id: id number of this memdev instance.
  * @depth: endpoint port depth
+ * @rec_gen_media: xarray to store CXL general media records
+ * @rec_dram: xarray to store CXL DRAM records
  */
 struct cxl_memdev {
 	struct device dev;
@@ -58,6 +95,8 @@ struct cxl_memdev {
 	struct cxl_port *endpoint;
 	int id;
 	int depth;
+	struct xarray rec_gen_media;
+	struct xarray rec_dram;
 };
 
 static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
@@ -819,11 +858,27 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
 #if IS_ENABLED(CONFIG_CXL_RAS_FEATURES)
 int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd);
 int devm_cxl_region_edac_register(struct cxl_region *cxlr);
+struct cxl_event_gen_media *
+cxl_find_rec_gen_media(struct cxl_memdev *cxlmd, struct cxl_mem_repair_attrbs *attrbs);
+struct cxl_event_dram *cxl_find_rec_dram(struct cxl_memdev *cxlmd,
+					 struct cxl_mem_repair_attrbs *attrbs);
+int cxl_store_rec_gen_media(struct cxl_memdev *cxlmd, union cxl_event *evt);
+int cxl_store_rec_dram(struct cxl_memdev *cxlmd, union cxl_event *evt);
 #else
 static inline int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 { return 0; }
 static inline int devm_cxl_region_edac_register(struct cxl_region *cxlr)
 { return 0; }
+static inline struct cxl_event_gen_media *
+cxl_find_rec_gen_media(struct cxl_memdev *cxlmd, struct cxl_mem_repair_attrbs *attrbs)
+{ return 0; }
+static inline struct cxl_event_dram *cxl_find_rec_dram(struct cxl_memdev *cxlmd,
+						       struct cxl_mem_repair_attrbs *attrbs)
+{ return 0; }
+static inline int cxl_store_rec_gen_media(struct cxl_memdev *cxlmd, union cxl_event *evt)
+{ return 0; }
+static inline int cxl_store_rec_dram(struct cxl_memdev *cxlmd, union cxl_event *evt)
+{ return 0; }
 #endif
 
 #ifdef CONFIG_CXL_SUSPEND
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index a96e54c6259e..a895ab75b7ec 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1044,6 +1044,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	pci_save_state(pdev);
 
+	xa_init(&cxlmd->rec_gen_media);
+	xa_init(&cxlmd->rec_dram);
+
 	return rc;
 }
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 13/15] cxl/memfeature: Add CXL memory device soft PPR control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (11 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 12/15] cxl: Support for finding memory operation attributes from the current boot shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 14/15] EDAC: Update memory repair control interface for memory sparing feature shiju.jose
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Post Package Repair (PPR) maintenance operations may be supported by CXL
devices that implement CXL.mem protocol. A PPR maintenance operation
requests the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support PPR features
may implement PPR Maintenance operations. DRAM components may support two
types of PPR, hard PPR (hPPR), for a permanent row repair, and Soft PPR
(sPPR), for a temporary row repair. Soft PPR is much faster than hPPR,
but the repair is lost with a power cycle.

During the execution of a PPR Maintenance operation, a CXL memory device:
- May or may not retain data
- May or may not be able to process CXL.mem requests correctly, including
the ones that target the DPA involved in the repair.
These CXL Memory Device capabilities are specified by Restriction Flags
in the sPPR Feature and hPPR Feature.

Soft PPR maintenance operation may be executed at runtime, if data is
retained and CXL.mem requests are correctly processed. For CXL devices with
DRAM components, hPPR maintenance operation may be executed only at boot
because typically data may not be retained with hPPR maintenance operation.

When a CXL device identifies error on a memory component, the device
may inform the host about the need for a PPR maintenance operation by using
an Event Record, where the Maintenance Needed flag is set. The Event Record
specifies the DPA that should be repaired. A CXL device may not keep track
of the requests that have already been sent and the information on which
DPA should be repaired may be lost upon power cycle.
The userspace tool requests for maintenance operation if the number of
corrected error reported on a CXL.mem media exceeds error threshold.

CXL spec 3.1 section 8.2.9.7.1.2 describes the device's sPPR (soft PPR)
maintenance operation and section 8.2.9.7.1.3 describes the device's
hPPR (hard PPR) maintenance operation feature.

CXL spec 3.1 section 8.2.9.7.2.1 describes the sPPR feature discovery and
configuration.

CXL spec 3.1 section 8.2.9.7.2.2 describes the hPPR feature discovery and
configuration.

Add support for controlling CXL memory device soft PPR (sPPR) feature.
Register with EDAC driver, which gets the memory repair attr descriptors
from the EDAC memory repair driver and exposes sysfs repair control
attributes for PRR to the userspace. For example CXL PPR control for the
CXL mem0 device is exposed in /sys/bus/edac/devices/cxl_mem0/mem_repairX/

Add checks to ensure the memory to be repaired is offline and originates
from a CXL DRAM or CXL gen_media error record reported in the current boot,
before requesting a PPR operation on the device.

Tested with QEMU patch for CXL PPR feature.
https://lore.kernel.org/all/20240730045722.71482-1-dave@stgolabs.net/

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/edac/memory_repair.rst |  46 ++++
 drivers/cxl/Kconfig                  |   1 +
 drivers/cxl/core/memfeature.c        | 345 ++++++++++++++++++++++++++-
 include/linux/edac.h                 |   5 +
 4 files changed, 396 insertions(+), 1 deletion(-)

diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst
index 7ccca02632f5..d7826a5dc2bf 100644
--- a/Documentation/edac/memory_repair.rst
+++ b/Documentation/edac/memory_repair.rst
@@ -104,3 +104,49 @@ sysfs
 
 Sysfs files are documented in
 `Documentation/ABI/testing/sysfs-edac-memory-repair`.
+
+Examples
+--------
+
+The memory repair usage takes the form shown in this example:
+
+1. CXL memory device Soft Post Package Repair (Soft PPR)
+
+1.1. Read device supported capabilities for the Soft PPR.
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/persist_mode
+
+0
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair_type
+
+ppr
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair_safe_when_in_use
+
+1
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/min_dpa
+
+0x0
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/max_dpa
+
+0xfffffff
+
+ Soft PPR that is safe to use with ongoing accesses to the memory
+
+ and applies to 4GiB of DPA space.
+
+1.2. Set attributes for a soft PPR for a DPA=0x300000
+
+# echo 0x8a2d > /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask
+
+# echo 0x300000 >  /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa
+
+1.3. Start soft PPR operation for the DPA=0x300000
+
+Note: Repair command returns error if unsupported/resources are not
+available for the repair operation.
+
+# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 892c9a3c5679..77baef31cf3c 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -166,6 +166,7 @@ config CXL_RAS_FEATURES
 	depends on CXL_MEM
 	depends on EDAC_SCRUB
 	depends on EDAC_ECS
+	depends on EDAC_MEM_REPAIR
 	help
 	  The CXL memory RAS feature control is optional and allows host to
 	  control the RAS features configurations of CXL Type 3 devices.
diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c
index c8540df567f1..a97a1c5f66e2 100644
--- a/drivers/cxl/core/memfeature.c
+++ b/drivers/cxl/core/memfeature.c
@@ -14,12 +14,13 @@
 #include <linux/cleanup.h>
 #include <linux/edac.h>
 #include <linux/limits.h>
+#include <linux/unaligned.h>
 #include <cxl/features.h>
 #include <cxl.h>
 #include <cxlmem.h>
 #include "core.h"
 
-#define CXL_DEV_NUM_RAS_FEATURES	2
+#define CXL_DEV_NUM_RAS_FEATURES	3
 #define CXL_DEV_HOUR_IN_SECS	3600
 
 #define CXL_DEV_NAME_LEN	128
@@ -772,12 +773,344 @@ static int cxl_memdev_ecs_init(struct cxl_memdev *cxlmd,
 	return 0;
 }
 
+/*
+ * CXL memory soft PPR & hard PPR control definitions
+ */
+struct cxl_ppr_context {
+	uuid_t repair_uuid;
+	u8 instance;
+	u16 get_feat_size;
+	u16 set_feat_size;
+	u8 get_version;
+	u8 set_version;
+	u16 effects;
+	struct cxl_memdev *cxlmd;
+	enum edac_mem_repair_type repair_type;
+	bool persist_mode;
+	u64 dpa;
+	u32 nibble_mask;
+};
+
+/**
+ * struct cxl_memdev_ppr_params - CXL memory PPR parameter data structure.
+ * @dpa: device physical address.
+ * @op_class: PPR operation class.
+ * @op_subclass: PPR operation subclass.
+ * @media_accessible: memory media is accessible or not during PPR operation.
+ * @data_retained: data is retained or not during PPR operation.
+ */
+struct cxl_memdev_ppr_params {
+	u64 dpa;
+	u8 op_class;
+	u8 op_subclass;
+	bool media_accessible;
+	bool data_retained;
+};
+
+/*
+ * See CXL rev 3.1 @8.2.9.7.2.1 Table 8-113 sPPR Feature Readable Attributes
+ *
+ * See CXL rev 3.1 @8.2.9.7.2.2 Table 8-116 hPPR Feature Readable Attributes
+ */
+#define CXL_MEMDEV_PPR_QUERY_RESOURCE_FLAG	BIT(0)
+
+#define CXL_MEMDEV_PPR_DEVICE_INITIATED_MASK	BIT(0)
+#define CXL_MEMDEV_PPR_FLAG_DPA_SUPPORT_MASK	BIT(0)
+#define CXL_MEMDEV_PPR_FLAG_NIBBLE_SUPPORT_MASK	BIT(1)
+#define CXL_MEMDEV_PPR_FLAG_MEM_SPARING_EV_REC_SUPPORT_MASK	BIT(2)
+
+#define CXL_MEMDEV_PPR_RESTRICTION_FLAG_MEDIA_ACCESSIBLE_MASK	BIT(0)
+#define CXL_MEMDEV_PPR_RESTRICTION_FLAG_DATA_RETAINED_MASK	BIT(2)
+
+#define CXL_MEMDEV_PPR_SPARING_EV_REC_EN_MASK	BIT(0)
+
+struct cxl_memdev_repair_rd_attrs_hdr {
+	u8 max_op_latency;
+	__le16 op_cap;
+	__le16 op_mode;
+	u8 op_class;
+	u8 op_subclass;
+	u8 rsvd[9];
+}  __packed;
+
+struct cxl_memdev_ppr_rd_attrs {
+	struct cxl_memdev_repair_rd_attrs_hdr hdr;
+	u8 ppr_flags;
+	__le16 restriction_flags;
+	u8 ppr_op_mode;
+}  __packed;
+
+/*
+ * See CXL rev 3.1 @8.2.9.7.1.2 Table 8-103 sPPR Maintenance Input Payload
+ *
+ * See CXL rev 3.1 @8.2.9.7.1.3 Table 8-104 hPPR Maintenance Input Payload
+ */
+struct cxl_memdev_ppr_maintenance_attrs {
+	u8 flags;
+	__le64 dpa;
+	u8 nibble_mask[3];
+}  __packed;
+
+static int cxl_mem_ppr_get_attrs(struct cxl_ppr_context *cxl_ppr_ctx,
+				 struct cxl_memdev_ppr_params *params)
+{
+	size_t rd_data_size = sizeof(struct cxl_memdev_ppr_rd_attrs);
+	struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd;
+	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
+	u16 restriction_flags;
+	size_t data_size;
+	u16 return_code;
+
+	struct cxl_memdev_ppr_rd_attrs *rd_attrs __free(kfree) =
+				kmalloc(rd_data_size, GFP_KERNEL);
+	if (!rd_attrs)
+		return -ENOMEM;
+
+	data_size = cxl_get_feature(cxl_mbox, &cxl_ppr_ctx->repair_uuid,
+				    CXL_GET_FEAT_SEL_CURRENT_VALUE,
+				    rd_attrs, rd_data_size, 0, &return_code);
+	if (!data_size)
+		return -EIO;
+
+	params->op_class = rd_attrs->hdr.op_class;
+	params->op_subclass = rd_attrs->hdr.op_subclass;
+	restriction_flags = le16_to_cpu(rd_attrs->restriction_flags);
+	params->media_accessible = FIELD_GET(CXL_MEMDEV_PPR_RESTRICTION_FLAG_MEDIA_ACCESSIBLE_MASK,
+					     restriction_flags) ^ 1;
+	params->data_retained = FIELD_GET(CXL_MEMDEV_PPR_RESTRICTION_FLAG_DATA_RETAINED_MASK,
+					  restriction_flags) ^ 1;
+
+	return 0;
+}
+
+static int cxl_mem_do_ppr_op(struct device *dev,
+			     struct cxl_ppr_context *cxl_ppr_ctx,
+			     struct cxl_memdev_ppr_params *rd_params)
+{
+	struct cxl_memdev_ppr_maintenance_attrs maintenance_attrs;
+	struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd;
+	struct cxl_mem_repair_attrbs attrbs = { 0 };
+
+	if (!rd_params->media_accessible || !rd_params->data_retained) {
+		/* Memory to repair must be offline */
+		if (cxl_are_decoders_committed(cxlmd))
+			return -EBUSY;
+		/* offline, so good for repair */
+	} else {
+		/* If offline all good, otherwise check for match with record */
+		if (cxl_are_decoders_committed(cxlmd)) {
+			/* Check memory to repair is from the current boot */
+			attrbs.repair_type = CXL_PPR;
+			attrbs.dpa = cxl_ppr_ctx->dpa;
+			attrbs.nibble_mask = cxl_ppr_ctx->nibble_mask;
+			if (!cxl_find_rec_dram(cxlmd, &attrbs) &&
+			    !cxl_find_rec_gen_media(cxlmd, &attrbs))
+				return -EINVAL;
+			/* Record matched, so even though online good for repair */
+		}
+	}
+
+	memset(&maintenance_attrs, 0, sizeof(maintenance_attrs));
+	maintenance_attrs.flags = 0;
+	maintenance_attrs.dpa = cpu_to_le64(cxl_ppr_ctx->dpa);
+	put_unaligned_le24(cxl_ppr_ctx->nibble_mask, maintenance_attrs.nibble_mask);
+
+	return cxl_do_maintenance(&cxlmd->cxlds->cxl_mbox, rd_params->op_class,
+				  rd_params->op_subclass, &maintenance_attrs,
+				  sizeof(maintenance_attrs));
+}
+
+static int cxl_mem_ppr_set_attrs(struct device *dev,
+				 struct cxl_ppr_context *cxl_ppr_ctx)
+{
+	struct cxl_memdev_ppr_params rd_params;
+	int ret;
+
+	ret = cxl_mem_ppr_get_attrs(cxl_ppr_ctx, &rd_params);
+	if (ret)
+		return ret;
+
+	ret = cxl_hold_region_and_dpa();
+	if (ret)
+		return ret;
+
+	ret = cxl_mem_do_ppr_op(dev, cxl_ppr_ctx, &rd_params);
+	cxl_release_region_and_dpa();
+
+	return ret;
+}
+
+static int cxl_ppr_get_repair_type(struct device *dev, void *drv_data,
+				   const char **repair_type)
+{
+	*repair_type = edac_repair_type[EDAC_PPR];
+
+	return 0;
+}
+
+static int cxl_ppr_get_persist_mode(struct device *dev, void *drv_data,
+				    bool *persist_mode)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+
+	*persist_mode = cxl_ppr_ctx->persist_mode;
+
+	return 0;
+}
+
+static int cxl_get_ppr_safe_when_in_use(struct device *dev, void *drv_data,
+					bool *safe)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+	struct cxl_memdev_ppr_params params;
+	int ret;
+
+	ret = cxl_mem_ppr_get_attrs(cxl_ppr_ctx, &params);
+	if (ret)
+		return ret;
+
+	*safe = params.media_accessible & params.data_retained;
+
+	return 0;
+}
+
+static int cxl_ppr_get_min_dpa(struct device *dev, void *drv_data,
+			       u64 *min_dpa)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+	struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	*min_dpa = cxlds->dpa_res.start;
+
+	return 0;
+}
+
+static int cxl_ppr_get_max_dpa(struct device *dev, void *drv_data,
+			       u64 *max_dpa)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+	struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	*max_dpa = cxlds->dpa_res.end;
+
+	return 0;
+}
+
+static int cxl_ppr_get_dpa(struct device *dev, void *drv_data,
+			   u64 *dpa)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+
+	*dpa = cxl_ppr_ctx->dpa;
+
+	return 0;
+}
+
+static int cxl_ppr_set_dpa(struct device *dev, void *drv_data, u64 dpa)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+	struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	if (dpa < cxlds->dpa_res.start || dpa > cxlds->dpa_res.end)
+		return -EINVAL;
+
+	cxl_ppr_ctx->dpa = dpa;
+
+	return 0;
+}
+
+static int cxl_ppr_get_nibble_mask(struct device *dev, void *drv_data,
+				   u32 *nibble_mask)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+
+	*nibble_mask = cxl_ppr_ctx->nibble_mask;
+
+	return 0;
+}
+
+static int cxl_ppr_set_nibble_mask(struct device *dev, void *drv_data, u32 nibble_mask)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+
+	cxl_ppr_ctx->nibble_mask = nibble_mask;
+
+	return 0;
+}
+
+static int cxl_do_ppr(struct device *dev, void *drv_data, u32 val)
+{
+	struct cxl_ppr_context *cxl_ppr_ctx = drv_data;
+
+	if (!cxl_ppr_ctx->dpa || val != EDAC_DO_MEM_REPAIR)
+		return -EINVAL;
+
+	return cxl_mem_ppr_set_attrs(dev, cxl_ppr_ctx);
+}
+
+static const struct edac_mem_repair_ops cxl_sppr_ops = {
+	.get_repair_type = cxl_ppr_get_repair_type,
+	.get_persist_mode = cxl_ppr_get_persist_mode,
+	.get_repair_safe_when_in_use = cxl_get_ppr_safe_when_in_use,
+	.get_min_dpa = cxl_ppr_get_min_dpa,
+	.get_max_dpa = cxl_ppr_get_max_dpa,
+	.get_dpa = cxl_ppr_get_dpa,
+	.set_dpa = cxl_ppr_set_dpa,
+	.get_nibble_mask = cxl_ppr_get_nibble_mask,
+	.set_nibble_mask = cxl_ppr_set_nibble_mask,
+	.do_repair = cxl_do_ppr,
+};
+
+static int cxl_memdev_soft_ppr_init(struct cxl_memdev *cxlmd,
+				    struct edac_dev_feature *ras_feature,
+				    u8 repair_inst)
+{
+	struct cxl_ppr_context *cxl_sppr_ctx;
+	struct cxl_feat_entry *feat_entry;
+
+	feat_entry = cxl_get_feature_entry(cxlmd, &CXL_FEAT_SPPR_UUID);
+	if (IS_ERR(feat_entry))
+		return -EOPNOTSUPP;
+
+	if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
+		return -EOPNOTSUPP;
+
+	cxl_sppr_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_sppr_ctx),
+				    GFP_KERNEL);
+	if (!cxl_sppr_ctx)
+		return -ENOMEM;
+
+	*cxl_sppr_ctx = (struct cxl_ppr_context) {
+		.get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
+		.set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
+		.get_version = feat_entry->get_feat_ver,
+		.set_version = feat_entry->set_feat_ver,
+		.effects = le16_to_cpu(feat_entry->effects),
+		.cxlmd = cxlmd,
+		.repair_type = EDAC_PPR,
+		.persist_mode = 0,
+		.instance = repair_inst,
+	};
+	uuid_copy(&cxl_sppr_ctx->repair_uuid, &CXL_FEAT_SPPR_UUID);
+
+	ras_feature->ft_type = RAS_FEAT_MEM_REPAIR;
+	ras_feature->instance = cxl_sppr_ctx->instance;
+	ras_feature->mem_repair_ops = &cxl_sppr_ops;
+	ras_feature->ctx = cxl_sppr_ctx;
+
+	return 0;
+}
+
 int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 {
 	struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
 	char cxl_dev_name[CXL_DEV_NAME_LEN];
 	int num_ras_features = 0;
 	u8 scrub_inst = 0;
+	u8 repair_inst = 0;
 	int rc;
 
 	rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features],
@@ -795,6 +1128,16 @@ int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 	if (rc != -EOPNOTSUPP)
 		num_ras_features++;
 
+	rc = cxl_memdev_soft_ppr_init(cxlmd, &ras_features[num_ras_features],
+				      repair_inst);
+	if (rc < 0 && rc != -EOPNOTSUPP)
+		return rc;
+
+	if (rc != -EOPNOTSUPP) {
+		repair_inst++;
+		num_ras_features++;
+	}
+
 	snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s",
 		 "cxl", dev_name(&cxlmd->dev));
 
diff --git a/include/linux/edac.h b/include/linux/edac.h
index cfb2ef41ab95..060f79a7f72a 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -745,9 +745,14 @@ static inline int edac_ecs_get_desc(struct device *ecs_dev,
 #endif /* CONFIG_EDAC_ECS */
 
 enum edac_mem_repair_type {
+	EDAC_PPR,
 	EDAC_REPAIR_MAX
 };
 
+static const char * const edac_repair_type[] = {
+	[EDAC_PPR] = "ppr",
+};
+
 enum edac_mem_repair_cmd {
 	EDAC_DO_MEM_REPAIR = 1,
 };
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 14/15] EDAC: Update memory repair control interface for memory sparing feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (12 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 13/15] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-07 14:44 ` [PATCH v19 15/15] cxl/memfeature: Add CXL memory device memory sparing control feature shiju.jose
  2025-02-10 17:52 ` [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Fan Ni
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Update memory repair control interface for memory sparing feature.
Examples are CXL memory sparing features.

CXL memory device may support soft and hard memory sparing at cacheline,
row, bank and rank granularities. Memory sparing is defined as
a repair function that replaces a portion of memory with a portion of
functional memory at that same granularity.

When a CXL device detects an error in a memory, it may report the host of
the need for a repair maintenance operation by using an event record where
the "maintenance needed" flag is set. The event records contains the device
physical address(DPA) and other attributes of the memory to repair (such as
bank group, bank, rank, row, column, channel etc). The kernel
will report the corresponding CXL general media or DRAM trace event to
userspace, and userspace tools (e.g. rasdaemon) will initiate a repair
operation in response to the device request via the sysfs repair control.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 .../ABI/testing/sysfs-edac-memory-repair      | 57 +++++++++++++
 drivers/edac/mem_repair.c                     | 85 +++++++++++++++++++
 include/linux/edac.h                          | 28 ++++++
 3 files changed, 170 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-edac-memory-repair b/Documentation/ABI/testing/sysfs-edac-memory-repair
index c54f59e4497b..d0d70b094651 100644
--- a/Documentation/ABI/testing/sysfs-edac-memory-repair
+++ b/Documentation/ABI/testing/sysfs-edac-memory-repair
@@ -42,6 +42,14 @@ Description:
 
 		- ppr - Post package repair.
 
+		- cacheline-sparing
+
+		- row-sparing
+
+		- bank-sparing
+
+		- rank-sparing
+
 		- All other values are reserved.
 
 What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode
@@ -134,6 +142,55 @@ Description:
 		related error records and trace events, for eg. CXL DRAM
 		and CXL general media error records in CXL memory devices.
 
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/bank_group
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/bank
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/rank
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/row
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/column
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/channel
+What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/sub_channel
+Date:		March 2025
+KernelVersion:	6.15
+Contact:	linux-edac@vger.kernel.org
+Description:
+		(RW) The control attributes for the  memory to be repaired.
+		The specific value of attributes to use depends on the
+		portion of memory to repair and will be reported to host
+		in related error records and may be available to userspace
+		in trace events, such as CXL DRAM and CXL general media
+		error records of CXL memory devices.
+
+		When readng back these attributes, it returns the current
+		value of memory requested to be repaired.
+
+		bank_group - The bank group of the memory to repair.
+
+		bank - The bank number of the memory to repair.
+
+		rank - The rank of the memory to repair. Rank is defined as a
+		set of memory devices on a channel that together execute a
+		transaction.
+
+		row - The row number of the memory to repair.
+
+		column - The column number of the memory to repair.
+
+		channel - The channel of the memory to repair. Channel is
+		defined as an interface that can be independently accessed
+		for a transaction.
+
+		sub_channel - The subchannel of the memory to repair.
+
+		The requirement to set these attributes varies based on the
+		repair function. The attributes in sysfs are not present
+		unless required for a repair function.
+
+		For example, CXL spec ver 3.1, Section 8.2.9.7.1.2 Table 8-103
+		soft PPR and Section 8.2.9.7.1.3 Table 8-104 hard PPR operations,
+		these attributes are not required to set. CXL spec ver 3.1,
+		Section 8.2.9.7.1.4 Table 8-105 memory sparing, these attributes
+		are required to set based on memory sparing granularity.
+
 What:		/sys/bus/edac/devices/<dev-name>/mem_repairX/repair
 Date:		March 2025
 KernelVersion:	6.15
diff --git a/drivers/edac/mem_repair.c b/drivers/edac/mem_repair.c
index 9fe310050464..47ed5edffc61 100755
--- a/drivers/edac/mem_repair.c
+++ b/drivers/edac/mem_repair.c
@@ -22,6 +22,13 @@ enum edac_mem_repair_attributes {
 	MEM_REPAIR_MIN_DPA,
 	MEM_REPAIR_MAX_DPA,
 	MEM_REPAIR_NIBBLE_MASK,
+	MEM_REPAIR_BANK_GROUP,
+	MEM_REPAIR_BANK,
+	MEM_REPAIR_RANK,
+	MEM_REPAIR_ROW,
+	MEM_REPAIR_COLUMN,
+	MEM_REPAIR_CHANNEL,
+	MEM_REPAIR_SUB_CHANNEL,
 	MEM_DO_REPAIR,
 	MEM_REPAIR_MAX_ATTRS
 };
@@ -70,6 +77,13 @@ EDAC_MEM_REPAIR_ATTR_SHOW(dpa, get_dpa, u64, "0x%llx\n")
 EDAC_MEM_REPAIR_ATTR_SHOW(min_dpa, get_min_dpa, u64, "0x%llx\n")
 EDAC_MEM_REPAIR_ATTR_SHOW(max_dpa, get_max_dpa, u64, "0x%llx\n")
 EDAC_MEM_REPAIR_ATTR_SHOW(nibble_mask, get_nibble_mask, u32, "0x%x\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(bank_group, get_bank_group, u32, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(bank, get_bank, u32, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(rank, get_rank, u32, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(row, get_row, u32, "0x%x\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(column, get_column, u32, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(channel, get_channel, u32, "%u\n")
+EDAC_MEM_REPAIR_ATTR_SHOW(sub_channel, get_sub_channel, u32, "%u\n")
 
 #define EDAC_MEM_REPAIR_ATTR_STORE(attrib, cb, type, conv_func)			\
 static ssize_t attrib##_store(struct device *ras_feat_dev,			\
@@ -99,6 +113,13 @@ EDAC_MEM_REPAIR_ATTR_STORE(persist_mode, set_persist_mode, unsigned long, kstrto
 EDAC_MEM_REPAIR_ATTR_STORE(hpa, set_hpa, u64, kstrtou64)
 EDAC_MEM_REPAIR_ATTR_STORE(dpa, set_dpa, u64, kstrtou64)
 EDAC_MEM_REPAIR_ATTR_STORE(nibble_mask, set_nibble_mask, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(bank_group, set_bank_group, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(bank, set_bank, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(rank, set_rank, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(row, set_row, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(column, set_column, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(channel, set_channel, unsigned long, kstrtoul)
+EDAC_MEM_REPAIR_ATTR_STORE(sub_channel, set_sub_channel, unsigned long, kstrtoul)
 
 #define EDAC_MEM_REPAIR_DO_OP(attrib, cb)						\
 static ssize_t attrib##_store(struct device *ras_feat_dev,				\
@@ -189,6 +210,62 @@ static umode_t mem_repair_attr_visible(struct kobject *kobj, struct attribute *a
 				return 0444;
 		}
 		break;
+	case MEM_REPAIR_BANK_GROUP:
+		if (ops->get_bank_group) {
+			if (ops->set_bank_group)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_BANK:
+		if (ops->get_bank) {
+			if (ops->set_bank)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_RANK:
+		if (ops->get_rank) {
+			if (ops->set_rank)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_ROW:
+		if (ops->get_row) {
+			if (ops->set_row)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_COLUMN:
+		if (ops->get_column) {
+			if (ops->set_column)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_CHANNEL:
+		if (ops->get_channel) {
+			if (ops->set_channel)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
+	case MEM_REPAIR_SUB_CHANNEL:
+		if (ops->get_sub_channel) {
+			if (ops->set_sub_channel)
+				return a->mode;
+			else
+				return 0444;
+		}
+		break;
 	case MEM_DO_REPAIR:
 		if (ops->do_repair)
 			return a->mode;
@@ -235,6 +312,14 @@ static int mem_repair_create_desc(struct device *dev,
 		[MEM_REPAIR_MAX_DPA] = EDAC_MEM_REPAIR_ATTR_RO(max_dpa, instance),
 		[MEM_REPAIR_NIBBLE_MASK] =
 				EDAC_MEM_REPAIR_ATTR_RW(nibble_mask, instance),
+		[MEM_REPAIR_BANK_GROUP] =
+				EDAC_MEM_REPAIR_ATTR_RW(bank_group, instance),
+		[MEM_REPAIR_BANK] = EDAC_MEM_REPAIR_ATTR_RW(bank, instance),
+		[MEM_REPAIR_RANK] = EDAC_MEM_REPAIR_ATTR_RW(rank, instance),
+		[MEM_REPAIR_ROW] = EDAC_MEM_REPAIR_ATTR_RW(row, instance),
+		[MEM_REPAIR_COLUMN] = EDAC_MEM_REPAIR_ATTR_RW(column, instance),
+		[MEM_REPAIR_CHANNEL] = EDAC_MEM_REPAIR_ATTR_RW(channel, instance),
+		[MEM_REPAIR_SUB_CHANNEL] = EDAC_MEM_REPAIR_ATTR_RW(sub_channel, instance),
 		[MEM_DO_REPAIR] = EDAC_MEM_REPAIR_ATTR_WO(repair, instance)
 	};
 
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 060f79a7f72a..a8fd01271582 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -785,6 +785,20 @@ enum edac_mem_repair_cmd {
  * @get_max_dpa: get the maximum supported device physical address (DPA).
  * @get_nibble_mask: get current nibble mask of memory to repair.
  * @set_nibble_mask: set nibble mask of memory to repair.
+ * @get_bank_group: get current bank group of memory to repair.
+ * @set_bank_group: set bank group of memory to repair.
+ * @get_bank: get current bank of memory to repair.
+ * @set_bank: set bank of memory to repair.
+ * @get_rank: get current rank of memory to repair.
+ * @set_rank: set rank of memory to repair.
+ * @get_row: get current row of memory to repair.
+ * @set_row: set row of memory to repair.
+ * @get_column: get current column of memory to repair.
+ * @set_column: set column of memory to repair.
+ * @get_channel: get current channel of memory to repair.
+ * @set_channel: set channel of memory to repair.
+ * @get_sub_channel: get current subchannel of memory to repair.
+ * @set_sub_channel: set subchannel of memory to repair.
  * @do_repair: Issue memory repair operation for the HPA/DPA and
  *	       other control attributes set for the memory to repair.
  *
@@ -805,6 +819,20 @@ struct edac_mem_repair_ops {
 	int (*get_max_dpa)(struct device *dev, void *drv_data, u64 *dpa);
 	int (*get_nibble_mask)(struct device *dev, void *drv_data, u32 *val);
 	int (*set_nibble_mask)(struct device *dev, void *drv_data, u32 val);
+	int (*get_bank_group)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_bank_group)(struct device *dev, void *drv_data, u32 val);
+	int (*get_bank)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_bank)(struct device *dev, void *drv_data, u32 val);
+	int (*get_rank)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_rank)(struct device *dev, void *drv_data, u32 val);
+	int (*get_row)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_row)(struct device *dev, void *drv_data, u32 val);
+	int (*get_column)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_column)(struct device *dev, void *drv_data, u32 val);
+	int (*get_channel)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_channel)(struct device *dev, void *drv_data, u32 val);
+	int (*get_sub_channel)(struct device *dev, void *drv_data, u32 *val);
+	int (*set_sub_channel)(struct device *dev, void *drv_data, u32 val);
 	int (*do_repair)(struct device *dev, void *drv_data, u32 val);
 };
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v19 15/15] cxl/memfeature: Add CXL memory device memory sparing control feature
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (13 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 14/15] EDAC: Update memory repair control interface for memory sparing feature shiju.jose
@ 2025-02-07 14:44 ` shiju.jose
  2025-02-10 17:52 ` [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Fan Ni
  15 siblings, 0 replies; 22+ messages in thread
From: shiju.jose @ 2025-02-07 14:44 UTC (permalink / raw)
  To: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, yazen.ghannam, tanxiaofei, prime.zeng, roberto.sassu,
	kangkang.shen, wanghuiqiang, linuxarm, shiju.jose

From: Shiju Jose <shiju.jose@huawei.com>

Memory sparing is defined as a repair function that replaces a portion of
memory with a portion of functional memory at that same DPA. The subclasses
for this operation vary in terms of the scope of the sparing being
performed. The cacheline sparing subclass refers to a sparing action that
can replace a full cacheline. Row sparing is provided as an alternative to
PPR sparing functions and its scope is that of a single DDR row.
As per CXL r3.2 Table 8-125 foot note 1. Memory sparing is preferred over
PPR when possible.
Bank sparing allows an entire bank to be replaced. Rank sparing is defined as
an operation in which an entire DDR rank is replaced.

Memory sparing maintenance operations may be supported by CXL devices
that implement CXL.mem protocol. A sparing maintenance operation requests
the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support memory sparing
features may implement sparing maintenance operations.

The host may issue a query command by setting query resources flag in the
input payload (CXL spec 3.1 Table 8-105) to determine availability of
sparing resources for a given address. In response to a query request,
the device shall report the resource availability by producing the memory
sparing event record (CXL spec 3.1 Table 8-48) in which the Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy
of the values specified in the request.

During the execution of a sparing maintenance operation, a CXL memory
device:
- may not retain data
- may not be able to process CXL.mem requests correctly.
These CXL memory device capabilities are specified by restriction flags
in the memory sparing feature readable attributes.

When a CXL device identifies error on a memory component, the device
may inform the host about the need for a memory sparing maintenance
operation by using DRAM event record, where the 'maintenance needed' flag
may set. The event record contains some of the DPA, Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that
should be repaired. The userspace tool requests for maintenance operation
if the 'maintenance needed' flag set in the CXL DRAM error record.

CXL spec 3.1 section 8.2.9.7.1.4 describes the device's memory sparing
maintenance operation feature.

CXL spec 3.1 section 8.2.9.7.2.3 describes the memory sparing feature
discovery and configuration.

Add support for controlling CXL memory device memory sparing feature.
Register with EDAC driver, which gets the memory repair attr descriptors
from the EDAC memory repair driver and exposes sysfs repair control
attributes for memory sparing to the userspace. For example CXL memory
sparing control for the CXL mem0 device is exposed in
/sys/bus/edac/devices/cxl_mem0/mem_repairX/

Use case
========
1. CXL device identifies a failure in a memory component, report to
   userspace in a CXL DRAM trace event with DPA and other attributes of
   memory to repair such as channel, rank, nibble mask, bank Group,
   bank, row, column, sub-channel.

2. Rasdaemon process the trace event and may issue query request in sysfs
check resources available for memory sparing if either of the following
conditions met.
 - 'maintenance needed' flag set in the event record.
 - 'threshold event' flag set for CVME threshold feature.
 - If the previous case is not enough, may be when the number of corrected
   error reported on a CXL.mem media to the user space exceeds an error
   threshold set in the userspace policy.

3. Rasdaemon process the memory sparing trace event and issue repair
   request for memory sparing.

Kernel CXL driver shall report memory sparing event record to the userspace
with the resource availability in order rasdaemon to process the event
record and issue a repair request in sysfs for the memory sparing operation
in the CXL device.

Note: Based on the feedbacks from the community 'query' sysfs attribute is
removed and reporting memory sparing error record to the userspace are not
supported. Instead userspace issues sparing operation and kernel does the
same to the CXL memory device, when 'maintenance needed' flag set in the
DRAM event record.

Add checks to ensure the memory to be repaired is offline and if online,
then originates from a CXL DRAM error record reported in the current boot
before requesting a memory sparing operation on the device.

Tested for memory sparing control feature with
   "hw/cxl: Add memory sparing control feature"
   Repository: "https://gitlab.com/shiju.jose/qemu.git"
   Branch: cxl-ras-features-2024-10-24

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 Documentation/edac/memory_repair.rst |  57 +++
 drivers/cxl/core/memfeature.c        | 557 ++++++++++++++++++++++++++-
 include/linux/edac.h                 |   8 +
 3 files changed, 620 insertions(+), 2 deletions(-)

diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst
index d7826a5dc2bf..89effa985193 100644
--- a/Documentation/edac/memory_repair.rst
+++ b/Documentation/edac/memory_repair.rst
@@ -150,3 +150,60 @@ Note: Repair command returns error if unsupported/resources are not
 available for the repair operation.
 
 # echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair
+
+2. CXL memory sparing
+
+2.1. Read device supported capabilities for the cacheline sparing.
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair_type
+
+cacheline-sparing
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/persist_mode
+
+0
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair_safe_when_in_use
+
+1
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/min_dpa
+
+0x0
+
+# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/max_dpa
+
+0xfffffff
+
+Sparing that is safe to use with ongoing accesses to the memory
+
+and applies to 4GiB of DPA space.
+
+2.2. Set attributes for cacheline sparing operation for a DPA=0x700000,
+     where device reported the attributes in CXL DRAM error event record.
+
+# echo 0x700000 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/dpa
+
+# echo 2 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank_group
+
+# echo 4 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/bank
+
+# echo 7 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/channel
+
+# echo 5 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/sub_channel
+
+# echo 9 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/rank
+
+# echo 0x240a > /sys/bus/edac/devices/cxl_mem0/mem_repair1/row
+
+# echo 11 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/column
+
+# echo 0x0FF > /sys/bus/edac/devices/cxl_mem0/mem_repair1/nibble_mask
+
+2.3. Start cacheline sparing operation
+
+Note: Repair command returns error if unsupported, resources are not
+available for the sparing operation or if memory to repair is online
+and attributes are reported from the previous boot etc.
+
+# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair1/repair
diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c
index a97a1c5f66e2..4f4cc76a39c7 100644
--- a/drivers/cxl/core/memfeature.c
+++ b/drivers/cxl/core/memfeature.c
@@ -19,8 +19,9 @@
 #include <cxl.h>
 #include <cxlmem.h>
 #include "core.h"
+#include "trace.h"
 
-#define CXL_DEV_NUM_RAS_FEATURES	3
+#define CXL_DEV_NUM_RAS_FEATURES	7
 #define CXL_DEV_HOUR_IN_SECS	3600
 
 #define CXL_DEV_NAME_LEN	128
@@ -1104,6 +1105,546 @@ static int cxl_memdev_soft_ppr_init(struct cxl_memdev *cxlmd,
 	return 0;
 }
 
+/* CXL memory sparing control definitions */
+enum cxl_mem_sparing_granularity {
+	CXL_MEM_SPARING_CACHELINE,
+	CXL_MEM_SPARING_ROW,
+	CXL_MEM_SPARING_BANK,
+	CXL_MEM_SPARING_RANK,
+	CXL_MEM_SPARING_MAX
+};
+
+struct cxl_mem_sparing_context {
+	struct cxl_memdev *cxlmd;
+	uuid_t repair_uuid;
+	u16 get_feat_size;
+	u16 set_feat_size;
+	u16 effects;
+	u8 instance;
+	u8 get_version;
+	u8 set_version;
+	u8 channel;
+	u8 rank;
+	u8 bank_group;
+	u32 nibble_mask;
+	u64 dpa;
+	u32 row;
+	u16 column;
+	u8 bank;
+	u8 sub_channel;
+	enum edac_mem_repair_type repair_type;
+	bool persist_mode;
+	enum cxl_mem_sparing_granularity granularity;
+};
+
+struct cxl_memdev_sparing_params {
+	u8 op_class;
+	u8 op_subclass;
+	bool cap_safe_when_in_use;
+	bool cap_hard_sparing;
+	bool cap_soft_sparing;
+};
+
+#define CXL_MEMDEV_SPARING_RD_CAP_SAFE_IN_USE_MASK	BIT(0)
+#define CXL_MEMDEV_SPARING_RD_CAP_HARD_SPARING_MASK	BIT(1)
+#define CXL_MEMDEV_SPARING_RD_CAP_SOFT_SPARING_MASK	BIT(2)
+
+#define CXL_MEMDEV_SPARING_WR_DEVICE_INITIATED_MASK	BIT(0)
+
+#define CXL_MEMDEV_SPARING_QUERY_RESOURCE_FLAG	BIT(0)
+#define CXL_MEMDEV_SET_HARD_SPARING_FLAG	BIT(1)
+#define CXL_MEMDEV_SPARING_SUB_CHANNEL_VALID_FLAG	BIT(2)
+#define CXL_MEMDEV_SPARING_NIB_MASK_VALID_FLAG	BIT(3)
+
+/*
+ * See CXL spec rev 3.1 @8.2.9.7.2.3 Table 8-119 Memory Sparing Feature
+ * Readable Attributes.
+ */
+struct cxl_memdev_sparing_rd_attrs {
+	struct cxl_memdev_repair_rd_attrs_hdr hdr;
+	u8 rsvd;
+	__le16 restriction_flags;
+}  __packed;
+
+/*
+ * See CXL spec rev 3.1 @8.2.9.7.1.4 Table 8-105 Memory Sparing Input Payload.
+ */
+struct cxl_memdev_sparing_in_payload {
+	u8 flags;
+	u8 channel;
+	u8 rank;
+	u8 nibble_mask[3];
+	u8 bank_group;
+	u8 bank;
+	u8 row[3];
+	__le16 column;
+	u8 sub_channel;
+}  __packed;
+
+static int cxl_mem_sparing_get_attrs(struct cxl_mem_sparing_context *cxl_sparing_ctx,
+				     struct cxl_memdev_sparing_params *params)
+{
+	size_t rd_data_size = sizeof(struct cxl_memdev_sparing_rd_attrs);
+	struct cxl_memdev *cxlmd = cxl_sparing_ctx->cxlmd;
+	struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox;
+	u16 restriction_flags;
+	size_t data_size;
+	u16 return_code;
+	struct cxl_memdev_sparing_rd_attrs *rd_attrs __free(kfree) =
+				kzalloc(rd_data_size, GFP_KERNEL);
+	if (!rd_attrs)
+		return -ENOMEM;
+
+	data_size = cxl_get_feature(cxl_mbox, &cxl_sparing_ctx->repair_uuid,
+				    CXL_GET_FEAT_SEL_CURRENT_VALUE,
+				    rd_attrs, rd_data_size, 0, &return_code);
+	if (!data_size)
+		return -EIO;
+
+	params->op_class = rd_attrs->hdr.op_class;
+	params->op_subclass = rd_attrs->hdr.op_subclass;
+	restriction_flags = le16_to_cpu(rd_attrs->restriction_flags);
+	params->cap_safe_when_in_use = FIELD_GET(CXL_MEMDEV_SPARING_RD_CAP_SAFE_IN_USE_MASK,
+						 restriction_flags) ^ 1;
+	params->cap_hard_sparing = FIELD_GET(CXL_MEMDEV_SPARING_RD_CAP_HARD_SPARING_MASK,
+					     restriction_flags);
+	params->cap_soft_sparing = FIELD_GET(CXL_MEMDEV_SPARING_RD_CAP_SOFT_SPARING_MASK,
+					     restriction_flags);
+
+	return 0;
+}
+
+static struct cxl_event_dram *
+cxl_mem_get_rec_dram(struct cxl_memdev *cxlmd, struct cxl_mem_sparing_context *ctx)
+{
+	struct cxl_mem_repair_attrbs attrbs = { 0 };
+
+	attrbs.dpa = ctx->dpa;
+	attrbs.channel = ctx->channel;
+	attrbs.rank = ctx->rank;
+	attrbs.nibble_mask = ctx->nibble_mask;
+	switch (ctx->repair_type) {
+	case EDAC_CACHELINE_SPARING:
+		attrbs.repair_type = CXL_CACHELINE_SPARING;
+		attrbs.bank_group = ctx->bank_group;
+		attrbs.bank = ctx->bank;
+		attrbs.row = ctx->row;
+		attrbs.column = ctx->column;
+		attrbs.sub_channel = ctx->sub_channel;
+		break;
+	case EDAC_ROW_SPARING:
+		attrbs.repair_type = CXL_ROW_SPARING;
+		attrbs.bank_group = ctx->bank_group;
+		attrbs.bank = ctx->bank;
+		attrbs.row = ctx->row;
+		break;
+	case EDAC_BANK_SPARING:
+		attrbs.repair_type = CXL_BANK_SPARING;
+		attrbs.bank_group = ctx->bank_group;
+		attrbs.bank = ctx->bank;
+		break;
+	case EDAC_RANK_SPARING:
+		attrbs.repair_type = CXL_BANK_SPARING;
+		break;
+	default:
+		return NULL;
+	}
+
+	/*
+	 * Check memory to repair is from the current boot
+	 */
+	return cxl_find_rec_dram(cxlmd, &attrbs);
+}
+
+static int cxl_mem_do_sparing_op(struct device *dev,
+				 struct cxl_mem_sparing_context *cxl_sparing_ctx,
+				 struct cxl_memdev_sparing_params *rd_params)
+{
+	struct cxl_memdev *cxlmd = cxl_sparing_ctx->cxlmd;
+	struct cxl_memdev_sparing_in_payload sparing_pi;
+	struct cxl_event_dram *rec = NULL;
+	u16 validity_flags = 0;
+
+	if (!rd_params->cap_safe_when_in_use) {
+		/*
+		 * Memory to repair must be offline
+		 */
+		if (cxl_are_decoders_committed(cxlmd))
+			return -EBUSY;
+		/*
+		 * offline, so good for repair
+		 */
+	} else {
+		/*
+		 * If offline all good, otherwise check for match with record
+		 */
+		if (cxl_are_decoders_committed(cxlmd)) {
+			rec = cxl_mem_get_rec_dram(cxlmd, cxl_sparing_ctx);
+			if (!rec)
+				return -EINVAL;
+			/*
+			 * Record matched, so even though online good for repair
+			 */
+			validity_flags = get_unaligned_le16(rec->media_hdr.validity_flags);
+			if (!validity_flags)
+				return -EINVAL;
+		}
+	}
+
+	memset(&sparing_pi, 0, sizeof(sparing_pi));
+	sparing_pi.flags = FIELD_PREP(CXL_MEMDEV_SPARING_QUERY_RESOURCE_FLAG, 0);
+	if (cxl_sparing_ctx->persist_mode)
+		sparing_pi.flags |=
+			FIELD_PREP(CXL_MEMDEV_SET_HARD_SPARING_FLAG, 1);
+
+	switch (cxl_sparing_ctx->repair_type) {
+	case EDAC_CACHELINE_SPARING:
+		sparing_pi.column = cpu_to_le16(cxl_sparing_ctx->column);
+		/*
+		 * Sub-channel is an optional attribute.
+		 */
+		if (!rec || (validity_flags & CXL_DER_VALID_SUB_CHANNEL)) {
+			sparing_pi.flags |=
+				FIELD_PREP(CXL_MEMDEV_SPARING_SUB_CHANNEL_VALID_FLAG, 1);
+			sparing_pi.sub_channel = cxl_sparing_ctx->sub_channel;
+		}
+		fallthrough;
+	case EDAC_ROW_SPARING:
+		put_unaligned_le24(cxl_sparing_ctx->row, sparing_pi.row);
+		fallthrough;
+	case EDAC_BANK_SPARING:
+		sparing_pi.bank_group = cxl_sparing_ctx->bank_group;
+		sparing_pi.bank = cxl_sparing_ctx->bank;
+		fallthrough;
+	case EDAC_RANK_SPARING:
+		sparing_pi.rank = cxl_sparing_ctx->rank;
+		fallthrough;
+	default:
+		sparing_pi.channel = cxl_sparing_ctx->channel;
+		if ((rec && (validity_flags & CXL_DER_VALID_NIBBLE)) ||
+		    (!rec && (!cxl_sparing_ctx->nibble_mask ||
+			     (cxl_sparing_ctx->nibble_mask & 0xFFFFFF)))) {
+			sparing_pi.flags |=
+				FIELD_PREP(CXL_MEMDEV_SPARING_NIB_MASK_VALID_FLAG, 1);
+			put_unaligned_le24(cxl_sparing_ctx->nibble_mask,
+					   sparing_pi.nibble_mask);
+		}
+		break;
+	}
+
+	return cxl_do_maintenance(&cxlmd->cxlds->cxl_mbox, rd_params->op_class,
+				  rd_params->op_subclass, &sparing_pi, sizeof(sparing_pi));
+}
+
+static int cxl_mem_sparing_set_attrs(struct device *dev,
+				     struct cxl_mem_sparing_context *ctx)
+{
+	struct cxl_memdev_sparing_params rd_params;
+	int ret;
+
+	ret = cxl_mem_sparing_get_attrs(ctx, &rd_params);
+	if (ret)
+		return ret;
+
+	ret = cxl_hold_region_and_dpa();
+	if (ret)
+		return ret;
+	ret = cxl_mem_do_sparing_op(dev, ctx, &rd_params);
+	cxl_release_region_and_dpa();
+
+	return ret;
+}
+
+static int cxl_mem_sparing_get_repair_type(struct device *dev, void *drv_data,
+					   const char **repair_type)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+
+	switch (ctx->repair_type) {
+	case EDAC_CACHELINE_SPARING:
+	case EDAC_ROW_SPARING:
+	case EDAC_BANK_SPARING:
+	case EDAC_RANK_SPARING:
+		*repair_type = edac_repair_type[ctx->repair_type];
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+#define CXL_SPARING_GET_ATTR(attrib, data_type)					\
+static int cxl_mem_sparing_get_##attrib(struct device *dev, void *drv_data,	\
+					data_type *val)				\
+{										\
+	struct cxl_mem_sparing_context *ctx = drv_data;				\
+										\
+	*val = ctx->attrib;							\
+										\
+	return 0;								\
+}
+CXL_SPARING_GET_ATTR(persist_mode, bool)
+CXL_SPARING_GET_ATTR(dpa, u64)
+CXL_SPARING_GET_ATTR(nibble_mask, u32)
+CXL_SPARING_GET_ATTR(bank_group, u32)
+CXL_SPARING_GET_ATTR(bank, u32)
+CXL_SPARING_GET_ATTR(rank, u32)
+CXL_SPARING_GET_ATTR(row, u32)
+CXL_SPARING_GET_ATTR(column, u32)
+CXL_SPARING_GET_ATTR(channel, u32)
+CXL_SPARING_GET_ATTR(sub_channel, u32)
+
+#define CXL_SPARING_SET_ATTR(attrib, data_type)					\
+static int cxl_mem_sparing_set_##attrib(struct device *dev, void *drv_data,	\
+					data_type val)				\
+{										\
+	struct cxl_mem_sparing_context *ctx = drv_data;				\
+										\
+	ctx->attrib = val;							\
+										\
+	return 0;								\
+}
+CXL_SPARING_SET_ATTR(nibble_mask, u32)
+CXL_SPARING_SET_ATTR(bank_group, u32)
+CXL_SPARING_SET_ATTR(bank, u32)
+CXL_SPARING_SET_ATTR(rank, u32)
+CXL_SPARING_SET_ATTR(row, u32)
+CXL_SPARING_SET_ATTR(column, u32)
+CXL_SPARING_SET_ATTR(channel, u32)
+CXL_SPARING_SET_ATTR(sub_channel, u32)
+
+static int cxl_mem_sparing_set_persist_mode(struct device *dev, void *drv_data,
+					    bool persist_mode)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+	struct cxl_memdev_sparing_params params;
+	int ret;
+
+	ret = cxl_mem_sparing_get_attrs(ctx, &params);
+	if (ret)
+		return ret;
+
+	if ((persist_mode && params.cap_hard_sparing) ||
+	    (!persist_mode && params.cap_soft_sparing))
+		ctx->persist_mode = persist_mode;
+	else
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+static int cxl_get_mem_sparing_safe_when_in_use(struct device *dev, void *drv_data,
+						bool *safe)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+	struct cxl_memdev_sparing_params params;
+	int ret;
+
+	ret = cxl_mem_sparing_get_attrs(ctx, &params);
+	if (ret)
+		return ret;
+
+	*safe = params.cap_safe_when_in_use;
+
+	return 0;
+}
+
+static int cxl_mem_sparing_get_min_dpa(struct device *dev, void *drv_data,
+				       u64 *min_dpa)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+	struct cxl_memdev *cxlmd = ctx->cxlmd;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	*min_dpa = cxlds->dpa_res.start;
+
+	return 0;
+}
+
+static int cxl_mem_sparing_get_max_dpa(struct device *dev, void *drv_data,
+				       u64 *max_dpa)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+	struct cxl_memdev *cxlmd = ctx->cxlmd;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	*max_dpa = cxlds->dpa_res.end;
+
+	return 0;
+}
+
+static int cxl_mem_sparing_set_dpa(struct device *dev, void *drv_data, u64 dpa)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+	struct cxl_memdev *cxlmd = ctx->cxlmd;
+	struct cxl_dev_state *cxlds = cxlmd->cxlds;
+
+	if (dpa < cxlds->dpa_res.start || dpa > cxlds->dpa_res.end)
+		return -EINVAL;
+
+	ctx->dpa = dpa;
+
+	return 0;
+}
+
+static int cxl_do_mem_sparing(struct device *dev, void *drv_data, u32 val)
+{
+	struct cxl_mem_sparing_context *ctx = drv_data;
+
+	if (val != EDAC_DO_MEM_REPAIR)
+		return -EINVAL;
+
+	return cxl_mem_sparing_set_attrs(dev, ctx);
+}
+
+#define RANK_OPS \
+	.get_repair_type = cxl_mem_sparing_get_repair_type, \
+	.get_persist_mode = cxl_mem_sparing_get_persist_mode, \
+	.set_persist_mode = cxl_mem_sparing_set_persist_mode, \
+	.get_repair_safe_when_in_use = cxl_get_mem_sparing_safe_when_in_use, \
+	.get_min_dpa = cxl_mem_sparing_get_min_dpa, \
+	.get_max_dpa = cxl_mem_sparing_get_max_dpa, \
+	.get_dpa = cxl_mem_sparing_get_dpa, \
+	.set_dpa = cxl_mem_sparing_set_dpa, \
+	.get_nibble_mask = cxl_mem_sparing_get_nibble_mask, \
+	.set_nibble_mask = cxl_mem_sparing_set_nibble_mask, \
+	.get_rank = cxl_mem_sparing_get_rank, \
+	.set_rank = cxl_mem_sparing_set_rank, \
+	.get_channel = cxl_mem_sparing_get_channel, \
+	.set_channel = cxl_mem_sparing_set_channel, \
+	.do_repair = cxl_do_mem_sparing
+
+#define BANK_OPS \
+	RANK_OPS, \
+	.get_bank_group = cxl_mem_sparing_get_bank_group, \
+	.set_bank_group = cxl_mem_sparing_set_bank_group, \
+	.get_bank = cxl_mem_sparing_get_bank, \
+	.set_bank = cxl_mem_sparing_set_bank
+
+#define ROW_OPS \
+	BANK_OPS, \
+	.get_row = cxl_mem_sparing_get_row, \
+	.set_row = cxl_mem_sparing_set_row
+
+#define CACHELINE_OPS \
+	ROW_OPS, \
+	.get_column = cxl_mem_sparing_get_column, \
+	.set_column = cxl_mem_sparing_set_column, \
+	.get_sub_channel = cxl_mem_sparing_get_sub_channel, \
+	.set_sub_channel = cxl_mem_sparing_set_sub_channel
+
+static const struct edac_mem_repair_ops cxl_rank_sparing_ops = {
+	RANK_OPS,
+};
+
+static const struct edac_mem_repair_ops cxl_bank_sparing_ops = {
+	BANK_OPS,
+};
+
+static const struct edac_mem_repair_ops cxl_row_sparing_ops = {
+	ROW_OPS,
+};
+
+static const struct edac_mem_repair_ops cxl_cacheline_sparing_ops = {
+	CACHELINE_OPS,
+};
+
+struct cxl_mem_sparing_desc {
+	const uuid_t repair_uuid;
+	enum edac_mem_repair_type repair_type;
+	enum cxl_mem_sparing_granularity granularity;
+	const struct edac_mem_repair_ops *repair_ops;
+};
+
+static const struct cxl_mem_sparing_desc mem_sparing_desc[] = {
+	{
+		.repair_uuid = CXL_FEAT_CACHELINE_SPARING_UUID,
+		.repair_type = EDAC_CACHELINE_SPARING,
+		.granularity = CXL_MEM_SPARING_CACHELINE,
+		.repair_ops = &cxl_cacheline_sparing_ops,
+	},
+	{
+		.repair_uuid = CXL_FEAT_ROW_SPARING_UUID,
+		.repair_type = EDAC_ROW_SPARING,
+		.granularity = CXL_MEM_SPARING_ROW,
+		.repair_ops = &cxl_row_sparing_ops,
+	},
+	{
+		.repair_uuid = CXL_FEAT_BANK_SPARING_UUID,
+		.repair_type = EDAC_BANK_SPARING,
+		.granularity = CXL_MEM_SPARING_BANK,
+		.repair_ops = &cxl_bank_sparing_ops,
+	},
+	{
+		.repair_uuid = CXL_FEAT_RANK_SPARING_UUID,
+		.repair_type = EDAC_RANK_SPARING,
+		.granularity = CXL_MEM_SPARING_RANK,
+		.repair_ops = &cxl_rank_sparing_ops,
+	},
+};
+
+static int cxl_memdev_sparing_init(struct cxl_memdev *cxlmd,
+				   struct edac_dev_feature *ras_feature,
+				   const struct cxl_mem_sparing_desc *desc,
+				   u8 repair_inst)
+{
+	struct cxl_mem_sparing_context *cxl_sparing_ctx;
+	struct cxl_memdev_sparing_params rd_params;
+	struct cxl_feat_entry *feat_entry;
+	int ret;
+
+	feat_entry = cxl_get_feature_entry(cxlmd, &desc->repair_uuid);
+	if (IS_ERR(feat_entry))
+		return -EOPNOTSUPP;
+
+	if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
+		return -EOPNOTSUPP;
+
+	cxl_sparing_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_sparing_ctx),
+				       GFP_KERNEL);
+	if (!cxl_sparing_ctx)
+		return -ENOMEM;
+
+	*cxl_sparing_ctx = (struct cxl_mem_sparing_context) {
+		.get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
+		.set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
+		.get_version = feat_entry->get_feat_ver,
+		.set_version = feat_entry->set_feat_ver,
+		.effects = le16_to_cpu(feat_entry->effects),
+		.cxlmd = cxlmd,
+		.repair_type = desc->repair_type,
+		.granularity = desc->granularity,
+		.instance = repair_inst++,
+	};
+	uuid_copy(&cxl_sparing_ctx->repair_uuid, &desc->repair_uuid);
+
+	/*
+	 * Read CXL device's sparing capabilities.
+	 */
+	ret = cxl_mem_sparing_get_attrs(cxl_sparing_ctx, &rd_params);
+	if (ret)
+		return ret;
+
+	/*
+	 * Set default value for persist_mode.
+	 */
+	if ((rd_params.cap_soft_sparing && rd_params.cap_hard_sparing) ||
+	    rd_params.cap_soft_sparing)
+		cxl_sparing_ctx->persist_mode = 0;
+	else if (rd_params.cap_hard_sparing)
+		cxl_sparing_ctx->persist_mode = 1;
+	else
+		return -EOPNOTSUPP;
+
+	ras_feature->ft_type = RAS_FEAT_MEM_REPAIR;
+	ras_feature->instance = cxl_sparing_ctx->instance;
+	ras_feature->mem_repair_ops = desc->repair_ops;
+	ras_feature->ctx = cxl_sparing_ctx;
+
+	return 0;
+}
+
 int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 {
 	struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
@@ -1111,7 +1652,7 @@ int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 	int num_ras_features = 0;
 	u8 scrub_inst = 0;
 	u8 repair_inst = 0;
-	int rc;
+	int rc, i;
 
 	rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features],
 				   scrub_inst);
@@ -1138,6 +1679,18 @@ int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
 		num_ras_features++;
 	}
 
+	for (i = 0; i < CXL_MEM_SPARING_MAX; i++) {
+		rc = cxl_memdev_sparing_init(cxlmd, &ras_features[num_ras_features],
+					     &mem_sparing_desc[i], repair_inst);
+		if (rc == -EOPNOTSUPP)
+			continue;
+		if (rc < 0)
+			return rc;
+
+		repair_inst++;
+		num_ras_features++;
+	}
+
 	snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s",
 		 "cxl", dev_name(&cxlmd->dev));
 
diff --git a/include/linux/edac.h b/include/linux/edac.h
index a8fd01271582..a80a91fd0167 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -746,11 +746,19 @@ static inline int edac_ecs_get_desc(struct device *ecs_dev,
 
 enum edac_mem_repair_type {
 	EDAC_PPR,
+	EDAC_CACHELINE_SPARING,
+	EDAC_ROW_SPARING,
+	EDAC_BANK_SPARING,
+	EDAC_RANK_SPARING,
 	EDAC_REPAIR_MAX
 };
 
 static const char * const edac_repair_type[] = {
 	[EDAC_PPR] = "ppr",
+	[EDAC_CACHELINE_SPARING] = "cacheline-sparing",
+	[EDAC_ROW_SPARING] = "row-sparing",
+	[EDAC_BANK_SPARING] = "bank-sparing",
+	[EDAC_RANK_SPARING] = "rank-sparing",
 };
 
 enum edac_mem_repair_cmd {
-- 
2.43.0




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
  2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
                   ` (14 preceding siblings ...)
  2025-02-07 14:44 ` [PATCH v19 15/15] cxl/memfeature: Add CXL memory device memory sparing control feature shiju.jose
@ 2025-02-10 17:52 ` Fan Ni
  2025-02-11 16:55   ` Shiju Jose
  15 siblings, 1 reply; 22+ messages in thread
From: Fan Ni @ 2025-02-10 17:52 UTC (permalink / raw)
  To: shiju.jose
  Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel,
	linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	nifan.cxl, tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
	wanghuiqiang, linuxarm, a.manzanares, nmtadam.samsung,
	anisa.su887

On Fri, Feb 07, 2025 at 02:44:29PM +0000, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
> 
> The CXL patches of this series has dependency on Dave's CXL fwctl
> series [1].
> 
> The code is based on v3 of CXL fwctl series [1] posted by Dave and
> v3 of FWCTL series [2] posted by Jason and rebased on top of
> v6.14-rc1.
> 
> [1]: https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jiang@intel.com/
> [2]: https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@nvidia.com/#r
> 
> 
> Userspace code for CXL memory repair features [3] and
> sample boot-script for CXL memory repair [4].
> 
> [3]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
> [4]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/
>   

Hi Shiju,
Is this series the same as in branch
https://github.com/shijujose4/linux/tree/edac-enhancement-ras-features_v19?

I hit some compile errors wen trying to test with the above branch
directly.

Here are two cases where I found the code cannot compile. Please check
if it is a false alarm.

Case 1: CONFIG_CXL_RAS_FEATURES=m

fan:~/cxl/linux-edac$ cat .config | egrep -i "edac|cxl|ras" | grep -v "^#"
CONFIG_ACPI_RAS2=y
CONFIG_ACPI_APEI_EINJ_CXL=y
CONFIG_PCIEAER_CXL=y
CONFIG_CXL_BUS=y
CONFIG_CXL_PCI=y
CONFIG_CXL_MEM_RAW_COMMANDS=y
CONFIG_CXL_ACPI=y
CONFIG_CXL_PMEM=y
CONFIG_CXL_MEM=y
CONFIG_CXL_FWCTL=y
CONFIG_CXL_PORT=y
CONFIG_CXL_SUSPEND=y
CONFIG_CXL_REGION=y
CONFIG_CXL_REGION_INVALIDATION_TEST=y
CONFIG_CXL_RAS_FEATURES=m
CONFIG_MMC_SDHCI_OF_ARASAN=y
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=y
CONFIG_EDAC_LEGACY_SYSFS=y
CONFIG_EDAC_DEBUG=y
CONFIG_EDAC_DECODE_MCE=m
CONFIG_EDAC_GHES=m
CONFIG_EDAC_SCRUB=y
CONFIG_EDAC_ECS=y
CONFIG_EDAC_MEM_REPAIR=y
CONFIG_EDAC_IGEN6=m
CONFIG_RAS=y
CONFIG_MEM_ACPI_RAS2=y
CONFIG_DEV_DAX_CXL=m
fan:~/cxl/linux-edac$


fan:~/cxl/linux-edac$ make -j16
mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C objtool 
  CALL    scripts/checksyscalls.sh
  INSTALL libsubcmd_headers
  UPD     include/generated/utsversion.h
  CC      init/version-timestamp.o
  KSYMS   .tmp_vmlinux0.kallsyms.S
  AS      .tmp_vmlinux0.kallsyms.o
  LD      .tmp_vmlinux1
ld: vmlinux.o: in function `cxl_region_probe':
/home/fan/cxl/linux-edac/drivers/cxl/core/region.c:3456:(.text+0x7b296f): undefined reference to `devm_cxl_region_edac_register'
ld: vmlinux.o: in function `cxl_mem_probe':
/home/fan/cxl/linux-edac/drivers/cxl/mem.c:188:(.text+0x7b8ad1): undefined reference to `devm_cxl_memdev_edac_register'
make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
make: *** [Makefile:251: __sub-make] Error 2

When compile with CONFIG_CXL_RAS_FEATURES=y, it can compile.


CASE 2: CONFIG_EDAC=m

fan:~/cxl/linux-edac$ cat .config | egrep -i "edac|cxl|ras" | grep -v "^#"
CONFIG_CRASH_RESERVE=y
CONFIG_CRASH_DUMP=y
CONFIG_CRASH_HOTPLUG=y
CONFIG_CRASH_MAX_MEMORY_RANGES=8192
CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y
CONFIG_ARCH_DEFAULT_CRASH_DUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_HOTPLUG=y
CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION=y
CONFIG_ACPI_RAS2=y
CONFIG_ACPI_APEI_EINJ_CXL=y
CONFIG_PCIEAER_CXL=y
CONFIG_CXL_BUS=y
CONFIG_CXL_PCI=y
CONFIG_CXL_MEM_RAW_COMMANDS=y
CONFIG_CXL_ACPI=y
CONFIG_CXL_PMEM=y
CONFIG_CXL_MEM=y
CONFIG_CXL_FWCTL=y
CONFIG_CXL_PORT=y
CONFIG_CXL_SUSPEND=y
CONFIG_CXL_REGION=y
CONFIG_CXL_REGION_INVALIDATION_TEST=y
CONFIG_CXL_RAS_FEATURES=y
CONFIG_MMC_SDHCI_OF_ARASAN=y
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=m
CONFIG_EDAC_LEGACY_SYSFS=y
CONFIG_EDAC_DEBUG=y
CONFIG_EDAC_DECODE_MCE=m
CONFIG_EDAC_GHES=m
CONFIG_EDAC_SCRUB=y
CONFIG_EDAC_ECS=y
CONFIG_EDAC_MEM_REPAIR=y
CONFIG_EDAC_IGEN6=m
CONFIG_RAS=y
CONFIG_MEM_ACPI_RAS2=y
CONFIG_DEV_DAX_CXL=m
fan:~/cxl/linux-edac$ 

fan:~/cxl/linux-edac$ make -j16
mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C objtool 
  CALL    scripts/checksyscalls.sh
  INSTALL libsubcmd_headers
  UPD     include/generated/utsversion.h
  CC      init/version-timestamp.o
  KSYMS   .tmp_vmlinux0.kallsyms.S
  AS      .tmp_vmlinux0.kallsyms.o
  LD      .tmp_vmlinux1
ld: vmlinux.o: in function `devm_cxl_region_edac_register':
/home/fan/cxl/linux-edac/drivers/cxl/core/memfeature.c:1720:(.text+0x7b665d): undefined reference to `edac_dev_register'
ld: vmlinux.o: in function `devm_cxl_memdev_edac_register':
/home/fan/cxl/linux-edac/drivers/cxl/core/memfeature.c:1697:(.text+0x7b7241): undefined reference to `edac_dev_register'
ld: vmlinux.o: in function `ras2_probe':
/home/fan/cxl/linux-edac/drivers/ras/acpi_ras2.c:363:(.text+0xb0ecc8): undefined reference to `edac_dev_register'
make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
make: *** [Makefile:251: __sub-make] Error 2



Fan


> Augmenting EDAC for controlling RAS features
> ============================================
> The proposed expansion of EDAC for controlling RAS features and
> exposing features control attributes to userspace in sysfs.
> Some Examples:
>  - Scrub control
>  - Error Check Scrub (ECS) control
>  - ACPI RAS2 features
>  - Post Package Repair (PPR) control
>  - Memory Sparing Repair control etc.
> 
> High level design is illustrated in the following diagram.
>  
>         +-----------------------------------------------+
>         |   Userspace - Rasdaemon                       |
>         | +-------------+                               |
>         | | RAS CXL mem |     +---------------+         |
>         | |error handler|---->|               |         |
>         | +-------------+     | RAS dynamic   |         |
>         | +-------------+     | scrub, memory |         |
>         | | RAS memory  |---->| repair control|         |
>         | |error handler|     +----|----------+         |
>         | +-------------+          |                    |
>         +--------------------------|--------------------+
>                                    |
>                                    |
>    +-------------------------------|------------------------------+
>    |     Kernel EDAC extension for | controlling RAS Features     |
>    |+------------------------------|----------------------------+ |
>    || EDAC Core          Sysfs EDAC| Bus                        | |
>    ||   +--------------------------|---------------------------+| |
>    ||   |/sys/bus/edac/devices/<dev>/scrubX/ |   | EDAC device || |
>    ||   |/sys/bus/edac/devices/<dev>/ecsX/   |<->| EDAC MC     || |
>    ||   |/sys/bus/edac/devices/<dev>/repairX |   | EDAC sysfs  || |
>    ||   +---------------------------|--------------------------+| |
>    ||                           EDAC|Bus                        | |
>    ||                               |                           | |
>    ||   +----------+ Get feature    |      Get feature          | |
>    ||   |          | desc +---------|------+ desc +----------+  | |
>    ||   |EDAC scrub|<-----| EDAC device    |      |          |  | |
>    ||   +----------+      | driver- RAS    |----->| EDAC mem |  | |
>    ||   +----------+      | feature control|      | repair   |  | |
>    ||   |          |<-----|                |      +----------+  | |
>    ||   |EDAC ECS  |      +---------|------+                    | |
>    ||   +----------+    Register RAS|features                   | |
>    ||         ______________________|_____________              | |
>    |+---------|---------------|------------------|--------------+ |
>    |  +-------|----+  +-------|-------+     +----|----------+     |
>    |  |            |  | CXL mem driver|     | Client driver |     |
>    |  | ACPI RAS2  |  | scrub, ECS,   |     | memory repair |     |
>    |  | driver     |  | sparing, PPR  |     | features      |     |
>    |  +-----|------+  +-------|-------+     +------|--------+     |
>    |        |                 |                    |              |
>    +--------|-----------------|--------------------|--------------+
>             |                 |                    |
>    +--------|-----------------|--------------------|--------------+
>    |    +---|-----------------|--------------------|-------+      |
>    |    |                                                  |      |
>    |    |            Platform HW and Firmware              |      |
>    |    +--------------------------------------------------+      |
>    +--------------------------------------------------------------+
> 
> 
> 1. EDAC RAS Features components - Create feature specific descriptors.
>    for example, EDAC scrub, EDAC ECS, EDAC memory repair in the above
>    diagram. 
> 2. EDAC device driver for controlling RAS Features - Get feature's attr
>    descriptors from EDAC RAS feature component and registers device's
>    RAS features with EDAC bus and expose the feature's sysfs attributes
>    under the sysfs EDAC bus.
> 3. RAS dynamic scrub controller - Userspace sample module added for scrub
>    control in rasdaemon to issue scrubbing when excess number of memory
>    errors are reported in a short span of time.
> 
> The added EDAC feature specific components (e.g. EDAC scrub, EDAC ECS,
> EDAC memory repair etc) do callbacks to  the parent driver (e.g. CXL
> driver, ACPI RAS driver etc) for the controls rather than just letting the
> caller deal with it because of the following reasons.
> 1. Enforces a common API across multiple implementations can do that
>    via review, but that's not generally gone well in the long run for
>    subsystems that have done it (several have later moved to callback
>    and feature list based approaches).
> 2. Gives a path for 'intercepting' in the EDAC feature driver.
>    An example for this is that we could intercept PPR repair calls
>    and sanity check that the memory in question is offline before
>    passing back to the underlying code.  Sure we could rely on doing
>    that via some additional calls from the parent driver, but the
>    ABI will get messier.
> 3. (Speculative) we may get in kernel users of some features in the
>    long run.
> 
> More details of the common RAS features are described in the following
> sections.
> 
> Memory Scrubbing
> ================
> Increasing DRAM size and cost has made memory subsystem reliability
> an important concern. These modules are used where potentially
> corrupted data could cause expensive or fatal issues. Memory errors are
> one of the top hardware failures that cause server and workload crashes.
> 
> Memory scrub is a feature where an ECC engine reads data from
> each memory media location, corrects with an ECC if necessary and
> writes the corrected data back to the same memory media location.
> 
> The memory DIMMs could be scrubbed at a configurable rate to detect
> uncorrected memory errors and attempts to recover from detected memory
> errors providing the following benefits.
> - Proactively scrubbing memory DIMMs reduces the chance of a correctable
>   error becoming uncorrectable.
> - Once detected, uncorrected errors caught in unallocated memory pages are
>   isolated and prevented from being allocated to an application or the OS.
> - The probability of software/hardware products encountering memory
>   errors is reduced.
> Some details of background can be found in Reference [5].
> 
> There are 2 types of memory scrubbing,
> 1. Background (patrol) scrubbing of the RAM whilst the RAM is otherwise
>    idle.
> 2. On-demand scrubbing for a specific address range/region of memory.
> 
> There are several types of interfaces to HW memory scrubbers
> identified such as ACPI NVDIMM ARS(Address Range Scrub), CXL memory
> device patrol scrub, CXL DDR5 ECS, ACPI RAS2 memory scrubbing.
> 
> The scrub control varies between different memory scrubbers. To allow
> for standard userspace tooling there is a need to present these controls
> with a standard ABI.
> 
> Introduce generic memory EDAC scrub control which allows user to control
> underlying scrubbers in the system via generic sysfs scrub control
> interface. The common sysfs scrub control interface abstracts the control
> of an arbitrary scrubbing functionality to a common set of functions.
> 
> Use case of common scrub control feature
> ========================================
> 1. There are several types of interfaces to HW memory scrubbers identified
>    such as ACPI NVDIMM ARS(Address Range Scrub), CXL memory device patrol
>    scrub, CXL DDR5 ECS, ACPI RAS2 memory scrubbing features and software
>    based memory scrubber(discussed in the community Reference [5]).
>    Also some scrubbers support controlling (background) patrol scrubbing
>    (ACPI RAS2, CXL) and/or on-demand scrubbing(ACPI RAS2, ACPI ARS).
>    However the scrub controls varies between memory scrubbers. Thus there
>    is a requirement for a standard generic sysfs scrub controls exposed
>    to userspace for the seamless control of the HW/SW scrubbers in
>    the system by admin/scripts/tools etc.
> 2. Scrub controls in user space allow the user to disable the scrubbing
>    in case disabling of the background patrol scrubbing or changing the
>    scrub rate are needed for other purposes such as performance-aware
>    operations which requires the background operations to be turned off
>    or reduced.
> 3. Allows to perform on-demand scrubbing for specific address range if
>    supported by the scrubber.
> 4. User space tools controls scrub the memory DIMMs regularly at a
>    configurable scrub rate using the sysfs scrub controls discussed help,
>    - to detect uncorrectable memory errors early before user accessing
>      memory, which helps to recover the detected memory errors.
>    - reduces the chance of a correctable error becoming uncorrectable.
> 5. Policy control for hotplugged memory. There is not necessarily a system
>    wide bios or similar in the loop to control the scrub settings on a CXL
>    device that wasn't there at boot. What that setting should be is a policy
>    decision as we are trading of reliability vs performance - hence it should
>    be in control of userspace. As such, 'an' interface is needed. Seems more
>    sensible to try and unify it with other similar interfaces than spin
>    yet another one.
> 
> The draft version of userspace code added in rasdaemon for dynamic scrub
> control, based on frequency of memory errors reported to userspace, tested
> for CXL device based patrol scrubbing feature and ACPI RAS2 based
> scrubbing feature.
> 
> https://github.com/shijujose4/rasdaemon/tree/ras_feature_control
> 
> ToDO: For memory repair features, such as PPR, memory sparing, rasdaemon
> collates records and decides to replace a row if there are lots of
> corrected errors, or a single uncorrected error or error record received
> with 'maintenance need' flag set as in some CXL event records.
> 
> Comparison of scrubbing features
> ================================
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |   ACPI    | CXL patrol|  CXL ECS  |  ARS      |
>  |  Name        |   RAS2    | scrub     |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | On-demand    | Supported | No        | No        | Supported |
>  | Scrubbing    |           |           |           |           |
>  |              |           |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | Background   | Supported | Supported | Supported | No        |
>  | scrubbing    |           |           |           |           |
>  |              |           |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | Mode of      | Scrub ctrl| per device| per memory|  Unknown  |
>  | scrubbing    | per NUMA  |           | media     |           |
>  |              | domain.   |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | Query scrub  | Supported | Supported | Supported | Supported |
>  | capabilities |           |           |           |           |
>  |              |           |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | Setting      | Supported | No        | No        | Supported |
>  | address range|           |           |           |           |
>  |              |           |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | Setting      | Supported | Supported | No        | No        |
>  | scrub rate   |           |           |           |           |
>  |              |           |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              |           |           |           |           |
>  | Unit for     | Not       | in hours  | No        | No        |
>  | scrub rate   | Defined   |           |           |           |
>  |              |           |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  |              | Supported |           |           |           |
>  | Scrub        | on-demand | No        | No        | Supported |
>  | status/      | scrubbing |           |           |           |
>  | Completion   | only      |           |           |           |
>  +--------------+-----------+-----------+-----------+-----------+
>  | UC error     |           |CXL general|CXL general| ACPI UCE  |
>  | reporting    | Exception |media/DRAM |media/DRAM | notify and|
>  |              |           |event/media|event/media| query     |
>  |              |           |scan?      |scan?      | ARS status|
>  +--------------+-----------+-----------+-----------+-----------+
> 
> CXL Memory Scrubbing features
> =============================
> CXL spec r3.1 section 8.2.9.9.11.1 describes the memory device patrol scrub
> control feature. The device patrol scrub proactively locates and makes
> corrections to errors in regular cycle. The patrol scrub control allows the
> request to configure patrol scrubber's input configurations.
> 
> The patrol scrub control allows the requester to specify the number of
> hours in which the patrol scrub cycles must be completed, provided that
> the requested number is not less than the minimum number of hours for the
> patrol scrub cycle that the device is capable of. In addition, the patrol
> scrub controls allow the host to disable and enable the feature in case
> disabling of the feature is needed for other purposes such as
> performance-aware operations which require the background operations to be
> turned off.
> 
> The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM
> Specification (JESD79-5) and allows the DRAM to internally read, correct
> single-bit errors, and write back corrected data bits to the DRAM array
> while providing transparency to error counts.
> 
> The DDR5 device contains number of memory media FRUs per device. The
> DDR5 ECS feature and thus the ECS control driver supports configuring
> the ECS parameters per FRU.
> 
> ACPI RAS2 Hardware-based Memory Scrubbing
> =========================================
> ACPI spec 6.5 section 5.2.21 ACPI RAS2 describes ACPI RAS2 table
> provides interfaces for platform RAS features and supports independent
> RAS controls and capabilities for a given RAS feature for multiple
> instances of the same component in a given system.
> Memory RAS features apply to RAS capabilities, controls and operations
> that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS
> features have a Feature Type of 0x00 (Memory).
> 
> The platform can use the hardware-based memory scrubbing feature to expose
> controls and capabilities associated with hardware-based memory scrub
> engines. The RAS2 memory scrubbing feature supports following as per spec,
>  - Independent memory scrubbing controls for each NUMA domain, identified
>    using its proximity domain.
>    Note: However AmpereComputing has single entry repeated as they have
>          centralized controls.
>  - Provision for background (patrol) scrubbing of the entire memory system,
>    as well as on-demand scrubbing for a specific region of memory.
> 
> ACPI Address Range Scrubbing(ARS)
> ================================
> ARS allows the platform to communicate memory errors to system software.
> This capability allows system software to prevent accesses to addresses
> with uncorrectable errors in memory. ARS functions manage all NVDIMMs
> present in the system. Only one scrub can be in progress system wide
> at any given time.
> Following functions are supported as per the specification.
> 1. Query ARS Capabilities for a given address range, indicates platform
>    supports the ACPI NVDIMM Root Device Unconsumed Error Notification.
> 2. Start ARS triggers an Address Range Scrub for the given memory range.
>    Address scrubbing can be done for volatile memory, persistent memory,
>    or both.
> 3. Query ARS Status command allows software to get the status of ARS,  
>    including the progress of ARS and ARS error record.
> 4. Clear Uncorrectable Error.
> 5. Translate SPA
> 6. ARS Error Inject etc.
> Note: Support for ARS is not added in this series because to reduce the
> line of code for review and could be added after initial code is merged. 
> We'd like feedback on whether this is of interest to ARS community?
> 
> Post Package Repair(PPR)
> ========================
> PPR (Post Package Repair) maintenance operation requests the memory device
> to perform a repair operation on its media if supported. A memory device
> may support two types of PPR: Hard PPR (hPPR), for a permanent row repair,
> and Soft PPR (sPPR), for a temporary row repair. sPPR is much faster than
> hPPR, but the repair is lost with a power cycle. During the execution of a
> PPR maintenance operation, a memory device, may or may not retain data and
> may or may not be able to process memory requests correctly. sPPR maintenance
> operation may be executed at runtime, if data is retained and memory requests
> are correctly processed. hPPR maintenance operation may be executed only at
> boot because data would not be retained.
> 
> Use cases of common PPR control feature
> =======================================
> 1. The Soft PPR (sPPR) and Hard PPR (hPPR) share similar control interfaces,
> thus there is a requirement for a standard generic sysfs PPR controls exposed
> to userspace for the seamless control of the PPR features in the system by the
> admin/scripts/tools etc.
> 2. When a CXL device identifies a failure on a memory component, the device
> may inform the host about the need for a PPR maintenance operation by using
> an event record, where the maintenance needed flag is set. The event record
> specifies the DPA that should be repaired. Kernel reports the corresponding
> cxl general media or DRAM trace event to userspace. The userspace tool,
> for reg. rasdaemon initiate a PPR maintenance operation in response to a
> device request using the sysfs PPR control.
> 3. User space tools, for eg. rasdaemon, do request PPR on a memory region
> when uncorrected memory error or excess corrected memory errors reported
> on that memory.
> 4. Likely multiple instances of PPR present per memory device.
> 
> Memory Sparing
> ==============
> Memory sparing is defined as a repair function that replaces a portion of
> memory with a portion of functional memory at that same DPA. User space
> tool, e.g. rasdaemon, may request the sparing operation for a given
> address for which the uncorrectable error is reported. In CXL,
> (CXL spec 3.1 section 8.2.9.7.1.4) subclasses for sparing operation vary
> in terms of the scope of the sparing being performed. The cacheline sparing
> subclass refers to a sparing action that can replace a full cacheline.
> Row sparing is provided as an alternative to PPR sparing functions and its
> scope is that of a single DDR row. Bank sparing allows an entire bank to
> be replaced. Rank sparing is defined as an operation in which an entire
> DDR rank is replaced.
> 
> Series adds,
> 1. EDAC device driver extended for controlling RAS features, EDAC scrub
>    driver, EDAC ECS driver, EDAC memory repair driver supports memory
>    scrub control, ECS control, memory repair(PPR, sparing) control
>    respectively.
> 2. Several common patches from Dave's cxl/fwctl series.   
> 3. Support for CXL feature mailbox commands, which is used by CXL device
>    scrubbing and memory repair features. 
> 4. CXL features driver supporting patrol scrub control (device and
>    region based).
> 5. CXL features driver supporting ECS control feature.
> 6. ACPI RAS2 driver adds OS interface for RAS2 communication through
>    PCC mailbox and extracts ACPI RAS2 feature table (RAS2) and
>    create platform device for the RAS memory features, which binds
>    to the memory ACPI RAS2 driver.
> 7. Memory ACPI RAS2 driver gets the PCC subspace for communicating
>    with the ACPI compliant platform supports ACPI RAS2. Add callback
>    functions and registers with EDAC device to support user to
>    control the HW patrol scrubbers exposed to the kernel via the
>    ACPI RAS2 table.
> 8. Support for CXL maintenance mailbox command, which is used by
>    CXL device memory repair feature.   
> 9. CXL features drivers supporting PPR and memory sparing features.
>     Note: There are other PPR, memory sparing drivers to come.
> 
> Open Questions based on feedbacks from the community:
> 1. Jonathan:
>    - Any need for discoverability of capability to scan different regions,
>    such as global PA space to userspace. Left as future extension.
> 
> 2. Jiaqi:
>    - STOP_PATROL_SCRUBBER from RAS2 must be blocked and, must not be exposed to
>      OS/userspace. Stopping patrol scrubber is unacceptable for platform where
>      OEM has enabled patrol scrubber, because the patrol scrubber is a key part
>      of logging and is repurposed for other RAS actions.
>    If the OEM does not want to expose this control, they should lock it down so the
>    interface is not exposed to the OS. These features are optional after all.
>    - "Requested Address Range"/"Actual Address Range" (region to scrub) is a
>       similarly bad thing to expose in RAS2.
>    If the OEM does not want to expose this, they should lock it down so the
>    interface is not exposed to the OS. These features are optional after all.
>    As per LPC discussion, support for stop and attributes for addr range
>    to be exposed to the userspace. 
> 3. Borislav: 
>    - How the scrub control exposed to userspace will be used?
>      POC added in rasdaemon with dynamic scrub control for CXL memory media
>      errors and memory errors reported to userspace.
>      https://github.com/shijujose4/rasdaemon/tree/scrub_control_6_june_2024
>    - Is the scrub interface is sufficient for the use cases?
>    - Who is going to use scrub controls tools/admin/scripts?
>      1) Rasdaemon for dynamic control
>      2) Udev script for more static 'defaults' on hotplug etc.
>    - Sample userspace code for memory repair feature
>      https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
>    - Sample boot-script for memory repair
>      https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/
> 
>      
> 4. PPR   
>    - For PPR, rasdaemon collates records and decides to replace a row if there
>      are lots of corrected errors, or a single uncorrected error or error record
>      received with maintenance request flag set as in CXL DRAM error record.
>    - sPPR more or less startup only (so faking hPPR) or actually useful
>      in a running system (if not the safe version that keeps everything
>      running whilst replacement is ongoing)
>    - Is future proofing for multiple PPR units useful given we've mashed
>      together hPPR and sPPR for CXL.
> 
> Implementation
> ==============
> 1. Linux kernel
> Version 19 of kernel implementations of RAS features control is available in,
> https://github.com/shijujose4/linux.git
> Branch: edac-enhancement-ras-features_v19
> 
> The code is based on v3 of CXL fwctl series [1] posted by Dave and
> v3 of FWCTL series [2] posted by Jason and rebased on top of
> v6.14-rc1.
> 
> [1]: https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jiang@intel.com/
> [2]: https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@nvidia.com/#r
>  
> 2. QEMU emulation
> QEMU for CXL RAS features implementation is available in, 
> https://gitlab.com/shiju.jose/qemu.git
> Branch: cxl-ras-features-2024-10-24
> 
> 3. Userspace - rasdaemon
> 3.1. Memory scrubbing 
>    The draft version of userspace sample code for dynamic scrub control,
>    based on frequency of memory errors reported to userspace, is added
>    in rasdaemon and enabled, tested for CXL device based patrol scrubbing
>    feature and ACPI RAS2 based scrubbing feature. This required updation
>    for the latest sysfs scrub interface.
>    https://github.com/shijujose4/rasdaemon/tree/ras_feature_control
> 
> 3.2. Memory repair
>    - Sample userspace code for memory repair feature [3].
> 
>    - Sample boot-script for memory repair [4].
> 
> [3]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/
> [4]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/
> 
> References:
> 1. ACPI spec r6.5 section 5.2.21 ACPI RAS2.
> 2. ACPI spec r6.5 section 9.19.7.2 ARS.
> 3. CXL spec  r3.1 8.2.9.9.11.1 Device patrol scrub control feature
> 4. CXL spec  r3.1 8.2.9.9.11.2 DDR5 ECS feature
> 5. CXL spec  r3.1 8.2.9.7.1.1 PPR Maintenance Operations
> 6. CXL spec  r3.1 8.2.9.7.2.1 sPPR Feature Discovery and Configuration
> 7. CXL spec  r3.1 8.2.9.7.2.2 hPPR Feature Discovery and Configuration
> 8. Background information about kernel support for memory scan, memory
>    error detection and ACPI RASF.
>    https://lore.kernel.org/all/20221103155029.2451105-1-jiaqiyan@google.com/
> 9. Discussions on RASF:
>    https://lore.kernel.org/lkml/20230915172818.761-1-shiju.jose@huawei.com/#r 
> 
> Changes
> =======
> v18 -> v19:
> 1. Added config EDAC_SCRUB, EDAC_ECS and EDAC_MEM_REPAIR and
>    associated changes for feedback from Borislav.
> 
> 2. Control attributes related to the memory-sparing features
>    in EDAC memory repair moved into another patch.
> 
> 3. Feedback from Dan on CXL patches, 
>    - Added more comments to the examples given for CXL features
>      in the Documentation/edac/scrub.rst and 
>      Documentation/edac/memory_repair.rst  
>    - Removed CXL_PCI dependency for the CONFIG_CXL_RAS_FEATURES 
>    - Updated description for CONFIG_CXL_RAS_FEATURES ad Dan suggested.
>    - Added specification reference comments for CXL patrol scrub and ECS.  
>    - kmalloc -> kzalloc
>    - Added comment and/or hold lock for region based memory scrubbing code. 
>      In addition, Added common helper functions to hold and release rwsem for read.
>    - Removed unwanted logs and dev_err -> dev_dbg in other places.
>    - Added separate helper functions for cxl_mem_ras_features_init() for memdev and region.
>      This solved a bug pointed by Dan in region based init.
>    - Replaced "cxl_region%d", cxlr->id with "%s_%s", "cxl", dev_name(&cxlr->dev)  
>    - removed module soft dep on cxl_features.  
> 
> 4. Mauro's review comments,
>    - Fixed ABI documentation issues, update diagram & table for ReST
>      and update .rst docs licensing update.
>    - Changed sysfs memory repair_type from integer to string format.
>    - Reorder, diagram and table updates in the patch[0].
> 
> 5. Feedback on memory repair from community and Jonathan, 
>    - Removed min and max attributes for range from edac mem repair for
>      bank, rank, channel,..etc. those were added in v16.
>    - For range attributes series continue to use min_ and max_ 
>      attributes. For eg. scrub rate in EDAC scrub and dpa & hpa in 
>      EDAC memory repair. Not sure to change or not to the scheme
>      suggested by Borislav for exposing range in sysfs.
>    - CXL memory repair - Added check for memory if offline and memory
>      to reapir with attributes reported in ev record in the current
>      boot. Otherwise do not issue repair command. 
>    etc.
> 
> 6. Feedback from Jonathan from internal review, 
>    - Fixes in the .rst documentations
>    - Update date in file headers.
>    - Change 'persist_mode' in memory repair to boolean.
>    - Moved some enums from patch for EDAC mem repair patch to 
>      CXL repair patches.
>    - Few optimizations and tidy up in RAS2 drivers.
>    - Removed extra return error code check for set_feature and get_feature
>      functions in CXL RAS code.
>    - Re-order structure definitions in CXL RAS driver for better packaging.
>    - Fixed multi line comment format in CXL memory repair drivers.
>    - Removed support for 'query' command from the CXL memory repair drivers.
>    - Removed zeroing sparing attributes if spare request to the device return error.
>    etc.
> 
> 7. 
>  - The code is based on v3 of CXL fwctl series [1] posted by Dave and
>     v3 of FWCTL series [2] posted by Jason and rebased on top of v6.14-rc1.
>  - Modified CXL code for Dave's changes based on Dan's suggestions, for
>    input for the feature calls is mbox based.
>  - In addition, added separate patch for cxl_get_supported_feature_entry()   
>    to EDAC series.
> 
> v17 -> v18:
> 1. Rebased to kernel version 6.13-rc5
> 2. Reordered patches for feedback on v17.
> 3.
> 3.1 Took updated patches for CXL feature infrastructure and feature commands
>    from Dave's cxl/features branch, debug and tested.
> 3.2. RAS features in the cxl/core/memfeature.c updated for interface
>      changes in the CXL feature commands.
> 4. Modified ACPI RAS2 code for the recent interface changes in the
>    PCC mbox code.
> 
> v16 -> v17:
> 1. 
> 1.1 Took several patches for CXL feature commands from Dave's 
>    fwctl/cxl series and add fixes pointed by Jonathan in those patches.
> 1.2. cxl/core/memfeature.c updated for interface changes in the
>    Get Supported Feature, Get Feature and Set Feature functions.
> 1.3. Used the UUID's for RAS features in CXL features code from
>      include/cxl/features.h    
> 2. Changes based on feedbacks from Boris
>  - Added attributes in EDAC memory repair to return the range for DPA
>    and other control attributes, and added callback functions for the
>    DPA range in CXL PPR and memory sparing code, which is the only one
>    supported in the CXL.
>  - Removed 'query' attribute for memory repair feature.
> 
> v15 -> v16:
> 1. Changes and Fixes for feedbacks from Boris
>  - Modified documentations and interface file for EDAC memory repair
>    to add more details and use cases.
>  - Merged documentations to corresponding patches instead of common patch
>    for full documentation for better readability.
>  - Removed 'persist_mode_avail' attribute for memory repair feature.
> 2. Changes for feedback from Dave Jiang
>  - Dave suggested helper function for ECS initialization in cxl/core/memfeature.c,
>    which added for all CXl RAS features, scrub, ECS, PPR and memory sparing features.
>  - Fixed endian conversion pointed by Dave in CXL memory sparing. Also I fixed
>      similar in CXL scrub, ECS and PPR features.
> 3. Changes for feedback from Ni Fan.
>  - Fixed a memory leak in edac_dev_register() for memory repair feature
>    and few suggestions by Ni Fan.
> 
> v14 -> v15:
> 1. Changes and Fixes for feedbacks from Boris
>   - Added documentations for edac features, scrub and memory_repair etc
>     and placed in a separate patch.
>   - Deleted extra 2 attributes for EDAC ECS log_entry_type_per_* and
>     mode_counts_*.
>   - Rsolved issues reported in Documentation/ABI/testing/sysfs-edac-ecs.
>   - Deleted unused pr_ftmt() from few files.
>   - Fixed some formatting issues EDAC ECS code and similar in other files. 
>     etc.
> 2. Change for feedback from Dave Jiang
>   - In CXL code for patrol scrub control, Dave suggested replace
>     void *drv_data with a union of parameters in cxl_ps_get_attrs() and
>     similar functions.
>     This is fixed by replacing void *drv_data with corresponding context
>     structure(struct cxl_patrol_scrub_context) in CXL local functions as
>     struct cxl_patrol_scrub_context difficult and can't be visible in
>     generic EDAC control interface. Similar changes are made for CXL ECS,
>     CXL PPR and CXL memory sparing local functions.
> 
> v13 -> v14:
> 1. Changes and Fixes for feedback from Boris
>   - Check grammar of patch description.
>   - Changed scrub control attributes for memory scrub range to "addr" and "size".
>   - Fixed unreached code in edac_dev_register(). 
>   - Removed enable_on_demand attribute from EDAC scrub control and modified
>     RAS2 driver for the same.
>   - Updated ABI documentation for EDAC scrub control.
>     etc.
> 
> 2. Changes for feedback from Greg/Rafael/Jonathan for ACPI RAS2
>   - Replaced platform device creation and binding with
>     auxiliary device creation and binding with ACPI RAS2
>     memory auxiliary driver.
> 
> 3. Changes and Fixes for feedback from Jonathan
>   - Fixed unreached code in edac_dev_register(). 
>   - Optimize callback functions in CXL ECS using macros.
>   - Add readback attributes for the EDAC memory repair feature
>     and add support in the CXL driver for PPR and memory sparing.
>   - Add refactoring in the CXL driver for PPR and memory sparing
>     for query/repair maintenance commands.
>   - Add cxl_dpa_to_region_locked() function.  
>   - Some more cleanups in the ACPI RAS2 and RAS2 memory drivers.
>     etc.
> 
> 4. Changes and Fixes for feedback from Ni Fan
>    - Fixed compilation error - cxl_mem_ras_features_init refined, when CXL components
>      build as module.
> 
> 5. Optimize callback functions in CXL memory sparing using macros.
>    etc.
>    
> v12 -> v13:
> 1. Changes and Fixes for feedback from Boris
>   - Function edac_dev_feat_init() merge with edac_dev_register()
>   - Add macros in EDAC feature specific code for repeated code.
>   - Correct spelling mistakes.
>   - Removed feature specific code from the patch "EDAC: Add support
>     for EDAC device features control"
> 2. Changes for feedbacks from Dave Jiang
>    - Move fields num_features and entries to struct cxl_mailbox,
>      in "cxl: Add Get Supported Features command for kernel usage"
>    - Use series from 
>      https://lore.kernel.org/linux-cxl/20240905223711.1990186-1-dave.jiang@intel.com/   
> 3. Changes and Fixes for feedback from Ni Fan
>    - In documentation scrub* to scrubX, ecs_fru* to ecs_fruX
>    - Corrected some grammar mistakes in the patch headers.
>    - Fixed an error print for min_scrub_cycle_hrs in the CXL patrol scrub
>      code.
>    - Improved an error print in the CXL ECS code.
>    - bool -> tristate for config CXL_RAS_FEAT
> 4. Add support for CXL memory sparing feature.
> 5. Add common EDAC memory repair driver for controlling memory repair
>    features, PPR, memory sparing etc.
> 
> v11 -> v12:
> 1. Changes and Fixes for feedback from Boris mainly for
>     patch "EDAC: Add support for EDAC device features control"
>     and other generic comments.
> 
> 2. Took CXL patches from Dave Jiang for "Add Get Supported Features
>    command for kernel usage" and other related patches. Merged helper
>    functions from this series to the above patch. Modifications of
>    CXL code in this series due to refactoring of CXL mailbox in Dave's
>    patches.
> 
> 3. Modified EDAC scrub control code to support multiple scrub instances
>    per device.
> 
> v10 -> v11:
> 1. Feedback from Borislav:
>    - Add generic EDAC code for control device features to
>      /drivers/edac/edac_device.c.
>    - Add common structure in edac for device feature's data.
>    
> 2. Some more optimizations in generic EDAC code for control
>    device features.
> 
> 3. Changes for feedback from Fan for ACPI RAS2 memory driver.
> 
> 4. Add support for control memory PPR (Post Package Repair) features
>    in EDAC.
>    
> 5. Add support for maintenance command in the CXL mailbox code,
>    which is needed for support PPR features in CXL driver.  
> 
> 6. Add support for control memory PPR (Post Package Repair) features
>    and do perform PPR maintenance operation in CXL driver.
> 
> 7. Rename drivers/cxl/core/memscrub.c to drivers/cxl/core/memfeature.c
> 
> v9 -> v10:
> 1. Feedback from Mauro Carvalho Chehab:
>    - Changes suggested in EDAC RAS feature driver.
>      use uppercase for enums, if else to switch-case, documentation for
>      static scrub and ecs init functions etc.
>    - Changes suggested in EDAC scrub.
>      unit of scrub cycle hour to seconds.
>      attribute node cycle_in_hours_available to min_cycle_duration and 
>      max_cycle_duration.
>      attribute node cycle_in_hours to current_cycle_duration.
>      Use base 0 for kstrtou64() and kstrtol() functions.
>      etc.
>    - Changes suggested in EDAC ECS.
>      uppercase for enums
>      add ABI documentation. etc
>         
> 2. Feedback from Fan:
>    - Changes suggested in EDAC RAS feature driver.
>      use uppercase for enums, change if...else to switch-case. 
>      some optimization in edac_ras_dev_register() function
>      add missing goto free_ctx
>    - Changes suggested in the code for feature commands.  
>    - CXL driver scrub and ECS code
>      use uppercase for enums, fix typo, use enum type for mode
>      fix lonf lines etc.
>        
> v8 -> v9:
> 1. Feedback from Borislav:
>    - Add scrub control driver to the EDAC on feedback from Borislav.
>    - Changed DEVICE_ATTR_..() static.
>    - Changed the write permissions for scrub control sysfs files as
>      root-only.
> 2. Feedback from Fan:
>    - Optimized cxl_get_feature() function by using min() and removed
>      feat_out_min_size.
>    - Removed unreached return from cxl_set_feature() function.
>    - Changed the term  "rate" to "cycle_in_hours" in all the
>      scrub control code.
>    - Allow cxl_mem_probe() continue if cxl_mem_patrol_scrub_init() fail,
>      with just a debug warning.
>       
> 3. Feedback from Jonathan:
>    - Removed patch __free() based cleanup function for acpi_put_table.
>      and added fix in the acpi RAS2 driver.
> 
> 4. Feedback from Dan Williams:
>    - Allow cxl_mem_probe() continue if cxl_mem_patrol_scrub_init() fail,
>      with just a debug warning.
>    - Add support for CXL region based scrub control.
> 
> 5. Feedback from Daniel Ferguson on RAS2 drivers:
>     In the ACPI RAS2 driver,
>   - Incorporated the changes given for clearing error reported.
>   - Incorporated the changes given for check the Set RAS Capability
>     status and return an appropriate error.
>     In the RAS2 memory driver,
>   - Added more checks for start/stop bg and on-demand scrubbing
>     so that addr range in cache do not get cleared and restrict
>     permitted operations during scrubbing.
> 
> History for v1 to v8 is available here.
> https://lore.kernel.org/lkml/20240726160556.2079-1-shiju.jose@huawei.com/
> 
> Shiju Jose (15):
>   EDAC: Add support for EDAC device features control
>   EDAC: Add scrub control feature
>   EDAC: Add ECS control feature
>   EDAC: Add memory repair control feature
>   ACPI:RAS2: Add ACPI RAS2 driver
>   ras: mem: Add memory ACPI RAS2 driver
>   cxl: Add helper function to retrieve a feature entry
>   cxl/memfeature: Add CXL memory device patrol scrub control feature
>   cxl/memfeature: Add CXL memory device ECS control feature
>   cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command
>   cxl/region: Add helper function to determine memory is online
>   cxl: Support for finding memory operation attributes from the current
>     boot
>   cxl/memfeature: Add CXL memory device soft PPR control feature
>   EDAC: Update memory repair control interface for memory sparing
>     feature
>   cxl/memfeature: Add CXL memory device memory sparing control feature
> 
>  Documentation/ABI/testing/sysfs-edac-ecs      |   74 +
>  .../ABI/testing/sysfs-edac-memory-repair      |  206 ++
>  Documentation/ABI/testing/sysfs-edac-scrub    |   69 +
>  Documentation/edac/features.rst               |  104 +
>  Documentation/edac/index.rst                  |   12 +
>  Documentation/edac/memory_repair.rst          |  209 ++
>  Documentation/edac/scrub.rst                  |  393 ++++
>  drivers/acpi/Kconfig                          |   11 +
>  drivers/acpi/Makefile                         |    1 +
>  drivers/acpi/ras2.c                           |  417 ++++
>  drivers/cxl/Kconfig                           |   18 +
>  drivers/cxl/core/Makefile                     |    1 +
>  drivers/cxl/core/core.h                       |    9 +
>  drivers/cxl/core/features.c                   |   21 +
>  drivers/cxl/core/mbox.c                       |   45 +-
>  drivers/cxl/core/memdev.c                     |    9 +
>  drivers/cxl/core/memfeature.c                 | 1723 +++++++++++++++++
>  drivers/cxl/core/ras.c                        |  151 ++
>  drivers/cxl/core/region.c                     |   15 +
>  drivers/cxl/cxlmem.h                          |   82 +
>  drivers/cxl/mem.c                             |    4 +
>  drivers/cxl/pci.c                             |    3 +
>  drivers/edac/Kconfig                          |   28 +
>  drivers/edac/Makefile                         |    4 +
>  drivers/edac/ecs.c                            |  207 ++
>  drivers/edac/edac_device.c                    |  185 ++
>  drivers/edac/mem_repair.c                     |  365 ++++
>  drivers/edac/scrub.c                          |  209 ++
>  drivers/ras/Kconfig                           |   10 +
>  drivers/ras/Makefile                          |    1 +
>  drivers/ras/acpi_ras2.c                       |  383 ++++
>  include/acpi/ras2_acpi.h                      |   47 +
>  include/cxl/features.h                        |    2 +
>  include/linux/edac.h                          |  228 +++
>  34 files changed, 5244 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-edac-ecs
>  create mode 100644 Documentation/ABI/testing/sysfs-edac-memory-repair
>  create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
>  create mode 100644 Documentation/edac/features.rst
>  create mode 100644 Documentation/edac/index.rst
>  create mode 100644 Documentation/edac/memory_repair.rst
>  create mode 100644 Documentation/edac/scrub.rst
>  create mode 100755 drivers/acpi/ras2.c
>  create mode 100644 drivers/cxl/core/memfeature.c
>  create mode 100644 drivers/cxl/core/ras.c
>  create mode 100755 drivers/edac/ecs.c
>  create mode 100755 drivers/edac/mem_repair.c
>  create mode 100755 drivers/edac/scrub.c
>  create mode 100644 drivers/ras/acpi_ras2.c
>  create mode 100644 include/acpi/ras2_acpi.h
> 
> -- 
> 2.43.0
> 

-- 
Fan Ni


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry
  2025-02-07 14:44 ` [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry shiju.jose
@ 2025-02-11  2:39   ` Dave Jiang
  2025-02-11 20:46     ` Shiju Jose
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Jiang @ 2025-02-11  2:39 UTC (permalink / raw)
  To: shiju.jose, linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, jonathan.cameron, alison.schofield, vishal.l.verma,
	ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
	rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
	james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
	duenwen, gthelen, wschwartz, dferguson, wbs, nifan.cxl,
	tanxiaofei, prime.zeng, roberto.sassu, kangkang.shen,
	wanghuiqiang, linuxarm



On 2/7/25 7:44 AM, shiju.jose@huawei.com wrote:
> From: Shiju Jose <shiju.jose@huawei.com>
> 
> Add helper function to retrieve a feature entry from the supported
> features list, if supported.
> 
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
> ---
>  drivers/cxl/core/features.c | 21 +++++++++++++++++++++
>  include/cxl/features.h      |  2 ++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c
> index 5f64185a5c7a..bf175e69cda1 100644
> --- a/drivers/cxl/core/features.c
> +++ b/drivers/cxl/core/features.c
> @@ -43,6 +43,27 @@ bool is_cxl_feature_exclusive(struct cxl_feat_entry *entry)
>  }
>  EXPORT_SYMBOL_NS_GPL(is_cxl_feature_exclusive, "CXL");
>  
> +struct cxl_feat_entry *cxl_get_feature_entry(struct cxl_memdev *cxlmd,
> +					     const uuid_t *feat_uuid)
> +{
> +	struct cxl_features_state *cxlfs = cxlmd->cxlfs;
> +	struct cxl_feat_entry *feat_entry;
> +	int count;
> +
> +	/*
> +	 * Retrieve the feature entry from the supported features list,
> +	 * if the feature is supported.
> +	 */
> +	feat_entry = cxlfs->entries;
> +	for (count = 0; count < cxlfs->num_features; count++, feat_entry++) {
> +		if (uuid_equal(&feat_entry->uuid, feat_uuid))
> +			return feat_entry;
> +	}
> +
> +	return ERR_PTR(-ENOENT);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_get_feature_entry, "CXL");

You probably don't need this if the memfeature code are in CXL core. 

DJ

> +
>  size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
>  		       enum cxl_get_feat_selection selection,
>  		       void *feat_out, size_t feat_out_size, u16 offset,
> diff --git a/include/cxl/features.h b/include/cxl/features.h
> index e52d0573f504..563d966beee5 100644
> --- a/include/cxl/features.h
> +++ b/include/cxl/features.h
> @@ -68,6 +68,8 @@ struct cxl_features_state {
>  };
>  
>  struct cxl_mailbox;
> +struct cxl_feat_entry *cxl_get_feature_entry(struct cxl_memdev *cxlmd,
> +					     const uuid_t *feat_uuid);
>  size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
>  		       enum cxl_get_feat_selection selection,
>  		       void *feat_out, size_t feat_out_size, u16 offset,



^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
  2025-02-10 17:52 ` [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Fan Ni
@ 2025-02-11 16:55   ` Shiju Jose
  2025-02-11 18:43     ` Fan Ni
  0 siblings, 1 reply; 22+ messages in thread
From: Shiju Jose @ 2025-02-11 16:55 UTC (permalink / raw)
  To: Fan Ni
  Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel,
	linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, Jonathan Cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang, Linuxarm,
	a.manzanares, nmtadam.samsung, anisa.su887

>-----Original Message-----
>From: Fan Ni <nifan.cxl@gmail.com>
>Sent: 10 February 2025 17:53
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
>linux-doc@vger.kernel.org; bp@alien8.de; tony.luck@intel.com;
>rafael@kernel.org; lenb@kernel.org; mchehab@kernel.org;
>dan.j.williams@intel.com; dave@stgolabs.net; Jonathan Cameron
><jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
>Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
>wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
><linuxarm@huawei.com>; a.manzanares@samsung.com;
>nmtadam.samsung@gmail.com; anisa.su887@gmail.com
>Subject: Re: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS
>control feature driver + CXL/ACPI-RAS2 drivers
>
>On Fri, Feb 07, 2025 at 02:44:29PM +0000, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> The CXL patches of this series has dependency on Dave's CXL fwctl
>> series [1].
>>
>> The code is based on v3 of CXL fwctl series [1] posted by Dave and
>> v3 of FWCTL series [2] posted by Jason and rebased on top of
>> v6.14-rc1.
>>
>> [1]:
>> https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jiang@
>> intel.com/
>> [2]:
>> https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@nvid
>> ia.com/#r
>>
>>
>> Userspace code for CXL memory repair features [3] and sample
>> boot-script for CXL memory repair [4].
>>
>> [3]:
>> https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.c
>> om/
>> [4]:
>> https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.c
>> om/
>>
>
>Hi Shiju,
>Is this series the same as in branch
>https://github.com/shijujose4/linux/tree/edac-enhancement-ras-features_v19?
>
>I hit some compile errors wen trying to test with the above branch directly.
>
>Here are two cases where I found the code cannot compile. Please check if it is a
>false alarm.
>
>Case 1: CONFIG_CXL_RAS_FEATURES=m
>
>fan:~/cxl/linux-edac$ cat .config | egrep -i "edac|cxl|ras" | grep -v "^#"
>CONFIG_ACPI_RAS2=y
>CONFIG_ACPI_APEI_EINJ_CXL=y
>CONFIG_PCIEAER_CXL=y
>CONFIG_CXL_BUS=y
>CONFIG_CXL_PCI=y
>CONFIG_CXL_MEM_RAW_COMMANDS=y
>CONFIG_CXL_ACPI=y
>CONFIG_CXL_PMEM=y
>CONFIG_CXL_MEM=y
>CONFIG_CXL_FWCTL=y
>CONFIG_CXL_PORT=y
>CONFIG_CXL_SUSPEND=y
>CONFIG_CXL_REGION=y
>CONFIG_CXL_REGION_INVALIDATION_TEST=y
>CONFIG_CXL_RAS_FEATURES=m
>CONFIG_MMC_SDHCI_OF_ARASAN=y
>CONFIG_EDAC_ATOMIC_SCRUB=y
>CONFIG_EDAC_SUPPORT=y
>CONFIG_EDAC=y
>CONFIG_EDAC_LEGACY_SYSFS=y
>CONFIG_EDAC_DEBUG=y
>CONFIG_EDAC_DECODE_MCE=m
>CONFIG_EDAC_GHES=m
>CONFIG_EDAC_SCRUB=y
>CONFIG_EDAC_ECS=y
>CONFIG_EDAC_MEM_REPAIR=y
>CONFIG_EDAC_IGEN6=m
>CONFIG_RAS=y
>CONFIG_MEM_ACPI_RAS2=y
>CONFIG_DEV_DAX_CXL=m
>fan:~/cxl/linux-edac$
>
>
>fan:~/cxl/linux-edac$ make -j16
>mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make
>O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C objtool
>  CALL    scripts/checksyscalls.sh
>  INSTALL libsubcmd_headers
>  UPD     include/generated/utsversion.h
>  CC      init/version-timestamp.o
>  KSYMS   .tmp_vmlinux0.kallsyms.S
>  AS      .tmp_vmlinux0.kallsyms.o
>  LD      .tmp_vmlinux1
>ld: vmlinux.o: in function `cxl_region_probe':
>/home/fan/cxl/linux-edac/drivers/cxl/core/region.c:3456:(.text+0x7b296f):
>undefined reference to `devm_cxl_region_edac_register'
>ld: vmlinux.o: in function `cxl_mem_probe':
>/home/fan/cxl/linux-edac/drivers/cxl/mem.c:188:(.text+0x7b8ad1): undefined
>reference to `devm_cxl_memdev_edac_register'
>make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
>make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
>make: *** [Makefile:251: __sub-make] Error 2
>
>When compile with CONFIG_CXL_RAS_FEATURES=y,  it can compile.
>
Hi Fan,

Thanks for checking this and reporting.

This error is with CONFIG_CXL_RAS_FEATURES=m and CONFIG_CXL_BUS=y and CONFIG_CXL_MEM=y.
Now changed  CONFIG_CXL_RAS_FEATURES  for tristate -> boolean as this implemented only interface functions
for the CXL RAS features.
>
>CASE 2: CONFIG_EDAC=m
>
>fan:~/cxl/linux-edac$ cat .config | egrep -i "edac|cxl|ras" | grep -v "^#"
>CONFIG_CRASH_RESERVE=y
>CONFIG_CRASH_DUMP=y
>CONFIG_CRASH_HOTPLUG=y
>CONFIG_CRASH_MAX_MEMORY_RANGES=8192
>CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y
>CONFIG_ARCH_DEFAULT_CRASH_DUMP=y
>CONFIG_ARCH_SUPPORTS_CRASH_HOTPLUG=y
>CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION=y
>CONFIG_ACPI_RAS2=y
>CONFIG_ACPI_APEI_EINJ_CXL=y
>CONFIG_PCIEAER_CXL=y
>CONFIG_CXL_BUS=y
>CONFIG_CXL_PCI=y
>CONFIG_CXL_MEM_RAW_COMMANDS=y
>CONFIG_CXL_ACPI=y
>CONFIG_CXL_PMEM=y
>CONFIG_CXL_MEM=y
>CONFIG_CXL_FWCTL=y
>CONFIG_CXL_PORT=y
>CONFIG_CXL_SUSPEND=y
>CONFIG_CXL_REGION=y
>CONFIG_CXL_REGION_INVALIDATION_TEST=y
>CONFIG_CXL_RAS_FEATURES=y
>CONFIG_MMC_SDHCI_OF_ARASAN=y
>CONFIG_EDAC_ATOMIC_SCRUB=y
>CONFIG_EDAC_SUPPORT=y
>CONFIG_EDAC=m
>CONFIG_EDAC_LEGACY_SYSFS=y
>CONFIG_EDAC_DEBUG=y
>CONFIG_EDAC_DECODE_MCE=m
>CONFIG_EDAC_GHES=m
>CONFIG_EDAC_SCRUB=y
>CONFIG_EDAC_ECS=y
>CONFIG_EDAC_MEM_REPAIR=y
>CONFIG_EDAC_IGEN6=m
>CONFIG_RAS=y
>CONFIG_MEM_ACPI_RAS2=y
>CONFIG_DEV_DAX_CXL=m
>fan:~/cxl/linux-edac$
>
>fan:~/cxl/linux-edac$ make -j16
>mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make
>O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C objtool
>  CALL    scripts/checksyscalls.sh
>  INSTALL libsubcmd_headers
>  UPD     include/generated/utsversion.h
>  CC      init/version-timestamp.o
>  KSYMS   .tmp_vmlinux0.kallsyms.S
>  AS      .tmp_vmlinux0.kallsyms.o
>  LD      .tmp_vmlinux1
>ld: vmlinux.o: in function `devm_cxl_region_edac_register':
>/home/fan/cxl/linux-
>edac/drivers/cxl/core/memfeature.c:1720:(.text+0x7b665d): undefined
>reference to `edac_dev_register'
>ld: vmlinux.o: in function `devm_cxl_memdev_edac_register':
>/home/fan/cxl/linux-
>edac/drivers/cxl/core/memfeature.c:1697:(.text+0x7b7241): undefined
>reference to `edac_dev_register'
>ld: vmlinux.o: in function `ras2_probe':
>/home/fan/cxl/linux-edac/drivers/ras/acpi_ras2.c:363:(.text+0xb0ecc8):
>undefined reference to `edac_dev_register'
>make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
>make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
>make: *** [Makefile:251: __sub-make] Error 2
>

Here the symbol 'edac_dev_register' can't find with CONFIG_CXL_BUS=y  CONFIG_CXL_RAS_FEATURES=y and 
CONFIG_EDAC=m.
Modified CXL_RAS_FEATURES depends on EDAC=y || (CXL_BUS=m && EDAC=m)
to fix this.
>
>
>Fan
>
>
Thanks,
Shiju


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
  2025-02-11 16:55   ` Shiju Jose
@ 2025-02-11 18:43     ` Fan Ni
  2025-02-11 20:42       ` Shiju Jose
  0 siblings, 1 reply; 22+ messages in thread
From: Fan Ni @ 2025-02-11 18:43 UTC (permalink / raw)
  To: Shiju Jose
  Cc: Fan Ni, linux-edac, linux-cxl, linux-acpi, linux-mm,
	linux-kernel, linux-doc, bp, tony.luck, rafael, lenb, mchehab,
	dan.j.williams, dave, Jonathan Cameron, dave.jiang,
	alison.schofield, vishal.l.verma, ira.weiny, david,
	Vilas.Sridharan, leo.duran, Yazen.Ghannam, rientjes, jiaqiyan,
	Jon.Grimm, dave.hansen, naoya.horiguchi, james.morse, jthoughton,
	somasundaram.a, erdemaktas, pgonda, duenwen, gthelen, wschwartz,
	dferguson, wbs, tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang, Linuxarm,
	a.manzanares, nmtadam.samsung, anisa.su887

On Tue, Feb 11, 2025 at 04:55:49PM +0000, Shiju Jose wrote:
> >-----Original Message-----
> >From: Fan Ni <nifan.cxl@gmail.com>
> >Sent: 10 February 2025 17:53
> >To: Shiju Jose <shiju.jose@huawei.com>
> >Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
> >acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org;
> >linux-doc@vger.kernel.org; bp@alien8.de; tony.luck@intel.com;
> >rafael@kernel.org; lenb@kernel.org; mchehab@kernel.org;
> >dan.j.williams@intel.com; dave@stgolabs.net; Jonathan Cameron
> ><jonathan.cameron@huawei.com>; dave.jiang@intel.com;
> >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
> >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
> >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
> >Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
> >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
> >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
> >duenwen@google.com; gthelen@google.com;
> >wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
> >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei
> ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto
> >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com;
> >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm
> ><linuxarm@huawei.com>; a.manzanares@samsung.com;
> >nmtadam.samsung@gmail.com; anisa.su887@gmail.com
> >Subject: Re: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS
> >control feature driver + CXL/ACPI-RAS2 drivers
> >
> >On Fri, Feb 07, 2025 at 02:44:29PM +0000, shiju.jose@huawei.com wrote:
> >> From: Shiju Jose <shiju.jose@huawei.com>
> >>
> >> The CXL patches of this series has dependency on Dave's CXL fwctl
> >> series [1].
> >>
> >> The code is based on v3 of CXL fwctl series [1] posted by Dave and
> >> v3 of FWCTL series [2] posted by Jason and rebased on top of
> >> v6.14-rc1.
> >>
> >> [1]:
> >> https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jiang@
> >> intel.com/
> >> [2]:
> >> https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@nvid
> >> ia.com/#r
> >>
> >>
> >> Userspace code for CXL memory repair features [3] and sample
> >> boot-script for CXL memory repair [4].
> >>
> >> [3]:
> >> https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.c
> >> om/
> >> [4]:
> >> https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.c
> >> om/
> >>
> >
> >Hi Shiju,
> >Is this series the same as in branch
> >https://github.com/shijujose4/linux/tree/edac-enhancement-ras-features_v19?
> >
> >I hit some compile errors wen trying to test with the above branch directly.
> >
> >Here are two cases where I found the code cannot compile. Please check if it is a
> >false alarm.
> >
> >Case 1: CONFIG_CXL_RAS_FEATURES=m
...
> >
> >fan:~/cxl/linux-edac$ make -j16
> >mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make
> >O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C objtool
> >  CALL    scripts/checksyscalls.sh
> >  INSTALL libsubcmd_headers
> >  UPD     include/generated/utsversion.h
> >  CC      init/version-timestamp.o
> >  KSYMS   .tmp_vmlinux0.kallsyms.S
> >  AS      .tmp_vmlinux0.kallsyms.o
> >  LD      .tmp_vmlinux1
> >ld: vmlinux.o: in function `cxl_region_probe':
> >/home/fan/cxl/linux-edac/drivers/cxl/core/region.c:3456:(.text+0x7b296f):
> >undefined reference to `devm_cxl_region_edac_register'
> >ld: vmlinux.o: in function `cxl_mem_probe':
> >/home/fan/cxl/linux-edac/drivers/cxl/mem.c:188:(.text+0x7b8ad1): undefined
> >reference to `devm_cxl_memdev_edac_register'
> >make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
> >make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
> >make: *** [Makefile:251: __sub-make] Error 2
> >
> >When compile with CONFIG_CXL_RAS_FEATURES=y,  it can compile.
> >
> Hi Fan,
> 
> Thanks for checking this and reporting.
> 
> This error is with CONFIG_CXL_RAS_FEATURES=m and CONFIG_CXL_BUS=y and CONFIG_CXL_MEM=y.
> Now changed  CONFIG_CXL_RAS_FEATURES  for tristate -> boolean as this implemented only interface functions
> for the CXL RAS features.
> >
> >CASE 2: CONFIG_EDAC=m
> >
... 
> >fan:~/cxl/linux-edac$ make -j16
> >mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make
> >O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C objtool
> >  CALL    scripts/checksyscalls.sh
> >  INSTALL libsubcmd_headers
> >  UPD     include/generated/utsversion.h
> >  CC      init/version-timestamp.o
> >  KSYMS   .tmp_vmlinux0.kallsyms.S
> >  AS      .tmp_vmlinux0.kallsyms.o
> >  LD      .tmp_vmlinux1
> >ld: vmlinux.o: in function `devm_cxl_region_edac_register':
> >/home/fan/cxl/linux-
> >edac/drivers/cxl/core/memfeature.c:1720:(.text+0x7b665d): undefined
> >reference to `edac_dev_register'
> >ld: vmlinux.o: in function `devm_cxl_memdev_edac_register':
> >/home/fan/cxl/linux-
> >edac/drivers/cxl/core/memfeature.c:1697:(.text+0x7b7241): undefined
> >reference to `edac_dev_register'
> >ld: vmlinux.o: in function `ras2_probe':
> >/home/fan/cxl/linux-edac/drivers/ras/acpi_ras2.c:363:(.text+0xb0ecc8):
> >undefined reference to `edac_dev_register'
> >make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
> >make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
> >make: *** [Makefile:251: __sub-make] Error 2
> >
> 
> Here the symbol 'edac_dev_register' can't find with CONFIG_CXL_BUS=y  CONFIG_CXL_RAS_FEATURES=y and 
> CONFIG_EDAC=m.
> Modified CXL_RAS_FEATURES depends on EDAC=y || (CXL_BUS=m && EDAC=m)
> to fix this.
Hi Shiju,
Did you mean the following fix?

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 77baef31cf3c..8615f329baa2 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -162,11 +162,12 @@ config CXL_REGION_INVALIDATION_TEST
          say N.

 config CXL_RAS_FEATURES
-       tristate "CXL: Memory RAS features"
+       bool "CXL: Memory RAS features"
        depends on CXL_MEM
        depends on EDAC_SCRUB
        depends on EDAC_ECS
        depends on EDAC_MEM_REPAIR
+       depends on EDAC=y || (CXL_BUS=m && EDAC=m)
        help
          The CXL memory RAS feature control is optional and allows host to
          control the RAS features configurations of CXL Type 3 devices.



With the fix, I still see the errors with following config.

fan:~/cxl/linux-edac$ cat .config | egrep "EDAC|CXL|RAS" | grep -v "^#"
CONFIG_ACPI_RAS2=y
CONFIG_ACPI_APEI_EINJ_CXL=y
CONFIG_PCIEAER_CXL=y
CONFIG_CXL_BUS=m
CONFIG_CXL_PCI=m
CONFIG_CXL_MEM_RAW_COMMANDS=y
CONFIG_CXL_ACPI=m
CONFIG_CXL_PMEM=m
CONFIG_CXL_MEM=m
CONFIG_CXL_FWCTL=y
CONFIG_CXL_PORT=m
CONFIG_CXL_SUSPEND=y
CONFIG_CXL_REGION=y
CONFIG_CXL_REGION_INVALIDATION_TEST=y
CONFIG_CXL_RAS_FEATURES=y
CONFIG_MMC_SDHCI_OF_ARASAN=y
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=m
CONFIG_EDAC_LEGACY_SYSFS=y
CONFIG_EDAC_DEBUG=y
CONFIG_EDAC_DECODE_MCE=m
CONFIG_EDAC_GHES=m
CONFIG_EDAC_SCRUB=y
CONFIG_EDAC_ECS=y
CONFIG_EDAC_MEM_REPAIR=y
CONFIG_EDAC_IGEN6=m
CONFIG_RAS=y
CONFIG_MEM_ACPI_RAS2=y
CONFIG_DEV_DAX_CXL=m

ld: vmlinux.o: in function `ras2_probe':
/home/fan/cxl/linux-edac/drivers/ras/acpi_ras2.c:363:(.text+0xaeb5c8): undefined reference to `edac_dev_register'
make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
make: *** [Makefile:251: __sub-make] Error 2

It seems ACPI_RAS2 depends on EDAC.
When changing CONFIG_EDAC=y, it compiles fine.

Fan



> >
> >
> >Fan
> >
> >
> Thanks,
> Shiju

-- 
Fan Ni


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers
  2025-02-11 18:43     ` Fan Ni
@ 2025-02-11 20:42       ` Shiju Jose
  0 siblings, 0 replies; 22+ messages in thread
From: Shiju Jose @ 2025-02-11 20:42 UTC (permalink / raw)
  To: Fan Ni
  Cc: linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel,
	linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, Jonathan Cameron, dave.jiang, alison.schofield,
	vishal.l.verma, ira.weiny, david, Vilas.Sridharan, leo.duran,
	Yazen.Ghannam, rientjes, jiaqiyan, Jon.Grimm, dave.hansen,
	naoya.horiguchi, james.morse, jthoughton, somasundaram.a,
	erdemaktas, pgonda, duenwen, gthelen, wschwartz, dferguson, wbs,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang, Linuxarm,
	a.manzanares, nmtadam.samsung, anisa.su887



>-----Original Message-----
>From: Fan Ni <nifan.cxl@gmail.com>
>Sent: 11 February 2025 18:44
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: Fan Ni <nifan.cxl@gmail.com>; linux-edac@vger.kernel.org; linux-
>cxl@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-
>kernel@vger.kernel.org; linux-doc@vger.kernel.org; bp@alien8.de;
>tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan
>Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com;
>Jon.Grimm@amd.com; dave.hansen@linux.intel.com;
>naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com;
>duenwen@google.com; gthelen@google.com;
>wschwartz@amperecomputing.com; dferguson@amperecomputing.com;
>wbs@os.amperecomputing.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao
>(B) <prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>;
>Linuxarm <linuxarm@huawei.com>; a.manzanares@samsung.com;
>nmtadam.samsung@gmail.com; anisa.su887@gmail.com
>Subject: Re: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS
>control feature driver + CXL/ACPI-RAS2 drivers
>
>On Tue, Feb 11, 2025 at 04:55:49PM +0000, Shiju Jose wrote:
>> >-----Original Message-----
>> >From: Fan Ni <nifan.cxl@gmail.com>
>> >Sent: 10 February 2025 17:53
>> >To: Shiju Jose <shiju.jose@huawei.com>
>> >Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux-
>> >acpi@vger.kernel.org; linux-mm@kvack.org;
>> >linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org;
>> >bp@alien8.de; tony.luck@intel.com; rafael@kernel.org;
>> >lenb@kernel.org; mchehab@kernel.org; dan.j.williams@intel.com;
>> >dave@stgolabs.net; Jonathan Cameron <jonathan.cameron@huawei.com>;
>> >dave.jiang@intel.com; alison.schofield@intel.com;
>> >vishal.l.verma@intel.com; ira.weiny@intel.com; david@redhat.com;
>> >Vilas.Sridharan@amd.com; leo.duran@amd.com;
>Yazen.Ghannam@amd.com;
>> >rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>> >dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>> >james.morse@arm.com; jthoughton@google.com;
>somasundaram.a@hpe.com;
>> >erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>> >gthelen@google.com; wschwartz@amperecomputing.com;
>> >dferguson@amperecomputing.com; wbs@os.amperecomputing.com;
>> >nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
>> ><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>> >kangkang.shen@futurewei.com; wanghuiqiang
><wanghuiqiang@huawei.com>;
>> >Linuxarm <linuxarm@huawei.com>; a.manzanares@samsung.com;
>> >nmtadam.samsung@gmail.com; anisa.su887@gmail.com
>> >Subject: Re: [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC
>> >RAS control feature driver + CXL/ACPI-RAS2 drivers
>> >
>> >On Fri, Feb 07, 2025 at 02:44:29PM +0000, shiju.jose@huawei.com wrote:
>> >> From: Shiju Jose <shiju.jose@huawei.com>
>> >>
>> >> The CXL patches of this series has dependency on Dave's CXL fwctl
>> >> series [1].
>> >>
>> >> The code is based on v3 of CXL fwctl series [1] posted by Dave and
>> >> v3 of FWCTL series [2] posted by Jason and rebased on top of
>> >> v6.14-rc1.
>> >>
>> >> [1]:
>> >> https://lore.kernel.org/linux-cxl/20250204220430.4146187-1-dave.jia
>> >> ng@
>> >> intel.com/
>> >> [2]:
>> >> https://lore.kernel.org/linux-cxl/0-v3-960f17f90f17+516-fwctl_jgg@n
>> >> vid
>> >> ia.com/#r
>> >>
>> >>
>> >> Userspace code for CXL memory repair features [3] and sample
>> >> boot-script for CXL memory repair [4].
>> >>
>> >> [3]:
>> >> https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawe
>> >> i.c
>> >> om/
>> >> [4]:
>> >> https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawe
>> >> i.c
>> >> om/
>> >>
>> >
>> >Hi Shiju,
>> >Is this series the same as in branch
>> >https://github.com/shijujose4/linux/tree/edac-enhancement-ras-
>features_v19?
>> >
>> >I hit some compile errors wen trying to test with the above branch directly.
>> >
>> >Here are two cases where I found the code cannot compile. Please
>> >check if it is a false alarm.
>> >
>> >Case 1: CONFIG_CXL_RAS_FEATURES=m
>...
>> >
>> >fan:~/cxl/linux-edac$ make -j16
>> >mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make
>> >O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C
>objtool
>> >  CALL    scripts/checksyscalls.sh
>> >  INSTALL libsubcmd_headers
>> >  UPD     include/generated/utsversion.h
>> >  CC      init/version-timestamp.o
>> >  KSYMS   .tmp_vmlinux0.kallsyms.S
>> >  AS      .tmp_vmlinux0.kallsyms.o
>> >  LD      .tmp_vmlinux1
>> >ld: vmlinux.o: in function `cxl_region_probe':
>> >/home/fan/cxl/linux-edac/drivers/cxl/core/region.c:3456:(.text+0x7b296f):
>> >undefined reference to `devm_cxl_region_edac_register'
>> >ld: vmlinux.o: in function `cxl_mem_probe':
>> >/home/fan/cxl/linux-edac/drivers/cxl/mem.c:188:(.text+0x7b8ad1):
>> >undefined reference to `devm_cxl_memdev_edac_register'
>> >make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
>> >make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error
>> >2
>> >make: *** [Makefile:251: __sub-make] Error 2
>> >
>> >When compile with CONFIG_CXL_RAS_FEATURES=y,  it can compile.
>> >
>> Hi Fan,
>>
>> Thanks for checking this and reporting.
>>
>> This error is with CONFIG_CXL_RAS_FEATURES=m and CONFIG_CXL_BUS=y
>and CONFIG_CXL_MEM=y.
>> Now changed  CONFIG_CXL_RAS_FEATURES  for tristate -> boolean as this
>> implemented only interface functions for the CXL RAS features.
>> >
>> >CASE 2: CONFIG_EDAC=m
>> >
>...
>> >fan:~/cxl/linux-edac$ make -j16
>> >mkdir -p /home/fan/cxl/linux-edac/tools/objtool && make
>> >O=/home/fan/cxl/linux-edac subdir=tools/objtool --no-print-directory -C
>objtool
>> >  CALL    scripts/checksyscalls.sh
>> >  INSTALL libsubcmd_headers
>> >  UPD     include/generated/utsversion.h
>> >  CC      init/version-timestamp.o
>> >  KSYMS   .tmp_vmlinux0.kallsyms.S
>> >  AS      .tmp_vmlinux0.kallsyms.o
>> >  LD      .tmp_vmlinux1
>> >ld: vmlinux.o: in function `devm_cxl_region_edac_register':
>> >/home/fan/cxl/linux-
>> >edac/drivers/cxl/core/memfeature.c:1720:(.text+0x7b665d): undefined
>> >reference to `edac_dev_register'
>> >ld: vmlinux.o: in function `devm_cxl_memdev_edac_register':
>> >/home/fan/cxl/linux-
>> >edac/drivers/cxl/core/memfeature.c:1697:(.text+0x7b7241): undefined
>> >reference to `edac_dev_register'
>> >ld: vmlinux.o: in function `ras2_probe':
>> >/home/fan/cxl/linux-edac/drivers/ras/acpi_ras2.c:363:(.text+0xb0ecc8):
>> >undefined reference to `edac_dev_register'
>> >make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
>> >make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error
>> >2
>> >make: *** [Makefile:251: __sub-make] Error 2
>> >
>>
>> Here the symbol 'edac_dev_register' can't find with CONFIG_CXL_BUS=y
>> CONFIG_CXL_RAS_FEATURES=y and CONFIG_EDAC=m.
>> Modified CXL_RAS_FEATURES depends on EDAC=y || (CXL_BUS=m &&
>EDAC=m)
>> to fix this.
>Hi Shiju,
>Did you mean the following fix?
>
>diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index
>77baef31cf3c..8615f329baa2 100644
>--- a/drivers/cxl/Kconfig
>+++ b/drivers/cxl/Kconfig
>@@ -162,11 +162,12 @@ config CXL_REGION_INVALIDATION_TEST
>          say N.
>
> config CXL_RAS_FEATURES
>-       tristate "CXL: Memory RAS features"
>+       bool "CXL: Memory RAS features"
>        depends on CXL_MEM
>        depends on EDAC_SCRUB
>        depends on EDAC_ECS
>        depends on EDAC_MEM_REPAIR
>+       depends on EDAC=y || (CXL_BUS=m && EDAC=m)
>        help
>          The CXL memory RAS feature control is optional and allows host to
>          control the RAS features configurations of CXL Type 3 devices.

Hi Fan,

Yes. 
>
>
>
>With the fix, I still see the errors with following config.

Sorry. I did not shared the RAS2 Kconfig change, assuming you are checking the CXL part only.
Please see RAS2 change, which require some improvements for the case CONFIG_MEM_ACPI_RAS2=m. 

diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig index c907e44360c8..881d6a642457 100644
--- a/drivers/ras/Kconfig
+++ b/drivers/ras/Kconfig
@@ -49,7 +49,7 @@ config RAS_FMPM
 config MEM_ACPI_RAS2
        tristate "Memory ACPI RAS2 driver"
        depends on ACPI_RAS2
-       depends on EDAC_SCRUB
+       depends on EDAC=y && EDAC_SCRUB
        help
          The driver binds to the platform device added by the ACPI RAS2
          table parser. Use a PCC channel subspace for communicating with  


Thanks,
Shiju
>
>fan:~/cxl/linux-edac$ cat .config | egrep "EDAC|CXL|RAS" | grep -v "^#"
>CONFIG_ACPI_RAS2=y
>CONFIG_ACPI_APEI_EINJ_CXL=y
>CONFIG_PCIEAER_CXL=y
>CONFIG_CXL_BUS=m
>CONFIG_CXL_PCI=m
>CONFIG_CXL_MEM_RAW_COMMANDS=y
>CONFIG_CXL_ACPI=m
>CONFIG_CXL_PMEM=m
>CONFIG_CXL_MEM=m
>CONFIG_CXL_FWCTL=y
>CONFIG_CXL_PORT=m
>CONFIG_CXL_SUSPEND=y
>CONFIG_CXL_REGION=y
>CONFIG_CXL_REGION_INVALIDATION_TEST=y
>CONFIG_CXL_RAS_FEATURES=y
>CONFIG_MMC_SDHCI_OF_ARASAN=y
>CONFIG_EDAC_ATOMIC_SCRUB=y
>CONFIG_EDAC_SUPPORT=y
>CONFIG_EDAC=m
>CONFIG_EDAC_LEGACY_SYSFS=y
>CONFIG_EDAC_DEBUG=y
>CONFIG_EDAC_DECODE_MCE=m
>CONFIG_EDAC_GHES=m
>CONFIG_EDAC_SCRUB=y
>CONFIG_EDAC_ECS=y
>CONFIG_EDAC_MEM_REPAIR=y
>CONFIG_EDAC_IGEN6=m
>CONFIG_RAS=y
>CONFIG_MEM_ACPI_RAS2=y
>CONFIG_DEV_DAX_CXL=m
>
>ld: vmlinux.o: in function `ras2_probe':
>/home/fan/cxl/linux-edac/drivers/ras/acpi_ras2.c:363:(.text+0xaeb5c8):
>undefined reference to `edac_dev_register'
>make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
>make[1]: *** [/home/fan/cxl/linux-edac/Makefile:1226: vmlinux] Error 2
>make: *** [Makefile:251: __sub-make] Error 2
>
>It seems ACPI_RAS2 depends on EDAC.
>When changing CONFIG_EDAC=y, it compiles fine.
>
>Fan
[...]



^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry
  2025-02-11  2:39   ` Dave Jiang
@ 2025-02-11 20:46     ` Shiju Jose
  0 siblings, 0 replies; 22+ messages in thread
From: Shiju Jose @ 2025-02-11 20:46 UTC (permalink / raw)
  To: Dave Jiang, linux-edac, linux-cxl, linux-acpi, linux-mm, linux-kernel
  Cc: linux-doc, bp, tony.luck, rafael, lenb, mchehab, dan.j.williams,
	dave, Jonathan Cameron, alison.schofield, vishal.l.verma,
	ira.weiny, david, Vilas.Sridharan, leo.duran, Yazen.Ghannam,
	rientjes, jiaqiyan, Jon.Grimm, dave.hansen, naoya.horiguchi,
	james.morse, jthoughton, somasundaram.a, erdemaktas, pgonda,
	duenwen, gthelen, wschwartz, dferguson, wbs, nifan.cxl,
	tanxiaofei, Zengtao (B),
	Roberto Sassu, kangkang.shen, wanghuiqiang, Linuxarm

>-----Original Message-----
>From: Dave Jiang <dave.jiang@intel.com>
>Sent: 11 February 2025 02:40
>To: Shiju Jose <shiju.jose@huawei.com>; linux-edac@vger.kernel.org; linux-
>cxl@vger.kernel.org; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-
>kernel@vger.kernel.org
>Cc: linux-doc@vger.kernel.org; bp@alien8.de; tony.luck@intel.com;
>rafael@kernel.org; lenb@kernel.org; mchehab@kernel.org;
>dan.j.williams@intel.com; dave@stgolabs.net; Jonathan Cameron
><jonathan.cameron@huawei.com>; alison.schofield@intel.com;
>vishal.l.verma@intel.com; ira.weiny@intel.com; david@redhat.com;
>Vilas.Sridharan@amd.com; leo.duran@amd.com; Yazen.Ghannam@amd.com;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>dferguson@amperecomputing.com; wbs@os.amperecomputing.com;
>nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>;
>Linuxarm <linuxarm@huawei.com>
>Subject: Re: [PATCH v19 07/15] cxl: Add helper function to retrieve a feature
>entry
>
>
>
>On 2/7/25 7:44 AM, shiju.jose@huawei.com wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> Add helper function to retrieve a feature entry from the supported
>> features list, if supported.
>>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>> ---
>>  drivers/cxl/core/features.c | 21 +++++++++++++++++++++
>>  include/cxl/features.h      |  2 ++
>>  2 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c
>> index 5f64185a5c7a..bf175e69cda1 100644
>> --- a/drivers/cxl/core/features.c
>> +++ b/drivers/cxl/core/features.c
>> @@ -43,6 +43,27 @@ bool is_cxl_feature_exclusive(struct cxl_feat_entry
>> *entry)  }  EXPORT_SYMBOL_NS_GPL(is_cxl_feature_exclusive, "CXL");
>>
>> +struct cxl_feat_entry *cxl_get_feature_entry(struct cxl_memdev *cxlmd,
>> +					     const uuid_t *feat_uuid)
>> +{
>> +	struct cxl_features_state *cxlfs = cxlmd->cxlfs;
>> +	struct cxl_feat_entry *feat_entry;
>> +	int count;
>> +
>> +	/*
>> +	 * Retrieve the feature entry from the supported features list,
>> +	 * if the feature is supported.
>> +	 */
>> +	feat_entry = cxlfs->entries;
>> +	for (count = 0; count < cxlfs->num_features; count++, feat_entry++) {
>> +		if (uuid_equal(&feat_entry->uuid, feat_uuid))
>> +			return feat_entry;
>> +	}
>> +
>> +	return ERR_PTR(-ENOENT);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_get_feature_entry, "CXL");
>
>You probably don't need this if the memfeature code are in CXL core.

Hi Dave,

You are right. At present, EXPORT_SYMBOL_NS_GPL(cxl_get_feature_entry)  is not required.

>
>DJ
>
>> +
>>  size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
>>  		       enum cxl_get_feat_selection selection,
>>  		       void *feat_out, size_t feat_out_size, u16 offset, diff --git
>> a/include/cxl/features.h b/include/cxl/features.h index
>> e52d0573f504..563d966beee5 100644
>> --- a/include/cxl/features.h
>> +++ b/include/cxl/features.h
>> @@ -68,6 +68,8 @@ struct cxl_features_state {  };
>>
>>  struct cxl_mailbox;
>> +struct cxl_feat_entry *cxl_get_feature_entry(struct cxl_memdev *cxlmd,
>> +					     const uuid_t *feat_uuid);
>>  size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
>>  		       enum cxl_get_feat_selection selection,
>>  		       void *feat_out, size_t feat_out_size, u16 offset,
>
>

Thanks,
Shiju

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-02-11 20:46 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-07 14:44 [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2025-02-07 14:44 ` [PATCH v19 01/15] EDAC: Add support for EDAC device features control shiju.jose
2025-02-07 14:44 ` [PATCH v19 02/15] EDAC: Add scrub control feature shiju.jose
2025-02-07 14:44 ` [PATCH v19 03/15] EDAC: Add ECS " shiju.jose
2025-02-07 14:44 ` [PATCH v19 04/15] EDAC: Add memory repair " shiju.jose
2025-02-07 14:44 ` [PATCH v19 05/15] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2025-02-07 14:44 ` [PATCH v19 06/15] ras: mem: Add memory " shiju.jose
2025-02-07 14:44 ` [PATCH v19 07/15] cxl: Add helper function to retrieve a feature entry shiju.jose
2025-02-11  2:39   ` Dave Jiang
2025-02-11 20:46     ` Shiju Jose
2025-02-07 14:44 ` [PATCH v19 08/15] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
2025-02-07 14:44 ` [PATCH v19 09/15] cxl/memfeature: Add CXL memory device ECS " shiju.jose
2025-02-07 14:44 ` [PATCH v19 10/15] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2025-02-07 14:44 ` [PATCH v19 11/15] cxl/region: Add helper function to determine memory is online shiju.jose
2025-02-07 14:44 ` [PATCH v19 12/15] cxl: Support for finding memory operation attributes from the current boot shiju.jose
2025-02-07 14:44 ` [PATCH v19 13/15] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
2025-02-07 14:44 ` [PATCH v19 14/15] EDAC: Update memory repair control interface for memory sparing feature shiju.jose
2025-02-07 14:44 ` [PATCH v19 15/15] cxl/memfeature: Add CXL memory device memory sparing control feature shiju.jose
2025-02-10 17:52 ` [PATCH v19 00/15] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Fan Ni
2025-02-11 16:55   ` Shiju Jose
2025-02-11 18:43     ` Fan Ni
2025-02-11 20:42       ` Shiju Jose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox