From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7451DC021A0 for ; Thu, 13 Feb 2025 21:35:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE373280004; Thu, 13 Feb 2025 16:35:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D92E9280001; Thu, 13 Feb 2025 16:35:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C33C7280004; Thu, 13 Feb 2025 16:35:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9F4E5280001 for ; Thu, 13 Feb 2025 16:35:08 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3D88A4A038 for ; Thu, 13 Feb 2025 21:35:08 +0000 (UTC) X-FDA: 83116227096.05.640621A Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf10.hostedemail.com (Postfix) with ESMTP id 280BBC0006 for ; Thu, 13 Feb 2025 21:35:05 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fzsyHvn7; spf=pass (imf10.hostedemail.com: domain of nifan.cxl@gmail.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=nifan.cxl@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739482506; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mO1FvYLWgJ76EQQgd/J/C8zLqhfJtgjs8p1wp7GyAYY=; b=DASJercLGQWCiKTWoJ1WgQWmNJC45L1Aoffxw1KCP7Up4SQzEYPf8b7qrNhdxPjeFcSqcR 8mhv2c9EHpFMz3ly565KZDR1+IgBdIaPEuPXJ9rFtBM+JjXkHB2QsfcyDdepuGQbMDX34v 3Y1s3GltllM9dCYETRM03PZyM0uDfhQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fzsyHvn7; spf=pass (imf10.hostedemail.com: domain of nifan.cxl@gmail.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=nifan.cxl@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739482506; a=rsa-sha256; cv=none; b=zF+WNao6TtGY036vKi+O//R0kNqP1KqhZ8XxAGMkfmfG/r/dKYvCHvuVyLDchBXR6gH8BQ SL7g0/ycEduOtwKhDuYJWoWBL209a/begjDnpogR1L7hq2yXfV7qV9X8nVHJlScn6LFTyl o8pc+dHNrBc6ExVxB0/vhipU1++pP8o= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-220c2a87378so19812595ad.1 for ; Thu, 13 Feb 2025 13:35:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739482505; x=1740087305; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:date:from:message-id:from:to:cc:subject:date:message-id:reply-to; bh=mO1FvYLWgJ76EQQgd/J/C8zLqhfJtgjs8p1wp7GyAYY=; b=fzsyHvn7GjEfbpL75WrIK2kYtcWNsq0kZRiRpXMsoTRNUnG/rbzBZViuOhsA+9SOp1 khxlChfhO3V/CzuKALFTdM08PlaEx8tNF+SDCAgFmQGhDz0BxB5JuzJq8IfvvujfPLUX xNUYB318L9sxPHwEOzyCDHOZRARLg09Sn7rxGJ6n4sr7m71vfkY86xjDPiGe/3iZeOfy HVQXfGUIo6a2Kdp7MrcCzYMAsQwia2mPKevr7BgYC919US6iQIrg8onan6f7FXZpWCZU o2eTLuEPN39iLZkeh5mUuv3L37psgjx4eFAGJnanUwSFFvUlgH1IikQXAU8O28Z3JuM0 W+rA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739482505; x=1740087305; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:date:from:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=mO1FvYLWgJ76EQQgd/J/C8zLqhfJtgjs8p1wp7GyAYY=; b=Y16Ecstj8NTvf1fzo4YfZ8mM4indyO8mE0I9Fz0zIo47OrN2QfB7OiAcoMhS9VBD1X c+Axceh3q8wNlV2IFxEXyUjds7x0lr2hKV7HZ95wG7OISKgW+JKJ5w3foiAlavMvuDYC koCf74dJU5vJgvLRqbSu/F/dfOqr0uBKnjJOKHM8hfbVUJiYiS7+ey5dmMwmfD40KZe9 REKtghrKw3OX+8PE8/GAozkwritk24ef4teKANv8Cdk7MtAPjKhY0i/LMq7KnmZlfdSW 8OzcP0CT9oDw+mLtyqeGx9U1yY7gw/WsNciy7EnV+jOlMaO/Qb4WTZMao5zjOUKu8foc FQNA== X-Forwarded-Encrypted: i=1; AJvYcCX7SNlKaSHTmzv1FlY9Wq4NiC9Fbx1Kdm5/JGGAAOEAoVI9zhSzGpstLD4saG/fltyTQJeTXYLTKA==@kvack.org X-Gm-Message-State: AOJu0YxBKg0xYYcG2gUC0HH6GeAAcrDS1ijZYS37FmOL12MjmkCRaH3O F82s/booz+H0Ss+6FlhmXb2nYWBisQZXEOGrtgbtyaxwzdTuAcLZ X-Gm-Gg: ASbGncsilfYW4FvKyaieVz5G4yyRHUAALkPHmkH5YTcz8zJdSoOhvBFNLYP4T1ZQfE1 cVl+mSd8QeLzkUYxKs3yRHwqn6SoMHQBZgEyx9KMENIyl8wQGS8lIxShQQ9O68ROCEiY5jU3TO/ XfYzq5/R29e/nH4niU9COXSK+NKuZd8/EGX/0Qrt9+4yCJ4/2Btl0cyVoxDW6VDF1XowKCrQweW V4//b4/3F4JysxauGycQuEqLhsAt/mtUdANcnmo6xkTAkMX14M46+ZK0XkkE0VXWBq+c/VFpWX7 SJyHgJOlXS+nHu1lLhvzxLkVZajK0f+kg5MhHY1/7G0= X-Google-Smtp-Source: AGHT+IHjNa1VbeDC205wGUrmuat+iH4dd4bSxcrot/VgExDFdrrERr0MrDPFSLuVWD4ahG/o8WwJiQ== X-Received: by 2002:a17:903:2f90:b0:216:485f:bf90 with SMTP id d9443c01a7336-220bbb0cab4mr133435095ad.27.1739482504553; Thu, 13 Feb 2025 13:35:04 -0800 (PST) Received: from asus. (c-73-189-148-61.hsd1.ca.comcast.net. [73.189.148.61]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-220d5585220sm16741435ad.215.2025.02.13.13.34.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Feb 2025 13:35:03 -0800 (PST) Message-ID: <67ae6587.170a0220.2d3544.9687@mx.google.com> X-Google-Original-Message-ID: From: Fan Ni X-Google-Original-From: Fan Ni Date: Thu, 13 Feb 2025 13:34:49 -0800 To: shiju.jose@huawei.com Cc: linux-edac@vger.kernel.org, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, bp@alien8.de, tony.luck@intel.com, rafael@kernel.org, lenb@kernel.org, mchehab@kernel.org, dan.j.williams@intel.com, dave@stgolabs.net, jonathan.cameron@huawei.com, dave.jiang@intel.com, alison.schofield@intel.com, vishal.l.verma@intel.com, ira.weiny@intel.com, david@redhat.com, Vilas.Sridharan@amd.com, leo.duran@amd.com, Yazen.Ghannam@amd.com, rientjes@google.com, jiaqiyan@google.com, Jon.Grimm@amd.com, dave.hansen@linux.intel.com, naoya.horiguchi@nec.com, james.morse@arm.com, jthoughton@google.com, somasundaram.a@hpe.com, erdemaktas@google.com, pgonda@google.com, duenwen@google.com, gthelen@google.com, wschwartz@amperecomputing.com, dferguson@amperecomputing.com, wbs@os.amperecomputing.com, nifan.cxl@gmail.com, tanxiaofei@huawei.com, prime.zeng@hisilicon.com, roberto.sassu@huawei.com, kangkang.shen@futurewei.com, wanghuiqiang@huawei.com, linuxarm@huawei.com Subject: Re: [PATCH v20 02/15] EDAC: Add scrub control feature References: <20250212143654.1893-1-shiju.jose@huawei.com> <20250212143654.1893-3-shiju.jose@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250212143654.1893-3-shiju.jose@huawei.com> X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 280BBC0006 X-Stat-Signature: b9udd3kdsd3gxhwen9mrnjx5i6oxxb4q X-HE-Tag: 1739482505-598859 X-HE-Meta: U2FsdGVkX1+Ak4Z4iEUawpl305TBM3RXpdBTJlMwqvq7CbrZbKiqj9zNLdR3Db0xFlEJ4XLoa0oc626J3AQYXlUDdHWVa6bI0ojcX4/JpBPhOyp+/s2aWa+O+mY/5Z152art7DpajVWqKo0hPM3OtEPzV5FalYyqzobI17TPx5hGkfp6QPZ3jkOlSa5P5XCxH2ul644G/TL1A6aHkpqStgpkUQzb5crBHoVfl1x/+t0X69JBvbOD+NEuUnwMfV840T/tlBvARm1r9yCWRgf/xRrVKoEblrVCCtOWGDM2oOipQP8PxQjoem7lMjrfAwQePsvC22oTXmbu+XXRXCDZzT8Jz6XPbUX5UpV/LW6UJeE+diA6VcV1ftDEMspqvFHxTsvNYS4q0l+s2tDSs/lYgOEd/BUHnzwFoLBtYNK06RFMg1I1ATsJuQe486uUrK2V3WcDcAScL1mL+J3e/ku4X/eabRrBmzgQErtlQeRK/YUxDwbQDbHr5HK58XV1Xj5drOuwmuMAVt9EI5YTbFo80UE/LWf66gUQkTjjCoUL16vb9hKXtk3hTAUl4iLrLX2RoGhph04XYp5RNgSoh+ES0RyvqPYs4P0emJ6i7ROrAjDTO+RjYrYsh5CLNuzq02aSc1eh63TkclmBeZF+1mjHO+0t49e7K5dN6NOxRUP42ODqh4vZSad8XJfdhRIEJfza8347PA7Up8dhOFxwuounHmhtRutiIrhzgHSflUGKy6vTsTBtdFJe3Wd/3d2CI2cBJ5+9t0yur7IWlo7TviG2+43Y+sZku9ePh1gPAQmB+mKkQoQGUo4mhojCpEFL+phTNkdtvwzOs5nD7OPoNgHWROs/39p635zrww6Yf8O7F2KU/LYsTKw1xkxlDnLCuMNHYzJIEM5mhLGPZEqvq2dPagX1vqHhgK8P66cLPIGMt7SWMfqhwufdyfX728Rxg+ggoJNC3D2e9No0nY+LSQx deohDK32 j09NAQg+UA4m8cqXhIuHP2C+NRIhbnRMAVcbSADz7E8NdU/akCZ4NdDICLmZMurL68Rd6T5TJ2QN9SnRPtQ4bOkg7IfYFekfve0KOkPNy3V5568Ez2d0+cVzebpElJxEY1poC/Q6IkSrwMgSFi4/0ua7apx+09TjCYbwn/gT+3gIxJ85f48T+oR3EwTDOy7kblAnJcwYWkffK8nh5TyG/lVFqXI5DT9UdysI5ROHwmb4O2ffZ1SKUbDYQJRzjSd4MVVXZ1I38J42USyvSW2W1hyl0EoND9GnwXVl8h24wLIUxn1Hq4JLY9WQqZMAScTlz0k4wPSWINxv7ZN+lgTQ7LTG2jC5moKeIIr//eR4eAM4yV2M8MXOTXYPnpkWZ5h4pb/0CsN8Ormr/5CTuNgHevwVSgKtXSWWVxv8mfdWeA07Dt3vQqCwld8BPINBuYilNZjgussqLGjKXuil0xHgTzJviCFLi69IWfc82xpToXzi9qeWvuHyUHixmHVPMxOwaEfqUe+kmvtIsJw+6P5Zqa/ssJ5l1srkJa7mZzw1cVrIfDiXYvs3eHYCnvucYT1RUdlH71RSquYvEPxDpb8AnBzRewfg+iMO1SXusZO5zHCChtaKh7bwSR9QhSL5iUGf+ofqCu2kI9ldwxVs9Gznxs/2Ujxg3Y6hDy6O9XVnKfksHViU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 12, 2025 at 02:36:40PM +0000, shiju.jose@huawei.com wrote: > From: Shiju Jose > > Add a generic EDAC scrub control to manage memory scrubbers in the system. > Devices with a scrub feature register with the EDAC device driver, which > retrieves the scrub descriptor from the EDAC scrub driver and exposes the > sysfs scrub control attributes for a scrub instance to userspace at > /sys/bus/edac/devices//scrubX/. > > The common sysfs scrub control interface abstracts the control of > arbitrary scrubbing functionality into a common set of functions. The > sysfs scrub attribute nodes are only present if the client driver has > implemented the corresponding attribute callback function and passed the > operations(ops) to the EDAC device driver during registration. > > Co-developed-by: Jonathan Cameron > Signed-off-by: Jonathan Cameron > Tested-by: Daniel Ferguson > Signed-off-by: Shiju Jose > --- > Documentation/ABI/testing/sysfs-edac-scrub | 69 ++++++ > Documentation/edac/features.rst | 6 + > Documentation/edac/index.rst | 1 + > Documentation/edac/scrub.rst | 259 +++++++++++++++++++++ > drivers/edac/Kconfig | 9 + > drivers/edac/Makefile | 2 + > drivers/edac/edac_device.c | 41 +++- > drivers/edac/scrub.c | 209 +++++++++++++++++ > include/linux/edac.h | 43 ++++ > 9 files changed, 635 insertions(+), 4 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub > create mode 100644 Documentation/edac/scrub.rst > create mode 100755 drivers/edac/scrub.c LGTM. Just one question, for min/max/current_cycle_duration attributes, is there a reason why seconds are used instead of hours directly as mentioned in the spec. That confused me a little bit when I tested to modify the current_cycle_duration with some value not multiple of 3600 and found the value read back is not the same as that just written. With that in mind, Tested-by: Fan Ni > > diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub > new file mode 100644 > index 000000000000..a3c0ad40b2b0 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-edac-scrub > @@ -0,0 +1,69 @@ > +What: /sys/bus/edac/devices//scrubX > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + The sysfs EDAC bus devices //scrubX subdirectory > + belongs to an instance of memory scrub control feature, > + where directory corresponds to a device/memory > + region registered with the EDAC device driver for the > + scrub control feature. > + > + The sysfs scrub attr nodes are only present if the parent > + driver has implemented the corresponding attr callback > + function and provided the necessary operations to the EDAC > + device driver during registration. > + > +What: /sys/bus/edac/devices//scrubX/addr > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The base address of the memory region to be scrubbed > + for on-demand scrubbing. Setting address starts scrubbing. > + The size must be set before that. > + > + The readback addr value is non-zero if the requested > + on-demand scrubbing is in progress, zero otherwise. > + > +What: /sys/bus/edac/devices//scrubX/size > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The size of the memory region to be scrubbed > + (on-demand scrubbing). > + > +What: /sys/bus/edac/devices//scrubX/enable_background > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) Start/Stop background(patrol) scrubbing if supported. > + > +What: /sys/bus/edac/devices//scrubX/min_cycle_duration > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) Supported minimum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices//scrubX/max_cycle_duration > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) Supported maximum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices//scrubX/current_cycle_duration > +Date: March 2025 > +KernelVersion: 6.15 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The current scrub cycle duration in seconds and must be > + within the supported range by the memory scrubber. > + > + Scrub has an overhead when running and that may want to be > + reduced by taking longer to do it. > diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst > index 6b0fdc6f5d6e..942d7a92b8d7 100644 > --- a/Documentation/edac/features.rst > +++ b/Documentation/edac/features.rst > @@ -92,3 +92,9 @@ High level design is illustrated in the following diagram:: > 3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for > dynamic scrub/repair control to issue scrubbing/repair when excess number > of corrected memory errors are reported in a short span of time. > + > +RAS features > +------------ > +1. Memory Scrub > + > +Memory scrub features are documented in `Documentation/edac/scrub.rst`. > diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst > index de4a3aa452cb..0a00c23838b6 100644 > --- a/Documentation/edac/index.rst > +++ b/Documentation/edac/index.rst > @@ -8,3 +8,4 @@ EDAC Subsystem > :maxdepth: 1 > > features > + scrub > diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst > new file mode 100644 > index 000000000000..50bb44b126fa > --- /dev/null > +++ b/Documentation/edac/scrub.rst > @@ -0,0 +1,259 @@ > +.. SPDX-License-Identifier: GPL-2.0 OR GFDL-1.2-no-invariants-or-later > + > +=================== > +EDAC Scrub Control > +=================== > + > +Copyright (c) 2024-2025 HiSilicon Limited. > + > +:Author: Shiju Jose > +:License: The GNU Free Documentation License, Version 1.2 without > + Invariant Sections, Front-Cover Texts nor Back-Cover Texts. > + (dual licensed under the GPL v2) > + > +- Written for: 6.15 > + > +Introduction > +------------ > +Increasing DRAM size and cost have made memory subsystem reliability an > +important concern. These modules are used where potentially corrupted data > +could cause expensive or fatal issues. Memory errors are among the top > +hardware failures that cause server and workload crashes. > + > +Memory scrubbing is a feature where an ECC (Error-Correcting Code) engine > +reads data from each memory media location, corrects with an ECC if > +necessary and writes the corrected data back to the same memory media > +location. > + > +The memory DIMMs can be scrubbed at a configurable rate to detect > +uncorrected memory errors and attempt recovery from detected errors, > +providing the following benefits. > + > +1. Proactively scrubbing memory DIMMs reduces the chance of a correctable > + error becoming uncorrectable. > + > +2. When detected, uncorrected errors caught in unallocated memory pages are > + isolated and prevented from being allocated to an application or the OS. > + > +3. This reduces the likelihood of software or hardware products encountering > + memory errors. > + > +4. The additional data on failures in memory may be used to build up statistics > + that are later used to decide whether to use memory repair technologies > + such as Post Package Repair or Sparing. > + > +There are 2 types of memory scrubbing: > + > +1. Background (patrol) scrubbing of the RAM while the RAM is otherwise > + idle. > + > +2. On-demand scrubbing for a specific address range or region of memory. > + > +Several types of interfaces to hardware memory scrubbers have been > +identified, such as CXL memory device patrol scrub, CXL DDR5 ECS, ACPI > +RAS2 memory scrubbing, and ACPI NVDIMM ARS (Address Range Scrub). > + > +The control mechanisms vary across different memory scrubbers. To enable > +standardized userspace tooling, there is a need to present these controls > +through a standardized ABI. > + > +Introduce a generic memory EDAC scrub control that allows users to manage > +underlying scrubbers in the system through a standardized sysfs scrub > +control interface. This common sysfs scrub control interface abstracts the > +management of various scrubbing functionalities into a unified set of > +functions. > + > +Use cases of common scrub control feature > +----------------------------------------- > +1. Several types of interfaces for hardware (HW) memory scrubbers have > + been identified, including the CXL memory device patrol scrub, CXL DDR5 > + ECS, ACPI RAS2 memory scrubbing features, ACPI NVDIMM ARS (Address Range > + Scrub), and software-based memory scrubbers. Of the identified interfaces > + to hardware memory scrubbers some support control over patrol (background) > + scrubbing (e.g., ACPI RAS2, CXL) and/or on-demand scrubbing (e.g., ACPI RAS2, > + ACPI ARS). However, the scrub control interfaces vary between memory > + scrubbers, highlighting the need for a standardized, generic sysfs scrub > + control interface that is accessible to userspace for administration and use > + by scripts/tools. > + > +2. User-space scrub controls allow users to disable scrubbing if necessary, > + for example, to disable background patrol scrubbing or adjust the scrub > + rate for performance-aware operations where background activities need to > + be minimized or disabled. > + > +3. User-space tools enable on-demand scrubbing for specific address ranges, > + provided that the scrubber supports this functionality. > + > +4. User-space tools can also control memory DIMM scrubbing at a configurable > + scrub rate via sysfs scrub controls. This approach offers several benefits: > + > + 4.1. Detects uncorrectable memory errors early, before user access to affected > + memory, helping facilitate recovery. > + > + 4.2. Reduces the likelihood of correctable errors developing into uncorrectable > + errors. > + > +5. Policy control for hotplugged memory is necessary because there may not > + be a system-wide BIOS or similar control to manage scrub settings for a CXL > + device added after boot. Determining these settings is a policy decision, > + balancing reliability against performance, so userspace should control it. > + Therefore, a unified interface is recommended for handling this function in > + a way that aligns with other similar interfaces, rather than creating a > + separate one. > + > +Scrubbing features > +------------------ > + > +CXL Memory Scrubbing features > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +CXL spec r3.1 [1]_ section 8.2.9.9.11.1 describes the memory device patrol > +scrub control feature. The device patrol scrub proactively locates and makes > +corrections to errors in regular cycle. The patrol scrub control allows the > +userspace request to change CXL patrol scrubber's configurations. > + > +The patrol scrub control allows the requester to specify the number of > +hours in which the patrol scrub cycles must be completed, provided that > +the requested scrub rate must be within the supported range of the > +scrub rate that the device is capable of. In the CXL driver, the > +number of seconds per scrub cycles, which user requests via sysfs, is > +rescaled to hours per scrub cycles. In addition, the patrol scrub controls > +allow the host to disable and enable the feature in case disabling of the > +feature is needed for other purposes such as performance-aware operations > +which require the background operations to be turned off. > + > +Error Check Scrub (ECS) > +~~~~~~~~~~~~~~~~~~~~~~~ > +CXL spec r3.1 [1]_ section 8.2.9.9.11.2 describes the Error Check Scrub (ECS) > +is a feature defined in JEDEC DDR5 SDRAM Specification (JESD79-5) and > +allows the DRAM to internally read, correct single-bit errors, and write > +back corrected data bits to the DRAM array while providing transparency > +to error counts. > + > +The DDR5 device contains number of memory media FRUs per device. The > +DDR5 ECS feature and thus the ECS control driver supports configuring > +the ECS parameters per FRU. > + > +ACPI RAS2 Hardware-based Memory Scrubbing > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +ACPI spec 6.5 [2]_ section 5.2.21 ACPI RAS2 describes ACPI RAS2 table > +provides interfaces for platform RAS features and supports independent > +RAS controls and capabilities for a given RAS feature for multiple > +instances of the same component in a given system. > +Memory RAS features apply to RAS capabilities, controls and operations > +that are specific to memory. RAS2 PCC sub-spaces for memory-specific RAS > +features have a Feature Type of 0x00 (Memory). > + > +The platform can use the hardware-based memory scrubbing feature to expose > +controls and capabilities associated with hardware-based memory scrub > +engines. The RAS2 memory scrubbing feature supports following as per spec, > + > +1. Independent memory scrubbing controls for each NUMA domain, identified > + using its proximity domain. > + > +2. Provision for background (patrol) scrubbing of the entire memory system, > + as well as on-demand scrubbing for a specific region of memory. > + > +ACPI Address Range Scrubbing(ARS) > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > +ACPI spec 6.5 [2]_ section 9.19.7.2 describes Address Range Scrubbing(ARS). > +ARS allows the platform to communicate memory errors to system software. > +This capability allows system software to prevent accesses to addresses > +with uncorrectable errors in memory. ARS functions manage all NVDIMMs > +present in the system. Only one scrub can be in progress system wide > +at any given time. > +Following functions are supported as per the specification. > + > +1. Query ARS Capabilities for a given address range, indicates platform > + supports the ACPI NVDIMM Root Device Unconsumed Error Notification. > + > +2. Start ARS triggers an Address Range Scrub for the given memory range. > + Address scrubbing can be done for volatile memory, persistent memory, > + or both. > + > +3. Query ARS Status command allows software to get the status of ARS, > + including the progress of ARS and ARS error record. > + > +4. Clear Uncorrectable Error. > + > +5. Translate SPA > + > +6. ARS Error Inject etc. > + > +The kernel supports an existing control for ARS and ARS is currently not > +supported in EDAC. > + > +.. [1] https://computeexpresslink.org/cxl-specification/ > + > +.. [2] https://uefi.org/specs/ACPI/6.5/ > + > +Comparison of various scrubbing features > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > + +--------------+-----------+-----------+-----------+-----------+ > + | | ACPI | CXL patrol| CXL ECS | ARS | > + | Name | RAS2 | scrub | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | On-demand | Supported | No | No | Supported | > + | Scrubbing | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Background | Supported | Supported | Supported | No | > + | scrubbing | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Mode of | Scrub ctrl| per device| per memory| Unknown | > + | scrubbing | per NUMA | | media | | > + | | domain. | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Query scrub | Supported | Supported | Supported | Supported | > + | capabilities | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Setting | Supported | No | No | Supported | > + | address range| | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Setting | Supported | Supported | No | No | > + | scrub rate | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Unit for | Not | in hours | No | No | > + | scrub rate | Defined | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | | Supported | | | | > + | Scrub | on-demand | No | No | Supported | > + | status/ | scrubbing | | | | > + | Completion | only | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + | UC error | |CXL general|CXL general| ACPI UCE | > + | reporting | Exception |media/DRAM |media/DRAM | notify and| > + | | |event/media|event/media| query | > + | | |scan? |scan? | ARS status| > + +--------------+-----------+-----------+-----------+-----------+ > + | | | | | | > + | Support for | Supported | Supported | Supported | No | > + | EDAC control | | | | | > + | | | | | | > + +--------------+-----------+-----------+-----------+-----------+ > + > +The File System > +--------------- > + > +The control attributes of a registered scrubber instance could be > +accessed in the > + > +/sys/bus/edac/devices//scrubX/ > + > +sysfs > +----- > + > +Sysfs files are documented in > +`Documentation/ABI/testing/sysfs-edac-scrub` > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig > index 2051a7c944a5..175d706168ab 100644 > --- a/drivers/edac/Kconfig > +++ b/drivers/edac/Kconfig > @@ -75,6 +75,15 @@ config EDAC_GHES > > In doubt, say 'Y'. > > +config EDAC_SCRUB > + bool "EDAC scrub feature" > + help > + The EDAC scrub feature is optional and is designed to control the > + memory scrubbers in the system. The common sysfs scrub interface > + abstracts the control of various arbitrary scrubbing functionalities > + into a unified set of functions. > + Say 'y/n' to enable/disable EDAC scrub feature. > + > config EDAC_AMD64 > tristate "AMD64 (Opteron, Athlon64)" > depends on AMD_NB && EDAC_DECODE_MCE > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile > index 89789ba8275f..f2a86ed997b7 100644 > --- a/drivers/edac/Makefile > +++ b/drivers/edac/Makefile > @@ -13,6 +13,8 @@ edac_core-y += edac_module.o edac_device_sysfs.o wq.o > > edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o > > +edac_core-$(CONFIG_EDAC_SCRUB) += scrub.o > + > ifdef CONFIG_PCI > edac_core-y += edac_pci.o edac_pci_sysfs.o > endif > diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c > index 142a661ff543..40407f0ee600 100644 > --- a/drivers/edac/edac_device.c > +++ b/drivers/edac/edac_device.c > @@ -575,6 +575,7 @@ static void edac_dev_release(struct device *dev) > { > struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); > > + kfree(ctx->scrub); > kfree(ctx->dev.groups); > kfree(ctx); > } > @@ -610,8 +611,10 @@ int edac_dev_register(struct device *parent, char *name, > const struct edac_dev_feature *ras_features) > { > const struct attribute_group **ras_attr_groups; > + struct edac_dev_data *dev_data; > struct edac_dev_feat_ctx *ctx; > int attr_gcnt = 0; > + int scrub_cnt = 0; > int ret, feat; > > if (!parent || !name || !num_features || !ras_features) > @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char *name, > /* Double parse to make space for attributes */ > for (feat = 0; feat < num_features; feat++) { > switch (ras_features[feat].ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + attr_gcnt++; > + scrub_cnt++; > + break; > default: > return -EINVAL; > } > @@ -636,13 +642,38 @@ int edac_dev_register(struct device *parent, char *name, > goto ctx_free; > } > > + if (scrub_cnt) { > + ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), GFP_KERNEL); > + if (!ctx->scrub) { > + ret = -ENOMEM; > + goto groups_free; > + } > + } > + > attr_gcnt = 0; > + scrub_cnt = 0; > for (feat = 0; feat < num_features; feat++, ras_features++) { > switch (ras_features->ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + if (!ras_features->scrub_ops || > + scrub_cnt != ras_features->instance) > + goto data_mem_free; > + > + dev_data = &ctx->scrub[scrub_cnt]; > + dev_data->instance = scrub_cnt; > + dev_data->scrub_ops = ras_features->scrub_ops; > + dev_data->private = ras_features->ctx; > + ret = edac_scrub_get_desc(parent, &ras_attr_groups[attr_gcnt], > + ras_features->instance); > + if (ret) > + goto data_mem_free; > + > + scrub_cnt++; > + attr_gcnt++; > + break; > default: > ret = -EINVAL; > - goto groups_free; > + goto data_mem_free; > } > } > > @@ -655,7 +686,7 @@ int edac_dev_register(struct device *parent, char *name, > > ret = dev_set_name(&ctx->dev, name); > if (ret) > - goto groups_free; > + goto data_mem_free; > > ret = device_register(&ctx->dev); > if (ret) { > @@ -665,6 +696,8 @@ int edac_dev_register(struct device *parent, char *name, > > return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev); > > +data_mem_free: > + kfree(ctx->scrub); > groups_free: > kfree(ras_attr_groups); > ctx_free: > diff --git a/drivers/edac/scrub.c b/drivers/edac/scrub.c > new file mode 100755 > index 000000000000..e421d3ebd959 > --- /dev/null > +++ b/drivers/edac/scrub.c > @@ -0,0 +1,209 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * The generic EDAC scrub driver controls the memory scrubbers in the > + * system. The common sysfs scrub interface abstracts the control of > + * various arbitrary scrubbing functionalities into a unified set of > + * functions. > + * > + * Copyright (c) 2024-2025 HiSilicon Limited. > + */ > + > +#include > + > +enum edac_scrub_attributes { > + SCRUB_ADDRESS, > + SCRUB_SIZE, > + SCRUB_ENABLE_BACKGROUND, > + SCRUB_MIN_CYCLE_DURATION, > + SCRUB_MAX_CYCLE_DURATION, > + SCRUB_CUR_CYCLE_DURATION, > + SCRUB_MAX_ATTRS > +}; > + > +struct edac_scrub_dev_attr { > + struct device_attribute dev_attr; > + u8 instance; > +}; > + > +struct edac_scrub_context { > + char name[EDAC_FEAT_NAME_LEN]; > + struct edac_scrub_dev_attr scrub_dev_attr[SCRUB_MAX_ATTRS]; > + struct attribute *scrub_attrs[SCRUB_MAX_ATTRS + 1]; > + struct attribute_group group; > +}; > + > +#define TO_SCRUB_DEV_ATTR(_dev_attr) \ > + container_of(_dev_attr, struct edac_scrub_dev_attr, dev_attr) > + > +#define EDAC_SCRUB_ATTR_SHOW(attrib, cb, type, format) \ > +static ssize_t attrib##_show(struct device *ras_feat_dev, \ > + struct device_attribute *attr, char *buf) \ > +{ \ > + u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance; \ > + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ > + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; \ > + type data; \ > + int ret; \ > + \ > + ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, &data); \ > + if (ret) \ > + return ret; \ > + \ > + return sysfs_emit(buf, format, data); \ > +} > + > +EDAC_SCRUB_ATTR_SHOW(addr, read_addr, u64, "0x%llx\n") > +EDAC_SCRUB_ATTR_SHOW(size, read_size, u64, "0x%llx\n") > +EDAC_SCRUB_ATTR_SHOW(enable_background, get_enabled_bg, bool, "%u\n") > +EDAC_SCRUB_ATTR_SHOW(min_cycle_duration, get_min_cycle, u32, "%u\n") > +EDAC_SCRUB_ATTR_SHOW(max_cycle_duration, get_max_cycle, u32, "%u\n") > +EDAC_SCRUB_ATTR_SHOW(current_cycle_duration, get_cycle_duration, u32, "%u\n") > + > +#define EDAC_SCRUB_ATTR_STORE(attrib, cb, type, conv_func) \ > +static ssize_t attrib##_store(struct device *ras_feat_dev, \ > + struct device_attribute *attr, \ > + const char *buf, size_t len) \ > +{ \ > + u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance; \ > + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ > + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; \ > + type data; \ > + int ret; \ > + \ > + ret = conv_func(buf, 0, &data); \ > + if (ret < 0) \ > + return ret; \ > + \ > + ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, data); \ > + if (ret) \ > + return ret; \ > + \ > + return len; \ > +} > + > +EDAC_SCRUB_ATTR_STORE(addr, write_addr, u64, kstrtou64) > +EDAC_SCRUB_ATTR_STORE(size, write_size, u64, kstrtou64) > +EDAC_SCRUB_ATTR_STORE(enable_background, set_enabled_bg, unsigned long, kstrtoul) > +EDAC_SCRUB_ATTR_STORE(current_cycle_duration, set_cycle_duration, unsigned long, kstrtoul) > + > +static umode_t scrub_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id) > +{ > + struct device *ras_feat_dev = kobj_to_dev(kobj); > + struct device_attribute *dev_attr = container_of(a, struct device_attribute, attr); > + u8 inst = TO_SCRUB_DEV_ATTR(dev_attr)->instance; > + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); > + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; > + > + switch (attr_id) { > + case SCRUB_ADDRESS: > + if (ops->read_addr) { > + if (ops->write_addr) > + return a->mode; > + else > + return 0444; > + } > + break; > + case SCRUB_SIZE: > + if (ops->read_size) { > + if (ops->write_size) > + return a->mode; > + else > + return 0444; > + } > + break; > + case SCRUB_ENABLE_BACKGROUND: > + if (ops->get_enabled_bg) { > + if (ops->set_enabled_bg) > + return a->mode; > + else > + return 0444; > + } > + break; > + case SCRUB_MIN_CYCLE_DURATION: > + if (ops->get_min_cycle) > + return a->mode; > + break; > + case SCRUB_MAX_CYCLE_DURATION: > + if (ops->get_max_cycle) > + return a->mode; > + break; > + case SCRUB_CUR_CYCLE_DURATION: > + if (ops->get_cycle_duration) { > + if (ops->set_cycle_duration) > + return a->mode; > + else > + return 0444; > + } > + break; > + default: > + break; > + } > + > + return 0; > +} > + > +#define EDAC_SCRUB_ATTR_RO(_name, _instance) \ > + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RO(_name), \ > + .instance = _instance }) > + > +#define EDAC_SCRUB_ATTR_WO(_name, _instance) \ > + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_WO(_name), \ > + .instance = _instance }) > + > +#define EDAC_SCRUB_ATTR_RW(_name, _instance) \ > + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RW(_name), \ > + .instance = _instance }) > + > +static int scrub_create_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, u8 instance) > +{ > + struct edac_scrub_context *scrub_ctx; > + struct attribute_group *group; > + int i; > + struct edac_scrub_dev_attr dev_attr[] = { > + [SCRUB_ADDRESS] = EDAC_SCRUB_ATTR_RW(addr, instance), > + [SCRUB_SIZE] = EDAC_SCRUB_ATTR_RW(size, instance), > + [SCRUB_ENABLE_BACKGROUND] = EDAC_SCRUB_ATTR_RW(enable_background, instance), > + [SCRUB_MIN_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(min_cycle_duration, instance), > + [SCRUB_MAX_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(max_cycle_duration, instance), > + [SCRUB_CUR_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RW(current_cycle_duration, instance) > + }; > + > + scrub_ctx = devm_kzalloc(scrub_dev, sizeof(*scrub_ctx), GFP_KERNEL); > + if (!scrub_ctx) > + return -ENOMEM; > + > + group = &scrub_ctx->group; > + for (i = 0; i < SCRUB_MAX_ATTRS; i++) { > + memcpy(&scrub_ctx->scrub_dev_attr[i], &dev_attr[i], sizeof(dev_attr[i])); > + scrub_ctx->scrub_attrs[i] = &scrub_ctx->scrub_dev_attr[i].dev_attr.attr; > + } > + sprintf(scrub_ctx->name, "%s%d", "scrub", instance); > + group->name = scrub_ctx->name; > + group->attrs = scrub_ctx->scrub_attrs; > + group->is_visible = scrub_attr_visible; > + > + attr_groups[0] = group; > + > + return 0; > +} > + > +/** > + * edac_scrub_get_desc - get EDAC scrub descriptors > + * @scrub_dev: client device, with scrub support > + * @attr_groups: pointer to attribute group container > + * @instance: device's scrub instance number. > + * > + * Return: > + * * %0 - Success. > + * * %-EINVAL - Invalid parameters passed. > + * * %-ENOMEM - Dynamic memory allocation failed. > + */ > +int edac_scrub_get_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, u8 instance) > +{ > + if (!scrub_dev || !attr_groups) > + return -EINVAL; > + > + return scrub_create_desc(scrub_dev, attr_groups, instance); > +} > diff --git a/include/linux/edac.h b/include/linux/edac.h > index 8c4b6ca2a994..1cbab08720df 100644 > --- a/include/linux/edac.h > +++ b/include/linux/edac.h > @@ -662,13 +662,54 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci, > return mci->dimms[index]; > } > > +#define EDAC_FEAT_NAME_LEN 128 > + > /* RAS feature type */ > enum edac_dev_feat { > + RAS_FEAT_SCRUB, > RAS_FEAT_MAX > }; > > +/** > + * struct edac_scrub_ops - scrub device operations (all elements optional) > + * @read_addr: read base address of scrubbing range. > + * @read_size: read offset of scrubbing range. > + * @write_addr: set base address of the scrubbing range. > + * @write_size: set offset of the scrubbing range. > + * @get_enabled_bg: check if currently performing background scrub. > + * @set_enabled_bg: start or stop a bg-scrub. > + * @get_min_cycle: get minimum supported scrub cycle duration in seconds. > + * @get_max_cycle: get maximum supported scrub cycle duration in seconds. > + * @get_cycle_duration: get current scrub cycle duration in seconds. > + * @set_cycle_duration: set current scrub cycle duration in seconds. > + */ > +struct edac_scrub_ops { > + int (*read_addr)(struct device *dev, void *drv_data, u64 *base); > + int (*read_size)(struct device *dev, void *drv_data, u64 *size); > + int (*write_addr)(struct device *dev, void *drv_data, u64 base); > + int (*write_size)(struct device *dev, void *drv_data, u64 size); > + int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable); > + int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable); > + int (*get_min_cycle)(struct device *dev, void *drv_data, u32 *min); > + int (*get_max_cycle)(struct device *dev, void *drv_data, u32 *max); > + int (*get_cycle_duration)(struct device *dev, void *drv_data, u32 *cycle); > + int (*set_cycle_duration)(struct device *dev, void *drv_data, u32 cycle); > +}; > + > +#if IS_ENABLED(CONFIG_EDAC_SCRUB) > +int edac_scrub_get_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, > + u8 instance); > +#else > +static inline int edac_scrub_get_desc(struct device *scrub_dev, > + const struct attribute_group **attr_groups, > + u8 instance) > +{ return -EOPNOTSUPP; } > +#endif /* CONFIG_EDAC_SCRUB */ > + > /* EDAC device feature information structure */ > struct edac_dev_data { > + const struct edac_scrub_ops *scrub_ops; > u8 instance; > void *private; > }; > @@ -676,11 +717,13 @@ struct edac_dev_data { > struct edac_dev_feat_ctx { > struct device dev; > void *private; > + struct edac_dev_data *scrub; > }; > > struct edac_dev_feature { > enum edac_dev_feat ft_type; > u8 instance; > + const struct edac_scrub_ops *scrub_ops; > void *ctx; > }; > > -- > 2.43.0 >