From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1783D16268 for ; Mon, 14 Oct 2024 14:33:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 651DC6B0082; Mon, 14 Oct 2024 10:33:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 602206B0083; Mon, 14 Oct 2024 10:33:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F1DF6B0085; Mon, 14 Oct 2024 10:33:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 37A266B0082 for ; Mon, 14 Oct 2024 10:33:09 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3C62D160F23 for ; Mon, 14 Oct 2024 14:33:00 +0000 (UTC) X-FDA: 82672449888.10.1A5103B Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf25.hostedemail.com (Postfix) with ESMTP id 300E8A000A for ; Mon, 14 Oct 2024 14:33:01 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728916198; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WXs20vXTeuA2e7uMNqFXIO1aPWTO4Vn+j0Pp93ufT+M=; b=WSwPfHr7vMMMRX0wmO2WvgyVdofhm5ostjyKDQ/EnTRO68I9r5Q52dCK8pZ1Pf3bqgHLod aTXG1+t6QRdFr19C/1rL2VYnvYsJqhwmEBSd/t50tEcyfuEnzJM9Aj2It7mIp5ToGbCa5I 9rjH76MaPk5LocR9QQe37lXFiF2Cz1c= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728916198; a=rsa-sha256; cv=none; b=aEYU8Q4HAnXWFxrgOyN0nu7XVv0/kDSvEyGK9zcos73KsnGiRyNgL8Ik3Cbr2tAHjceSrc /7ND8MfdqF8rU1Udb4juBn6SsyocvNyz5XlG7nrMgmskbAn42IT970RMCZfArTQHZrvsdy aE/b7og8vsbWVApsSw0BT5dI+fZV41A= Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4XS0686Qbgz6GC8c; Mon, 14 Oct 2024 22:31:28 +0800 (CST) Received: from frapeml500008.china.huawei.com (unknown [7.182.85.71]) by mail.maildlp.com (Postfix) with ESMTPS id 8D020140C72; Mon, 14 Oct 2024 22:33:03 +0800 (CST) Received: from localhost (10.203.177.66) by frapeml500008.china.huawei.com (7.182.85.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 14 Oct 2024 16:33:01 +0200 Date: Mon, 14 Oct 2024 15:33:00 +0100 From: Jonathan Cameron To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v13 03/18] EDAC: Add ECS control feature Message-ID: <20241014153300.00002942@Huawei.com> In-Reply-To: <20241009124120.1124-4-shiju.jose@huawei.com> References: <20241009124120.1124-1-shiju.jose@huawei.com> <20241009124120.1124-4-shiju.jose@huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.203.177.66] X-ClientProxiedBy: lhrpeml500006.china.huawei.com (7.191.161.198) To frapeml500008.china.huawei.com (7.182.85.71) X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 300E8A000A X-Stat-Signature: p3jndodh7sxt97xxfw4qqnjw1h16ubm4 X-Rspam-User: X-HE-Tag: 1728916381-149604 X-HE-Meta: U2FsdGVkX1/k16xZwpoH1P+me1ZPvUl35ox0JGwhWqibadgBBOlO4AukmNhpj8ISZZyYxY7vpBkm318iGZejA49Xdn/rVjPKh2KvxqFTSDgR5Zza6vYOOTbUFgU70qbOshad2hLG1n9855SbBCxb5ixRn9yP+9DkwGH6aXQG1xEtKyfYvI3IKh0npjl8N4GLqgk/wZKig4DeJqWJibSwdMfNyGS8fwHFaRjbc9zTLo/snxfyonHzGwXl6wej72epkNr+qXzw9WdlXNiuyLUK+Pq0YcXAgubS1eErZVbHmPw+IPeMXFkVqKG/BnSa+dsjORs/oq8F/3Yvoapfmr9czoIL43wVSSe40z4QS7GiW7m8P+qDTktgB/TRtsR47nskfPo6Ofg8HCKJWRd+5ppiBVddonfUpOt6xJexBTQOZS4yxibajeyU2sE7AxTHSRU1Y6h8yZ64zxzwhTFMR6XKa2711SyCQZCg/Zh2OnPXe7G/JUxisADMlKF6WA8rZmMSwUTGGPEY523cvm3g9w+ce6DfZplfIpUFKwu2ah3lRg0wzobKaYTjvm2+CO4h+kWDwvJ/QAfwMC3MnkvO85KIiK6Y2AxnyP+WDCEJs6IesKCZ8WC14mcSGYKM55/ieqhEr/knijStUOR1lO/Lg4wLR9YhbQgPCGWHgkD6HGCH++Aaea6dor9XPl1dQOc/RwGN2KqE1NyxP4w63KSACjvT80VMGouT1RnZCY6iObdZJgCzX1UM4AIZWmyFKot9ebf+jEempWVgVl+QIyOxTg/leLmcZvhs0cyUwXKWH5+s6oNUeI/kbpif+edQEPdthXnTMRfVLE1yc63orJh0YEZ2LWP/uaTNup/KSClRwOTEgIbMR25sv+3IWh0Klwp+i23i2b+ma7k5T0zlk1alrv1ayJQDPwy4EcvDl9duDR/4vPQa8EsaxrpL/aM06BUStHvAm1HKK8aR9wKAJRbgxaF 5tQ1Gifo x7NQwyZpi8nSOI0aKtwoO+xFkhuSie7nxLfNtjRnLjImpQiclJSRZdlgzkX6Ioe7Fjdbj8j2OK89xQ1B2UOYkGW3BiBtV1r8OJwE/o0A/K9czOobzGyPRdR4f1bmM8iNnjZj46sd816qF2VJhB5AQGnMzABPepxgyIMXG5FwzaNpxmVVfhHqMhdz1V/CnQK7eT1gjScjNIYM+xxPxuYeAJUc7u5NtiCyWSlCo0U+vULBc0+4j32Io8tCzto8+3s8G5s1snpqEKieuNZFMiWxMFsNVtZDSWMgWI9e1Wk6KQ45K9bPXpEe7RSEVJ1adReYuTGNE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 9 Oct 2024 13:41:04 +0100 wrote: > From: Shiju Jose > > Add EDAC ECS (Error Check Scrub) control in order to control a memory > device's ECS feature. > > The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM > Specification (JESD79-5) and allows the DRAM to internally read, correct > single-bit errors, and write back corrected data bits to the DRAM array > while providing transparency to error counts. > > The DDR5 device contains number of memory media FRUs per device. The > DDR5 ECS feature and thus the ECS control driver supports configuring > the ECS parameters per FRU. > > The memory devices support ECS feature register with EDAC device > driver, which retrieves the ECS descriptor from EDAC ECS driver and > exposes the sysfs ECS control attributes to userspace in > /sys/bus/edac/devices//ecs_fruX/. > > The common sysfs ECS control interface abstracts the control of an > arbitrary ECS functionality to a common set of functions. > > The support for ECS feature is added separately because the DDR5 ECS > features control attributes are dissimilar from those of the scrub > feature. > > The sysfs ECS attr nodes would be present only if the client driver > has implemented the corresponding attr callback function and passed > in ops to the EDAC RAS feature driver during registration. > > Co-developed-by: Jonathan Cameron > Signed-off-by: Jonathan Cameron > Signed-off-by: Shiju Jose Hi Shiju A few minor bits and bobs inline. > +What: /sys/bus/edac/devices//ecs_fruX/mode_counts_codewords > +Date: Oct 2024 > +KernelVersion: 6.12 These all need updating to 6.13 given we missed on 6.12. > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) True if current mode is ECS counts codewords with errors. > + > +What: /sys/bus/edac/devices//ecs_fruX/reset > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (WO) ECS reset ECC counter. > + 0 - normal, ECC counter running actively. For a write only parameter, maybe just reject anything that isn't 1? > + 1 - reset ECC counter to the default value. > + > +What: /sys/bus/edac/devices//ecs_fruX/threshold > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) ECS threshold count per GB of memory cells. In the CXL spec it is Gb of memory cells (bits, not bytes). I'm assuming this is meant to match that. Maybe safer to spell it out. > diff --git a/drivers/edac/ecs.c b/drivers/edac/ecs.c > new file mode 100755 > index 000000000000..a2b64d7bf6b6 > --- /dev/null > +++ b/drivers/edac/ecs.c > diff --git a/include/linux/edac.h b/include/linux/edac.h > index 5344e2cf6808..20bdb08c7626 100644 > --- a/include/linux/edac.h > +++ b/include/linux/edac.h > > +/** > + * struct ecs_ops - ECS device operations (all elements optional) > + * @get_log_entry_type: read the log entry type value. > + * @set_log_entry_type: set the log entry type value. > + * @get_log_entry_type_per_dram: read the log entry type per dram value. > + * @get_log_entry_type_memory_media: read the log entry type per memory media value. > + * @get_mode: read the mode value. > + * @set_mode: set the mode value. > + * @get_mode_counts_rows: read the mode counts rows value. > + * @get_mode_counts_codewords: read the mode counts codewords value. > + * @reset: reset the ECS counter. > + * @get_threshold: read the threshold value. > + * @set_threshold: set the threshold value. Maybe it's worth duplicating the ABI docs statement on units? > + */ > +struct edac_ecs_ops { > + int (*get_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u32 *val); > + int (*set_log_entry_type)(struct device *dev, void *drv_data, int fru_id, u32 val); > + int (*get_log_entry_type_per_dram)(struct device *dev, void *drv_data, > + int fru_id, u32 *val); > + int (*get_log_entry_type_per_memory_media)(struct device *dev, void *drv_data, > + int fru_id, u32 *val); > + int (*get_mode)(struct device *dev, void *drv_data, int fru_id, u32 *val); > + int (*set_mode)(struct device *dev, void *drv_data, int fru_id, u32 val); > + int (*get_mode_counts_rows)(struct device *dev, void *drv_data, int fru_id, u32 *val); > + int (*get_mode_counts_codewords)(struct device *dev, void *drv_data, int fru_id, u32 *val); > + int (*reset)(struct device *dev, void *drv_data, int fru_id, u32 val); > + int (*get_threshold)(struct device *dev, void *drv_data, int fru_id, u32 *threshold); > + int (*set_threshold)(struct device *dev, void *drv_data, int fru_id, u32 threshold); > +};