From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32146D68B17 for ; Thu, 14 Nov 2024 13:33:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A70596B0083; Thu, 14 Nov 2024 08:33:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A202D6B0085; Thu, 14 Nov 2024 08:33:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E7666B0088; Thu, 14 Nov 2024 08:33:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6C4FC6B0083 for ; Thu, 14 Nov 2024 08:33:54 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E7658411A9 for ; Thu, 14 Nov 2024 13:33:53 +0000 (UTC) X-FDA: 82784792538.05.C29F38A Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by imf12.hostedemail.com (Postfix) with ESMTP id A232840024 for ; Thu, 14 Nov 2024 13:33:31 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=T5g2Oc5o; dmarc=pass (policy=none) header.from=alien8.de; spf=pass (imf12.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731591054; a=rsa-sha256; cv=none; b=TGo5ms6XHcWHZ1W+Lb5vhOePKsMKhonfH+YjCKGnxPcsmFUMzBse8Gi3W3OfY1aYwEX0+T vOTIFKl3R32ZPgvmZS3ihKJxFq+pc/9Qfl3yd63rypMHuc0kRKY8aBKyE8HgL982/FVqLZ rWuWY6LiRQ8I0IOCintV3NKQe3qYNzc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=T5g2Oc5o; dmarc=pass (policy=none) header.from=alien8.de; spf=pass (imf12.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731591054; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GZYwjf6zGx9udTtaH3+hhtQ6x1jYJyXBlIbM0yeiO5g=; b=xzhoGQ308SBhVGxKgYJnoplhimL567ZDqUw+s0KTEz/vou28Dyav92nnn0ir7B3Z4P922d NqVmLZ662oFExNUCF7Bzic9uNdTj3xHpNcS/BWDu5xW1g76pf1jEKHY4OL6rRM2YSGTL+8 mUtWBU42cMsTqxzb+GFTsdVBQOG5xv0= Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 58C6B40E0219; Thu, 14 Nov 2024 13:33:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 316gSpFbc922; Thu, 14 Nov 2024 13:33:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1731591221; bh=GZYwjf6zGx9udTtaH3+hhtQ6x1jYJyXBlIbM0yeiO5g=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=T5g2Oc5o69eG3jsf6mBpCHztBXlMBpkivbs/iA7wr1T6ClhAWkJsmSRRjmrQuYZ0t qLxNWWb+LKMqSf8OTEQ0PE7TKpejtVjlDHOv+EDfMPeiwMHeCA6wbhA9pLvyeCVjRp aLtgiK9sAA47jt57Dn+YJ25TpjwHSU0w7lyQNEuzSKwnLDpySW8/NVLngNjNSzpEDm 6Y88BlzR3QbNs30cqoJzAoh93ciRZY6gcf2bCRQi/CMX7opH8bT9lBmWlKW7wXSMNj 8/c1EKHGTE5ZSR2czipKQmruxgoIdUwVwaeX2DYMRxJB2cCqPOp9mjR0TvQ1+kYA4i kQYRmXbK4VBSddFVX3esmsEBTfmlt/bf4o5m8ihWNtxjvL7c41fUEj8TQrMQkYj2h5 Z419ubN4v1W/twM5eU31nd6wqlSh51cIrYcuB4cgYEpngFX19TShYzeNig9xTtFbvJ 49IRcATSuyUwk2Ks+hcs4kJ1Rhz0SzaR09uXtUJs0EAIi6d27TnhBl0VMF6paTn4jR PhSve6VOw5mkIfOq7jxLF0qugh4+d/HMWIMoon8fR7m18XJATaxEFs34tuqA1u/lv9 PskfU5eXhWr3Juu9iohU55XdnVotErWAurcfV3CYSP058ooVVoxQgUQmdV0Wo8NeHf wANn36QRtKuVGTc2m6QeMJ5c= Received: from zn.tnic (p200300ea973a314f329c23fffea6a903.dip0.t-ipconnect.de [IPv6:2003:ea:973a:314f:329c:23ff:fea6:a903]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 8B18940E0163; Thu, 14 Nov 2024 13:32:54 +0000 (UTC) Date: Thu, 14 Nov 2024 14:32:49 +0100 From: Borislav Petkov To: Shiju Jose Cc: "linux-edac@vger.kernel.org" , "linux-cxl@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "tony.luck@intel.com" , "rafael@kernel.org" , "lenb@kernel.org" , "mchehab@kernel.org" , "dan.j.williams@intel.com" , "dave@stgolabs.net" , Jonathan Cameron , "gregkh@linuxfoundation.org" , "sudeep.holla@arm.com" , "jassisinghbrar@gmail.com" , "dave.jiang@intel.com" , "alison.schofield@intel.com" , "vishal.l.verma@intel.com" , "ira.weiny@intel.com" , "david@redhat.com" , "Vilas.Sridharan@amd.com" , "leo.duran@amd.com" , "Yazen.Ghannam@amd.com" , "rientjes@google.com" , "jiaqiyan@google.com" , "Jon.Grimm@amd.com" , "dave.hansen@linux.intel.com" , "naoya.horiguchi@nec.com" , "james.morse@arm.com" , "jthoughton@google.com" , "somasundaram.a@hpe.com" , "erdemaktas@google.com" , "pgonda@google.com" , "duenwen@google.com" , "gthelen@google.com" , "wschwartz@amperecomputing.com" , "dferguson@amperecomputing.com" , "wbs@os.amperecomputing.com" , "nifan.cxl@gmail.com" , tanxiaofei , "Zengtao (B)" , Roberto Sassu , "kangkang.shen@futurewei.com" , wanghuiqiang , Linuxarm Subject: Re: [PATCH v15 11/15] EDAC: Add memory repair control feature Message-ID: <20241114133249.GEZzX8ATNyc_Xw1L52@fat_crate.local> References: <20241101091735.1465-1-shiju.jose@huawei.com> <20241101091735.1465-12-shiju.jose@huawei.com> <20241104061554.GOZyhmmo9melwI0c6q@fat_crate.local> <1ac30acc16ab42c98313c20c79988349@huawei.com> <20241111112819.GCZzHqUz1Sz-vcW09c@fat_crate.local> <7fd81b442ba3477787f5342e69adbb96@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <7fd81b442ba3477787f5342e69adbb96@huawei.com> X-Stat-Signature: b1h9w4e6dsngqzxhkh1okbgytwpxedb4 X-Rspamd-Queue-Id: A232840024 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731591211-848673 X-HE-Meta: U2FsdGVkX1//hENV3GjKxMfk2i23Q4mOZl/p/OVBgFc/YvLMNRj+4qrjBOsUsDLolx9+/ylKhZ5PKc2PpMCSo0KtvRatF6+ha/QKqhwnDGrP31ShKNbNVQwOpjDPdMv2FCGBqpssGpd3hjB9uDos6FqFNyt9IuqX5pOUCm5IEW0VBQ+N9+/n5Xt+WB/sLO3qchfX/F9XwbnMDvPchk0fAIqpLCI9QIH9sZwmx7zgyjvebQAB96axf4+Dk6uknVhog30bL9jIdMbRPnBIVqyKklQUDj6We4WOGy1z3rQRdUS2Lip6W9jKD7uNilDfF8tPSVUT6NVztjdrR/4A1K8XxwoOvZQjsdeMXe1be9HJuLEgHuNlIOobks5Ni8N4f3JrJdhJsPNDC/hk1+tl8O7Gc105mh1sFV0FccLcRdsxJN0H/ey75rp/Gj2jf64LH5g09URH64OENIqO1Ao+/SyoHNeFIp0CXjUw+fM6AfnShinU/sRnywDWYc9xcP0Yd90t806o/NH0DpBWEedD7Gixle63FGvnlScM3dukEnskgQLpqWZo2hHJq1CrS/41XWp0bDh5Ga3DekYQZpBkEnoH7HgdjyxJVr7ty8TmMhNZxJOVzgeysF8g3ka+pQ82vqErrARCq7DyiNHMCM4m7rWAqekMk5Xu/orlZo+Ju3u+SdunYkmnZtFnsC54yYHz5/F6OF98kR1e6KwSohcP6W/aMHdlyM0pnP5UAfswvvqqHmSDseqpye9DLG18tjXLJirRAHyGUHYMBet1/1ZULwuM0TBxUo5J/cEK+pidR3OkwD4o1Wy2tT6yRmvsUT3KSop8XgE08s3bHgC4xG/hPm/INWtHvxQInK37jV0jp9qYjzwqZqH5ihMUe8qC+TVpKv9g7clc2zJd3rkNtR25rrDcFZ/E/3IRnYsjT+6LtSLDwb6Ym4rC1wzHtUyIMmf0TfaEYdKz8atKZcOJuMC4fPL IvYRV2aP IkzRqUkiXjGk2eqaRQ1h16AyP6EpqWpnEHxqGrwuOX40hpmQLEB5eaFiP4WplGkmk/hFamFhRMKBzivStLjSKRt89Hnou7J5sE2gDKlUPXos0UslQJWDgLUdVQagVuim9vfV9KY8q0k1GjK3ZrFmI1fL0k5CAfDtKue6EBKYP3k327CxnwGeN6/xXavANq3UnJgsSw89Y3M5MPunRYzALnqJP1jfHciBnkGhOOL70c5dQJDzQXXeIg4hSG+GWf+QHJuPNTBJ/gotGVIGFC9Vch9Ejc6LwSeK+kGcvYwRLMVFU0x1t+7V6wdtoD84X0uE5lsTDfGRJo3VbAda5VTd/FobThR4EAHb89icdVImWzDQVx7GDrJyya0660ADZZtc9ys06qSMZC9Gss/0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.002453, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 11, 2024 at 04:54:48PM +0000, Shiju Jose wrote: > Presently, 0 (soft memory repair) and 1 (hard memory repair), depends on > which mode/s a memory device is supported. What if the device supports more than one mode? > However for CXL memory sparing feature, the persistent mode is configurable at runtime > for a memory sparing instance, thus both soft and hard sparing are supported. > Example given for CXL memory sparing feature in Documentation/edac/memory_repair.rst, > root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair1/persist_mode_avail > 0,1 Ok, and how is the user supposed to know what those mean? > Kernel sysfs doc mentioned about array of values as follows, though not seen much examples. > https://docs.kernel.org/filesystems/sysfs.html > "Attributes should be ASCII text files, preferably with only one value per file. It is noted that > it may not be efficient to contain only one value per file, so it is socially acceptable to express > an array of values of the same type." True story. Ok, so there's an exception to that rule. > The values of these attributes are specific to device and portion of the memory to repair. > For example, In CXL repair features, > CXL memory device identifies a failure on a memory component, device provides the corresponding > values of the attributes (DPA, channel, rank, nibble mask, bank group, bank, row, column or sub-channel etc) > in an event record to the host and to the userspace in the corresponding trace event. > Userspace shall use these values for the query resource availability and repair operations. I don't think you're answering my question. Lemme try again: I am on a machine with such an interface. I do echo 0xdeadbeef > /sys/devices... -EINVAL echo 0xface > ... -EINVAL How do I know what the allowed ranges are? > This will work for the CXL PPR feature where the result of the query operation for resources availability > return to the command, however for the CXL memory sparing features, the result of the query resources > availability command returned later in a Memory Sparing Event Record from the device. > Userspace shall issue repair operation with the attributes values received on the Memory Sparing trace event. > Thus for the CXL memory sparing feature, query for resources availability and repair operation > cannot be combined. What happens if the resources availability changes between the query and the start of the repair operation? The cat catches fire? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette