From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09B6FC021AA for ; Wed, 19 Feb 2025 18:46:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 820BA280122; Wed, 19 Feb 2025 13:46:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A94C2800FF; Wed, 19 Feb 2025 13:46:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 58705280122; Wed, 19 Feb 2025 13:46:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 36C362800FF for ; Wed, 19 Feb 2025 13:46:37 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B96FE8120F for ; Wed, 19 Feb 2025 18:46:36 +0000 (UTC) X-FDA: 83137575192.12.70DCC18 Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by imf12.hostedemail.com (Postfix) with ESMTP id 3F5EB40019 for ; Wed, 19 Feb 2025 18:46:34 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=cPd+ngRt; spf=pass (imf12.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de; dmarc=pass (policy=none) header.from=alien8.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739990794; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vgFdYNMT39E8eGj67uvR/82qw40Mmbyr9Z4EvKWNqPY=; b=yNPCNAGOyh4g1TiaU3Sgp7SmOXsjvq1HKTn2OqI88/wOAbaizd12p14l6IXAe71K+vI/Xa +8txjF8lvFk/we7Eiwvs31PvrG4qK83KRJ/1vLId7ne4FVSfUHSoh1lz4xtG/HHdOFFQzT vToZ8pyeai8FIzFUHDkZMlp5Lg7Acg4= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=cPd+ngRt; spf=pass (imf12.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de; dmarc=pass (policy=none) header.from=alien8.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739990794; a=rsa-sha256; cv=none; b=AX/wN5c+kUqkYbe1Q+lZ69mC5pmXei6NTDQg+//tMr+gOKYWYRePMAS30b8r15tIlD23Sn bd0276pDzzbUakxoaxJ0Y2Qp4ihbVtSuw+e/ttAyFYoPF+egN0UwngCBTztj2NwOHcL5MZ zv6Ul553czEIVeufdW16/ssNl+Kqi30= Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 8E10C40E0176; Wed, 19 Feb 2025 18:46:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id z15dtccO68ao; Wed, 19 Feb 2025 18:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1739990786; bh=vgFdYNMT39E8eGj67uvR/82qw40Mmbyr9Z4EvKWNqPY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=cPd+ngRtOjokmWFhMpHZLwUzUlKHAvYG8bszufBHMcN0PPdJMFf/RCGmK8x2Zw4MA 5oTCrc4aHfXcrYrYvHJmy2MkGuopszfBHuIHl2ZZyiCQmWyYTFWcKqzS1Jlcs9yzl1 f1AL69xikS7OBVc0n1mua16IV8AStAb12McccNjCUy2JaQbx5Q5GoJfnpkauZepxJE cvfb58aqFPCTGhzDaows5bOf99k3CATYZ3+ZW4p2VEK80hkLPNfBiD/HRcl7LcqizD vHkzq7EHabgyXkHf/BxCd5bBwPbC1jaIMJrhDcid8Y9sWR+/pXRoOTNnB+ABNW6c0Y wWYad2LQneyEntYR3fs1F5vp7i0l9zFcEybwTLYGbBRvpr7IY+iDSCJA4JHnja+226 vCnUvz47JApOTYss0qfKLpwCqZOuyLfN3fhMnkivEQeg+h+Hyur2Bcxa5vHtRoCFkE o61n1Pk7f+xdjYlnwFJkRyBJNEM0ytz77Ik5WJlW1HF4wgz8N+Qmd+f8OWox4t420d MfbsJ6hV83SYxA5JGAjP2+AtQxCuK2ToJ2cEmj33SDYDG1x/dNoJplZ2Ruq5hJhVPy AIbtWv+YzQZ2MCA9cXE50rKgsBHNebZ6vTBGmBHjBYLWlbkl2+GyINsr4p63Vxg8RT d4y3gDw8sr/JyvvO8YDBCWaE= Received: from zn.tnic (pd95303ce.dip0.t-ipconnect.de [217.83.3.206]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id A4A6740E015D; Wed, 19 Feb 2025 18:45:40 +0000 (UTC) Date: Wed, 19 Feb 2025 19:45:33 +0100 From: Borislav Petkov To: Jonathan Cameron Cc: Shiju Jose , "linux-edac@vger.kernel.org" , "linux-cxl@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "tony.luck@intel.com" , "rafael@kernel.org" , "lenb@kernel.org" , "mchehab@kernel.org" , "dan.j.williams@intel.com" , "dave@stgolabs.net" , "dave.jiang@intel.com" , "alison.schofield@intel.com" , "vishal.l.verma@intel.com" , "ira.weiny@intel.com" , "david@redhat.com" , "Vilas.Sridharan@amd.com" , "leo.duran@amd.com" , "Yazen.Ghannam@amd.com" , "rientjes@google.com" , "jiaqiyan@google.com" , "Jon.Grimm@amd.com" , "dave.hansen@linux.intel.com" , "naoya.horiguchi@nec.com" , "james.morse@arm.com" , "jthoughton@google.com" , "somasundaram.a@hpe.com" , "erdemaktas@google.com" , "pgonda@google.com" , "duenwen@google.com" , "gthelen@google.com" , "wschwartz@amperecomputing.com" , "dferguson@amperecomputing.com" , "wbs@os.amperecomputing.com" , "nifan.cxl@gmail.com" , tanxiaofei , "Zengtao (B)" , Roberto Sassu , "kangkang.shen@futurewei.com" , wanghuiqiang , Linuxarm , Vandana Salve , Steven Rostedt Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature Message-ID: <20250219184533.GCZ7YmzTDk5B4p-C7e@fat_crate.local> References: <20250109161902.GDZ3_29rH-sQMV4n0N@fat_crate.local> <20250109183448.000059ec@huawei.com> <20250111171243.GCZ4Kmi5xMtY2ktCHm@fat_crate.local> <20250113110740.00003a7c@huawei.com> <20250121161653.GAZ4_IdYDQ9_-QoEvn@fat_crate.local> <20250121181632.0000637c@huawei.com> <20250122190917.GDZ5FCXetp9--djyQ6@fat_crate.local> <20250206133949.00006dd6@huawei.com> <20250217132322.GCZ7M4Somf2VYvbwHb@fat_crate.local> <20250218165125.00007065@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20250218165125.00007065@huawei.com> X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3F5EB40019 X-Stat-Signature: 9udrdiodgz4gzeyhoifoqqj18k4hbafn X-HE-Tag: 1739990794-383628 X-HE-Meta: U2FsdGVkX19BTUf0ynPfrxQ5iH70GdQxoIhhDBbPgd4QNwDg2ATgzUniluWW/TRzlYY76IZ3R1pITPFu53VIw3Xm5ZnusBrjL3d8j71tXYhvq3FmWSlbQf4lRopof7VLhbWU1iTcmqptb7WQIkYwVW/CU2PEN/VEFqJrpbRK/iW+1XXO4sdW9qUcyh+uhojtvnUq3AJKZ7R4F+lYoQNAxwIA/psR0JfMc8JpXtZ9BOa+S1m40ElLWdTmfFQF3Lo8/GGoXXBQSGfaR8ffIGDaA0kyn9IlVSOc+N/4eJ1RWKMA0WWIhTbDJ/md5Ac5iQiQas5Z4XuCs/Y1ByKTdAGj8unyUMg9/zvIpqpuIe/WxAAN4Jf2XPiaNLSdDUaSmQz/coEwLu5C+BWVlO5jIvGIru5zKudx30rzYdzPu3EulZDgKk3FO8CKwN4HQcm6JOH+TA3o8KNRTvS5STkgC/XbLIblPpN5QC7pTRYaFwfTmcE71FgFoAugJixv4SB8QwQccWzVBEiwcuJjc2H053ATQ1r+sMTDLekefnGglRLc7F7iDu/NHigwI6rh8kxDfubYYBSoab0E2pokUd92jHo0tHDPA0JnzMGDgb6x15CE3Q/xV00QgKXVIaIhGVPPTDb9hjAC7dBkextLQCFub1rJ5fydfIrbgtrzRFiniumzkP68aT0GZ7xFbLVzMFzlLPLTyn6t2rPZm1X3nVe3MFUIZ4m8N+Q+AjZCH6CF42do+qPRV5MpQWxsUuBoGWowdsFsVxxAcKKCurIxKuRZrUgEQClqKc9nF5g4jDKtfyKjriM9HoOo05s1LHjSDwDlsfLgJ4E3GJzJR+BGe/Ta+Mzl7rTjSfkbhNReLLe/V3tcKS5XlfS+oq7GaoH5WNnt6E92S49+SH/QgWvn+3CI16yk/TgA/T23cb4xXYayUyHyMihEQzoP84iPp03a4W3vRYYh4SreBpCRhjVIIIyEo7y 91Jb7kIM GJ3sZkyDbGWH85T6BSt5nTn+YT0c4rCj/NyFLhDXpNj+AYykuvQwaO7/ulx4MmAmRDHuXBio+3GqAWEs+9kPlmm0ALUMRTV4eVT4RsXxFS1Q2gF2oJ7AXBDjCk3fDT4TOOhZjlICJuYiN1zF3ccBAa0PCfdRxnMfG0+c6aWMGt/LARlUbt4mZugC9sUAFF6/028kU6bdVEhCQ8X1MbOXbFT+m89LbRxH6DcSGvaw9fNF+poYq5s1Bx4YmHmSidtGdENeetCn2YH454SojUVjuBtOV1YRLBsFD29XqRlIIjiU7pL/Nne4cbESfF2e7aev87z+lciLPzC/Kjeu/CcXUNYSnxi7SY2VKLG2E0feKtZ4NDVWoJTL3UZW307Wyadf6xsW04AfB+ePueIzgYvRPjnlRsN+Cm2/2dOOP1N/DaMpxj4gREaMDylGKg/rh46WYFL0gCLCCF/vyKeq6XSTgS4UqAN5c856eKHJf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 18, 2025 at 04:51:25PM +0000, Jonathan Cameron wrote: > As a side note, if you are in the situation where the device can do > memory repair without any disruption of memory access then my > assumption is in the case where the device would set the maintenance > needed + where it is considering soft repair (so no long term cost > to a wrong decision) then the device would probably just do it > autonomously and at most we might get a notification. And this is basically what I'm trying to hint at: if you can do recovery action without userspace involvement, then please, by all means. There's no need to noodle information back'n'forth through user if the kernel or the device itself even, can handle it on its own. More involved stuff should obviously rely on userspace to do more involved "pondering." > So I think that if we see this there will be some disruption. > Latency spikes for soft repair or we are looking at hard repair. > In that case we'd need policy on whether to repair at all. > In general the rasdaemon handling in that series is intentionally > simplistic. Real solutions will take time to refine but they > don't need changes to the kernel interface, just when to poke it. I hope so. > The error record comes out as a trace point. Is there any precedence for > injecting those back into the kernel? I'm just questioning the whole interface and its usability. Not saying it doesn't make sense - we're simply weighing all options here. > That policy question is a long term one but I can suggest 'possible' policies > that might help motivate the discussion > > 1. Repair may be very disruptive to memory latency. Delay until a maintenance > window when latency spike is accepted by the customer until then rely on > maintenance needed still representing a relatively low chance of failure. So during the maintenance window, the operator is supposed to do rasdaemon --start-expensive-repair-operations ? > 2. Hard repair uses known limited resources - e.g. those are known to match up > to a particular number of rows in each module. That is not discoverable under > the CXL spec so would have to come from another source of metadata. > Apply some sort of fall off function so that we repair only the very worst > cases as we run out. Alternative is always soft offline the memory in the OS, > aim is to reduce chance of having to do that a somewhat optimal fashion. > I'm not sure on the appropriate stats, maybe assume a given granual failure > rate follows a Poison distribution and attempt to estimate lambda? Would > need an expert in appropriate failure modes or a lot of data to define > this! I have no clue what you're saying here. :-) > It is the simplest interface that we have come up with so far. I'm fully open > to alternatives that provide a clean way to get this data back into the > kernel and play well with existing logging tooling (e.g. rasdaemon) > > Some things we could do, > * Store binary of trace event and reinject. As above + we would have to be > very careful that any changes to the event are made with knowledge that > we need to handle this path. Little or now marshaling / formatting code > in userspace, but new logging infrastructure needed + a chardev /ioctl > to inject the data and a bit of userspace glue to talk to it. > * Reinject a binary representation we define, via an ioctl on some > chardev we create for the purpose. Userspace code has to take > key value pairs and process them into this form. So similar amount > of marshaling code to what we have for sysfs. > * Or what we currently propose, write set of key value pairs to a simple > (though multifile) sysfs interface. As you've noted marshaling is needed. ... and the advantage of having such a sysfs interface: it is human readable and usable vs having to use a tool to create a binary blob in a certain format... Ok, then. Let's give that API a try... I guess I need to pick up the EDAC patches from here: https://lore.kernel.org/r/20250212143654.1893-1-shiju.jose@huawei.com If so, there's an EDAC patch 14 which is not together with the first 4. And I was thinking of taking the first 4 or 5 and then giving other folks an immutable branch in the EDAC tree which they can use to base the CXL stuff on top. What's up? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette