linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Shiju Jose <shiju.jose@huawei.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"mchehab@kernel.org" <mchehab@kernel.org>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"Vilas.Sridharan@amd.com" <Vilas.Sridharan@amd.com>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	"Roberto Sassu" <roberto.sassu@huawei.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>,
	Linuxarm <linuxarm@huawei.com>, Vandana Salve <vsalve@micron.com>,
	"Steven Rostedt" <rostedt@goodmis.org>
Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature
Date: Thu, 20 Feb 2025 12:19:15 +0000	[thread overview]
Message-ID: <20250220121915.00001391@huawei.com> (raw)
In-Reply-To: <20250219184533.GCZ7YmzTDk5B4p-C7e@fat_crate.local>

On Wed, 19 Feb 2025 19:45:33 +0100
Borislav Petkov <bp@alien8.de> wrote:

> On Tue, Feb 18, 2025 at 04:51:25PM +0000, Jonathan Cameron wrote:
> > As a side note, if you are in the situation where the device can do
> > memory repair without any disruption of memory access then my
> > assumption is in the case where the device would set the maintenance
> > needed + where it is considering soft repair (so no long term cost
> > to a wrong decision) then the device would probably just do it
> > autonomously and at most we might get a notification.  
> 
> And this is basically what I'm trying to hint at: if you can do recovery
> action without userspace involvement, then please, by all means. There's no
> need to noodle information back'n'forth through user if the kernel or the
> device itself even, can handle it on its own.
> 
> More involved stuff should obviously rely on userspace to do more involved
> "pondering."

Lets explore this further as a follow up. A policy switch to let the kernel
do the 'easy' stuff (assuming device didn't do it) makes sense if this
particular combination is common.

> 
> > So I think that if we see this there will be some disruption.
> > Latency spikes for soft repair or we are looking at hard repair.
> > In that case we'd need policy on whether to repair at all.
> > In general the rasdaemon handling in that series is intentionally
> > simplistic. Real solutions will take time to refine but they
> > don't need changes to the kernel interface, just when to poke it.  
> 
> I hope so.
> 
> > The error record comes out as a trace point. Is there any precedence for
> > injecting those back into the kernel?   
> 
> I'm just questioning the whole interface and its usability. Not saying it
> doesn't make sense - we're simply weighing all options here.
> 
> > That policy question is a long term one but I can suggest 'possible' policies
> > that might help motivate the discussion
> >
> > 1. Repair may be very disruptive to memory latency. Delay until a maintenance
> >    window when latency spike is accepted by the customer until then rely on
> >    maintenance needed still representing a relatively low chance of failure.  
> 
> So during the maintenance window, the operator is supposed to do
> 
> rasdaemon --start-expensive-repair-operations

Yes, would be something along those lines.  Or a script very similar to the
the boot one Shiju wrote.  Scan the DB and find what needs repairing + do so.

> 
> ?
> 
> > 2. Hard repair uses known limited resources - e.g. those are known to match up
> >    to a particular number of rows in each module. That is not discoverable under
> >    the CXL spec so would have to come from another source of metadata.
> >    Apply some sort of fall off function so that we repair only the very worst
> >    cases as we run out. Alternative is always soft offline the memory in the OS,
> >    aim is to reduce chance of having to do that a somewhat optimal fashion.
> >    I'm not sure on the appropriate stats, maybe assume a given granual failure
> >    rate follows a Poison distribution and attempt to estimate lambda?  Would
> >    need an expert in appropriate failure modes or a lot of data to define
> >    this!  
> 
> I have no clue what you're saying here. :-)

I'll write something up at some point as it's definitely a complex
topic and I need to find a statistician + hardware folk with error models to
help flesh it out. 

There is another topic to look at which is what to do with synchronous poison
if we can repair the memory and bring it back into use.
I can't find the thread, but last time I asked about recovering from that, the
mm folk said they'd need to see the code + usecases (fair enough!).

> 
> > It is the simplest interface that we have come up with so far. I'm fully open
> > to alternatives that provide a clean way to get this data back into the
> > kernel and play well with existing logging tooling (e.g. rasdaemon)
> > 
> > Some things we could do,
> > * Store binary of trace event and reinject. As above + we would have to be
> >   very careful that any changes to the event are made with knowledge that
> >   we need to handle this path.  Little or now marshaling / formatting code
> >   in userspace, but new logging infrastructure needed + a chardev /ioctl
> >   to inject the data and a bit of userspace glue to talk to it.
> > * Reinject a binary representation we define, via an ioctl on some
> >   chardev we create for the purpose.  Userspace code has to take
> >   key value pairs and process them into this form.  So similar amount
> >   of marshaling code to what we have for sysfs.
> > * Or what we currently propose, write set of key value pairs to a simple
> >   (though multifile) sysfs interface. As you've noted marshaling is needed.  
> 
> ... and the advantage of having such a sysfs interface: it is human readable
> and usable vs having to use a tool to create a binary blob in a certain
> format...
> 
> Ok, then. Let's give that API a try... I guess I need to pick up the EDAC
> patches from here:
> 
> https://lore.kernel.org/r/20250212143654.1893-1-shiju.jose@huawei.com
> 
> If so, there's an EDAC patch 14 which is not together with the first 4. And
> I was thinking of taking the first 4 or 5 and then giving other folks an
> immutable branch in the EDAC tree which they can use to base the CXL stuff on
> top.
> 
> What's up?

My fault. I asked Shiju to split the more complex ABI for sparing out
to build the complexity up rather than having it all in one patch.

Should be fine for you to take 1-4 and 14 which is all the EDAC parts.

For 5 and 6 Rafael acked the ACPI part (5), and the ACPI ras2 scrub driver
has no other dependencies so I think that should go through your
tree as well, though no need to be in the immutable branch.

Dave Jiang can work his magic on the CXL stuff on top of a merge of your
immutable branch.

Thanks!

Jonathan
> 



  reply	other threads:[~2025-02-20 12:19 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-06 12:09 [PATCH v18 00/19] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers shiju.jose
2025-01-06 12:09 ` [PATCH v18 01/19] EDAC: Add support for EDAC device features control shiju.jose
2025-01-06 13:37   ` Borislav Petkov
2025-01-06 14:48     ` Shiju Jose
2025-01-13 15:06   ` Mauro Carvalho Chehab
2025-01-14  9:55     ` Jonathan Cameron
2025-01-14 10:08     ` Shiju Jose
2025-01-14 11:33       ` Mauro Carvalho Chehab
2025-01-30 19:18   ` Daniel Ferguson
2025-01-06 12:09 ` [PATCH v18 02/19] EDAC: Add scrub control feature shiju.jose
2025-01-06 15:57   ` Borislav Petkov
2025-01-06 19:34     ` Shiju Jose
2025-01-07  7:32       ` Borislav Petkov
2025-01-07  9:23         ` Shiju Jose
2025-01-08 15:47         ` Shiju Jose
2025-01-13 15:50   ` Mauro Carvalho Chehab
2025-01-30 19:18   ` Daniel Ferguson
2025-01-06 12:09 ` [PATCH v18 03/19] EDAC: Add ECS " shiju.jose
2025-01-13 16:09   ` Mauro Carvalho Chehab
2025-01-06 12:10 ` [PATCH v18 04/19] EDAC: Add memory repair " shiju.jose
2025-01-09  9:19   ` Borislav Petkov
2025-01-09 11:00     ` Shiju Jose
2025-01-09 12:32       ` Borislav Petkov
2025-01-09 14:24         ` Jonathan Cameron
2025-01-09 15:18           ` Borislav Petkov
2025-01-09 16:01             ` Jonathan Cameron
2025-01-09 16:19               ` Borislav Petkov
2025-01-09 18:34                 ` Jonathan Cameron
2025-01-09 23:51                   ` Dan Williams
2025-01-10 11:01                     ` Jonathan Cameron
2025-01-10 22:49                       ` Dan Williams
2025-01-13 11:40                         ` Jonathan Cameron
2025-01-14 19:35                           ` Dan Williams
2025-01-15 10:07                             ` Jonathan Cameron
2025-01-15 11:35                             ` Mauro Carvalho Chehab
2025-01-11 17:12                   ` Borislav Petkov
2025-01-13 11:07                     ` Jonathan Cameron
2025-01-21 16:16                       ` Borislav Petkov
2025-01-21 18:16                         ` Jonathan Cameron
2025-01-22 19:09                           ` Borislav Petkov
2025-02-06 13:39                             ` Jonathan Cameron
2025-02-17 13:23                               ` Borislav Petkov
2025-02-18 16:51                                 ` Jonathan Cameron
2025-02-19 18:45                                   ` Borislav Petkov
2025-02-20 12:19                                     ` Jonathan Cameron [this message]
2025-01-14 13:10                   ` Mauro Carvalho Chehab
2025-01-14 12:57               ` Mauro Carvalho Chehab
2025-01-14 12:38           ` Mauro Carvalho Chehab
2025-01-14 13:05             ` Jonathan Cameron
2025-01-14 14:39               ` Mauro Carvalho Chehab
2025-01-14 11:47   ` Mauro Carvalho Chehab
2025-01-14 12:31     ` Shiju Jose
2025-01-14 14:26       ` Mauro Carvalho Chehab
2025-01-14 13:47   ` Mauro Carvalho Chehab
2025-01-14 14:30     ` Shiju Jose
2025-01-15 12:03       ` Mauro Carvalho Chehab
2025-01-06 12:10 ` [PATCH v18 05/19] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2025-01-21 23:01   ` Daniel Ferguson
2025-01-22 15:38     ` Shiju Jose
2025-01-30 19:19   ` Daniel Ferguson
2025-01-06 12:10 ` [PATCH v18 06/19] ras: mem: Add memory " shiju.jose
2025-01-21 23:01   ` Daniel Ferguson
2025-01-30 19:19   ` Daniel Ferguson
2025-01-06 12:10 ` [PATCH v18 07/19] cxl: Refactor user ioctl command path from mds to mailbox shiju.jose
2025-01-06 12:10 ` [PATCH v18 08/19] cxl: Add skeletal features driver shiju.jose
2025-01-06 12:10 ` [PATCH v18 09/19] cxl: Enumerate feature commands shiju.jose
2025-01-06 12:10 ` [PATCH v18 10/19] cxl: Add Get Supported Features command for kernel usage shiju.jose
2025-01-06 12:10 ` [PATCH v18 11/19] cxl: Add features driver attribute to emit number of features supported shiju.jose
2025-01-06 12:10 ` [PATCH v18 12/19] cxl/mbox: Add GET_FEATURE mailbox command shiju.jose
2025-01-06 12:10 ` [PATCH v18 13/19] cxl/mbox: Add SET_FEATURE " shiju.jose
2025-01-06 12:10 ` [PATCH v18 14/19] cxl: Setup exclusive CXL features that are reserved for the kernel shiju.jose
2025-01-06 12:10 ` [PATCH v18 15/19] cxl/memfeature: Add CXL memory device patrol scrub control feature shiju.jose
2025-01-24 20:38   ` Dan Williams
2025-01-27 10:06     ` Jonathan Cameron
2025-01-27 12:53     ` Shiju Jose
2025-01-27 23:17       ` Dan Williams
2025-01-29 12:28         ` Shiju Jose
2025-01-06 12:10 ` [PATCH v18 16/19] cxl/memfeature: Add CXL memory device ECS " shiju.jose
2025-01-06 12:10 ` [PATCH v18 17/19] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2025-01-06 12:10 ` [PATCH v18 18/19] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
2025-01-06 12:10 ` [PATCH v18 19/19] cxl/memfeature: Add CXL memory device memory sparing " shiju.jose
2025-01-13 14:46 ` [PATCH v18 00/19] EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers Mauro Carvalho Chehab
2025-01-13 15:36   ` Jonathan Cameron
2025-01-14 14:06     ` Mauro Carvalho Chehab
2025-01-13 18:15   ` Shiju Jose
2025-01-30 19:18 ` Daniel Ferguson
2025-02-03  9:25   ` Shiju Jose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250220121915.00001391@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Jon.Grimm@amd.com \
    --cc=Vilas.Sridharan@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dferguson@amperecomputing.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gthelen@google.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=mchehab@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=roberto.sassu@huawei.com \
    --cc=rostedt@goodmis.org \
    --cc=shiju.jose@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vishal.l.verma@intel.com \
    --cc=vsalve@micron.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox