linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Shiju Jose <shiju.jose@huawei.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"david@redhat.com" <david@redhat.com>,
	"Vilas.Sridharan@amd.com" <Vilas.Sridharan@amd.com>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"mike.malvestuto@intel.com" <mike.malvestuto@intel.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>,
	Linuxarm <linuxarm@huawei.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	Jean Delvare <jdelvare@suse.com>,
	Guenter Roeck <linux@roeck-us.net>,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>
Subject: Re: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem
Date: Wed, 22 May 2024 10:40:17 +0100	[thread overview]
Message-ID: <20240522104017.00003904@Huawei.com> (raw)
In-Reply-To: <20240521080621.GBZkxV_ZWnbbrq-yV_@fat_crate.local>

On Tue, 21 May 2024 10:06:21 +0200
Borislav Petkov <bp@alien8.de> wrote:

> On Fri, May 17, 2024 at 12:44:18PM +0100, Jonathan Cameron wrote:
> > Given we are talking about something new, maybe this is an opportunity
> > to not perpetuate this?
> > 
> > If we add scrub in here I'd prefer to just use the normal bus registration
> > handling rather than creating a nest of additional nodes.  So perhaps we
> > could consider
> > /sys/bus/edac/device/scrub0 (or whatever name makes sense, as per the
> > earlier discussion of cxl_scrub0 or similar).  
> 
> Yes, my main worry is how this RAS functionality is going to be all
> organized in the tree. Yes, EDAC legacy methods can die but the
> user-visible part can't so we might as well use it to concentrate stuff
> there.

Understood.

> 
> > Could consider moving the bus location of mc0 etc in future to there with
> > symlinks to /sys/bus/edac/device/mc/* for backwards compatibility either
> > via setting their parents or more explicit link creation.  
> 
> You can ignore the mc - that's the memory controller representation EDAC
> does and that's also kind of semi-legacy considering how heterogeneous
> devices are becoming. Nowadays, scrubbing functionality can be on
> anything that has memory and that's not only a memory controller.
> 
> So it would actually be the better thing to abstract that differently
> and use .../edac/device/ for the different RAS functionalities. I.e.,
> have the "device" organize it all.

I'm not sure I follow this. Definitely worth ensuring we are thinking
the same thing wrt to layout before we go further,

Do you mean keep it similar to the existing device/mc device/pci
structure so /sys/bus/edac/devices/scrub/cxl_mem0_scrub etc?
This would rely on symlinks to paper over the dev->parent not being
the normal parent. Hence would be similar to /sys/bus/edac/devices/pci in
edac_pci_create_sysfs() or equivalent in edac_device_create_sysfs().

Or is the ../edac/device bit about putting an extra device under edac/devices/?
e.g.
/sys/bus/edac/devices/cxl_memX/scrub
/sys/bus/edac/devices/cxl_memX/other_ras_thing
which would be fairly standard driver model stuff.

This would sit alongside 'legacy'
/sys/bus/edac/devices/mc/mcX
/sys/bus/edac/devices/pci/pciX etc

I'd prefer this second model as it's very standard and but grouping is per
providing parent device, rather than functionality. However, it is rather
different from the existing edac structure.

Where I've used the symlink approach in the past, it has always
been about keeping a legacy interface in place, not where I'd start
with something new.   Hence I think this is a question of how far
we 'breakaway' from existing edac structure.



> 
> > These scrub0 would have their dev->parent set to who ever actually
> > registered them providing that reference cleanly and letting all the
> > normal device model stuff work more simply.  
> 
> Ack.

This suggests the second option above, but I wanted to confirm as Shiju
and I read this differently.

> 
> > If we did that with the scrub nodes, the only substantial change from
> > a separate subsystem as seen in this patch set would be to register
> > them on the edac bus rather than a separate class.
> > 
> > As you pointed out, there is a simple scrub interface in the existing
> > edac memory controller code. How would you suggest handling that?
> > Have them all register an additional device on the bus (as a child
> > of the mcX devices) perhaps?  Seems an easy step forwards and should
> > be no backwards compatibility concerns.  
> 
> Well, you guys want to control that scrubbing from userspace and those
> old things probably do not fit that model? We could just not convert
> them for now and add them later if really needed. I.e., leave sleeping
> dogs lie.

Ok. There is an existing is the minimal sysfs existing interface but I'm
fine with ignoring it for now.
 
> 
> > It absolutely doesn't as long as we can do it fairly cleanly within
> > existing code. I wasn't sure that was possible, but you know edac
> > a lot better than me and so I'll defer to you on that!  
> 
> Meh, I'm simply maintaining it because no one else wants to. :)

*much sympathy!*  As we ramp up more on this stuff, we'll try and
help out where we can.

> 
> > Several options for that, but fair question - bringing (at least some of)
> > the RAS mess together will focus reviewer bandwidth etc better.  
> 
> Review is more than appreciated, as always.
> 
> > I'm definitely keen on unifying things as I agree, this mixture of different
> > RAS functionality is a ever worsening mess.  
> 
> Yap, it needs to be unified and reigned into something more
> user-friendly and manageable.

Hopefully we all agree on a unified solution being the target.

Feels like we are converging. Now we are down to the details :)

Thanks,

Jonathan

> 
> Thx.
> 



  reply	other threads:[~2024-05-22  9:40 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19 16:47 [RFC PATCH v8 00/10] ras: scrub: introduce subsystem + CXL/ACPI-RAS2 drivers shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem shiju.jose
2024-04-24 20:25   ` fan
2024-04-25 10:38     ` Shiju Jose
2024-04-25 10:15   ` Borislav Petkov
2024-04-25 18:11     ` Shiju Jose
2024-05-06 10:30       ` Borislav Petkov
2024-05-08 16:59         ` Shiju Jose
2024-05-08 17:20           ` Borislav Petkov
2024-05-08 17:44             ` Shiju Jose
2024-05-08 19:25               ` Borislav Petkov
2024-05-09  9:19                 ` Jonathan Cameron
2024-05-09 15:52                   ` Borislav Petkov
2024-05-09 20:03                     ` Borislav Petkov
2024-05-09 21:21                       ` Dan Williams
2024-05-09 21:51                         ` Borislav Petkov
2024-05-09 22:59                           ` Dan Williams
2024-05-10  9:25                             ` Borislav Petkov
2024-05-10 17:13                               ` Dan Williams
2024-05-11 10:17                                 ` Borislav Petkov
2024-05-17 11:15                                   ` Jonathan Cameron
2024-05-17 11:44                                     ` Jonathan Cameron
2024-05-21  8:06                                       ` Borislav Petkov
2024-05-22  9:40                                         ` Jonathan Cameron [this message]
2024-05-27  9:09                                           ` Borislav Petkov
2024-05-20 10:54                                   ` Shiju Jose
2024-05-20 11:58                                     ` Jonathan Cameron
2024-05-27  9:21                                       ` Borislav Petkov
2024-05-28  9:06                                         ` Jonathan Cameron
2024-06-06 16:05                                           ` Borislav Petkov
2024-05-10 13:31                     ` Jonathan Cameron
2024-05-09 21:47   ` Dan Williams
2024-05-10  9:03     ` Jonathan Cameron
2024-04-19 16:47 ` [RFC PATCH v8 02/10] cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 03/10] cxl/mbox: Add GET_FEATURE " shiju.jose
2024-04-24 23:19   ` fan
2024-04-25 10:38     ` Shiju Jose
2024-04-19 16:47 ` [RFC PATCH v8 04/10] cxl/mbox: Add SET_FEATURE " shiju.jose
2024-04-25 17:26   ` fan
2024-04-19 16:47 ` [RFC PATCH v8 05/10] cxl/memscrub: Add CXL device patrol scrub control feature shiju.jose
2024-04-26 23:56   ` fan
2024-04-29 11:20     ` Shiju Jose
2024-04-29 12:21       ` Jonathan Cameron
2024-05-10  0:26   ` Dan Williams
2024-05-10 11:23     ` Jonathan Cameron
2024-04-19 16:47 ` [RFC PATCH v8 06/10] ACPICA: Add __free() based cleanup function for acpi_put_table shiju.jose
2024-04-19 18:06   ` Jonathan Cameron
2024-04-19 16:47 ` [RFC PATCH v8 07/10] platform: Add __free() based cleanup function for platform_device_put shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 08/10] ACPI:RAS2: Add ACPI RAS2 driver shiju.jose
2024-06-05 21:32   ` Daniel Ferguson
2024-04-19 16:47 ` [RFC PATCH v8 09/10] ras: scrub: Add scrub control attributes for ACPI RAS2 shiju.jose
2024-04-19 16:47 ` [RFC PATCH v8 10/10] ras: scrub: ACPI RAS2: Add memory ACPI RAS2 driver shiju.jose
2024-06-05 21:33   ` Daniel Ferguson
2024-06-07 15:46     ` Shiju Jose
2024-06-21 18:06       ` Daniel Ferguson
2024-06-26 12:23         ` Shiju Jose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240522104017.00003904@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Jon.Grimm@amd.com \
    --cc=Vilas.Sridharan@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dferguson@amperecomputing.com \
    --cc=dmitry.torokhov@gmail.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gthelen@google.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=jdelvare@suse.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=linuxarm@huawei.com \
    --cc=mike.malvestuto@intel.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=shiju.jose@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vishal.l.verma@intel.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox