linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Shiju Jose <shiju.jose@huawei.com>
To: Dan Williams <dan.j.williams@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"Vilas.Sridharan@amd.com" <Vilas.Sridharan@amd.com>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	"lenb@kernel.org" <lenb@kernel.org>,
	"mchehab@kernel.org" <mchehab@kernel.org>,
	"leo.duran@amd.com" <leo.duran@amd.com>,
	"Yazen.Ghannam@amd.com" <Yazen.Ghannam@amd.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"jiaqiyan@google.com" <jiaqiyan@google.com>,
	"Jon.Grimm@amd.com" <Jon.Grimm@amd.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"naoya.horiguchi@nec.com" <naoya.horiguchi@nec.com>,
	"james.morse@arm.com" <james.morse@arm.com>,
	"jthoughton@google.com" <jthoughton@google.com>,
	"somasundaram.a@hpe.com" <somasundaram.a@hpe.com>,
	"erdemaktas@google.com" <erdemaktas@google.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"duenwen@google.com" <duenwen@google.com>,
	"gthelen@google.com" <gthelen@google.com>,
	"wschwartz@amperecomputing.com" <wschwartz@amperecomputing.com>,
	"dferguson@amperecomputing.com" <dferguson@amperecomputing.com>,
	"wbs@os.amperecomputing.com" <wbs@os.amperecomputing.com>,
	"nifan.cxl@gmail.com" <nifan.cxl@gmail.com>,
	tanxiaofei <tanxiaofei@huawei.com>,
	"Zengtao (B)" <prime.zeng@hisilicon.com>,
	Roberto Sassu <roberto.sassu@huawei.com>,
	"kangkang.shen@futurewei.com" <kangkang.shen@futurewei.com>,
	wanghuiqiang <wanghuiqiang@huawei.com>,
	Linuxarm <linuxarm@huawei.com>
Subject: RE: [PATCH v2 3/8] cxl/edac: Add CXL memory device patrol scrub control feature
Date: Fri, 28 Mar 2025 10:18:20 +0000	[thread overview]
Message-ID: <d75c574e26d94aa9b398acfca0ecac9d@huawei.com> (raw)
In-Reply-To: <67e4ae1cb0cdd_152c29462@dwillia2-mobl3.amr.corp.intel.com.notmuch>

>-----Original Message-----
>From: Dan Williams <dan.j.williams@intel.com>
>Sent: 27 March 2025 01:47
>To: Shiju Jose <shiju.jose@huawei.com>; linux-cxl@vger.kernel.org;
>dan.j.williams@intel.com; dave@stgolabs.net; Jonathan Cameron
><jonathan.cameron@huawei.com>; dave.jiang@intel.com;
>alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com;
>david@redhat.com; Vilas.Sridharan@amd.com
>Cc: linux-edac@vger.kernel.org; linux-acpi@vger.kernel.org; linux-
>mm@kvack.org; linux-kernel@vger.kernel.org; bp@alien8.de;
>tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org;
>mchehab@kernel.org; leo.duran@amd.com; Yazen.Ghannam@amd.com;
>rientjes@google.com; jiaqiyan@google.com; Jon.Grimm@amd.com;
>dave.hansen@linux.intel.com; naoya.horiguchi@nec.com;
>james.morse@arm.com; jthoughton@google.com; somasundaram.a@hpe.com;
>erdemaktas@google.com; pgonda@google.com; duenwen@google.com;
>gthelen@google.com; wschwartz@amperecomputing.com;
>dferguson@amperecomputing.com; wbs@os.amperecomputing.com;
>nifan.cxl@gmail.com; tanxiaofei <tanxiaofei@huawei.com>; Zengtao (B)
><prime.zeng@hisilicon.com>; Roberto Sassu <roberto.sassu@huawei.com>;
>kangkang.shen@futurewei.com; wanghuiqiang <wanghuiqiang@huawei.com>;
>Linuxarm <linuxarm@huawei.com>; Shiju Jose <shiju.jose@huawei.com>
>Subject: Re: [PATCH v2 3/8] cxl/edac: Add CXL memory device patrol scrub
>control feature
>
>shiju.jose@ wrote:
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub
[...]
>> diff --git a/drivers/cxl/core/edac.c b/drivers/cxl/core/edac.c new
>> file mode 100644 index 000000000000..5ec3535785e1
>> --- /dev/null
>> +++ b/drivers/cxl/core/edac.c
>> @@ -0,0 +1,474 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * CXL EDAC memory feature driver.
>> + *
>> + * Copyright (c) 2024-2025 HiSilicon Limited.
>> + *
>> + *  - Supports functions to configure EDAC features of the
>> + *    CXL memory devices.
>> + *  - Registers with the EDAC device subsystem driver to expose
>> + *    the features sysfs attributes to the user for configuring
>> + *    CXL memory RAS feature.
>> + */
>> +
>> +#include <linux/cleanup.h>
>> +#include <linux/edac.h>
>> +#include <linux/limits.h>
>> +#include <cxl/features.h>
>> +#include <cxl.h>
>> +#include <cxlmem.h>
>> +#include "core.h"
>> +
>> +#define CXL_NR_EDAC_DEV_FEATURES 1
>> +
>> +static struct rw_semaphore *cxl_acquire(struct rw_semaphore *rwsem) {
>> +	if (down_read_interruptible(rwsem))
>> +		return NULL;
>> +
>> +	return rwsem;
>> +}
>> +
>> +DEFINE_FREE(cxl_unlock, struct rw_semaphore *, if (_T) up_read(_T))
>
>I know I suggested cxl_acquire() and cxl_unlock(), but this really is a generic
>facility.
>
>Let's call it rwsem_read_intr_acquire() and rwsem_read_release(), and I'll
>follow up later with Peter to see if he wants this to graduate from CXL.
>
>Also, go ahead and define it in cxl.h for now as I think other places in the
>subsystem could benefit from this approach.

Hi Dan,

Thanks for the comments.
Sure. should these definitions in cxl.h  require in a separate patch?

>
>> +
>> +/*
>> + * CXL memory patrol scrub control
>> + */
>> +struct cxl_patrol_scrub_context {
>
>I like "patrol_scrub" spelled out here compared to "ps" used everywhere else.
Will change.

>
>> +	u8 instance;
>> +	u16 get_feat_size;
>> +	u16 set_feat_size;
>> +	u8 get_version;
>> +	u8 set_version;
>> +	u16 effects;
>> +	struct cxl_memdev *cxlmd;
>> +	struct cxl_region *cxlr;
>> +};
>> +
>> +/**
>> + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data
>structure.
>> + * @enable:     [IN & OUT] enable(1)/disable(0) patrol scrub.
>> + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is
>changeable.
>> + * @scrub_cycle_hrs:    [IN] Requested patrol scrub cycle in hours.
>> + *                      [OUT] Current patrol scrub cycle in hours.
>> + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours
>supported.
>> + */
>> +struct cxl_memdev_ps_params {
>> +	bool enable;
>> +	bool scrub_cycle_changeable;
>
>This is set but unused. Even if it were to be used I would expect it to be set in the
>cxl_patrol_scrub_context.
I will add  to cxl_patrol_scrub_context and will add an extra check against this when
user request to change the scrub rate.
>
>> +	u8 scrub_cycle_hrs;
>> +	u8 min_scrub_cycle_hrs;
>> +};
>
>I do not understand the point of this extra object and would prefer to keep
>intermediate data structures to a minimum.
>
>It looks like all this does is provide for short lived parsed caching of the raw
>hardware patrol scrube attributes. Just pass those raw objects around and
>provide helpers to do the conversion.

Sure. Will do. Was added to avoid more number of function parameters. 
>
>The less data structures the less confusion for the next person that has to read
>this code a few years down the road.
>
>> +
>> +enum cxl_scrub_param {
>> +	CXL_PS_PARAM_ENABLE,
>> +	CXL_PS_PARAM_SCRUB_CYCLE,
>> +};
>
>This seems unforuntate, why not make non-zero scrub rate an implied enable
>and zero to disable? A non-zero sentinel value like U32_MAX can indicate "keep
>scrub rate unchanged".

These  enums can be removed with remove using cxl_memdev_ps_params.

>
>> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0)
>
>This CXL_MEMDEV_PS prefix is awkward due to overload with 'struct
>cxl_memdev'. Minimize it to something like:
>
>CXL_SCRUB_CONTROL_CHANGEABLE
>CXL_SCRUB_CONTROL_REALTIME
>CXL_SCRUB_CONTROL_CYCLE_MASK
>CXL_SCRUB_CONTROL_MIN_CYCLE_MASK

Will change.
>
>> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK
>BIT(1)
>> +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0)
>#define
>> +CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8) #define
>> +CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0)
>
>CXL_SCRUB_CONTROL_ENABLE
>
>...no need to call it a mask when it is just a single-bit, and when it is both the
>status and the control just call it "enable".

Sure. Will change.
>
>> +
>> +/*
>> + * See CXL spec rev 3.2 @8.2.10.9.11.1 Table 8-222 Device Patrol
>> +Scrub Control
>> + * Feature Readable Attributes.
>> + */
>> +struct cxl_memdev_ps_rd_attrs {
>> +	u8 scrub_cycle_cap;
>> +	__le16 scrub_cycle_hrs;
>
>"hours" is just 2 more characters than "hrs", I think we can afford the extra
>bytes.

Will change.
>
>[..]
>> +int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) {
>> +	struct edac_dev_feature ras_features[CXL_NR_EDAC_DEV_FEATURES];
>> +	int num_ras_features = 0;
>> +	u8 scrub_inst = 0;
>> +	int rc;
>> +
>> +	rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features],
>> +				   scrub_inst);
>> +	if (rc < 0 && rc != -EOPNOTSUPP)
>> +		return rc;
>> +
>> +	if (rc != -EOPNOTSUPP)
>> +		num_ras_features++;
>> +
>> +	char *cxl_dev_name __free(kfree) =
>> +		kasprintf(GFP_KERNEL, "cxl_%s", dev_name(&cxlmd->dev));
>
>if (!cxl_dev_name)
>	return -ENOMEM;

Will add.
>
>> +
>> +	return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
>> +				 num_ras_features, ras_features); }
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_edac_register, "CXL");
>> +
>> +int devm_cxl_region_edac_register(struct cxl_region *cxlr) {
>> +	struct edac_dev_feature ras_features[CXL_NR_EDAC_DEV_FEATURES];
>> +	int num_ras_features = 0;
>> +	u8 scrub_inst = 0;
>> +	int rc;
>> +
>> +	rc = cxl_region_scrub_init(cxlr, &ras_features[num_ras_features],
>> +				   scrub_inst);
>> +	if (rc < 0)
>> +		return rc;
>> +
>> +	num_ras_features++;
>> +
>> +	char *cxl_dev_name __free(kfree) =
>> +		kasprintf(GFP_KERNEL, "cxl_%s", dev_name(&cxlr->dev));
>> +
>> +	return edac_dev_register(&cxlr->dev, cxl_dev_name, NULL,
>> +				 num_ras_features, ras_features); }
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_region_edac_register, "CXL");
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index b3260d433ec7..2aa6eb675fdf 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3542,6 +3542,11 @@ static int cxl_region_probe(struct device *dev)
>>  	case CXL_PARTMODE_PMEM:
>>  		return devm_cxl_add_pmem_region(cxlr);
>>  	case CXL_PARTMODE_RAM:
>> +		rc = devm_cxl_region_edac_register(cxlr);
>
>Why do only volatile regions get EDAC support? PMEM patrol scrub seems
>equally valid.

Will add devm_cxl_region_edac_register () for PMEM as well.

Thanks,
Shiju


  reply	other threads:[~2025-03-28 10:18 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 18:04 [PATCH v2 0/8] cxl: support CXL memory RAS features shiju.jose
2025-03-20 18:04 ` [PATCH v2 1/8] cxl: Add helper function to retrieve a feature entry shiju.jose
2025-03-26 21:32   ` Dan Williams
2025-03-27 16:59     ` Shiju Jose
2025-03-31 23:35       ` Dan Williams
2025-03-20 18:04 ` [PATCH v2 2/8] EDAC: Update documentation for the CXL memory patrol scrub control feature shiju.jose
2025-03-21 10:03   ` Jonathan Cameron
2025-03-24 10:37     ` Shiju Jose
2025-03-26 21:46     ` Dan Williams
2025-03-20 18:04 ` [PATCH v2 3/8] cxl/edac: Add CXL memory device " shiju.jose
2025-03-27  1:47   ` Dan Williams
2025-03-28 10:18     ` Shiju Jose [this message]
2025-03-20 18:04 ` [PATCH v2 4/8] cxl/edac: Add CXL memory device ECS " shiju.jose
2025-03-27 17:12   ` Dan Williams
2025-03-20 18:04 ` [PATCH v2 5/8] cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command shiju.jose
2025-03-27 17:23   ` Dan Williams
2025-03-20 18:04 ` [PATCH v2 6/8] cxl: Support for finding memory operation attributes from the current boot shiju.jose
2025-03-27 17:43   ` Dan Williams
2025-03-20 18:04 ` [PATCH v2 7/8] cxl/memfeature: Add CXL memory device soft PPR control feature shiju.jose
2025-03-27 17:01   ` Borislav Petkov
2025-03-27 17:08     ` Borislav Petkov
2025-03-28 13:05       ` Shiju Jose
2025-03-20 18:04 ` [PATCH v2 8/8] cxl/memfeature: Add CXL memory device memory sparing " shiju.jose
2025-03-21  7:39 ` [PATCH v2 0/8] cxl: support CXL memory RAS features Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d75c574e26d94aa9b398acfca0ecac9d@huawei.com \
    --to=shiju.jose@huawei.com \
    --cc=Jon.Grimm@amd.com \
    --cc=Vilas.Sridharan@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dferguson@amperecomputing.com \
    --cc=duenwen@google.com \
    --cc=erdemaktas@google.com \
    --cc=gthelen@google.com \
    --cc=ira.weiny@intel.com \
    --cc=james.morse@arm.com \
    --cc=jiaqiyan@google.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=jthoughton@google.com \
    --cc=kangkang.shen@futurewei.com \
    --cc=lenb@kernel.org \
    --cc=leo.duran@amd.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=mchehab@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=nifan.cxl@gmail.com \
    --cc=pgonda@google.com \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=rientjes@google.com \
    --cc=roberto.sassu@huawei.com \
    --cc=somasundaram.a@hpe.com \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vishal.l.verma@intel.com \
    --cc=wanghuiqiang@huawei.com \
    --cc=wbs@os.amperecomputing.com \
    --cc=wschwartz@amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox