linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] cxl/pci: Set default timeout for background operations
@ 2024-02-07 10:53 sthanneeru.opensrc
  2024-02-15  3:31 ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: sthanneeru.opensrc @ 2024-02-07 10:53 UTC (permalink / raw)
  To: linux-cxl, linux-mm, sthanneeru.opensrc
  Cc: Jonathan.Cameron, dan.j.williams, john, emirakhur, ajayjoshi,
	Ravis.OpenSrc, sthanneeru

From: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>

The CXL 3.0 specification outlines background operations,
and support for handling these operations was added in following patch.

Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@stgolabs.net/

Mailbox commands like ‘Log Populate’ use background operations
to complete the execution of the command.
This can lead to a timeout, since there is currently no option
in the ioctl cxl_send_command structure to specify
a timeout value. The default values being zero can lead
to the driver reporting false negatives to the application.

This patch aims to establish default values, enabling mailbox commands
that operate in the background to continue functioning even
if a timeout is not set in the userspace application.

Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
---
 drivers/cxl/pci.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 4fd1f207c84e..82bdff55bb8d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -40,6 +40,10 @@
 /* CXL 2.0 - 8.2.8.4 */
 #define CXL_MAILBOX_TIMEOUT_MS (2 * HZ)
 
+/* Default timeout for background operations */
+#define CXL_RC_BG_POLL_CNT		40
+#define CXL_RC_BG_POLL_INTERVAL_MS	1000
+
 /*
  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
  * dictate how long to wait for the mailbox to become ready. The new
@@ -312,6 +316,13 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
 
 		dev_dbg(dev, "Mailbox background operation (0x%04x) started\n",
 			mbox_cmd->opcode);
+		/*
+		 * Set default background operation timeout, if not set already.
+		 */
+		if (!mbox_cmd->poll_count)
+			mbox_cmd->poll_count = CXL_RC_BG_POLL_CNT;
+		if (!mbox_cmd->poll_interval_ms)
+			mbox_cmd->poll_interval_ms = CXL_RC_BG_POLL_INTERVAL_MS;
 
 		timeout = mbox_cmd->poll_interval_ms;
 		for (i = 0; i < mbox_cmd->poll_count; i++) {
-- 
2.43.0



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] cxl/pci: Set default timeout for background operations
  2024-02-07 10:53 [RFC PATCH] cxl/pci: Set default timeout for background operations sthanneeru.opensrc
@ 2024-02-15  3:31 ` Dan Williams
  2024-02-15 12:34   ` Jonathan Cameron
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2024-02-15  3:31 UTC (permalink / raw)
  To: sthanneeru.opensrc, linux-cxl, linux-mm
  Cc: Jonathan.Cameron, dan.j.williams, john, emirakhur, ajayjoshi,
	Ravis.OpenSrc, sthanneeru

sthanneeru.opensrc@ wrote:
> From: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
> 
> The CXL 3.0 specification outlines background operations,
> and support for handling these operations was added in following patch.
> 
> Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@stgolabs.net/
> 
> Mailbox commands like ‘Log Populate’ use background operations
> to complete the execution of the command.
> This can lead to a timeout, since there is currently no option
> in the ioctl cxl_send_command structure to specify
> a timeout value. The default values being zero can lead
> to the driver reporting false negatives to the application.
> 
> This patch aims to establish default values, enabling mailbox commands
> that operate in the background to continue functioning even
> if a timeout is not set in the userspace application.

The reason there are no defaults is because userspace is not allowed to
issue background commands. The CXL background command definition is
awkward in that it allows a single command to monopolize the mailbox for
an indefinite amount of time.

Instead, the approach taken with Firmware Update and Sanitize is that a
kernel sysfs ABI mediates access to the mailbox and facilitates bounded 
timeslices between command submissions. It effectively allows the kernel
to manage fairness and more importantly preempt userspace if it needs to
issue its own commands.

I assume you are only seeing this lack of a default due to building with
CONFIG_CXL_MEM_RAW_COMMANDS=y? If yes, "raw" means "raw" and the kernel
is mostly taken out of the loop of saving userspace from itself.

All that said, ugh, "Log Populate" has no facility to time bound the
population of the log. I do not think it is tenable for Linux to
surrender mailbox access for an indefinite uninterruptible amount of
time... unless you want to handle "Log Populate" like Sanitize where the
unbounded background operation is tolerated because the device is taken
offline?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] cxl/pci: Set default timeout for background operations
  2024-02-15  3:31 ` Dan Williams
@ 2024-02-15 12:34   ` Jonathan Cameron
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Cameron @ 2024-02-15 12:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: sthanneeru.opensrc, linux-cxl, linux-mm, john, emirakhur,
	ajayjoshi, Ravis.OpenSrc, sthanneeru

On Wed, 14 Feb 2024 19:31:23 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> sthanneeru.opensrc@ wrote:
> > From: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
> > 
> > The CXL 3.0 specification outlines background operations,
> > and support for handling these operations was added in following patch.
> > 
> > Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@stgolabs.net/
> > 
> > Mailbox commands like ‘Log Populate’ use background operations
> > to complete the execution of the command.
> > This can lead to a timeout, since there is currently no option
> > in the ioctl cxl_send_command structure to specify
> > a timeout value. The default values being zero can lead
> > to the driver reporting false negatives to the application.
> > 
> > This patch aims to establish default values, enabling mailbox commands
> > that operate in the background to continue functioning even
> > if a timeout is not set in the userspace application.  
> 
> The reason there are no defaults is because userspace is not allowed to
> issue background commands. The CXL background command definition is
> awkward in that it allows a single command to monopolize the mailbox for
> an indefinite amount of time.
> 
> Instead, the approach taken with Firmware Update and Sanitize is that a
> kernel sysfs ABI mediates access to the mailbox and facilitates bounded 
> timeslices between command submissions. It effectively allows the kernel
> to manage fairness and more importantly preempt userspace if it needs to
> issue its own commands.
> 
> I assume you are only seeing this lack of a default due to building with
> CONFIG_CXL_MEM_RAW_COMMANDS=y? If yes, "raw" means "raw" and the kernel
> is mostly taken out of the loop of saving userspace from itself.
> 
> All that said, ugh, "Log Populate" has no facility to time bound the
> population of the log. I do not think it is tenable for Linux to
> surrender mailbox access for an indefinite uninterruptible amount of
> time... unless you want to handle "Log Populate" like Sanitize where the
> unbounded background operation is tolerated because the device is taken
> offline?

It may be pointless to do a component state dump only on an offline device.
My assumption is this one is hardware debug only.  Patches out of
tree or behind a really scary sounding config variable perhaps?
Other than vendor log I don't think populate log applies to the other
log types yet (they don't mention it anyway!)

Jonathan





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] cxl/pci: Set default timeout for background operations
  2024-09-02  6:42 Shinas Rasheed
@ 2024-09-13 11:52 ` Jonathan Cameron
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Cameron @ 2024-09-13 11:52 UTC (permalink / raw)
  To: Shinas Rasheed
  Cc: dan.j.williams, Ravis.OpenSrc, ajayjoshi, emirakhur, john,
	linux-cxl, linux-mm, sthanneeru.opensrc, sthanneeru, Haseeb Gani,
	Sathesh B Edara, Srinivasa Rao Jampala, Shariful Alam,
	Vimlesh Kumar, Dharmesh Pitroda

On Mon, 2 Sep 2024 06:42:34 +0000
Shinas Rasheed <srasheed@marvell.com> wrote:

> Hi,

Hi Shinas,

Note this is my view. Opinions may differ!

> 
> Trying to revive this discussion about background commands. I'm sorry if more on the topic has been discussed elsewhere, 
> but this is a patch that I could find which related to the same topic.
> 
> The requirement for default timeouts (or atleast a mechanism to provide timeouts to the command structure) seems to be needed
> since the CXL spec doesn't seem to explicitly deny userspace access to background commands.

The CXL spec (quite rightly) doesn't speak at all about providing userspace or other
access to anything. This is purely an OS implementation question so
this is the right place to address it for Linux.

It would be perfectly valid for an OS to mediate fully every command
including all vendor defined ones. If there is a command it doesn't
know about, default to blocking it.

There is a path to enable 'some' background commands from userspace
but only the ones that support abort.


> This seems to constrain a lot of
> functionality with respect to Vendor Defined commands as well, if a command needs to be implemented as background.

Indeed. If someone has a vendor defined command that locks the device up for
a long period, then the kernel may well have a quirk in future to
block that command (if we enable vendor commands from userspace
in general - see the fwctl discussion).  That may be a necessary precaution
to avoid the kernel locking up for substantial periods.



> 
> >> sthanneeru.opensrc@ wrote:  
> >> > From: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
> >> > 
> >> > The CXL 3.0 specification outlines background operations,
> >> > and support for handling these operations was added in following patch.
> >> > 
> >> > Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@stgolabs.net/
> >> > 
> >> > Mailbox commands like ‘Log Populate’ use background operations
> >> > to complete the execution of the command.
> >> > This can lead to a timeout, since there is currently no option
> >> > in the ioctl cxl_send_command structure to specify
> >> > a timeout value. The default values being zero can lead
> >> > to the driver reporting false negatives to the application.
> >> > 
> >> > This patch aims to establish default values, enabling mailbox commands
> >> > that operate in the background to continue functioning even
> >> > if a timeout is not set in the userspace application.    
> >> 
> >> The reason there are no defaults is because userspace is not allowed to
> >> issue background commands. The CXL background command definition is
> >> awkward in that it allows a single command to monopolize the mailbox for
> >> an indefinite amount of time.  
> 
> I understand this is a design adopted by the kernel, and not actually by the CXL spec (at least I don't see it
> specifying explicitly that background commands are not allowed in band).

(I see you touch on this below)
The background commands are 1 at a time.  Given some background commands
are used by the kernel, the ability to run non background commands at the
same time doesn't help us in the general case.  Hence proposal
is to only allow userspace use of commands that may be cancelled.
If the kernel wants to use the mailbox for a background command and
one is already being processed by the device, the kernel can just cancel it.

> 
> >> 
> >> Instead, the approach taken with Firmware Update and Sanitize is that a
> >> kernel sysfs ABI mediates access to the mailbox and facilitates bounded 
> >> timeslices between command submissions. It effectively allows the kernel
> >> to manage fairness and more importantly preempt userspace if it needs to
> >> issue its own commands.
> >> 
> >> I assume you are only seeing this lack of a default due to building with
> >> CONFIG_CXL_MEM_RAW_COMMANDS=y? If yes, "raw" means "raw" and the kernel
> >> is mostly taken out of the loop of saving userspace from itself.  
> 
> This seems to imply that no vendor defined command can be background commands. Please correct me if
> I'm wrong, since Vendor defined commands have to be sent as raw commands, and the absence of default timeouts
> indicate that the command will be immediately timed out.

The fwctl discussion may provide a path for this, but I think
that will still only support commands we can abort.

> 
> Also, I understand the point that background commands (since there can only be one) monopolizes the
> mailbox, and hence cannot be exposed to userspace and as I see the background command handling goes synchronously
> with the usual command handling in the kernel (ie; foreground command has to wait for the background command to complete or timeout)
> but doesn't this defeat the purpose of the spec defined ability to run a foreground command while a background command is
> underway?

It is a valid optimization to run non background commands. If there
is a clear use case then I think we'd be happy to see patches.
As you say though, that doesn't enable userspace background comamnds
though as some of the commands the kernel uses are background as well
so we still need to be able to abort the ones userspace issued.

> 
> >> 
> >> All that said, ugh, "Log Populate" has no facility to time bound the
> >> population of the log. I do not think it is tenable for Linux to
> >> surrender mailbox access for an indefinite uninterruptible amount of
> >> time... unless you want to handle "Log Populate" like Sanitize where the
> >> unbounded background operation is tolerated because the device is taken
> >> offline?  
> >
> >It may be pointless to do a component state dump only on an offline device.
> >My assumption is this one is hardware debug only.  Patches out of
> >tree or behind a really scary sounding config variable perhaps?
> >Other than vendor log I don't think populate log applies to the other
> >log types yet (they don't mention it anyway!)
> >
> >Jonathan  
> 
> Am I wrong to understand that this design decision is still in the open right now?
> 
> Another concern was that right now, a WARN_ONCE is issued when a raw command is ensued. Maybe this belongs
> on a different thread, but are raw commands to be depreciated (if then how can Vendor Defined commands be sent?)
See fwctl discussion for Vendor Defined command.
Right now Dave has focused on Get / Set feature but we'll expand to others.
The FWCTL area in general is probably going to be discussed at the various
conferences in Vienna next week, so for now this is a 'watch this space'.

We did discuss the option of possibly supporting vendor defined commands explicitly
if some or all of these conditions were met.
1) There was clear documentation.
2) Possibly a proposal for similar feature in the CXL specification or another
   industry spec.  (to ensure we don't end supporting a never ending sequence of
   similar vendor specific commands)

In a sense (1) means they are a defacto standard so we can be sure
of side effects etc and things like how long they take.

Jonathan

> 
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] cxl/pci: Set default timeout for background operations
@ 2024-09-02  6:42 Shinas Rasheed
  2024-09-13 11:52 ` Jonathan Cameron
  0 siblings, 1 reply; 5+ messages in thread
From: Shinas Rasheed @ 2024-09-02  6:42 UTC (permalink / raw)
  To: jonathan.cameron, dan.j.williams
  Cc: Ravis.OpenSrc, ajayjoshi, emirakhur, john, linux-cxl, linux-mm,
	sthanneeru.opensrc, sthanneeru, Haseeb Gani, Sathesh B Edara,
	Srinivasa Rao Jampala, Shariful Alam, Vimlesh Kumar,
	Dharmesh Pitroda

Hi,

Trying to revive this discussion about background commands. I'm sorry if more on the topic has been discussed elsewhere, 
but this is a patch that I could find which related to the same topic.

The requirement for default timeouts (or atleast a mechanism to provide timeouts to the command structure) seems to be needed
since the CXL spec doesn't seem to explicitly deny userspace access to background commands. This seems to constrain a lot of
functionality with respect to Vendor Defined commands as well, if a command needs to be implemented as background.

>> sthanneeru.opensrc@ wrote:
>> > From: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com>
>> > 
>> > The CXL 3.0 specification outlines background operations,
>> > and support for handling these operations was added in following patch.
>> > 
>> > Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@stgolabs.net/
>> > 
>> > Mailbox commands like ‘Log Populate’ use background operations
>> > to complete the execution of the command.
>> > This can lead to a timeout, since there is currently no option
>> > in the ioctl cxl_send_command structure to specify
>> > a timeout value. The default values being zero can lead
>> > to the driver reporting false negatives to the application.
>> > 
>> > This patch aims to establish default values, enabling mailbox commands
>> > that operate in the background to continue functioning even
>> > if a timeout is not set in the userspace application.  
>> 
>> The reason there are no defaults is because userspace is not allowed to
>> issue background commands. The CXL background command definition is
>> awkward in that it allows a single command to monopolize the mailbox for
>> an indefinite amount of time.

I understand this is a design adopted by the kernel, and not actually by the CXL spec (at least I don't see it
specifying explicitly that background commands are not allowed in band).

>> 
>> Instead, the approach taken with Firmware Update and Sanitize is that a
>> kernel sysfs ABI mediates access to the mailbox and facilitates bounded 
>> timeslices between command submissions. It effectively allows the kernel
>> to manage fairness and more importantly preempt userspace if it needs to
>> issue its own commands.
>> 
>> I assume you are only seeing this lack of a default due to building with
>> CONFIG_CXL_MEM_RAW_COMMANDS=y? If yes, "raw" means "raw" and the kernel
>> is mostly taken out of the loop of saving userspace from itself.

This seems to imply that no vendor defined command can be background commands. Please correct me if
I'm wrong, since Vendor defined commands have to be sent as raw commands, and the absence of default timeouts
indicate that the command will be immediately timed out.

Also, I understand the point that background commands (since there can only be one) monopolizes the
mailbox, and hence cannot be exposed to userspace and as I see the background command handling goes synchronously
with the usual command handling in the kernel (ie; foreground command has to wait for the background command to complete or timeout)
but doesn't this defeat the purpose of the spec defined ability to run a foreground command while a background command is
underway?

>> 
>> All that said, ugh, "Log Populate" has no facility to time bound the
>> population of the log. I do not think it is tenable for Linux to
>> surrender mailbox access for an indefinite uninterruptible amount of
>> time... unless you want to handle "Log Populate" like Sanitize where the
>> unbounded background operation is tolerated because the device is taken
>> offline?
>
>It may be pointless to do a component state dump only on an offline device.
>My assumption is this one is hardware debug only.  Patches out of
>tree or behind a really scary sounding config variable perhaps?
>Other than vendor log I don't think populate log applies to the other
>log types yet (they don't mention it anyway!)
>
>Jonathan

Am I wrong to understand that this design decision is still in the open right now?

Another concern was that right now, a WARN_ONCE is issued when a raw command is ensued. Maybe this belongs
on a different thread, but are raw commands to be depreciated (if then how can Vendor Defined commands be sent?)



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-09-13 11:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-07 10:53 [RFC PATCH] cxl/pci: Set default timeout for background operations sthanneeru.opensrc
2024-02-15  3:31 ` Dan Williams
2024-02-15 12:34   ` Jonathan Cameron
2024-09-02  6:42 Shinas Rasheed
2024-09-13 11:52 ` Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox