From: "Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com>
To: Adam Manzanares <a.manzanares@samsung.com>
Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
Dan Williams <dan.j.williams@intel.com>,
Cong Wang <cong.wang@bytedance.com>,
Viacheslav Dubeyko <slava@dubeyko.com>
Subject: Re: [External] [LSF/MM/BPF TOPIC] CXL Fabric Manager (FM) architecture
Date: Wed, 8 Feb 2023 10:03:57 -0800 [thread overview]
Message-ID: <7E864E85-A36F-487B-8B70-C8C49FBECD73@bytedance.com> (raw)
In-Reply-To: <20230208163844.GA407917@bgt-140510-bm01>
> On Feb 8, 2023, at 8:38 AM, Adam Manzanares <a.manzanares@samsung.com> wrote:
>
> On Thu, Feb 02, 2023 at 09:54:02AM +0000, Jonathan Cameron wrote:
>> On Wed, 1 Feb 2023 12:04:56 -0800
>> "Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com> wrote:
>>
>>>>
<skipped>
>>>
>>> Most probably, we will have multiple FM implementations in firmware.
>>> Yes, FM on host could be important for debug and to verify correctness
>>> firmware-based implementations. But FM daemon on host could be important
>>> to receive notifications and react somehow on these events. Also, journalling
>>> of events/messages/events could be important responsibility of FM daemon
>>> on host.
>>
>> I agree with an FM daemon somewhere (potentially running on the BMC type chip
>> that also has the lower level FM-API access). I think it is somewhat
>> separate from the rest of this on basis it may well just be talking redfish
>> to the FM and there are lots of tools for that sort of handling already.
>>
>
> I would be interested in particpating in a BOF about this topic. I wonder what
> happens when we have multiple switches with multiple FMs each on a separate BMC.
> In this case, does it make more sense to have an owner of the global FM state
> be a user space application. Is this the job of the orchestrator?
>
> The BMC based FM seems to have scalability issues, but will we hit them in
> practice any time soon.
I had discussion recently and it looks like there are interesting points:
(1) If we have multiple CXL switches (especially with complex hierarchy), then it is
very compute-intensive activity. So, potentially, FM on firmware side could be not
capable to digest and executes all responsibilities without potential performance
degradation.
(2) However, if we have FM on host side, then there is security concerns because
FM sees everything and all details of multiple hosts and subsystems.
(3) Technically speaking, there is one potential capability that user-space FM daemon
can run as on host side as on CXL switch side. I mean here that if we implement
user-space FM daemon, then it could be used to execute FM functionality on CXL
switch side (maybe????). :)
<skipped>
>>>>> - Manage surprise removal of devices
>>>>
>>>> Likewise, beyond reporting I wouldn't expect the FM daemon to have any idea
>>>> what to do in the way of managing this. Scream loudly?
>>>>
>>>
>>> Maybe, it could require application(s) notification. Let’s imagine that application
>>> uses some resources from removed device. Maybe, FM can manage kernel-space
>>> metadata correction and helping to manage application requests to not existing
>>> entities.
>>
>> Notifications for the host are likely to come via inband means - so type3 driver
>> handling rather than related to FM. As far as the host is concerned this is the
>> same as case where there is no FM and someone ripped a device out.
>>
>> There might indeed be meta data to manage, but doubt it will have anything to
>> do with kernel.
>>
>
> I've also had similar thoughts, I think the OS responds to notifications that
> are generated in-band after changes to the state of the FM are made through
> OOB means.
>
> I envision the host sends REDFISH requests to a switch BMC that has an FM
> implementation. Once the changes are implemented by the FM it would show up
> as changes to the PCIe hierarchy on a host, which is capable of responding to
> such changes.
>
I think I am not completely follow your point. :) First of all, I assume that if host
sends REDFISH request, then it will be expected the confirmation of request execution.
It means for me that host needs to receive some packet that informs that request
executed successfully or failed. It means that some subsystem or application requested
this change and only after receiving the confirmation requested capabilities can be used.
And if FM is on CXL switch side, then how FM will show up the changes? It sounds for me
that some FM subsystem should be on the host side to receive confirmation/notification
and to execute the real changes in PCIe hierarchy. Am missing something here?
Thanks,
Slava.
next prev parent reply other threads:[~2023-02-08 18:04 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-30 19:11 Viacheslav A.Dubeyko
2023-01-31 17:41 ` Jonathan Cameron
2023-02-01 20:04 ` [External] " Viacheslav A.Dubeyko
2023-02-02 9:54 ` Jonathan Cameron
2023-02-08 16:38 ` Adam Manzanares
2023-02-08 18:03 ` Viacheslav A.Dubeyko [this message]
2023-02-09 11:05 ` Jonathan Cameron
2023-02-09 22:04 ` Viacheslav A.Dubeyko
2023-02-10 12:32 ` Jonathan Cameron
2023-02-17 18:31 ` Viacheslav A.Dubeyko
2023-02-20 11:59 ` Jonathan Cameron
2023-02-09 22:10 ` Adam Manzanares
2023-02-09 22:22 ` Viacheslav A.Dubeyko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7E864E85-A36F-487B-8B70-C8C49FBECD73@bytedance.com \
--to=viacheslav.dubeyko@bytedance.com \
--cc=Jonathan.Cameron@Huawei.com \
--cc=a.manzanares@samsung.com \
--cc=cong.wang@bytedance.com \
--cc=dan.j.williams@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=slava@dubeyko.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox