Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Machine Learning (ML) library in Linux kernel

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Chris Mason <clm@meta.com>
To: Jan Kara <jack@suse.cz>, Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: "chrisl@kernel.org" <chrisl@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Pavan Rallabhandi <Pavan.Rallabhandi@ibm.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Machine Learning (ML) library in Linux kernel
Date: Tue, 10 Feb 2026 09:20:53 -0500	[thread overview]
Message-ID: <a1bc8ccc-730c-4076-82ec-20bf86dd100b@meta.com> (raw)
In-Reply-To: <6ek3nhulz72niscw2iz2n5xhczz4ta6a6hvyrlneuyk2d36ngx@4ymlemzifugr>

On 2/10/26 8:47 AM, Jan Kara wrote:
> On Mon 09-02-26 22:28:59, Viacheslav Dubeyko via Lsf-pc wrote:
>> On Mon, 2026-02-09 at 02:03 -0800, Chris Li wrote:
>>> On Fri, Feb 6, 2026 at 11:38 AM Viacheslav Dubeyko
>>> <Slava.Dubeyko@ibm.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Machine Learning (ML) is approach/area of learning from data,
>>>> finding patterns, and making predictions without implementing algorithms
>>>> by developers. The number of areas of ML applications is growing
>>>> with every day. Generally speaking, ML can introduce a self-evolving and
>>>> self-learning capability in Linux kernel. There are already research works
>>>> and industry efforts to employ ML approaches for configuration and
>>>> optimization the Linux kernel. However, introduction of ML approaches
>>>> in Linux kernel is not so simple and straightforward way. There are multiple
>>>> problems and unanswered questions on this road. First of all, any ML model
>>>> requires the floating-point operations (FPU) for running. But there is
>>>> no direct use of FPUs in kernel space. Also, ML model requires training phase
>>>> that can be a reason of significant performance degradation of Linux kernel.
>>>> Even inference phase could be problematic from the performance point of view
>>>> on kernel side. The using of ML approaches in Linux kernel is inevitable step.
>>>> But, how can we use ML approaches in Linux kernel? Which infrastructure
>>>> do we need to adopt ML models in Linux kernel?
>>>
>>> I think there are two different things, I think you want the latter
>>> but I am not sure
>>>
>>> 1) using ML model to help kernel development, code reviews, generate
>>> patches by descriptions etc. For example, Chris Mason has a kernel
>>> review repo on github and he is sharing his review finding the mailing
>>> list:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_masoncl_review-2Dprompts_tree_main&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=q5bIm4AXMzc8NJu1_RGmnQ2fMWKq4Y4RAkElvUgSs00&m=vvrDPxyw_JXPrkC8BjzA2kEtwdPfwV2gBMEXG7ZveXM4LhS01LfoGwqhEyUZpPe4&s=rqNez5_rmiEuE7in5e_7MfyUzzqzaA6Gk46WWvmN3yk&e=  
>>> It is kernel development related, but the ML agent code is running in
>>> the user space. The actual ML computation might run GPU/TPUs. That
>>> does not seem to be what you have in mind.
>>>
>>> 2) Run the ML model computation in the kernel space.
>>> Can you clarify if this is what you have in mind? You mention kernel
>>> FPU usage in the kernel for ML model. It is only relevant if you need
>>> to run the FP in the kernel CPU instructions. Most ML computations are
>>> not run in CPU instructions. They run on GPUs/TPUs. Why not keep the
>>> ML program (PyTorch/agents) in the user space and pass the data to the
>>> GPU/TPU driver to run? There will be some kernel instructure like
>>> VFIO/IOMMU involved with the GPU/TPU driver. For the most part the
>>> kernel is just facilitating the data passing to/from the GPU/TPU
>>> driver then to the GPU/TPU hardware. The ML hardware is doing the
>>> heavy lifting.
>>
>> The idea is to have ML model running in user-space and kernel subsystem can
>> interact with ML model in user-space. As the next step, I am considering two
>> real-life use-cases: (1) GC subsystem of LFS file system, (2) ML-based DAMON
>> approach. So, for example, GC can be represented by ML model in user-space. GC
>> can request data (segments state) from kernel-space and ML model in user-space
>> can do training or/and inference. As a result, ML model in user-space can select
>> victim segments and instruct kernel-space logic of moving valid data from victim
>> segment(s) into clean/current one(s). 
> 
> To be honest I'm skeptical about how generic this can be. Essentially
> you're describing a generic interface to offload arbitrary kernel decision
> to userspace. ML is a userspace bussiness here and not really relevant for
> the concept AFAICT. And we already have several ways of kernel asking
> userspace to do something for it and unless it is very restricted and well
> defined it is rather painful, prone to deadlocks, security issues etc.
> 
> So by all means if you want to do GC decisions for your filesystem in
> userspace by ML, be my guest, it does make some sense although I'd be wary
> of issues where we need to writeback dirty pages to free memory which may
> now depend on your userspace helper to make a decision which may need the
> memory to do the decision... But I don't see why you need all the ML fluff
> around it when it seems like just another way to call userspace helper and
> why some of the existing methods would not suffice.

Looking through the description (not the code, apologies), it really
feels like we're reinventing BPF here:

- introspection into what the kernel is currently doing
- communications channel with applications
- a mechanism to override specific kernel functionality
- fancy applications arbitrating decisions.

My feedback during plumbers and also today is that you can get 99% of
what you're looking for with some BPF code.

It may or may not be perfect for your needs, but it's a much faster path
to generate community and collaboration around the goals.  After that,
it's a lot easier to justify larger changes in the kernel.

If this becomes an LSF/MM topic, my bar for discussion would be:
- extensive data collected about some kernel component (Damon,
scheduling etc)
- working proof of concept that improved on decisions made in the kernel
- discussion of changes needed to improve or enable the proof of concept

In other words, I don't think we need a list of ways ML might be used.
I think we need specific examples of a way that ML was used and why it's
better than what the kernel is already doing.

-chris

next prev parent reply	other threads:[~2026-02-10 14:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 19:38 Viacheslav Dubeyko
2026-02-06 23:28 ` Hillf Danton
2026-02-09 10:03 ` Chris Li
2026-02-09 22:28   ` Viacheslav Dubeyko
2026-02-10 13:47     ` [Lsf-pc] " Jan Kara
2026-02-10 14:20       ` Chris Mason [this message]
2026-02-10 22:36         ` Viacheslav Dubeyko
2026-02-11  1:30           ` SeongJae Park
2026-02-11 20:29             ` Viacheslav Dubeyko
2026-02-10 21:02       ` Viacheslav Dubeyko
2026-02-11  9:55         ` Jan Kara
2026-02-12  0:53           ` Viacheslav Dubeyko
2026-02-12 11:02             ` Jan Kara
2026-02-09 10:25 ` Barry Song
2026-02-09 22:07   ` Viacheslav Dubeyko
2026-02-10  3:06     ` Barry Song
2026-02-10 19:57       ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1bc8ccc-730c-4076-82ec-20bf86dd100b@meta.com \
    --to=clm@meta.com \
    --cc=Pavan.Rallabhandi@ibm.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=chrisl@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox