Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Machine Learning (ML) library in Linux kernel

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: "jack@suse.cz" <jack@suse.cz>,
	 "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"chrisl@kernel.org" <chrisl@kernel.org>,
	 "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	Pavan Rallabhandi <Pavan.Rallabhandi@ibm.com>,
	 "clm@meta.com" <clm@meta.com>
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Machine Learning (ML) library in Linux kernel
Date: Wed, 11 Feb 2026 10:55:34 +0100	[thread overview]
Message-ID: <kw4qco6aq4bq55nmb4c5ibicmj7ga77vtgzlj65jtdhzowks5m@buhefb6m4eqx> (raw)
In-Reply-To: <11f659fd88f887b9fe4c88a386f1a5c2157968a6.camel@ibm.com>

On Tue 10-02-26 21:02:12, Viacheslav Dubeyko wrote:
> On Tue, 2026-02-10 at 14:47 +0100, Jan Kara wrote:
> > On Mon 09-02-26 22:28:59, Viacheslav Dubeyko via Lsf-pc wrote:
> > > The idea is to have ML model running in user-space and kernel subsystem can
> > > interact with ML model in user-space. As the next step, I am considering two
> > > real-life use-cases: (1) GC subsystem of LFS file system, (2) ML-based DAMON
> > > approach. So, for example, GC can be represented by ML model in user-space. GC
> > > can request data (segments state) from kernel-space and ML model in user-space
> > > can do training or/and inference. As a result, ML model in user-space can select
> > > victim segments and instruct kernel-space logic of moving valid data from victim
> > > segment(s) into clean/current one(s). 
> > 
> > To be honest I'm skeptical about how generic this can be. Essentially
> > you're describing a generic interface to offload arbitrary kernel decision
> > to userspace. ML is a userspace bussiness here and not really relevant for
> > the concept AFAICT. And we already have several ways of kernel asking
> > userspace to do something for it and unless it is very restricted and well
> > defined it is rather painful, prone to deadlocks, security issues etc.
> 
> Scepticism is normal reaction. :) So, nothing wrong is to be sceptical.
> 
> I believe it can be pretty generic from the data flow point of view. Probably,
> different kernel subsystems could require different ways of interaction with
> user-space. However, if we are talking about data flow but NOT execution flow,
> then it could be generic enough. And if it can be generic, then we can suggest
> generic way of extending any kernel subsystem by ML support.
> 
> I don't think that we need to consider the ML library appraoch like "kernel
> asking userspace to do something". Rather it needs to consider the model like
> "kernel share data with user-space and user-space recommends something to
> kernel". So, user-space agent (ML model) can request data from kernel space or
> kernel subsystem can notify the user-space agent that data is available. And
> it's up to kernel subsystem implementation which data could be shared with user-
> space. So, ML model can be trained in user-space and, then, share
> recommendations (or eBPF code, for example) with kernel space. Finally, it's up
> to kernel subsystem how and when to apply these recommendations on kernel side.

I guess I have to see some examples. Because so far it sounds so generic
that I'm failing to see a value in this :)

> > So by all means if you want to do GC decisions for your filesystem in
> > userspace by ML, be my guest, it does make some sense although I'd be wary
> > of issues where we need to writeback dirty pages to free memory which may
> > now depend on your userspace helper to make a decision which may need the
> > memory to do the decision... But I don't see why you need all the ML fluff
> > around it when it seems like just another way to call userspace helper and
> > why some of the existing methods would not suffice.
> > 
> 
> OK. I see. :) You understood GC like a subsystem that helps to kernel
> memory subsystem to manage the writeback dirty memory pages. :) It's
> potential direction and I like your suggestion. :) But I meant something
> different because I consider of LFS file system's GC subsystem. So, if we
> are using Copy-On-Write (COW) policy, then we have segments or erase
> blocks with a mixture of valid and invalid logical blocks after update
> operations. And we need GC subsystem to clean old segments by means of
> moving valid logical blocks from exhausted segments into clean/current
> ones. The problem here is to find an efficient algorithm of selecting
> victim segments with smallest amount of valid blocks with the goal of
> decreasing write amplification. So, file system needs to share the
> metadata details (segments state, for example), ML model can share the
> recommendations, and kernel code of file system can finally move valid
> blocks in the background.

No, I actually meant the LFS file system GC as you talk about it. But I was
just too terse about my concerns: As you said an LFS with COW needs to
select a new position to write each block. When there is no free block
available, it has to select partially used erase block (some logical blocks
in it became invalid) to reuse. And for this selection you want to use ML
AFAIU. Hence we have a dependency folio writeback -> COW block allocation ->
GC to make some block free -> ML decision. And now you have to be really
careful so that "ML decision" doesn't even indirectly depend on folio
writeback to complete. And bear in mind that e.g. if the code doing "ML
decision" dirties some mmaped file pages it *will* block waiting for page
writeback to complete to get the system below the limit of dirty pages.
This is the kind of deadlock I'm talking about that is hard to avoid when
offloading kernel decisions to userspace (and yes, I've seen these kind of
deadlocks in practice in various shapes and forms with various methods when
kernel depended on userspace to make forward progress).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

next prev parent reply	other threads:[~2026-02-11  9:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 19:38 Viacheslav Dubeyko
2026-02-06 23:28 ` Hillf Danton
2026-02-09 10:03 ` Chris Li
2026-02-09 22:28   ` Viacheslav Dubeyko
2026-02-10 13:47     ` [Lsf-pc] " Jan Kara
2026-02-10 14:20       ` Chris Mason
2026-02-10 22:36         ` Viacheslav Dubeyko
2026-02-11  1:30           ` SeongJae Park
2026-02-11 20:29             ` Viacheslav Dubeyko
2026-02-10 21:02       ` Viacheslav Dubeyko
2026-02-11  9:55         ` Jan Kara [this message]
2026-02-12  0:53           ` Viacheslav Dubeyko
2026-02-12 11:02             ` Jan Kara
2026-02-09 10:25 ` Barry Song
2026-02-09 22:07   ` Viacheslav Dubeyko
2026-02-10  3:06     ` Barry Song
2026-02-10 19:57       ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kw4qco6aq4bq55nmb4c5ibicmj7ga77vtgzlj65jtdhzowks5m@buhefb6m4eqx \
    --to=jack@suse.cz \
    --cc=Pavan.Rallabhandi@ibm.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=chrisl@kernel.org \
    --cc=clm@meta.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox