Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process

ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: ksummit@lists.linux.dev
Subject: Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
Date: Tue, 5 Aug 2025 18:55:29 +0100	[thread overview]
Message-ID: <c8daa784-4c51-4d65-b134-244194dce300@lucifer.local> (raw)
In-Reply-To: <56e85d392471beea3322d19bde368920ba6323b6.camel@HansenPartnership.com>

On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > Unavoidably, LLMs are the hot topic in tech right now, and are here
> > to stay.
> >
> > This poses unique problems:
> >
> > * Never before have people been able to generate as much content that
> > may, on a surface reading, seem valid whilst in reality being quite
> > the opposite.
> >
> > * Equally, LLM's can introduce very subtle mistakes that humans find
> > difficult to pick up upon - humans implicitly assume that the classes
> > of errors they will encounter are the kinds other humans would make -
> > AI defeats that instinct.
>
> Do you have any examples of this?  I've found the opposite to be true:

Sure - Steven encountered this in [1].

As he says there:

"If I had known, I would have examined the patch a little more thoroughly,
 and would have discovered a very minor mistake in the patch."

The algorithm is determining likely output based on statistics, and
therefore density of input. Since in reality one can write infinite
programs, it's mathematically inevitable that an LLM will have to 'infer'
answers.

That inference has no basis in dynamics, that is a model of reality that it
can use to determine answers, rather it will, in essence, provide a random
result.

If there is a great deal of input (e.g. C programs), then that inference is
likely to manifest in very subtle errors. See [2] for a thoughtful
exploration from an AI expert on the topic of statistics vs. dynamics, and
[3] for a broader exploration of the topic from the same author.

[1]:https://lore.kernel.org/workflows/20250724194556.105803db@gandalf.local.home/
[2]:https://blog.piekniewski.info/2016/11/01/statistics-and-dynamics/
[3]:https://blog.piekniewski.info/2023/04/09/ai-reflections/

> AI is capable of really big stupid mistakes when it hasn't seen enough
> of the pattern, but I can't recall seeing it make something you'd
> classify as a subtle mistake (I assume it could copy subtle mistakes
> from wrong training data, so I'm not saying it can't, just that I
> haven't seen any).

It's not from incorrect training data, it's fundamental to how LLMs
work.

>
> I think the big mistakes could possibly be avoided by asking people who
> submit patches to also append the AI confidence score:
>
> https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/accuracy-confidence?view=doc-intel-4.0.0

That's interesting, though I don't know how reliable this might be.

However it's for exactly this kind of input that I proposed the topic :)

>
> So we know how much similar training the model has seen before coming
> to any conclusion about the value of the output.
>
> > * The kernel is uniquely sensitive to erroneous (especially subtly
> > erroneous) code - even small errors can be highly consequential. We
> > use a programming language that can almost be defined by its lack of
> > any kind   of safety, and in some subsystems patches are simply taken
> > if no obvious problems exist, making us rather vulnerable to this.
>
> I think that's really overlooking the fact that if properly trained (a
> somewhat big *if* depending on the model) AI should be very good at
> writing safe code in unsafe languages.  However it takes C specific

I fundamentally disagree.

The consequences of even extremely small mistakes can be very serious in C,
as the language does little to nothing for you.

No matter how much data it absorbs it cannot span the entire space of all
possible programs or even anywhere close.

I mean again, I apply the arguments above as to why I feel this is
_fundamental_ to the approach.

Kernel code is also very specific and has characteristics that render it
different from userland. We must consider a great many more things that
would be handled for us were we userland - interrupts, the context we are
in, locks of all varieties, etc. etc.

While there's a lot of kernel code (~10's of millions of line), for an LLM
that is very small, and we simply cannot generate more.

Yes it can eat up all the C it can, but that isn't quite the same.

> training to do this, so any LLM that's absorbed a load of rust, python
> and javascript from the internet will be correspondingly bad at writing
> safe C code.  Hence the origin of the LLM and its training corpus would
> be a key factor in deciding to trust it.
>
> > * On the other hand, there are use cases which are useful - test
> > data/code generation, summarisation, smart auto-complete - so it'd
> > perhaps be foolish to entirely dismiss AI.
>
> Patch backporting is another such nice use.

As long as carefully checked :)

>
> > A very important non-technical point we must consider is that, the
> > second we even appear to be open to AI submission of _any_ kind, the
> > press will inevitably report on it gleefully, likely with
> > oversimplified headlines like 'Linux accepts AI patches'.
>
> Oh, I think simply accepting AI patches is old news:
>
> https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent-of-microsoft-code-is-written-by-ai.html

That doesn't pertain to the kernel specifically.

Of course code being written by AI is old news, but there's no doubt that
tech publications would JUMP on anything even suggesting we are open in
some broad way to AI submissions.

Given Linus's rather neutral public position on AI, it'd certainly mark
what _would be perceived_, in my view, as a sea change on this.

>
> > The moment that happens, we are likely to see a significant uptick in
> > AI submissions whether we like it or not.
> >
> > I propose that we establish the broad rules as they pertain to the
> > kernel, and would like to bring the discussion to the Maintainer's
> > Summit so we can determine what those should be.
> >
> > It's important to get a sense of how maintainers feel about this -
> > whether what is proposed is opt-in or opt-out - and how we actually
> > implement this.
> >
> > There has been discussion on-list about this (see [0]), with many
> > suggestions made including a 'traffic light' system per-subsystem,
> > however many open questions remain - the devil is in the details.
> >
> > [0]:
> > https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.or
> > g/
>
> We're already getting AI generated bug reports from what I can tell.
> It would be really helpful to see the AI confidence score for them as
> well.

That is definitely an interesting additional data point that could
potentially be helpful here! I wasn't aware of this so thanks for that :)

>
> Regards,
>
> James
>
>

Cheers, Lorenzo

next prev parent reply	other threads:[~2025-08-05 17:55 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-05 16:03 Lorenzo Stoakes
2025-08-05 16:43 ` James Bottomley
2025-08-05 17:11   ` Mark Brown
2025-08-05 17:23     ` James Bottomley
2025-08-05 17:43       ` Sasha Levin
2025-08-05 17:58         ` Lorenzo Stoakes
2025-08-05 18:16       ` Mark Brown
2025-08-05 18:01     ` Lorenzo Stoakes
2025-08-05 18:46       ` Mark Brown
2025-08-05 19:18         ` Lorenzo Stoakes
2025-08-05 17:17   ` Stephen Hemminger
2025-08-05 17:55   ` Lorenzo Stoakes [this message]
2025-08-05 18:23     ` Lorenzo Stoakes
2025-08-12 13:44       ` Steven Rostedt
2025-08-05 18:34     ` James Bottomley
2025-08-05 18:55       ` Lorenzo Stoakes
2025-08-12 13:50       ` Steven Rostedt
2025-08-05 18:39     ` Sasha Levin
2025-08-05 19:15       ` Lorenzo Stoakes
2025-08-05 20:02         ` James Bottomley
2025-08-05 20:48           ` Al Viro
2025-08-06 19:26           ` Lorenzo Stoakes
2025-08-07 12:25             ` Mark Brown
2025-08-07 13:00               ` Lorenzo Stoakes
2025-08-11 21:26                 ` Luis Chamberlain
2025-08-12 14:19                 ` Steven Rostedt
2025-08-06  4:04       ` Alexey Dobriyan
2025-08-06 20:36         ` Sasha Levin
2025-08-05 21:58   ` Jiri Kosina
2025-08-06  6:58     ` Hannes Reinecke
2025-08-06 19:36       ` Lorenzo Stoakes
2025-08-06 19:35     ` Lorenzo Stoakes
2025-08-05 18:10 ` H. Peter Anvin
2025-08-05 18:19   ` Lorenzo Stoakes
2025-08-06  5:49   ` Julia Lawall
2025-08-06  9:25     ` Dan Carpenter
2025-08-06  9:39       ` Julia Lawall
2025-08-06 19:30       ` Lorenzo Stoakes
2025-08-12 14:37         ` Steven Rostedt
2025-08-12 15:02           ` Sasha Levin
2025-08-12 15:24             ` Paul E. McKenney
2025-08-12 15:25               ` Sasha Levin
2025-08-12 15:28                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8daa784-4c51-4d65-b134-244194dce300@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=ksummit@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox