[MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process

ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

* [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
@ 2025-08-05 16:03 Lorenzo Stoakes
  2025-08-05 16:43 ` James Bottomley
                   ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 16:03 UTC (permalink / raw)
  To: ksummit

Unavoidably, LLMs are the hot topic in tech right now, and are here to
stay.

This poses unique problems:

* Never before have people been able to generate as much content that may,
  on a surface reading, seem valid whilst in reality being quite the
  opposite.

* Equally, LLM's can introduce very subtle mistakes that humans find
  difficult to pick up upon - humans implicitly assume that the classes of
  errors they will encounter are the kinds other humans would make - AI
  defeats that instinct.

* The kernel is uniquely sensitive to erroneous (especially subtly
  erroneous) code - even small errors can be highly consequential. We use a
  programming language that can almost be defined by its lack of any kind
  of safety, and in some subsystems patches are simply taken if no obvious
  problems exist, making us rather vulnerable to this.

* On the other hand, there are use cases which are useful - test data/code
  generation, summarisation, smart auto-complete - so it'd perhaps be
  foolish to entirely dismiss AI.

A very important non-technical point we must consider is that, the second
we even appear to be open to AI submission of _any_ kind, the press will
inevitably report on it gleefully, likely with oversimplified headlines
like 'Linux accepts AI patches'.

The moment that happens, we are likely to see a significant uptick in AI
submissions whether we like it or not.

I propose that we establish the broad rules as they pertain to the kernel,
and would like to bring the discussion to the Maintainer's Summit so we can
determine what those should be.

It's important to get a sense of how maintainers feel about this - whether
what is proposed is opt-in or opt-out - and how we actually implement this.

There has been discussion on-list about this (see [0]), with many
suggestions made including a 'traffic light' system per-subsystem, however
many open questions remain - the devil is in the details.

[0]:https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.org/

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:03 [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process Lorenzo Stoakes
@ 2025-08-05 16:43 ` James Bottomley
  2025-08-05 17:11   ` Mark Brown
                     ` (3 more replies)
  2025-08-05 18:10 ` H. Peter Anvin
  2025-12-08  1:12 ` Sasha Levin
  2 siblings, 4 replies; 56+ messages in thread
From: James Bottomley @ 2025-08-05 16:43 UTC (permalink / raw)
  To: Lorenzo Stoakes, ksummit

On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> Unavoidably, LLMs are the hot topic in tech right now, and are here
> to stay.
> 
> This poses unique problems:
> 
> * Never before have people been able to generate as much content that
> may, on a surface reading, seem valid whilst in reality being quite
> the opposite.
> 
> * Equally, LLM's can introduce very subtle mistakes that humans find
> difficult to pick up upon - humans implicitly assume that the classes
> of errors they will encounter are the kinds other humans would make -
> AI defeats that instinct.

Do you have any examples of this?  I've found the opposite to be true:
AI is capable of really big stupid mistakes when it hasn't seen enough
of the pattern, but I can't recall seeing it make something you'd
classify as a subtle mistake (I assume it could copy subtle mistakes
from wrong training data, so I'm not saying it can't, just that I
haven't seen any).

I think the big mistakes could possibly be avoided by asking people who
submit patches to also append the AI confidence score:

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/accuracy-confidence?view=doc-intel-4.0.0

So we know how much similar training the model has seen before coming
to any conclusion about the value of the output.

> * The kernel is uniquely sensitive to erroneous (especially subtly
> erroneous) code - even small errors can be highly consequential. We
> use a programming language that can almost be defined by its lack of
> any kind   of safety, and in some subsystems patches are simply taken
> if no obvious problems exist, making us rather vulnerable to this.

I think that's really overlooking the fact that if properly trained (a
somewhat big *if* depending on the model) AI should be very good at
writing safe code in unsafe languages.  However it takes C specific
training to do this, so any LLM that's absorbed a load of rust, python
and javascript from the internet will be correspondingly bad at writing
safe C code.  Hence the origin of the LLM and its training corpus would
be a key factor in deciding to trust it.

> * On the other hand, there are use cases which are useful - test
> data/code generation, summarisation, smart auto-complete - so it'd
> perhaps be foolish to entirely dismiss AI.

Patch backporting is another such nice use.

> A very important non-technical point we must consider is that, the
> second we even appear to be open to AI submission of _any_ kind, the
> press will inevitably report on it gleefully, likely with
> oversimplified headlines like 'Linux accepts AI patches'.

Oh, I think simply accepting AI patches is old news:

https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent-of-microsoft-code-is-written-by-ai.html

> The moment that happens, we are likely to see a significant uptick in
> AI submissions whether we like it or not.
> 
> I propose that we establish the broad rules as they pertain to the
> kernel, and would like to bring the discussion to the Maintainer's
> Summit so we can determine what those should be.
> 
> It's important to get a sense of how maintainers feel about this -
> whether what is proposed is opt-in or opt-out - and how we actually
> implement this.
> 
> There has been discussion on-list about this (see [0]), with many
> suggestions made including a 'traffic light' system per-subsystem,
> however many open questions remain - the devil is in the details.
> 
> [0]:
> https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.or
> g/

We're already getting AI generated bug reports from what I can tell. 
It would be really helpful to see the AI confidence score for them as
well.

Regards,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:43 ` James Bottomley
@ 2025-08-05 17:11   ` Mark Brown
  2025-08-05 17:23     ` James Bottomley
  2025-08-05 18:01     ` Lorenzo Stoakes
  2025-08-05 17:17   ` Stephen Hemminger
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 56+ messages in thread
From: Mark Brown @ 2025-08-05 17:11 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lorenzo Stoakes, ksummit

[-- Attachment #1: Type: text/plain, Size: 1045 bytes --]

On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:

> > * On the other hand, there are use cases which are useful - test
> > data/code generation, summarisation, smart auto-complete - so it'd
> > perhaps be foolish to entirely dismiss AI.

> Patch backporting is another such nice use.

Patch backporting sounds pretty scary to me, it's the sort of thing
where extra context that needs to be accounted for is very likely to
come up (eg, assumptions you can make about existing state or
santisation).  That trips up humans often enough and doesn't seem like
it's playing to the strengths advertised for LLMs.

TBH I'm not thrilled about the general test code is trivial assumption
either, unstable test code or test code that doesn't cover what people
think it covers are both problems.  The issues when things go wrong are
less severe than the kernel itself but things still need to be
maintained and we already have issues with people being dismissive of
the selftests.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:43 ` James Bottomley
  2025-08-05 17:11   ` Mark Brown
@ 2025-08-05 17:17   ` Stephen Hemminger
  2025-08-05 17:55   ` Lorenzo Stoakes
  2025-08-05 21:58   ` Jiri Kosina
  3 siblings, 0 replies; 56+ messages in thread
From: Stephen Hemminger @ 2025-08-05 17:17 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lorenzo Stoakes, ksummit

On Tue, 05 Aug 2025 12:43:38 -0400
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > Unavoidably, LLMs are the hot topic in tech right now, and are here
> > to stay.
> > 
> > This poses unique problems:
> > 
> > * Never before have people been able to generate as much content that
> > may, on a surface reading, seem valid whilst in reality being quite
> > the opposite.
> > 
> > * Equally, LLM's can introduce very subtle mistakes that humans find
> > difficult to pick up upon - humans implicitly assume that the classes
> > of errors they will encounter are the kinds other humans would make -
> > AI defeats that instinct.  
> 
> Do you have any examples of this?  I've found the opposite to be true:
> AI is capable of really big stupid mistakes when it hasn't seen enough
> of the pattern, but I can't recall seeing it make something you'd
> classify as a subtle mistake (I assume it could copy subtle mistakes
> from wrong training data, so I'm not saying it can't, just that I
> haven't seen any).
> 
> I think the big mistakes could possibly be avoided by asking people who
> submit patches to also append the AI confidence score:
> 
> https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/accuracy-confidence?view=doc-intel-4.0.0
> 
> So we know how much similar training the model has seen before coming
> to any conclusion about the value of the output.
> 
> > * The kernel is uniquely sensitive to erroneous (especially subtly
> > erroneous) code - even small errors can be highly consequential. We
> > use a programming language that can almost be defined by its lack of
> > any kind   of safety, and in some subsystems patches are simply taken
> > if no obvious problems exist, making us rather vulnerable to this.  
> 
> I think that's really overlooking the fact that if properly trained (a
> somewhat big *if* depending on the model) AI should be very good at
> writing safe code in unsafe languages.  However it takes C specific
> training to do this, so any LLM that's absorbed a load of rust, python
> and javascript from the internet will be correspondingly bad at writing
> safe C code.  Hence the origin of the LLM and its training corpus would
> be a key factor in deciding to trust it.
> 
> > * On the other hand, there are use cases which are useful - test
> > data/code generation, summarisation, smart auto-complete - so it'd
> > perhaps be foolish to entirely dismiss AI.  
> 
> Patch backporting is another such nice use.
> 
> > A very important non-technical point we must consider is that, the
> > second we even appear to be open to AI submission of _any_ kind, the
> > press will inevitably report on it gleefully, likely with
> > oversimplified headlines like 'Linux accepts AI patches'.  
> 
> Oh, I think simply accepting AI patches is old news:
> 
> https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent-of-microsoft-code-is-written-by-ai.html
> 
> > The moment that happens, we are likely to see a significant uptick in
> > AI submissions whether we like it or not.
> > 
> > I propose that we establish the broad rules as they pertain to the
> > kernel, and would like to bring the discussion to the Maintainer's
> > Summit so we can determine what those should be.
> > 
> > It's important to get a sense of how maintainers feel about this -
> > whether what is proposed is opt-in or opt-out - and how we actually
> > implement this.
> > 
> > There has been discussion on-list about this (see [0]), with many
> > suggestions made including a 'traffic light' system per-subsystem,
> > however many open questions remain - the devil is in the details.
> > 
> > [0]:
> > https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.or
> > g/  
> 
> We're already getting AI generated bug reports from what I can tell. 
> It would be really helpful to see the AI confidence score for them as
> well.
> 
> Regards,
> 
> James
> 
> 
> 

I have been toying with AI for some things that got stuck on my TODO
list for years (like writing tests). Have to agree with James, the errors
in that code are not subtle. Mostly they are things that never work because
AI has no context about what it is doing. Especially when porting between environments.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:11   ` Mark Brown
@ 2025-08-05 17:23     ` James Bottomley
  2025-08-05 17:43       ` Sasha Levin
  2025-08-05 18:16       ` Mark Brown
  2025-08-05 18:01     ` Lorenzo Stoakes
  1 sibling, 2 replies; 56+ messages in thread
From: James Bottomley @ 2025-08-05 17:23 UTC (permalink / raw)
  To: Mark Brown; +Cc: Lorenzo Stoakes, ksummit

[-- Attachment #1: Type: text/plain, Size: 2129 bytes --]

On Tue, 2025-08-05 at 18:11 +0100, Mark Brown wrote:
> On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> 
> > > * On the other hand, there are use cases which are useful - test
> > > data/code generation, summarisation, smart auto-complete - so
> > > it'd perhaps be foolish to entirely dismiss AI.
> 
> > Patch backporting is another such nice use.
> 
> Patch backporting sounds pretty scary to me, it's the sort of thing
> where extra context that needs to be accounted for is very likely to
> come up (eg, assumptions you can make about existing state or
> santisation).

If you think about it, the git history contains the exact patch path
between where the patch was applied and where you want to apply it. 
That's a finite data set which LLMs can be trained to work nicely with.

>   That trips up humans often enough and doesn't seem like it's
> playing to the strengths advertised for LLMs.

Humans don't look at the patch path (or use something broad like a
range scan).  The AI can be patient enough to actually go over it all.

> TBH I'm not thrilled about the general test code is trivial
> assumption either,

I don't think anyone who trains AI thinks testing is trivial.  It does
take special training for AI to be good at test writing.

>  unstable test code or test code that doesn't cover what people think
> it covers are both problems.

Test coverage and constructing tests for coverage is another place AI
can help.  Especially given coverage is a measurable quantity which
makes training easier.

>   The issues when things go wrong are less severe than the kernel
> itself but things still need to be maintained and we already have
> issues with people being dismissive of the selftests.

Well our selftests, having just spent ages figuring out how to run a
subset of the bpf tests, are very eccentric ... in that each test set
runs in a completely different way from any of the others and knowledge
from one selftest area doesn't apply to a different one.

Regards,

James

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:23     ` James Bottomley
@ 2025-08-05 17:43       ` Sasha Levin
  2025-08-05 17:58         ` Lorenzo Stoakes
  2025-08-05 18:16       ` Mark Brown
  1 sibling, 1 reply; 56+ messages in thread
From: Sasha Levin @ 2025-08-05 17:43 UTC (permalink / raw)
  To: James Bottomley; +Cc: Mark Brown, Lorenzo Stoakes, ksummit

On Tue, Aug 05, 2025 at 01:23:18PM -0400, James Bottomley wrote:
>On Tue, 2025-08-05 at 18:11 +0100, Mark Brown wrote:
>> On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
>> > On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
>>
>> > > * On the other hand, there are use cases which are useful - test
>> > > data/code generation, summarisation, smart auto-complete - so
>> > > it'd perhaps be foolish to entirely dismiss AI.
>>
>> > Patch backporting is another such nice use.
>>
>> Patch backporting sounds pretty scary to me, it's the sort of thing
>> where extra context that needs to be accounted for is very likely to
>> come up (eg, assumptions you can make about existing state or
>> santisation).
>
>If you think about it, the git history contains the exact patch path
>between where the patch was applied and where you want to apply it.
>That's a finite data set which LLMs can be trained to work nicely with.

Most of the patches that fail to backport and get a "FAILED:" mail as a
result are really just either a trivial context conflict ar a missing
dependency.

Resolving those is within the realm of a "junior engineer" which I
suspect AI would tackle fairly well.

The other ~10-20% are indeed something more complex due to things like a
major rewrite/refactor/etc of a subsystem, where I wouldn't trust the
current state of the art AI to tackle. But mayble a few years from now?

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:43 ` James Bottomley
  2025-08-05 17:11   ` Mark Brown
  2025-08-05 17:17   ` Stephen Hemminger
@ 2025-08-05 17:55   ` Lorenzo Stoakes
  2025-08-05 18:23     ` Lorenzo Stoakes
                       ` (2 more replies)
  2025-08-05 21:58   ` Jiri Kosina
  3 siblings, 3 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 17:55 UTC (permalink / raw)
  To: James Bottomley; +Cc: ksummit

On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > Unavoidably, LLMs are the hot topic in tech right now, and are here
> > to stay.
> >
> > This poses unique problems:
> >
> > * Never before have people been able to generate as much content that
> > may, on a surface reading, seem valid whilst in reality being quite
> > the opposite.
> >
> > * Equally, LLM's can introduce very subtle mistakes that humans find
> > difficult to pick up upon - humans implicitly assume that the classes
> > of errors they will encounter are the kinds other humans would make -
> > AI defeats that instinct.
>
> Do you have any examples of this?  I've found the opposite to be true:

Sure - Steven encountered this in [1].

As he says there:

"If I had known, I would have examined the patch a little more thoroughly,
 and would have discovered a very minor mistake in the patch."

The algorithm is determining likely output based on statistics, and
therefore density of input. Since in reality one can write infinite
programs, it's mathematically inevitable that an LLM will have to 'infer'
answers.

That inference has no basis in dynamics, that is a model of reality that it
can use to determine answers, rather it will, in essence, provide a random
result.

If there is a great deal of input (e.g. C programs), then that inference is
likely to manifest in very subtle errors. See [2] for a thoughtful
exploration from an AI expert on the topic of statistics vs. dynamics, and
[3] for a broader exploration of the topic from the same author.

[1]:https://lore.kernel.org/workflows/20250724194556.105803db@gandalf.local.home/
[2]:https://blog.piekniewski.info/2016/11/01/statistics-and-dynamics/
[3]:https://blog.piekniewski.info/2023/04/09/ai-reflections/

> AI is capable of really big stupid mistakes when it hasn't seen enough
> of the pattern, but I can't recall seeing it make something you'd
> classify as a subtle mistake (I assume it could copy subtle mistakes
> from wrong training data, so I'm not saying it can't, just that I
> haven't seen any).

It's not from incorrect training data, it's fundamental to how LLMs
work.

>
> I think the big mistakes could possibly be avoided by asking people who
> submit patches to also append the AI confidence score:
>
> https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/accuracy-confidence?view=doc-intel-4.0.0

That's interesting, though I don't know how reliable this might be.

However it's for exactly this kind of input that I proposed the topic :)

>
> So we know how much similar training the model has seen before coming
> to any conclusion about the value of the output.
>
> > * The kernel is uniquely sensitive to erroneous (especially subtly
> > erroneous) code - even small errors can be highly consequential. We
> > use a programming language that can almost be defined by its lack of
> > any kind   of safety, and in some subsystems patches are simply taken
> > if no obvious problems exist, making us rather vulnerable to this.
>
> I think that's really overlooking the fact that if properly trained (a
> somewhat big *if* depending on the model) AI should be very good at
> writing safe code in unsafe languages.  However it takes C specific

I fundamentally disagree.

The consequences of even extremely small mistakes can be very serious in C,
as the language does little to nothing for you.

No matter how much data it absorbs it cannot span the entire space of all
possible programs or even anywhere close.

I mean again, I apply the arguments above as to why I feel this is
_fundamental_ to the approach.

Kernel code is also very specific and has characteristics that render it
different from userland. We must consider a great many more things that
would be handled for us were we userland - interrupts, the context we are
in, locks of all varieties, etc. etc.

While there's a lot of kernel code (~10's of millions of line), for an LLM
that is very small, and we simply cannot generate more.

Yes it can eat up all the C it can, but that isn't quite the same.

> training to do this, so any LLM that's absorbed a load of rust, python
> and javascript from the internet will be correspondingly bad at writing
> safe C code.  Hence the origin of the LLM and its training corpus would
> be a key factor in deciding to trust it.
>
> > * On the other hand, there are use cases which are useful - test
> > data/code generation, summarisation, smart auto-complete - so it'd
> > perhaps be foolish to entirely dismiss AI.
>
> Patch backporting is another such nice use.

As long as carefully checked :)

>
> > A very important non-technical point we must consider is that, the
> > second we even appear to be open to AI submission of _any_ kind, the
> > press will inevitably report on it gleefully, likely with
> > oversimplified headlines like 'Linux accepts AI patches'.
>
> Oh, I think simply accepting AI patches is old news:
>
> https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent-of-microsoft-code-is-written-by-ai.html

That doesn't pertain to the kernel specifically.

Of course code being written by AI is old news, but there's no doubt that
tech publications would JUMP on anything even suggesting we are open in
some broad way to AI submissions.

Given Linus's rather neutral public position on AI, it'd certainly mark
what _would be perceived_, in my view, as a sea change on this.

>
> > The moment that happens, we are likely to see a significant uptick in
> > AI submissions whether we like it or not.
> >
> > I propose that we establish the broad rules as they pertain to the
> > kernel, and would like to bring the discussion to the Maintainer's
> > Summit so we can determine what those should be.
> >
> > It's important to get a sense of how maintainers feel about this -
> > whether what is proposed is opt-in or opt-out - and how we actually
> > implement this.
> >
> > There has been discussion on-list about this (see [0]), with many
> > suggestions made including a 'traffic light' system per-subsystem,
> > however many open questions remain - the devil is in the details.
> >
> > [0]:
> > https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.or
> > g/
>
> We're already getting AI generated bug reports from what I can tell.
> It would be really helpful to see the AI confidence score for them as
> well.

That is definitely an interesting additional data point that could
potentially be helpful here! I wasn't aware of this so thanks for that :)

>
> Regards,
>
> James
>
>

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:43       ` Sasha Levin
@ 2025-08-05 17:58         ` Lorenzo Stoakes
  0 siblings, 0 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 17:58 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Mark Brown, ksummit

On Tue, Aug 05, 2025 at 01:43:00PM -0400, Sasha Levin wrote:
> On Tue, Aug 05, 2025 at 01:23:18PM -0400, James Bottomley wrote:
> > On Tue, 2025-08-05 at 18:11 +0100, Mark Brown wrote:
> > > On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > > > On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > >
> > > > > * On the other hand, there are use cases which are useful - test
> > > > > data/code generation, summarisation, smart auto-complete - so
> > > > > it'd perhaps be foolish to entirely dismiss AI.
> > >
> > > > Patch backporting is another such nice use.
> > >
> > > Patch backporting sounds pretty scary to me, it's the sort of thing
> > > where extra context that needs to be accounted for is very likely to
> > > come up (eg, assumptions you can make about existing state or
> > > santisation).
> >
> > If you think about it, the git history contains the exact patch path
> > between where the patch was applied and where you want to apply it.
> > That's a finite data set which LLMs can be trained to work nicely with.
>
> Most of the patches that fail to backport and get a "FAILED:" mail as a
> result are really just either a trivial context conflict ar a missing
> dependency.

Yeah I'm not surprised by that!

>
> Resolving those is within the realm of a "junior engineer" which I
> suspect AI would tackle fairly well.

Right.

>
> The other ~10-20% are indeed something more complex due to things like a
> major rewrite/refactor/etc of a subsystem, where I wouldn't trust the
> current state of the art AI to tackle. But mayble a few years from now?

Well, I have my doubts :) I think it's always a trade-off of scrutiny +
class of error vs. time gains.

But this is, of course, a point of debate.

>
> --
> Thanks,
> Sasha

By the way Sasha, I was planning to cc- you as this is obviously entirely
related to your series and I'm more than happy to propose a join discussion
or something like this, - I just followed the rules as per the maintainers
summit email strictly and didn't know whether I _could_ cc :)

I would enjoy having a friendly discussion about it in person regardless :)

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:11   ` Mark Brown
  2025-08-05 17:23     ` James Bottomley
@ 2025-08-05 18:01     ` Lorenzo Stoakes
  2025-08-05 18:46       ` Mark Brown
  1 sibling, 1 reply; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 18:01 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, ksummit

On Tue, Aug 05, 2025 at 06:11:31PM +0100, Mark Brown wrote:
> TBH I'm not thrilled about the general test code is trivial assumption
> either, unstable test code or test code that doesn't cover what people
> think it covers are both problems.  The issues when things go wrong are
> less severe than the kernel itself but things still need to be
> maintained and we already have issues with people being dismissive of
> the selftests.

Yeah I didn't mean to downplay this or suggest it's not important, as you've
(possibly :P) seen I write a lot of it myself and consider it vital.

However the point is that we can:

- Generate scenarios relatively quickly that way.
- Resulatant mistakes are at least _less_ impactful. And you'd hope, would come
  out in the wash more easily.

However, I wonder whether it might be better simply to restrict LLMs to test
data generation or the like.

These are again all exactly the kinds of things that make this an important
topic to discuss I believe :)

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:03 [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process Lorenzo Stoakes
  2025-08-05 16:43 ` James Bottomley
@ 2025-08-05 18:10 ` H. Peter Anvin
  2025-08-05 18:19   ` Lorenzo Stoakes
  2025-08-06  5:49   ` Julia Lawall
  2025-12-08  1:12 ` Sasha Levin
  2 siblings, 2 replies; 56+ messages in thread
From: H. Peter Anvin @ 2025-08-05 18:10 UTC (permalink / raw)
  To: Lorenzo Stoakes, ksummit

On August 5, 2025 9:03:18 AM PDT, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>Unavoidably, LLMs are the hot topic in tech right now, and are here to
>stay.
>
>This poses unique problems:
>
>* Never before have people been able to generate as much content that may,
>  on a surface reading, seem valid whilst in reality being quite the
>  opposite.
>
>* Equally, LLM's can introduce very subtle mistakes that humans find
>  difficult to pick up upon - humans implicitly assume that the classes of
>  errors they will encounter are the kinds other humans would make - AI
>  defeats that instinct.
>
>* The kernel is uniquely sensitive to erroneous (especially subtly
>  erroneous) code - even small errors can be highly consequential. We use a
>  programming language that can almost be defined by its lack of any kind
>  of safety, and in some subsystems patches are simply taken if no obvious
>  problems exist, making us rather vulnerable to this.
>
>* On the other hand, there are use cases which are useful - test data/code
>  generation, summarisation, smart auto-complete - so it'd perhaps be
>  foolish to entirely dismiss AI.
>
>A very important non-technical point we must consider is that, the second
>we even appear to be open to AI submission of _any_ kind, the press will
>inevitably report on it gleefully, likely with oversimplified headlines
>like 'Linux accepts AI patches'.
>
>The moment that happens, we are likely to see a significant uptick in AI
>submissions whether we like it or not.
>
>I propose that we establish the broad rules as they pertain to the kernel,
>and would like to bring the discussion to the Maintainer's Summit so we can
>determine what those should be.
>
>It's important to get a sense of how maintainers feel about this - whether
>what is proposed is opt-in or opt-out - and how we actually implement this.
>
>There has been discussion on-list about this (see [0]), with many
>suggestions made including a 'traffic light' system per-subsystem, however
>many open questions remain - the devil is in the details.
>
>[0]:https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.org/
>

It would seem to me to be far more useful  to get AI to help find hard-to-see but perhaps common problems, or tidying up code (like Coccinelle on steroids.)

Things that are easy to see post hoc are correct, but can be hard or laborious to do. 

Another genuinely good usage for AI is for especially non-English speakers to tidy up their patch comments and other documentation.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:23     ` James Bottomley
  2025-08-05 17:43       ` Sasha Levin
@ 2025-08-05 18:16       ` Mark Brown
  1 sibling, 0 replies; 56+ messages in thread
From: Mark Brown @ 2025-08-05 18:16 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lorenzo Stoakes, ksummit

[-- Attachment #1: Type: text/plain, Size: 3286 bytes --]

On Tue, Aug 05, 2025 at 01:23:18PM -0400, James Bottomley wrote:
> On Tue, 2025-08-05 at 18:11 +0100, Mark Brown wrote:

> > Patch backporting sounds pretty scary to me, it's the sort of thing
> > where extra context that needs to be accounted for is very likely to
> > come up (eg, assumptions you can make about existing state or
> > santisation).

> If you think about it, the git history contains the exact patch path
> between where the patch was applied and where you want to apply it. 
> That's a finite data set which LLMs can be trained to work nicely with.

> >   That trips up humans often enough and doesn't seem like it's
> > playing to the strengths advertised for LLMs.

> Humans don't look at the patch path (or use something broad like a
> range scan).  The AI can be patient enough to actually go over it all.

The things humans are usually doing in a situation like that is
remembering that someone changed something and why, and of course the
new dependencies that came in.  I see what you're saying, but I'm rather
nervous as to what people would actually do and how effective the
results would be especially where things get complicated and there's
landmines.

> > TBH I'm not thrilled about the general test code is trivial
> > assumption either,

> I don't think anyone who trains AI thinks testing is trivial.  It does
> take special training for AI to be good at test writing.

I think a lot of the people saying "oh, we can just churn that out with
AI" kind of things do have that sort of attitude.  This thread is far
from the first time I've seen people saying tests are a great
application, and it's more usually as a contrast to the complicated
stuff in the kernel rather than with a consideration of the reasons the
specific benefits these tools might offer in this applciation.

> >  unstable test code or test code that doesn't cover what people think
> > it covers are both problems.

> Test coverage and constructing tests for coverage is another place AI
> can help.  Especially given coverage is a measurable quantity which
> makes training easier.

There's definitely some opportunity for specialist stuff there,
especially if you're just looking at measurable metrics like you're
mentioning.  Other tools in this area are also available of course!

> >   The issues when things go wrong are less severe than the kernel
> > itself but things still need to be maintained and we already have
> > issues with people being dismissive of the selftests.

> Well our selftests, having just spent ages figuring out how to run a
> subset of the bpf tests, are very eccentric ... in that each test set
> runs in a completely different way from any of the others and knowledge
> from one selftest area doesn't apply to a different one.

They should all run from the selftests harness so the simple running
them bit should at least should be standard?  We do have some suites
that were thrown into the kernel with marginal integration with the
frameworks but they're generally fairly obvious as soon as you go in via
the standard interfaces.  I'm not saying the overall picture is amazing,
but I see a big part of it being a social problem with getting people to
take what we've got seriously.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:10 ` H. Peter Anvin
@ 2025-08-05 18:19   ` Lorenzo Stoakes
  2025-08-06  5:49   ` Julia Lawall
  1 sibling, 0 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 18:19 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: ksummit

On Tue, Aug 05, 2025 at 11:10:51AM -0700, H. Peter Anvin wrote:

>
> It would seem to me to be far more useful  to get AI to help find hard-to-see but perhaps common problems, or tidying up code (like Coccinelle on steroids.)
>
> Things that are easy to see post hoc are correct, but can be hard or laborious to do.

Right, as long as we get the RoI right on the efforts to check the results,
that's a great suggestion.

>
> Another genuinely good usage for AI is for especially non-English speakers to tidy up their patch comments and other documentation.

Yes, I have a lot of empathy for this and I gather non-native speakers find
it extraordinarily useful for this task in general.

This is a very good point as to why we have to be careful not to have a
broad-brush approach like 'no AI', as there are uses that do not cause
issues of concern but are of great use to people using them.

It would be rather egregious to demand that non-native speakers using LLMs
for this purpose disclose the fact that would label the patch in the same
fashion as, say, a very broken LLM kernel patch let's say.

There really has to be a fine-grained approach to this - the devil truly is
in the details here.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:55   ` Lorenzo Stoakes
@ 2025-08-05 18:23     ` Lorenzo Stoakes
  2025-08-12 13:44       ` Steven Rostedt
  2025-08-05 18:34     ` James Bottomley
  2025-08-05 18:39     ` Sasha Levin
  2 siblings, 1 reply; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 18:23 UTC (permalink / raw)
  To: James Bottomley; +Cc: ksummit, Steven Rostedt

+cc Steven

Apologies, I meant to +cc you here as I mention an issue you encountered,
and forgot of course :>)

Cheers, Lorenzo

On Tue, Aug 05, 2025 at 06:55:29PM +0100, Lorenzo Stoakes wrote:
> On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > > Unavoidably, LLMs are the hot topic in tech right now, and are here
> > > to stay.
> > >
> > > This poses unique problems:
> > >
> > > * Never before have people been able to generate as much content that
> > > may, on a surface reading, seem valid whilst in reality being quite
> > > the opposite.
> > >
> > > * Equally, LLM's can introduce very subtle mistakes that humans find
> > > difficult to pick up upon - humans implicitly assume that the classes
> > > of errors they will encounter are the kinds other humans would make -
> > > AI defeats that instinct.
> >
> > Do you have any examples of this?  I've found the opposite to be true:
>
> Sure - Steven encountered this in [1].
>
> As he says there:
>
> "If I had known, I would have examined the patch a little more thoroughly,
>  and would have discovered a very minor mistake in the patch."
>
> The algorithm is determining likely output based on statistics, and
> therefore density of input. Since in reality one can write infinite
> programs, it's mathematically inevitable that an LLM will have to 'infer'
> answers.
>
> That inference has no basis in dynamics, that is a model of reality that it
> can use to determine answers, rather it will, in essence, provide a random
> result.
>
> If there is a great deal of input (e.g. C programs), then that inference is
> likely to manifest in very subtle errors. See [2] for a thoughtful
> exploration from an AI expert on the topic of statistics vs. dynamics, and
> [3] for a broader exploration of the topic from the same author.
>
> [1]:https://lore.kernel.org/workflows/20250724194556.105803db@gandalf.local.home/
> [2]:https://blog.piekniewski.info/2016/11/01/statistics-and-dynamics/
> [3]:https://blog.piekniewski.info/2023/04/09/ai-reflections/
>
>
> > AI is capable of really big stupid mistakes when it hasn't seen enough
> > of the pattern, but I can't recall seeing it make something you'd
> > classify as a subtle mistake (I assume it could copy subtle mistakes
> > from wrong training data, so I'm not saying it can't, just that I
> > haven't seen any).
>
> It's not from incorrect training data, it's fundamental to how LLMs
> work.
>
> >
> > I think the big mistakes could possibly be avoided by asking people who
> > submit patches to also append the AI confidence score:
> >
> > https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/accuracy-confidence?view=doc-intel-4.0.0
>
> That's interesting, though I don't know how reliable this might be.
>
> However it's for exactly this kind of input that I proposed the topic :)
>
> >
> > So we know how much similar training the model has seen before coming
> > to any conclusion about the value of the output.
> >
> > > * The kernel is uniquely sensitive to erroneous (especially subtly
> > > erroneous) code - even small errors can be highly consequential. We
> > > use a programming language that can almost be defined by its lack of
> > > any kind   of safety, and in some subsystems patches are simply taken
> > > if no obvious problems exist, making us rather vulnerable to this.
> >
> > I think that's really overlooking the fact that if properly trained (a
> > somewhat big *if* depending on the model) AI should be very good at
> > writing safe code in unsafe languages.  However it takes C specific
>
> I fundamentally disagree.
>
> The consequences of even extremely small mistakes can be very serious in C,
> as the language does little to nothing for you.
>
> No matter how much data it absorbs it cannot span the entire space of all
> possible programs or even anywhere close.
>
> I mean again, I apply the arguments above as to why I feel this is
> _fundamental_ to the approach.
>
> Kernel code is also very specific and has characteristics that render it
> different from userland. We must consider a great many more things that
> would be handled for us were we userland - interrupts, the context we are
> in, locks of all varieties, etc. etc.
>
> While there's a lot of kernel code (~10's of millions of line), for an LLM
> that is very small, and we simply cannot generate more.
>
> Yes it can eat up all the C it can, but that isn't quite the same.
>
> > training to do this, so any LLM that's absorbed a load of rust, python
> > and javascript from the internet will be correspondingly bad at writing
> > safe C code.  Hence the origin of the LLM and its training corpus would
> > be a key factor in deciding to trust it.
> >
> > > * On the other hand, there are use cases which are useful - test
> > > data/code generation, summarisation, smart auto-complete - so it'd
> > > perhaps be foolish to entirely dismiss AI.
> >
> > Patch backporting is another such nice use.
>
> As long as carefully checked :)
>
> >
> > > A very important non-technical point we must consider is that, the
> > > second we even appear to be open to AI submission of _any_ kind, the
> > > press will inevitably report on it gleefully, likely with
> > > oversimplified headlines like 'Linux accepts AI patches'.
> >
> > Oh, I think simply accepting AI patches is old news:
> >
> > https://www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent-of-microsoft-code-is-written-by-ai.html
>
> That doesn't pertain to the kernel specifically.
>
> Of course code being written by AI is old news, but there's no doubt that
> tech publications would JUMP on anything even suggesting we are open in
> some broad way to AI submissions.
>
> Given Linus's rather neutral public position on AI, it'd certainly mark
> what _would be perceived_, in my view, as a sea change on this.
>
> >
> > > The moment that happens, we are likely to see a significant uptick in
> > > AI submissions whether we like it or not.
> > >
> > > I propose that we establish the broad rules as they pertain to the
> > > kernel, and would like to bring the discussion to the Maintainer's
> > > Summit so we can determine what those should be.
> > >
> > > It's important to get a sense of how maintainers feel about this -
> > > whether what is proposed is opt-in or opt-out - and how we actually
> > > implement this.
> > >
> > > There has been discussion on-list about this (see [0]), with many
> > > suggestions made including a 'traffic light' system per-subsystem,
> > > however many open questions remain - the devil is in the details.
> > >
> > > [0]:
> > > https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.or
> > > g/
> >
> > We're already getting AI generated bug reports from what I can tell.
> > It would be really helpful to see the AI confidence score for them as
> > well.
>
> That is definitely an interesting additional data point that could
> potentially be helpful here! I wasn't aware of this so thanks for that :)
>
> >
> > Regards,
> >
> > James
> >
> >
>
> Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:55   ` Lorenzo Stoakes
  2025-08-05 18:23     ` Lorenzo Stoakes
@ 2025-08-05 18:34     ` James Bottomley
  2025-08-05 18:55       ` Lorenzo Stoakes
  2025-08-12 13:50       ` Steven Rostedt
  2025-08-05 18:39     ` Sasha Levin
  2 siblings, 2 replies; 56+ messages in thread
From: James Bottomley @ 2025-08-05 18:34 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: ksummit

On Tue, 2025-08-05 at 18:55 +0100, Lorenzo Stoakes wrote:
> On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > > Unavoidably, LLMs are the hot topic in tech right now, and are
> > > here to stay.
> > > 
> > > This poses unique problems:
> > > 
> > > * Never before have people been able to generate as much content
> > > that may, on a surface reading, seem valid whilst in reality
> > > being quite the opposite.
> > > 
> > > * Equally, LLM's can introduce very subtle mistakes that humans
> > > find difficult to pick up upon - humans implicitly assume that
> > > the classes of errors they will encounter are the kinds other
> > > humans would make - AI defeats that instinct.
> > 
> > Do you have any examples of this?  I've found the opposite to be
> > true:
> 
> Sure - Steven encountered this in [1].
> 
> As he says there:
> 
> "If I had known, I would have examined the patch a little more
> thoroughly,  and would have discovered a very minor mistake in the
> patch."

Heh, well now you make me look it seems that the minor mistake is
adding at tail instead of head?  That seems to be because the hash list
API doesn't have a head add ...

I wouldn't really call that a subtle problem because the LLM would have
picked up the head to tail conversion if we'd had an at head API for it
to learn from.

> The algorithm is determining likely output based on statistics, and
> therefore density of input. Since in reality one can write infinite
> programs, it's mathematically inevitable that an LLM will have to
> 'infer' answers.
> 
> That inference has no basis in dynamics, that is a model of reality
> that it can use to determine answers, rather it will, in essence,
> provide a random result.
> 
> If there is a great deal of input (e.g. C programs), then that
> inference is
> likely to manifest in very subtle errors. See [2] for a thoughtful
> exploration from an AI expert on the topic of statistics vs.
> dynamics, and [3] for a broader exploration of the topic from the
> same author.

Amazingly enough when you're trying to sell a new thing, you become
very down on what you see as the old thing (bcachefs vs btrfs ...?)

> 
> [1]:
> https://lore.kernel.org/workflows/20250724194556.105803db@gandalf.loc
> al.home/
> [2]:https://blog.piekniewski.info/2016/11/01/statistics-and-dynamics/
> [3]:https://blog.piekniewski.info/2023/04/09/ai-reflections/
> 
[...]
> > > * The kernel is uniquely sensitive to erroneous (especially
> > > subtly erroneous) code - even small errors can be highly
> > > consequential. We use a programming language that can almost be
> > > defined by its lack of any kind   of safety, and in some
> > > subsystems patches are simply taken if no obvious problems exist,
> > > making us rather vulnerable to this.
> > 
> > I think that's really overlooking the fact that if properly trained
> > (a somewhat big *if* depending on the model) AI should be very good
> > at writing safe code in unsafe languages.  However it takes C
> > specific
> 
> I fundamentally disagree.
> 
> The consequences of even extremely small mistakes can be very serious
> in C, as the language does little to nothing for you.
> 
> No matter how much data it absorbs it cannot span the entire space of
> all possible programs or even anywhere close.

Neither can a human and we get by on mostly pattern matching ourselves
...

> I mean again, I apply the arguments above as to why I feel this is
> _fundamental_ to the approach.
> 
> Kernel code is also very specific and has characteristics that render
> it different from userland. We must consider a great many more things
> that would be handled for us were we userland - interrupts, the
> context we are in, locks of all varieties, etc. etc.
> 
> While there's a lot of kernel code (~10's of millions of line), for
> an LLM that is very small, and we simply cannot generate more.
> 
> Yes it can eat up all the C it can, but that isn't quite the same.

You seem to be assuming training is simply dump the data corpus and let
the model fend for itself.  It isn't it's a more painstaking process
that finds the mistakes in the output and gets the model to improve
itself ... it is more like human teaching.

I'm not saying current AI is perfect, but I am saying that most of the
issues with current AI can be traced to training problems which can be
corrected in the model if anyone cares enough to do it.  The useful
signal is that in all badly trained models I've seen the AI confidence
score is really low because of the multiple matches in different areas
that proper training would separate.  THat's why I think AI confidence
score should be the first thing we ask for.

Regards,

James


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 17:55   ` Lorenzo Stoakes
  2025-08-05 18:23     ` Lorenzo Stoakes
  2025-08-05 18:34     ` James Bottomley
@ 2025-08-05 18:39     ` Sasha Levin
  2025-08-05 19:15       ` Lorenzo Stoakes
  2025-08-06  4:04       ` Alexey Dobriyan
  2 siblings, 2 replies; 56+ messages in thread
From: Sasha Levin @ 2025-08-05 18:39 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: James Bottomley, ksummit

On Tue, Aug 05, 2025 at 06:55:29PM +0100, Lorenzo Stoakes wrote:
>On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
>> I think that's really overlooking the fact that if properly trained (a
>> somewhat big *if* depending on the model) AI should be very good at
>> writing safe code in unsafe languages.  However it takes C specific
>
>I fundamentally disagree.
>
>The consequences of even extremely small mistakes can be very serious in C,
>as the language does little to nothing for you.
>
>No matter how much data it absorbs it cannot span the entire space of all
>possible programs or even anywhere close.

Neither can a human :)

I think that this is where we see things differently: I don't think that
AI needs to be perfect, I just want it to be at the same lever (or
better) than a human.

Humans aren't great at writing C code. There's a reason we're looking at
using Rust for the kernel, and there's a reason that LTS trees exist -
they're living evidence of just how many mistakes humans make.

Look at the contents of LTS trees or the CVEs that get assigned: most of
them are fairly simple memory safety issues, off-by-one, use-after-free,
etc...

I don't think we should expect a bar for AI that is higher than the one
we set for humans.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:01     ` Lorenzo Stoakes
@ 2025-08-05 18:46       ` Mark Brown
  2025-08-05 19:18         ` Lorenzo Stoakes
  0 siblings, 1 reply; 56+ messages in thread
From: Mark Brown @ 2025-08-05 18:46 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: James Bottomley, ksummit

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

On Tue, Aug 05, 2025 at 07:01:01PM +0100, Lorenzo Stoakes wrote:

> However the point is that we can:

> - Generate scenarios relatively quickly that way.
> - Resulatant mistakes are at least _less_ impactful. And you'd hope, would come
>   out in the wash more easily.

> However, I wonder whether it might be better simply to restrict LLMs to test
> data generation or the like.

Yeah, test data seems much safer than test code since it's much more
constrained.  I'm definitely not 100% against trying things with code,
I'm more worried about people throwing something out there without
really looking at it properly since it's just test code.  With the
scenario generation for example the question is always if we'd be
happier with the test program being a scenario generator - sometimes the
answer is no (eg, because it's too expensive to run and you really need
to pick scenarios), sometimes the answer is yes even though the
resulting test program is more involved.

> These are again all exactly the kinds of things that make this an important
> topic to discuss I believe :)

Indeed, and I'm going to guess that it's going to be as much about
establishing taste as firm rules.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:34     ` James Bottomley
@ 2025-08-05 18:55       ` Lorenzo Stoakes
  2025-08-12 13:50       ` Steven Rostedt
  1 sibling, 0 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 18:55 UTC (permalink / raw)
  To: James Bottomley; +Cc: ksummit, Steven Rostedt

(remembering to +cc Steven this time)

On Tue, Aug 05, 2025 at 02:34:40PM -0400, James Bottomley wrote:
> On Tue, 2025-08-05 at 18:55 +0100, Lorenzo Stoakes wrote:
> > On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > > On Tue, 2025-08-05 at 17:03 +0100, Lorenzo Stoakes wrote:
> > > > Unavoidably, LLMs are the hot topic in tech right now, and are
> > > > here to stay.
> > > >
> > > > This poses unique problems:
> > > >
> > > > * Never before have people been able to generate as much content
> > > > that may, on a surface reading, seem valid whilst in reality
> > > > being quite the opposite.
> > > >
> > > > * Equally, LLM's can introduce very subtle mistakes that humans
> > > > find difficult to pick up upon - humans implicitly assume that
> > > > the classes of errors they will encounter are the kinds other
> > > > humans would make - AI defeats that instinct.
> > >
> > > Do you have any examples of this?  I've found the opposite to be
> > > true:
> >
> > Sure - Steven encountered this in [1].
> >
> > As he says there:
> >
> > "If I had known, I would have examined the patch a little more
> > thoroughly,  and would have discovered a very minor mistake in the
> > patch."
>
> Heh, well now you make me look it seems that the minor mistake is
> adding at tail instead of head?  That seems to be because the hash list
> API doesn't have a head add ...
>
> I wouldn't really call that a subtle problem because the LLM would have
> picked up the head to tail conversion if we'd had an at head API for it
> to learn from.

You see, I feel like whatever example I provide would provoke a response
like this :)

I also encountered an LLM insisting that MAINTAINERS contained a section
that doesn't exist, but subtly incorrect. 'It' insisted that it was true
and I could check the file (it was wrong).

I've asked for explanations of concepts that it's got confidently,
misleadingly wrong.

https://rdel.substack.com/p/rdel-57-what-are-the-most-common

Is an article referencing common bugs generated by code-generating LLM
machinery.

In interacting with chat bots I've encountered _very confidently_ stated
stuff that is convincing, were you not to be expert enough to determine
otherwise.

I could go and try to gather a bunch of examples (hey, this is a proposal
right? If it were accepted then I'd be able to spend time firming stuff up
like this ;)

But I come back to the fundamental point that we are statistically
inferring information against an infinite number of possibilities. It is
simply mathematically inevitable there will be gaps, and errors can very
conceivably be subtle as well as glaring.

Either are problematic.

>
> > The algorithm is determining likely output based on statistics, and
> > therefore density of input. Since in reality one can write infinite
> > programs, it's mathematically inevitable that an LLM will have to
> > 'infer' answers.
> >
> > That inference has no basis in dynamics, that is a model of reality
> > that it can use to determine answers, rather it will, in essence,
> > provide a random result.
> >
> > If there is a great deal of input (e.g. C programs), then that
> > inference is
> > likely to manifest in very subtle errors. See [2] for a thoughtful
> > exploration from an AI expert on the topic of statistics vs.
> > dynamics, and [3] for a broader exploration of the topic from the
> > same author.
>
> Amazingly enough when you're trying to sell a new thing, you become
> very down on what you see as the old thing (bcachefs vs btrfs ...?)

Come on James, ;) I think this is rather an unfair dismissal of those
articles that are well-reasoned and thoughtful.

I think the discussion around statistical inference vs. dynamic modelling
is fairly profound and insightful.

Also that comparison... ;)

>
> >
> > [1]:
> > https://lore.kernel.org/workflows/20250724194556.105803db@gandalf.loc
> > al.home/
> > [2]:https://blog.piekniewski.info/2016/11/01/statistics-and-dynamics/
> > [3]:https://blog.piekniewski.info/2023/04/09/ai-reflections/
> >
> [...]
> > > > * The kernel is uniquely sensitive to erroneous (especially
> > > > subtly erroneous) code - even small errors can be highly
> > > > consequential. We use a programming language that can almost be
> > > > defined by its lack of any kind   of safety, and in some
> > > > subsystems patches are simply taken if no obvious problems exist,
> > > > making us rather vulnerable to this.
> > >
> > > I think that's really overlooking the fact that if properly trained
> > > (a somewhat big *if* depending on the model) AI should be very good
> > > at writing safe code in unsafe languages.  However it takes C
> > > specific
> >
> > I fundamentally disagree.
> >
> > The consequences of even extremely small mistakes can be very serious
> > in C, as the language does little to nothing for you.
> >
> > No matter how much data it absorbs it cannot span the entire space of
> > all possible programs or even anywhere close.
>
> Neither can a human and we get by on mostly pattern matching ourselves
> ...

This is a very typical counterargument made. The problem is that humans are
not able to generate these kinds of errors at this kind of scale in the
same way LLMs can*, and humans implicitly expect 'human-like' errors, that
we cannot assume will arise in this output.

We tend to have a fairly constrained set of errors that we make, which you
can usually reason about and really - maintainers pattern match on errors
made as much as patch writers pattern match on writing them.

Breaking these assumptions in unusual ways is likely to be problematic.

*Excepting certain coccinelle contributors of course...

>
> > I mean again, I apply the arguments above as to why I feel this is
> > _fundamental_ to the approach.
> >
> > Kernel code is also very specific and has characteristics that render
> > it different from userland. We must consider a great many more things
> > that would be handled for us were we userland - interrupts, the
> > context we are in, locks of all varieties, etc. etc.
> >
> > While there's a lot of kernel code (~10's of millions of line), for
> > an LLM that is very small, and we simply cannot generate more.
> >
> > Yes it can eat up all the C it can, but that isn't quite the same.
>
> You seem to be assuming training is simply dump the data corpus and let
> the model fend for itself.  It isn't it's a more painstaking process
> that finds the mistakes in the output and gets the model to improve
> itself ... it is more like human teaching.

No, I assume that statistical inference cannot be established for an
effectively infinite problem space, which I think in reasonable.

>
> I'm not saying current AI is perfect, but I am saying that most of the
> issues with current AI can be traced to training problems which can be
> corrected in the model if anyone cares enough to do it.  The useful
> signal is that in all badly trained models I've seen the AI confidence
> score is really low because of the multiple matches in different areas
> that proper training would separate.  THat's why I think AI confidence
> score should be the first thing we ask for.

Again, I've no issue with this confidence score as a data point, though we
do need to assess how reliable it is.

>
> Regards,
>
> James
>

I think we're diverging a little from the broader point being made here -
we need a clear policy on this - to details as to what kinds of problems
LLMs pose.

So whether we agree to disagree on some of these details, I feel we can
(probably? :) agree on the need for a coherent approach and a clear policy
on this.

And to be clear, I'm not opposing LLMs per se, I'm simply underlying the
kinds of issues we ought to be cautious of.

Ultimately I think we ought to let individual maintainers decide what they
will/won't accept (within reason).

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:39     ` Sasha Levin
@ 2025-08-05 19:15       ` Lorenzo Stoakes
  2025-08-05 20:02         ` James Bottomley
  2025-08-06  4:04       ` Alexey Dobriyan
  1 sibling, 1 reply; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 19:15 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, ksummit

On Tue, Aug 05, 2025 at 02:39:20PM -0400, Sasha Levin wrote:
> On Tue, Aug 05, 2025 at 06:55:29PM +0100, Lorenzo Stoakes wrote:
> > On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > > I think that's really overlooking the fact that if properly trained (a
> > > somewhat big *if* depending on the model) AI should be very good at
> > > writing safe code in unsafe languages.  However it takes C specific
> >
> > I fundamentally disagree.
> >
> > The consequences of even extremely small mistakes can be very serious in C,
> > as the language does little to nothing for you.
> >
> > No matter how much data it absorbs it cannot span the entire space of all
> > possible programs or even anywhere close.
>
> Neither can a human :)
>
> I think that this is where we see things differently: I don't think that
> AI needs to be perfect, I just want it to be at the same lever (or
> better) than a human.

Not at all, none of my objections are about perfection. I use LLMs myself,
in appropriate circumstances where the expected failings are not
problematic.

My objections are to do with the the kinds of errors one can encounter with
statistical inference like this.

Humans do not confidently hallucinate in the absence of concrete data,
rather we infer and model. This is dynamics vs. statistics (I genuinely
recommend the article I linked to James, it's a fascinating insight - [0]).

It's the _nature_ of these errors that I am concerned about in conjunction
with unsafe development tooling and highly consequential results of even
subtle errors that makes the kernel especially problematic in my view.

>
> Humans aren't great at writing C code. There's a reason we're looking at
> using Rust for the kernel, and there's a reason that LTS trees exist -
> they're living evidence of just how many mistakes humans make.

Humans make human-like errors. and not at industrial scale :)

>
> Look at the contents of LTS trees or the CVEs that get assigned: most of
> them are fairly simple memory safety issues, off-by-one, use-after-free,
> etc...

Absolutely.

>
> I don't think we should expect a bar for AI that is higher than the one
> we set for humans.

I'm not, rather I'm saying let's be aware of the kinds of issues we might
encounter from LLMs and take them into account when establishing policy.

>
> --
> Thanks,
> Sasha

Regardless of where we feel the issues may/may not be, I think we can
certainly agree that we ought to have policy on how we approach
LLM-generated content, and determine broadly what maintainers want to do
with it.

Which I think is the _key_ point of the proposal more than anything! :)

Cheers, Lorenzo

[0]:https://blog.piekniewski.info/2016/11/01/statistics-and-dynamics/

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:46       ` Mark Brown
@ 2025-08-05 19:18         ` Lorenzo Stoakes
  0 siblings, 0 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-05 19:18 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, ksummit

On Tue, Aug 05, 2025 at 07:46:24PM +0100, Mark Brown wrote:
> On Tue, Aug 05, 2025 at 07:01:01PM +0100, Lorenzo Stoakes wrote:
>
> > However the point is that we can:
>
> > - Generate scenarios relatively quickly that way.
> > - Resulatant mistakes are at least _less_ impactful. And you'd hope, would come
> >   out in the wash more easily.
>
> > However, I wonder whether it might be better simply to restrict LLMs to test
> > data generation or the like.
>
> Yeah, test data seems much safer than test code since it's much more
> constrained.  I'm definitely not 100% against trying things with code,
> I'm more worried about people throwing something out there without
> really looking at it properly since it's just test code.  With the
> scenario generation for example the question is always if we'd be
> happier with the test program being a scenario generator - sometimes the
> answer is no (eg, because it's too expensive to run and you really need
> to pick scenarios), sometimes the answer is yes even though the
> resulting test program is more involved.

And equally, those reviewing test code such as yourself are a finite
resource, and a sudden onset of patches that might result is a
consideration also.

>
> > These are again all exactly the kinds of things that make this an important
> > topic to discuss I believe :)
>
> Indeed, and I'm going to guess that it's going to be as much about
> establishing taste as firm rules.

Absolutely agree :)

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 19:15       ` Lorenzo Stoakes
@ 2025-08-05 20:02         ` James Bottomley
  2025-08-05 20:48           ` Al Viro
  2025-08-06 19:26           ` Lorenzo Stoakes
  0 siblings, 2 replies; 56+ messages in thread
From: James Bottomley @ 2025-08-05 20:02 UTC (permalink / raw)
  To: Lorenzo Stoakes, Sasha Levin; +Cc: ksummit

On Tue, 2025-08-05 at 20:15 +0100, Lorenzo Stoakes wrote:
> On Tue, Aug 05, 2025 at 02:39:20PM -0400, Sasha Levin wrote:
> > On Tue, Aug 05, 2025 at 06:55:29PM +0100, Lorenzo Stoakes wrote:
> > > On Tue, Aug 05, 2025 at 12:43:38PM -0400, James Bottomley wrote:
> > > > I think that's really overlooking the fact that if properly
> > > > trained (a somewhat big *if* depending on the model) AI should
> > > > be very good at writing safe code in unsafe languages.  However
> > > > it takes C specific
> > > 
> > > I fundamentally disagree.
> > > 
> > > The consequences of even extremely small mistakes can be very
> > > serious in C, as the language does little to nothing for you.
> > > 
> > > No matter how much data it absorbs it cannot span the entire
> > > space of all possible programs or even anywhere close.
> > 
> > Neither can a human :)
> > 
> > I think that this is where we see things differently: I don't think
> > that AI needs to be perfect, I just want it to be at the same lever
> > (or better) than a human.
> 
> Not at all, none of my objections are about perfection. I use LLMs
> myself, in appropriate circumstances where the expected failings are
> not problematic.
> 
> My objections are to do with the the kinds of errors one can
> encounter with statistical inference like this.
> 
> Humans do not confidently hallucinate in the absence of concrete
> data, rather we infer and model.

Might I refer you to pretty much any white house press briefing for
counter examples ...

>  This is dynamics vs. statistics (I genuinely recommend the article I
> linked to James, it's a fascinating insight - [0]).
> 
> It's the _nature_ of these errors that I am concerned about in
> conjunction with unsafe development tooling and highly consequential
> results of even subtle errors that makes the kernel especially
> problematic in my view.

You know that's an argument for not allowing teenagers to learn to
drive (at any age), or over operate heavy machinery or ...

The point being that with enough training human society thinks the
probability of error is remote enough (for some value of enough) to
become an acceptable risk.

> > Humans aren't great at writing C code. There's a reason we're
> > looking at using Rust for the kernel, and there's a reason that LTS
> > trees exist - they're living evidence of just how many mistakes
> > humans make.
> 
> Humans make human-like errors. and not at industrial scale :)

I've an infinite number of monkeys^WSet of Far Eastern Call centres for
appliance repair that would beg to disagree.

> > Look at the contents of LTS trees or the CVEs that get assigned:
> > most of them are fairly simple memory safety issues, off-by-one,
> > use-after-free, etc...
> 
> Absolutely.
> 
> > 
> > I don't think we should expect a bar for AI that is higher than the
> > one we set for humans.
> 
> I'm not, rather I'm saying let's be aware of the kinds of issues we
> might encounter from LLMs and take them into account when
> establishing policy.

Well, if we set a policy, it should be flexible enough to adapt as the
AI does and not be locked to what would prevent the AI mistakes I can
find today from happening.  If we're going to codify this rigidly we
could arguably have a policy not to accept patches from humans who
might be (and often are) wrong as well.

I think we should stick to indicators of trustworthiness that AI is
already generating at let that guide maintainer taste without
necessarily having something more detailed.

Regards,

James


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 20:02         ` James Bottomley
@ 2025-08-05 20:48           ` Al Viro
  2025-08-06 19:26           ` Lorenzo Stoakes
  1 sibling, 0 replies; 56+ messages in thread
From: Al Viro @ 2025-08-05 20:48 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lorenzo Stoakes, Sasha Levin, ksummit

On Tue, Aug 05, 2025 at 04:02:02PM -0400, James Bottomley wrote:

> > My objections are to do with the the kinds of errors one can
> > encounter with statistical inference like this.
> > 
> > Humans do not confidently hallucinate in the absence of concrete
> > data, rather we infer and model.
> 
> Might I refer you to pretty much any white house press briefing for
> counter examples ...

	Thank you.  Do we really want their kind anywhere near the
codebase, though?  How would you treat someone ever caught pulling that
kind of shit?  Personally I'd classify them as indistinguishable from
scammers and that would be it, reducing the problem to recognizing their
sock puppets if they ever get such.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:43 ` James Bottomley
                     ` (2 preceding siblings ...)
  2025-08-05 17:55   ` Lorenzo Stoakes
@ 2025-08-05 21:58   ` Jiri Kosina
  2025-08-06  6:58     ` Hannes Reinecke
  2025-08-06 19:35     ` Lorenzo Stoakes
  3 siblings, 2 replies; 56+ messages in thread
From: Jiri Kosina @ 2025-08-05 21:58 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lorenzo Stoakes, ksummit

On Tue, 5 Aug 2025, James Bottomley wrote:

> > * Equally, LLM's can introduce very subtle mistakes that humans find
> > difficult to pick up upon - humans implicitly assume that the classes
> > of errors they will encounter are the kinds other humans would make -
> > AI defeats that instinct.
> 
> Do you have any examples of this?  I've found the opposite to be true:
> AI is capable of really big stupid mistakes when it hasn't seen enough
> of the pattern, but I can't recall seeing it make something you'd
> classify as a subtle mistake (I assume it could copy subtle mistakes
> from wrong training data, so I'm not saying it can't, just that I
> haven't seen any).

Few months ago (so state of matters has probably changed a lot since 
then, I have no idea), I've seen a GPIO driver fully written by ChatGPT.

It was functionally correct and worked well. But it was missing error 
handle here-and-there. Is it subtle or big?

Thanks,

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:39     ` Sasha Levin
  2025-08-05 19:15       ` Lorenzo Stoakes
@ 2025-08-06  4:04       ` Alexey Dobriyan
  2025-08-06 20:36         ` Sasha Levin
  1 sibling, 1 reply; 56+ messages in thread
From: Alexey Dobriyan @ 2025-08-06  4:04 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Lorenzo Stoakes, James Bottomley, ksummit

On Tue, Aug 05, 2025 at 02:39:20PM -0400, Sasha Levin wrote:
> I don't think we should expect a bar for AI that is higher than the one
> we set for humans.

We absolutely should.

There is a thing with computers is that they are perfect. They are so
infinitely perfect compared to humans that anything downgrading computers
levels of perfection to humans level of perfection must be rejected
because it can't be scaled back and thus will be permanent loss.

Would you install calculator application which you know reports wrong
results 1% of the time?

Would you uninstall calculator after realizing it has 1% bug rate?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:10 ` H. Peter Anvin
  2025-08-05 18:19   ` Lorenzo Stoakes
@ 2025-08-06  5:49   ` Julia Lawall
  2025-08-06  9:25     ` Dan Carpenter
  1 sibling, 1 reply; 56+ messages in thread
From: Julia Lawall @ 2025-08-06  5:49 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Lorenzo Stoakes, ksummit



On Tue, 5 Aug 2025, H. Peter Anvin wrote:

> On August 5, 2025 9:03:18 AM PDT, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> >Unavoidably, LLMs are the hot topic in tech right now, and are here to
> >stay.
> >
> >This poses unique problems:
> >
> >* Never before have people been able to generate as much content that may,
> >  on a surface reading, seem valid whilst in reality being quite the
> >  opposite.
> >
> >* Equally, LLM's can introduce very subtle mistakes that humans find
> >  difficult to pick up upon - humans implicitly assume that the classes of
> >  errors they will encounter are the kinds other humans would make - AI
> >  defeats that instinct.
> >
> >* The kernel is uniquely sensitive to erroneous (especially subtly
> >  erroneous) code - even small errors can be highly consequential. We use a
> >  programming language that can almost be defined by its lack of any kind
> >  of safety, and in some subsystems patches are simply taken if no obvious
> >  problems exist, making us rather vulnerable to this.
> >
> >* On the other hand, there are use cases which are useful - test data/code
> >  generation, summarisation, smart auto-complete - so it'd perhaps be
> >  foolish to entirely dismiss AI.
> >
> >A very important non-technical point we must consider is that, the second
> >we even appear to be open to AI submission of _any_ kind, the press will
> >inevitably report on it gleefully, likely with oversimplified headlines
> >like 'Linux accepts AI patches'.
> >
> >The moment that happens, we are likely to see a significant uptick in AI
> >submissions whether we like it or not.
> >
> >I propose that we establish the broad rules as they pertain to the kernel,
> >and would like to bring the discussion to the Maintainer's Summit so we can
> >determine what those should be.
> >
> >It's important to get a sense of how maintainers feel about this - whether
> >what is proposed is opt-in or opt-out - and how we actually implement this.
> >
> >There has been discussion on-list about this (see [0]), with many
> >suggestions made including a 'traffic light' system per-subsystem, however
> >many open questions remain - the devil is in the details.
> >
> >[0]:https://lore.kernel.org/all/20250727195802.2222764-1-sashal@kernel.org/
> >
>
> It would seem to me to be far more useful  to get AI to help find hard-to-see but perhaps common problems, or tidying up code (like Coccinelle on steroids.)
>
> Things that are easy to see post hoc are correct, but can be hard or laborious to do.
>
> Another genuinely good usage for AI is for especially non-English speakers to tidy up their patch comments and other documentation.

There are also some parts of the kernel that are supposed to have
comments, such as memory barriers.  AI could help ensure that the comments
are actually meaningful, and perhaps suggest to the developer how they
could be improved (sort of a 0-day for comment quality).

julia

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 21:58   ` Jiri Kosina
@ 2025-08-06  6:58     ` Hannes Reinecke
  2025-08-06 19:36       ` Lorenzo Stoakes
  2025-08-06 19:35     ` Lorenzo Stoakes
  1 sibling, 1 reply; 56+ messages in thread
From: Hannes Reinecke @ 2025-08-06  6:58 UTC (permalink / raw)
  To: Jiri Kosina, James Bottomley; +Cc: Lorenzo Stoakes, ksummit

On 8/5/25 23:58, Jiri Kosina wrote:
> On Tue, 5 Aug 2025, James Bottomley wrote:
> 
>>> * Equally, LLM's can introduce very subtle mistakes that humans find
>>> difficult to pick up upon - humans implicitly assume that the classes
>>> of errors they will encounter are the kinds other humans would make -
>>> AI defeats that instinct.
>>
>> Do you have any examples of this?  I've found the opposite to be true:
>> AI is capable of really big stupid mistakes when it hasn't seen enough
>> of the pattern, but I can't recall seeing it make something you'd
>> classify as a subtle mistake (I assume it could copy subtle mistakes
>> from wrong training data, so I'm not saying it can't, just that I
>> haven't seen any).
> 
> Few months ago (so state of matters has probably changed a lot since
> then, I have no idea), I've seen a GPIO driver fully written by ChatGPT.
> 
> It was functionally correct and worked well. But it was missing error
> handle here-and-there. Is it subtle or big?
> 
Which (curiously enough) raises another point: Procrastination.
We (as developers) are _terrible_ at things which we _should_ do,
but for some reason or other never get around to do properly
or at all. To wit: Error handling. Or: documentation.

Yet the code base we have written will be used as training material
for LLM. So I am not surprised that LLM lack in the very same
areas (how could they otherwise? There is not enough training
material...). And I guess the last thing we want is LLMs inventing
error handling.

But anyway. LLMs are great in producing something where 90% accuracy
is good enough. But that's not how coding works; so one always will
have to review any LLM generated code.
And that at a time where we already have issues finding enough reviewers
for code written by humans.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.com                               +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06  5:49   ` Julia Lawall
@ 2025-08-06  9:25     ` Dan Carpenter
  2025-08-06  9:39       ` Julia Lawall
  2025-08-06 19:30       ` Lorenzo Stoakes
  0 siblings, 2 replies; 56+ messages in thread
From: Dan Carpenter @ 2025-08-06  9:25 UTC (permalink / raw)
  To: Julia Lawall; +Cc: H. Peter Anvin, Lorenzo Stoakes, ksummit

On Wed, Aug 06, 2025 at 07:49:24AM +0200, Julia Lawall wrote:
> 
> 
> On Tue, 5 Aug 2025, H. Peter Anvin wrote:
> >
> > Another genuinely good usage for AI is for especially non-English
> > speakers to tidy up their patch comments and other documentation.
> 
> There are also some parts of the kernel that are supposed to have
> comments, such as memory barriers.  AI could help ensure that the comments
> are actually meaningful, and perhaps suggest to the developer how they
> could be improved (sort of a 0-day for comment quality).
> 

I feel like I have seen patches where people have generated AI
documentation for locking.  The problem is that if you ask AI to write
something it always has a very confident answer but normally it's vague
and slightly wrong.  It takes no time to generate these patches but it
takes a while to review them.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06  9:25     ` Dan Carpenter
@ 2025-08-06  9:39       ` Julia Lawall
  2025-08-06 19:30       ` Lorenzo Stoakes
  1 sibling, 0 replies; 56+ messages in thread
From: Julia Lawall @ 2025-08-06  9:39 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: H. Peter Anvin, Lorenzo Stoakes, ksummit



On Wed, 6 Aug 2025, Dan Carpenter wrote:

> On Wed, Aug 06, 2025 at 07:49:24AM +0200, Julia Lawall wrote:
> >
> >
> > On Tue, 5 Aug 2025, H. Peter Anvin wrote:
> > >
> > > Another genuinely good usage for AI is for especially non-English
> > > speakers to tidy up their patch comments and other documentation.
> >
> > There are also some parts of the kernel that are supposed to have
> > comments, such as memory barriers.  AI could help ensure that the comments
> > are actually meaningful, and perhaps suggest to the developer how they
> > could be improved (sort of a 0-day for comment quality).
> >
>
> I feel like I have seen patches where people have generated AI
> documentation for locking.  The problem is that if you ask AI to write
> something it always has a very confident answer but normally it's vague
> and slightly wrong.  It takes no time to generate these patches but it
> takes a while to review them.

I would have hoped for some tool that would somehow be designed to
understand comments in a good way.  Random individual efforts to generate
comments using AI would probably indeed produce something innocuous and
useless.

julia

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 20:02         ` James Bottomley
  2025-08-05 20:48           ` Al Viro
@ 2025-08-06 19:26           ` Lorenzo Stoakes
  2025-08-07 12:25             ` Mark Brown
  1 sibling, 1 reply; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-06 19:26 UTC (permalink / raw)
  To: James Bottomley; +Cc: Sasha Levin, ksummit

On Tue, Aug 05, 2025 at 04:02:02PM -0400, James Bottomley wrote:
> > >
> > > I don't think we should expect a bar for AI that is higher than the
> > > one we set for humans.
> >
> > I'm not, rather I'm saying let's be aware of the kinds of issues we
> > might encounter from LLMs and take them into account when
> > establishing policy.
>
> Well, if we set a policy, it should be flexible enough to adapt as the
> AI does and not be locked to what would prevent the AI mistakes I can
> find today from happening.  If we're going to codify this rigidly we
> could arguably have a policy not to accept patches from humans who
> might be (and often are) wrong as well.

Sure, I think any policy should be broad and reasonable.

Probably we want something simple and practical to begin with,
e.g. categorising by:

1. Was most or all of this patch generated by an LLM? (>=90%)
2. Was a large part of this patch generated by an LLM? (>30%)
3. Was a small part of this patch generated by an LLM? (<30%)

In addition to:

- Was the commit message of this patch generated in large part by an LLM
  (excluding non-native speakers using an LLM to simply assist writing it
  in english)?

All of which could have tags, and each entry in MAINTAINERS could have an
opt-in entry indicating which will be acceptable.

We could then explicitly indicate that we're fine with and no need to
disclose uses that are simple day-to-day use of LLM tools such as:

- Simple, supervised use of LLM-based 'smart' autocomplete features.
- Research being assisted by an LLM.
- Any use of an LLM for non-upstreamed code used in development of the
  series.

etc.

Then we can leave the decision as to what's acceptable to individual
maintainers.

>
> I think we should stick to indicators of trustworthiness that AI is
> already generating at let that guide maintainer taste without
> necessarily having something more detailed.

Well, it's an interesting data point but I'm not sure asking the LLM to
rate its own trustworthiness is a reliable measure, and at any rate I think
we need to keep things simple to begin with.

>
> Regards,
>
> James
>

A really key thing to consider here too is maintainer resource. We're
already strained on this with human submissions, so perhaps we want to make
very clear in AI policy document that this is emphatically not an
invitation for pointing automated tools at the kernel and generating tonnes
of patches, and trying to do so might result in your patches being ignored.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06  9:25     ` Dan Carpenter
  2025-08-06  9:39       ` Julia Lawall
@ 2025-08-06 19:30       ` Lorenzo Stoakes
  2025-08-12 14:37         ` Steven Rostedt
  1 sibling, 1 reply; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-06 19:30 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Julia Lawall, H. Peter Anvin, ksummit

On Wed, Aug 06, 2025 at 12:25:59PM +0300, Dan Carpenter wrote:
> On Wed, Aug 06, 2025 at 07:49:24AM +0200, Julia Lawall wrote:
> >
> >
> > On Tue, 5 Aug 2025, H. Peter Anvin wrote:
> > >
> > > Another genuinely good usage for AI is for especially non-English
> > > speakers to tidy up their patch comments and other documentation.
> >
> > There are also some parts of the kernel that are supposed to have
> > comments, such as memory barriers.  AI could help ensure that the comments
> > are actually meaningful, and perhaps suggest to the developer how they
> > could be improved (sort of a 0-day for comment quality).
> >
>
> I feel like I have seen patches where people have generated AI
> documentation for locking.  The problem is that if you ask AI to write
> something it always has a very confident answer but normally it's vague
> and slightly wrong.  It takes no time to generate these patches but it
> takes a while to review them.

The async relationship between effort to generate vs. effort to review is a
very serious aspect of all this.

It's why I think it's aboslutely key to make all this _opt-in_ and to
empower individual maintainers to decide how to handle these kinds of
patches.

>
> regards,
> dan carpenter

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 21:58   ` Jiri Kosina
  2025-08-06  6:58     ` Hannes Reinecke
@ 2025-08-06 19:35     ` Lorenzo Stoakes
  1 sibling, 0 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-06 19:35 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: James Bottomley, ksummit

On Tue, Aug 05, 2025 at 11:58:19PM +0200, Jiri Kosina wrote:
> On Tue, 5 Aug 2025, James Bottomley wrote:
>
> > > * Equally, LLM's can introduce very subtle mistakes that humans find
> > > difficult to pick up upon - humans implicitly assume that the classes
> > > of errors they will encounter are the kinds other humans would make -
> > > AI defeats that instinct.
> >
> > Do you have any examples of this?  I've found the opposite to be true:
> > AI is capable of really big stupid mistakes when it hasn't seen enough
> > of the pattern, but I can't recall seeing it make something you'd
> > classify as a subtle mistake (I assume it could copy subtle mistakes
> > from wrong training data, so I'm not saying it can't, just that I
> > haven't seen any).
>
> Few months ago (so state of matters has probably changed a lot since
> then, I have no idea), I've seen a GPIO driver fully written by ChatGPT.
>
> It was functionally correct and worked well. But it was missing error
> handle here-and-there. Is it subtle or big?

Right exactly :)

In any case we can all agree that LLMs hallucinate and create errors.

And therefore being able to categorise the degree to which LLMs
participated in a patch easily empowers maintainers to determine how much
effort they're willing to put in to try to track these down.

So either way we need this I think.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06  6:58     ` Hannes Reinecke
@ 2025-08-06 19:36       ` Lorenzo Stoakes
  0 siblings, 0 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-06 19:36 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Jiri Kosina, James Bottomley, ksummit

On Wed, Aug 06, 2025 at 08:58:20AM +0200, Hannes Reinecke wrote:
> But anyway. LLMs are great in producing something where 90% accuracy
> is good enough. But that's not how coding works; so one always will
> have to review any LLM generated code.
> And that at a time where we already have issues finding enough reviewers
> for code written by humans.

Yes, precisely. Maintainer resource is strained as it is (he writes at
8.35pm :), we simply do not have the bandwidth to handle an uncontrolled
influx where there's an async relationship between ability-to-generate and
ability-to-review.

So empowering maintainers to _decide_ what's acceptable per-subsystem is
key I think.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06  4:04       ` Alexey Dobriyan
@ 2025-08-06 20:36         ` Sasha Levin
  0 siblings, 0 replies; 56+ messages in thread
From: Sasha Levin @ 2025-08-06 20:36 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: Lorenzo Stoakes, James Bottomley, ksummit

On Wed, Aug 06, 2025 at 07:04:02AM +0300, Alexey Dobriyan wrote:
>On Tue, Aug 05, 2025 at 02:39:20PM -0400, Sasha Levin wrote:
>> I don't think we should expect a bar for AI that is higher than the one
>> we set for humans.
>
>We absolutely should.
>
>There is a thing with computers is that they are perfect. They are so
>infinitely perfect compared to humans that anything downgrading computers
>levels of perfection to humans level of perfection must be rejected
>because it can't be scaled back and thus will be permanent loss.
>
>Would you install calculator application which you know reports wrong
>results 1% of the time?

Let's look at the last release and see :)

$ git log --oneline --no-merges -i --grep ^fixes: --grep cc:.*stable v6.15..v6.16 | wc -l
2734
$ git log --oneline --no-merges v6.15..v6.16 | wc -l
14639

So at least ~20% of the commits that go in are fixes for issues - one
can make the argument that 20% of the kernel is "broken" :)

In your calculator example, if the calculator is making an error 1% of
the time doing simple addition, I would not buy it. If it's making an
error 1% of the time in an edgecase of something I probably would never
use? Sure, I'd happily take the calculator.

>Would you uninstall calculator after realizing it has 1% bug rate?

I'm assuming we both agree that the kernel has over 1% "bug rate"
throughout all of the various usecases it's being used in, why haven't
you uninstalled it yet?

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06 19:26           ` Lorenzo Stoakes
@ 2025-08-07 12:25             ` Mark Brown
  2025-08-07 13:00               ` Lorenzo Stoakes
  0 siblings, 1 reply; 56+ messages in thread
From: Mark Brown @ 2025-08-07 12:25 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: James Bottomley, Sasha Levin, ksummit

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]

On Wed, Aug 06, 2025 at 08:26:41PM +0100, Lorenzo Stoakes wrote:

> - Was the commit message of this patch generated in large part by an LLM
>   (excluding non-native speakers using an LLM to simply assist writing it
>   in english)?

Easiest way to say that is probably to say "written by" and "translated
by".  I think for all these things we should just talk about tooling
rather than specifically LLMs, as well as avoiding any rules lawyering
about how precisely a given tool is implemented it's probably useful to
know about tools whatever the technology.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-07 12:25             ` Mark Brown
@ 2025-08-07 13:00               ` Lorenzo Stoakes
  2025-08-11 21:26                 ` Luis Chamberlain
  2025-08-12 14:19                 ` Steven Rostedt
  0 siblings, 2 replies; 56+ messages in thread
From: Lorenzo Stoakes @ 2025-08-07 13:00 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, Sasha Levin, ksummit

On Thu, Aug 07, 2025 at 01:25:23PM +0100, Mark Brown wrote:
> On Wed, Aug 06, 2025 at 08:26:41PM +0100, Lorenzo Stoakes wrote:
>
> > - Was the commit message of this patch generated in large part by an LLM
> >   (excluding non-native speakers using an LLM to simply assist writing it
> >   in english)?
>
> Easiest way to say that is probably to say "written by" and "translated
> by".  I think for all these things we should just talk about tooling
> rather than specifically LLMs, as well as avoiding any rules lawyering
> about how precisely a given tool is implemented it's probably useful to
> know about tools whatever the technology.

That's a great idea!!

And agreed on language/rules lawyering, I think we have to have something
_simple_ and robust at least to begin with.

There are inevitably blurry lines everywhere with this stuff so I think we
have to be pragmatic and as clear as we possibly can. We can always refine
things later as we learn.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-07 13:00               ` Lorenzo Stoakes
@ 2025-08-11 21:26                 ` Luis Chamberlain
  2025-08-12 14:19                 ` Steven Rostedt
  1 sibling, 0 replies; 56+ messages in thread
From: Luis Chamberlain @ 2025-08-11 21:26 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: Mark Brown, James Bottomley, Sasha Levin, ksummit

On Thu, Aug 07, 2025 at 02:00:56PM +0100, Lorenzo Stoakes wrote:
> On Thu, Aug 07, 2025 at 01:25:23PM +0100, Mark Brown wrote:
> > On Wed, Aug 06, 2025 at 08:26:41PM +0100, Lorenzo Stoakes wrote:
> >
> > > - Was the commit message of this patch generated in large part by an LLM
> > >   (excluding non-native speakers using an LLM to simply assist writing it
> > >   in english)?
> >
> > Easiest way to say that is probably to say "written by" and "translated
> > by".  I think for all these things we should just talk about tooling
> > rather than specifically LLMs, as well as avoiding any rules lawyering
> > about how precisely a given tool is implemented it's probably useful to
> > know about tools whatever the technology.
> 
> That's a great idea!!
> 
> And agreed on language/rules lawyering, I think we have to have something
> _simple_ and robust at least to begin with.

I've been using for years the tag "Generated-by" starting wit Coccinelle:

git log --oneline --author="mcgrof" --grep "Generated-by"| wc -l
31

And it seems like I'm not the only one:

git log --oneline --grep "Generated-by"| wc -l
49

For other projects such as kdevops where I *only* use LLMs to write new
code now, we have been using:

Generated-by: ChatGPT Codex
Generated-by: Claude AI

We use this even if it was partially AI. I think that gives
maintainers sufficient information to make judgement calls.

Other than this practice, if we're to slowly and carefully welcome LLM
generated code on the kernel I'd recommend we evaluate a context
intialization file.  For Claude that's CLAUDE.md, you can look at
kdevops's file for an example [0]. Having one makes adoption easier, and you
can provide strict rules. The context is limited though, you want about
~40 KiB. However I'm not sure if a generic one may be so easily
agreeable, so fortunatley the bots can also look for your ~/CLAUDE.md.
But I can vouch for the fact that its proven useful for kdevops.

Other than this, another best pratice we've adopted on kdevops is to
to grade commits based on LLM prompts so to keep tabs on how well LLMs improve
overtime with example prompts, and to track them with full prompts, we have
PROMPTS.md [1]. These can help LLMs as well.

The grammatical evolution on kdevops is what makes LLMs adoption today
easily possible [2]. I don't think its as solid yet for kernel development in
agreement with recent findings [3], however its only getting better so
best is we prepare for it and learn from existing projects' use cases.

So for testing -- clearly its a win.  For other things, here's a few
things we can evaluate future success over time:

  * fix syzbot bugs
  * take on maintenance for orphaned drivers
  * help maintainers with patch review / testing

[0] https://github.com/linux-kdevops/kdevops/blob/main/CLAUDE.md
[1] https://github.com/linux-kdevops/kdevops/blob/main/PROMPTS.md
[2] https://github.com/linux-kdevops/kdevops?tab=readme-ov-file#generative-ai-usage-on-kdevops
[3] https://neurips.cc/virtual/2024/poster/97426

  Luis

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:23     ` Lorenzo Stoakes
@ 2025-08-12 13:44       ` Steven Rostedt
  0 siblings, 0 replies; 56+ messages in thread
From: Steven Rostedt @ 2025-08-12 13:44 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: James Bottomley, ksummit

On Tue, 5 Aug 2025 19:23:47 +0100
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> +cc Steven
> 
> Apologies, I meant to +cc you here as I mention an issue you encountered,
> and forgot of course :>)

Just got back from PTO and now catching up through my massive heap of emails.

No worries, I'm subscribed to the ksummit list and already read your email
before I even saw this.

-- Steve

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 18:34     ` James Bottomley
  2025-08-05 18:55       ` Lorenzo Stoakes
@ 2025-08-12 13:50       ` Steven Rostedt
  1 sibling, 0 replies; 56+ messages in thread
From: Steven Rostedt @ 2025-08-12 13:50 UTC (permalink / raw)
  To: James Bottomley; +Cc: Lorenzo Stoakes, ksummit

On Tue, 05 Aug 2025 14:34:40 -0400
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> > Sure - Steven encountered this in [1].
> > 
> > As he says there:
> > 
> > "If I had known, I would have examined the patch a little more
> > thoroughly,  and would have discovered a very minor mistake in the
> > patch."  
> 
> Heh, well now you make me look it seems that the minor mistake is
> adding at tail instead of head?  That seems to be because the hash list
> API doesn't have a head add ...
> 
> I wouldn't really call that a subtle problem because the LLM would have
> picked up the head to tail conversion if we'd had an at head API for it
> to learn from.

That wasn't the minor mistake I was referring to. The mistake was that it
dropped a __read_mostly annotation. Which after further review, was
actually OK and something that should have been done in a separate patch. I
have other tags that are required for alignment and such otherwise things
may break mysteriously. It could easily have dropped a different tag that
would have made a difference.

The point here is that it most definitely will introduce very subtle
mistakes. More subtle than what a human may produce and much harder to spot.

-- Steve

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-07 13:00               ` Lorenzo Stoakes
  2025-08-11 21:26                 ` Luis Chamberlain
@ 2025-08-12 14:19                 ` Steven Rostedt
  1 sibling, 0 replies; 56+ messages in thread
From: Steven Rostedt @ 2025-08-12 14:19 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: Mark Brown, James Bottomley, Sasha Levin, ksummit

On Thu, 7 Aug 2025 14:00:56 +0100
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> On Thu, Aug 07, 2025 at 01:25:23PM +0100, Mark Brown wrote:
> > On Wed, Aug 06, 2025 at 08:26:41PM +0100, Lorenzo Stoakes wrote:
> >  
> > > - Was the commit message of this patch generated in large part by an LLM
> > >   (excluding non-native speakers using an LLM to simply assist writing it
> > >   in english)?  
> >
> > Easiest way to say that is probably to say "written by" and "translated
> > by".  I think for all these things we should just talk about tooling
> > rather than specifically LLMs, as well as avoiding any rules lawyering
> > about how precisely a given tool is implemented it's probably useful to
> > know about tools whatever the technology.  
> 
> That's a great idea!!
> 
> And agreed on language/rules lawyering, I think we have to have something
> _simple_ and robust at least to begin with.
> 
> There are inevitably blurry lines everywhere with this stuff so I think we
> have to be pragmatic and as clear as we possibly can. We can always refine
> things later as we learn.

I honestly don't care who or what writes the change log. As long as the
reason for the change is clear and I can read it and then know what the
patch is doing, and more importantly why it is doing it. Then I can be the
judge.

If I don't understand the change log, I simply send the patch back and say,
"Please explain what your are doing and why your are doing it, better".

For the longest time, I have rewritten change logs myself from other people
if they don't explain it properly. Mostly from non-native speakers.
Depending on how much of a rewrite I do, I'll get a confirmation from the
submitter to make sure it's what they intended. If it's just fixing grammar
mistakes I don't even ask.

I use to say "Can you write something like this:" and explain it better,
and the next version of the patch would always have what I suggested
written verbatim, so I stopped doing that :-p

-- Steve


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-06 19:30       ` Lorenzo Stoakes
@ 2025-08-12 14:37         ` Steven Rostedt
  2025-08-12 15:02           ` Sasha Levin
  0 siblings, 1 reply; 56+ messages in thread
From: Steven Rostedt @ 2025-08-12 14:37 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: Dan Carpenter, Julia Lawall, H. Peter Anvin, ksummit

On Wed, 6 Aug 2025 20:30:25 +0100
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> > I feel like I have seen patches where people have generated AI
> > documentation for locking.  The problem is that if you ask AI to write
> > something it always has a very confident answer but normally it's vague
> > and slightly wrong.  It takes no time to generate these patches but it
> > takes a while to review them.  
> 
> The async relationship between effort to generate vs. effort to review is a
> very serious aspect of all this.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
          THIS!

My biggest concern with AI is that it may cause me to spend more time
reviewing what it produced than it would take me to write the code myself.

When I finally got people to help me with my work, my productivity dropped
substantially, as I found I was spending more time helping them with tasks
than it would have taken me to do it myself. But over time, that changed.

This could be the same with AI if you had a model that would learn from
what you taught it. But I suspect that AI generators will not be taking my
input from what I find and I'll likely be telling the same AI generator how
to fix the mistake over and over again.

One thing I look forward to with AI is that it will likely help me debug my
own code. I suspect it would be really good at reading my code and say
"Hey Stupid! You forgot to free this value over here". And things like that.
In other words, I trust AI to debug my code more than I can debug its code.

What's that saying? "To debug code requires someone that is smarter than
the one who wrote it."

-- Steve

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-12 14:37         ` Steven Rostedt
@ 2025-08-12 15:02           ` Sasha Levin
  2025-08-12 15:24             ` Paul E. McKenney
  0 siblings, 1 reply; 56+ messages in thread
From: Sasha Levin @ 2025-08-12 15:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lorenzo Stoakes, Dan Carpenter, Julia Lawall, H. Peter Anvin, ksummit

On Tue, Aug 12, 2025 at 10:37:10AM -0400, Steven Rostedt wrote:
>One thing I look forward to with AI is that it will likely help me debug my
>own code. I suspect it would be really good at reading my code and say
>"Hey Stupid! You forgot to free this value over here". And things like that.
>In other words, I trust AI to debug my code more than I can debug its code.

I'm surprised that this isn't too common of a practice at this point.
Sure, AI will make mistakes, but I think that:

1. It's much better at reviewing rather than writing.
2. Mistakes aren't critical (you'd review it yourself anyway).
3. No legal issues.

As an example:

$ git ol 391dda1bd7c56de62b96126214f040fe8965561b
391dda1bd7c5 ("tracing: Use hashtable.h for event_hash")

$ git checkout 391dda1bd7c5^
HEAD is now at 7f81f27b1093 tracing: Fix use-after-free in print_graph_function_flags during tracer switching

$ git format-patch -1 391dda1bd7c5
0001-tracing-Use-hashtable.h-for-event_hash.patch

$ echo "I want to send @0001-tracing-Use-hashtable.h-for-event_hash.patch upstream. Carefully review the patch, identify any issues, and write a report." | claude -p
# Patch Review Report: tracing: Use hashtable.h for event_hash

## Summary
This patch converts the manually implemented hash table in `kernel/trace/trace_output.c` to use the generic hashtable API from `<linux/hashtable.h>`. The conversion is technically correct and simplifies the code, but there is one notable issue that should be addressed.

## Review Findings

### ✅ Positive Aspects

1. **Correct API Usage**: The patch properly uses the hashtable.h API:
    - `DEFINE_HASHTABLE()` for static initialization
    - `hash_add()` for insertion
    - `hash_del()` for deletion
    - `hash_for_each_possible()` for lookup

2. **Proper Constant Renaming**: The rename from `EVENT_HASHSIZE` to `EVENT_HASH_BITS` correctly reflects the semantic change (from size to bit count).

3. **Functionally Equivalent**: The new implementation maintains the same hash table size (2^7 = 128 entries) and hashing behavior.

4. **Code Simplification**: Removes manual bucket calculation logic, making the code cleaner and more maintainable.

### ❌ Issue Found

**Loss of `__read_mostly` Annotation**

The original implementation declares the hash table with `__read_mostly`:
```c
static struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly;
```

The patch changes this to:
```c
static DEFINE_HASHTABLE(event_hash, EVENT_HASH_BITS);
```

The `DEFINE_HASHTABLE` macro does not preserve the `__read_mostly` annotation. This could impact performance as:
- The event hash table is primarily read during event lookups
- It's only written during event registration/unregistration (rare operations)
- Without `__read_mostly`, the hash table may not be optimally placed in cache-friendly memory sections

### Recommended Fix

The hashtable.h header provides `DEFINE_READ_MOSTLY_HASHTABLE` macro specifically for this use case. The patch should use:
```c
static DEFINE_READ_MOSTLY_HASHTABLE(event_hash, EVENT_HASH_BITS);
```

This would preserve the performance optimization of the original code.

## Verification Checklist

- [x] Patch applies cleanly
- [x] Hash table size remains unchanged (128 entries)
- [x] Hash function behavior is preserved
- [x] Locking semantics remain the same
- [x] All hash table operations correctly converted
- [ ] Performance annotations preserved (`__read_mostly`)

## Recommendation

**Hold for revision** - The patch should be updated to use `DEFINE_READ_MOSTLY_HASHTABLE` instead of `DEFINE_HASHTABLE` to maintain the performance optimization present in the original code.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-12 15:02           ` Sasha Levin
@ 2025-08-12 15:24             ` Paul E. McKenney
  2025-08-12 15:25               ` Sasha Levin
  0 siblings, 1 reply; 56+ messages in thread
From: Paul E. McKenney @ 2025-08-12 15:24 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Steven Rostedt, Lorenzo Stoakes, Dan Carpenter, Julia Lawall,
	H. Peter Anvin, ksummit

On Tue, Aug 12, 2025 at 11:02:50AM -0400, Sasha Levin wrote:
> On Tue, Aug 12, 2025 at 10:37:10AM -0400, Steven Rostedt wrote:
> > One thing I look forward to with AI is that it will likely help me debug my
> > own code. I suspect it would be really good at reading my code and say
> > "Hey Stupid! You forgot to free this value over here". And things like that.
> > In other words, I trust AI to debug my code more than I can debug its code.
> 
> I'm surprised that this isn't too common of a practice at this point.
> Sure, AI will make mistakes, but I think that:
> 
> 1. It's much better at reviewing rather than writing.
> 2. Mistakes aren't critical (you'd review it yourself anyway).
> 3. No legal issues.

Ahem.  Please do us all a favor, yourself included, and go count the
number of lawsuits currently in flight on this topic.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-12 15:24             ` Paul E. McKenney
@ 2025-08-12 15:25               ` Sasha Levin
  2025-08-12 15:28                 ` Paul E. McKenney
  0 siblings, 1 reply; 56+ messages in thread
From: Sasha Levin @ 2025-08-12 15:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, Lorenzo Stoakes, Dan Carpenter, Julia Lawall,
	H. Peter Anvin, ksummit

On Tue, Aug 12, 2025 at 08:24:11AM -0700, Paul E. McKenney wrote:
>On Tue, Aug 12, 2025 at 11:02:50AM -0400, Sasha Levin wrote:
>> On Tue, Aug 12, 2025 at 10:37:10AM -0400, Steven Rostedt wrote:
>> > One thing I look forward to with AI is that it will likely help me debug my
>> > own code. I suspect it would be really good at reading my code and say
>> > "Hey Stupid! You forgot to free this value over here". And things like that.
>> > In other words, I trust AI to debug my code more than I can debug its code.
>>
>> I'm surprised that this isn't too common of a practice at this point.
>> Sure, AI will make mistakes, but I think that:
>>
>> 1. It's much better at reviewing rather than writing.
>> 2. Mistakes aren't critical (you'd review it yourself anyway).
>> 3. No legal issues.
>
>Ahem.  Please do us all a favor, yourself included, and go count the
>number of lawsuits currently in flight on this topic.

For using AI for review, where it doesn't generate any code? I'd love a
reference please :)

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-12 15:25               ` Sasha Levin
@ 2025-08-12 15:28                 ` Paul E. McKenney
  0 siblings, 0 replies; 56+ messages in thread
From: Paul E. McKenney @ 2025-08-12 15:28 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Steven Rostedt, Lorenzo Stoakes, Dan Carpenter, Julia Lawall,
	H. Peter Anvin, ksummit

On Tue, Aug 12, 2025 at 11:25:49AM -0400, Sasha Levin wrote:
> On Tue, Aug 12, 2025 at 08:24:11AM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 12, 2025 at 11:02:50AM -0400, Sasha Levin wrote:
> > > On Tue, Aug 12, 2025 at 10:37:10AM -0400, Steven Rostedt wrote:
> > > > One thing I look forward to with AI is that it will likely help me debug my
> > > > own code. I suspect it would be really good at reading my code and say
> > > > "Hey Stupid! You forgot to free this value over here". And things like that.
> > > > In other words, I trust AI to debug my code more than I can debug its code.
> > > 
> > > I'm surprised that this isn't too common of a practice at this point.
> > > Sure, AI will make mistakes, but I think that:
> > > 
> > > 1. It's much better at reviewing rather than writing.
> > > 2. Mistakes aren't critical (you'd review it yourself anyway).
> > > 3. No legal issues.
> > 
> > Ahem.  Please do us all a favor, yourself included, and go count the
> > number of lawsuits currently in flight on this topic.
> 
> For using AI for review, where it doesn't generate any code? I'd love a
> reference please :)

Code is not the only thing that is subject to copyright.  But please don't
take my word for it, go ask Napster.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-08-05 16:03 [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process Lorenzo Stoakes
  2025-08-05 16:43 ` James Bottomley
  2025-08-05 18:10 ` H. Peter Anvin
@ 2025-12-08  1:12 ` Sasha Levin
  2025-12-08  1:25   ` H. Peter Anvin
                     ` (2 more replies)
  2 siblings, 3 replies; 56+ messages in thread
From: Sasha Levin @ 2025-12-08  1:12 UTC (permalink / raw)
  To: ksummit

This (and parallel) threads have generated substantial discussion across
several related topics.  In preperation for the Maintainer's summit, here's a
summary of where we appear to have consensus, where we don't, and some
questions to consider before the summit.

Where We Have Consensus:

1. Human accountability is non-negotiable:

The From: line must always be a human who takes full responsibility for the
patch. No "but the AI wrote that part" excuses. This maps cleanly to our
existing DCO requirements and approach to tooling.

2. Some form of disclosure is needed:

Whether it's a trailer tag, a note below the cut line, or something else,
there's broad agreement that AI involvement should be disclosed. The exact
mechanism is debatable, but the principle is not.

3. Maintainer autonomy matters:

Individual subsystem maintainers should be empowered to set their own policies.
An opt-in approach per-subsystem seems preferred over a kernel-wide mandate
that doesn't account for different subsystem needs.

4. This isn't going away:

Industry is already using AI extensively. We're already receiving AI-generated
bug reports. Ignoring this won't make it disappear; better to have a thoughtful
policy than no policy.

5. Language assistance for non-native speakers is legitimate

Using AI to improve documentation and commit messages should not be stigmatized
or treated the same as AI-generated code.

Where We Don't Have Consensus:

1. The nature of AI errors:

Some argue AI makes fundamentally different errors than humans - subtle
mistakes that slip past review because we're trained to spot human-pattern
errors. Others argue AI errors are obvious when the model is under-trained, and
that better training can address most issues. This affects how much scrutiny
AI-assisted patches need.

2. Same bar or higher bar?

The kernel already has a significant bug rate - roughly 20% of commits in a
release cycle are fixes. Should we hold AI to the same standard we hold humans,
or does the kernel's criticality demand a higher bar for AI? There's genuine
disagreement here.

3. Legal risk tolerance:

DCO clause (a) requires certifying "I have the right to submit it under the
open source license." With AI training data provenance unclear and litigation
ongoing, how cautious should we be? Some advocate waiting for legal clarity;
others argue the legal concerns are overblown and we should focus on practical
guardrails.

4. The asymmetric effort problem:

AI can generate patches in seconds; review takes hours. Unlike human
contributors who learn from feedback and improve, AI models will repeat the
same mistakes. How do we prevent maintainer overload? There's no clear answer
yet.

Questions for the Summit:

1. Policy scope: Should we establish a kernel-wide minimum policy, or
    simply document that subsystem maintainers set their own rules?

2. Disclosure format: What should disclosure look like? Options discussed
    include:

    - Trailer tag (e.g., `Assisted-by:`, `Generated-by:`)
    - Below-the-cut note
    - Verbose commit log explanation
    - Technology-agnostic "tooling" terminology vs. AI-specific

3. Generation vs. review: AI for code review and debugging seems less
    controversial than AI for code generation. Should we treat these
    differently in policy?

4. What requires disclosure?: Where's the line? Clearly, wholesale
    AI-generated patches need disclosure. What about:

    - AI-suggested fixes that a human then implements?
    - Using AI to understand an API before writing code?
    - AI assistance with commit message wording?

5. Legal stance: Should we take a position on AI-generated code and DCO
    compliance, or leave that to individual contributors to assess?

6. Enforcement reality: We can't even get everyone to run checkpatch.
    Whatever policy we adopt, how do we think about enforcement?

Looking forward to the discussion.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  1:12 ` Sasha Levin
@ 2025-12-08  1:25   ` H. Peter Anvin
  2025-12-08  1:59     ` Jonathan Corbet
  2025-12-08  4:15   ` Laurent Pinchart
  2025-12-08  7:00   ` Jiri Kosina
  2 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2025-12-08  1:25 UTC (permalink / raw)
  To: Sasha Levin, ksummit

On December 7, 2025 5:12:51 PM PST, Sasha Levin <sashal@kernel.org> wrote:
>This (and parallel) threads have generated substantial discussion across
>several related topics.  In preperation for the Maintainer's summit, here's a
>summary of where we appear to have consensus, where we don't, and some
>questions to consider before the summit.
>
>Where We Have Consensus:
>
>1. Human accountability is non-negotiable:
>
>The From: line must always be a human who takes full responsibility for the
>patch. No "but the AI wrote that part" excuses. This maps cleanly to our
>existing DCO requirements and approach to tooling.
>
>2. Some form of disclosure is needed:
>
>Whether it's a trailer tag, a note below the cut line, or something else,
>there's broad agreement that AI involvement should be disclosed. The exact
>mechanism is debatable, but the principle is not.
>
>3. Maintainer autonomy matters:
>
>Individual subsystem maintainers should be empowered to set their own policies.
>An opt-in approach per-subsystem seems preferred over a kernel-wide mandate
>that doesn't account for different subsystem needs.
>
>4. This isn't going away:
>
>Industry is already using AI extensively. We're already receiving AI-generated
>bug reports. Ignoring this won't make it disappear; better to have a thoughtful
>policy than no policy.
>
>5. Language assistance for non-native speakers is legitimate
>
>Using AI to improve documentation and commit messages should not be stigmatized
>or treated the same as AI-generated code.
>
>
>Where We Don't Have Consensus:
>
>1. The nature of AI errors:
>
>Some argue AI makes fundamentally different errors than humans - subtle
>mistakes that slip past review because we're trained to spot human-pattern
>errors. Others argue AI errors are obvious when the model is under-trained, and
>that better training can address most issues. This affects how much scrutiny
>AI-assisted patches need.
>
>2. Same bar or higher bar?
>
>The kernel already has a significant bug rate - roughly 20% of commits in a
>release cycle are fixes. Should we hold AI to the same standard we hold humans,
>or does the kernel's criticality demand a higher bar for AI? There's genuine
>disagreement here.
>
>3. Legal risk tolerance:
>
>DCO clause (a) requires certifying "I have the right to submit it under the
>open source license." With AI training data provenance unclear and litigation
>ongoing, how cautious should we be? Some advocate waiting for legal clarity;
>others argue the legal concerns are overblown and we should focus on practical
>guardrails.
>
>4. The asymmetric effort problem:
>
>AI can generate patches in seconds; review takes hours. Unlike human
>contributors who learn from feedback and improve, AI models will repeat the
>same mistakes. How do we prevent maintainer overload? There's no clear answer
>yet.
>
>
>Questions for the Summit:
>
>1. Policy scope: Should we establish a kernel-wide minimum policy, or
>   simply document that subsystem maintainers set their own rules?
>
>2. Disclosure format: What should disclosure look like? Options discussed
>   include:
>
>   - Trailer tag (e.g., `Assisted-by:`, `Generated-by:`)
>   - Below-the-cut note
>   - Verbose commit log explanation
>   - Technology-agnostic "tooling" terminology vs. AI-specific
>
>3. Generation vs. review: AI for code review and debugging seems less
>   controversial than AI for code generation. Should we treat these
>   differently in policy?
>
>4. What requires disclosure?: Where's the line? Clearly, wholesale
>   AI-generated patches need disclosure. What about:
>
>   - AI-suggested fixes that a human then implements?
>   - Using AI to understand an API before writing code?
>   - AI assistance with commit message wording?
>
>5. Legal stance: Should we take a position on AI-generated code and DCO
>   compliance, or leave that to individual contributors to assess?
>
>6. Enforcement reality: We can't even get everyone to run checkpatch.
>   Whatever policy we adopt, how do we think about enforcement?
>
>Looking forward to the discussion.
>

I contend there is a huge difference between *code* and descriptions/documentation/...

Although git is an enormous help, it is still far harder to unwind code than the auxiliary stuff. 

AI bug reports? We have gotten robot bug reports for decades. Translation tool? Not a problem. Using AI to get ideas? Not really different than reading out of a textbook 

However, *code* taken out of an AI seems to me that it would have to be presumed plagiarized. Furthermore, it is very often wrong in both subtle and blatant ways.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  1:25   ` H. Peter Anvin
@ 2025-12-08  1:59     ` Jonathan Corbet
  2025-12-08  3:15       ` Steven Rostedt
  0 siblings, 1 reply; 56+ messages in thread
From: Jonathan Corbet @ 2025-12-08  1:59 UTC (permalink / raw)
  To: H. Peter Anvin, Sasha Levin, ksummit

"H. Peter Anvin" <hpa@zytor.com> writes:

> I contend there is a huge difference between *code* and
> descriptions/documentation/...
>
> Although git is an enormous help, it is still far harder to unwind
> code than the auxiliary stuff.
>
> AI bug reports? We have gotten robot bug reports for
> decades. Translation tool? Not a problem. Using AI to get ideas? Not
> really different than reading out of a textbook
>
> However, *code* taken out of an AI seems to me that it would have to
> be presumed plagiarized. Furthermore, it is very often wrong in both
> subtle and blatant ways.

As you might imagine, I'm not fully on board with that.  Code is assumed
plagiarized, but text is not?  Subtly wrong documentation is OK?

I think our documentation requires just as much care as our code does.

jon

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  1:59     ` Jonathan Corbet
@ 2025-12-08  3:15       ` Steven Rostedt
  2025-12-08  3:42         ` James Bottomley
  0 siblings, 1 reply; 56+ messages in thread
From: Steven Rostedt @ 2025-12-08  3:15 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: H. Peter Anvin, Sasha Levin, ksummit

On Sun, 07 Dec 2025 18:59:19 -0700
Jonathan Corbet <corbet@lwn.net> wrote:

> > I contend there is a huge difference between *code* and
> > descriptions/documentation/...

> 
> As you might imagine, I'm not fully on board with that.  Code is assumed
> plagiarized, but text is not?  Subtly wrong documentation is OK?
> 
> I think our documentation requires just as much care as our code does.

I assumed what hpa was mentioning about documentation, may be either
translation of original text of the submitter, or AI looking at the
code that was created and created a change log. In either case, the
text was generated from the input of the author. Where as AI generated
code likely comes from somebody else's code. Perhaps AI was trained on
somebody else's text, but the output will likely not be a derivative of
it as the input is still original.

In that case, I agree with hpa on this.

-- Steve

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  3:15       ` Steven Rostedt
@ 2025-12-08  3:42         ` James Bottomley
  2025-12-08  8:41           ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 56+ messages in thread
From: James Bottomley @ 2025-12-08  3:42 UTC (permalink / raw)
  To: Steven Rostedt, Jonathan Corbet; +Cc: H. Peter Anvin, Sasha Levin, ksummit

On Sun, 2025-12-07 at 22:15 -0500, Steven Rostedt wrote:
> On Sun, 07 Dec 2025 18:59:19 -0700
> Jonathan Corbet <corbet@lwn.net> wrote:
> 
> > > I contend there is a huge difference between *code* and
> > > descriptions/documentation/...
> 
> > 
> > As you might imagine, I'm not fully on board with that.  Code is
> > assumed plagiarized, but text is not?  Subtly wrong documentation
> > is OK?
> > 
> > I think our documentation requires just as much care as our code
> > does.
> 
> I assumed what hpa was mentioning about documentation, may be either
> translation of original text of the submitter, or AI looking at the
> code that was created and created a change log. In either case, the
> text was generated from the input of the author

I think this is precisely the problem Jon was referring to: you're
saying that if AI generates *text* based on input prompts it's not a
copyright problem, but if AI generates *code* based on input prompts,
it is.  As simply a neural net operational issue *both* input to output
sets are generated in the same way by the AI process and would have the
same legal probability of being copyright problems.  i.e. if the first
likely isn't a copyright problem, the second likely isn't as well (and
vice versa).

> . Where as AI generated code likely comes from somebody else's code.
> Perhaps AI was trained on somebody else's text, but the output will
> likely not be a derivative of it as the input is still original.

That's an incorrect statement: if the output is a derivative of the
training (which is a big if given the current state of the legal
landscape) and the training set was copyrighted, then even a translated
text using that training data will pick up the copyright violation
regardless of input prompting.

Regards,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  1:12 ` Sasha Levin
  2025-12-08  1:25   ` H. Peter Anvin
@ 2025-12-08  4:15   ` Laurent Pinchart
  2025-12-08  4:31     ` Jonathan Corbet
  2025-12-08  7:00   ` Jiri Kosina
  2 siblings, 1 reply; 56+ messages in thread
From: Laurent Pinchart @ 2025-12-08  4:15 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ksummit

Hi Sasha,

Thank you for summarizing the long discussions. I won't ask if this
summary has been written by an LLM :-)

On Sun, Dec 07, 2025 at 08:12:51PM -0500, Sasha Levin wrote:
> This (and parallel) threads have generated substantial discussion across
> several related topics.  In preperation for the Maintainer's summit, here's a
> summary of where we appear to have consensus, where we don't, and some
> questions to consider before the summit.
> 
> Where We Have Consensus:
> 
> 1. Human accountability is non-negotiable:
> 
> The From: line must always be a human who takes full responsibility for the
> patch. No "but the AI wrote that part" excuses. This maps cleanly to our
> existing DCO requirements and approach to tooling.
> 
> 2. Some form of disclosure is needed:
> 
> Whether it's a trailer tag, a note below the cut line, or something else,
> there's broad agreement that AI involvement should be disclosed. The exact
> mechanism is debatable, but the principle is not.
> 
> 3. Maintainer autonomy matters:
> 
> Individual subsystem maintainers should be empowered to set their own policies.
> An opt-in approach per-subsystem seems preferred over a kernel-wide mandate
> that doesn't account for different subsystem needs.
> 
> 4. This isn't going away:
> 
> Industry is already using AI extensively. We're already receiving AI-generated
> bug reports. Ignoring this won't make it disappear; better to have a thoughtful
> policy than no policy.
> 
> 5. Language assistance for non-native speakers is legitimate
> 
> Using AI to improve documentation and commit messages should not be stigmatized
> or treated the same as AI-generated code.
> 
> 
> Where We Don't Have Consensus:
> 
> 1. The nature of AI errors:
> 
> Some argue AI makes fundamentally different errors than humans - subtle
> mistakes that slip past review because we're trained to spot human-pattern
> errors. Others argue AI errors are obvious when the model is under-trained, and
> that better training can address most issues. This affects how much scrutiny
> AI-assisted patches need.
> 
> 2. Same bar or higher bar?
> 
> The kernel already has a significant bug rate - roughly 20% of commits in a
> release cycle are fixes. Should we hold AI to the same standard we hold humans,
> or does the kernel's criticality demand a higher bar for AI? There's genuine
> disagreement here.
> 
> 3. Legal risk tolerance:
> 
> DCO clause (a) requires certifying "I have the right to submit it under the
> open source license." With AI training data provenance unclear and litigation
> ongoing, how cautious should we be? Some advocate waiting for legal clarity;
> others argue the legal concerns are overblown and we should focus on practical
> guardrails.
> 
> 4. The asymmetric effort problem:
> 
> AI can generate patches in seconds; review takes hours. Unlike human
> contributors who learn from feedback and improve, AI models will repeat the
> same mistakes. How do we prevent maintainer overload? There's no clear answer
> yet.
> 
> 
> Questions for the Summit:
> 
> 1. Policy scope: Should we establish a kernel-wide minimum policy, or
>     simply document that subsystem maintainers set their own rules?
> 
> 2. Disclosure format: What should disclosure look like? Options discussed
>     include:
> 
>     - Trailer tag (e.g., `Assisted-by:`, `Generated-by:`)
>     - Below-the-cut note
>     - Verbose commit log explanation
>     - Technology-agnostic "tooling" terminology vs. AI-specific
> 
> 3. Generation vs. review: AI for code review and debugging seems less
>     controversial than AI for code generation. Should we treat these
>     differently in policy?
> 
> 4. What requires disclosure?: Where's the line? Clearly, wholesale
>     AI-generated patches need disclosure. What about:
> 
>     - AI-suggested fixes that a human then implements?
>     - Using AI to understand an API before writing code?
>     - AI assistance with commit message wording?
> 
> 5. Legal stance: Should we take a position on AI-generated code and DCO
>     compliance, or leave that to individual contributors to assess?
> 
> 6. Enforcement reality: We can't even get everyone to run checkpatch.
>     Whatever policy we adopt, how do we think about enforcement?

This is a pretty good summary. It's missing one point in my opinion,
partly related to the legal stance: the ethical stance.

The Linux kernel is governed by the GPL. There are contributors who care
about the copyleft aspect of the license. Even if the legal issues get
cleared in the future, not everybody will agree that usage of GPL code
as input to create proprietary LLMs is ethical: it may not breach the
letter of the license while breaching the spirit. I would like to see
this question being discussed.

> Looking forward to the discussion.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  4:15   ` Laurent Pinchart
@ 2025-12-08  4:31     ` Jonathan Corbet
  2025-12-08  4:36       ` Laurent Pinchart
  0 siblings, 1 reply; 56+ messages in thread
From: Jonathan Corbet @ 2025-12-08  4:31 UTC (permalink / raw)
  To: Laurent Pinchart, Sasha Levin; +Cc: ksummit

Laurent Pinchart <laurent.pinchart@ideasonboard.com> writes:

> This is a pretty good summary. It's missing one point in my opinion,
> partly related to the legal stance: the ethical stance.
>
> The Linux kernel is governed by the GPL. There are contributors who care
> about the copyleft aspect of the license. Even if the legal issues get
> cleared in the future, not everybody will agree that usage of GPL code
> as input to create proprietary LLMs is ethical:

Somebody needs to say this: that is an important ethical question, but
it is far from the only one:

- The creation of these LLM systems imposes significant costs on anybody
  trying to run any sort of open resource on the Internet.  I have
  ranted at length about this... Konstantin has been rather more
  restrained, but the effects of this industry on kernel.org have been
  significant and expensive.  Using these systems makes us a part of the
  problem.

- Before generating code in this way, we should think about the
  jet-engine turbines now powering data centers.  We work on Linux to
  create a better world; wrecking its climate is not conducive to that
  goal.

- The BitKeeper experience showed us what can happen when we make our
  processes dependent on proprietary software.  We should not forget
  that lesson regardless of how shiny the new toys look.

I'll stop now :)

jon

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  4:31     ` Jonathan Corbet
@ 2025-12-08  4:36       ` Laurent Pinchart
  0 siblings, 0 replies; 56+ messages in thread
From: Laurent Pinchart @ 2025-12-08  4:36 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Sasha Levin, ksummit

On Sun, Dec 07, 2025 at 09:31:22PM -0700, Jonathan Corbet wrote:
> Laurent Pinchart writes:
> 
> > This is a pretty good summary. It's missing one point in my opinion,
> > partly related to the legal stance: the ethical stance.
> >
> > The Linux kernel is governed by the GPL. There are contributors who care
> > about the copyleft aspect of the license. Even if the legal issues get
> > cleared in the future, not everybody will agree that usage of GPL code
> > as input to create proprietary LLMs is ethical:
> 
> Somebody needs to say this: that is an important ethical question, but
> it is far from the only one:
> 
> - The creation of these LLM systems imposes significant costs on anybody
>   trying to run any sort of open resource on the Internet.  I have
>   ranted at length about this... Konstantin has been rather more
>   restrained, but the effects of this industry on kernel.org have been
>   significant and expensive.  Using these systems makes us a part of the
>   problem.
> 
> - Before generating code in this way, we should think about the
>   jet-engine turbines now powering data centers.  We work on Linux to
>   create a better world; wrecking its climate is not conducive to that
>   goal.
> 
> - The BitKeeper experience showed us what can happen when we make our
>   processes dependent on proprietary software.  We should not forget
>   that lesson regardless of how shiny the new toys look.
> 
> I'll stop now :)

I hope you won't, someone needs to keep being vocal about those crucial
points :-) I often feel pretty lonely when I raise those concerns, it's
important to show people they're not alone.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  1:12 ` Sasha Levin
  2025-12-08  1:25   ` H. Peter Anvin
  2025-12-08  4:15   ` Laurent Pinchart
@ 2025-12-08  7:00   ` Jiri Kosina
  2025-12-08  7:38     ` James Bottomley
  2 siblings, 1 reply; 56+ messages in thread
From: Jiri Kosina @ 2025-12-08  7:00 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ksummit

On Sun, 7 Dec 2025, Sasha Levin wrote:

> 3. Legal risk tolerance:
> 
> DCO clause (a) requires certifying "I have the right to submit it under the
> open source license." With AI training data provenance unclear and litigation
> ongoing, how cautious should we be? Some advocate waiting for legal clarity;
> others argue the legal concerns are overblown and we should focus on practical
> guardrails.

In either case, isn't all the legal responsibility with the submitter, who 
submitted under the terms of DCO, and thus this particular aspect is not 
really a concern for us (maintainers) on the receiving side as per the (c) 
DCO clause?

Thanks,

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  7:00   ` Jiri Kosina
@ 2025-12-08  7:38     ` James Bottomley
  0 siblings, 0 replies; 56+ messages in thread
From: James Bottomley @ 2025-12-08  7:38 UTC (permalink / raw)
  To: Jiri Kosina, Sasha Levin; +Cc: ksummit

On Mon, 2025-12-08 at 08:00 +0100, Jiri Kosina wrote:
> On Sun, 7 Dec 2025, Sasha Levin wrote:
> 
> > 3. Legal risk tolerance:
> > 
> > DCO clause (a) requires certifying "I have the right to submit it
> > under the open source license." With AI training data provenance
> > unclear and litigation ongoing, how cautious should we be? Some
> > advocate waiting for legal clarity; others argue the legal concerns
> > are overblown and we should focus on practical guardrails.
> 
> In either case, isn't all the legal responsibility with the
> submitter, who submitted under the terms of DCO, and thus this
> particular aspect is not really a concern for us (maintainers) on the
> receiving side as per the (c) DCO clause?

In terms of liability for maintainers, yes, you're right.  However, in
terms of legal risk for the kernel itself, no, because the problem is
the situation where we've accepted loads of AI code that then has to be
reverted if a court decides AI code is a derivative work of its model.
I think this latter is where we need a collective agreement to risk
tolerance.

However, you can put me down as thinking the risk to the kernel of an
adverse court decision on AI is acceptably small.

Regards,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  3:42         ` James Bottomley
@ 2025-12-08  8:41           ` Mauro Carvalho Chehab
  2025-12-08  9:16             ` James Bottomley
  0 siblings, 1 reply; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-12-08  8:41 UTC (permalink / raw)
  To: James Bottomley
  Cc: Steven Rostedt, Jonathan Corbet, H. Peter Anvin, Sasha Levin, ksummit

Em Mon, 08 Dec 2025 12:42:32 +0900
James Bottomley <James.Bottomley@HansenPartnership.com> escreveu:

> On Sun, 2025-12-07 at 22:15 -0500, Steven Rostedt wrote:
> > On Sun, 07 Dec 2025 18:59:19 -0700
> > Jonathan Corbet <corbet@lwn.net> wrote:
> >   
> > > > I contend there is a huge difference between *code* and
> > > > descriptions/documentation/...  
> >   
> > > 
> > > As you might imagine, I'm not fully on board with that.  Code is
> > > assumed plagiarized, but text is not?  Subtly wrong documentation
> > > is OK?
> > > 
> > > I think our documentation requires just as much care as our code
> > > does.  
> > 
> > I assumed what hpa was mentioning about documentation, may be either
> > translation of original text of the submitter, or AI looking at the
> > code that was created and created a change log. In either case, the
> > text was generated from the input of the author  
> 
> I think this is precisely the problem Jon was referring to: you're
> saying that if AI generates *text* based on input prompts it's not a
> copyright problem, but if AI generates *code* based on input prompts,
> it is.  As simply a neural net operational issue *both* input to output
> sets are generated in the same way by the AI process and would have the
> same legal probability of being copyright problems.  i.e. if the first
> likely isn't a copyright problem, the second likely isn't as well (and
> vice versa).

I'd say that there are different things placed in the same box. Those
two, for example sound OK on my eyes:

- translations - either for documentation of for the code.
  The original copyrights maintain on any translations. This is already
  proofed in courts: if one translates Isaac Asimov's "Foundation" to 
  Greek, his copyright remains at the translation. Ok, if the translation
  is done by a human, he can claim additional copyrights for the
  translation, but a machine doesn't have legal rights to claim for
  copyrights. Plus, the translation is a derivative work of the original
  text, so, I can't see how this could ever be a problem, if the
  copyrights of the original author is placed at the translation;

- code filling - if a prompt requests to automate a repetitive task,
  like creating a skeleton code, adding includes, review coding style
  and other brute force "brainless" activities, the generated code won't
  be different than what other similar tools of what the developer would
  do - AI is simply a tool to speedup it, just like any other similar
  tools. No copyright issues.

Things could be in gray area if one uses AI to write a patch from the
scratch. Still, if the training data is big enough, the weights at the
neuron network will be calibrated to repeat the most common patterns, 
so the code would probably be similar to what most developers would do.

On some experiments I did myself, that's what it happened: the generated
code wasn't much different than what a junior student with C knowledge would
write, with about the same mistakes. The only thing is that, instead of
taking weeks, the code materialized in seconds. To be something that
a maintainer would pick, a senior developer would be required to cleanup
the mess.

> > . Where as AI generated code likely comes from somebody else's code.
> > Perhaps AI was trained on somebody else's text, but the output will
> > likely not be a derivative of it as the input is still original.  
> 
> That's an incorrect statement: if the output is a derivative of the
> training (which is a big if given the current state of the legal
> landscape) and the training set was copyrighted, then even a translated
> text using that training data will pick up the copyright violation
> regardless of input prompting.

If one trains it only with internal code from an specific original 
product that won't have any common patterns which anyone else would
do, then this could be the case.

However, this is usually not the case: models are trained with big
data from lots of different developers and projects. As Neural networks
training is based on settings up weights based on inputs/outputs, if the
training data is big enough, such weights will tend to follow the most
repetitive patterns from similar code/text. 

On other words, AI training will generate a model that tends to repeat
sequences with the most common patterns from its training data. This
is not different than what a programming student would do without
using AI when facing a programming issue: he would likely search
for it on a browser. The search engine algorithms from search
providers are already showing results with the more likely answers
for such question on the top.

The AI generated code won't be much different than that, except that,
instead of taking just the first search result, it would use
a mix of the top search results for the same prompt to produce its
result.

In any case (googling or using AI), the tool-produced code examples
aren't ready for submission. It can be just the beginning of some code 
that will require usually lots of work to be something that could be 
ready for submission - or even - it can be an example of what one should
not do. In the latter case, the developer would need to google again or
to change the prompt, until it gets something that might be applicable
to the real use case.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  8:41           ` Mauro Carvalho Chehab
@ 2025-12-08  9:16             ` James Bottomley
  2025-12-08 10:22               ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 56+ messages in thread
From: James Bottomley @ 2025-12-08  9:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Steven Rostedt, Jonathan Corbet, H. Peter Anvin, Sasha Levin, ksummit

On Mon, 2025-12-08 at 09:41 +0100, Mauro Carvalho Chehab wrote:
> Em Mon, 08 Dec 2025 12:42:32 +0900
> James Bottomley <James.Bottomley@HansenPartnership.com> escreveu:
> 
> > On Sun, 2025-12-07 at 22:15 -0500, Steven Rostedt wrote:
> > > On Sun, 07 Dec 2025 18:59:19 -0700
> > > Jonathan Corbet <corbet@lwn.net> wrote:
> > >   
> > > > > I contend there is a huge difference between *code* and
> > > > > descriptions/documentation/...  
> > >   
> > > > 
> > > > As you might imagine, I'm not fully on board with that.  Code
> > > > is assumed plagiarized, but text is not?  Subtly wrong
> > > > documentation is OK?
> > > > 
> > > > I think our documentation requires just as much care as our
> > > > code does.  
> > > 
> > > I assumed what hpa was mentioning about documentation, may be
> > > either translation of original text of the submitter, or AI
> > > looking at the code that was created and created a change log. In
> > > either case, the text was generated from the input of the author 
> > 
> > I think this is precisely the problem Jon was referring to: you're
> > saying that if AI generates *text* based on input prompts it's not
> > a copyright problem, but if AI generates *code* based on input
> > prompts, it is.  As simply a neural net operational issue *both*
> > input to output sets are generated in the same way by the AI
> > process and would have the same legal probability of being
> > copyright problems.  i.e. if the first likely isn't a copyright
> > problem, the second likely isn't as well (and vice versa).
> 
> I'd say that there are different things placed in the same box. Those
> two, for example sound OK on my eyes:
> 
> - translations - either for documentation of for the code.
>   The original copyrights maintain on any translations. This is
> already
>   proofed in courts: if one translates Isaac Asimov's "Foundation" to
>   Greek, his copyright remains at the translation. Ok, if the
> translation
>   is done by a human, he can claim additional copyrights for the
>   translation, but a machine doesn't have legal rights to claim for
>   copyrights. Plus, the translation is a derivative work of the
> original
>   text, so, I can't see how this could ever be a problem, if the
>   copyrights of the original author is placed at the translation;

I can explain simply how I as a translator could cause a copyright
problem with no AI involvement: let's say I translate Foundation from
English to French but while doing so I embed a load of quotes from the
novels of Annie Ernaux but in a way that it nicely matches the Azimov
original.  Now I've created a work which may be derivative of
Foundation and partly owned by me but which also has claims of
copyright abuse from Annie Ernaux.

The above is directly analogous to what would happen if the AI output
were decided to be a derivative of its training for an AI translator.

> - code filling - if a prompt requests to automate a repetitive task,
>   like creating a skeleton code, adding includes, review coding style
>   and other brute force "brainless" activities, the generated code
> won't
>   be different than what other similar tools of what the developer
> would
>   do - AI is simply a tool to speedup it, just like any other similar
>   tools. No copyright issues.
> 
> Things could be in gray area if one uses AI to write a patch from the
> scratch. Still, if the training data is big enough, the weights at
> the neuron network will be calibrated to repeat the most common
> patterns, so the code would probably be similar to what most
> developers would do.
>
> 
> On some experiments I did myself, that's what it happened: the
> generated code wasn't much different than what a junior student with
> C knowledge would write, with about the same mistakes. The only thing
> is that, instead of taking weeks, the code materialized in seconds.
> To be something that a maintainer would pick, a senior developer
> would be required to cleanup the mess.

How good (or not) AI is at coding is different from the question of
whether the output has its copyright contaminated by the training data.
> 
> > > . Where as AI generated code likely comes from somebody else's
> > > code. Perhaps AI was trained on somebody else's text, but the
> > > output will likely not be a derivative of it as the input is
> > > still original. 
> > 
> > That's an incorrect statement: if the output is a derivative of the
> > training (which is a big if given the current state of the legal
> > landscape) and the training set was copyrighted, then even a
> > translated text using that training data will pick up the copyright
> > violation regardless of input prompting.
> 
> If one trains it only with internal code from an specific original 
> product that won't have any common patterns which anyone else would
> do, then this could be the case.
> 
> However, this is usually not the case: models are trained with big
> data from lots of different developers and projects. As Neural
> networks training is based on settings up weights based on
> inputs/outputs, if the training data is big enough, such weights will
> tend to follow the most repetitive patterns from similar code/text. 
> 
> On other words, AI training will generate a model that tends to
> repeat sequences with the most common patterns from its training
> data. This is not different than what a programming student would do
> without using AI when facing a programming issue: he would likely
> search for it on a browser. The search engine algorithms from search
> providers are already showing results with the more likely answers
> for such question on the top.

Patterns are not expression in the copyright sense.  Indeed, code tends
to be much more amenable to the independent invention defence than
literature: If I give the same programming task to a set of engineers
with the same CS training, most of them would come up with pretty
identical programs even if they don't collaborate.  However, as long as
they didn't copy from each other the programs they come up with are
separate works even if they're very similar in expression.

Just because code is more likely to be independently invented than
literature doesn't make it more prone to copyright violations (although
it does give more scope to the litigious to claim this).

Regards,

James

> The AI generated code won't be much different than that, except that,
> instead of taking just the first search result, it would use
> a mix of the top search results for the same prompt to produce its
> result.
> 
> In any case (googling or using AI), the tool-produced code examples
> aren't ready for submission. It can be just the beginning of some
> code that will require usually lots of work to be something that
> could be ready for submission - or even - it can be an example of
> what one should not do. In the latter case, the developer would need
> to google again or to change the prompt, until it gets something that
> might be applicable to the real use case.
> 
> Thanks,
> Mauro
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process
  2025-12-08  9:16             ` James Bottomley
@ 2025-12-08 10:22               ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-12-08 10:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Mauro Carvalho Chehab, Steven Rostedt, Jonathan Corbet,
	H. Peter Anvin, Sasha Levin, ksummit

On Mon, Dec 08, 2025 at 06:16:52PM +0900, James Bottomley wrote:
> On Mon, 2025-12-08 at 09:41 +0100, Mauro Carvalho Chehab wrote:
> > Em Mon, 08 Dec 2025 12:42:32 +0900
> > James Bottomley <James.Bottomley@HansenPartnership.com> escreveu:
> > 
> > > On Sun, 2025-12-07 at 22:15 -0500, Steven Rostedt wrote:
> > > > On Sun, 07 Dec 2025 18:59:19 -0700
> > > > Jonathan Corbet <corbet@lwn.net> wrote:
> > > >   
> > > > > > I contend there is a huge difference between *code* and
> > > > > > descriptions/documentation/...  
> > > >   
> > > > > 
> > > > > As you might imagine, I'm not fully on board with that.  Code
> > > > > is assumed plagiarized, but text is not?  Subtly wrong
> > > > > documentation is OK?
> > > > > 
> > > > > I think our documentation requires just as much care as our
> > > > > code does.  
> > > > 
> > > > I assumed what hpa was mentioning about documentation, may be
> > > > either translation of original text of the submitter, or AI
> > > > looking at the code that was created and created a change log. In
> > > > either case, the text was generated from the input of the author 
> > > 
> > > I think this is precisely the problem Jon was referring to: you're
> > > saying that if AI generates *text* based on input prompts it's not
> > > a copyright problem, but if AI generates *code* based on input
> > > prompts, it is.  As simply a neural net operational issue *both*
> > > input to output sets are generated in the same way by the AI
> > > process and would have the same legal probability of being
> > > copyright problems.  i.e. if the first likely isn't a copyright
> > > problem, the second likely isn't as well (and vice versa).
> > 
> > I'd say that there are different things placed in the same box. Those
> > two, for example sound OK on my eyes:
> > 
> > - translations - either for documentation of for the code.
> >   The original copyrights maintain on any translations. This is
> > already
> >   proofed in courts: if one translates Isaac Asimov's "Foundation" to
> >   Greek, his copyright remains at the translation. Ok, if the
> > translation
> >   is done by a human, he can claim additional copyrights for the
> >   translation, but a machine doesn't have legal rights to claim for
> >   copyrights. Plus, the translation is a derivative work of the
> > original
> >   text, so, I can't see how this could ever be a problem, if the
> >   copyrights of the original author is placed at the translation;
> 
> I can explain simply how I as a translator could cause a copyright
> problem with no AI involvement: let's say I translate Foundation from
> English to French but while doing so I embed a load of quotes from the
> novels of Annie Ernaux but in a way that it nicely matches the Azimov
> original.  Now I've created a work which may be derivative of
> Foundation and partly owned by me but which also has claims of
> copyright abuse from Annie Ernaux.

A usage like that would likely be fair use/fair dealing.

> The above is directly analogous to what would happen if the AI output
> were decided to be a derivative of its training for an AI translator.

As AI would pick most likely translations, the risk of picking quotes
would be less likely.

Worse case scenario on something digitally published: one can change
the translation to a different translated text/code if a valid copyright
claim would apply.

> 
> > - code filling - if a prompt requests to automate a repetitive task,
> >   like creating a skeleton code, adding includes, review coding style
> >   and other brute force "brainless" activities, the generated code
> > won't
> >   be different than what other similar tools of what the developer
> > would
> >   do - AI is simply a tool to speedup it, just like any other similar
> >   tools. No copyright issues.
> > 
> > Things could be in gray area if one uses AI to write a patch from the
> > scratch. Still, if the training data is big enough, the weights at
> > the neuron network will be calibrated to repeat the most common
> > patterns, so the code would probably be similar to what most
> > developers would do.
> >
> > 
> > On some experiments I did myself, that's what it happened: the
> > generated code wasn't much different than what a junior student with
> > C knowledge would write, with about the same mistakes. The only thing
> > is that, instead of taking weeks, the code materialized in seconds.
> > To be something that a maintainer would pick, a senior developer
> > would be required to cleanup the mess.
> 
> How good (or not) AI is at coding is different from the question of
> whether the output has its copyright contaminated by the training data.

True.

> > 
> > > > . Where as AI generated code likely comes from somebody else's
> > > > code. Perhaps AI was trained on somebody else's text, but the
> > > > output will likely not be a derivative of it as the input is
> > > > still original. 
> > > 
> > > That's an incorrect statement: if the output is a derivative of the
> > > training (which is a big if given the current state of the legal
> > > landscape) and the training set was copyrighted, then even a
> > > translated text using that training data will pick up the copyright
> > > violation regardless of input prompting.
> > 
> > If one trains it only with internal code from an specific original 
> > product that won't have any common patterns which anyone else would
> > do, then this could be the case.
> > 
> > However, this is usually not the case: models are trained with big
> > data from lots of different developers and projects. As Neural
> > networks training is based on settings up weights based on
> > inputs/outputs, if the training data is big enough, such weights will
> > tend to follow the most repetitive patterns from similar code/text. 
> > 
> > On other words, AI training will generate a model that tends to
> > repeat sequences with the most common patterns from its training
> > data. This is not different than what a programming student would do
> > without using AI when facing a programming issue: he would likely
> > search for it on a browser. The search engine algorithms from search
> > providers are already showing results with the more likely answers
> > for such question on the top.
> 
> Patterns are not expression in the copyright sense.  Indeed, code tends
> to be much more amenable to the independent invention defence than
> literature: If I give the same programming task to a set of engineers
> with the same CS training, most of them would come up with pretty
> identical programs even if they don't collaborate.

True. Also, such common patterns that are repeated everywhere are
very likely fair use, if they originally came from copyrighted material.

> However, as long as
> they didn't copy from each other the programs they come up with are
> separate works even if they're very similar in expression.

Those are indeed separate works. A code written by some developer,
either using as basis his CS training, an AI-generated code, a text
book code or a searched code from the Internet as an example can become
copyrighted by the developer who wrote it.

For me, AI, when used as an ancillary tool, is not any different than
what developers have been doing.

Now, using AI as replacement for humans is a hole different thing:
I don't think we are on that stage yet. I'm also not convinced that
this would happen anytime soon.

On some tests I did, even the most complex engines are not currently
capable of generating proper code: it usually requires lots of
interactions to refine prompts and new prompts to modify the produced
results to something more palatable. The output was almost always
a code skeleton that requires manual work.

On such workflow, the prompts can be considered as copyright material.
As such, the transformation into code also carries copyrights from
the developer. As the output requires manual changes to reach
production level, such changes are also copyrighted by the developer.

Again, this is not different than doing a research at specialized
literature and/or the Internet: one needs to do the right research,
classify the results and modify the code examples to generate the
real code.

> Just because code is more likely to be independently invented than
> literature doesn't make it more prone to copyright violations (although
> it does give more scope to the litigious to claim this).

True, but this is not different than not using AI at all.

> 
> Regards,
> 
> James
> 
> > The AI generated code won't be much different than that, except that,
> > instead of taking just the first search result, it would use
> > a mix of the top search results for the same prompt to produce its
> > result.
> > 
> > In any case (googling or using AI), the tool-produced code examples
> > aren't ready for submission. It can be just the beginning of some
> > code that will require usually lots of work to be something that
> > could be ready for submission - or even - it can be an example of
> > what one should not do. In the latter case, the developer would need
> > to google again or to change the prompt, until it gets something that
> > might be applicable to the real use case.
> > 
> > Thanks,
> > Mauro
> > 
> 

-- 
Thanks,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2025-12-08 10:22 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-05 16:03 [MAINTAINERS SUMMIT] The role of AI and LLMs in the kernel process Lorenzo Stoakes
2025-08-05 16:43 ` James Bottomley
2025-08-05 17:11   ` Mark Brown
2025-08-05 17:23     ` James Bottomley
2025-08-05 17:43       ` Sasha Levin
2025-08-05 17:58         ` Lorenzo Stoakes
2025-08-05 18:16       ` Mark Brown
2025-08-05 18:01     ` Lorenzo Stoakes
2025-08-05 18:46       ` Mark Brown
2025-08-05 19:18         ` Lorenzo Stoakes
2025-08-05 17:17   ` Stephen Hemminger
2025-08-05 17:55   ` Lorenzo Stoakes
2025-08-05 18:23     ` Lorenzo Stoakes
2025-08-12 13:44       ` Steven Rostedt
2025-08-05 18:34     ` James Bottomley
2025-08-05 18:55       ` Lorenzo Stoakes
2025-08-12 13:50       ` Steven Rostedt
2025-08-05 18:39     ` Sasha Levin
2025-08-05 19:15       ` Lorenzo Stoakes
2025-08-05 20:02         ` James Bottomley
2025-08-05 20:48           ` Al Viro
2025-08-06 19:26           ` Lorenzo Stoakes
2025-08-07 12:25             ` Mark Brown
2025-08-07 13:00               ` Lorenzo Stoakes
2025-08-11 21:26                 ` Luis Chamberlain
2025-08-12 14:19                 ` Steven Rostedt
2025-08-06  4:04       ` Alexey Dobriyan
2025-08-06 20:36         ` Sasha Levin
2025-08-05 21:58   ` Jiri Kosina
2025-08-06  6:58     ` Hannes Reinecke
2025-08-06 19:36       ` Lorenzo Stoakes
2025-08-06 19:35     ` Lorenzo Stoakes
2025-08-05 18:10 ` H. Peter Anvin
2025-08-05 18:19   ` Lorenzo Stoakes
2025-08-06  5:49   ` Julia Lawall
2025-08-06  9:25     ` Dan Carpenter
2025-08-06  9:39       ` Julia Lawall
2025-08-06 19:30       ` Lorenzo Stoakes
2025-08-12 14:37         ` Steven Rostedt
2025-08-12 15:02           ` Sasha Levin
2025-08-12 15:24             ` Paul E. McKenney
2025-08-12 15:25               ` Sasha Levin
2025-08-12 15:28                 ` Paul E. McKenney
2025-12-08  1:12 ` Sasha Levin
2025-12-08  1:25   ` H. Peter Anvin
2025-12-08  1:59     ` Jonathan Corbet
2025-12-08  3:15       ` Steven Rostedt
2025-12-08  3:42         ` James Bottomley
2025-12-08  8:41           ` Mauro Carvalho Chehab
2025-12-08  9:16             ` James Bottomley
2025-12-08 10:22               ` Mauro Carvalho Chehab
2025-12-08  4:15   ` Laurent Pinchart
2025-12-08  4:31     ` Jonathan Corbet
2025-12-08  4:36       ` Laurent Pinchart
2025-12-08  7:00   ` Jiri Kosina
2025-12-08  7:38     ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox