Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content

workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
       [not found] <20251105231514.3167738-1-dave.hansen@linux.intel.com>
@ 2025-11-10  7:43 ` Vlastimil Babka
  2025-11-10  8:58   ` Christian Brauner
       [not found] ` <11eaf7fa-27d0-4a57-abf0-5f24c918966c@lucifer.local>
  1 sibling, 1 reply; 24+ messages in thread
From: Vlastimil Babka @ 2025-11-10  7:43 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, workflows, ksummit
  Cc: Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

+Cc ksummit (where the discussions about this topic happened recently) and
workflows (probably the closest list we have for such things in general)
because nobody reads lkml today and this seems to have been going under the
radar until mentioned at lwn yesterday

On 11/6/25 00:15, Dave Hansen wrote:
> In the last few years, the capabilities of coding tools have exploded.
> As those capabilities have expanded, contributors and maintainers have
> more and more questions about how and when to apply those
> capabilities.
> 
> The shiny new AI tools (chatbots, coding assistants and more) are
> impressive.  Add new Documentation to guide contributors on how to
> best use kernel development tools, new and old.
> 
> Note, though, there are fundamentally no new or unique rules in this
> new document. It clarifies expectations that the kernel community has
> had for many years. For example, researchers are already asked to
> disclose the tools they use to find issues in
> Documentation/process/researcher-guidelines.rst. This new document
> just reiterates existing best practices for development tooling.
> 
> In short: Please show your work and make sure your contribution is
> easy to review.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Sasha Levin <sashal@kernel.org>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Miguel Ojeda <ojeda@kernel.org>
> Cc: Shuah Khan <shuah@kernel.org>
> 
> --
> 
> This document was a collaborative effort from all the members of
> the TAB. I just reformatted it into .rst and wrote the changelog.
> 
> Changes from v1:
>  * Rename to generated-content.rst and add to documentation index.
>    (Jon)
>  * Rework subject to align with the new filename
>  * Replace commercial names with generic ones. (Jon)
>  * Be consistent about punctuation at the end of bullets for whole
>    sentences. (Miguel)
>  * Formatting sprucing up and minor typos (Miguel)
> ---
>  Documentation/process/generated-content.rst | 94 +++++++++++++++++++++
>  Documentation/process/index.rst             |  1 +
>  2 files changed, 95 insertions(+)
>  create mode 100644 Documentation/process/generated-content.rst
> 
> diff --git a/Documentation/process/generated-content.rst b/Documentation/process/generated-content.rst
> new file mode 100644
> index 0000000000000..5e8ff44190932
> --- /dev/null
> +++ b/Documentation/process/generated-content.rst
> @@ -0,0 +1,94 @@
> +============================================
> +Kernel Guidelines for Tool Generated Content
> +============================================
> +
> +Purpose
> +=======
> +
> +Kernel contributors have been using tooling to generate contributions
> +for a long time. These tools are constantly becoming more capable and
> +undoubtedly improve developer productivity. At the same time, reviewer
> +and maintainer bandwidth is a very scarce resource. Understanding
> +which portions of a contribution come from humans versus tools is
> +critical to maintain those resources and keep kernel development
> +healthy.
> +
> +The goal here is to clarify community expectations around tools. This
> +lets everyone become more productive while also maintaining high
> +degrees of trust between submitters and reviewers.
> +
> +Out of Scope
> +============
> +
> +These guidelines do not apply to tools that make trivial tweaks to
> +preexisting content. Nor do they pertain to AI tooling that helps with
> +menial tasks. Some examples:
> +
> + - Spelling and grammar fix ups, like rephrasing to imperative voice
> + - Typing aids like identifier completion, common boilerplate or
> +   trivial pattern completion
> + - Purely mechanical transformations like variable renaming
> + - Reformatting, like running Lindent, ``clang-format`` or
> +   ``rust-fmt``
> +
> +Even if your tool use is out of scope you should still always consider
> +if it would help reviewing your contribution if the reviewer knows
> +about the tool that you used.
> +
> +In Scope
> +========
> +
> +These guidelines apply when a meaningful amount of content in a kernel
> +contribution was not written by a person in the Signed-off-by chain,
> +but was instead created by a tool.
> +
> +Detection of a problem is also a part of the development process; if a
> +tool was used to find a problem addressed by a change, that should be
> +noted in the changelog. This not only gives credit where it is due, it
> +also helps fellow developers find out about these tools.
> +
> +Some examples:
> + - Any tool-suggested fix such as ``checkpatch.pl --fix``
> + - Coccinelle scripts
> + - A chatbot generated a new function in your patch to sort list entries.
> + - A .c file in the patch was originally generated by a LLM but cleaned
> +   up by hand.
> + - The changelog was generated by handing the patch to a generative AI
> +   tool and asking it to write the changelog.
> + - The changelog was translated from another language.
> +
> +If in doubt, choose transparency and assume these guidelines apply to
> +your contribution.
> +
> +Guidelines
> +==========
> +
> +First, read the Developer's Certificate of Origin:
> +Documentation/process/submitting-patches.rst . Its rules are simple
> +and have been in place for a long time. They have covered many
> +tool-generated contributions.
> +
> +Second, when making a contribution, be transparent about the origin of
> +content in cover letters and changelogs. You can be more transparent
> +by adding information like this:
> +
> + - What tools were used?
> + - The input to the tools you used, like the coccinelle source script.
> + - If code was largely generated from a single or short set of
> +   prompts, include those prompts in the commit log. For longer
> +   sessions, include a summary of the prompts and the nature of
> +   resulting assistance.
> + - Which portions of the content were affected by that tool?
> +
> +As with all contributions, individual maintainers have discretion to
> +choose how they handle the contribution. For example, they might:
> +
> + - Treat it just like any other contribution
> + - Reject it outright
> + - Review the contribution with extra scrutiny
> + - Suggest a better prompt instead of suggesting specific code changes
> + - Ask for some other special steps, like asking the contributor to
> +   elaborate on how the tool or model was trained
> + - Ask the submitter to explain in more detail about the contribution
> +   so that the maintainer can feel comfortable that the submitter fully
> +   understands how the code works.
> diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
> index aa12f26601949..e1a8a31389f53 100644
> --- a/Documentation/process/index.rst
> +++ b/Documentation/process/index.rst
> @@ -68,6 +68,7 @@ beyond).
>     stable-kernel-rules
>     management-style
>     researcher-guidelines
> +   generated-content
>  
>  Dealing with bugs
>  -----------------


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10  7:43 ` [PATCH] [v2] Documentation: Provide guidelines for tool-generated content Vlastimil Babka
@ 2025-11-10  8:58   ` Christian Brauner
  2025-11-10 16:08     ` Dave Hansen
  2025-11-10 17:25     ` Laurent Pinchart
  0 siblings, 2 replies; 24+ messages in thread
From: Christian Brauner @ 2025-11-10  8:58 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, Nov 10, 2025 at 08:43:06AM +0100, Vlastimil Babka wrote:
> +Cc ksummit (where the discussions about this topic happened recently) and
> workflows (probably the closest list we have for such things in general)
> because nobody reads lkml today and this seems to have been going under the
> radar until mentioned at lwn yesterday
> 
> On 11/6/25 00:15, Dave Hansen wrote:
> > In the last few years, the capabilities of coding tools have exploded.
> > As those capabilities have expanded, contributors and maintainers have
> > more and more questions about how and when to apply those
> > capabilities.
> > 
> > The shiny new AI tools (chatbots, coding assistants and more) are
> > impressive.

This reads like a factual statement about "impressiveness" of the tools.
Just drop that sentence, please. It doesn't add value to the commit
message at all.
                  
> > Add new Documentation to guide contributors on how to 
> > best use kernel development tools, new and old.
> > 
> > Note, though, there are fundamentally no new or unique rules in this
> > new document. It clarifies expectations that the kernel community has
> > had for many years. For example, researchers are already asked to
> > disclose the tools they use to find issues in
> > Documentation/process/researcher-guidelines.rst. This new document
> > just reiterates existing best practices for development tooling.
> > 
> > In short: Please show your work and make sure your contribution is
> > easy to review.
> > 
> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Theodore Ts'o <tytso@mit.edu>
> > Cc: Sasha Levin <sashal@kernel.org>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Kees Cook <kees@kernel.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Miguel Ojeda <ojeda@kernel.org>
> > Cc: Shuah Khan <shuah@kernel.org>
> > 
> > --
> > 
> > This document was a collaborative effort from all the members of
> > the TAB. I just reformatted it into .rst and wrote the changelog.
> > 
> > Changes from v1:
> >  * Rename to generated-content.rst and add to documentation index.
> >    (Jon)
> >  * Rework subject to align with the new filename
> >  * Replace commercial names with generic ones. (Jon)
> >  * Be consistent about punctuation at the end of bullets for whole
> >    sentences. (Miguel)
> >  * Formatting sprucing up and minor typos (Miguel)
> > ---
> >  Documentation/process/generated-content.rst | 94 +++++++++++++++++++++
> >  Documentation/process/index.rst             |  1 +
> >  2 files changed, 95 insertions(+)
> >  create mode 100644 Documentation/process/generated-content.rst
> > 
> > diff --git a/Documentation/process/generated-content.rst b/Documentation/process/generated-content.rst
> > new file mode 100644
> > index 0000000000000..5e8ff44190932
> > --- /dev/null
> > +++ b/Documentation/process/generated-content.rst
> > @@ -0,0 +1,94 @@
> > +============================================
> > +Kernel Guidelines for Tool Generated Content
> > +============================================
> > +
> > +Purpose
> > +=======
> > +
> > +Kernel contributors have been using tooling to generate contributions
> > +for a long time.

> > These tools are constantly becoming more capable and
> > +undoubtedly improve developer productivity. At the same time, reviewer

"undoubtedly improve developer productivity"?
Am I reading an advert or kernel documentation about the policy how to
use new tooling?

Please keep it factual without statements about what perceived value
this adds. People use it and we have to have a policy for it. There's no
need to celebrate it.

> > +and maintainer bandwidth is a very scarce resource. Understanding
> > +which portions of a contribution come from humans versus tools is
> > +critical to maintain those resources and keep kernel development
> > +healthy.
> > +
> > +The goal here is to clarify community expectations around tools. This
> > +lets everyone become more productive while also maintaining high
> > +degrees of trust between submitters and reviewers.
> > +
> > +Out of Scope
> > +============
> > +
> > +These guidelines do not apply to tools that make trivial tweaks to
> > +preexisting content. Nor do they pertain to AI tooling that helps with
> > +menial tasks. Some examples:
> > +
> > + - Spelling and grammar fix ups, like rephrasing to imperative voice
> > + - Typing aids like identifier completion, common boilerplate or
> > +   trivial pattern completion
> > + - Purely mechanical transformations like variable renaming
> > + - Reformatting, like running Lindent, ``clang-format`` or
> > +   ``rust-fmt``
> > +
> > +Even if your tool use is out of scope you should still always consider
> > +if it would help reviewing your contribution if the reviewer knows
> > +about the tool that you used.
> > +
> > +In Scope
> > +========
> > +
> > +These guidelines apply when a meaningful amount of content in a kernel
> > +contribution was not written by a person in the Signed-off-by chain,
> > +but was instead created by a tool.
> > +
> > +Detection of a problem is also a part of the development process; if a
> > +tool was used to find a problem addressed by a change, that should be
> > +noted in the changelog. This not only gives credit where it is due, it
> > +also helps fellow developers find out about these tools.
> > +
> > +Some examples:
> > + - Any tool-suggested fix such as ``checkpatch.pl --fix``
> > + - Coccinelle scripts
> > + - A chatbot generated a new function in your patch to sort list entries.
> > + - A .c file in the patch was originally generated by a LLM but cleaned
> > +   up by hand.
> > + - The changelog was generated by handing the patch to a generative AI
> > +   tool and asking it to write the changelog.
> > + - The changelog was translated from another language.
> > +
> > +If in doubt, choose transparency and assume these guidelines apply to
> > +your contribution.
> > +
> > +Guidelines
> > +==========
> > +
> > +First, read the Developer's Certificate of Origin:
> > +Documentation/process/submitting-patches.rst . Its rules are simple
> > +and have been in place for a long time. They have covered many
> > +tool-generated contributions.
> > +
> > +Second, when making a contribution, be transparent about the origin of
> > +content in cover letters and changelogs. You can be more transparent
> > +by adding information like this:
> > +
> > + - What tools were used?
> > + - The input to the tools you used, like the coccinelle source script.
> > + - If code was largely generated from a single or short set of
> > +   prompts, include those prompts in the commit log. For longer
> > +   sessions, include a summary of the prompts and the nature of
> > +   resulting assistance.
> > + - Which portions of the content were affected by that tool?
> > +
> > +As with all contributions, individual maintainers have discretion to
> > +choose how they handle the contribution. For example, they might:
> > +
> > + - Treat it just like any other contribution
> > + - Reject it outright
> > + - Review the contribution with extra scrutiny
> > + - Suggest a better prompt instead of suggesting specific code changes
> > + - Ask for some other special steps, like asking the contributor to
> > +   elaborate on how the tool or model was trained
> > + - Ask the submitter to explain in more detail about the contribution
> > +   so that the maintainer can feel comfortable that the submitter fully
> > +   understands how the code works.
> > diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
> > index aa12f26601949..e1a8a31389f53 100644
> > --- a/Documentation/process/index.rst
> > +++ b/Documentation/process/index.rst
> > @@ -68,6 +68,7 @@ beyond).
> >     stable-kernel-rules
> >     management-style
> >     researcher-guidelines
> > +   generated-content
> >  
> >  Dealing with bugs
> >  -----------------
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
       [not found] ` <11eaf7fa-27d0-4a57-abf0-5f24c918966c@lucifer.local>
@ 2025-11-10 11:15   ` Lorenzo Stoakes
       [not found]   ` <103ee61c-f958-440c-af73-1cf3600d10fd@intel.com>
  1 sibling, 0 replies; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-11-10 11:15 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-kernel, Steven Rostedt, Dan Williams, Theodore Ts'o,
	Sasha Levin, Jonathan Corbet, Kees Cook, Greg Kroah-Hartman,
	Miguel Ojeda, Shuah Khan, Christian Brauner, Vlastimil Babka,
	workflows, ksummit, Dan Carpenter, Julia Lawall, James Bottomley,
	Mark Brown, Paul E. McKenney, Jiri Kosina

+cc potentially interested parties.

Apologies if I missed anybody, just scanned through quickly.

On Mon, Nov 10, 2025 at 10:48:04AM +0000, Lorenzo Stoakes wrote:
> I think it would have been helpful to ping those engaged in the discussion in
> this area in related threads, e.g. [0] and [1].
>
> [0]: https://lore.kernel.org/ksummit/49f1a974-e1e6-4be5-864e-5e0f905e1a8f@paulmck-laptop/T/#m30873ef3dc9bd2c4c95547e81efff3085474f2d9
> [1]: https://lore.kernel.org/all/7e7f485e-93ad-4bc4-9323-f154ce477c39@lucifer.local/
>
> I'm not sure what the process was that lead to this, but it feels rather as if
> the community were excluded here.
>
> It also seems slightly odd to produce this in advance of the maintainer's
> summit, as I felt there was some agreement that the topic should be discussed
> there?
>
> Obviously there may be very good reasons for this but it'd be good for them to
> be clarified and those who engaged in these discussions to be cc'd also (or at
> least ping on threads linking!)
>
> On Wed, Nov 05, 2025 at 03:15:14PM -0800, Dave Hansen wrote:
> > In the last few years, the capabilities of coding tools have exploded.
> > As those capabilities have expanded, contributors and maintainers have
> > more and more questions about how and when to apply those
> > capabilities.
> >
> > The shiny new AI tools (chatbots, coding assistants and more) are
> > impressive.  Add new Documentation to guide contributors on how to
> > best use kernel development tools, new and old.
>
> As others have pointed out, this is strangely gleeful, can we please drop it?
>
> As mentioned in the msummit thread I have a great concern about how the press
> might report on this kind of change, as I fear that a 'kernel accepts AI
> patches' story might result in a large influx of AI patches from enthusiatic
> people which will have a direct impact on maintainer workload.
>
> I don't think comments like this help in that respect.
>
> In general I feel that a more restrictive/pessmistic document that can later be
> made less pessimistic/restrictive is a better approach than a broad one on this
> basis.
>
> >
> > Note, though, there are fundamentally no new or unique rules in this
> > new document. It clarifies expectations that the kernel community has
>
> Hmm, I'm not sure the conflation of pre-existing tooling which always required
> some degree of understanding vs. a technique which can simply generate entire
> patch sets with commentary included is justified.
>
> While I _do_ like the idea that basic principles that already existed still
> exist for LLMs (that's a powerful notion), I wonder if we do in fact do need
> some new rules here.
>
> I think saying this also pushes back on the concept of maintainer-by-maintainer
> policy as 'it's just like it always was' doesn't suggest that it warrants a
> higher level of scrutiny.
>
> > had for many years. For example, researchers are already asked to
> > disclose the tools they use to find issues in
> > Documentation/process/researcher-guidelines.rst. This new document
> > just reiterates existing best practices for development tooling.
>
> Ironically that document is considerably more strident and firm than this
> one :)
>
> >
> > In short: Please show your work and make sure your contribution is
> > easy to review.
>
> I wonder whether we need to be very explicit in stating - please do not
> generate patches in large volume with no involvement from you and
> _emphasise_ that human involvement is _necessary_.
>
> In discussion with kernel colleagues who use AI extensively, there is a
> very clear pattern than a key part of usefully making use of this tooling
> is for there to be an 'expert in the loop' who reviews what is generated to
> ensure it is correct.
>
> I therefore think we either _should_ have a specific rule for LLM-generated
> content or should (and it really makes sense actually) have a broad
> 'generated content' rule that - you _must_ have a thorough understanding of
> what you are doing such that you can review and filter the generated
> output.
>
> I think stating that we will NOT accept series that are generated without
> understanding would be very beneficial in all respects, rather than leaving
> it somehow implied.
>
> Being soft or vague here is likely to cause maintainer headaches IMO
> (though of course there's only so many who will read a doc etc. being able
> to point at the document in reply as a maintainer is useful too).
>
> >
> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Theodore Ts'o <tytso@mit.edu>
> > Cc: Sasha Levin <sashal@kernel.org>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Kees Cook <kees@kernel.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Miguel Ojeda <ojeda@kernel.org>
> > Cc: Shuah Khan <shuah@kernel.org>
> >
> > --
> >
> > This document was a collaborative effort from all the members of
> > the TAB. I just reformatted it into .rst and wrote the changelog.
> >
> > Changes from v1:
> >  * Rename to generated-content.rst and add to documentation index.
> >    (Jon)
> >  * Rework subject to align with the new filename
> >  * Replace commercial names with generic ones. (Jon)
> >  * Be consistent about punctuation at the end of bullets for whole
> >    sentences. (Miguel)
> >  * Formatting sprucing up and minor typos (Miguel)
> > ---
> >  Documentation/process/generated-content.rst | 94 +++++++++++++++++++++
> >  Documentation/process/index.rst             |  1 +
> >  2 files changed, 95 insertions(+)
> >  create mode 100644 Documentation/process/generated-content.rst
> >
> > diff --git a/Documentation/process/generated-content.rst b/Documentation/process/generated-content.rst
> > new file mode 100644
> > index 0000000000000..5e8ff44190932
> > --- /dev/null
> > +++ b/Documentation/process/generated-content.rst
> > @@ -0,0 +1,94 @@
> > +============================================
> > +Kernel Guidelines for Tool Generated Content
> > +============================================
> > +
> > +Purpose
> > +=======
> > +
> > +Kernel contributors have been using tooling to generate contributions
> > +for a long time. These tools are constantly becoming more capable and
> > +undoubtedly improve developer productivity. At the same time, reviewer
> > +and maintainer bandwidth is a very scarce resource. Understanding
>
> This is absolutely the key issue here imo, maintainer bandwidth. Glad this
> is in the opener.
>
> > +which portions of a contribution come from humans versus tools is
> > +critical to maintain those resources and keep kernel development
> > +healthy.
>
> Agreed entirely.
>
> > +
> > +The goal here is to clarify community expectations around tools. This
> > +lets everyone become more productive while also maintaining high
> > +degrees of trust between submitters and reviewers.
>
> Also very good.
>
> > +
> > +Out of Scope
> > +============
> > +
> > +These guidelines do not apply to tools that make trivial tweaks to
> > +preexisting content. Nor do they pertain to AI tooling that helps with
> > +menial tasks. Some examples:
> > +
> > + - Spelling and grammar fix ups, like rephrasing to imperative voice
> > + - Typing aids like identifier completion, common boilerplate or
> > +   trivial pattern completion
> > + - Purely mechanical transformations like variable renaming
> > + - Reformatting, like running Lindent, ``clang-format`` or
> > +   ``rust-fmt``
> > +
> > +Even if your tool use is out of scope you should still always consider
> > +if it would help reviewing your contribution if the reviewer knows
> > +about the tool that you used.
>
> This is great, I agree very much that we have to be reasonable about these
> uses.
>
> The final sentence is also great.
>
> > +
> > +In Scope
> > +========
> > +
> > +These guidelines apply when a meaningful amount of content in a kernel
> > +contribution was not written by a person in the Signed-off-by chain,
> > +but was instead created by a tool.
>
> Yes, perhaps useful actually using the term 'meaningful amount' rather than
> trying to be absolutely explicit about what this entails.
>
> Also allows for maintainer discretion.
>
> > +
> > +Detection of a problem is also a part of the development process; if a
> > +tool was used to find a problem addressed by a change, that should be
> > +noted in the changelog. This not only gives credit where it is due, it
> > +also helps fellow developers find out about these tools.
> > +
> > +Some examples:
> > + - Any tool-suggested fix such as ``checkpatch.pl --fix``
> > + - Coccinelle scripts
> > + - A chatbot generated a new function in your patch to sort list entries.
> > + - A .c file in the patch was originally generated by a LLM but cleaned
> > +   up by hand.
> > + - The changelog was generated by handing the patch to a generative AI
> > +   tool and asking it to write the changelog.
> > + - The changelog was translated from another language.
> > +
> > +If in doubt, choose transparency and assume these guidelines apply to
> > +your contribution.
>
> Yes agreed.
>
> > +
> > +Guidelines
> > +==========
> > +
> > +First, read the Developer's Certificate of Origin:
> > +Documentation/process/submitting-patches.rst . Its rules are simple
> > +and have been in place for a long time. They have covered many
> > +tool-generated contributions.
> > +
> > +Second, when making a contribution, be transparent about the origin of
> > +content in cover letters and changelogs. You can be more transparent
> > +by adding information like this:
> > +
> > + - What tools were used?
> > + - The input to the tools you used, like the coccinelle source script.
>
> Not sure repeatedly using coccinelle as an example is helpful, as
> coccinelle is far less of an issue than LLM tooling, perhaps for the
> avoidance of doubt, expand this to include references to that?
>
> > + - If code was largely generated from a single or short set of
> > +   prompts, include those prompts in the commit log. For longer
> > +   sessions, include a summary of the prompts and the nature of
> > +   resulting assistance.
>
> Maybe worth saying send it in a cover letter if a series, but perhaps
> pedantic.
>
> > + - Which portions of the content were affected by that tool?
> > +
> > +As with all contributions, individual maintainers have discretion to
> > +choose how they handle the contribution. For example, they might:
> > +
> > + - Treat it just like any other contribution
> > + - Reject it outright
> > + - Review the contribution with extra scrutiny
> > + - Suggest a better prompt instead of suggesting specific code changes
> > + - Ask for some other special steps, like asking the contributor to
> > +   elaborate on how the tool or model was trained
> > + - Ask the submitter to explain in more detail about the contribution
> > +   so that the maintainer can feel comfortable that the submitter fully
> > +   understands how the code works.
>
> OK I wrote something suggesting you add this and you already have :) that's
> great. Let me go delete that request :)
>
> However I'm not sure the 'as with all contributions' is right though - as a
> maintainer in mm I don't actually feel that we can reject outright without
> having to give significant explanation as to why.
>
> And I think that's often the case - people (rightly) dislike blanket NAKs
> and it's a terrible practice, which often (also rightly) gets pushback from
> co-maintainers or others in the community.
>
> So I think perhaps it'd also be useful to very explicitly say that
> maintainers may say no summarily in instances where the review load would
> simply be too much to handle large clearly-AI-generated and
> clearly-unfiltered series.
>
> Another point to raise perhaps is that - even in the cases where the
> submitter is carefully reviewing generated output - that submitters must be
> reasonable in terms of the volume they submit. This is perhaps hand wavey
> but mentioning it would be great not least for the ability for maintainers
> to point at the doc and reference it.
>
> > diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
> > index aa12f26601949..e1a8a31389f53 100644
> > --- a/Documentation/process/index.rst
> > +++ b/Documentation/process/index.rst
> > @@ -68,6 +68,7 @@ beyond).
> >     stable-kernel-rules
> >     management-style
> >     researcher-guidelines
> > +   generated-content
> >
> >  Dealing with bugs
> >  -----------------
>
> I guess this is a WIP?
>
> > --
> > 2.34.1
> >
> >
>
> Thanks, Lorenzo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10  8:58   ` Christian Brauner
@ 2025-11-10 16:08     ` Dave Hansen
  2025-11-10 17:25     ` Laurent Pinchart
  1 sibling, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2025-11-10 16:08 UTC (permalink / raw)
  To: Christian Brauner, Dave Hansen
  Cc: Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On 11/10/25 00:58, Christian Brauner wrote:
...
> This reads like a factual statement about "impressiveness" of the tools.
> Just drop that sentence, please. It doesn't add value to the commit
> message at all.\

Sure thing. Dropped.

...>>> These tools are constantly becoming more capable and
>>> +undoubtedly improve developer productivity. At the same time, reviewer
> 
> "undoubtedly improve developer productivity"?
> Am I reading an advert or kernel documentation about the policy how to
> use new tooling?
> 
> Please keep it factual without statements about what perceived value
> this adds. People use it and we have to have a policy for it. There's no
> need to celebrate it.

I can definitely steer this away from perceived value. But the main
point of this section was to do some impedance matching between
maintainers and contributors. You (the contributor) may be more
productive, but the maintainer just got more patches to review.

So we could easily tone this down by changing:

	These tools are constantly becoming more capable and
	undoubtedly improve developer productivity.

to

	These tools can increase the volume of contributions.

But I do think it's important to make the connection between
reviewer/maintainer scarcity and tooling.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
       [not found]   ` <103ee61c-f958-440c-af73-1cf3600d10fd@intel.com>
@ 2025-11-10 16:51     ` Lorenzo Stoakes
  0 siblings, 0 replies; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-11-10 16:51 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, linux-kernel, Steven Rostedt, Dan Williams,
	Theodore Ts'o, Sasha Levin, Jonathan Corbet, Kees Cook,
	Greg Kroah-Hartman, Miguel Ojeda, Shuah Khan, Christian Brauner,
	Vlastimil Babka, workflows, ksummit, Dan Carpenter, Julia Lawall,
	James Bottomley, Mark Brown, Paul E. McKenney, Jiri Kosina

+re-cc various - email is a pain!

On Mon, Nov 10, 2025 at 08:35:12AM -0800, Dave Hansen wrote:
> On 11/10/25 02:48, Lorenzo Stoakes wrote:
> ...
> > It also seems slightly odd to produce this in advance of the maintainer's
> > summit, as I felt there was some agreement that the topic should be discussed
> > there?
>
> The TAB discussions have been ongoing and this document was mostly put
> together before the ksummit thread even launched off. This patch just
> suffered from being put on the back burner.

Ah that's useful context!

I mean I (obviously) feel this document is very necessary/useful so it's
nice that this was already an ongoing thing and we're all aligned anyway.

>
> > I think stating that we will NOT accept series that are generated without
> > understanding would be very beneficial in all respects, rather than leaving
> > it somehow implied.
>
> I actually don't think that's a tooling-specific requirement.
>
> If you're posting a series, you should understand it. I've seen quite a
> few cases where folks will pick up someone else's work, forward port it,
> and post it again without a clear understanding of the series.
>
> "Understand and be able to defend what you contribute" is certainly a
> good rule. It's also concise enough to have this document touch in it.
>
> Would that suffice?

That's great thanks. And yes absolutely it applies to everything, but
obviously LLMs are a realm where a person's capacity to exceed their
understanding is amplified.

>
> >> +Guidelines
> >> +==========
> >> +
> >> +First, read the Developer's Certificate of Origin:
> >> +Documentation/process/submitting-patches.rst . Its rules are simple
> >> +and have been in place for a long time. They have covered many
> >> +tool-generated contributions.
> >> +
> >> +Second, when making a contribution, be transparent about the origin of
> >> +content in cover letters and changelogs. You can be more transparent
> >> +by adding information like this:
> >> +
> >> + - What tools were used?
> >> + - The input to the tools you used, like the coccinelle source script.
> >
> > Not sure repeatedly using coccinelle as an example is helpful, as
> > coccinelle is far less of an issue than LLM tooling, perhaps for the
> > avoidance of doubt, expand this to include references to that?
> >
> >> + - If code was largely generated from a single or short set of
> >> +   prompts, include those prompts in the commit log. For longer
> >> +   sessions, include a summary of the prompts and the nature of
> >> +   resulting assistance.
> >
> > Maybe worth saying send it in a cover letter if a series, but perhaps
> > pedantic.
>
> Do we have a good short term that means "commit logs or cover letter"?
> "Changelogs" maybe? But, yeah, we don't want people reading this and
> avoiding putting stuff in cover letters.

Yeah it's maybe not worth specifying to be honest, it might just add
confusion.

>
> >> + - Which portions of the content were affected by that tool?
> >> +
> >> +As with all contributions, individual maintainers have discretion to
> >> +choose how they handle the contribution. For example, they might:
> >> +
> >> + - Treat it just like any other contribution
> >> + - Reject it outright
> >> + - Review the contribution with extra scrutiny
> >> + - Suggest a better prompt instead of suggesting specific code changes
> >> + - Ask for some other special steps, like asking the contributor to
> >> +   elaborate on how the tool or model was trained
> >> + - Ask the submitter to explain in more detail about the contribution
> >> +   so that the maintainer can feel comfortable that the submitter fully
> >> +   understands how the code works.
> >
> > OK I wrote something suggesting you add this and you already have :) that's
> > great. Let me go delete that request :)
> >
> > However I'm not sure the 'as with all contributions' is right though - as a
> > maintainer in mm I don't actually feel that we can reject outright without
> > having to give significant explanation as to why.
> >
> > And I think that's often the case - people (rightly) dislike blanket NAKs
> > and it's a terrible practice, which often (also rightly) gets pushback from
> > co-maintainers or others in the community.
> >
> > So I think perhaps it'd also be useful to very explicitly say that
> > maintainers may say no summarily in instances where the review load would
> > simply be too much to handle large clearly-AI-generated and
> > clearly-unfiltered series.
> >
> > Another point to raise perhaps is that - even in the cases where the
> > submitter is carefully reviewing generated output - that submitters must be
> > reasonable in terms of the volume they submit. This is perhaps hand wavey
> > but mentioning it would be great not least for the ability for maintainers
> > to point at the doc and reference it.
>
> How about we expand this bullet a bit?
>
> - Review the contribution with extra scrutiny
>
> to
>
> - Treat the contribution specially like reviewing with extra scrutiny,
>   or at a lower priority than human-generated content.
>
> That's a good match for the "Treat it just like any other contribution"
> bullet. Maintainers can either treat it normally _or_ specially.

That sounds good.

I do still think an explicit point about volume is important though, at
least to underline it.

Something like:

	Please be wary of the volume of submitted patches - sending an
	unreasonable number of generated patches is more likely to result
	in maintainers rejecting them or deprioritising review.

Perhaps?

>
> >> diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
> >> index aa12f26601949..e1a8a31389f53 100644
> >> --- a/Documentation/process/index.rst
> >> +++ b/Documentation/process/index.rst
> >> @@ -68,6 +68,7 @@ beyond).
> >>     stable-kernel-rules
> >>     management-style
> >>     researcher-guidelines
> >> +   generated-content
> >>
> >>  Dealing with bugs
> >>  -----------------
> >
> > I guess this is a WIP?

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10  8:58   ` Christian Brauner
  2025-11-10 16:08     ` Dave Hansen
@ 2025-11-10 17:25     ` Laurent Pinchart
  2025-11-10 17:41       ` Dave Hansen
                         ` (2 more replies)
  1 sibling, 3 replies; 24+ messages in thread
From: Laurent Pinchart @ 2025-11-10 17:25 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Dave Hansen, Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, Nov 10, 2025 at 09:58:52AM +0100, Christian Brauner wrote:
> On Mon, Nov 10, 2025 at 08:43:06AM +0100, Vlastimil Babka wrote:
> > +Cc ksummit (where the discussions about this topic happened recently) and
> > workflows (probably the closest list we have for such things in general)
> > because nobody reads lkml today and this seems to have been going under the
> > radar until mentioned at lwn yesterday
> > 
> > On 11/6/25 00:15, Dave Hansen wrote:
> > > In the last few years, the capabilities of coding tools have exploded.
> > > As those capabilities have expanded, contributors and maintainers have
> > > more and more questions about how and when to apply those
> > > capabilities.
> > > 
> > > The shiny new AI tools (chatbots, coding assistants and more) are
> > > impressive.
> 
> This reads like a factual statement about "impressiveness" of the tools.
> Just drop that sentence, please. It doesn't add value to the commit
> message at all.
>                   
> > > Add new Documentation to guide contributors on how to 
> > > best use kernel development tools, new and old.
> > > 
> > > Note, though, there are fundamentally no new or unique rules in this
> > > new document. It clarifies expectations that the kernel community has
> > > had for many years. For example, researchers are already asked to
> > > disclose the tools they use to find issues in
> > > Documentation/process/researcher-guidelines.rst. This new document
> > > just reiterates existing best practices for development tooling.
> > > 
> > > In short: Please show your work and make sure your contribution is
> > > easy to review.
> > > 
> > > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Theodore Ts'o <tytso@mit.edu>
> > > Cc: Sasha Levin <sashal@kernel.org>
> > > Cc: Jonathan Corbet <corbet@lwn.net>
> > > Cc: Kees Cook <kees@kernel.org>
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: Miguel Ojeda <ojeda@kernel.org>
> > > Cc: Shuah Khan <shuah@kernel.org>
> > > 
> > > --
> > > 
> > > This document was a collaborative effort from all the members of
> > > the TAB. I just reformatted it into .rst and wrote the changelog.
> > > 
> > > Changes from v1:
> > >  * Rename to generated-content.rst and add to documentation index.
> > >    (Jon)
> > >  * Rework subject to align with the new filename
> > >  * Replace commercial names with generic ones. (Jon)
> > >  * Be consistent about punctuation at the end of bullets for whole
> > >    sentences. (Miguel)
> > >  * Formatting sprucing up and minor typos (Miguel)
> > > ---
> > >  Documentation/process/generated-content.rst | 94 +++++++++++++++++++++
> > >  Documentation/process/index.rst             |  1 +
> > >  2 files changed, 95 insertions(+)
> > >  create mode 100644 Documentation/process/generated-content.rst
> > > 
> > > diff --git a/Documentation/process/generated-content.rst b/Documentation/process/generated-content.rst
> > > new file mode 100644
> > > index 0000000000000..5e8ff44190932
> > > --- /dev/null
> > > +++ b/Documentation/process/generated-content.rst
> > > @@ -0,0 +1,94 @@
> > > +============================================
> > > +Kernel Guidelines for Tool Generated Content
> > > +============================================
> > > +
> > > +Purpose
> > > +=======
> > > +
> > > +Kernel contributors have been using tooling to generate contributions
> > > +for a long time.
> 
> > > These tools are constantly becoming more capable and
> > > +undoubtedly improve developer productivity. At the same time, reviewer
> 
> "undoubtedly improve developer productivity"?
> Am I reading an advert or kernel documentation about the policy how to
> use new tooling?
> 
> Please keep it factual without statements about what perceived value
> this adds. People use it and we have to have a policy for it. There's no
> need to celebrate it.
> 
> > > +and maintainer bandwidth is a very scarce resource. Understanding
> > > +which portions of a contribution come from humans versus tools is
> > > +critical to maintain those resources and keep kernel development
> > > +healthy.
> > > +
> > > +The goal here is to clarify community expectations around tools. This
> > > +lets everyone become more productive while also maintaining high
> > > +degrees of trust between submitters and reviewers.
> > > +
> > > +Out of Scope
> > > +============
> > > +
> > > +These guidelines do not apply to tools that make trivial tweaks to
> > > +preexisting content. Nor do they pertain to AI tooling that helps with
> > > +menial tasks. Some examples:
> > > +
> > > + - Spelling and grammar fix ups, like rephrasing to imperative voice
> > > + - Typing aids like identifier completion, common boilerplate or
> > > +   trivial pattern completion
> > > + - Purely mechanical transformations like variable renaming

Mechanical transformations are often performed with Coccinelle. Given
how you mention that tool below, I wouldn't frame it as out of scope
here.

> > > + - Reformatting, like running Lindent, ``clang-format`` or
> > > +   ``rust-fmt``
> > > +
> > > +Even if your tool use is out of scope you should still always consider
> > > +if it would help reviewing your contribution if the reviewer knows
> > > +about the tool that you used.
> > > +
> > > +In Scope
> > > +========
> > > +
> > > +These guidelines apply when a meaningful amount of content in a kernel
> > > +contribution was not written by a person in the Signed-off-by chain,
> > > +but was instead created by a tool.
> > > +
> > > +Detection of a problem is also a part of the development process; if a
> > > +tool was used to find a problem addressed by a change, that should be
> > > +noted in the changelog. This not only gives credit where it is due, it
> > > +also helps fellow developers find out about these tools.
> > > +
> > > +Some examples:
> > > + - Any tool-suggested fix such as ``checkpatch.pl --fix``
> > > + - Coccinelle scripts
> > > + - A chatbot generated a new function in your patch to sort list entries.
> > > + - A .c file in the patch was originally generated by a LLM but cleaned
> > > +   up by hand.
> > > + - The changelog was generated by handing the patch to a generative AI
> > > +   tool and asking it to write the changelog.
> > > + - The changelog was translated from another language.
> > > +
> > > +If in doubt, choose transparency and assume these guidelines apply to
> > > +your contribution.
> > > +
> > > +Guidelines
> > > +==========
> > > +
> > > +First, read the Developer's Certificate of Origin:
> > > +Documentation/process/submitting-patches.rst . Its rules are simple
> > > +and have been in place for a long time. They have covered many
> > > +tool-generated contributions.
> > > +
> > > +Second, when making a contribution, be transparent about the origin of
> > > +content in cover letters and changelogs. You can be more transparent
> > > +by adding information like this:
> > > +
> > > + - What tools were used?
> > > + - The input to the tools you used, like the coccinelle source script.
> > > + - If code was largely generated from a single or short set of
> > > +   prompts, include those prompts in the commit log. For longer
> > > +   sessions, include a summary of the prompts and the nature of
> > > +   resulting assistance.
> > > + - Which portions of the content were affected by that tool?
> > > +
> > > +As with all contributions, individual maintainers have discretion to
> > > +choose how they handle the contribution. For example, they might:
> > > +
> > > + - Treat it just like any other contribution
> > > + - Reject it outright
> > > + - Review the contribution with extra scrutiny
> > > + - Suggest a better prompt instead of suggesting specific code changes
> > > + - Ask for some other special steps, like asking the contributor to
> > > +   elaborate on how the tool or model was trained
> > > + - Ask the submitter to explain in more detail about the contribution
> > > +   so that the maintainer can feel comfortable that the submitter fully
> > > +   understands how the code works.
> > > diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
> > > index aa12f26601949..e1a8a31389f53 100644
> > > --- a/Documentation/process/index.rst
> > > +++ b/Documentation/process/index.rst
> > > @@ -68,6 +68,7 @@ beyond).
> > >     stable-kernel-rules
> > >     management-style
> > >     researcher-guidelines
> > > +   generated-content
> > >  
> > >  Dealing with bugs
> > >  -----------------

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 17:25     ` Laurent Pinchart
@ 2025-11-10 17:41       ` Dave Hansen
  2025-11-10 17:44       ` Linus Torvalds
  2025-11-10 17:46       ` Steven Rostedt
  2 siblings, 0 replies; 24+ messages in thread
From: Dave Hansen @ 2025-11-10 17:41 UTC (permalink / raw)
  To: Laurent Pinchart, Christian Brauner
  Cc: Dave Hansen, Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On 11/10/25 09:25, Laurent Pinchart wrote:
>>>> + - Purely mechanical transformations like variable renaming
> Mechanical transformations are often performed with Coccinelle. Given
> how you mention that tool below, I wouldn't frame it as out of scope
> here.

The key here isn't which tool is used, it's how it's used.

If you go use Coccinelle for pure variable renaming, you don't need to
mention it. Same as if you use perl or vim to do a s/foo/bar/.

That said, if you choose to attach your trivial variable renaming
Coccinelle script, everyone will be better off for it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 17:25     ` Laurent Pinchart
  2025-11-10 17:41       ` Dave Hansen
@ 2025-11-10 17:44       ` Linus Torvalds
  2025-11-10 17:56         ` Luck, Tony
  2025-11-10 18:39         ` Mike Rapoport
  2025-11-10 17:46       ` Steven Rostedt
  2 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2025-11-10 17:44 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Steven Rostedt, Dan Williams,
	Theodore Ts'o, Sasha Levin, Jonathan Corbet, Kees Cook,
	Greg Kroah-Hartman, Miguel Ojeda, Shuah Khan

On Mon, 10 Nov 2025 at 09:25, Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Mechanical transformations are often performed with Coccinelle. Given
> how you mention that tool below, I wouldn't frame it as out of scope
> here.

Honestly, I think the documented rule should not aim to treat AI as
anything special at all, and literally just talk about tooling.

Exactly because we've used things like coccinelle (and much simpler
tools like 'sed', for that matter) for ages.

IOW, this should all be about "tool-assisted patches should be
described as such, and should explain how the tool was used".

If people send in patches that have been generated by tools, we
already ask people to just include the script in the commit message.

I mean, we already have commit messages that say things like

    This is a completely mechanical patch (done with a simple "sed -i"
    statement).

when people do mindless conversions that are so straightforward that
the actual sed patch isn't even documented (in that case is was
something like just

   sed -i 's/__ASSEMBLY__/__ASSEMBLER__/'

or whatever), and in other cases people include the actual script
(random example being commit 96b451d53ae9: "drm/{i915,xe}: convert
i915 and xe display members into pointers").

I think we should treat any AI generated patches similarly: people
should mention the tool it was done with, and the script (ok, the
"scripts" are called "prompts", because AI is so "special") used.

Sure, AI ends up making the result potentially much more subtle, but I
don't think the *issue* is new, and I don't think it should need to be
treated as such.

                 Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 17:25     ` Laurent Pinchart
  2025-11-10 17:41       ` Dave Hansen
  2025-11-10 17:44       ` Linus Torvalds
@ 2025-11-10 17:46       ` Steven Rostedt
  2 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt @ 2025-11-10 17:46 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 10 Nov 2025 19:25:07 +0200
Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote:

> > > > + - Purely mechanical transformations like variable renaming  
> 
> Mechanical transformations are often performed with Coccinelle. Given
> how you mention that tool below, I wouldn't frame it as out of scope
> here.

Agreed. Tooling that performs "mechanical transformations like variable
renaming" is definitely in scope of this document. The number of times I've
seen this "simple" activity make mistakes. It most definitely should be
disclosed if a tool helped in this regard.

Anyway,

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 17:44       ` Linus Torvalds
@ 2025-11-10 17:56         ` Luck, Tony
  2025-11-10 18:39         ` Mike Rapoport
  1 sibling, 0 replies; 24+ messages in thread
From: Luck, Tony @ 2025-11-10 17:56 UTC (permalink / raw)
  To: Linus Torvalds, Laurent Pinchart
  Cc: Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Steven Rostedt, Williams, Dan J,
	Theodore Ts'o, Sasha Levin, Jonathan Corbet, Kees Cook,
	Greg Kroah-Hartman, Miguel Ojeda, Shuah Khan

> I think we should treat any AI generated patches similarly: people
> should mention the tool it was done with, and the script (ok, the
> "scripts" are called "prompts", because AI is so "special") used.

AI is also special in that it is effectively non-deterministic. You probably won't get
the same output from the same prompt.

Maybe still helpful to include the prompt, but it has less utility that a
sed or coccinelle script.

-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 17:44       ` Linus Torvalds
  2025-11-10 17:56         ` Luck, Tony
@ 2025-11-10 18:39         ` Mike Rapoport
  2025-11-10 19:05           ` Linus Torvalds
  1 sibling, 1 reply; 24+ messages in thread
From: Mike Rapoport @ 2025-11-10 18:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Laurent Pinchart, Christian Brauner, Dave Hansen,
	Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, Nov 10, 2025 at 09:44:00AM -0800, Linus Torvalds wrote:
> On Mon, 10 Nov 2025 at 09:25, Laurent Pinchart
> <laurent.pinchart@ideasonboard.com> wrote:
> >
> > Mechanical transformations are often performed with Coccinelle. Given
> > how you mention that tool below, I wouldn't frame it as out of scope
> > here.
> 
> Honestly, I think the documented rule should not aim to treat AI as
> anything special at all, and literally just talk about tooling.
> 
> I think we should treat any AI generated patches similarly: people
> should mention the tool it was done with, and the script (ok, the
> "scripts" are called "prompts", because AI is so "special") used.
> 
> Sure, AI ends up making the result potentially much more subtle, but I
> don't think the *issue* is new, and I don't think it should need to be
> treated as such.
 
The novelty here is that AI does not only transform the code, it can
generate it from scratch en masse.

>                  Linus
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 18:39         ` Mike Rapoport
@ 2025-11-10 19:05           ` Linus Torvalds
  2025-11-10 19:18             ` H. Peter Anvin
  0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2025-11-10 19:05 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Laurent Pinchart, Christian Brauner, Dave Hansen,
	Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 10 Nov 2025 at 10:39, Mike Rapoport <rppt@kernel.org> wrote:
>
> The novelty here is that AI does not only transform the code, it can
> generate it from scratch en masse.

And why would that make any difference to the basic rules for us?

  "Plus ça change, plus c'est la même chose"

It's a change in degree, not in any fundamentals.

                Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:05           ` Linus Torvalds
@ 2025-11-10 19:18             ` H. Peter Anvin
  2025-11-10 19:36               ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: H. Peter Anvin @ 2025-11-10 19:18 UTC (permalink / raw)
  To: Linus Torvalds, Mike Rapoport
  Cc: Laurent Pinchart, Christian Brauner, Dave Hansen,
	Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On November 10, 2025 11:05:44 AM PST, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>On Mon, 10 Nov 2025 at 10:39, Mike Rapoport <rppt@kernel.org> wrote:
>>
>> The novelty here is that AI does not only transform the code, it can
>> generate it from scratch en masse.
>
>And why would that make any difference to the basic rules for us?
>
>  "Plus ça change, plus c'est la même chose"
>
>It's a change in degree, not in any fundamentals.
>
>                Linus
>
>

Copyright reasons, mainly.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:18             ` H. Peter Anvin
@ 2025-11-10 19:36               ` Linus Torvalds
  2025-11-10 19:54                 ` Steven Rostedt
  2025-11-11  9:35                 ` Lorenzo Stoakes
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2025-11-10 19:36 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Mike Rapoport, Laurent Pinchart, Christian Brauner, Dave Hansen,
	Vlastimil Babka, linux-kernel, workflows, ksummit,
	Steven Rostedt, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 10 Nov 2025 at 11:18, H. Peter Anvin <hpa@zytor.com> wrote:
>
> Copyright reasons, mainly.

I really don't see the argument.

The copyright issues are all true for all other code too. In fact, the
copyright issues are a thing whether tools were involved or not.

Copyright is *always* a thing.

We have a fair chunk of actual generated "new" code, whether it is the
millions of lines of register descriptions from hardware companies, or
it's the millions of lines of unicode data.

(Ok, the unicode data is just a few thousand lines, I exaggerate. But
we really do have several million lines AMD GPU headers that must have
been generated from hw descriptors, and there we didn't even ask for
the tool or the source, just for the usual copyright sign-off).

I really don't see what makes AI generated content so special.

Yes, I think you need to specify what the tool was and what the
conditions were for the change, but again - none of that is actually
new in ANY way.

This all feels like the usual AI hype-fest. Because THAT is the thing
that is truly special about AI. The hype, and the billions and
billions of dollars.

I claim that copyright is no different just because it was artificial.

What's the copyright difference between artificial intelligence and
good oldfashioned wetware that isn't documented by "I used this tool
and these sources".

It's just another tool, guys. It's one that makes some people a lot of
money, and yes, it will change society. But it's still just a tool.

                Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:36               ` Linus Torvalds
@ 2025-11-10 19:54                 ` Steven Rostedt
  2025-11-10 20:00                   ` Konstantin Ryabitsev
                                     ` (2 more replies)
  2025-11-11  9:35                 ` Lorenzo Stoakes
  1 sibling, 3 replies; 24+ messages in thread
From: Steven Rostedt @ 2025-11-10 19:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 10 Nov 2025 11:36:00 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> What's the copyright difference between artificial intelligence and
> good oldfashioned wetware that isn't documented by "I used this tool
> and these sources".

Probably no difference. I would guess the real liability is for those that
use AI to submit patches. With the usual disclaimers of IANAL, I'm assuming
that when you place your "Signed-off-by", you are stating that you have the
right to submit this code. If it comes down that you did not have the right
to submit the code, the original submitter is liable.

I guess the question also is, is the maintainer that took that patch and
added their SoB also liable?

If it is discovered that the AI tool was using source code that it wasn't
supposed to be using, and then injected code that was pretty much verbatim
to the original source, where it would be a copyright infringement, would
the submitter of the patch be responsible? Would the maintainer?

I guess this would be no different if the submitter saw some code from a
proprietary project and cut and pasted it without understanding they were
not allowed to, and submitted that.

If the lawyers come back and say the onus is on the submitter and not the
maintainer that the code being submitted is legal to be submitted under
copyright law, then I'm perfectly fine in accepting any AI code (as long as
the submitter can prove they understand that code and the code is clean).

But until the lawyers state that explicitly, I can see why maintainers can
be nervous about accepting AI generated code. Perhaps this transparency can
make matters worse. As it can be argued that the maintainer knew it was a
questionable AI that generated the code? (Like it would be if a maintainer
knew the code being submitted was copied from a proprietary project)

This is out of scope of the current patch, as the patch is about
transparency and not AI acceptance.

-- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:54                 ` Steven Rostedt
@ 2025-11-10 20:00                   ` Konstantin Ryabitsev
  2025-11-10 20:25                     ` Steven Rostedt
  2025-11-10 21:21                   ` James Bottomley
  2025-11-10 23:16                   ` Theodore Ts'o
  2 siblings, 1 reply; 24+ messages in thread
From: Konstantin Ryabitsev @ 2025-11-10 20:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, Nov 10, 2025 at 02:54:05PM -0500, Steven Rostedt wrote:
> Probably no difference. I would guess the real liability is for those that
> use AI to submit patches. With the usual disclaimers of IANAL, I'm assuming
> that when you place your "Signed-off-by", you are stating that you have the
> right to submit this code. If it comes down that you did not have the right
> to submit the code, the original submitter is liable.

And if the lawyers come back and say that the submitter is not liable, what's
to prevent someone from copypasting actual copyrighted code from a proprietary
source and adding a "Generated-by: Chat j'ai-pété" line to absolve themselves?

-K

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 20:00                   ` Konstantin Ryabitsev
@ 2025-11-10 20:25                     ` Steven Rostedt
  0 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt @ 2025-11-10 20:25 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 10 Nov 2025 15:00:54 -0500
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:

> On Mon, Nov 10, 2025 at 02:54:05PM -0500, Steven Rostedt wrote:
> > Probably no difference. I would guess the real liability is for those that
> > use AI to submit patches. With the usual disclaimers of IANAL, I'm assuming
> > that when you place your "Signed-off-by", you are stating that you have the
> > right to submit this code. If it comes down that you did not have the right
> > to submit the code, the original submitter is liable.  
> 
> And if the lawyers come back and say that the submitter is not liable, what's
> to prevent someone from copypasting actual copyrighted code from a proprietary
> source and adding a "Generated-by: Chat j'ai-pété" line to absolve themselves?
> 

Wouldn't that be up to the courts? I shouldn't say "the lawyers come back
and say", it's more like "a court has ruled", and keeping to court
precedent, the lawyers would say "this is how it was ruled before". Of
course, today I'm not really sure how much "precedent" matters :-p

-- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:54                 ` Steven Rostedt
  2025-11-10 20:00                   ` Konstantin Ryabitsev
@ 2025-11-10 21:21                   ` James Bottomley
  2025-11-10 21:42                     ` Steven Rostedt
  2025-11-10 23:16                   ` Theodore Ts'o
  2 siblings, 1 reply; 24+ messages in thread
From: James Bottomley @ 2025-11-10 21:21 UTC (permalink / raw)
  To: Steven Rostedt, Linus Torvalds
  Cc: H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 2025-11-10 at 14:54 -0500, Steven Rostedt wrote:
> On Mon, 10 Nov 2025 11:36:00 -0800
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > What's the copyright difference between artificial intelligence and
> > good oldfashioned wetware that isn't documented by "I used this
> > tool and these sources".
> 
> Probably no difference. I would guess the real liability is for those
> that use AI to submit patches. With the usual disclaimers of IANAL,
> I'm assuming that when you place your "Signed-off-by", you are
> stating that you have the right to submit this code. If it comes down
> that you did not have the right to submit the code, the original
> submitter is liable.

Liable for what?  Signed-off-by is a representation by you that you
followed the DCO, nothing more:

https://developercertificate.org/

Liability arises when someone reasonably relies on that representation
for some purpose and remember most licences actually disclaim fitness
for a particular purpose in all situations (the no warranty clause), so
we have loads of protection from general "liability" fears.

> I guess the question also is, is the maintainer that took that patch
> and added their SoB also liable?

See above for liability.  If you mean what representations does a
Maintainer give with a signoff, that's usually section (c):

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it

> If it is discovered that the AI tool was using source code that it
> wasn't supposed to be using, and then injected code that was pretty
> much verbatim to the original source, where it would be a copyright
> infringement, would the submitter of the patch be responsible? Would
> the maintainer?
> 
> I guess this would be no different if the submitter saw some code
> from a proprietary project and cut and pasted it without
> understanding they were not allowed to, and submitted that.

Right, the situation is analagous.  However, remember today there's no
legal case law that says a model's output is a derivative work of its
training (although there are still several cases ongoing).

> If the lawyers come back and say the onus is on the submitter and not
> the maintainer that the code being submitted is legal to be submitted
> under copyright law, then I'm perfectly fine in accepting any AI code
> (as long as the submitter can prove they understand that code and the
> code is clean).

Again, what do you mean by Liable?  The representations in the DCO are
fairly clear and as long as you have a good faith basis for following
their requirements the chances are that even if things like CRA pierce
the licence no-warranty clauses you wouldn't end up on the hook for a
copyright violation committed by a downstream author.

Remember also that a big design of the signoff is that if someone does
do something wrong, their contributions can be quickly identified and
excised (which is probably why AI contributions should be tagged with
which AI they came from).

If you want more assurance, let's take the example of the 10 lines of
code SCO eventually decided had been cut and pasted from Unixware by a
SGI engineer.  Their goal was to go after the shipper of the code with
the biggest pockets (IBM) they never made a case against the individual
engineer (probably mostly because the GPL no-warranty would make it
very hard to make the case and in minor part because the recovery would
be minimal)

> But until the lawyers state that explicitly, I can see why
> maintainers can be nervous about accepting AI generated code. Perhaps
> this transparency can make matters worse. As it can be argued that
> the maintainer knew it was a questionable AI that generated the code?
> (Like it would be if a maintainer knew the code being submitted was
> copied from a proprietary project).

There is so far no court case that AI output infringes copyright (there
are cases that have decided that AI training breached copyright
control, but none that that makes the output a derivative work of that
training), so that currently means that everyone can accept in good
faith that AI generated code is not infringing.  Even if the copyright
lobby eventually wins a case on the derivative nature of the output,
that won't change your historical good faith basis for accepting code,
although it may mean the project needs to undertake an effort to excise
it.

As far as the copyright status of AI output in the US goes, as long as
its not derivative of something, then it's a non-human creation and as
such cannot be copyrighted at all, so it's equivalent to public domain.

Regards,

James

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 21:21                   ` James Bottomley
@ 2025-11-10 21:42                     ` Steven Rostedt
  2025-11-10 21:52                       ` Luck, Tony
  0 siblings, 1 reply; 24+ messages in thread
From: Steven Rostedt @ 2025-11-10 21:42 UTC (permalink / raw)
  To: James Bottomley
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Theodore Ts'o, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Mon, 10 Nov 2025 16:21:30 -0500
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> As far as the copyright status of AI output in the US goes, as long as
> its not derivative of something, then it's a non-human creation and as
> such cannot be copyrighted at all, so it's equivalent to public domain.

I believe that's what is currently being argued in court. If AI is trained
on human content and prints out something based on it, is it a non-human
creation?  This isn't a case of a monkey taking a selfie, where the content
provider is clearly non-human. This is a machine that uses human created
content to derive new creations.

-- Steve

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 21:42                     ` Steven Rostedt
@ 2025-11-10 21:52                       ` Luck, Tony
  2025-11-10 22:07                         ` James Bottomley
  0 siblings, 1 reply; 24+ messages in thread
From: Luck, Tony @ 2025-11-10 21:52 UTC (permalink / raw)
  To: Steven Rostedt, James Bottomley
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Williams, Dan J, Theodore Ts'o,
	Sasha Levin, Jonathan Corbet, Kees Cook, Greg Kroah-Hartman,
	Miguel Ojeda, Shuah Khan

> I believe that's what is currently being argued in court. If AI is trained
> on human content and prints out something based on it, is it a non-human
> creation?  This isn't a case of a monkey taking a selfie, where the content
> provider is clearly non-human. This is a machine that uses human created
> content to derive new creations.

If the output were deemed copyrightable, who should own that copyright?

Option 1 is "The human that crafted the prompt to generate it"

Option 2 is "The corporation that spent vast resources to create that AI model"

Option 3 is "The owners of the copyrighted material used to train the AI".

If a court ever must decide which to pick, it may well pick the answer requested
by the best funded legal team (which would be option 2).

-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 21:52                       ` Luck, Tony
@ 2025-11-10 22:07                         ` James Bottomley
  0 siblings, 0 replies; 24+ messages in thread
From: James Bottomley @ 2025-11-10 22:07 UTC (permalink / raw)
  To: Luck, Tony, Steven Rostedt
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Williams, Dan J, Theodore Ts'o,
	Sasha Levin, Jonathan Corbet, Kees Cook, Greg Kroah-Hartman,
	Miguel Ojeda, Shuah Khan

On Mon, 2025-11-10 at 21:52 +0000, Luck, Tony wrote:
> > I believe that's what is currently being argued in court. If AI is
> > trained on human content and prints out something based on it, is
> > it a non-human creation?

So far (Bartz v. Anthropic and Kadrey v. Meta) the decisions have been
that the output is transformative enough that that is, in fact, an
independent creation.

> >   This isn't a case of a monkey taking a selfie, where the
> > content provider is clearly non-human. This is a machine that uses
> > human created content to derive new creations.
> 
> If the output were deemed copyrightable, who should own that
> copyright?
> 
> Option 1 is "The human that crafted the prompt to generate it"

This is possible, but so far hasn't been argued.

> 
> Option 2 is "The corporation that spent vast resources to create that
> AI model"

This would require a change of law in the US to allow a non-human
content creator to hold copyright; absent that there's nothing
copyrightable the corporation can lay claim to (not that the AI
industry might not be motivated to seek this eventually if they have
trouble monetizing AI).

> Option 3 is "The owners of the copyrighted material used to train the
> AI".

This is the derivative of training argument which has so far failed.

Regards,

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:54                 ` Steven Rostedt
  2025-11-10 20:00                   ` Konstantin Ryabitsev
  2025-11-10 21:21                   ` James Bottomley
@ 2025-11-10 23:16                   ` Theodore Ts'o
  2 siblings, 0 replies; 24+ messages in thread
From: Theodore Ts'o @ 2025-11-10 23:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Dan Williams, Sasha Levin, Jonathan Corbet,
	Kees Cook, Greg Kroah-Hartman, Miguel Ojeda, Shuah Khan

On Mon, Nov 10, 2025 at 02:54:05PM -0500, Steven Rostedt wrote:
> Probably no difference. I would guess the real liability is for those that
> use AI to submit patches. With the usual disclaimers of IANAL, I'm assuming
> that when you place your "Signed-off-by", you are stating that you have the
> right to submit this code. If it comes down that you did not have the right
> to submit the code, the original submitter is liable.
> 
> I guess the question also is, is the maintainer that took that patch and
> added their SoB also liable?

ObDisclaimer: Although I have take one or two law classes at the MIT
Sloan School (e.g., "Law for the I/T Manager"), I am not a lawyer, and
more importantly, I am not *your* lawyer.  So this is not legal
advice. 

Maintainers are always assuming that code that has a Signed-Off-By is
code that the submitter has a right to submit.  This is true before
AI, and it will be true today, after the advent of AI.  If I receive a
patch from someone who works for Google, or Microoft, or Amazon, how
do I know that they haven't cut and pasted code from their compan's
internal proprieatry code base?  I don't.  I rely on the Signed-off-by
and the good faith of the code submitter, and if someone sends me code
that they aren't authorized, it is my personal belief that I wouldn't be
liable; only the submitter.

What is true for code written by human (who might or might not have
cut and pasted from their internal code search), it should just be as
true for AI-generated code.

In fact, from a strict legal liability perspective, I'd be happier not
knowing whether or not a particlar patch had some kind of LLM
involved.  What I don't know, I can't *possibly* be held liable.

						- Ted

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-10 19:36               ` Linus Torvalds
  2025-11-10 19:54                 ` Steven Rostedt
@ 2025-11-11  9:35                 ` Lorenzo Stoakes
  2025-11-11 13:08                   ` Theodore Ts'o
  1 sibling, 1 reply; 24+ messages in thread
From: Lorenzo Stoakes @ 2025-11-11  9:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Steven Rostedt, Dan Williams,
	Theodore Ts'o, Sasha Levin, Jonathan Corbet, Kees Cook,
	Greg Kroah-Hartman, Miguel Ojeda, Shuah Khan

On Mon, Nov 10, 2025 at 11:36:00AM -0800, Linus Torvalds wrote:
>
> I really don't see what makes AI generated content so special.

The thread's become one of those jump-the-shark 'everybody + their dog
commenting' things, so risking adding more here, but...

As I said (or at least hope I did, or eventually did :) when I first raised
this on Sasha's original thread, in my MS proposal, and in review (which
Dave responded to very graciously - I think the doc is _mostly_ really
good) - I think LLMs really _are_ different in one important respect:

Submitter/maintainer asymmetry.

The issue is that people can generate sensible-looking series _EN MASSE_ that
now maintainers now HAVE to deal with.

That's the _BIG_ difference here.

With coccinelle etc. you need _some_ level of understanding of tooling
etc. to do it which acts as a barrier and maintiners submitter/maintiner
symmetry SOMEWHAT (with, err, at least one notable exception ;)

Now 'any idiot' can fire off hundreds of patches that look at a glance as
if they might have some validiity.

The asymmetry of this is VERY concerning.

I also hate that we have to think about it, but the second the press put
out 'the kernel accepts AI patches now!' - and trust me THEY WILL - we are
likely to see an influx like this that maintainers will have to deal with.

And much like the 'Linus doesn't scale' issue we hit some time ago, we
might hit a 'maintainers don't scale' issue here.

SO.

I think what we have to underline is:

1. Maintains MUST have the ability to JUST SAY NO, go away _en-masse_ to
   regain symmetry on this.

It might throw out the baby with the bath water in some cases, but it may
be a price we have to pay to avoid disaster.

Rightly people don't like BLANKET NAKS. But I think we need to be very
clear that - in this case - you might very well get them so to avoid
unworkable asymmetry.

2. Those who submit patches MUST UNDERSTAND EVERY PART OF IT.

'that which can be proposed without understanding can be dismissed without
understanding'.

I think as long as we UNDERLINE these points I think we're good.

TL;DR: we won't take slop.

Otherwise, sure, plus ca change.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] [v2] Documentation: Provide guidelines for tool-generated content
  2025-11-11  9:35                 ` Lorenzo Stoakes
@ 2025-11-11 13:08                   ` Theodore Ts'o
  0 siblings, 0 replies; 24+ messages in thread
From: Theodore Ts'o @ 2025-11-11 13:08 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Linus Torvalds, H. Peter Anvin, Mike Rapoport, Laurent Pinchart,
	Christian Brauner, Dave Hansen, Vlastimil Babka, linux-kernel,
	workflows, ksummit, Steven Rostedt, Dan Williams, Sasha Levin,
	Jonathan Corbet, Kees Cook, Greg Kroah-Hartman, Miguel Ojeda,
	Shuah Khan

On Tue, Nov 11, 2025 at 09:35:18AM +0000, Lorenzo Stoakes wrote:
> 
> Now 'any idiot' can fire off hundreds of patches that look at a glance as
> if they might have some validiity.
> 
> The asymmetry of this is VERY concerning.
> 
> I also hate that we have to think about it, but the second the press put
> out 'the kernel accepts AI patches now!' - and trust me THEY WILL - we are
> likely to see an influx like this that maintainers will have to deal with.

Yeah, that's an argument for not requiring any kind of AI tagging.
One of my concerns is that there's no guarantee that people flooding
the kernel with AI slop won't disclose that they used an LLM.

> 1. Maintains MUST have the ability to JUST SAY NO, go away _en-masse_ to
>    regain symmetry on this.

Maintainers do have this already.  There are certain people who are
known to be sending low priority patches, and people just quietly
ignore those patches.

The risk of AI slop is that this will just happen a *lot* more often,
which means that patches from known high quality controllers will get
far more attention than patches from newer contributors --- because we
won't know whether it's a new contributor who is coming up to speed,
or someone who is sending AI slop.  So the more AI slop we get, the
more this dynamic will accelerate, to the point where people who
accuse us of having an old "boys/girls" club will become true, and
people will accuse us of not being welcoming to new contributors.

There *will* be a solution to the symmetry; so I wouldn't consider it
"unworkable".  It's just that we (and especially newcomers) might not
like the solution that naturally comes out of it.  As you put it,
"throw out the baby with the bath water"; the system will survive, but
it might suck to be the baby.

> 2. Those who submit patches MUST UNDERSTAND EVERY PART OF IT.
> 
> 'that which can be proposed without understanding can be dismissed without
> understanding'.

Yeah, it might be that all we can do is to say that people who use
LLM's without understanding all parts of it, my result in their
blackening their reputation, wiht the result that *all* their patches
might get ignored.

And we can warn that if a company has many of its employees sending
lower quality contributions, maintainers might decide to address the
denial of service attack by ignoring *all* patches from a particular
company / domain.  We've done this before, with the University of
Minnesota, due to gross abuse leading to lack of trust of the
institution.

Hopefully things won't come to that, but maybe explicitly warning
people that *is* a possibility might be useful as a deterrent factor.

And I think it's important to say that it's low quality contributions
from AI is no different from any other kind of low quality
contributions.  And just as judges in courts of law have sanctioned
lawyers for submitting legal briefs that contained completely
hallucinated court cases, there will be costs to sending cr*p no
matter what the source.

> I think as long as we UNDERLINE these points I think we're good.
> 
> TL;DR: we won't take slop.

Agreed, completely.

						- Ted

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-11-11 13:09 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20251105231514.3167738-1-dave.hansen@linux.intel.com>
2025-11-10  7:43 ` [PATCH] [v2] Documentation: Provide guidelines for tool-generated content Vlastimil Babka
2025-11-10  8:58   ` Christian Brauner
2025-11-10 16:08     ` Dave Hansen
2025-11-10 17:25     ` Laurent Pinchart
2025-11-10 17:41       ` Dave Hansen
2025-11-10 17:44       ` Linus Torvalds
2025-11-10 17:56         ` Luck, Tony
2025-11-10 18:39         ` Mike Rapoport
2025-11-10 19:05           ` Linus Torvalds
2025-11-10 19:18             ` H. Peter Anvin
2025-11-10 19:36               ` Linus Torvalds
2025-11-10 19:54                 ` Steven Rostedt
2025-11-10 20:00                   ` Konstantin Ryabitsev
2025-11-10 20:25                     ` Steven Rostedt
2025-11-10 21:21                   ` James Bottomley
2025-11-10 21:42                     ` Steven Rostedt
2025-11-10 21:52                       ` Luck, Tony
2025-11-10 22:07                         ` James Bottomley
2025-11-10 23:16                   ` Theodore Ts'o
2025-11-11  9:35                 ` Lorenzo Stoakes
2025-11-11 13:08                   ` Theodore Ts'o
2025-11-10 17:46       ` Steven Rostedt
     [not found] ` <11eaf7fa-27d0-4a57-abf0-5f24c918966c@lucifer.local>
2025-11-10 11:15   ` Lorenzo Stoakes
     [not found]   ` <103ee61c-f958-440c-af73-1cf3600d10fd@intel.com>
2025-11-10 16:51     ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox