From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 596C22582 for ; Mon, 15 Sep 2025 18:01:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757959307; cv=none; b=GtbmwNlby/8HF2vMhiPJdEStFe7O6LYkWAB5Q5rlwsTvPVdRidh3uK3uYfw+Yf9uWdyjr4MLpuh+bEWROBW1nB6JYcUjor5tBI/lQ+vgUKuZytf1gFamKjEH7VvQmkLn9hrK4nDm1JRd8aw1fHXyavp6SuWYOoxWBZ54vR/19qg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757959307; c=relaxed/simple; bh=Yjnje5RjidlGotMHX+anH2bGpF15m5HnR5vghcQP+Ew=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=NxTlGNnpJIvh/4MG1DuWS0Hoa6c259BdzwpQU0uUNfI2S4bZVadZWyLO+UpL0MpRJPlUNx+Iucb8ODPtqRpkhA1x8bj7/8SZJwjLk2/y3oTSGbIE5askBazATD41TGJQx5aWavOmxq3r3Tbq1zMnVHkO7QojjmipMGyVo/367Q4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Oy6hFOzM; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Oy6hFOzM" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3D35C4CEF1; Mon, 15 Sep 2025 18:01:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1757959306; bh=Yjnje5RjidlGotMHX+anH2bGpF15m5HnR5vghcQP+Ew=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Oy6hFOzMU43SDemRf8hXElUXqvvU0Ca+KTucxtksfgK4afPNFEd+bniuaESsxE5Zx pZ1GQqjIZPtRWk01HqYQtxjbtnMtHhQEpDD+KEc015USPtgbATxeUok7LSf/ci24mS sWPRRbBGWHfI7il8NUThH90kO49AElWVQIFTTUz0kZxVwTLu7sF9dZR3R9pw2Pzxgw v8fkzdHieb7XloacqI6cZu6VE0UIR/6fJsBcjCrz3X64jORePISW4FDUnwFJhcmNyd xWbJAU6TwgmJxPMr5zZAB4cdl6Vl+muxJT2BpuWYyOklL34BzFSg3sUC0Gh5JDmQsM QfDLnmr6cZjmw== Date: Mon, 15 Sep 2025 11:01:46 -0700 From: Kees Cook To: Jiri Kosina Cc: ksummit@lists.linux.dev Subject: Re: [MAINTAINERS SUMMIT] Annotating patches containing AI-assisted code Message-ID: <202509151019.CD7AA0C0BE@keescook> References: <1npn33nq-713r-r502-p5op-q627pn5555oo@fhfr.pbz> Precedence: bulk X-Mailing-List: ksummit@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1npn33nq-713r-r502-p5op-q627pn5555oo@fhfr.pbz> On Tue, Aug 05, 2025 at 05:38:36PM +0200, Jiri Kosina wrote: > I believe we need to at least settle on (and document) the way how to > express in patch (meta)data: > > - this patch has been assisted by LLM $X > - the human understanding the generated code is $Y A lot was covered in this thread already, but I noticed that much of the discussion ended up looking at LLM assistance from the perspective of "one shot" execution, either as a comparison against other "mechanical" transformation tools (like Coccinelle) or from the perspective of bulk code creation ("the LLM wrote it all"). What I didn't see discussed, and where I think there is substantially greater utility to be had from LLMs, is contributions created during long running sessions. Such sessions might include more than "just" writing code, like doing builds, running tests, or helping with debugging (like driving gdb analysis of kernel crashes). I think this scenario makes things much more complex to "declare" in a commit log. (Some examples from me: "security fix I drove the LLM to make"[1], "I worked interactively with an LLM to construct API testing coverage"[2] and "I used the LLM to find missed conversion instances and do cross-arch build and test validation"[3].) The awkward analogy I have is that of carving a fish out of drift wood: I picked the wood, and then used a chainsaw, chisel, and knife to remove everything that wasn't the fish I wanted. Normally only the final result is shown. For more complex creations, I might describe why I made various choices. If I'm asked to describe "how I used the chisel" it quickly becomes murky: it was one of several tools used, and its use depended on other tools and other choices and the state of the sculpture at any given time. So, what I mean to say is it's certainly useful to declare "I used a chisel", but that for long running sessions it becomes kind of pointless to include much more than a general gist of what the process was. This immediately gets at the "trust" part of this thread making the mentioned "human understanding the generated code" a central issue. How should that be expressed? Our existing commit logs don't do a lot of "show your work" right now, but rather focus on the why/what of a change, and less "how did I write this". It's not strictly absent (some commit logs discuss what alternatives were tried and eliminated, for example), but we've tended to look only at final results and instead use trust in contributors as a stand-in for "prove to me you understand what you've changed". It seems like a "show your work" approach for commit logs would be valuable regardless of tools involved. I've been struggling to find a short way to describe this, though. Initially I thought we wanted to ask "Why is this contribution correct?" but we actually already expect that to be answered in the commit log. We want something more specific, like "How did you construct this solution?" But that is unlikely to be distilled into a trailer tag. -Kees [1] https://lore.kernel.org/lkml/20250724080756.work.741-kees@kernel.org/ [2] https://lore.kernel.org/lkml/20250717085156.work.363-kees@kernel.org/ [3] https://lore.kernel.org/lkml/20250804163910.work.929-kees@kernel.org/ -- Kees Cook