Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)

workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Luis Chamberlain <mcgrof@kernel.org>
To: Konstantin Ryabitsev <konstantin@linuxfoundation.org>,
	Sasha Levin <sashal@kernel.org>
Cc: users@kernel.org, tools@kernel.org, workflows@vger.kernel.org
Subject: Re: Toy/demo: using ChatGPT to summarize lengthy LKML threads (b4 integration)
Date: Wed, 28 Feb 2024 11:32:35 -0800	[thread overview]
Message-ID: <Zd-KU5lrdeAk4LiT@bombadil.infradead.org> (raw)
In-Reply-To: <20240227-flawless-capybara-of-drama-e09653@lemur>

On Tue, Feb 27, 2024 at 05:32:34PM -0500, Konstantin Ryabitsev wrote:
> Hi, all:
> 
> I was playing with shell-gpt and wrote a quickie integration that would allow
> retrieving (slimmed-down) threads from lore, feeding them to ChatGPT, and
> asking it to provide some basic analysis of the thread contents. Here's a
> recorded demo session:
> 
> https://asciinema.org/a/643435
> 
> A few notes:
> 
> 1. This is obviously not a replacement for actually reading email, but can
>    potentially be a useful asset for a busy maintainer who just wants a quick
>    summary of a lengthy thread before they look at it in detail.
> 2. This is not free or cheap! To digest a lengthy thread, you can expect
>    ChatGPT to generate enough tokens to cost you $1 or more in API usage fees.
>    I know it's nothing compared to how expensive some of y'all's time is, and
>    you can probably easily get that expensed by your employers, but for many
>    others it's a pretty expensive toy. I managed to make it a bit cheaper by
>    doing some surgery on the threads before feeding them to chatgpt (like
>    removing most of the message headers and throwing out some of the quoted
>    content), but there's a limit to how much we can throw out before the
>    analysis becomes dramatically less useful.
> 3. This only works with ChatGPT-4, as most threads are too long for
>    ChatGPT-3.5 to even process.
> 
> So, the question is -- is this useful at all? Am I wasting time poking in this
> direction, or is this something that would be of benefit to any of you? If the
> latter, I will document how to set this up and commit the thread minimization
> code I hacked together to make it cheaper.

While I probably wouldn't use it day to day, I expect younger
generations might use this more than us older generations to be more
productive, even if they get halluciations.

An LLM trained with more data relevant to patches might be more
suitable, and it is why I wanted the tooling for stable candidate patches
to be opened up, so to enable more exploring in areas like this.

A use case example might be training for identifying subsystems with
more memory safety issues.

Another might be to help to summarize further pull requests in one or
two sentences, or optionally few bullets. So for instance, I try to
document major bullet list changes for modules here:

https://kernelnewbies.org/KernelProjects/modules

So it is easier to track / go down memory lane. Doing this automatically
would allow me to use a tool to do this.

  Luis

     prev parent reply	other threads:[~2024-02-28 19:32 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 22:32 Konstantin Ryabitsev
2024-02-27 23:35 ` Junio C Hamano
2024-02-28  0:43 ` Linus Torvalds
2024-02-28 20:46   ` Shuah Khan
2024-02-29  0:33   ` James Bottomley
2024-02-28  5:00 ` Willy Tarreau
2024-02-28 14:03   ` Mark Brown
2024-02-28 14:39     ` Willy Tarreau
2024-02-28 15:22     ` Konstantin Ryabitsev
2024-02-28 15:29       ` Willy Tarreau
2024-02-28 17:52         ` Konstantin Ryabitsev
2024-02-28 17:58           ` Willy Tarreau
2024-02-28 19:16             ` Konstantin Ryabitsev
2024-02-28 15:04   ` Hannes Reinecke
2024-02-28 15:15     ` Willy Tarreau
2024-02-28 17:43     ` Jonathan Corbet
2024-02-28 18:52       ` Alex Elder
2024-02-28 18:55 ` Bart Van Assche
2024-02-29  7:18   ` Hannes Reinecke
2024-02-29  8:37     ` Theodore Ts'o
2024-03-01  1:13     ` Bart Van Assche
2024-02-29  9:30   ` James Bottomley
2024-02-28 19:32 ` Luis Chamberlain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zd-KU5lrdeAk4LiT@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=konstantin@linuxfoundation.org \
    --cc=sashal@kernel.org \
    --cc=tools@kernel.org \
    --cc=users@kernel.org \
    --cc=workflows@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox