workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Daniel Axtens <dja@axtens.net>,
	workflows@vger.kernel.org, automated-testing@yoctoproject.org,
	Han-Wen Nienhuys <hanwen@google.com>,
	Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Subject: Re: [Automated-testing] Structured feeds
Date: Fri, 8 Nov 2019 10:26:09 -0500	[thread overview]
Message-ID: <20191108152609.runqdsajpe7llsvz@redhat.com> (raw)
In-Reply-To: <CACT4Y+YTe7kpEvchsHU99fH1y-STYR_tteuHTYKGUOjf6oH9kQ@mail.gmail.com>

On Fri, Nov 08, 2019 at 08:58:44AM +0100, Dmitry Vyukov wrote:
> On Thu, Nov 7, 2019 at 9:44 PM Don Zickus <dzickus@redhat.com> wrote:
> >
> > On Thu, Nov 07, 2019 at 02:35:08AM +1100, Daniel Axtens wrote:
> > > > As soon as we have a bridge from plain-text emails into the structured
> > > > form, we can start building everything else in the structured world.
> > > > Such bridge needs to parse new incoming emails, try to make sense out
> > > > of them (new patch, new patch version, comment, etc) and then push the
> > > > information in structured form. Then e.g. CIs can fetch info about
> > >
> > > This is an non-trivial problem, fwiw. Patchwork's email parser clocks in
> > > at almost thirteen hundred lines, and that's with the benefit of the
> > > Python standard library. It also regularly gets patched to handle
> > > changes to email systems (e.g. DMARC), changes to git (git request-pull
> > > format changed subtly in 2.14.3), the bizzare ways people send email,
> > > and so on.
> >
> > Does it ever make sense to just use git to do the translation to structured
> > json?  Git has similar logic and can easily handle its own changes.  Tools
> > like git-mailinfo and git-mailsplit probably do a good chunk of the
> > work today.
> >
> > It wouldn't pull together series info.
> 
> Hi Don,
> 
> Could you elaborate? What exactly do you mean? I don't understand the
> overall proposal.

The problem I was looking at was, patchwork has this large elaborate python
code to translate human git formatted patches into some structured form.
And rightfully so.

But git has similar code in order to make git-am work.

When applying an email to public-inbox, I had assumed it was using a tool
like git-am that would call into git-mailsplit and git-mailinfo to split
apart the email into various pieces and put them in .git/rebase-apply.

At that point most of the text parsing is done.

So the thought was to have another public-inbox tool that took advantage of
the already split data and just take the small step to finish converting
into a structured file 'j'.  As opposed to sending the text email through an
external tool like patchwork to re-split the data into structured pieces
again.

Then adding to that thought was, every time git changed its format or text
output, instead of updating external tools, just leverage git's existing
knowledge of the change (assuming public-inbox used the latest git tool
consistently) would reduce the ripple effect of having to update all
external tools before developers can utilize new git features or changes.

But looking through the public-inbox code, it appears to do things
differently, so the idea may not work at all.

So just treat my idea as looking at the problem from a different angle to
see if there is an easier solution.

Cheers,
Don


  reply	other threads:[~2019-11-08 15:26 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05 10:02 Dmitry Vyukov
2019-11-06 15:35 ` Daniel Axtens
2019-11-06 20:50   ` Konstantin Ryabitsev
2019-11-07  9:08     ` Dmitry Vyukov
2019-11-07 10:57       ` Daniel Axtens
2019-11-07 11:26         ` Veronika Kabatova
2019-11-08  0:24           ` Eric Wong
2019-11-07 11:09     ` Daniel Axtens
2019-11-08 14:18     ` Daniel Axtens
2019-11-09  7:41       ` Johannes Berg
2019-11-12 10:44         ` Daniel Borkmann
     [not found]         ` <208edf06eb4c56a4f376caf0feced65f09d23f93.camel@that.guru>
2019-11-30 18:16           ` Johannes Berg
2019-11-30 18:36             ` Stephen Finucane
2019-11-07  8:53   ` Dmitry Vyukov
2019-11-07 10:40     ` Daniel Axtens
2019-11-07 10:43       ` Dmitry Vyukov
2019-11-07 20:43   ` [Automated-testing] " Don Zickus
2019-11-08  7:58     ` Dmitry Vyukov
2019-11-08 15:26       ` Don Zickus [this message]
2019-11-08 11:44     ` Daniel Axtens
2019-11-08 14:54       ` Don Zickus
2019-11-06 19:54 ` Han-Wen Nienhuys
2019-11-06 20:31   ` Sean Whitton
2019-11-07  9:04   ` Dmitry Vyukov
2019-11-07  8:48 ` [Automated-testing] " Tim.Bird
2019-11-07  9:13   ` Dmitry Vyukov
2019-11-07  9:20     ` Tim.Bird
2019-11-07 20:53 ` Don Zickus
2019-11-08  8:05   ` Dmitry Vyukov
2019-11-08 14:52     ` Don Zickus
2019-11-11  9:20       ` Dmitry Vyukov
2019-11-11 15:14         ` Don Zickus
2019-11-12 22:54 ` Konstantin Ryabitsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191108152609.runqdsajpe7llsvz@redhat.com \
    --to=dzickus@redhat.com \
    --cc=automated-testing@yoctoproject.org \
    --cc=dja@axtens.net \
    --cc=dvyukov@google.com \
    --cc=hanwen@google.com \
    --cc=konstantin@linuxfoundation.org \
    --cc=workflows@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox