From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C034BC4361B for ; Wed, 16 Dec 2020 04:44:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5824823139 for ; Wed, 16 Dec 2020 04:44:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5824823139 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DF4E28D0012; Tue, 15 Dec 2020 23:44:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DA5BE6B0074; Tue, 15 Dec 2020 23:44:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6EC48D0012; Tue, 15 Dec 2020 23:44:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id B19DE6B0073 for ; Tue, 15 Dec 2020 23:44:55 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7DE45180AD80F for ; Wed, 16 Dec 2020 04:44:55 +0000 (UTC) X-FDA: 77597905350.04.dress81_3a0a75d27429 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 6642F800005C for ; Wed, 16 Dec 2020 04:44:55 +0000 (UTC) X-HE-Tag: dress81_3a0a75d27429 X-Filterd-Recvd-Size: 8674 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Dec 2020 04:44:54 +0000 (UTC) Date: Tue, 15 Dec 2020 20:44:53 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1608093894; bh=fVCRBSg+2X2PmD/hPE/N5n9gWnfFI5sl+EAclvAcRRo=; h=From:To:Subject:In-Reply-To:From; b=0RxLI1qndPvrxUc2xi1t8+AObMDJVo25JGBsx/4tA6y4m08YhxVr3QxQeUVkkAt5q jJbTBFRHUENg1igGC1gU5V+CY7y/fn4PR6OsXNNMsemKqN1MyH83VQCFeih4WUQ7rl OETjHop+/8nIZfEc99hunUZap/PTI3dO3rTvmqKo= From: Andrew Morton To: akpm@linux-foundation.org, dwaipayanray1@gmail.com, joe@perches.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 47/95] checkpatch: improve email parsing Message-ID: <20201216044453.FsnefDBIW%akpm@linux-foundation.org> In-Reply-To: <20201215204156.f05ec694b907845bcfab5c44@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dwaipayan Ray Subject: checkpatch: improve email parsing checkpatch doesn't report warnings for many common mistakes in emails. Some of which are trailing commas and incorrect use of email comments. At the same time several false positives are reported due to incorrect handling of mail comments. The most common of which is due to the pattern: # X.X Improve email parsing in checkpatch. Some general email rules are defined: - Multiple name comments should not be allowed. - Comments inside address should not be allowed. - In general comments should be enclosed within parentheses. Relaxation is given to comments beginning with #. - Stable addresses should not begin with a name. - Comments in stable addresses should begin only with a #. Improvements to parsing: - Detect and report unexpected content after email. - Quoted names are excluded from comment parsing. - Trailing dots, commas or quotes in email are removed during formatting. Correspondingly a BAD_SIGN_OFF warning is emitted. - Improperly quoted email like '"name
"' are now warned about. In addition, added fixes for all the possible rules. Link: https://lore.kernel.org/linux-kernel-mentees/6c275d95c3033422addfc256a30e6ae3dd37941d.camel@perches.com/ Link: https://lore.kernel.org/linux-kernel-mentees/20201105200857.GC1333458@kroah.com/ Link: https://lkml.kernel.org/r/20201108100632.75340-1-dwaipayanray1@gmail.com Signed-off-by: Dwaipayan Ray Suggested-by: Joe Perches Acked-by: Joe Perches Signed-off-by: Andrew Morton --- scripts/checkpatch.pl | 108 +++++++++++++++++++++++++++++++++------- 1 file changed, 91 insertions(+), 17 deletions(-) --- a/scripts/checkpatch.pl~checkpatch-improve-email-parsing +++ a/scripts/checkpatch.pl @@ -1159,6 +1159,7 @@ sub parse_email { my ($formatted_email) = @_; my $name = ""; + my $quoted = ""; my $name_comment = ""; my $address = ""; my $comment = ""; @@ -1190,14 +1191,20 @@ sub parse_email { } } - $comment = trim($comment); - $name = trim($name); - $name =~ s/^\"|\"$//g; - if ($name =~ s/(\s*\([^\)]+\))\s*//) { - $name_comment = trim($1); + # Extract comments from names excluding quoted parts + # "John D. (Doe)" - Do not extract + if ($name =~ s/\"(.+)\"//) { + $quoted = $1; + } + while ($name =~ s/\s*($balanced_parens)\s*/ /) { + $name_comment .= trim($1); } + $name =~ s/^[ \"]+|[ \"]+$//g; + $name = trim("$quoted $name"); + $address = trim($address); $address =~ s/^\<|\>$//g; + $comment = trim($comment); if ($name =~ /[^\w \-]/i) { ##has "must quote" chars $name =~ s/(? 1) { WARN("BAD_SIGN_OFF", - "email address '$email' might be better as '$suggested_email'\n" . $herecurr); + "Use a single name comment in email: '$email'\n" . $herecurr); + } + + + # stable@vger.kernel.org or stable@kernel.org shouldn't + # have an email name. In addition commments should strictly + # begin with a # + if ($email =~ /^.*stable\@(?:vger\.)?kernel\.org/i) { + if (($comment ne "" && $comment !~ /^#.+/) || + ($email_name ne "")) { + my $cur_name = $email_name; + my $new_comment = $comment; + $cur_name =~ s/[a-zA-Z\s\-\"]+//g; + + # Remove brackets enclosing comment text + # and # from start of comments to get comment text + $new_comment =~ s/^\((.*)\)$/$1/; + $new_comment =~ s/^\[(.*)\]$/$1/; + $new_comment =~ s/^[\s\#]+|\s+$//g; + + $new_comment = trim("$new_comment $cur_name") if ($cur_name ne $new_comment); + $new_comment = " # $new_comment" if ($new_comment ne ""); + my $new_email = "$email_address$new_comment"; + + if (WARN("BAD_STABLE_ADDRESS_STYLE", + "Invalid email format for stable: '$email', prefer '$new_email'\n" . $herecurr) && + $fix) { + $fixed[$fixlinenr] =~ s/\Q$email\E/$new_email/; + } + } + } elsif ($comment ne "" && $comment !~ /^(?:#.+|\(.+\))$/) { + my $new_comment = $comment; + + # Extract comment text from within brackets or + # c89 style /*...*/ comments + $new_comment =~ s/^\[(.*)\]$/$1/; + $new_comment =~ s/^\/\*(.*)\*\/$/$1/; + + $new_comment = trim($new_comment); + $new_comment =~ s/^[^\w]$//; # Single lettered comment with non word character is usually a typo + $new_comment = "($new_comment)" if ($new_comment ne ""); + my $new_email = format_email($email_name, $name_comment, $email_address, $new_comment); + + if (WARN("BAD_SIGN_OFF", + "Unexpected content after email: '$email', should be: '$new_email'\n" . $herecurr) && + $fix) { + $fixed[$fixlinenr] =~ s/\Q$email\E/$new_email/; + } } } _