From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 572D9C43334
	for <linux-mm@archiver.kernel.org>; Mon, 20 Jun 2022 15:30:48 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 96D258E0001; Mon, 20 Jun 2022 11:30:47 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 8F54F6B0073; Mon, 20 Jun 2022 11:30:47 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 76EA98E0001; Mon, 20 Jun 2022 11:30:47 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 63D206B0071
	for <linux-mm@kvack.org>; Mon, 20 Jun 2022 11:30:47 -0400 (EDT)
Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay07.hostedemail.com (Postfix) with ESMTP id 32E5A20CE1
	for <linux-mm@kvack.org>; Mon, 20 Jun 2022 15:30:47 +0000 (UTC)
X-FDA: 79599001734.21.1D62A28
Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169])
	by imf20.hostedemail.com (Postfix) with ESMTP id 90A361C00AD
	for <linux-mm@kvack.org>; Mon, 20 Jun 2022 15:30:46 +0000 (UTC)
Received: by mail-qk1-f169.google.com with SMTP id a184so8037804qkg.5
        for <linux-mm@kvack.org>; Mon, 20 Jun 2022 08:30:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=date:from:to:cc:subject:message-id:references:mime-version
         :content-disposition:in-reply-to;
        bh=tIziweDctphrivbBNtLrv59ERoNt4/d4pW20tuqw9bg=;
        b=lwW5J7l5zU0wTj82HmB/JO+Q5uM/Yv5b5EXsCh1LDLa86XlOqlvDHaZo3JbcEqHRuA
         QktymIv5mLPXxfRkVFK6Zv5Y0tYIiEGrmjwEnEeHxs3lOO12ZEk4tsMmrld09ZlQPKf3
         0Sy5UM3DEJ/J3oLz+0VP422Vi5Ju8Knbsg+POXuMN509+y9EQRn4dCtSJ5YH9ASbgIdO
         +M64pFNnPB2u1pnfYku6Ztv1MCpUU21Fie4cBs9QcQVt8EfutszanPfy6pBA9Opgi7k/
         992rEYF5RAuEtZjDzmVm5Cr1A5NcHn1RXjnhWFV9B4nchZsf7kvKOXiW2dlxPo0tSgzD
         gO6g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:from:to:cc:subject:message-id:references
         :mime-version:content-disposition:in-reply-to;
        bh=tIziweDctphrivbBNtLrv59ERoNt4/d4pW20tuqw9bg=;
        b=iiVq7Lsj/1J1dy3CqKC5TXnBkF94835Nlv37l9x9NxCkvN9arKVWNb5uUoFAcA1ua2
         vA5HE+n56jx+TJBB0ZNP9zGbbJKO88x+toZ5K45fC+ecBptg2i6HrqO7j19ZWBDoL9hl
         t5v9ynY4MS+qjuSsjrd3mvb3IRpO/ssswsolZFGX1DiUKlJCVQN43/RHtI4VnyJi8nQq
         kSOjh/VQQkRf+li7da+q43R41FO84hwfkCHb9Tm1R3hKyZk1bZzuBos5/1raVN87tCrj
         pXU5FIobZuKq8afgM5yXHt2l9e/n9ENCN0ermfuhyyC1a2IIi/jheAgK6OPsrctZruUK
         4hNg==
X-Gm-Message-State: AJIora9TYvud7ntaUBslNX6sqzHhMh+ODuvvuWoxZr0r/subRy7iwbjd
	qPwO6DNQVePVc88RA/p0B2HZhPaPq8zAfIk=
X-Google-Smtp-Source: AGRyM1s8AGwv69iI/TzgVnUJsc7kv4ZSNeEzSrXELk9G0F+7vMYnwetu7enrYL5oAhlHhK6x+pkUnA==
X-Received: by 2002:a05:620a:2a08:b0:6a6:c094:e674 with SMTP id o8-20020a05620a2a0800b006a6c094e674mr16700429qkp.226.1655739045840;
        Mon, 20 Jun 2022 08:30:45 -0700 (PDT)
Received: from localhost (c-73-219-103-14.hsd1.vt.comcast.net. [73.219.103.14])
        by smtp.gmail.com with ESMTPSA id x9-20020a05620a448900b006a6a904c0a5sm12107410qkp.107.2022.06.20.08.30.44
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 20 Jun 2022 08:30:44 -0700 (PDT)
Date: Mon, 20 Jun 2022 11:30:43 -0400
From: Kent Overstreet <kent.overstreet@gmail.com>
To: David Laight <David.Laight@ACULAB.COM>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"pmladek@suse.com" <pmladek@suse.com>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"enozhatsky@chromium.org" <enozhatsky@chromium.org>,
	"linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk>,
	"willy@infradead.org" <willy@infradead.org>
Subject: Re: [PATCH v4 01/34] lib/printbuf: New data structure for printing
 strings
Message-ID: <20220620153043.vgtfrltebiyprufz@moria.home.lan>
References: <20220620004233.3805-1-kent.overstreet@gmail.com>
 <20220620004233.3805-2-kent.overstreet@gmail.com>
 <f0808aaee9ac4b088121c0fbe7e18f0d@AcuMS.aculab.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <f0808aaee9ac4b088121c0fbe7e18f0d@AcuMS.aculab.com>
ARC-Authentication-Results: i=1;
	imf20.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=lwW5J7l5;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf20.hostedemail.com: domain of kent.overstreet@gmail.com designates 209.85.222.169 as permitted sender) smtp.mailfrom=kent.overstreet@gmail.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1655739046;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=tIziweDctphrivbBNtLrv59ERoNt4/d4pW20tuqw9bg=;
	b=IVDhP/lUXfk4IqL1+NEX8em4fg5pvr5i5D3mmn5fox/mkCs14DOpY8AG6xf92ZYOuESDmI
	xgUJFigJtxw19byz1nRybGa26zNupC+RPZ0AC3YmLiPveBJMaElRfC1aph7ExeKijrFjWm
	7Nhhp9QV950sXXz27JyrttYdtdkF8Cs=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655739046; a=rsa-sha256;
	cv=none;
	b=FeKyM8CBtv6EFWuiOVItoc/u2v4nup3edDVWzXxSFt+DwnKf/P6gPuNu/yecQtNoEPajAJ
	tqbUQKJDYaEsVQnQkHgnL7YFponUK4U7GSWkZFTK44tppsVlPGAJ86yYouBy7NCijjjPr1
	GH07WcjYcvALr8t+gZfA8kVOdtwJxZA=
X-Rspamd-Queue-Id: 90A361C00AD
Authentication-Results: imf20.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=lwW5J7l5;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf20.hostedemail.com: domain of kent.overstreet@gmail.com designates 209.85.222.169 as permitted sender) smtp.mailfrom=kent.overstreet@gmail.com
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Stat-Signature: gdtqogjhgrtttiw1uywaggwxdxti6pee
X-HE-Tag: 1655739046-327704
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, Jun 20, 2022 at 04:44:10AM +0000, David Laight wrote:
> From: Kent Overstreet
> > Sent: 20 June 2022 01:42
> > 
> > This adds printbufs: a printbuf points to a char * buffer and knows the
> > size of the output buffer as well as the current output position.
> > 
> > Future patches will be adding more features to printbuf, but initially
> > printbufs are targeted at refactoring and improving our existing code in
> > lib/vsprintf.c - so this initial printbuf patch has the features
> > required for that.
> > 
> > Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
> > Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  include/linux/printbuf.h | 122 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 122 insertions(+)
> >  create mode 100644 include/linux/printbuf.h
> > 
> > diff --git a/include/linux/printbuf.h b/include/linux/printbuf.h
> > new file mode 100644
> > index 0000000000..8186c447ca
> > --- /dev/null
> > +++ b/include/linux/printbuf.h
> > @@ -0,0 +1,122 @@
> > +/* SPDX-License-Identifier: LGPL-2.1+ */
> > +/* Copyright (C) 2022 Kent Overstreet */
> > +
> > +#ifndef _LINUX_PRINTBUF_H
> > +#define _LINUX_PRINTBUF_H
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/string.h>
> > +
> > +/*
> > + * Printbufs: String buffer for outputting (printing) to, for vsnprintf
> > + */
> > +
> > +struct printbuf {
> > +	char			*buf;
> > +	unsigned		size;
> > +	unsigned		pos;
> 
> No naked unsigneds.

This is the way I've _always_ written kernel code - single word type names.

> 
> > +};
> > +
> > +/*
> > + * Returns size remaining of output buffer:
> > + */
> > +static inline unsigned printbuf_remaining_size(struct printbuf *out)
> > +{
> > +	return out->pos < out->size ? out->size - out->pos : 0;
> > +}
> > +
> > +/*
> > + * Returns number of characters we can print to the output buffer - i.e.
> > + * excluding the terminating nul:
> > + */
> > +static inline unsigned printbuf_remaining(struct printbuf *out)
> > +{
> > +	return out->pos < out->size ? out->size - out->pos - 1 : 0;
> > +}
> 
> Those two are so similar mistakes will be make.

If you've got ideas for better names I'd be happy to hear them - we discussed
this and this was what we came up with.

> You can also just return negatives when the buffer has overlowed
> and get the callers to test < or <= as required.

Yeesh, no.

> I also wonder it is necessary to count the total length
> when the buffer isn't long enough?
> Unless there is a real pressing need for it I'd not bother.
> Setting pos == size (after writing the '\0') allows
> overflow be detected without most of the dangers.

Because that's what snprintf() needs.

> > +
> > +static inline unsigned printbuf_written(struct printbuf *out)
> > +{
> > +	return min(out->pos, out->size);
> 
> That excludes the '\0' for short buffers but includes
> it for overlong ones.

It actually doesn't.

> > +}
> > +
> > +/*
> > + * Returns true if output was truncated:
> > + */
> > +static inline bool printbuf_overflowed(struct printbuf *out)
> > +{
> > +	return out->pos >= out->size;
> > +}
> > +
> > +static inline void printbuf_nul_terminate(struct printbuf *out)
> > +{
> > +	if (out->pos < out->size)
> > +		out->buf[out->pos] = 0;
> > +	else if (out->size)
> > +		out->buf[out->size - 1] = 0;
> > +}
> > +
> > +static inline void __prt_char(struct printbuf *out, char c)
> > +{
> > +	if (printbuf_remaining(out))
> > +		out->buf[out->pos] = c;
> 
> At this point it is (should be) always safe to add the '\0'.
> Doing so would save the extra conditionals later on.

True, but at the cost of making the code less straightforward. I may have a look
at it.

> 
> > +	out->pos++;
> > +}
> > +
> > +static inline void prt_char(struct printbuf *out, char c)
> > +{
> > +	__prt_char(out, c);
> > +	printbuf_nul_terminate(out);
> > +}
> > +
> > +static inline void __prt_chars(struct printbuf *out, char c, unsigned n)
> > +{
> > +	unsigned i, can_print = min(n, printbuf_remaining(out));
> > +
> > +	for (i = 0; i < can_print; i++)
> > +		out->buf[out->pos++] = c;
> > +	out->pos += n - can_print;
> > +}
> > +
> > +static inline void prt_chars(struct printbuf *out, char c, unsigned n)
> > +{
> > +	__prt_chars(out, c, n);
> > +	printbuf_nul_terminate(out);
> > +}
> > +
> > +static inline void prt_bytes(struct printbuf *out, const void *b, unsigned n)
> > +{
> > +	unsigned i, can_print = min(n, printbuf_remaining(out));
> > +
> > +	for (i = 0; i < can_print; i++)
> > +		out->buf[out->pos++] = ((char *) b)[i];
> > +	out->pos += n - can_print;
> > +
> > +	printbuf_nul_terminate(out);
> 
> jeepers - that can be written so much better.
> Something like:
> 	unsigned int i, pos = out->pos;
> 	int space = pos - out->size - 1;
> 	char *tgt = out->buf + pos;
> 	const char *src = b;
> 	out->pos = pos + n;
> 
> 	if (space <= 0)
> 		return;
> 	if (n > space)
> 		n = space;
> 
> 	for (i = 0; i < n; i++)
> 		tgt[i] = src[i];
> 	tgt[1] = 0;
> 

I find your version considerably harder to read, and I've stared at enough
assembly that I trust the compiler to generate pretty equivalent code.

> > +}
> > +
> > +static inline void prt_str(struct printbuf *out, const char *str)
> > +{
> > +	prt_bytes(out, str, strlen(str));
> 
> Do you really need to call strlen() and then process
> the buffer byte by byte?

Versus introducing a branch to check for nul into the inner loop of prt_bytes()?
You're not serious, are you?