From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AC8AC433EF for ; Fri, 22 Apr 2022 20:47:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 77C336B0074; Fri, 22 Apr 2022 16:47:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 72BAB6B0075; Fri, 22 Apr 2022 16:47:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61BC56B0078; Fri, 22 Apr 2022 16:47:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 53F096B0074 for ; Fri, 22 Apr 2022 16:47:50 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2A43B5F9 for ; Fri, 22 Apr 2022 20:47:50 +0000 (UTC) X-FDA: 79385701500.25.1FAF103 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf20.hostedemail.com (Postfix) with ESMTP id 0A1141C002E for ; Fri, 22 Apr 2022 20:47:47 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 370EAB8325B; Fri, 22 Apr 2022 20:47:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BB5EFC385A4; Fri, 22 Apr 2022 20:47:45 +0000 (UTC) Date: Fri, 22 Apr 2022 16:47:44 -0400 From: Steven Rostedt To: Kent Overstreet Cc: Christoph Hellwig , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, hannes@cmpxchg.org, akpm@linux-foundation.org, linux-clk@vger.kernel.org, linux-tegra@vger.kernel.org, linux-input@vger.kernel.org, roman.gushchin@linux.dev Subject: Re: [PATCH v2 1/8] lib/printbuf: New data structure for heap-allocated strings Message-ID: <20220422164744.6500ca06@gandalf.local.home> In-Reply-To: <20220422203057.iscsmurtrmwkpwnq@moria.home.lan> References: <20220421234837.3629927-1-kent.overstreet@gmail.com> <20220421234837.3629927-7-kent.overstreet@gmail.com> <20220422042017.GA9946@lst.de> <20220422052208.GA10745@lst.de> <20220422113736.460058cc@gandalf.local.home> <20220422193015.2rs2wvqwdlczreh3@moria.home.lan> <20220422153916.7ebf20c3@gandalf.local.home> <20220422203057.iscsmurtrmwkpwnq@moria.home.lan> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 0A1141C002E Authentication-Results: imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of "SRS0=ohlw=VA=goodmis.org=rostedt@kernel.org" designates 145.40.68.75 as permitted sender) smtp.mailfrom="SRS0=ohlw=VA=goodmis.org=rostedt@kernel.org"; dmarc=none X-Rspam-User: X-Stat-Signature: wkmipsh7nbx8yjtn1zi7wm8mx454yy87 X-HE-Tag: 1650660467-346948 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 22 Apr 2022 16:30:57 -0400 Kent Overstreet wrote: > So here's the story of how I got from where seq_buf is now to where printbuf is > now: > > - Printbuf started out as almost an exact duplicate of seq_buf (like I said, > not intentionally), with an external buffer typically allocated on the stack. Basically seq_buf is designed to be used as an underlining infrastructure. That's why it does not allocate any buffer and leaves that to the user cases. Hence, trace_seq() allocates a page for use, and I even use seq_buf in user space to dynamically allocate when needed. > > - As error/log messages got to be bigger and more structured, stack usage > eventually became an issue, so eventually I added the heap allocations. Which is something you could do on top of seq_buf. Point being, you do not need to re-implement printbuf, and I have not looked at the code, but instead, implement printbuf on top of seq_buf, and extend seq_buf where needed. Like trace_seq does, and the patches I have for seq_file would do. It would leave the string processing and buffer space management to seq_buf, as there's ways to see "oh, we need more space, let's allocate more" and then increase the heap. > > - This made them a lot more convenient to use, and made possible entirely new > ways of using them - so I started using them more, and converting everything > that outputted to strings to them. > > - This lead to the realization that when pretty-printers are easy and > convenient to write, that leads to writing pretty-printers for _more_ stuff, > which makes it easy to stay in the habit of adding anything relevant to > sysfs/debugfs - and log/error messages got a _whole_ lot better when I > realized instead of writing format strings for every basic C type I can just > use the .to_text() methods of the high level objects I'm working with. > > Basically, my debugging life has gotten _drastically_ easier because of this > change in process and approach - deadlocks that I used to have to attach a > debugger for are now trivial because all the relevant state is in debugfs and > greppable, and filesystem inconsistencies that used to suck to debug I now just > take what's in the error message and grep through the journal for. > > I can't understate how invaluable all this stuff has been, and I'm excited to > take the lessons I've learned and apply them to the wider kernel and make other > people's lives easier too. > > The shrinkers-OOM-reporting patch was an obvious starting point because > - shrinkers have internal state that's definitely worth reporting on > - we shouldn't just be logging this on OOM, we should also make this available > in sysfs or debugfs. > > Feature wise, printbufs have: > - heap allocation > - extra state for formatting: indent level, tabstops, and a way of specifying > units. > > That's basically it. Heap allocation adds very little code and eliminates a > _lot_ of headaches in playing the "how much do I need to/can I put on the stack" > game, and you'll want the formatting options as soon as you start formatting > multi line output and writing pretty printers that call other pretty printers. I would be more willing to accept a printbuf, if it was built on top of seq_buf. That is, you don't need to change all your user cases, you just need to make printbuf an extension of seq_buf by using it underneath, like trace_seq does. Then it would not be re-inventing the wheel, but just building on top of it. -- Steve