From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 12B1E98B for ; Wed, 20 Jul 2016 07:32:08 +0000 (UTC) Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id B2D12142 for ; Wed, 20 Jul 2016 07:32:05 +0000 (UTC) Date: Wed, 20 Jul 2016 09:32:01 +0200 (CEST) From: Julia Lawall To: "Eric W. Biederman" In-Reply-To: <87lh0xf9xv.fsf@x220.int.ebiederm.org> Message-ID: References: <87inw1skws.fsf@x220.int.ebiederm.org> <1468962539.2383.82.camel@HansenPartnership.com> <87lh0xf9xv.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: James Bottomley , ksummit-discuss@lists.linuxfoundation.org Subject: Re: [Ksummit-discuss] [CORE TOPIC] More useful types in the linux kernel List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Tue, 19 Jul 2016, Eric W. Biederman wrote: > James Bottomley writes: > > > On Tue, 2016-07-19 at 10:32 -0500, Eric W. Biederman wrote: > >> Historically the types in C came about because the machines > >> fundamentally supported different data types either with different > >> sizes or different characteristics (i.e. u8, u16, float, double). > >> These data types and the C type system were built so programmers > >> could tell the machine what they needed it to do. > >> > >> There is another genesis of types that started with the simply typed > >> lambda calculs that is about eliminating common errors and otherwise > >> helping a programmer get their code right. > >> > >> In the years since C was invented there has been a lot of activity > >> and a > >> little bit of progress in this area. Would people be receptive to > >> improvements in this area? > >> > >> I would like to talk to folks and gague what it would take to make > >> improvements in this area acceptable, practical, and useful. > >> > >> Would a gcc plugin that checks the most interesting things that > >> sparse checks on every build be interesting? (endianness of integer > >> types for example) > > > > How would this be different from simply automatically running sparse in > > the kernel build if the binary is present (effectively making make C=1 > > the default)? > > Nothing. I am just honestly looking at ways that we can get things to > always or almost always run. Sparse isn't getting run regularly now so > I was suspect that would not be as good of a solution. A problem with putting time consuming tools into the normal build process is that they work on all the code you compile, which may be a large superset of the code that is actually changed. julia > > >> Would a type system for pointers derived from separation logic that > >> has the concept that a piece of data is owned by a piece of running > >> code rather than another piece of data be interesting? > > > > By this you mean a thread of execution that should be expected to free > > the data pointed to when it finishes? Sort of like a self garbage > > collecting reference? > > Not really. The big idea is to make expressable in the type system the > key concepts on how a programmer reasons about their data structures, > instead of trying to make programmers perform some convoluted logic to > express things in concepts that are trivial to implement in a type > system. > > Another way to talk about it would be complete alias analysis at type > checking time. > > Which means that the type system then knows essentially everything the > programmer knows about aliases and object lifetimes and if you don't > pass something on or free it, the type checker then knows you made > a mistake. > > >> * This cleanly allows for doubly linked lists. > >> > >> * This is useful to ensure that data is either put in another data > >> structure where it is remembered or it is freed. > >> > >> * This is useful to ensure reference counts are not leaked. > >> > >> * This is useful to ensure that every lock is paired with an > >> unlock. > >> > >> My personal filter for things like this are types that can be checked > >> in time proportional to the amount of code to be built so that it is > >> roughly the same situation we are in now. > >> > >> > >> Given it's heritage and it's history the type system in C has serious > >> limitations that I don't know if they are correctible, when it comes > >> to catching programmer mistakes: silent truncation of unsigned types > >> into smaller unsigned types, casts, etc. Would people be willing to > >> consider a simple, link compatible alternative to C for some of the > >> code in the kernel that had the same low level control of the machine > >> but had a type system that made catching mistakes easier? > >> > >> > >> Deploying solutions like this will take a fair bit of grunt work, and > >> time similar or worse than the big kernel lock removal. Given how > >> widely Linux is used and how annoying some of these bugs can be I > >> think it is worthwhile to dig in and see what kind of improvements > >> can be made. > >> > >> I would really like to get a feel among kernel maintainers and > >> developers if this is something that is interesting, and what kind of > >> constraints they think something like this would need to be usable > >> for the kernel? > > > > I've got to say that rewriting in some as yet undefined language simply > > to get better typing and conversions looks very daunting. Before even > > considering this, can you answer why an extension to sparse, which > > would mostly flag the problems if we use the correct annotations, > > wouldn't work just as well? We'd still have to add the additional > > annotations (and someone would have to update sparse) but it would > > still be a lot easier than rewriting and giving the kernel a build > > dependency on whatever the rewrite is done in. > > That is a very good question. The short answer is that I am not yet > convinced I can figure out how to retrofit the type checks I want onto > C's type system. I have implemented the type checks I am looking at in > a fresh language I am toying with where I can throw out inconvinient > unnecessary corner cases. After I finished boiling the ocean the basic > types were similar enough to C's type system that I think I can make > the connection but I haven't tried yet. > > Assuming things can be sorted so they coexist nicely C's types. No > advantage. If it turns out that to make the code not a pain to write we > need better syntax or some basic type inference there is a large > advantage in converting code. > > The first place I can think of where I might get hung up in the area of > discriminated unions. I am pretty certain that some cases of type > safety will require them and C doesn't have anything like that today. A > practical example would be a type system that ensures you call > IS_ERR(ptr) before you use a pointer as a pointer. > > Ordinary C unions and type casts that serve the same purpose would at the > end of the day all have to be reabstracted. Because otherwise memory > safety cound not be analyized. > > Some of that reabstraction I will need to introduce type variables which > a completely foreing concept to C. > > On the flip side since the linux kernel and low level programming where > precise control of the machine happens is my target audience it is > definitely worth looking at what can be done in a context like sparse. > That is the fastest way to connect the tools with the real world > problems. If it can't be done or if it can't be made nice simple > and easy to use then there will be good arguments for why we need a new > syntax. > > Mostly I asked the question because the type system would be so much > easier in a green field setting, and I wanted to be lazy. So hearing > the common sense request (Please fix C.) Is helpful to get that lazy > part of myself in gear. > > The end game for me is a type system that doesn't permit memory errors, > reference counting errors, locking errors, or deadlocks, while at the > same time allowing pretty much the code and data structures that we use > in the kernel today. > > Eric > _______________________________________________ > Ksummit-discuss mailing list > Ksummit-discuss@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss >