ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Ben Hutchings <ben@decadent.org.uk>, Dave Jones <davej@redhat.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>,
	Sarah Sharp <sarah@minilop.net>,
	ksummit-discuss@lists.linuxfoundation.org,
	Greg KH <gregkh@linuxfoundation.org>,
	Julia Lawall <julia.lawall@lip6.fr>,
	Heinrich Schuchardt <xypron.glpk@gmx.de>,
	Darren Hart <darren@dvhart.com>,
	Dan Carpenter <dan.carpenter@oracle.com>
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Kernel tinification: shrinking the kernel and avoiding size regressions
Date: Sat, 03 May 2014 15:35:08 +0200	[thread overview]
Message-ID: <5364F08C.3060301@gmail.com> (raw)
In-Reply-To: <1399063518.24523.43.camel@deadeye.wl.decadent.org.uk>

On 05/02/2014 10:45 PM, Ben Hutchings wrote:
> On Fri, 2014-05-02 at 15:49 -0400, Dave Jones wrote:
>> On Fri, May 02, 2014 at 03:33:14PM -0400, Theodore Ts'o wrote:
>>  > There's been a huge focus on system calls in this discussion, and I
>>  > suspect this is a bit of a red herring.  Taking a look at "git log
>>  > arch/x86/syscalls/syscall_64.tbl" --- since all the world's is no
>>  > longer a Vax, but rather an x86_64 :-P --- there really hasn't been
>>  > that many new system calls lately.
>>
>> I may have a vested interest in syscalls :)
>>
>> The rate we're adding them has slowed down, but the rate at which we're
>> finding bugs exposed through them has accelerated enormously over the
>> last few years.

Yes. The APIs delivered to userspace continue to be infested with bugs
and design infelicities, many of which go undetected for a long time.

>> To use just one example, on certain systems I'd love to be able to just
>> turn off sys_perf_event_open given what a trainwreck of vulnerabilities it's been
>> over the last few years [comedy: it is actually a config option, but x86
>> 'selects' it, so you'll have it and you'll like it].
>> Thankfully at least the scarier parts of it are now hidden behind the
>> paranoid sysctl.
> 
> I have considered proposing perf_event_paranoid=3 to disable it
> completely for non-root.
> 
>>  > And if you look at things like renameat(2), the actual code savings by
>>  > removing renameat(2) is pretty small, and IMHO, not worth the
>>  > complexity and uncertainty that it would represent to application
>>  > programmers of "does this system call exist or doesn't it".
>>
>> I think we've got two categories here.
>>
>> "variant" syscalls like renameat, which just offers enhancements over
>> an existing syscall. Stuff that things like glibc tend to care about.
>> This stuff is usually pretty boring, and not even worth considering for
>> potentially disabling imo.
>>
>> And then we have "enable boatload of code" syscalls that are typically
>> used by a few standalone apps/features. kexec, checkpointing, whatever
>> db it was that cares about remap_file_pages, mempolicy, etc. etc.
>>
>> It's this "not used by every user" code that tends to scare me, because
>> it's written with 1-2 well behaved bits of userspace in mind, which
>> usually means "has so many unchecked corner cases it's not even funny"

Well it's worse than that, I think. Those unchecked corner cases turn
up even in code that is not protected by config options or privs.
My example of the day: the timeout argument of recvmmsg() does nothing
sensible--there was no (or minimal) testing, seems to have been minimal
review of the feature, and of course there was no documentation of how
the timeout feature should work beyond the statement that "recvmmsg 
now has a struct timespec timeout, that works in the same fashion as
the ppoll one" (Newsflash: recvmmsg() and ppoll() are doing very 
different things, so describing one in terms of the other doesn't
provide much insight.)

https://bugzilla.kernel.org/show_bug.cgi?id=75371
http://thread.gmane.org/gmane.linux.man/5677

> [...]
> 
> Since Michael often seems to be the one testing those corner cases while
> writing documentation, it seems like you're getting back to the old
> issue of whether lack of documentation should be a blocker for adding
> new system calls.

I think there's really room for a lot more rigor here. There is way
too much crap hitting the userspace API. I've long argued that
(ggod) documentation is one of the best ways of finding bugs and
design errors. I know, because that's the way I've discovered a lot
of the problems. Of course, perhaps I am just an odd data point,
but I recently got to help out in an experiment that reproduced 
the results.

Heinrich Schuchardt recently took it upon himself to document the 
fanotify API, which has been undocumented since its release in 2.6.37.
(Heinrich's pages will probably be published in the next week or so,
in the meantime the drafts are here: 
http://git.kernel.org/cgit/docs/man-pages/man-pages.git/tree/ )

In the course of writing the pages (and goaded by me at various
points to "explain this detail" or "tell the reader what happens 
in this case"), Heinrich has uncovered (and documented) one or 
two design infelicities and a good crop of bugs (at least one 
of which has some security implications: 
http://thread.gmane.org/gmane.linux.kernel/1686672/focus=1690201 )

So, Heinrich demonstrated what I've long known: show me a new
kernel-user-space API and I can probably pretty quickly show you
a bug. Writing good documentation goes a long way toward finding
those bugs and design problems, and it really should be done
well before an API is released, since, of course, some API 
problems can't be  fixed later. And, it should be a collaborative
effort involving not just the developer concerned but someone
fairly distant from them who can look skeptically at the 
documentation.

Oh, and I didn't explicitly say it, but to me it's obvious:
good documentation necessarily implies good testing. And
that's the thing that made Heinrich's work good: when he
wrote in response to some of my goadings that the answers 
might take a while, because he'd need to write some tests,
that was exactly what I hoped to hear.

tools like trinity do a great job of catching bizarre behaviors
in APIs, but in the end some bugs (and design problems) are 
only going to be found when human beings sit down and think
deeply about what is going on. (The timeout issue for 
recvmmsg() is a case in point. There's no fuzz testing for
that sort of issue, and for that matter no specification of
the expected behavior against which to test.)

Thanks,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  parent reply	other threads:[~2014-05-03 13:35 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 16:44 Josh Triplett
2014-05-02 17:11 ` Dave Jones
2014-05-02 17:20   ` James Bottomley
2014-05-02 17:33     ` Dave Jones
2014-05-02 17:46       ` Josh Boyer
2014-05-02 18:50         ` H. Peter Anvin
2014-05-02 19:02           ` Josh Boyer
2014-05-02 19:03           ` Michael Kerrisk (man-pages)
2014-05-02 19:33             ` Theodore Ts'o
2014-05-02 19:38               ` Jiri Kosina
2014-05-02 19:49               ` Dave Jones
2014-05-02 20:06                 ` Steven Rostedt
2014-05-02 20:41                 ` Theodore Ts'o
2014-05-02 21:01                   ` Dave Jones
2014-05-02 21:19                     ` Josh Boyer
2014-05-02 21:23                       ` Jiri Kosina
2014-05-02 21:36                         ` Josh Boyer
2014-05-02 21:27                       ` James Bottomley
2014-05-02 21:39                         ` Josh Boyer
2014-05-02 22:35                           ` Andy Lutomirski
2014-05-06 17:18                             ` josh
2014-05-06 17:31                               ` Andy Lutomirski
2014-05-09 18:22                                 ` H. Peter Anvin
2014-05-09 20:37                                   ` Andy Lutomirski
2014-05-09 22:50                                     ` Josh Triplett
2014-05-10  0:23                                     ` James Bottomley
2014-05-10  0:38                                       ` Andy Lutomirski
2014-05-10  3:44                                         ` Josh Triplett
2014-05-03 17:30                           ` James Bottomley
2014-05-02 21:56                     ` tytso
2014-05-02 20:45                 ` Ben Hutchings
2014-05-02 21:03                   ` Dave Jones
2014-05-03 13:37                     ` Michael Kerrisk (man-pages)
2014-05-03 13:35                   ` Michael Kerrisk (man-pages) [this message]
2014-05-03 13:32               ` Michael Kerrisk (man-pages)
2014-05-02 19:03       ` Mark Brown
2014-05-02 19:45         ` Luck, Tony
2014-05-02 21:03           ` Mark Brown
2014-05-02 21:08             ` Dave Jones
2014-05-02 21:14               ` Andy Lutomirski
2014-05-02 21:21               ` Luck, Tony
2014-05-02 21:38                 ` H. Peter Anvin
2014-05-03  1:21               ` Mark Brown
2014-05-07 12:35             ` David Woodhouse
2014-05-09 15:51               ` Mark Brown
2014-05-02 17:33     ` Guenter Roeck
2014-05-02 17:44     ` Steven Rostedt
2014-05-07 11:32     ` David Woodhouse
2014-05-07 16:38       ` James Bottomley
2014-05-02 22:04   ` Jan Kara
2014-05-05 23:45   ` Bird, Tim
2014-05-06  2:14     ` H. Peter Anvin
2014-05-09 16:22   ` Josh Triplett
2014-05-09 16:59     ` Bird, Tim
2014-05-09 17:23       ` josh
2014-05-08 15:52 ` Christoph Lameter
2014-05-12 17:35 ` Wolfram Sang
2014-05-13 16:36 ` Bird, Tim
2014-05-13 18:00   ` josh
2014-05-14  1:04   ` Julia Lawall
2014-08-17  9:45 ` [Ksummit-discuss] tiny.wiki.kernel.org Josh Triplett
2014-05-08 16:24 [Ksummit-discuss] [CORE TOPIC] Kernel tinification: shrinking the kernel and avoiding size regressions Christoph Lameter
2014-05-09  0:31 ` James Bottomley
2014-05-09 14:48   ` Christoph Lameter
2014-05-09 16:24     ` Steven Rostedt
2014-05-09 16:55       ` Christoph Lameter
2014-05-09 17:21         ` josh
2014-05-09 17:42         ` James Bottomley
2014-05-09 17:52           ` Christoph Lameter
2014-05-09 18:32             ` Steven Rostedt
2014-05-09 19:02               ` Julia Lawall
2014-05-09 20:31                 ` Steven Rostedt
2014-05-09 17:52           ` Matthew Wilcox
2014-05-12 18:06         ` Dave Hansen
2014-05-12 20:20           ` Roland Dreier
2014-05-14  2:37   ` Li Zefan
2014-05-15 19:41     ` H. Peter Anvin
2014-05-15 20:00       ` Greg KH
2014-05-15 20:29         ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5364F08C.3060301@gmail.com \
    --to=mtk.manpages@gmail.com \
    --cc=ben@decadent.org.uk \
    --cc=dan.carpenter@oracle.com \
    --cc=darren@dvhart.com \
    --cc=davej@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=julia.lawall@lip6.fr \
    --cc=jwboyer@fedoraproject.org \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    --cc=sarah@minilop.net \
    --cc=xypron.glpk@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox