ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Bird, Tim" <Tim.Bird@sonymobile.com>
To: Josh Triplett <josh@joshtriplett.org>, Dave Jones <davej@redhat.com>
Cc: Sarah Sharp <sarah@minilop.net>,
	"ksummit-discuss@lists.linuxfoundation.org"
	<ksummit-discuss@lists.linuxfoundation.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	Julia Lawall <julia.lawall@lip6.fr>,
	Darren Hart <darren@dvhart.com>,
	Dan Carpenter <dan.carpenter@oracle.com>
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Kernel tinification: shrinking the kernel and avoiding size regressions
Date: Fri, 9 May 2014 18:59:16 +0200	[thread overview]
Message-ID: <F5184659D418E34EA12B1903EE5EF5FDEE1B9D0114@seldmbx02.corpusers.net> (raw)
In-Reply-To: <20140509162229.GB4152@thin>

On Friday, May 09, 2014 9:22 AM Josh Triplett wrote:
> 
> On Fri, May 02, 2014 at 01:11:03PM -0400, Dave Jones wrote:
> > On Fri, May 02, 2014 at 09:44:42AM -0700, Josh Triplett wrote:
> >
> >  > Topics:
> >  > - Kconfig, and avoiding excessive configurability in the pursuit of tiny
> >  > - Optimizing a kernel for its exact target userspace.
> >  > - Examples of shrinking the kernel
> >
> > Something that's partially related here: Making stuff optional
> > reduces attack surface the kernel presents. We're starting to grow
> > more and more CONFIG options to disable syscalls. I'd like to hear
> > peoples reactions on introducing even more optionality in this area.
> 
> I'd certainly like to see just about every syscall made optional, for
> userspace that doesn't need it.  For specialized systems, that certainly
> would decrease attack surface.  However, seccomp decreases attack
> surface by the same amount, and for any except those specialized systems
> that would make more sense, because the set of available syscalls can
> then change with a simple policy change rather than a new kernel.
> 
> And this doesn't free us from the obligation to make all new APIs
> secure against hostile userspace.
> 
> > I had a patch to make this particular syscall a cond_syscall, but then
> > XFS ate my homework and I haven't had chance to revisit this.
> > So, my questions are:
> > - are there other obvious syscalls we could make optional without userspace
> >   freaking out when they suddenly start getting ENOSYS ?
> 
> I've attached a complete list of the syscalls from
> include/linux/syscalls.h that do not appear in kernel/sys_ni.c, and thus
> always exist.  (syscalls.h notably does not include all the
> arch-specific syscalls, some of which might make sense to leave out as
> well.)
> 
> Of those, a few classes of syscalls that seem obvious, for various
> classes of specialized or legacy-free systems:
> 
> - For any syscall updated to have a foo2, foo3, etc, a single config
>   option to leave out all the older versions would make sense, to go
>   with userspace that never calls the older versions.
> - Likewise, the non-64 file calls.
> - Likewise, sys_old*
> - splice/vmsplice/tee.
> - sys_*sync*
> - sys_clock_* and any other time functions.
> - sys_sched_*
> - All signal-related syscalls
> - rlimit syscalls
> - sys_*xattr*
> - sys_nice
> - sys_cap{get,set}
> - fadvise, fallocate, readahead, etc.
> - uid/gid functions.
> - ioperm/iopl
> - ptrace
> - sendfile
> - times
> - utimes and company
> 
> > - how much configurability here is too much ?
> >   r_f_p was an obvious candidate because it's.. well, nasty.  Some of the
> >   more straightforward syscalls may not be such a big deal, but then we
> >   have CONFIG's for kcmp and other 'simple' syscalls already..
> 
> We need a more systematic mechanism, I think.  CONFIG_SYSCALL_FOO for
> every possible FOO seems too much, even for classes of syscalls.
> Ideally, we could feed in a table of syscalls collected by some
> analysis of the target userspace, and the kernel will then have exactly
> those syscalls.

In my system, I set it up so that every syscall had it's own
SYSCALL_DEFINE macro. and then used a single header file
consisting of lines like:
#define syscall_setreuid16_unused 1

The SYSCALL_DEFINE macros would then control whether the
syscall was extern'ed or not.  A separate mechanism converted
the CALL macro in calls.S (on ARM) to use sys_ni_syscall, and
LTO made the (now unreferenced) function evaporate.

Overall, this allowed control of every syscall with a single easily
generated (or easily hand-edited) header file.  And, with a stub
header file, everything worked as without the changes.

The header file was auto-generated by tools that scanned the
user-space programs for all possible syscall sequences.

In hindsight this system could probably be improved with some
extra tweaking to the base SYSCALL_DEFINE macros, to make
it so no source changes were required at the function definition sites.

In any event, it's possible to get per-syscall granularity without
having to add new CONFIGS (but at the expense of adding a generated
header file).
 -- Tim

  reply	other threads:[~2014-05-09 16:59 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 16:44 Josh Triplett
2014-05-02 17:11 ` Dave Jones
2014-05-02 17:20   ` James Bottomley
2014-05-02 17:33     ` Dave Jones
2014-05-02 17:46       ` Josh Boyer
2014-05-02 18:50         ` H. Peter Anvin
2014-05-02 19:02           ` Josh Boyer
2014-05-02 19:03           ` Michael Kerrisk (man-pages)
2014-05-02 19:33             ` Theodore Ts'o
2014-05-02 19:38               ` Jiri Kosina
2014-05-02 19:49               ` Dave Jones
2014-05-02 20:06                 ` Steven Rostedt
2014-05-02 20:41                 ` Theodore Ts'o
2014-05-02 21:01                   ` Dave Jones
2014-05-02 21:19                     ` Josh Boyer
2014-05-02 21:23                       ` Jiri Kosina
2014-05-02 21:36                         ` Josh Boyer
2014-05-02 21:27                       ` James Bottomley
2014-05-02 21:39                         ` Josh Boyer
2014-05-02 22:35                           ` Andy Lutomirski
2014-05-06 17:18                             ` josh
2014-05-06 17:31                               ` Andy Lutomirski
2014-05-09 18:22                                 ` H. Peter Anvin
2014-05-09 20:37                                   ` Andy Lutomirski
2014-05-09 22:50                                     ` Josh Triplett
2014-05-10  0:23                                     ` James Bottomley
2014-05-10  0:38                                       ` Andy Lutomirski
2014-05-10  3:44                                         ` Josh Triplett
2014-05-03 17:30                           ` James Bottomley
2014-05-02 21:56                     ` tytso
2014-05-02 20:45                 ` Ben Hutchings
2014-05-02 21:03                   ` Dave Jones
2014-05-03 13:37                     ` Michael Kerrisk (man-pages)
2014-05-03 13:35                   ` Michael Kerrisk (man-pages)
2014-05-03 13:32               ` Michael Kerrisk (man-pages)
2014-05-02 19:03       ` Mark Brown
2014-05-02 19:45         ` Luck, Tony
2014-05-02 21:03           ` Mark Brown
2014-05-02 21:08             ` Dave Jones
2014-05-02 21:14               ` Andy Lutomirski
2014-05-02 21:21               ` Luck, Tony
2014-05-02 21:38                 ` H. Peter Anvin
2014-05-03  1:21               ` Mark Brown
2014-05-07 12:35             ` David Woodhouse
2014-05-09 15:51               ` Mark Brown
2014-05-02 17:33     ` Guenter Roeck
2014-05-02 17:44     ` Steven Rostedt
2014-05-07 11:32     ` David Woodhouse
2014-05-07 16:38       ` James Bottomley
2014-05-02 22:04   ` Jan Kara
2014-05-05 23:45   ` Bird, Tim
2014-05-06  2:14     ` H. Peter Anvin
2014-05-09 16:22   ` Josh Triplett
2014-05-09 16:59     ` Bird, Tim [this message]
2014-05-09 17:23       ` josh
2014-05-08 15:52 ` Christoph Lameter
2014-05-12 17:35 ` Wolfram Sang
2014-05-13 16:36 ` Bird, Tim
2014-05-13 18:00   ` josh
2014-05-14  1:04   ` Julia Lawall
2014-08-17  9:45 ` [Ksummit-discuss] tiny.wiki.kernel.org Josh Triplett
2014-05-08 16:24 [Ksummit-discuss] [CORE TOPIC] Kernel tinification: shrinking the kernel and avoiding size regressions Christoph Lameter
2014-05-09  0:31 ` James Bottomley
2014-05-09 14:48   ` Christoph Lameter
2014-05-09 16:24     ` Steven Rostedt
2014-05-09 16:55       ` Christoph Lameter
2014-05-09 17:21         ` josh
2014-05-09 17:42         ` James Bottomley
2014-05-09 17:52           ` Christoph Lameter
2014-05-09 18:32             ` Steven Rostedt
2014-05-09 19:02               ` Julia Lawall
2014-05-09 20:31                 ` Steven Rostedt
2014-05-09 17:52           ` Matthew Wilcox
2014-05-12 18:06         ` Dave Hansen
2014-05-12 20:20           ` Roland Dreier
2014-05-14  2:37   ` Li Zefan
2014-05-15 19:41     ` H. Peter Anvin
2014-05-15 20:00       ` Greg KH
2014-05-15 20:29         ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F5184659D418E34EA12B1903EE5EF5FDEE1B9D0114@seldmbx02.corpusers.net \
    --to=tim.bird@sonymobile.com \
    --cc=dan.carpenter@oracle.com \
    --cc=darren@dvhart.com \
    --cc=davej@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=josh@joshtriplett.org \
    --cc=julia.lawall@lip6.fr \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    --cc=sarah@minilop.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox