From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail144.messagelabs.com (mail144.messagelabs.com [216.82.254.51]) by kanga.kvack.org (Postfix) with ESMTP id 531C46B004D for ; Thu, 21 May 2009 17:46:37 -0400 (EDT) Date: Thu, 21 May 2009 14:46:38 -0700 From: "Larry H." Subject: Re: [patch 0/5] Support for sanitization flag in low-level page allocator Message-ID: <20090521214638.GL10756@oblivion.subreption.com> References: <20090520183045.GB10547@oblivion.subreption.com> <1242852158.6582.231.camel@laptop> <4A15A69F.3040604@redhat.com> <20090521202628.39625a5d@lxorguk.ukuu.org.uk> <20090521195603.GK10756@oblivion.subreption.com> <20090521214713.65adfd6e@lxorguk.ukuu.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090521214713.65adfd6e@lxorguk.ukuu.org.uk> Sender: owner-linux-mm@kvack.org To: Alan Cox Cc: Rik van Riel , Peter Zijlstra , linux-kernel@vger.kernel.org, Linus Torvalds , linux-mm@kvack.org, Ingo Molnar List-ID: On 21:47 Thu 21 May , Alan Cox wrote: > In the kernel no and page flags are very precious, very few and if we run > out will cost us a vast amount of extra kernel memory. If page flags were > free the question would be trivial - but they are not. Thus it is worth > asking whether its actually harder to remember to zap the buffer or set > and clear the flag correctly. > > There is no kernel paging (except virtualised but that is an entire other > can of worms we shouldn't open), you can handle reallocation concerns > without page flags by using a SLAB type for 'secure' allocations which > clears the entry on free. > > You don't need a page flag, just a per vma flag and something akin to > madvise() to set the flag on the VMA (and/or split the VMA for partial > maps as we do for anything else). VMA flags are cheap. The patch already implements a SLAB_CONFIDENTIAL flag now (I finished renaming the flags) for this purpose. This is my proposal and summary of the changes I'll do to the patch, based off your feedback: 1. A page flag seems to be frowned upon by you. I can understand this and agree that we must keep in mind that these come in scarce quantities. A page flag is the only way to allow the low level page allocator to mark pages that contain sensitive information (so we support this through normal gfp flags in get_free)pages and so forth). 2. We can independently make SLAB/SLUB aware of a CONFIDENTIAL flag that: a) Sanitizes objects at kfree() time when they've been allocated with the gfp flag or they belong to a cache marked with the SLAB_CONFIDENTIAL flag. b) Does not require changes to the low level page allocator. c) Still can prevent leaks in re-allocation scenarios and other cases. 3. We can implement a vma flag for this purpose and should be no issue to you or other maintainers. I'll split the SLAB/SLUB changes, which add support for the flag and the gfp counterpart, and then have a separate one which adds the page flag. Please read my comments on the latter at the end of this email. We can ditch the page flag patch if we finally reject that approach and stick to the other one. I'm fine with that. Let me know if you are keen on this approach and I'll follow with an updated patch. > If you are paging them to a crypted filestore they should be safe on > disk. What is your problem with that ? If your suspend image is > compromised it doesn't really matter if you wiped the data as what you > resume may then wait for the new keys and compromise those. In fact > having a page flag makes it easier for the attack code to know what to > capture and send to the bad guys... I wasn't talking about disk based attacks. I'm talking about a rogue module or just some information leak which let's an user peek at known addresses. For instance, some operating systems implement disk encryption with IVs and keys stored as global variables. Microsoft's BitLocker operates that way internally if they haven't changed it. Apple does the same for swap encryption, etc. > I was assuming you'd wipe such data from memory on a suspend to disk. > However on a suspend to disk its basically as cheap to wipe all of memory > and safer than wiping random bits and praying you know what the compiler > did and you know what some other bit of library did. Security conscious users normally disable suspend or hibernation altogether. It's far more difficult to get it right than it seems. You will *always* need some static place to store your key. Apple's XNU kernel stores the key in the image header for instance. I bet other systems do the same. > Indeed - but a technically sound solution that doesn't waste a page flag > is still important. It's btw not as simple as a page flag anyway - the > kernel stores some stuff in places that do not have page flags, it also > has kmaps and other things that will give you suprises. I haven't identified a single place that stored potentially sensitive information which can be reasonably protected with a simple approach like this, that doesn't use kmalloc or the low level page allocator directly. I bet there are some, but there are plenty of other, more obvious, ones which need our attention. > Perhaps you should post your threat model to go with the patches. At the > moment your model doesn't seem to make sense. The threat model is simple: 1. The kernel has interfaces which deal with likely sensitive information (from tty input drivers, to crypto api and network stack implementations). 2. Memory allocated by these interfaces will suffer of data remanence problems, even post-release. This will scatter such information and make coldboot/Iceman attacks possible to recover cryptographic secrets (ex. scanning for AES key expansion blocks is trivial, and this has been demonstrated for RSA as well, see the Princeton paper about it). 3. LIFO allocators make re-allocation leaks possible. If an interface allocates a buffer, stores data in it and releases it without clearing, a successive allocation somewhere else can return this same object and let the caller access the original contents out of the context they were meant to. If a network stack implementation allocates a 64 byte buffer after some cryptoapi ctx initialization code got another 64 byte buffer and released it, you've got a problem there. If an attacker couples an uninitialized variable usage bug with this situation, you've got a possibly exploitable problem there. Worst of all, is that he might not need such a bug for abusing it ;) Let me know if you need any further clarifications, please. > Surely we can attack the problem far more directly for all but S2R by > > - choosing to use encrypted swap and encrypted S2D images (already > possible) > - wiping the in memory image on S2D if the user chooses (which would be > smart) > > That has the advantage that nobody has to label pages sensitive - which > is flawed anyway, we want to label pages "non-sensitive" in the ideal > world so we default secure. I agree the ideal, best approach would be to sanitize all pages. If you are interested on a patch doing just that (as long as a Kconfig option enables it), I can provide you with a clean one. The original code in PaX did just that. BTW, this can be extrapolated to .rodata and DEBUG_RODATA, as well as the lack of mprotect restrictions for hosts with SELinux disabled (that is, no execmem/execstack/execheap checks). We should really make .rodata read-only by default, and disallow mprotect to produce RWX mappings by default. Otherwise our NX is flawed. These are matters for another patch, and a different discussion too. Larry -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org