From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06625CD37AD for ; Fri, 15 Sep 2023 21:34:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 757C88D0032; Fri, 15 Sep 2023 17:34:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 707F68D0005; Fri, 15 Sep 2023 17:34:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F6598D0032; Fri, 15 Sep 2023 17:34:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4C2518D0005 for ; Fri, 15 Sep 2023 17:34:37 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 09D1E40EA2 for ; Fri, 15 Sep 2023 21:34:37 +0000 (UTC) X-FDA: 81240136194.24.157D242 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by imf11.hostedemail.com (Postfix) with ESMTP id 18F5E40006 for ; Fri, 15 Sep 2023 21:34:34 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=YGQgBcAJ; spf=pass (imf11.hostedemail.com: domain of keescook@chromium.org designates 209.85.215.181 as permitted sender) smtp.mailfrom=keescook@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694813675; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DMNKJbaDoh2XeIMNATULLDefE0skDJfasuAN0WXdJAM=; b=WLoDo4EhUOOMO+VfXMsXW0Rp9ayakxwpJ3K7/MLNNe2nd9/oB5oj98Fn7kdkogifStnOBs BBmPgbhsCmeda9XFIuWnMc1eYdg68YTHPPBnZ60xek6HgzCM2J78eHRx0yxBrlD2HRoPVD p0az+vcAVcuF8Z7T6jKX6EkKS+PeXqc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694813675; a=rsa-sha256; cv=none; b=wT0qKZmuulPCGKPc1RnzZQM78pBfzSajA44D2bGmONZ4nFoWDlemzGxOP5IbRHSETBI0b9 g0+eWpRasmSSG0vRypHxK6B48Do7qSyb4YqV67TSAExB9KvO2JgbpTDBazPyVcQQZXos4J 8bXuvRsGL9EUiy3ly3nLvMvCoOKVAek= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=YGQgBcAJ; spf=pass (imf11.hostedemail.com: domain of keescook@chromium.org designates 209.85.215.181 as permitted sender) smtp.mailfrom=keescook@chromium.org; dmarc=pass (policy=none) header.from=chromium.org Received: by mail-pg1-f181.google.com with SMTP id 41be03b00d2f7-577fb90bbebso1818945a12.1 for ; Fri, 15 Sep 2023 14:34:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1694813674; x=1695418474; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=DMNKJbaDoh2XeIMNATULLDefE0skDJfasuAN0WXdJAM=; b=YGQgBcAJqflyglbNHgcYZwuxKSTzpxbkr+L5spJDaCVhe0JJh30ZYIX6zFUjaKdO1U PjqJIzlPGZNSOuELT4AucKIQLl4eOT2It1ZYeuyc5QjR1i1fBUKaFY0StbRves3DefrL 7in7p9y4dO+uJHzVZ4XHYwGBuRKkvpg5Iirmk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694813674; x=1695418474; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=DMNKJbaDoh2XeIMNATULLDefE0skDJfasuAN0WXdJAM=; b=gvdn9XaY5uHFOAASOxEFnYHxo0vsF8EBKyINybqQC3aFvUW3UjcPCO/GFI+DrXZp4g vXNGVNHQw/V9izizVg4gtV91tfdVorQsHXtAgYld9s9CnbrQEWpd2ltgjXq4g4XTwxQz RHtmh1h8DADkHnISvfUhbfWgWQ5HSUeutX59UCeq2bjMSoBfgIeC5JpCvBRrV1FKI8tP SaKoQW46m5EjQFm7t+EsB8tUe50TAzbZZfJXVM4z3B+B4EpGE9aBJphd1QstiO/hsNbZ Ph0mKb6WYJimIyWda7b/yCHTO6AH87qwPy5aJeAlCOfkhxK008PPnLNeC5+j1eTKEdc3 RElw== X-Gm-Message-State: AOJu0Yx17uwdvKICEWb3xSzvUF5RoWtWFmk+xKkEdfvDbt2jrvsOusJH DLtGlMzTbaWOlF7AilU9hdfkpA== X-Google-Smtp-Source: AGHT+IHMHssMNnreMLJqTuCVGlIvTdvgjcemSHWehzPcVZV916K3MiLdrfG5XxR45MOqXavrLdyosg== X-Received: by 2002:a17:90b:128a:b0:26b:698f:dda7 with SMTP id fw10-20020a17090b128a00b0026b698fdda7mr2636429pjb.32.1694813673889; Fri, 15 Sep 2023 14:34:33 -0700 (PDT) Received: from www.outflux.net (198-0-35-241-static.hfc.comcastbusiness.net. [198.0.35.241]) by smtp.gmail.com with ESMTPSA id ga9-20020a17090b038900b0026094c23d0asm3481238pjb.17.2023.09.15.14.34.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 14:34:33 -0700 (PDT) Date: Fri, 15 Sep 2023 14:34:32 -0700 From: Kees Cook To: Matteo Rizzo Cc: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org, jannh@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com Subject: Re: [RFC PATCH 14/14] security: add documentation for SLAB_VIRTUAL Message-ID: <202309151428.C04391065F@keescook> References: <20230915105933.495735-1-matteorizzo@google.com> <20230915105933.495735-15-matteorizzo@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230915105933.495735-15-matteorizzo@google.com> X-Stat-Signature: mrs1xpys97uw1r9owhxny3dt49qgw88m X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 18F5E40006 X-Rspam-User: X-HE-Tag: 1694813674-669693 X-HE-Meta: U2FsdGVkX1/WkdFVj9QCs6NcFNR0iO5mlla3prAMHaNL3ELUECrkAF12k6vwVB24koB8kRE9tYYz0ppY4GdBk9gU9/eUNiDjvuhFlyhKiDqdCv6O8w9viZE5JTxapD0yzZPpOgbFv4YzSICQrpGMFBfmymcQKjhPsWkkyh8gJ+fYkatknL3FrQ3HUiK4HexuP2Ims8XSaultUUGreUVQACD9AEt0ZOt/K4RPo5B2YtAGLVuwvKT2uqFap+XmpNcGyCs+S/eR+PC9EAQ/SFCd9HgbYnVfULMgdnjyMUyAKMob1t3hn9OqYNL2nsNQSoo+hihPfBXQH5kVaz3JLAqcLJcZGoQI7oZ5q//hwLSlAZyMK80yKHzXslmyqqOG5lNDX7VLfJTJQuhDfX4aetpIlgFEfatLOAzB4zXe6xr/tF+cw/E37cqlKrAKmEZZndasORhIAwcJKr+U3U07BxxyrxOSAZhljf72KoTnDdWasErD2qnrH32B7UeL0ZOERNbdZLaShY+kkNU70UH/pwNNKDZi1bwdmo/lW66/vyKGbD+hOmJyt/UXLDLQyvcSf6jhjnNge5UarSqk64B3bYEkm2wO4GxqHpfG0ernmm2qE9qe05OPD58KtRbUPubhIFKwwoI9qK2QwuufhIZ+zdfdmDxLs7aZgdj93bybAkG8QNN1uo+LoNXQ3wv33vIIx2jIgwm9DexhgbDbguistMOSUIxpCv3bPY6ZYLZxzZps8x6bnpEDhIO0F/+MzMFmDtUL1aWkM+9XdtQo/57kkFZLLBBUTeW1J0cfD6JTSyaA2pzqMo3pIcYPT/1cgKy1trDwU1kTUhVqYmP0TogcrGr+ZRDH71R6uIlfhZ0t/JLdJWG1oDyzLi/Y+meIBOlOiWJIk9ZiGOWO8lpWDffjT2/qOBpQjZ0VqVn9B2+qTBQHHrlcpn0H4hR0Xnp3LAhw9NrThDPkz9/AOX7gn1A8V18 AZZK+u9L yDV5xAOVz7TMzYwdu6tmkYBHJuhkWN+N7DxEhYGIZ2UQb07o1fWDWjd2+BvnBv5bh9Q/inylrPUsMui4hGnFjuNaBSprNvIxlqq2cDMLAVJOFRqkuR0MoIKWe/Sqb7hK24cY/+uv9yY7e1X6c5pvPu9jkJUGhb2OMFY5N89Mmq4xSZ6bmEf0yvoQuepothTntzNzP1fZeMBSq9+jFpaz0084TH0Tiom4Ast7jkZ312lQ75f5WBB8GfzEzQFsyH9EdpT6Lgcj/AIZ6/MXCVXrQfv4nPVdMJGoPCmWSyvmguuBwWBtUtC/U+RQngQuzac+V3zjzVPbqvyJsTkSUl7S9JcmtNx4UBCS2qPP1wyVudqiOTrPEd3EE6zvPCEjh8Tv7Di1bUDJVxhF2u/21RwSkul2Tvg0GVaY+4F2IzLc94VZuDP9c7EA+7v3gNd8yBxqDUP+Zh53Miong2Vqb+8IrjM+PEB8P/GLuSNRzUvFeP8EJCWh2eshR1DQAy05YNuIOzhw9WOGgC1EVLjdXMakioAkJJDbq2tEDprlYALp0jJtvH/9kdGqpy+sN2cii74yJ0YTfT7O4WJyD4JTcdJAa7fjwsBqc6/pSpyFpf3ahYmrE1CgOxo0nkvDj2zNaVPB8E2EC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 15, 2023 at 10:59:33AM +0000, Matteo Rizzo wrote: > From: Jann Horn > > Document what SLAB_VIRTUAL is trying to do, how it's implemented, and > why. > > Signed-off-by: Jann Horn > Co-developed-by: Matteo Rizzo > Signed-off-by: Matteo Rizzo > --- > Documentation/security/self-protection.rst | 102 +++++++++++++++++++++ > 1 file changed, 102 insertions(+) > > diff --git a/Documentation/security/self-protection.rst b/Documentation/security/self-protection.rst > index 910668e665cb..5a5e99e3f244 100644 > --- a/Documentation/security/self-protection.rst > +++ b/Documentation/security/self-protection.rst > @@ -314,3 +314,105 @@ To help kill classes of bugs that result in kernel addresses being > written to userspace, the destination of writes needs to be tracked. If > the buffer is destined for userspace (e.g. seq_file backed ``/proc`` files), > it should automatically censor sensitive values. > + > + > +Memory Allocator Mitigations > +============================ > + > +Protection against cross-cache attacks (SLAB_VIRTUAL) > +----------------------------------------------------- > + > +SLAB_VIRTUAL is a mitigation that deterministically prevents cross-cache > +attacks. > + > +Linux Kernel use-after-free vulnerabilities are commonly exploited by turning > +them into an object type confusion (having two active pointers of different > +types to the same memory location) using one of the following techniques: > + > +1. Direct object reuse: make the kernel give the victim object back to the slab > + allocator, then allocate the object again from the same slab cache as a > + different type. This is only possible if the victim object resides in a slab > + cache which can contain objects of different types - for example one of the > + kmalloc caches. > +2. "Cross-cache attack": make the kernel give the victim object back to the slab > + allocator, then make the slab allocator give the page containing the object > + back to the page allocator, then either allocate the page directly as some > + other type of page or make the slab allocator allocate it again for a > + different slab cache and allocate an object from there. I feel like adding a link to https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html would be nice here, as some folks reading this may not understand how plausible the second attack can be. :) > + > +In either case, the important part is that the same virtual address is reused > +for two objects of different types. > + > +The first case can be addressed by separating objects of different types > +into different slab caches. If a slab cache only contains objects of the > +same type then directly turning an use-after-free into a type confusion is > +impossible as long as the slab page that contains the victim object remains > +assigned to that slab cache. This type of mitigation is easily bypassable > +by cross-cache attacks: if the attacker can make the slab allocator return > +the page containing the victim object to the page allocator and then make > +it use the same page for a different slab cache, type confusion becomes > +possible again. Addressing the first case is therefore only worthwhile if > +cross-cache attacks are also addressed. AUTOSLAB uses a combination of I think you mean CONFIG_RANDOM_KMALLOC_CACHES, not AUTOSLAB which isn't upstream. > +probabilistic mitigations for this. SLAB_VIRTUAL addresses the second case > +deterministically by changing the way the slab allocator allocates memory. > + > +Preventing slab virtual address reuse > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +In theory there is an easy fix against cross-cache attacks: modify the slab > +allocator so that it never gives memory back to the page allocator. In practice > +this would be problematic because physical memory remains permanently assigned > +to a slab cache even if it doesn't contain any active objects. A viable > +cross-cache mitigation must allow the system to reclaim unused physical memory. > +In the current design of the slab allocator there is no way > +to keep a region of virtual memory permanently assigned to a slab cache without > +also permanently reserving physical memory. That is because the virtual > +addresses that the slab allocator uses come from the linear map region, where > +there is a 1:1 correspondence between virtual and physical addresses. > + > +SLAB_VIRTUAL's solution is to create a dedicated virtual memory region that is > +only used for slab memory, and to enforce that once a range of virtual addresses > +is used for a slab cache, it is never reused for any other caches. Using a > +dedicated region of virtual memory lets us reserve ranges of virtual addresses > +to prevent cross-cache attacks and at the same time release physical memory back > +to the system when it's no longer needed. This is what Chromium's PartitionAlloc > +does in userspace > +(https://chromium.googlesource.com/chromium/src/+/354da2514b31df2aa14291199a567e10a7671621/base/allocator/partition_allocator/PartitionAlloc.md). > + > +Implementation > +~~~~~~~~~~~~~~ > + > +SLAB_VIRTUAL reserves a region of virtual memory for the slab allocator. All > +pointers returned by the slab allocator point to this region. The region is > +statically partitioned in two sub-regions: the metadata region and the data > +region. The data region is where the actual objects are allocated from. The > +metadata region is an array of struct slab objects, one for each PAGE_SIZE bytes > +in the data region. > +Without SLAB_VIRTUAL, struct slab is overlaid on top of the struct page/struct > +folio that corresponds to the physical memory page backing the slab instead of > +using a dedicated memory region. This doesn't work for SLAB_VIRTUAL, which needs > +to store metadata for slabs even when no physical memory is allocated to them. > +Having an array of struct slab lets us implement virt_to_slab efficiently purely > +with arithmetic. In order to support high-order slabs, the struct slabs > +corresponding to tail pages contain a pointer to the head slab, which > +corresponds to the slab's head page. > + > +TLB flushing > +~~~~~~~~~~~~ > + > +Before it can release a page of physical memory back to the page allocator, the > +slab allocator must flush the TLB entries for that page on all CPUs. This is not > +only necessary for the mitigation to work reliably but it's also required for > +correctness. Without a TLB flush some CPUs might continue using the old mapping > +if the virtual address range is reused for a new slab and cause memory > +corruption even in the absence of other bugs. The slab allocator can release > +pages in contexts where TLB flushes can't be performed (e.g. in hardware > +interrupt handlers). Pages to free are not freed directly, and instead they are > +put on a queue and freed from a workqueue context which also flushes the TLB. > + > +Performance > +~~~~~~~~~~~ > + > +SLAB_VIRTUAL's performance impact depends on the workload. On kernel compilation > +(kernbench) the slowdown is about 1-2% depending on the machine type and is > +slightly worse on machines with more cores. Is there anything that can be added to the docs about future work, areas of improvement, etc? -Kees -- Kees Cook