From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00BC5C433E0 for ; Tue, 9 Mar 2021 13:58:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B87D6509B for ; Tue, 9 Mar 2021 13:58:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B87D6509B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C26B86B00CA; Tue, 9 Mar 2021 08:58:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BFDC06B00CB; Tue, 9 Mar 2021 08:58:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9EF88D007F; Tue, 9 Mar 2021 08:58:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id 8DCD66B00CA for ; Tue, 9 Mar 2021 08:58:14 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3FCF08249980 for ; Tue, 9 Mar 2021 13:58:14 +0000 (UTC) X-FDA: 77900490108.13.0BE984E Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by imf05.hostedemail.com (Postfix) with ESMTP id 22CF6E0011C5 for ; Tue, 9 Mar 2021 13:58:13 +0000 (UTC) Received: by mail-lj1-f176.google.com with SMTP id r25so20696504ljk.11 for ; Tue, 09 Mar 2021 05:58:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=vGGHjgXXTqasYN8P5EyEBv5tuRCYFLrDszBbtT8jMqk=; b=vJtky+ICZQR5Ik8d9Dt48lLhMJycNjiMao2MlTd21js+ghTKZiqbNxU5sfVlD2FvlY 7FOmYI0sBuua9kJPEdfZAF2YoA3xKcNO042z8J6tNiePxuxuStis/py+yO5DEForYqhO VpdKPZlfV3RMh5VukZgCWzTV8zWDeGcoYDMokmpuAuxtgdcHDTKMc4Jq8W4IO6YAk7I+ xIwZ0s9BQxbcnei/9M/PH7UOXLzkJo1QRqmoDe61gxVvhZbEKjzkqaUV56j8dOhhhLt1 +VnLFRtfx9Ppcaj0Fh4KqB7crDePf4maTobJiemcdHIXXLxL3ozQOx4hpUkZvpds/KkO bCmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=vGGHjgXXTqasYN8P5EyEBv5tuRCYFLrDszBbtT8jMqk=; b=LfOEdmtXzPVBrRHt+3qjkyXrn1CZGI7eQr3CgvlCxZGRd0j6VyYFMgTtdLXSHJ9v1f XmehT52KJSXTZH2/hAwnQdkegyPYr15oV34X45g9WaXkJlg+a/aIw1fvPLYh+nfMSjs7 UUR/HxpDvsDGGZmjAu+BGAzWt6JagbpzqeRJJAlw5nYSp3jWtQMJuQC+9lpfVZC3P88n tGXasdSS7iJoh1gIih8zvOlzXnQkgc/ndMo0CTiLpIx5V1uMZZF1nerpB4VIjZzSqsSz cPqXJ99t/V0D1drcBfxIRBZeM5zbcrbsLPC+qxDkUAWieDnV7Ctnd0ILcksM7/t3sFJi PnbQ== X-Gm-Message-State: AOAM531dhKLHIMkp79CVmuGYTSuiZZCQ0vSCgB7uQTjM8SOl66DG8YOi noeZyXO9TodAk6okQZ6irAg= X-Google-Smtp-Source: ABdhPJwOPzccUoO8WMlCBDqmdHvQFpwRWx+m/DQAnhmWT/JAaEayZrr3p4onZQrvncWp+crxjTU41Q== X-Received: by 2002:a2e:8185:: with SMTP id e5mr3094256ljg.138.1615298292146; Tue, 09 Mar 2021 05:58:12 -0800 (PST) Received: from localhost.localdomain (88-114-223-25.elisa-laajakaista.fi. [88.114.223.25]) by smtp.gmail.com with ESMTPSA id e18sm1955462ljl.92.2021.03.09.05.58.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Mar 2021 05:58:11 -0800 (PST) From: Topi Miettinen To: linux-hardening@vger.kernel.org, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Topi Miettinen , Andy Lutomirski , Jann Horn , Kees Cook , Linux API , Matthew Wilcox , Mike Rapoport , Vlad Rezki Subject: [PATCH v4] mm/vmalloc: randomize vmalloc() allocations Date: Tue, 9 Mar 2021 15:57:57 +0200 Message-Id: <20210309135757.5406-1-toiwoton@gmail.com> X-Mailer: git-send-email 2.30.1 MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 22CF6E0011C5 X-Stat-Signature: 8bqz576u3ij85y7wy57ce43ggi8kt949 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=mail-lj1-f176.google.com; client-ip=209.85.208.176 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615298293-301495 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memory mappings inside kernel allocated with vmalloc() are in predictable order and packed tightly toward the low addresses, except for per-cpu areas which start from top of the vmalloc area. With new kernel boot parameter 'randomize_vmalloc=3D1', the entire area is used randomly to make the allocations less predictable and harder to guess for attackers. Also module and BPF code locations get randomized (within their dedicated and rather small area though) and if CONFIG_VMAP_STACK is enabled, also kernel thread stack locations. On 32 bit systems this may cause problems due to increased VM fragmentation if the address space gets crowded. On all systems, it will reduce performance and increase memory and cache usage due to less efficient use of page tables and inability to merge adjacent VMAs with compatible attributes. On x86_64 with 5 level page tables, in the worst case, additional page table entries of up to 4 pages are created for each mapping, so with small mappings there's considerable penalty. Without randomize_vmalloc=3D1: $ grep -v kernel_clone /proc/vmallocinfo 0xffffc90000000000-0xffffc90000009000 36864 irq_init_percpu_irqstack+0x= 176/0x1c0 vmap 0xffffc90000009000-0xffffc9000000b000 8192 acpi_os_map_iomem+0x2ac/0x2= d0 phys=3D0x000000001ffe1000 ioremap 0xffffc9000000c000-0xffffc9000000f000 12288 acpi_os_map_iomem+0x2ac/0x2= d0 phys=3D0x000000001ffe0000 ioremap 0xffffc9000000f000-0xffffc90000011000 8192 hpet_enable+0x31/0x4a4 phys= =3D0x00000000fed00000 ioremap 0xffffc90000011000-0xffffc90000013000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffc90000013000-0xffffc90000015000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffc90000015000-0xffffc90000017000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffc90000021000-0xffffc90000023000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffc90000023000-0xffffc90000025000 8192 acpi_os_map_iomem+0x2ac/0x2= d0 phys=3D0x00000000fed00000 ioremap 0xffffc90000025000-0xffffc90000027000 8192 memremap+0x19c/0x280 phys=3D= 0x00000000000f5000 ioremap 0xffffc90000031000-0xffffc90000036000 20480 pcpu_create_chunk+0xe8/0x26= 0 pages=3D4 vmalloc 0xffffc90000043000-0xffffc90000047000 16384 n_tty_open+0x11/0xe0 pages=3D= 3 vmalloc 0xffffc90000211000-0xffffc90000232000 135168 crypto_scomp_init_tfm+0xc6/= 0xf0 pages=3D32 vmalloc 0xffffc90000232000-0xffffc90000253000 135168 crypto_scomp_init_tfm+0x67/= 0xf0 pages=3D32 vmalloc 0xffffc900005a9000-0xffffc900005ba000 69632 pcpu_create_chunk+0x7b/0x26= 0 pages=3D16 vmalloc 0xffffc900005ba000-0xffffc900005cc000 73728 pcpu_create_chunk+0xb2/0x26= 0 pages=3D17 vmalloc 0xffffe8ffffc00000-0xffffe8ffffe00000 2097152 pcpu_get_vm_areas+0x0/0x229= 0 vmalloc With randomize_vmalloc=3D1, the allocations are randomized: $ grep -v kernel_clone /proc/vmallocinfo 0xffffc9759d443000-0xffffc9759d445000 8192 hpet_enable+0x31/0x4a4 phys= =3D0x00000000fed00000 ioremap 0xffffccf1e9f66000-0xffffccf1e9f68000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffcd2fc02a4000-0xffffcd2fc02a6000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffcdaefb898000-0xffffcdaefb89b000 12288 acpi_os_map_iomem+0x2ac/0x2= d0 phys=3D0x000000001ffe0000 ioremap 0xffffcef8074c3000-0xffffcef8074cc000 36864 irq_init_percpu_irqstack+0x= 176/0x1c0 vmap 0xffffcf725ca2e000-0xffffcf725ca4f000 135168 crypto_scomp_init_tfm+0xc6/= 0xf0 pages=3D32 vmalloc 0xffffd0efb25e1000-0xffffd0efb25f2000 69632 pcpu_create_chunk+0x7b/0x26= 0 pages=3D16 vmalloc 0xffffd27054678000-0xffffd2705467c000 16384 n_tty_open+0x11/0xe0 pages=3D= 3 vmalloc 0xffffd2adf716e000-0xffffd2adf7180000 73728 pcpu_create_chunk+0xb2/0x26= 0 pages=3D17 vmalloc 0xffffd4ba5fb6b000-0xffffd4ba5fb6d000 8192 acpi_os_map_iomem+0x2ac/0x2= d0 phys=3D0x000000001ffe1000 ioremap 0xffffded126192000-0xffffded126194000 8192 memremap+0x19c/0x280 phys=3D= 0x00000000000f5000 ioremap 0xffffe01a4dbcd000-0xffffe01a4dbcf000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffe4b649952000-0xffffe4b649954000 8192 acpi_os_map_iomem+0x2ac/0x2= d0 phys=3D0x00000000fed00000 ioremap 0xffffe71ed592a000-0xffffe71ed592c000 8192 gen_pool_add_owner+0x49/0x1= 30 pages=3D1 vmalloc 0xffffe7dc5824f000-0xffffe7dc58270000 135168 crypto_scomp_init_tfm+0x67/= 0xf0 pages=3D32 vmalloc 0xffffe8f4f9800000-0xffffe8f4f9a00000 2097152 pcpu_get_vm_areas+0x0/0x229= 0 vmalloc 0xffffe8f4f9a19000-0xffffe8f4f9a1e000 20480 pcpu_create_chunk+0xe8/0x26= 0 pages=3D4 vmalloc With CONFIG_VMAP_STACK, also kernel thread stacks are placed in vmalloc area and therefore they also get randomized (only one example line from /proc/vmallocinfo shown for brevity): unrandomized: 0xffffc90000018000-0xffffc90000021000 36864 kernel_clone+0xf9/0x560 pag= es=3D8 vmalloc randomized: 0xffffcb57611a8000-0xffffcb57611b1000 36864 kernel_clone+0xf9/0x560 pag= es=3D8 vmalloc CC: Andrew Morton CC: Andy Lutomirski CC: Jann Horn CC: Kees Cook CC: Linux API CC: Matthew Wilcox CC: Mike Rapoport CC: Vlad Rezki Signed-off-by: Topi Miettinen --- v2: retry allocation from other end of vmalloc space in case of failure (Matthew Wilcox), improve commit message and documentation v3: randomize also percpu allocations (pcpu_get_vm_areas()) v4: use static branches (Kees Cook) and make the parameter boolean. --- .../admin-guide/kernel-parameters.txt | 24 ++++++++++ mm/vmalloc.c | 44 +++++++++++++++++-- 2 files changed, 65 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentat= ion/admin-guide/kernel-parameters.txt index a10b545c2070..726aec542079 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4024,6 +4024,30 @@ =20 ramdisk_start=3D [RAM] RAM disk image start address =20 + randomize_vmalloc=3D [KNL] Boolean option to randomize vmalloc() + allocations. When enabled, the entire + vmalloc() area is used randomly to make the + allocations less predictable and harder to + guess for attackers. Also module and BPF code + locations get randomized (within their + dedicated and rather small area though) and if + CONFIG_VMAP_STACK is enabled, also kernel + thread stack locations. + + On 32 bit systems this may cause problems due + to increased VM fragmentation if the address + space gets crowded. + + On all systems, it will reduce performance and + increase memory and cache usage due to less + efficient use of page tables and inability to + merge adjacent VMAs with compatible + attributes. On x86_64 with 5 level page + tables, in the worst case, additional page + table entries of up to 4 pages are created for + each mapping, so with small mappings there's + considerable penalty. + random.trust_cpu=3D{on,off} [KNL] Enable or disable trusting the use of the CPU's random number generator (if available) to diff --git a/mm/vmalloc.c b/mm/vmalloc.c index e6f352bf0498..b5ecf27dc98e 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -34,6 +34,7 @@ #include #include #include +#include =20 #include #include @@ -1089,6 +1090,25 @@ adjust_va_to_fit_type(struct vmap_area *va, return 0; } =20 +static DEFINE_STATIC_KEY_FALSE_RO(randomize_vmalloc); + +static int __init set_randomize_vmalloc(char *str) +{ + int ret; + bool bool_result; + + ret =3D kstrtobool(str, &bool_result); + if (ret) + return ret; + + if (bool_result) + static_branch_enable(&randomize_vmalloc); + else + static_branch_disable(&randomize_vmalloc); + return 1; +} +__setup("randomize_vmalloc=3D", set_randomize_vmalloc); + /* * Returns a start address of the newly allocated area, if success. * Otherwise a vend is returned that indicates failure. @@ -1162,7 +1182,7 @@ static struct vmap_area *alloc_vmap_area(unsigned l= ong size, int node, gfp_t gfp_mask) { struct vmap_area *va, *pva; - unsigned long addr; + unsigned long addr, voffset; int purged =3D 0; int ret; =20 @@ -1217,11 +1237,24 @@ static struct vmap_area *alloc_vmap_area(unsigned= long size, if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) kmem_cache_free(vmap_area_cachep, pva); =20 + /* Randomize allocation */ + if (static_branch_unlikely(&randomize_vmalloc)) { + voffset =3D get_random_long() & (roundup_pow_of_two(vend - vstart) - 1= ); + voffset =3D PAGE_ALIGN(voffset); + if (voffset + size > vend - vstart) + voffset =3D vend - vstart - size; + } else + voffset =3D 0; + /* * If an allocation fails, the "vend" address is * returned. Therefore trigger the overflow path. */ - addr =3D __alloc_vmap_area(size, align, vstart, vend); + addr =3D __alloc_vmap_area(size, align, vstart + voffset, vend); + + if (unlikely(addr =3D=3D vend) && voffset) + /* Retry randomization from other end */ + addr =3D __alloc_vmap_area(size, align, vstart, vstart + voffset + siz= e); spin_unlock(&free_vmap_area_lock); =20 if (unlikely(addr =3D=3D vend)) @@ -3258,7 +3291,12 @@ struct vm_struct **pcpu_get_vm_areas(const unsigne= d long *offsets, start =3D offsets[area]; end =3D start + sizes[area]; =20 - va =3D pvm_find_va_enclose_addr(vmalloc_end); + if (static_branch_unlikely(&randomize_vmalloc)) + va =3D pvm_find_va_enclose_addr(vmalloc_start + + (get_random_long() & + (roundup_pow_of_two(vmalloc_end - vmalloc_start) - 1))); + else + va =3D pvm_find_va_enclose_addr(vmalloc_end); base =3D pvm_determine_end_from_reverse(&va, align) - end; =20 while (true) { --=20 2.30.1