From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6905BEE6455 for ; Fri, 15 Sep 2023 11:00:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F047D8D0021; Fri, 15 Sep 2023 07:00:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB4968D0012; Fri, 15 Sep 2023 07:00:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D58438D0021; Fri, 15 Sep 2023 07:00:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C0A778D0012 for ; Fri, 15 Sep 2023 07:00:06 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 93815121029 for ; Fri, 15 Sep 2023 11:00:06 +0000 (UTC) X-FDA: 81238537212.08.D991DF7 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf05.hostedemail.com (Postfix) with ESMTP id BEED1100023 for ; Fri, 15 Sep 2023 11:00:04 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ov5MpAqg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3MzkEZQsKCOQSGZZKUXOffUMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--matteorizzo.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3MzkEZQsKCOQSGZZKUXOffUMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--matteorizzo.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694775604; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QO31JQTgjdCfbGBtgkfgkDIiB/3zYgVmpQ8kZJD1h+Y=; b=XlG0ZuUwRuwF7fwpB5gxqlB9tVcZqJsE9f5hnvpPyVj1o0SqfwCUfgODofeUxcnghaRFHs yPieF72nvsl3PeB73kTgqBcvWU97bJyLYt91eiTONooDA4VekKqlYqjktPhfPpyKGLHirV xmT8XgWTe8GH5puedLZp19xaL/V3k/c= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ov5MpAqg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3MzkEZQsKCOQSGZZKUXOffUMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--matteorizzo.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3MzkEZQsKCOQSGZZKUXOffUMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--matteorizzo.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694775604; a=rsa-sha256; cv=none; b=RTWX8u9Lh4StABITqsbMDpo8Br9WtU+uTRLyE18vugLNQQq0HNnJCf/pmydvgsczISYRhI iC1vGXXJal4RCZUORfOr7QrkboQ7/yuk8ofrLvXRMy9mQ/AiF2wONew/o8vIuYUdqFdPq4 V5tiVqAeVIaOE1nqIMmTQSaFCalpj1Y= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-d81c02bf2beso848578276.2 for ; Fri, 15 Sep 2023 04:00:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775604; x=1695380404; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QO31JQTgjdCfbGBtgkfgkDIiB/3zYgVmpQ8kZJD1h+Y=; b=ov5MpAqgott9eCnXBxjaOVGjOrUpioTLwXjBSJ0LnGlULkrjuYpnA3bti7SFgtWhEk cWMAZm2Ce022m7bYWpXd8j74pCtfOZGwruNM1+B3eQZ5oIlBhd2nn+izcWdaIvCyZyUv RyDZ0+JgsES35IjgOb9680DXM6ZaYydqLANjdLuCHZeRyp2UO2AJQ2wbPKuxasac7nZf YgQBwNR53tCTGEk45IIlIVUn4RyyVtWcwc1U9YIkwdBVpTJ+jt0VRaLpSxuijllYPytk rilLt8z2K92Qk9YpTIswGoJ4oraUeTtYFgZYO1Loo74oG55/dARgejXPimj1QeGbc6x3 ik4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775604; x=1695380404; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QO31JQTgjdCfbGBtgkfgkDIiB/3zYgVmpQ8kZJD1h+Y=; b=gJsyd39U2iGbbk5oFii1VTANo/6Nosodi4jNDXuCWxeeQLtUCF+Iy9BBJBUHqryBvp dSsOLsAmGv5JjWvZKG7PEdYkbIqpy0cDoPTP/t7XRqGw5OOxwSNvkgoTyteVWlSfISEz xwN2peX5jWFr2EistHQYW/RWmhAJPkQBaREUk8/JBjfpeHHoK9zoCB2RqhM1qI1eOdAy 4U5RxURkDz1mEOtqfeE36c4VJB6S7K+fTa1qNeuXvKSfg65QFSK34GUpszIpOIit1MLf Tj316vdBqhZBUOZrF+s+7DB3R1kG8HddtKMd6Pg3wgyE8GYPUflYQUcKcaSrH5k7DZHX 3Ttg== X-Gm-Message-State: AOJu0YzwYWIfY2vMDiZxENP2TOfWd3Lmkg/BcspujpUhGby66oQm9CpJ arhBUU6u2SLXNKYPpwpGlDRitSiY/BqxsOBteQ== X-Google-Smtp-Source: AGHT+IHEmylFdcJCHDvLHo7IywUVvUbkVuBMx1sd6LW13fQHTajbqMCrs6QknCEIuXA+dbFf620Y1iDB5FTTmGzUFQ== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a05:6902:144d:b0:d81:503e:2824 with SMTP id a13-20020a056902144d00b00d81503e2824mr26306ybv.10.1694775603871; Fri, 15 Sep 2023 04:00:03 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:29 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-11-matteorizzo@google.com> Subject: [RFC PATCH 10/14] x86: Create virtual memory region for SLUB From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: BEED1100023 X-Stat-Signature: b8uet5fpsdsjjhs1nznhk3eapynmdi1z X-Rspam-User: X-HE-Tag: 1694775604-774751 X-HE-Meta: U2FsdGVkX1+2U7iEmvxGVKq9GHJ5W1vfbpektbWuUIukVBHzo2I1a3V2L+mkYK1G1Zm/Kl6timVCv9AGe1uqLRQSCFF8qzzfq+fPTTdvYkkjm5d0i4plgvVOoywrk7pjegOzgz9rXkJ3yhfioZ10XSc/mpzIP9LILk983InDX1mFMUjfjgujKq4BtFuuz6/gmWivZ4K8X4QyGLjOino3mynakSzWh04oC1NkHM+uL9gKaKnur/0VOYtu9U71S8638CA3mgMilcitgKzGAPgLgH4LMWcTendQ6ksVTW+DHKXXrC5y5N+61DAUBuAsKRk1rFecKTpoyI52D8DhL/IgQOli2BuwdETH3SgUL9T1ava/2BRhVoT5HsY7Fc6qpaLLPsSP46IrSrCrm5WrgAvttWVjqIT1g0cpId/woQTyaBr4aE0O8gIMOmHUxAb/6Lkmefdrt6l0hikPaAJzKXUrkrJbb5KULz+Gx6vkhs4GNtKfA4XAy4xyzGPyC9ZdFzm0WAo/WsP2wYRqE0zNkqc5Ri7qkfMfozP60PrXz/zaUC3qyBh8o21Yaxxnf6ws9fW4wLO876gjji5vGwyb7m7qwKLRfsgLpdCc59g6SeFwjXJ7kcmfqfxDv0u0UMzmU9VE4mHCSzGYC/8q2LvepMBGaEZRiflt9xT0QlKmrBdficO2GPZ/DyexUzJHt6I9s9RPbPwgdOmuykPyMy5FNJBpurnA/b7K4yADPX7N89Pxptl/eQ6enMkUlSoJrK2AEbKxASnS/LEjDeAHst9kQCQuSE7lk48sXW+Enj2plTcWh/7mKrhm1bEV3lT1BjLig3MrcbwyKarFHJXY7WGsrOiQyDg/y3c9jB+sl6g7/ZEibrCBdpPUtBiDbdNzrIp+vZNe44kW2Mg96z7N/ge+RjbAh10vR1zpcqPsIjgHIgpntNY7GucNkIBpuwwvfQrKK49knsLXffGTR6cBxkkLxLX VvKdjQZz tToI4V51wlX1wPeGS46/oVCqBXSYk8k5SR5ZMmiP5f0RnOMaMWE1rDDU9Y1SgLIejfjcwjPsDuteUEd8+pOrUGrFgzD3OYlQYXBn6UyZcdPj/CvRfWxVE37wGSVVvTcdrVjyIgPDkBmh6txOUDJBZqJ+cOEyuPTV6ISTj4GfGrJOCvVb5gqWirRWxnH2OGZ+kM0AERiX5BwK9Jemc1K27dW6ICFKUeMuIzJESqNqZ7jCJbX40Qn5R1NQT9Ceo7O4bBQUP5QQ2Y4/BD0sXzL/bnC8W6vWs225UA6pUlE0h0RURS+wdBaqXgkqlh8YFb3mN0q40QlErhvIBlGASkR3sX3PcvQMFwZuUP1KTACsGHgPda9V+w6YXA61vsR7vSXk+qKAnm5tAZErqwnwvRVhYXGhMGcaWGyIZqXBEyC79QGaOwpE0yyhRvKm5Mp4PlHVcgJKxM4zvWb7frjXw3LkCrFEDqWlPcS7+EPCSv3AlWoa0DwCbx61y7eAK1OIo046d8okmn+ifXn6RaEcWutndWassZCzhBToxdZRvcaxAYqguyEyXWIsEryJiu6U7Oy838edl0++YgRvYcpsjkcX3zzZwRfn/48YPPjD5l3qEMT0CJReKkGJdBhUydeswxH9iFM/wIZkcqIVFlMrsUwE865D+7hbeJgedda/npwQFitw6wD+Ms2jdDhklhojFU92LDQls3LZm5tKvzMKh8UD5/5TGpA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jann Horn SLAB_VIRTUAL reserves 512 GiB of virtual memory and uses them for both struct slab and the actual slab memory. The pointers returned by kmem_cache_alloc will point to this range of memory. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo --- Documentation/arch/x86/x86_64/mm.rst | 4 ++-- arch/x86/include/asm/pgtable_64_types.h | 16 ++++++++++++++++ arch/x86/mm/init_64.c | 19 +++++++++++++++---- arch/x86/mm/kaslr.c | 9 +++++++++ arch/x86/mm/mm_internal.h | 4 ++++ mm/slub.c | 4 ++++ security/Kconfig.hardening | 2 ++ 7 files changed, 52 insertions(+), 6 deletions(-) diff --git a/Documentation/arch/x86/x86_64/mm.rst b/Documentation/arch/x86/x86_64/mm.rst index 35e5e18c83d0..121179537175 100644 --- a/Documentation/arch/x86/x86_64/mm.rst +++ b/Documentation/arch/x86/x86_64/mm.rst @@ -57,7 +57,7 @@ Complete virtual memory map with 4-level page tables fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole | | | | vaddr_end for KASLR fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping - fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole + fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | SLUB virtual memory ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space @@ -116,7 +116,7 @@ Complete virtual memory map with 5-level page tables fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole | | | | vaddr_end for KASLR fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping - fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole + fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | SLUB virtual memory ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 38b54b992f32..e1a91eb084c4 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -6,6 +6,7 @@ #ifndef __ASSEMBLY__ #include +#include #include /* @@ -199,6 +200,21 @@ extern unsigned int ptrs_per_p4d; #define ESPFIX_PGD_ENTRY _AC(-2, UL) #define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) +#ifdef CONFIG_SLAB_VIRTUAL +#define SLAB_PGD_ENTRY _AC(-3, UL) +#define SLAB_BASE_ADDR (SLAB_PGD_ENTRY << P4D_SHIFT) +#define SLAB_END_ADDR (SLAB_BASE_ADDR + P4D_SIZE) + +/* + * We need to define this here because we need it to compute SLAB_META_SIZE + * and including slab.h causes a dependency cycle. + */ +#define STRUCT_SLAB_SIZE (32 * sizeof(void *)) +#define SLAB_VPAGES ((SLAB_END_ADDR - SLAB_BASE_ADDR) / PAGE_SIZE) +#define SLAB_META_SIZE ALIGN(SLAB_VPAGES * STRUCT_SLAB_SIZE, PAGE_SIZE) +#define SLAB_DATA_BASE_ADDR (SLAB_BASE_ADDR + SLAB_META_SIZE) +#endif /* CONFIG_SLAB_VIRTUAL */ + #define CPU_ENTRY_AREA_PGD _AC(-4, UL) #define CPU_ENTRY_AREA_BASE (CPU_ENTRY_AREA_PGD << P4D_SHIFT) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a190aae8ceaf..d716ddfd9880 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1279,16 +1279,19 @@ static void __init register_page_bootmem_info(void) } /* - * Pre-allocates page-table pages for the vmalloc area in the kernel page-table. + * Pre-allocates page-table pages for the vmalloc and SLUB areas in the kernel + * page-table. * Only the level which needs to be synchronized between all page-tables is * allocated because the synchronization can be expensive. */ -static void __init preallocate_vmalloc_pages(void) +static void __init preallocate_top_level_entries_range(unsigned long start, + unsigned long end) { unsigned long addr; const char *lvl; - for (addr = VMALLOC_START; addr <= VMEMORY_END; addr = ALIGN(addr + 1, PGDIR_SIZE)) { + + for (addr = start; addr <= end; addr = ALIGN(addr + 1, PGDIR_SIZE)) { pgd_t *pgd = pgd_offset_k(addr); p4d_t *p4d; pud_t *pud; @@ -1328,6 +1331,14 @@ static void __init preallocate_vmalloc_pages(void) panic("Failed to pre-allocate %s pages for vmalloc area\n", lvl); } +static void __init preallocate_top_level_entries(void) +{ + preallocate_top_level_entries_range(VMALLOC_START, VMEMORY_END); +#ifdef CONFIG_SLAB_VIRTUAL + preallocate_top_level_entries_range(SLAB_BASE_ADDR, SLAB_END_ADDR - 1); +#endif +} + void __init mem_init(void) { pci_iommu_alloc(); @@ -1351,7 +1362,7 @@ void __init mem_init(void) if (get_gate_vma(&init_mm)) kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, PAGE_SIZE, KCORE_USER); - preallocate_vmalloc_pages(); + preallocate_top_level_entries(); } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index 37db264866b6..7b297d372a8c 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -136,6 +136,15 @@ void __init kernel_randomize_memory(void) vaddr = round_up(vaddr + 1, PUD_SIZE); remain_entropy -= entropy; } + +#ifdef CONFIG_SLAB_VIRTUAL + /* + * slub_addr_base is initialized separately from the + * kaslr_memory_regions because it comes after CPU_ENTRY_AREA_BASE. + */ + prandom_bytes_state(&rand_state, &rand, sizeof(rand)); + slub_addr_base += (rand & ((1UL << 36) - PAGE_SIZE)); +#endif } void __meminit init_trampoline_kaslr(void) diff --git a/arch/x86/mm/mm_internal.h b/arch/x86/mm/mm_internal.h index 3f37b5c80bb3..fafb79b7e019 100644 --- a/arch/x86/mm/mm_internal.h +++ b/arch/x86/mm/mm_internal.h @@ -25,4 +25,8 @@ void update_cache_mode_entry(unsigned entry, enum page_cache_mode cache); extern unsigned long tlb_single_page_flush_ceiling; +#ifdef CONFIG_SLAB_VIRTUAL +extern unsigned long slub_addr_base; +#endif + #endif /* __X86_MM_INTERNAL_H */ diff --git a/mm/slub.c b/mm/slub.c index 4f77e5d4fe6c..a731fdc79bff 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -166,6 +166,10 @@ * the fast path and disables lockless freelists. */ +#ifdef CONFIG_SLAB_VIRTUAL +unsigned long slub_addr_base = SLAB_DATA_BASE_ADDR; +#endif /* CONFIG_SLAB_VIRTUAL */ + /* * We could simply use migrate_disable()/enable() but as long as it's a * function call even on !PREEMPT_RT, use inline preempt_disable() there. diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening index 9f4e6e38aa76..f4a0af424149 100644 --- a/security/Kconfig.hardening +++ b/security/Kconfig.hardening @@ -357,6 +357,8 @@ config GCC_PLUGIN_RANDSTRUCT config SLAB_VIRTUAL bool "Allocate slab objects from virtual memory" + # For virtual memory region allocation + depends on X86_64 depends on SLUB && !SLUB_TINY # If KFENCE support is desired, it could be implemented on top of our # virtual memory allocation facilities -- 2.42.0.459.ge4e396fd5e-goog