From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 945F6C5B549 for ; Fri, 30 May 2025 09:29:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 368716B0083; Fri, 30 May 2025 05:29:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 33FBB6B009E; Fri, 30 May 2025 05:29:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22E806B009F; Fri, 30 May 2025 05:29:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 02CCD6B0083 for ; Fri, 30 May 2025 05:29:45 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B1516E0DDF for ; Fri, 30 May 2025 09:29:45 +0000 (UTC) X-FDA: 83499051930.09.E123040 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf21.hostedemail.com (Postfix) with ESMTP id EB3771C000B for ; Fri, 30 May 2025 09:29:43 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=NG9cfZoE; spf=pass (imf21.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748597384; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fsVmARra3PYHgocOota+jP5MYor7RDP1xSKc/otYVfE=; b=efp6HeOUOKpKd9th5ifUFGXcRH95/ZlXy+jvi/wdt/s+wINkkqH4yne9A+7vfAv+qui8kT mzaVh125lGZ+Zke5KxJOU9M/a47MxowlqvEGYrn9EkkY9YjgyQqEF4C7xEzGQRnfTsA4qQ BkCZimP8HNH5iBGKmekbSTl/51wdRZk= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=NG9cfZoE; spf=pass (imf21.hostedemail.com: domain of libo.gcs85@bytedance.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=libo.gcs85@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748597384; a=rsa-sha256; cv=none; b=nQPxuZg0sVf1cLJpvXH4z2cq5M1/WiCYyob0+3fIrTq16TqgJciR2uyqwYwE4Xx/acIsUY PKzFjCQ3D4UM7yIAvhoZoqNoVzk8Xr+hKRUR7UXoXTroNZZxWo8E0euiDJjv5JFbnCmzku ysWU3vnK5DndBPcNnOdiIDj8zd7SNA8= Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-306b6ae4fb2so1582199a91.3 for ; Fri, 30 May 2025 02:29:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748597383; x=1749202183; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fsVmARra3PYHgocOota+jP5MYor7RDP1xSKc/otYVfE=; b=NG9cfZoEvfiU3+n5x0bFCPP3ND23t44VfxjeStVKm2vp1HGjpMXZP2HRGrnzhYfWsU zF9IzkvNU05YsTP0a0RVod4kmOQ64xqI0JptAKv6iNZeGMZ4D5WzZE7FPwyzJ0NsAVSp 68VWtyYCdYecoNaZ9XjBFUGxEukUTDTQuB43803oSchHsSf4Mr7prlpCd+HWlN4z605Q Z+ND2GAs4HciTN6Bhe841kjV4gakuOobmtLfdh14S494vdCwPMNRDqzb3g1WgAqffov3 x3r+Rsi1FkrzmhjLlGeywmkGTuk1AssSGD/ZM+7WMk+U66s08zsIqprmZpGape30SGSE Jejg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748597383; x=1749202183; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fsVmARra3PYHgocOota+jP5MYor7RDP1xSKc/otYVfE=; b=MNI7rHgYcJ5+gw+UTzb7QB85Re7Uajg3GysRiA7XE8nvNg1ZfKOvKLUZzySFk1r19c 5TnpXYaEnf7+BIb056FbT8V4KG4DoM7LZjjgOsXGxJdMRYYEpVjUUfQK8H8jRQpHQSif hr6+urCj3REmyBKTHMvatAYQZDS7fGUos8zvPumDji3HwOIpqVYxQbYLZqJFtdHPUw5H 8HlSBboteKRotqu7kXwPZpcRoK+UeZeiXTkAivybYAasokggDwWKfunymvY+haTXExCB n16z3f0/eeL6BQTgV1zi0zNwLHUH6gxZmMbOZTqe46QqIX2kfgYuxcNOxUdixxVWrCYl hyGw== X-Forwarded-Encrypted: i=1; AJvYcCWGscxY0PBd6Y2cpDeDDbrmVdIW/BmyGdDX4ftYWvE+zqw/HaaIXkgPUt04bEKmsBd7P/tT6jlNyQ==@kvack.org X-Gm-Message-State: AOJu0YxaY3EQ2CxWSkMM83MmPaHNPJdzplJg0k5alC+AkhK4FE9LjhfK vMxB5eCJ8Ci6Ia2giaLmZpNDaaibc2+FhNzR8G6vQhaPgsSigEGAHGnFLIEk4hfc9lw= X-Gm-Gg: ASbGncuwkzNPuqIKVuSlPzTTja4np8ZCICkUu+zrtVStcenLCNTA+wggYnNtmMM3e3A c3EuGFiayfGSVEtTgoANHjvuknZRtrSUcaH3ra7rcs9VPRe6QJIVQ9EnXrYnKcPUaOQAvt7ge3G 5KfShBVvpmLVLBcdhQFmx3KLdVQ+MnMtPHO/wiWI2ApEdHizWRsLTVBGSZQtkcya4+dECZ3ov// z1GZ3mzZmfYdIBxw3ALRplwcAsG8tX/Xx5aS0OEDBQoD2LotrgLlr2tVYbZ61qT0RHIT9FUYob5 xVXhWVdZrqBBgWOjR9fokdRJb/hQfephMxQZTHKpUGeXbNJL7pKkHP7R+TygwwcxLs+KQXwnueI bmPjaswauX98c3OytL/+JMkvckwDjvRo= X-Google-Smtp-Source: AGHT+IGDxz/qX0egNQr6P0EaiRBDIQcZ9ust5aoVSte6/Q83iWKLonGQ/1cLLq8vU8IelvR0XV1gQw== X-Received: by 2002:a17:90b:5344:b0:312:b4a:6342 with SMTP id 98e67ed59e1d1-31241e9c28fmr4334365a91.33.1748597382681; Fri, 30 May 2025 02:29:42 -0700 (PDT) Received: from FQ627FTG20.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3124e29f7b8sm838724a91.2.2025.05.30.02.29.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 02:29:42 -0700 (PDT) From: Bo Li To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, kees@kernel.org, akpm@linux-foundation.org, david@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, peterz@infradead.org Cc: dietmar.eggemann@arm.com, hpa@zytor.com, acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, kan.liang@linux.intel.com, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, jannh@google.com, pfalcato@suse.de, riel@surriel.com, harry.yoo@oracle.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, yinhongbo@bytedance.com, dengliang.1214@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, songmuchun@bytedance.com, yuanzhu@bytedance.com, chengguozhu@bytedance.com, sunjiadong.lff@bytedance.com, Bo Li Subject: [RFC v2 05/35] RPAL: enable virtual address space partitions Date: Fri, 30 May 2025 17:27:33 +0800 Message-Id: X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: EB3771C000B X-Stat-Signature: tutjxufqjydp57ex6ndbape3o7ftfu6z X-Rspam-User: X-HE-Tag: 1748597383-460641 X-HE-Meta: U2FsdGVkX18Hs8FH2yWafsh0c3gnnYRHXt1bl6NWORAm0X1DzD5EXLxu0RONs89Wz7PDyihEs2HvAD5VQ7qHAy8Te0rJb5DDImf4pA2aU3DPXvbQU4ASPVwxJsC0j+AL3wm6wZpuQh1lpwGo7W76qwByiL5oSZ/58H+Z2/CgnOdFzUOUn7GE3w5n07KfAK6B7OnMIwW11v5KF5xETz1HrxMcEQ6vCr/iHthlgWfp8sYlJCPZCMQ3x4NX5vOzYqjhKTvO+A1tFIz+RxEFOO7JMxnQy2DQtNdvAvgqR6hlLHStAVVrgZDm9l72/npE04lVzvz5f8hvcmGUq8DW6B5LnCExW5kStj3/cl5NCq3dIbJgbyX6RWwW6C0nZyVKMloR4u4dTtKcYHVUSX5Qr/4GRGJzFppTCoYECJyxJN0R8anbqkf6L0EkUuTa+1K43lXJ5venemca4LCL/nXLUNaeYi14dOozrUSpdIi03vPX4HUNLW2UWQe89c+4FjvjCzS9n+oeXBjFsg1WirhtOHExV4vjCJis2MDUj0gRUiD81uNV9y8+EZJPnlCFc1D1Iw6OTXCD1AUubatvin1uQAoU06kIkD7vNJCHJDWp111ZM9eOmHD7e3CJEEVvToq1My9Kn0u8Bs3Yfqhy9Ie6owQN/HrKX/DTzUyyIZf9J7DHSQJGIevSUaFa/f+4CNUv9jyZlcy8jYVOSwucZOV2J86PyQ7fh2BkGPw4MudJuqseB8BVE6LkM9GsLiXAyxKpf4VCZ2scbhGiBG5OQWkCXz/0iX7sjU3VFUY+xUt9nPqwzw0t9UzKCNZ7mvhg9qz4fLOT24VXa7kgwZ+6XRT2/gFuk/hAsPfAOh+GIsytdX0+RpttTiNlJ1nBylZqRJPfQSSKrJM+SJEPBQehg4Kzgo06QIEdWSU2rjNavjFhmPSWVxg1UUY4AH8GxuKm/IV5rNHUJq4Ug87wY0KLPIRjUeV w7mZqkAl V4Ozn2pZxaOrK79iXPbdsWN0VAtADIxOxa2xd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Each RPAL service occupies a contiguous 512GB virtual address space, with its base address determined by the id assigned during initialization. For userspace virtual address space beyond this 512GB range, we employ memory ballooning to occupy these regions, ensuring that processes do not utilize these virtual addresses. Since the address space layout is determined when the process is loaded, RPAL sets the unused fields in the header of the ELF binary to the "RPAL" characters to alter the loading method of RPAL processes, enabling the process to be located within the correct 512GB address space upon loading. Signed-off-by: Bo Li --- arch/x86/mm/mmap.c | 10 +++++ arch/x86/rpal/Makefile | 2 +- arch/x86/rpal/mm.c | 70 +++++++++++++++++++++++++++++ arch/x86/rpal/service.c | 8 ++++ fs/binfmt_elf.c | 98 ++++++++++++++++++++++++++++++++++++++++- include/linux/rpal.h | 65 +++++++++++++++++++++++++++ 6 files changed, 251 insertions(+), 2 deletions(-) create mode 100644 arch/x86/rpal/mm.c diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index 5ed2109211da..504f2b9a0e8e 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -119,6 +120,15 @@ static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base, *base = mmap_base(random_factor, task_size, rlim_stack); } +#ifdef CONFIG_RPAL +void rpal_pick_mmap_base(struct mm_struct *mm, struct rlimit *rlim_stack) +{ + arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base, + arch_rnd(RPAL_MAX_RAND_BITS), rpal_get_top(mm->rpal_rs), + rlim_stack); +} +#endif + void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { if (mmap_is_legacy()) diff --git a/arch/x86/rpal/Makefile b/arch/x86/rpal/Makefile index ee3698b5a9b3..2c858a8d7b9e 100644 --- a/arch/x86/rpal/Makefile +++ b/arch/x86/rpal/Makefile @@ -2,4 +2,4 @@ obj-$(CONFIG_RPAL) += rpal.o -rpal-y := service.o core.o +rpal-y := service.o core.o mm.o diff --git a/arch/x86/rpal/mm.c b/arch/x86/rpal/mm.c new file mode 100644 index 000000000000..f469bcf57b66 --- /dev/null +++ b/arch/x86/rpal/mm.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * RPAL service level operations + * Copyright (c) 2025, ByteDance. All rights reserved. + * + * Author: Jiadong Sun + */ + +#include +#include +#include +#include + +static inline int rpal_balloon_mapping(unsigned long base, unsigned long size) +{ + struct vm_area_struct *vma; + unsigned long addr, populate; + int is_fail = 0; + + if (size == 0) + return 0; + + addr = do_mmap(NULL, base, size, PROT_NONE, + MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE, + VM_DONTEXPAND | VM_PFNMAP | VM_DONTDUMP, 0, &populate, + NULL); + + is_fail = base != addr; + + if (is_fail) { + pr_info("rpal: Balloon mapping 0x%016lx - 0x%016lx, %s, addr: 0x%016lx\n", + base, base + size, is_fail ? "Fail" : "Success", addr); + } + vma = find_vma(current->mm, addr); + if (vma->vm_start != addr || vma->vm_end != addr + size) { + is_fail = 1; + rpal_err("rpal: find vma 0x%016lx - 0x%016lx fail\n", addr, + addr + size); + } + + return is_fail; +} + +#define RPAL_USER_TOP TASK_SIZE + +int rpal_balloon_init(unsigned long base) +{ + unsigned long top; + struct mm_struct *mm = current->mm; + int ret; + + top = base + RPAL_ADDR_SPACE_SIZE; + + mmap_write_lock(mm); + + if (base > mmap_min_addr) { + ret = rpal_balloon_mapping(mmap_min_addr, base - mmap_min_addr); + if (ret) + goto out; + } + + ret = rpal_balloon_mapping(top, RPAL_USER_TOP - top); + if (ret && base > mmap_min_addr) + do_munmap(mm, mmap_min_addr, base - mmap_min_addr, NULL); + +out: + mmap_write_unlock(mm); + + return ret; +} diff --git a/arch/x86/rpal/service.c b/arch/x86/rpal/service.c index 55ecb7e0ef8c..caa4afa5a2c6 100644 --- a/arch/x86/rpal/service.c +++ b/arch/x86/rpal/service.c @@ -143,6 +143,11 @@ static void delete_service(struct rpal_service *rs) spin_unlock_irqrestore(&hash_table_lock, flags); } +static inline unsigned long calculate_base_address(int id) +{ + return RPAL_ADDRESS_SPACE_LOW + RPAL_ADDR_SPACE_SIZE * id; +} + struct rpal_service *rpal_register_service(void) { struct rpal_service *rs; @@ -168,6 +173,9 @@ struct rpal_service *rpal_register_service(void) if (unlikely(rs->key == RPAL_INVALID_KEY)) goto key_fail; + rs->bad_service = false; + rs->base = calculate_base_address(rs->id); + current->rpal_rs = rs; rs->group_leader = get_task_struct(current); diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index a43363d593e5..9d27d9922de4 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -47,6 +47,7 @@ #include #include #include +#include #include #include @@ -814,6 +815,61 @@ static int parse_elf_properties(struct file *f, const struct elf_phdr *phdr, return ret == -ENOENT ? 0 : ret; } +#if IS_ENABLED(CONFIG_RPAL) +static int rpal_create_service(char *e_ident, struct rpal_service **rs, + unsigned long *rpal_base, int *retval, + struct linux_binprm *bprm, int executable_stack) +{ + /* + * The first 16 bytes of the elf binary is magic number, and the last + * 7 bytes of that is reserved and ignored. We use the last 4 bytes + * to indicate a rpal binary. If the last 4 bytes is "RPAL", then this + * is a rpal binary and we need to do register routinue. + */ + if (memcmp(e_ident + RPAL_MAGIC_OFFSET, RPAL_MAGIC, RPAL_MAGIC_LEN) == + 0) { + unsigned long rpal_stack_top = STACK_TOP; + + *rs = rpal_register_service(); + if (*rs != NULL) { + *rpal_base = rpal_get_base(*rs); + rpal_stack_top = *rpal_base + RPAL_ADDR_SPACE_SIZE; + /* + * We need to recalculate the mmap_base, otherwise the address space + * layout randomization will not make any difference. + */ + rpal_pick_mmap_base(current->mm, &bprm->rlim_stack); + } + /* + * RPAL process only has a contiguous 512GB address space, Whose base + * address is given by its struct rpal_service. We need to rearrange + * the user stack in this 512GB address space. + */ + *retval = setup_arg_pages(bprm, + randomize_stack_top(rpal_stack_top), + executable_stack); + /* + * We use memory ballon to avoid kernel allocating vma other than + * the process's 512GB memory. + */ + if (unlikely(*rs != NULL && rpal_balloon_init(*rpal_base))) { + rpal_err("pid: %d, comm: %s: rpal balloon init fail\n", + current->pid, current->comm); + rpal_unregister_service(*rs); + *rs = NULL; + *retval = -EINVAL; + goto out; + } + } else { + *retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), + executable_stack); + } + +out: + return 0; +} +#endif + static int load_elf_binary(struct linux_binprm *bprm) { struct file *interpreter = NULL; /* to shut gcc up */ @@ -836,6 +892,10 @@ static int load_elf_binary(struct linux_binprm *bprm) struct arch_elf_state arch_state = INIT_ARCH_ELF_STATE; struct mm_struct *mm; struct pt_regs *regs; +#ifdef CONFIG_RPAL + struct rpal_service *rs = NULL; + unsigned long rpal_base; +#endif retval = -ENOEXEC; /* First of all, some simple consistency checks */ @@ -1008,10 +1068,19 @@ static int load_elf_binary(struct linux_binprm *bprm) setup_new_exec(bprm); +#ifdef CONFIG_RPAL + /* call original function if fails */ + if (rpal_create_service((char *)&elf_ex->e_ident, &rs, &rpal_base, + &retval, bprm, executable_stack)) + retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), + executable_stack); +#else /* Do this so that we can load the interpreter, if need be. We will change some of these later */ retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), executable_stack); +#endif + if (retval < 0) goto out_free_dentry; @@ -1055,6 +1124,22 @@ static int load_elf_binary(struct linux_binprm *bprm) * is needed. */ elf_flags |= MAP_FIXED_NOREPLACE; +#ifdef CONFIG_RPAL + /* + * If We load MAP_FIXED binary, it will either fail when + * doing mmap, as we have done the memory balloon before, + * or work well, where we are so lucky to have fixed address + * in it's RPAL address space. A MAP_FIXED binary should + * by no means be a RPAL service. Here we only print + * an error. Maybe we will handle it in the future. + */ + if (unlikely(rs != NULL)) { + rpal_err( + "pid: %d, common: %s, load a binary with MAP_FIXED segment\n", + current->pid, current->comm); + rs->bad_service = true; + } +#endif } else if (elf_ex->e_type == ET_DYN) { /* * This logic is run once for the first LOAD Program @@ -1128,6 +1213,12 @@ static int load_elf_binary(struct linux_binprm *bprm) /* Adjust alignment as requested. */ if (alignment) load_bias &= ~(alignment - 1); +#ifdef CONFIG_RPAL + if (rs != NULL) { + load_bias &= RPAL_RAND_ADDR_SPACE_MASK; + load_bias += rpal_base; + } +#endif elf_flags |= MAP_FIXED_NOREPLACE; } else { /* @@ -1306,7 +1397,12 @@ static int load_elf_binary(struct linux_binprm *bprm) if (!IS_ENABLED(CONFIG_COMPAT_BRK) && IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && elf_ex->e_type == ET_DYN && !interpreter) { - elf_brk = ELF_ET_DYN_BASE; +#ifdef CONFIG_RPAL + if (rs && !rs->bad_service) + elf_brk = rpal_base; + else +#endif + elf_brk = ELF_ET_DYN_BASE; /* This counts as moving the brk, so let brk(2) know. */ brk_moved = true; } diff --git a/include/linux/rpal.h b/include/linux/rpal.h index 7b9d90b62b3f..f7c0de747f55 100644 --- a/include/linux/rpal.h +++ b/include/linux/rpal.h @@ -15,11 +15,17 @@ #include #include #include +#include #define RPAL_ERROR_MSG "rpal error: " #define rpal_err(x...) pr_err(RPAL_ERROR_MSG x) #define rpal_err_ratelimited(x...) pr_err_ratelimited(RPAL_ERROR_MSG x) +/* RPAL magic macros in binary elf header */ +#define RPAL_MAGIC "RPAL" +#define RPAL_MAGIC_OFFSET 12 +#define RPAL_MAGIC_LEN 4 + /* * The first 512GB is reserved due to mmap_min_addr. * The last 512GB is dropped since stack will be initially @@ -30,6 +36,47 @@ #define RPAL_FIRST_KEY _AC(1, UL) #define RPAL_INVALID_KEY _AC(0, UL) +/* + * Process Virtual Address Space Layout (For 4-level Paging) + * |-------------| + * | No Mapping | + * |-------------| <-- 64 KB (mmap_min_addr) + * | ... | + * |-------------| <-- 1 * 512GB + * | service 0 | + * |-------------| <-- 2 * 512 GB + * | Service 1 | + * |-------------| <-- 3 * 512 GB + * | Service 2 | + * |-------------| <-- 4 * 512 GB + * | ... | + * |-------------| <-- 255 * 512 GB + * | Service 254 | + * |-------------| <-- 128 TB + * | | + * | ... | + * |-------------| <-- PAGE_OFFSET + * | | + * | Kernel | + * |_____________| + * + */ +#define RPAL_ADDR_SPACE_SIZE (_AC(512, UL) * SZ_1G) +/* + * Since RPAL restricts the virtual address space used by a single + * process to 512GB, the number of bits for address randomization + * must be correspondingly reduced; otherwise, issues such as overlaps + * in randomized addresses could occur. RPAL employs 20-bit (page number) + * address randomization to balance security and usability. + */ +#define RPAL_RAND_ADDR_SPACE_MASK _AC(0xfffffff0, UL) +#define RPAL_MAX_RAND_BITS 20 + +#define RPAL_NR_ADDR_SPACE 256 + +#define RPAL_ADDRESS_SPACE_LOW ((0UL) + RPAL_ADDR_SPACE_SIZE) +#define RPAL_ADDRESS_SPACE_HIGH ((0UL) + RPAL_NR_ADDR_SPACE * RPAL_ADDR_SPACE_SIZE) + /* * Each RPAL process (a.k.a RPAL service) should have a pointer to * struct rpal_service in all its tasks' task_struct. @@ -52,6 +99,10 @@ struct rpal_service { u64 key; /* virtual address space id */ int id; + /* virtual address space base address of this service */ + unsigned long base; + /* bad rpal binary */ + bool bad_service; /* * Fields above should never change after initialization. @@ -86,6 +137,16 @@ struct rpal_service *rpal_get_service(struct rpal_service *rs); */ void rpal_put_service(struct rpal_service *rs); +static inline unsigned long rpal_get_base(struct rpal_service *rs) +{ + return rs->base; +} + +static inline unsigned long rpal_get_top(struct rpal_service *rs) +{ + return rs->base + RPAL_ADDR_SPACE_SIZE; +} + #ifdef CONFIG_RPAL static inline struct rpal_service *rpal_current_service(void) { @@ -100,4 +161,8 @@ struct rpal_service *rpal_register_service(void); struct rpal_service *rpal_get_service_by_key(u64 key); void copy_rpal(struct task_struct *p); void exit_rpal(bool group_dead); +int rpal_balloon_init(unsigned long base); + +extern void rpal_pick_mmap_base(struct mm_struct *mm, + struct rlimit *rlim_stack); #endif -- 2.20.1