From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22057C4361B for ; Fri, 18 Dec 2020 15:06:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1BC0823AFD for ; Fri, 18 Dec 2020 15:06:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1BC0823AFD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 44D668D0001; Fri, 18 Dec 2020 10:06:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D4C26B006C; Fri, 18 Dec 2020 10:06:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B72C8D0001; Fri, 18 Dec 2020 10:06:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id CAFB46B005D for ; Fri, 18 Dec 2020 10:06:06 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 682ED1EF3 for ; Fri, 18 Dec 2020 15:06:06 +0000 (UTC) X-FDA: 77606728332.03.push43_4800a952743e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 7969214DA0 for ; Fri, 18 Dec 2020 15:05:56 +0000 (UTC) X-HE-Tag: push43_4800a952743e X-Filterd-Recvd-Size: 256710 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Dec 2020 15:05:52 +0000 (UTC) Received: from dggeme702-chm.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4CyB2W0wXSzXpkh; Fri, 18 Dec 2020 22:24:35 +0800 (CST) Received: from dggema764-chm.china.huawei.com (10.1.198.206) by dggeme702-chm.china.huawei.com (10.1.199.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1913.5; Fri, 18 Dec 2020 22:25:09 +0800 Received: from dggema764-chm.china.huawei.com ([10.9.49.86]) by dggema764-chm.china.huawei.com ([10.9.49.86]) with mapi id 15.01.1913.007; Fri, 18 Dec 2020 22:25:09 +0800 From: hejingxian To: Andrew Morton , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" CC: Hushiyuan , "hewenliang (C)" Subject: [PATCH] add pin memory method for checkout add restore Thread-Topic: [PATCH] add pin memory method for checkout add restore Thread-Index: AdbVSZMYq89Z1i35SCKgJJwENVHCjQ== Date: Fri, 18 Dec 2020 14:25:09 +0000 Message-ID: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.174.177.226] Content-Type: multipart/alternative; boundary="_000_a68df79992c04bbf8167748dbeca1fcchuaweicom_" MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --_000_a68df79992c04bbf8167748dbeca1fcchuaweicom_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable From: Jingxian He > Date: Thu, 10 Dec 2020 20:31:15 +0800 Subject: [PATCH] add pin memory method for checkout add restore We can use the checkpoint and restore in userspace(criu) method to dump and= restore tasks when updating the kernel. Currently, criu needs dump all memory data of tas= ks to files. When the memory size is very large(larger than 1G), the cost time of the du= mping data will be very long(more than 1 min). We can pin the memory data of tasks and collect the corresponding physical = pages mapping info in checkpoint process, and remap the physical pages to restore tasks after = kernel is updated. The pin memory area info is saved in the reserve memblock named nvwa_res_fi= rst, which can keep usable in the kernel update process. The pin memory driver provides the following ioctl command for criu: 1) SET_PIN_MEM_AREA: set pin memory area, which can be remap to the restore= task. 2) CLEAR_PIN_MEM_AREA: clear the pin memory area info, which enable user re= set the pin data. 3) REMAP_PIN_MEM_AREA: remap the pages of the pin memory to the restore tas= k. Signed-off-by: Jingxian He > --- arch/arm64/kernel/setup.c | 7 + arch/arm64/mm/init.c | 62 +++- drivers/char/Kconfig | 7 + drivers/char/Makefile | 1 + drivers/char/pin_memory.c | 198 +++++++++++++ include/linux/crash_core.h | 5 + include/linux/pin_mem.h | 62 ++++ kernel/crash_core.c | 11 + mm/Kconfig | 6 + mm/Makefile | 1 + mm/huge_memory.c | 61 ++++ mm/memory.c | 68 +++++ mm/pin_mem.c | 691 ++++++++++++++++++++++++++++++++++++++++++= +++ 13 files changed, 1179 insertions(+), 1 deletion(-) create mode 100644 drivers/char/pin_memory.c create mode 100644 include/linux/pin_mem.h create mode 100644 mm/pin_mem.c diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 56f6645..40751ed 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -50,6 +50,9 @@ #include #include #include +#ifdef CONFIG_PIN_MEMORY +#include +#endif static int num_standard_resources; static struct resource *standard_resources; @@ -243,6 +246,10 @@ static void __init request_standard_resources(void) crashk_res.end <=3D res->end) request_resource(res, &crashk_res); #endif +#ifdef CONFIG_PIN_MEMORY + if (pin_memory_resource.end) + insert_resource(&iomem_resource, &pin_memory_res= ource); +#endif } } diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index b65dffd..dee3192 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -41,7 +41,9 @@ #include #include #include - +#ifdef CONFIG_PIN_MEMORY +#include +#endif #define ARM64_ZONE_DMA_BITS 30 /* @@ -68,6 +70,16 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init; static phys_addr_t arm64_dma32_phys_limit __ro_after_init; +#ifdef CONFIG_PIN_MEMORY +struct resource pin_memory_resource =3D { + .name =3D "Pin memory maps", + .start =3D 0, + .end =3D 0, + .flags =3D IORESOURCE_MEM, + .desc =3D IORES_DESC_PIN_MEM_MAPS +}; +#endif + #ifdef CONFIG_KEXEC_CORE /* * reserve_crashkernel() - reserves memory for crash kernel @@ -129,6 +141,47 @@ static void __init reserve_crashkernel(void) } #endif /* CONFIG_KEXEC_CORE */ +#ifdef CONFIG_PIN_MEMORY +static void __init reserve_pin_memory_res(void) +{ + unsigned long long mem_start, mem_len; + int ret; + + ret =3D parse_pin_memory(boot_command_line, memblock_phys_mem_size(= ), + &mem_len, &mem_start= ); + if (ret || !mem_len) + return; + + mem_len =3D PAGE_ALIGN(mem_len); + + if (!memblock_is_region_memory(mem_start, mem_len)) { + pr_warn("cannot reserve for pin memory: region is not memo= ry!\n"); + return; + } + + if (memblock_is_region_reserved(mem_start, mem_len)) { + pr_warn("cannot reserve for pin memory: region overlaps re= served memory!\n"); + return; + } + + if (!IS_ALIGNED(mem_start, SZ_2M)) { + pr_warn("cannot reserve for pin memory: base address is no= t 2MB aligned\n"); + return; + } + + memblock_reserve(mem_start, mem_len); + pr_debug("pin memory resource reserved: 0x%016llx - 0x%016llx (%lld= MB)\n", + mem_start, mem_start + mem_len, mem_len >> 20); + + pin_memory_resource.start =3D mem_start; + pin_memory_resource.end =3D mem_start + mem_len - 1; +} +#else +static void __init reserve_pin_memory_res(void) +{ +} +#endif /* CONFIG_PIN_MEMORY */ + #ifdef CONFIG_CRASH_DUMP static int __init early_init_dt_scan_elfcorehdr(unsigned long node, const char *uname, int depth, void *data) @@ -452,6 +505,8 @@ void __init arm64_memblock_init(void) reserve_crashkernel(); + reserve_pin_memory_res(); + reserve_elfcorehdr(); high_memory =3D __va(memblock_end_of_DRAM() - 1) + 1; @@ -573,6 +628,11 @@ void __init mem_init(void) /* this will put all unused low memory onto the freelists */ memblock_free_all(); +#ifdef CONFIG_PIN_MEMORY + /* pre alloc the pages for pin memory */ + init_reserve_page_map((unsigned long)pin_memory_resource.start, + (unsigned long)(pin_memory_resource.end - pin_memory_resou= rce.start)); +#endif mem_init_print_info(NULL); /* diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig index 26956c0..73af2f0 100644 --- a/drivers/char/Kconfig +++ b/drivers/char/Kconfig @@ -560,3 +560,10 @@ config RANDOM_TRUST_BOOTLOADER booloader is trustworthy so it will be added to the kernel's entropy pool. Otherwise, say N here so it will be regarded as device input t= hat only mixes the entropy pool. + +config PIN_MEMORY_DEV + bool "/dev/pinmem character device" + depends PIN_MEMORY + default n + help + pin memory driver diff --git a/drivers/char/Makefile b/drivers/char/Makefile index 7c5ea6f..1941642 100644 --- a/drivers/char/Makefile +++ b/drivers/char/Makefile @@ -52,3 +52,4 @@ js-rtc-y =3D rtc.o obj-$(CONFIG_XILLYBUS) +=3D xillybus/ obj-$(CONFIG_POWERNV_OP_PANEL) +=3D powernv-op-panel.o obj-$(CONFIG_ADI) +=3D adi.o +obj-$(CONFIG_PIN_MEMORY_DEV) +=3D pin_memory.o diff --git a/drivers/char/pin_memory.c b/drivers/char/pin_memory.c new file mode 100644 index 00000000..a0464e1 --- /dev/null +++ b/drivers/char/pin_memory.c @@ -0,0 +1,198 @@ +/* + * Copyright @ Huawei Technologies Co., Ltd. 2020-2020. ALL rights reserve= d. + * Description: Euler pin memory driver + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MAX_PIN_MEM_AREA_NUM 16 +struct _pin_mem_area { + unsigned long virt_start; + unsigned long virt_end; +}; + +struct pin_mem_area_set { + unsigned int pid; + unsigned int area_num; + struct _pin_mem_area mem_area[MAX_PIN_MEM_AREA_NUM]; +}; + +#define PIN_MEM_MAGIC 0x59 +#define _SET_PIN_MEM_AREA 1 +#define _CLEAR_PIN_MEM_AREA 2 +#define _REMAP_PIN_MEM_AREA 3 +#define SET_PIN_MEM_AREA _IOW(PIN_MEM_MAGIC, _SET_PIN_MEM_AREA, struct = pin_mem_area_set) +#define CLEAR_PIN_MEM_AREA _IOW(PIN_MEM_MAGIC, _CLEAR_PIN_MEM_AREA, i= nt) +#define REMAP_PIN_MEM_AREA _IOW(PIN_MEM_MAGIC, _REMAP_PIN_MEM_AREA, i= nt) + +static int set_pin_mem(struct pin_mem_area_set *pmas) +{ + int i; + int ret =3D 0; + struct _pin_mem_area *pma; + struct mm_struct *mm; + struct task_struct *task; + struct pid *pid_s; + + pid_s =3D find_get_pid(pmas->pid); + if (!pid_s) { + pr_warn("Get pid struct fail:%d.\n", pmas->pid); + goto fail; + } + rcu_read_lock(); + task =3D pid_task(pid_s, PIDTYPE_PID); + if (!task) { + pr_warn("Get task struct fail:%d.\n", pmas->pid); + goto fail; + } + mm =3D get_task_mm(task); + for (i =3D 0; i < pmas->area_num; i++) { + pma =3D &(pmas->mem_area[i]); + ret =3D pin_mem_area(task, mm, pma->virt_start, pma->virt_= end); + if (ret) { + mmput(mm); + goto fail; + } + } + mmput(mm); + rcu_read_unlock(); + return ret; + +fail: + rcu_read_unlock(); + return -EFAULT; +} + +static int set_pin_mem_area(unsigned long arg) +{ + struct pin_mem_area_set pmas; + void __user *buf =3D (void __user *)arg; + + if (!access_ok(buf, sizeof(pmas))) + return -EFAULT; + if (copy_from_user(&pmas, buf, sizeof(pmas))) + return -EINVAL; + if (pmas.area_num > MAX_PIN_MEM_AREA_NUM) { + pr_warn("Input area_num is too large.\n"); + return -EINVAL; + } + + return set_pin_mem(&pmas); +} + +static int pin_mem_remap(unsigned long arg) +{ + int pid; + struct task_struct *task; + struct mm_struct *mm; + vm_fault_t ret; + void __user *buf =3D (void __user *)arg; + struct pid *pid_s; + + if (!access_ok(buf, sizeof(int))) + return -EINVAL; + if (copy_from_user(&pid, buf, sizeof(int))) + return -EINVAL; + + pid_s =3D find_get_pid(pid); + if (!pid_s) { + pr_warn("Get pid struct fail:%d.\n", pid); + return -EINVAL; + } + rcu_read_lock(); + task =3D pid_task(pid_s, PIDTYPE_PID); + if (!task) { + pr_warn("Get task struct fail:%d.\n", pid); + goto fault; + } + mm =3D get_task_mm(task); + ret =3D do_mem_remap(pid, mm); + if (ret) { + pr_warn("Handle pin memory remap fail.\n"); + mmput(mm); + goto fault; + } + mmput(mm); + rcu_read_unlock(); + return 0; + +fault: + rcu_read_unlock(); + return -EFAULT; +} + +static long pin_memory_ioctl(struct file *file, unsigned cmd, unsigned lon= g arg) +{ + long ret =3D 0; + + if (_IOC_TYPE(cmd) !=3D PIN_MEM_MAGIC) + return -EINVAL; + if (_IOC_NR(cmd) > _REMAP_PIN_MEM_AREA) + return -EINVAL; + + switch (cmd) { + case SET_PIN_MEM_AREA: + ret =3D set_pin_mem_area(arg); + break; + case CLEAR_PIN_MEM_AREA: + clear_pin_memory_record(); + break; + case REMAP_PIN_MEM_AREA: + ret =3D pin_mem_remap(arg); + break; + default: + return -EINVAL; + } + return ret; +} + +static const struct file_operations pin_memory_fops =3D { + .owner =3D THIS_MODULE, + .unlocked_ioctl =3D pin_memory_ioctl, + .compat_ioctl =3D pin_memory_ioctl, +}; + +static struct miscdevice pin_memory_miscdev =3D { + .minor =3D MISC_DYNAMIC_MINOR, + .name =3D "pinmem", + .fops =3D &pin_memory_fops, +}; + +static int pin_memory_init(void) +{ + int err =3D misc_register(&pin_memory_miscdev); + if (!err) { + pr_info("pin_memory init\n"); + } else { + pr_warn("pin_memory init failed!\n"); + } + return err; +} + +static void pin_memory_exit(void) +{ + misc_deregister(&pin_memory_miscdev); + pr_info("pin_memory ko exists!\n"); +} + +module_init(pin_memory_init); +module_exit(pin_memory_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Euler"); +MODULE_DESCRIPTION("pin memory"); diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h index 525510a..5baf40d 100644 --- a/include/linux/crash_core.h +++ b/include/linux/crash_core.h @@ -75,4 +75,9 @@ int parse_crashkernel_high(char *cmdline, unsigned long l= ong system_ram, int parse_crashkernel_low(char *cmdline, unsigned long long system_ram, unsigned long long *crash_size, unsigned long long *crash_b= ase); +#ifdef CONFIG_PIN_MEMORY +int __init parse_pin_memory(char *cmdline, unsigned long long system_ram, + unsigned long long *pin_size, unsigned long long *pin_base= ); +#endif + #endif /* LINUX_CRASH_CORE_H */ diff --git a/include/linux/pin_mem.h b/include/linux/pin_mem.h new file mode 100644 index 00000000..0ca44ac --- /dev/null +++ b/include/linux/pin_mem.h @@ -0,0 +1,62 @@ +/* + * Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. + * Provide the pin memory method for check point and restore task. + */ +#ifndef _LINUX_PIN_MEMORY_H +#define _LINUX_PIN_MEMORY_H + +#ifdef CONFIG_PIN_MEMORY +#include +#include +#include +#include +#ifdef CONFIG_ARM64 +#include +#endif + +#define PAGE_BUDDY_MAPCOUNT_VALUE (~PG_buddy) + +#define COLLECT_PAGES_FINISH 1 +#define COLLECT_PAGES_NEED_CONTINUE -1 +#define COLLECT_PAGES_FAIL 0 + +#define COMPOUND_PAD_MASK 0xffffffff +#define COMPOUND_PAD_START 0x88 +#define COMPOUND_PAD_DELTA 0x40 +#define LIST_POISON4 0xdead000000000400 + +#define next_pme(pme) ((unsigned long *)(pme + 1) + pme->nr_pages) + +struct page_map_entry { + unsigned long virt_addr; + unsigned int nr_pages; + unsigned int is_huge_page; + unsigned long phy_addr_array[0]; +}; + +struct page_map_info { + int pid; + int pid_reserved; + unsigned int entry_num; + struct page_map_entry *pme; +}; + +extern struct page_map_info *get_page_map_info(int pid); +extern struct page_map_info *create_page_map_info(int pid); +extern vm_fault_t do_mem_remap(int pid, struct mm_struct *mm); +extern vm_fault_t do_anon_page_remap(struct vm_area_struct *vma, unsigned = long address, + pmd_t *pmd, struct page *page); +extern void clear_pin_memory_record(void); +extern int pin_mem_area(struct task_struct *task, struct mm_struct *mm, + unsigned long start_addr, unsigned long end_addr); +extern vm_fault_t do_anon_huge_page_remap(struct vm_area_struct *vma, unsi= gned long address, + pmd_t *pmd, struct page *page); + +/* reserve space for pin memory*/ +#ifdef CONFIG_ARM64 +extern struct resource pin_memory_resource; +#endif +extern void init_reserve_page_map(unsigned long map_addr, unsigned long ma= p_size); + +#endif /* CONFIG_PIN_MEMORY */ +#endif /* _LINUX_PIN_MEMORY_H */ diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 9f1557b..7512696 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -292,6 +292,17 @@ int __init parse_crashkernel_low(char *cmdline, "crashkernel=3D", suffix_tbl[SUFFIX_LOW]= ); } +#ifdef CONFIG_PIN_MEMORY +int __init parse_pin_memory(char *cmdline, + unsigned long long system_ram, + unsigned long long *pin_size, + unsigned long long *pin_base) +{ + return __parse_crashkernel(cmdline, system_ram, pin_size, pin_base, + "pinmemory=3D", NULL); +} +#endif + Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type, void *data, size_t data_len) { diff --git a/mm/Kconfig b/mm/Kconfig index ab80933..c2dd088 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -739,4 +739,10 @@ config ARCH_HAS_HUGEPD config MAPPING_DIRTY_HELPERS bool +config PIN_MEMORY + bool "Support for pin memory" + depends on CHECKPOINT_RESTORE + help + Say y here to enable the pin memory feature for checkpoint + and restore. endmenu diff --git a/mm/Makefile b/mm/Makefile index 1937cc2..7e1984e 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -108,3 +108,4 @@ obj-$(CONFIG_ZONE_DEVICE) +=3D memremap.o obj-$(CONFIG_HMM_MIRROR) +=3D hmm.o obj-$(CONFIG_MEMFD_CREATE) +=3D memfd.o obj-$(CONFIG_MAPPING_DIRTY_HELPERS) +=3D mapping_dirty_helpers.o +obj-$(CONFIG_PIN_MEMORY) +=3D pin_mem.o diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a880932..93dc582 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3083,4 +3083,65 @@ void remove_migration_pmd(struct page_vma_mapped_wal= k *pvmw, struct page *new) mlock_vma_page(new); update_mmu_cache_pmd(vma, address, pvmw->pmd); } + +#ifdef CONFIG_PIN_MEMORY +vm_fault_t do_anon_huge_page_remap(struct vm_area_struct *vma, unsigned lo= ng address, + pmd_t *pmd, struct page *page) +{ + gfp_t gfp; + pgtable_t pgtable; + spinlock_t *ptl; + pmd_t entry; + vm_fault_t ret =3D 0; + struct mem_cgroup *memcg; + + if (unlikely(anon_vma_prepare(vma))) + return VM_FAULT_OOM; + if (unlikely(khugepaged_enter(vma, vma->vm_flags))) + return VM_FAULT_OOM; + gfp =3D alloc_hugepage_direct_gfpmask(vma); + prep_transhuge_page(page); + if (mem_cgroup_try_charge_delay(page, vma->vm_mm, gfp, &memcg, true= )) { + put_page(page); + count_vm_event(THP_FAULT_FALLBACK); + return VM_FAULT_FALLBACK; + } + pgtable =3D pte_alloc_one(vma->vm_mm, address); + if (unlikely(!pgtable)) { + ret =3D VM_FAULT_OOM; + goto release; + } + __SetPageUptodate(page); + ptl =3D pmd_lock(vma->vm_mm, pmd); + if (unlikely(!pmd_none(*pmd))) { + goto unlock_release; + } else { + ret =3D check_stable_address_space(vma->vm_mm); + if (ret) + goto unlock_release; + entry =3D mk_huge_pmd(page, vma->vm_page_prot); + entry =3D maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); + page_add_new_anon_rmap(page, vma, address, true); + mem_cgroup_commit_charge(page, memcg, false, true); + lru_cache_add_active_or_unevictable(page, vma); + pgtable_trans_huge_deposit(vma->vm_mm, pmd, pgtable); + set_pmd_at(vma->vm_mm, address, pmd, entry); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); + mm_inc_nr_ptes(vma->vm_mm); + spin_unlock(ptl); + count_vm_event(THP_FAULT_ALLOC); + } + + return 0; +unlock_release: + spin_unlock(ptl); +release: + if (pgtable) + pte_free(vma->vm_mm, pgtable); + mem_cgroup_cancel_charge(page, memcg, true); + put_page(page); + return ret; +} +#endif + #endif diff --git a/mm/memory.c b/mm/memory.c index 45442d9..dd416fd 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4799,4 +4799,72 @@ void ptlock_free(struct page *page) { kmem_cache_free(page_ptl_cachep, page->ptl); } + +#ifdef CONFIG_PIN_MEMORY +vm_fault_t do_anon_page_remap(struct vm_area_struct *vma, unsigned long ad= dress, + pmd_t *pmd, struct page *page) +{ + struct mem_cgroup *memcg; + pte_t entry; + spinlock_t *ptl; + pte_t *pte; + vm_fault_t ret =3D 0; + + if (pte_alloc(vma->vm_mm, pmd, address)) + return VM_FAULT_OOM; + + /* See the comment in pte_alloc_one_map() */ + if (unlikely(pmd_trans_unstable(pmd))) + return 0; + + /* Allocate our own private page. */ + if (unlikely(anon_vma_prepare(vma))) + goto oom; + + if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL, &memc= g, + false)) + goto oom_free_page; + + /* + * The memory barrier inside __SetPageUptodate makes sure that + * preceeding stores to the page contents become visible before + * the set_pte_at() write. + */ + __SetPageUptodate(page); + + entry =3D mk_pte(page, vma->vm_page_prot); + if (vma->vm_flags & VM_WRITE) + entry =3D pte_mkwrite(pte_mkdirty(entry)); + pte =3D pte_offset_map_lock(vma->vm_mm, pmd, address, + &ptl); + if (!pte_none(*pte)) { + ret =3D VM_FAULT_FALLBACK; + goto release; + } + + ret =3D check_stable_address_space(vma->vm_mm); + if (ret) + goto release; + inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); + page_add_new_anon_rmap(page, vma, address, false); + mem_cgroup_commit_charge(page, memcg, false, false); + lru_cache_add_active_or_unevictable(page, vma); + + set_pte_at(vma->vm_mm, address, pte, entry); + /* No need to invalidate - it was non-present before */ + update_mmu_cache(vma, address, pte); +unlock: + pte_unmap_unlock(pte, ptl); + return ret; +release: + mem_cgroup_cancel_charge(page, memcg, false); + put_page(page); + goto unlock; +oom_free_page: + put_page(page); +oom: + return VM_FAULT_OOM; +} +#endif + #endif diff --git a/mm/pin_mem.c b/mm/pin_mem.c new file mode 100644 index 00000000..ca3f23a --- /dev/null +++ b/mm/pin_mem.c @@ -0,0 +1,691 @@ +/* + * Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. + * Provide the pin memory method for check point and restore task. + */ +#ifdef CONFIG_PIN_MEMORY +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MAX_PIN_PID_NUM 128 +static DEFINE_SPINLOCK(page_map_entry_lock); + +unsigned int pin_pid_num; +static unsigned int *pin_pid_num_addr; +static unsigned long __page_map_entry_start; +static unsigned long page_map_entry_end; +static struct page_map_info *user_space_reserve_start; +static struct page_map_entry *page_map_entry_start; +unsigned int max_pin_pid_num __read_mostly; + +static int __init setup_max_pin_pid_num(char *str) +{ + int ret =3D 1; + + if (!str) + goto out; + + ret =3D kstrtouint(str, 10, &max_pin_pid_num); +out: + if (ret) { + pr_warn("Unable to parse max pin pid num.\n"); + } else { + if (max_pin_pid_num > MAX_PIN_PID_NUM) { + max_pin_pid_num =3D 0; + pr_warn("Input max_pin_pid_num is too large.\n")= ; + } + } + return ret; +} +early_param("max_pin_pid_num", setup_max_pin_pid_num); + +struct page_map_info *create_page_map_info(int pid) +{ + struct page_map_info *new; + + if (!user_space_reserve_start) + return NULL; + + if (pin_pid_num >=3D max_pin_pid_num) { + pr_warn("Pin pid num too large than max_pin_pid_num, fail = create: %d!", pid); + return NULL; + } + new =3D (struct page_map_info *)(user_space_reserve_start + pin_pid= _num); + new->pid =3D pid; + new->pme =3D NULL; + new->entry_num =3D 0; + new->pid_reserved =3D false; + (*pin_pid_num_addr)++; + pin_pid_num++; + return new; +} +EXPORT_SYMBOL_GPL(create_page_map_info); + +struct page_map_info *get_page_map_info(int pid) +{ + int i; + + if (!user_space_reserve_start) + return NULL; + + for (i =3D 0; i < pin_pid_num; i++) { + if (user_space_reserve_start[i].pid =3D=3D pid) { + return &(user_space_reserve_start[i]); + } + } + return NULL; +} +EXPORT_SYMBOL_GPL(get_page_map_info); + +static struct page *find_head_page(struct page *page) +{ + struct page *p =3D page; + + while (!PageBuddy(p)) { + if (PageLRU(p)) + return NULL; + p--; + } + return p; +} + +static void spilt_page_area_left(struct zone *zone, struct free_area *area= , struct page *page, + unsigned long size, int order) +{ + unsigned long cur_size =3D 1 << order; + unsigned long total_size =3D 0; + struct page *tmp; + unsigned long tmp_size =3D size; + + while (size && cur_size > size) { + cur_size >>=3D 1; + order--; + area--; + if (cur_size <=3D size) { + list_add(&page[total_size].lru, &area->free_list= [MIGRATE_MOVABLE]); + atomic_set(&(page[total_size]._mapcount), PAGE_B= UDDY_MAPCOUNT_VALUE); + set_page_private(&page[total_size], order); + set_pageblock_migratetype(&page[total_size], MIG= RATE_MOVABLE); + area->nr_free++; + total_size +=3D cur_size; + size -=3D cur_size; + } + } +} + +static void spilt_page_area_right(struct zone *zone, struct free_area *are= a, struct page *page, + unsigned long size, int order) +{ + unsigned long cur_size =3D 1 << order; + struct page *right_page, *head_page; + unsigned long tmp_size =3D size; + + right_page =3D page + size; + while (size && cur_size > size) { + cur_size >>=3D 1; + order--; + area--; + if (cur_size <=3D size) { + head_page =3D right_page - cur_size; + list_add(&head_page->lru, &area->free_list[MIGRA= TE_MOVABLE]); + atomic_set(&(head_page->_mapcount), PAGE_BUDDY_M= APCOUNT_VALUE); + set_page_private(head_page, order); + set_pageblock_migratetype(head_page, MIGRATE_MOV= ABLE); + area->nr_free++; + size -=3D cur_size; + right_page =3D head_page; + } + } +} + +void reserve_page_from_buddy(unsigned long nr_pages, struct page *page) +{ + unsigned int current_order; + struct page *page_end; + struct free_area *area; + struct zone *zone; + struct page *head_page; + + head_page =3D find_head_page(page); + if (!head_page) { + pr_warn("Find page head fail."); + return; + } + current_order =3D head_page->private; + page_end =3D head_page + (1 << current_order); + zone =3D page_zone(head_page); + area =3D &(zone->free_area[current_order]); + list_del(&head_page->lru); + atomic_set(&head_page->_mapcount, -1); + set_page_private(head_page, 0); + area->nr_free--; + if (head_page !=3D page) + spilt_page_area_left(zone, area, head_page, + (unsigned long)(page - head_page), current_order= ); + page =3D page + nr_pages; + if (page < page_end) { + spilt_page_area_right(zone, area, page, + (unsigned long)(page_end - page), current_order)= ; + } else if (page > page_end) { + pr_warn("Find page end smaller than page."); + } +} + +static inline void reserve_user_normal_pages(struct page *page) +{ + if (!atomic_read(&page->_refcount)) { + atomic_inc(&page->_refcount); + reserve_page_from_buddy(1, page); + } else { + pr_warn("Page %pK refcount %d large than zero, no need res= erve.\n", + page, page->_refcount.counter); + } +} + +static void init_huge_pmd_pages(struct page *head_page) +{ + int i =3D 0; + struct page *page =3D head_page; + unsigned long *temp; + unsigned long compound_pad =3D COMPOUND_PAD_START; + + __set_bit(PG_head, &page->flags); + __set_bit(PG_active, &page->flags); + atomic_set(&page->_refcount, 1); + page++; + i++; + page->compound_head =3D (unsigned long)head_page + 1; + page->_compound_pad_2 =3D (unsigned long)head_page & COMPOUND_PAD_M= ASK; + temp =3D (unsigned long *)(&(page->_compound_pad_2)); + temp[1] =3D LIST_POISON4; + page->compound_dtor =3D HUGETLB_PAGE_DTOR + 1; + page->compound_order =3D HPAGE_PMD_ORDER; + page++; + i++; + page->compound_head =3D (unsigned long)head_page + 1; + page->_compound_pad_2 =3D (unsigned long)head_page + compound_pad; + i++; + INIT_LIST_HEAD(&(page->deferred_list)); + for (; i < HPAGE_PMD_NR; i++) { + page =3D head_page + i; + page->compound_head =3D (unsigned long)head_page + 1; + compound_pad +=3D COMPOUND_PAD_DELTA; + page->_compound_pad_2 =3D (unsigned long)head_page + compo= und_pad; + temp =3D (unsigned long *)(&(page->_compound_pad_2)); + temp[1] =3D LIST_POISON4; + } +} + +static void reserve_user_huge_pmd_pages(struct page *page) +{ + struct page *head_page; + + if (!atomic_read(&page->_refcount)) { + atomic_inc(&page->_refcount); + head_page =3D find_head_page(page); + reserve_page_from_buddy((1 << HPAGE_PMD_ORDER), page); + init_huge_pmd_pages(page); + } else { + pr_warn("Page %pK refcount %d large than zero, no need res= erve.\n", + page, page->_refcount.counter); + } +} + +static void reserve_user_space_map_pages(void) +{ + struct page_map_info *pmi; + struct page_map_entry *pme; + unsigned int i, j, index; + struct page *page; + unsigned long flags; + unsigned long page_size; + int err =3D 0; + unsigned long phy_addr; + + if (!user_space_reserve_start) + return; + spin_lock_irqsave(&page_map_entry_lock, flags); + for (index =3D 0; index < pin_pid_num; index++) { + pmi =3D &(user_space_reserve_start[index]); + pme =3D pmi->pme; + + for (i =3D 0; i < pmi->entry_num; i++) { + err =3D 0; + for (j =3D 0; j < pme->nr_pages; j++) { + phy_addr =3D pme->phy_addr_array[j]; + if (!phy_addr) + continue; + page =3D phys_to_page(phy_addr); + if (atomic_read(&page->_refcount)) { + pme->phy_addr_array[j] =3D 0; + page_size =3D pme->is_huge_pag= e ? HPAGE_PMD_SIZE : PAGE_SIZE; + continue; + } + if (!pme->is_huge_page) { + reserve_user_normal_pages(page= ); + } else { + reserve_user_huge_pmd_pages(pa= ge); + } + } + pme =3D (struct page_map_entry *)next_pme(pme); + if (err) + err_phy_num++; + } + page_size =3D pme->is_huge_page ? HPAGE_PMD_SIZE : PAGE_SI= ZE; + } + spin_unlock(&page_map_entry_lock); +} + + +/* The whole page map entry collect process must be Sequentially. + The user_space_reserve_start points to the first page map info for + the first dump task. And the page_map_entry_start points to + the first page map entry of the first dump vma. */ +static void init_page_map_info(unsigned int *map_addr) +{ + unsigned long map_len =3D pin_memory_resource.end - pin_memory_reso= urce.start; + + if (user_space_reserve_start || !max_pin_pid_num) + return; + pin_pid_num =3D *map_addr; + pin_pid_num_addr =3D map_addr; + user_space_reserve_start =3D + (struct kup_page_map_info *)(map_addr + 1); + page_map_entry_start =3D + (struct page_map_entry *)(user_space_reserve_start + max_p= in_pid_num); + page_map_entry_end =3D (unsigned long)map_addr + map_len; + if (pin_pid_num > 0) + reserve_user_space_map_pages(); +} + +int collect_pmd_huge_pages(struct task_struct *task, + unsigned long start_addr, unsigned long end_addr, struct page_map_e= ntry *pme) +{ + long res; + int index =3D 0; + unsigned long start =3D start_addr; + struct page *temp_page; + + while (start < end_addr) { + temp_page =3D NULL; + res =3D get_user_pages_remote(task, task->mm, start, 1, + FOLL_TOUCH|FOLL_GET, &temp_page, NULL, NULL); + if (!res) { + pr_warn("Get huge page for addr(%lx) fail.", sta= rt); + return COLLECT_PAGES_FAIL; + } + if (PageHead(temp_page)) { + start +=3D HPAGE_PMD_SIZE; + pme->phy_addr_array[index] =3D page_to_phys(temp= _page); + index++; + } else { + pme->nr_pages =3D index; + atomic_dec(&((temp_page)->_refcount)); + return COLLECT_PAGES_NEED_CONTINUE; + } + } + pme->nr_pages =3D index; + return COLLECT_PAGES_FINISH; +} + +int collect_normal_pages(struct task_struct *task, + unsigned long start_addr, unsigned long end_addr, struct page_map_e= ntry *pme) +{ + int res; + unsigned long next; + unsigned long i, nr_pages; + struct page *tmp_page; + unsigned long *phy_addr_array =3D pme->phy_addr_array; + struct page **page_array =3D (struct page **)pme->phy_addr_array; + + next =3D (start_addr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE; + next =3D (next > end_addr) ? end_addr : next; + pme->nr_pages =3D 0; + while (start_addr < next) { + nr_pages =3D (next - start_addr) / PAGE_SIZE; + res =3D get_user_pages_remote(task, task->mm, start_addr, = 1, + FOLL_TOUCH|FOLL_GET, &tmp_page, NULL, N= ULL); + if (!res) { + pr_warn("Get user pages of %lx fail.\n", start_a= ddr); + return COLLECT_PAGES_FAIL; + } + if (PageHead(tmp_page)) { + atomic_dec(&(tmp_page->_refcount)); + return COLLECT_PAGES_NEED_CONTINUE; + } + atomic_dec(&(tmp_page->_refcount)); + if (PageTail(tmp_page)) { + start_addr =3D next; + pme->virt_addr =3D start_addr; + next =3D (next + HPAGE_PMD_SIZE) > end_addr ? en= d_addr : (next + HPAGE_PMD_SIZE); + continue; + } + res =3D get_user_pages_remote(task, task->mm, start_addr, = nr_pages, + FOLL_TOUCH|FOLL_GET, page_array, NULL, NULL); + if (!res) { + pr_warn("Get user pages of %lx fail.\n", start_a= ddr); + return COLLECT_PAGES_FAIL; + } + for (i =3D 0; i < nr_pages; i++) { + phy_addr_array[i] =3D page_to_phys(page_array[i]= ); + } + pme->nr_pages +=3D nr_pages; + page_array +=3D nr_pages; + phy_addr_array +=3D nr_pages; + start_addr =3D next; + next =3D (next + HPAGE_PMD_SIZE) > end_addr ? end_addr : (= next + HPAGE_PMD_SIZE); + } + return COLLECT_PAGES_FINISH; +} + +/* Users make sure that the pin memory belongs to anonymous vma. */ +int pin_mem_area(struct task_struct *task, struct mm_struct *mm, + unsigned long start_addr, unsigned long end_addr) +{ + int pid, ret; + int is_huge_page =3D false; + unsigned int page_size; + unsigned long nr_pages, flags; + struct page_map_entry *pme; + struct page_map_info *pmi; + struct vm_area_struct *vma; + unsigned long i; + struct page *tmp_page; + + if (!page_map_entry_start + || !task || !mm + || start_addr >=3D end_addr) + return -EFAULT; + + pid =3D task->pid; + spin_lock_irqsave(&page_map_entry_lock, flags); + nr_pages =3D ((end_addr - start_addr) / PAGE_SIZE); + if ((unsigned long)page_map_entry_start + nr_pages * sizeof(struct = page *) + >=3D page_map_entry_end) { + pr_warn("Page map entry use up!\n"); + ret =3D -EFAULT; + goto finish; + } + vma =3D find_extend_vma(mm, start_addr); + if (!vma) { + pr_warn("Find no match vma!\n"); + ret =3D -EFAULT; + goto finish; + } + if (start_addr =3D=3D (start_addr & HPAGE_PMD_MASK) && + transparent_hugepage_enabled(vma)) { + page_size =3D HPAGE_PMD_SIZE; + is_huge_page =3D true; + } else { + page_size =3D PAGE_SIZE; + } + pme =3D page_map_entry_start; + pme->virt_addr =3D start_addr; + pme->is_huge_page =3D is_huge_page; + memset(pme->phy_addr_array, 0, nr_pages * sizeof(unsigned long)); + down_write(&mm->mmap_sem); + if (!is_huge_page) { + ret =3D collect_normal_pages(task, start_addr, end_addr, p= me); + if (!pme->nr_pages) { + if (ret =3D=3D COLLECT_PAGES_FINISH) { + ret =3D 0; + up_write(&mm->mmap_sem); + goto finish; + } + pme->is_huge_page =3D true; + page_size =3D HPAGE_PMD_SIZE; + ret =3D collect_pmd_huge_pages(task, pme->virt_a= ddr, end_addr, pme); + } + } else { + ret =3D collect_pmd_huge_pages(task, start_addr, end_addr,= pme); + if (!pme->nr_pages) { + if (ret =3D=3D COLLECT_PAGES_FINISH) { + ret =3D 0; + up_write(&mm->mmap_sem); + goto finish; + } + pme->is_huge_page =3D false; + page_size =3D PAGE_SIZE; + ret =3D collect_normal_pages(task, pme->virt_add= r, end_addr, pme); + } + } + up_write(&mm->mmap_sem); + if (ret =3D=3D COLLECT_PAGES_FAIL) { + ret =3D -EFAULT; + goto finish; + } + + /* check for zero pages */ + for (i =3D 0; i < pme->nr_pages; i++) { + tmp_page =3D phys_to_page(pme->phy_addr_array[i]); + if (!pme->is_huge_page) { + if (page_to_pfn(tmp_page) =3D=3D my_zero_pfn(pme= ->virt_addr + i * PAGE_SIZE)) + pme->phy_addr_array[i] =3D 0; + } else if (is_huge_zero_page(tmp_page)) + pme->phy_addr_array[i] =3D 0; + } + + page_map_entry_start =3D (struct page_map_entry *)(next_pme(pme)); + pmi =3D get_page_map_info(pid); + if (!pmi) + pmi =3D create_page_map_info(pid); + if (!pmi) { + pr_warn("Create page map info fail for pid: %d!\n", pid); + ret =3D -EFAULT; + goto finish; + } + if (!pmi->pme) + pmi->pme =3D pme; + pmi->entry_num++; + + if (ret =3D=3D COLLECT_PAGES_NEED_CONTINUE) { + ret =3D pin_mem_area(task, mm, pme->virt_addr + pme->nr_pa= ges * page_size, end_addr); + } + +finish: + spin_unlock_irqrestore(&page_map_entry_lock, flags); + return ret; +} +EXPORT_SYMBOL_GPL(pin_mem_area); + +vm_fault_t remap_normal_pages(struct mm_struct *mm, struct vm_area_struct = *vma, + struct page_map_entry *pme) +{ + int ret; + unsigned int j; + pgd_t *pgd; + p4d_t *p4d; + pmd_t *pmd; + pud_t *pud; + struct page *page; + unsigned long address; + unsigned long phy_addr; + + for (j =3D 0; j < pme->nr_pages; j++) { + address =3D pme->virt_addr + j * PAGE_SIZE; + phy_addr =3D pme->phy_addr_array[j]; + if (!phy_addr) + continue; + page =3D phys_to_page(phy_addr); + if (page->flags & (1 << PG_reserved)) + page->flags -=3D (1 << PG_reserved); + if (page_to_pfn(page) =3D=3D my_zero_pfn(address)) { + pme->phy_addr_array[j] =3D 0; + continue; + } + page->mapping =3D NULL; + pgd =3D pgd_offset(mm, address); + p4d =3D p4d_alloc(mm, pgd, address); + if (!p4d) + return VM_FAULT_OOM; + pud =3D pud_alloc(mm, p4d, address); + if (!pud) + return VM_FAULT_OOM; + pmd =3D pmd_alloc(mm, pud, address); + if (!pmd) + return VM_FAULT_OOM; + ret =3D do_anon_page_remap(vma, address, pmd, page); + if (ret =3D=3D VM_FAULT_OOM) + return ret; + } + return 0; +} + +vm_fault_t remap_huge_pmd_pages(struct mm_struct *mm, struct vm_area_struc= t *vma, + struct page_map_entry *pme) +{ + int ret; + unsigned int j; + pgd_t *pgd; + p4d_t *p4d; + pmd_t *pmd; + pud_t *pud; + struct page *page; + unsigned long address; + unsigned long phy_addr; + + for (j =3D 0; j < pme->nr_pages; j++) { + address =3D pme->virt_addr + j * HPAGE_PMD_SIZE; + phy_addr =3D pme->phy_addr_array[j]; + if (!phy_addr) + continue; + page =3D phys_to_page(phy_addr); + if (page->flags & (1 << PG_reserved)) + page->flags -=3D (1 << PG_reserved); + if (is_huge_zero_page(page)) { + pme->phy_addr_array[j] =3D 0; + continue; + } + pgd =3D pgd_offset(mm, address); + p4d =3D p4d_alloc(mm, pgd, address); + if (!p4d) + return VM_FAULT_OOM; + pud =3D pud_alloc(mm, p4d, address); + if (!pud) + return VM_FAULT_OOM; + pmd =3D pmd_alloc(mm, pud, address); + if (!pmd) + return VM_FAULT_OOM; + ret =3D do_anon_huge_page_remap(vma, address, pmd, page); + if (ret =3D=3D VM_FAULT_OOM) + return ret; + } + return 0; +} + +vm_fault_t do_mem_remap(int pid, struct mm_struct *mm) +{ + unsigned int i =3D 0; + vm_fault_t ret =3D 0; + struct vm_area_struct *vma; + struct page_map_info *pmi; + struct page_map_entry *pme; + + pmi =3D get_page_map_info(pid); + if (!pmi) + return -EFAULT; + down_write(&mm->mmap_sem); + pme =3D pmi->pme; + vma =3D mm->mmap; + while ((i < pmi->entry_num) && (vma !=3D NULL)) { + if (pme->virt_addr >=3D vma->vm_start && pme->virt_addr < = vma->vm_end) { + i++; + if (!vma_is_anonymous(vma)) { + pme =3D (struct page_map_entry *)(next_= pme(pme)); + continue; + } + if (!pme->is_huge_page) { + ret =3D remap_normal_pages(mm, vma, pme= ); + if (ret < 0) + goto out; + } else { + ret =3D remap_huge_pmd_pages(mm, vma, p= me); + if (ret < 0) + goto out; + } + pme =3D (struct page_map_entry *)(next_pme(pme))= ; + } else { + vma =3D vma->vm_next; + } + } +out: + up_write(&mm->mmap_sem); + return ret; +} +EXPORT_SYMBOL_GPL(do_mem_remap); + +#if defined(CONFIG_ARM64) +void init_reserve_page_map(unsigned long map_addr, unsigned long map_size) +{ + void *addr; + + if (!map_addr || !map_size) + return; + addr =3D phys_to_virt(map_addr); + init_page_map_info((unsigned int *)addr); +} +#else +void init_reserve_page_map(unsigned long map_addr, unsigned long map_size) +{ +} +#endif + +/* Clear all pin memory record. */ +void clear_pin_memory_record(void) +{ + if (pin_pid_num_addr) { + *pin_pid_num_addr =3D 0; + pin_pid_num =3D 0; + page_map_entry_start =3D (struct page_map_entry *)__page_m= ap_entry_start; + } + if (kernel_space_reserve_start && kernel_pin_space_size > 0) { + *(unsigned long *)kernel_space_reserve_start =3D 0; + } +} +EXPORT_SYMBOL_GPL(clear_pin_memory_record); + +vm_fault_t reserve_kernel_space_mem(unsigned long start_addr, unsigned int= pages) +{ + unsigned long i; + unsigned long entry_num; + struct page_map_entry *pme, *pme_start; + + + entry_num =3D *(unsigned long *)kernel_space_reserve_start; + pme_start =3D (struct page_map_entry *)(kernel_space_reserve_start = + sizeof(entry_num)); + pme =3D pme_start; + spin_lock(&page_map_entry_lock); + for (i =3D 0; i < entry_num; i++) { + if (start_addr =3D=3D pme->virt_addr) { + spin_unlock(&page_map_entry_lock); + return 0; + } + pme =3D pme + 1; + } + if ((unsigned long)(pme_start + entry_num) >=3D kernel_space_reserv= e_end) { + spin_unlock(&page_map_entry_lock); + return VM_FAULT_OOM; + } + pme =3D pme_start + entry_num; + pme->virt_addr =3D start_addr; + pme->nr_pages =3D pages; + pme->is_huge_page =3D false; + *(unsigned long *)kernel_space_reserve_start =3D entry_num + 1; + spin_unlock(&page_map_entry_lock); + return 0; +} +EXPORT_SYMBOL_GPL(reserve_kernel_space_mem); + +#endif /* CONFIG_PIN_MEMORY */ -- 1.8.3.1 --_000_a68df79992c04bbf8167748dbeca1fcchuaweicom_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

From: Jingxian He <hejingxian@huawei.com><= /span>

Date: Thu, 10 Dec 2020 20:31:15= +0800

Subject: [PATCH] add pin memory= method for checkout add restore

 

We can use the checkpoint and r= estore in userspace(criu) method to dump and restore tasks

when updating the kernel. Curre= ntly, criu needs dump all memory data of tasks to files.<= /p>

When the memory size is very la= rge(larger than 1G), the cost time of the dumping data

will be very long(more than 1 m= in).

 

We can pin the memory data of t= asks and collect the corresponding physical pages mapping info

in checkpoint process, and rema= p the physical pages to restore tasks after kernel is updated.

 

The pin memory area info is sav= ed in the reserve memblock named nvwa_res_first, which can keep<= /span>

usable in the kernel update pro= cess.

 

The pin memory driver provides = the following ioctl command for criu:

1) SET_PIN_MEM_AREA: set pin me= mory area, which can be remap to the restore task.

2) CLEAR_PIN_MEM_AREA: clear th= e pin memory area info, which enable user reset the pin data.

3) REMAP_PIN_MEM_AREA: remap th= e pages of the pin memory to the restore task.

 

Signed-off-by: Jingxian He <= hejingxian@huawei.com>=

---

arch/arm64/kernel/setup.c = |   7 +

arch/arm64/mm/init.c  = ;     |  62 +++-

drivers/char/Kconfig  = ;     |   7 +

drivers/char/Makefile &nbs= p;    |   1 +

drivers/char/pin_memory.c = | 198 +++++++++++++

include/linux/crash_core.h |&nb= sp;  5 +

include/linux/pin_mem.h  &= nbsp; |  62 ++++

kernel/crash_core.c  =       |  11 +

mm/Kconfig   &nb= sp;            = |   6 +

mm/Makefile   &n= bsp;            |&nb= sp;  1 +

mm/huge_memory.c  &nb= sp;        |  61 +++= 3;

mm/memory.c   &n= bsp;            |&nb= sp; 68 +++++

mm/pin_mem.c   &= nbsp;           | 691 = 3;++++++++++++++= 3;++++++++++++++= 3;++++++++++++++

13 files changed, 1179 insertio= ns(+), 1 deletion(-)

create mode 100644 drivers/char= /pin_memory.c

create mode 100644 include/linu= x/pin_mem.h

create mode 100644 mm/pin_mem.c=

 

diff --git a/arch/arm64/kernel/= setup.c b/arch/arm64/kernel/setup.c

index 56f6645..40751ed 100644

--- a/arch/arm64/kernel/setup.c=

+++ b/arch/arm64/ke= rnel/setup.c

@@ -50,6 +50,9 @@

#include <asm/efi.h>=

#include <asm/xen/hypervisor= .h>

#include <asm/mmu_context.h&= gt;

+#ifdef CONFIG_PIN_MEMORY

+#include <linux/pin_mem= ory.h>

+#endif

 

 static int num_standard_r= esources;

static struct resource *standar= d_resources;

@@ -243,6 +246,10 @@ static= void __init request_standard_resources(void)

     &= nbsp;            &nb= sp; crashk_res.end <=3D res->end)

     &= nbsp;           &nbs= p;        request_resource(res, &cra= shk_res);

#endif

+#ifdef CONFIG_PIN_MEMORY

+    &n= bsp;           if (pin_me= mory_resource.end)

+    &n= bsp;            = ;         insert_resource(&iome= m_resource, &pin_memory_resource);

+#endif

     &= nbsp; }

}

 

diff --git a/arch/arm64/mm/init= .c b/arch/arm64/mm/init.c

index b65dffd..dee3192 100644

--- a/arch/arm64/mm/init.c=

+++ b/arch/arm64/mm= /init.c

@@ -41,7 +41,9 @@

#include <linux/sizes.h><= o:p>

#include <asm/tlb.h>=

#include <asm/alternative.h&= gt;

-

+#ifdef CONFIG_PIN_MEMORY

+#include <linux/pin_mem= ory.h>

+#endif

#define ARM64_ZONE_DMA_BITS&nbs= p;       30

 

 /*

@@ -68,6 +70,16 @@

phys_addr_t arm64_dma_phys_limi= t __ro_after_init;

static phys_addr_t arm64_dma32_= phys_limit __ro_after_init;

 

+#ifdef CONFIG_PIN_MEMORY

+struct resource pin_memory= _resource =3D {

+    &n= bsp;   .name =3D "Pin memory maps",

+    &n= bsp;   .start =3D 0,

+    &n= bsp;   .end =3D 0,

+    &n= bsp;   .flags =3D IORESOURCE_MEM,

+    &n= bsp;   .desc =3D IORES_DESC_PIN_MEM_MAPS

+};

+#endif

+

#ifdef CONFIG_KEXEC_CORE

/*

  * reserve_crashkernel() = - reserves memory for crash kernel

@@ -129,6 +141,47 @@ static= void __init reserve_crashkernel(void)

}

#endif /* CONFIG_KEXEC_CORE */<= o:p>

 

+#ifdef CONFIG_PIN_MEMORY

+static void __init reserve= _pin_memory_res(void)

+{

+    &n= bsp;  unsigned long long mem_start, mem_len;

+    &n= bsp;  int ret;

+

+    &n= bsp;  ret =3D parse_pin_memory(boot_command_line, memblock_phys_mem_si= ze(),

+    &n= bsp;            = ;            &n= bsp;            = ;            &me= m_len, &mem_start);

+    &n= bsp;  if (ret || !mem_len)

+    &n= bsp;            = ;         return;=

+

+    &n= bsp;  mem_len =3D PAGE_ALIGN(mem_len);

+

+    &n= bsp;  if (!memblock_is_region_memory(mem_start, mem_len)) {=

+    &n= bsp;           pr_warn(&q= uot;cannot reserve for pin memory: region is not memory!\n");

+    &n= bsp;           return;

+    &n= bsp;  }

+

+    &n= bsp;  if (memblock_is_region_reserved(mem_start, mem_len)) {

+    &n= bsp;           pr_warn(&q= uot;cannot reserve for pin memory: region overlaps reserved memory!\n"= );

+    &n= bsp;           return;

+    &n= bsp;  }

+

+    &n= bsp;  if (!IS_ALIGNED(mem_start, SZ_2M)) {

+    &n= bsp;           pr_warn(&q= uot;cannot reserve for pin memory: base address is not 2MB aligned\n")= ;

+    &n= bsp;           return;

+    &n= bsp;  }

+

+    &n= bsp;  memblock_reserve(mem_start, mem_len);

+    &n= bsp;  pr_debug("pin memory resource reserved: 0x%016llx - 0x%016l= lx (%lld MB)\n",

+    &n= bsp;           mem_start,= mem_start + mem_len, mem_len >> 20);

+

+    &n= bsp;  pin_memory_resource.start =3D mem_start;

+    &n= bsp;  pin_memory_resource.end =3D mem_start + mem_len - 1;

+}

+#else

+static void __init reserve= _pin_memory_res(void)

+{

+}

+#endif /* CONFIG_PIN_MEMOR= Y */

+

#ifdef CONFIG_CRASH_DUMP

static int __init early_init_dt= _scan_elfcorehdr(unsigned long node,

     &= nbsp;          const char *una= me, int depth, void *data)

@@ -452,6 +505,8 @@ void __= init arm64_memblock_init(void)

 

     &= nbsp;  reserve_crashkernel();

 

+    &n= bsp;  reserve_pin_memory_res();

+

     &= nbsp; reserve_elfcorehdr();

 

     &= nbsp;  high_memory =3D __va(memblock_end_of_DRAM() - 1) + 1;<= /o:p>

@@ -573,6 +628,11 @@ void _= _init mem_init(void)

     &= nbsp; /* this will put all unused low memory onto the freelists */

     &= nbsp; memblock_free_all();

 

+#ifdef CONFIG_PIN_MEMORY

+    &n= bsp;  /* pre alloc the pages for pin memory */

+    &n= bsp;  init_reserve_page_map((unsigned long)pin_memory_resource.start,<= o:p>

+    &n= bsp;           (unsigned = long)(pin_memory_resource.end - pin_memory_resource.start));

+#endif

     &= nbsp; mem_init_print_info(NULL);

 

     &= nbsp;  /*

diff --git a/drivers/char/Kconf= ig b/drivers/char/Kconfig

index 26956c0..73af2f0 100644

--- a/drivers/char/Kconfig=

+++ b/drivers/char/= Kconfig

@@ -560,3 +560,10 @@ config= RANDOM_TRUST_BOOTLOADER

     &= nbsp; booloader is trustworthy so it will be added to the kernel's entropy<= o:p>

     &= nbsp; pool. Otherwise, say N here so it will be regarded as device input th= at

     &= nbsp; only mixes the entropy pool.

+

+config PIN_MEMORY_DEV=

+    &n= bsp;  bool "/dev/pinmem character device"<= /p>

+    &n= bsp;  depends PIN_MEMORY

+    &n= bsp;  default n

+    &n= bsp;  help

+    &n= bsp;  pin memory driver

diff --git a/drivers/char/Makef= ile b/drivers/char/Makefile

index 7c5ea6f..1941642 100644

--- a/drivers/char/Makefile

+++ b/drivers/char/= Makefile

@@ -52,3 +52,4 @@ js-rtc-y = =3D rtc.o

obj-$(CONFIG_XILLYBUS) &nb= sp;            =   +=3D xillybus/

obj-$(CONFIG_POWERNV_OP_PANEL) = +=3D powernv-op-panel.o

obj-$(CONFIG_ADI)  &n= bsp;            = ;  +=3D adi.o

+obj-$(CONFIG_PIN_MEMORY_DE= V)     +=3D pin_memory.o

diff --git a/drivers/char/pin_m= emory.c b/drivers/char/pin_memory.c

new file mode 100644=

index 00000000..a0464e1

--- /dev/null=

+++ b/drivers/char/= pin_memory.c

@@ -0,0 +1,198 @@

+/*

+ * Copyright @ Huawei Tech= nologies Co., Ltd. 2020-2020. ALL rights reserved.

+ * Description: Euler pin = memory driver

+ */

+#include <linux/kernel.= h>

+#include <linux/module.= h>

+#include <linux/kprobes= .h>

+#include <linux/spinloc= k.h>

+#include <linux/workque= ue.h>

+#include <linux/sched.h= >

+#include <linux/mm.h>= ;

+#include <linux/init.h&= gt;

+#include <linux/miscdev= ice.h>

+#include <linux/fs.h>= ;

+#include <linux/mm_type= s.h>

+#include <asm/processor= .h>

+#include <uapi/asm-gene= ric/ioctl.h>

+#include <uapi/asm-gene= ric/mman-common.h>

+#include <uapi/asm/setu= p.h>

+#include <linux/pin_mem= .h>

+#include <linux/sched/m= m.h>

+

+#define MAX_PIN_MEM_AREA_N= UM  16

+struct _pin_mem_area {

+    &n= bsp;  unsigned long virt_start;

+    &n= bsp;  unsigned long virt_end;

+};

+

+struct pin_mem_area_set {<= o:p>

+    &n= bsp;  unsigned int pid;

+    &n= bsp;  unsigned int area_num;

+    &n= bsp;  struct _pin_mem_area mem_area[MAX_PIN_MEM_AREA_NUM];<= /span>

+};

+

+#define PIN_MEM_MAGIC 0x59=

+#define _SET_PIN_MEM_AREA&= nbsp;   1

+#define _CLEAR_PIN_MEM_ARE= A  2

+#define _REMAP_PIN_MEM_ARE= A  3

+#define SET_PIN_MEM_AREA&n= bsp;   _IOW(PIN_MEM_MAGIC, _SET_PIN_MEM_AREA, struct pin_mem_area= _set)

+#define CLEAR_PIN_MEM_AREA=       _IOW(PIN_MEM_MAGIC, _CLEAR_PIN_MEM_AREA, int= )

+#define REMAP_PIN_MEM_AREA=       _IOW(PIN_MEM_MAGIC, _REMAP_PIN_MEM_AREA, int= )

+

+static int set_pin_mem(str= uct pin_mem_area_set *pmas)

+{

+    &n= bsp;  int i;

+    &n= bsp;  int ret =3D 0;

+    &n= bsp;  struct _pin_mem_area *pma;

+    &n= bsp;  struct mm_struct *mm;

+    &n= bsp;  struct task_struct *task;

+    &n= bsp;  struct pid *pid_s;

+

+    &n= bsp;  pid_s =3D find_get_pid(pmas->pid);

+    &n= bsp;  if (!pid_s) {

+    &n= bsp;           pr_warn(&q= uot;Get pid struct fail:%d.\n", pmas->pid);

+    &n= bsp;           goto fail;=

+    &n= bsp;  }

+    &n= bsp;  rcu_read_lock();

+    &n= bsp;  task =3D pid_task(pid_s, PIDTYPE_PID);

+    &n= bsp;  if (!task) {

+    &n= bsp;           pr_warn(&q= uot;Get task struct fail:%d.\n", pmas->pid);

+    &n= bsp;           goto fail;=

+    &n= bsp;  }

+    &n= bsp;  mm =3D get_task_mm(task);

+    &n= bsp;  for (i =3D 0; i < pmas->area_num; i++) {

+    &n= bsp;           pma =3D &a= mp;(pmas->mem_area[i]);

+    &n= bsp;           ret =3D pi= n_mem_area(task, mm, pma->virt_start, pma->virt_end);

+    &n= bsp;           if (ret) {=

+    &n= bsp;            = ;         mmput(mm);

+    &n= bsp;            = ;         goto fail;

+    &n= bsp;           }

+    &n= bsp;  }

+    &n= bsp;  mmput(mm);

+    &n= bsp;  rcu_read_unlock();

+    &n= bsp;  return ret;

+

+fail:

+    &n= bsp;  rcu_read_unlock();

+    &n= bsp;  return -EFAULT;

+}

+

+static int set_pin_mem_are= a(unsigned long arg)

+{

+    &n= bsp;  struct pin_mem_area_set pmas;

+    &n= bsp;  void __user *buf =3D (void __user *)arg;

+

+    &n= bsp;  if (!access_ok(buf, sizeof(pmas)))

+    &n= bsp;           return -EF= AULT;

+    &n= bsp;  if (copy_from_user(&pmas, buf, sizeof(pmas)))

+    &n= bsp;           return -EI= NVAL;

+    &n= bsp;  if (pmas.area_num > MAX_PIN_MEM_AREA_NUM) {=

+    &n= bsp;           pr_warn(&q= uot;Input area_num is too large.\n");

+    &n= bsp;           return -EI= NVAL;

+    &n= bsp;  }

+

+    &n= bsp;  return set_pin_mem(&pmas);

+}

+

+static int pin_mem_remap(u= nsigned long arg)

+{

+    &n= bsp;  int pid;

+    &n= bsp;  struct task_struct *task;

+    &n= bsp;  struct mm_struct *mm;

+    &n= bsp;  vm_fault_t ret;

+    &n= bsp;  void __user *buf =3D (void __user *)arg;

+    &n= bsp;  struct pid *pid_s;

+

+    &n= bsp;  if (!access_ok(buf, sizeof(int)))

+    &n= bsp;           return -EI= NVAL;

+    &n= bsp;  if (copy_from_user(&pid, buf, sizeof(int)))

+    &n= bsp;           return -EI= NVAL;

+

+    &n= bsp;  pid_s =3D find_get_pid(pid);

+    &n= bsp;  if (!pid_s) {

+    &n= bsp;           pr_warn(&q= uot;Get pid struct fail:%d.\n", pid);

+    &n= bsp;           return -EI= NVAL;

+    &n= bsp;  }

+    &n= bsp;  rcu_read_lock();

+    &n= bsp;  task =3D pid_task(pid_s, PIDTYPE_PID);

+    &n= bsp;  if (!task) {

+    &n= bsp;           pr_warn(&q= uot;Get task struct fail:%d.\n", pid);

+    &n= bsp;           goto fault= ;

+    &n= bsp;  }

+    &n= bsp;  mm =3D get_task_mm(task);

+    &n= bsp;  ret =3D do_mem_remap(pid, mm);

+    &n= bsp;  if (ret) {

+    &n= bsp;           pr_warn(&q= uot;Handle pin memory remap fail.\n");

+    &n= bsp;           mmput(mm);=

+    &n= bsp;           goto fault= ;

+    &n= bsp;  }

+    &n= bsp;  mmput(mm);

+    &n= bsp;  rcu_read_unlock();

+    &n= bsp;  return 0;

+

+fault:

+    &n= bsp;  rcu_read_unlock();

+    &n= bsp;  return -EFAULT;

+}

+

+static long pin_memory_ioc= tl(struct file *file, unsigned cmd, unsigned long arg)

+{

+    &n= bsp;  long ret =3D 0;

+

+    &n= bsp;  if (_IOC_TYPE(cmd) !=3D PIN_MEM_MAGIC)

+    &n= bsp;           return -EI= NVAL;

+    &n= bsp;  if (_IOC_NR(cmd) > _REMAP_PIN_MEM_AREA)

+    &n= bsp;           return -EI= NVAL;

+

+    &n= bsp;  switch (cmd) {

+    &n= bsp;  case SET_PIN_MEM_AREA:

+    &n= bsp;           ret =3D se= t_pin_mem_area(arg);

+    &n= bsp;           break;

+    &n= bsp;  case CLEAR_PIN_MEM_AREA:

+    &n= bsp;           clear_pin_= memory_record();

+    &n= bsp;           break;

+    &n= bsp;  case REMAP_PIN_MEM_AREA:

+    &n= bsp;           ret =3D pi= n_mem_remap(arg);

+    &n= bsp;           break;

+    &n= bsp;  default:

+    &n= bsp;           return -EI= NVAL;

+    &n= bsp;  }

+    &n= bsp;  return ret;

+}

+

+static const struct file_o= perations pin_memory_fops =3D {

+    &n= bsp;  .owner     =3D THIS_MODULE,

+    &n= bsp;  .unlocked_ioctl =3D pin_memory_ioctl,

+    &n= bsp;  .compat_ioctl   =3D pin_memory_ioctl,

+};

+

+static struct miscdevice p= in_memory_miscdev =3D {

+    &n= bsp;  .minor      =3D MISC_DYNAMIC_MINOR,

+    &n= bsp;  .name    =3D "pinmem",

+    &n= bsp;  .fops       =3D &pin_memory_fops,

+};

+

+static int pin_memory_init= (void)

+{

+    &n= bsp;  int err =3D misc_register(&pin_memory_miscdev);

+    &n= bsp;  if (!err) {

+    &n= bsp;           pr_info(&q= uot;pin_memory init\n");

+    &n= bsp;  } else {

+    &n= bsp;           pr_warn(&q= uot;pin_memory init failed!\n");

+    &n= bsp;  }

+    &n= bsp;  return err;

+}

+

+static void pin_memory_exi= t(void)

+{

+    &n= bsp;  misc_deregister(&pin_memory_miscdev);

+    &n= bsp;  pr_info("pin_memory ko exists!\n");<= /p>

+}

+

+module_init(pin_memory_ini= t);

+module_exit(pin_memory_exi= t);

+

+MODULE_LICENSE("GPL&q= uot;);

+MODULE_AUTHOR("Euler&= quot;);

+MODULE_DESCRIPTION("p= in memory");

diff --git a/include/linux/cras= h_core.h b/include/linux/crash_core.h

index 525510a..5baf40d 100644

--- a/include/linux/crash_core.= h

+++ b/include/linux= /crash_core.h

@@ -75,4 +75,9 @@ int parse= _crashkernel_high(char *cmdline, unsigned long long system_ram,<= /span>

int parse_crashkernel_low(char = *cmdline, unsigned long long system_ram,

     &= nbsp;          unsigned long l= ong *crash_size, unsigned long long *crash_base);

 

+#ifdef CONFIG_PIN_MEMORY

+int __init parse_pin_memor= y(char *cmdline, unsigned long long system_ram,

+    &n= bsp;           unsigned l= ong long *pin_size, unsigned long long *pin_base);

+#endif

+

#endif /* LINUX_CRASH_CORE_H */=

diff --git a/include/linux/pin_= mem.h b/include/linux/pin_mem.h

new file mode 100644=

index 00000000..0ca44ac

--- /dev/null=

+++ b/include/linux= /pin_mem.h

@@ -0,0 +1,62 @@=

+/*

+ *  Copyright (C) 202= 0. Huawei Technologies Co., Ltd. All rights reserved.

+ *  Provide the pin m= emory method for check point and restore task.

+ */

+#ifndef _LINUX_PIN_MEMORY_= H

+#define _LINUX_PIN_MEMORY_= H

+

+#ifdef CONFIG_PIN_MEMORY

+#include <linux/errno.h= >

+#include <linux/kabi.h&= gt;

+#include <linux/mm_type= s.h>

+#include <linux/err.h&g= t;

+#ifdef CONFIG_ARM64

+#include <linux/ioport.= h>

+#endif

+

+#define PAGE_BUDDY_MAPCOUN= T_VALUE  (~PG_buddy)

+

+#define COLLECT_PAGES_FINI= SH  1

+#define COLLECT_PAGES_NEED= _CONTINUE  -1

+#define COLLECT_PAGES_FAIL=   0

+

+#define COMPOUND_PAD_MASK&= nbsp; 0xffffffff

+#define COMPOUND_PAD_START=   0x88

+#define COMPOUND_PAD_DELTA=   0x40

+#define LIST_POISON4 0xdea= d000000000400

+

+#define next_pme(pme) = ; ((unsigned long *)(pme + 1) + pme->nr_pages)=

+

+struct page_map_entry {

+    &n= bsp;  unsigned long virt_addr;

+    &n= bsp;  unsigned int nr_pages;

+    &n= bsp;  unsigned int is_huge_page;

+    &n= bsp;  unsigned long phy_addr_array[0];

+};

+

+struct page_map_info {

+    &n= bsp;  int pid;

+    &n= bsp;  int pid_reserved;

+    &n= bsp;  unsigned int entry_num;

+    &n= bsp;  struct page_map_entry *pme;

+};

+

+extern struct page_map_inf= o *get_page_map_info(int pid);

+extern struct page_map_inf= o *create_page_map_info(int pid);

+extern vm_fault_t do_mem_r= emap(int pid, struct mm_struct *mm);

+extern vm_fault_t do_anon_= page_remap(struct vm_area_struct *vma, unsigned long address,

+    &n= bsp;  pmd_t *pmd, struct page *page);

+extern void clear_pin_memo= ry_record(void);

+extern int pin_mem_area(st= ruct task_struct *task, struct mm_struct *mm,

+    &n= bsp;           unsigned l= ong start_addr, unsigned long end_addr);

+extern vm_fault_t do_anon_= huge_page_remap(struct vm_area_struct *vma, unsigned long address,

+    &n= bsp;           pmd_t *pmd= , struct page *page);

+

+/* reserve space for pin m= emory*/

+#ifdef CONFIG_ARM64

+extern struct resource pin= _memory_resource;

+#endif

+extern void init_reserve_p= age_map(unsigned long map_addr, unsigned long map_size);<= /p>

+

+#endif /* CONFIG_PIN_MEMOR= Y */

+#endif /* _LINUX_PIN_MEMOR= Y_H */

diff --git a/kernel/crash_core.= c b/kernel/crash_core.c

index 9f1557b..7512696 100644

--- a/kernel/crash_core.c<= /o:p>

+++ b/kernel/crash_= core.c

@@ -292,6 +292,17 @@ int __= init parse_crashkernel_low(char *cmdline,

     &= nbsp;           &nbs= p;            &= nbsp;    "crashkernel=3D", suffix_tbl[SUFFIX_LOW])= ;

}

 

+#ifdef CONFIG_PIN_MEMORY

+int __init parse_pin_memor= y(char *cmdline,

+    &n= bsp;            = ;         unsigned long long system= _ram,

+    &n= bsp;            = ;         unsigned long long *pin_s= ize,

+    &n= bsp;            = ;         unsigned long long *pin_b= ase)

+{

+    &n= bsp;  return __parse_crashkernel(cmdline, system_ram, pin_size, pin_ba= se,

+    &n= bsp;            = ;            &n= bsp;     "pinmemory=3D", NULL);

+}

+#endif

+

Elf_Word *append_elf_note(Elf_W= ord *buf, char *name, unsigned int type,

     &= nbsp;           &nbs= p;          void *data, size_t= data_len)

{

diff --git a/mm/Kconfig b/mm/Kc= onfig

index ab80933..c2dd088 100644

--- a/mm/Kconfig

+++ b/mm/Kconfig

@@ -739,4 +739,10 @@ config= ARCH_HAS_HUGEPD

config MAPPING_DIRTY_HELPERS

     &= nbsp;   bool

 

+config PIN_MEMORY

+    &n= bsp;  bool "Support for pin memory"

+    &n= bsp;  depends on CHECKPOINT_RESTORE

+    &n= bsp;  help

+    &n= bsp;    Say y here to enable the pin memory feature for checkpoin= t

+    &n= bsp;    and restore.

endmenu

diff --git a/mm/Makefile b/mm/M= akefile

index 1937cc2..7e1984e 100644

--- a/mm/Makefile

+++ b/mm/Makefile

@@ -108,3 +108,4 @@ obj-$(C= ONFIG_ZONE_DEVICE) +=3D memremap.o

obj-$(CONFIG_HMM_MIRROR) += =3D hmm.o

obj-$(CONFIG_MEMFD_CREATE) += ;=3D memfd.o

obj-$(CONFIG_MAPPING_DIRTY_HELP= ERS) +=3D mapping_dirty_helpers.o

+obj-$(CONFIG_PIN_MEMORY) &= #43;=3D pin_mem.o

diff --git a/mm/huge_memory.c b= /mm/huge_memory.c

index a880932..93dc582 100644

--- a/mm/huge_memory.c

+++ b/mm/huge_memor= y.c

@@ -3083,4 +3083,65 @@ void= remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)<= o:p>

     &= nbsp;          mlock_vma_page(= new);

     &= nbsp; update_mmu_cache_pmd(vma, address, pvmw->pmd);

}

+

+#ifdef CONFIG_PIN_MEMORY

+vm_fault_t do_anon_huge_pa= ge_remap(struct vm_area_struct *vma, unsigned long address,

+    &n= bsp;           pmd_t *pmd= , struct page *page)

+{

+    &n= bsp;  gfp_t gfp;

+    &n= bsp;  pgtable_t pgtable;

+    &n= bsp;  spinlock_t *ptl;

+    &n= bsp;  pmd_t entry;

+    &n= bsp;  vm_fault_t ret =3D 0;

+    &n= bsp;  struct mem_cgroup *memcg;

+

+    &n= bsp;  if (unlikely(anon_vma_prepare(vma)))

+    &n= bsp;  return VM_FAULT_OOM;

+    &n= bsp;  if (unlikely(khugepaged_enter(vma, vma->vm_flags)))

+    &n= bsp;           return VM_= FAULT_OOM;

+    &n= bsp;  gfp =3D alloc_hugepage_direct_gfpmask(vma);

+    &n= bsp;  prep_transhuge_page(page);

+    &n= bsp;  if (mem_cgroup_try_charge_delay(page, vma->vm_mm, gfp, &m= emcg, true)) {

+    &n= bsp;           put_page(p= age);

+    &n= bsp;           count_vm_e= vent(THP_FAULT_FALLBACK);

+    &n= bsp;           return VM_= FAULT_FALLBACK;

+    &n= bsp;  }

+    &n= bsp;  pgtable =3D pte_alloc_one(vma->vm_mm, address);

+    &n= bsp;  if (unlikely(!pgtable)) {

+    &n= bsp;           ret =3D VM= _FAULT_OOM;

+    &n= bsp;           goto relea= se;

+    &n= bsp;  }

+    &n= bsp;  __SetPageUptodate(page);

+    &n= bsp;  ptl =3D pmd_lock(vma->vm_mm, pmd);

+    &n= bsp;  if (unlikely(!pmd_none(*pmd))) {

+    &n= bsp;           goto unloc= k_release;

+    &n= bsp;  } else {

+    &n= bsp;           ret =3D ch= eck_stable_address_space(vma->vm_mm);

+    &n= bsp;           if (ret)

+    &n= bsp;            = ;         goto unlock_release;=

+    &n= bsp;           entry =3D = mk_huge_pmd(page, vma->vm_page_prot);

+    &n= bsp;           entry =3D = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);

+    &n= bsp;           page_add_n= ew_anon_rmap(page, vma, address, true);

+    &n= bsp;           mem_cgroup= _commit_charge(page, memcg, false, true);

+    &n= bsp;           lru_cache_= add_active_or_unevictable(page, vma);

+    &n= bsp;           pgtable_tr= ans_huge_deposit(vma->vm_mm, pmd, pgtable);

+    &n= bsp;           set_pmd_at= (vma->vm_mm, address, pmd, entry);

+    &n= bsp;           add_mm_cou= nter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR);

+    &n= bsp;           mm_inc_nr_= ptes(vma->vm_mm);

+    &n= bsp;           spin_unloc= k(ptl);

+    &n= bsp;           count_vm_e= vent(THP_FAULT_ALLOC);

+    &n= bsp;  }

+

+    &n= bsp;  return 0;

+unlock_release:=

+    &n= bsp;  spin_unlock(ptl);

+release:=

+    &n= bsp;  if (pgtable)

+    &n= bsp;           pte_free(v= ma->vm_mm, pgtable);

+    &n= bsp;  mem_cgroup_cancel_charge(page, memcg, true);

+    &n= bsp;  put_page(page);

+    &n= bsp;  return ret;

+}

+#endif

+

#endif

diff --git a/mm/memory.c b/mm/m= emory.c

index 45442d9..dd416fd 100644

--- a/mm/memory.c

+++ b/mm/memory.c

@@ -4799,4 +4799,72 @@ void= ptlock_free(struct page *page)

{

     &= nbsp; kmem_cache_free(page_ptl_cachep, page->ptl);

}

+

+#ifdef CONFIG_PIN_MEMORY

+vm_fault_t do_anon_page_re= map(struct vm_area_struct *vma, unsigned long address,

+    &n= bsp;  pmd_t *pmd, struct page *page)

+{

+    &n= bsp;  struct mem_cgroup *memcg;

+    &n= bsp;  pte_t entry;

+    &n= bsp;  spinlock_t *ptl;

+    &n= bsp;  pte_t *pte;

+    &n= bsp;  vm_fault_t ret =3D 0;

+

+    &n= bsp;  if (pte_alloc(vma->vm_mm, pmd, address))

+    &n= bsp;           return VM_= FAULT_OOM;

+

+    &n= bsp;  /* See the comment in pte_alloc_one_map() */

+    &n= bsp;  if (unlikely(pmd_trans_unstable(pmd)))

+    &n= bsp;           return 0;<= o:p>

+

+    &n= bsp;  /* Allocate our own private page. */

+    &n= bsp;  if (unlikely(anon_vma_prepare(vma)))

+    &n= bsp;           goto oom;<= o:p>

+

+    &n= bsp;  if (mem_cgroup_try_charge_delay(page, vma->vm_mm, GFP_KERNEL,= &memcg,

+    &n= bsp;           false))

+    &n= bsp;           goto oom_f= ree_page;

+

+    &n= bsp;  /*

+    &n= bsp;  * The memory barrier inside __SetPageUptodate makes sure that

+    &n= bsp;  * preceeding stores to the page contents become visible before

+    &n= bsp;  * the set_pte_at() write.

+    &n= bsp;  */

+    &n= bsp;  __SetPageUptodate(page);

+

+    &n= bsp;  entry =3D mk_pte(page, vma->vm_page_prot);<= /p>

+    &n= bsp;  if (vma->vm_flags & VM_WRITE)

+    &n= bsp;           entry =3D = pte_mkwrite(pte_mkdirty(entry));

+    &n= bsp;  pte =3D pte_offset_map_lock(vma->vm_mm, pmd, address,

+    &n= bsp;            = ;         &ptl);

+    &n= bsp;  if (!pte_none(*pte)) {

+    &n= bsp;           ret =3D VM= _FAULT_FALLBACK;

+    &n= bsp;           goto relea= se;

+    &n= bsp;  }

+

+    &n= bsp;  ret =3D check_stable_address_space(vma->vm_mm);

+    &n= bsp;  if (ret)

+    &n= bsp;           goto relea= se;

+    &n= bsp;  inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);

+    &n= bsp;  page_add_new_anon_rmap(page, vma, address, false);

+    &n= bsp;  mem_cgroup_commit_charge(page, memcg, false, false);<= /span>

+    &n= bsp;  lru_cache_add_active_or_unevictable(page, vma);

+

+    &n= bsp;  set_pte_at(vma->vm_mm, address, pte, entry);

+    &n= bsp;  /* No need to invalidate - it was non-present before */

+    &n= bsp;  update_mmu_cache(vma, address, pte);

+unlock:<= /p>

+    &n= bsp;  pte_unmap_unlock(pte, ptl);

+    &n= bsp;  return ret;

+release:=

+    &n= bsp;  mem_cgroup_cancel_charge(page, memcg, false);<= /p>

+    &n= bsp;  put_page(page);

+    &n= bsp;  goto unlock;

+oom_free_page:<= /span>

+    &n= bsp;  put_page(page);

+oom:

+    &n= bsp;  return VM_FAULT_OOM;

+}

+#endif

+

#endif

diff --git a/mm/pin_mem.c b/mm/= pin_mem.c

new file mode 100644=

index 00000000..ca3f23a

--- /dev/null=

+++ b/mm/pin_mem.c<= o:p>

@@ -0,0 +1,691 @@

+/*

+ *  Copyright (C) 202= 0. Huawei Technologies Co., Ltd. All rights reserved.

+ *  Provide the pin m= emory method for check point and restore task.

+ */

+#ifdef CONFIG_PIN_MEMORY

+#include <linux/fs.h>= ;

+#include <linux/init.h&= gt;

+#include <linux/proc_fs= .h>

+#include <linux/seq_fil= e.h>

+#include <linux/slab.h&= gt;

+#include <linux/time.h&= gt;

+#include <linux/sched/c= putime.h>

+#include <linux/tick.h&= gt;

+#include <asm/uaccess.h= >

+#include <linux/mm.h>= ;

+#include <linux/pin_mem= .h>

+#include <linux/idr.h&g= t;

+#include <linux/page-is= olation.h>

+#include <linux/sched/m= m.h>

+#include <linux/ctype.h= >

+

+#define MAX_PIN_PID_NUM&nb= sp; 128

+static DEFINE_SPINLOCK(pag= e_map_entry_lock);

+

+unsigned int pin_pid_num;<= o:p>

+static unsigned int *pin_p= id_num_addr;

+static unsigned long __pag= e_map_entry_start;

+static unsigned long page_= map_entry_end;

+static struct page_map_inf= o *user_space_reserve_start;

+static struct page_map_ent= ry *page_map_entry_start;

+unsigned int max_pin_pid_n= um __read_mostly;

+

+static int __init setup_ma= x_pin_pid_num(char *str)

+{

+    &n= bsp;  int ret =3D 1;

+

+    &n= bsp;  if (!str)

+    &n= bsp;           goto out;<= o:p>

+

+    &n= bsp;  ret =3D kstrtouint(str, 10, &max_pin_pid_num);

+out:

+    &n= bsp;  if (ret) {

+    &n= bsp;           pr_warn(&q= uot;Unable to parse max pin pid num.\n");

+    &n= bsp;  } else {

+    &n= bsp;           if (max_pi= n_pid_num > MAX_PIN_PID_NUM) {

+    &n= bsp;            = ;         max_pin_pid_num =3D 0;

+    &n= bsp;            = ;         pr_warn("Input max_p= in_pid_num is too large.\n");

+    &n= bsp;           }

+    &n= bsp;  }

+    &n= bsp;  return ret;

+}

+early_param("max_pin_= pid_num", setup_max_pin_pid_num);

+

+struct page_map_info *crea= te_page_map_info(int pid)

+{

+    &n= bsp;  struct page_map_info *new;

+

+    &n= bsp;  if (!user_space_reserve_start)

+    &n= bsp;           return NUL= L;

+

+    &n= bsp;  if (pin_pid_num >=3D max_pin_pid_num) {

+    &n= bsp;           pr_warn(&q= uot;Pin pid num too large than max_pin_pid_num, fail create: %d!", pid= );

+    &n= bsp;           return NUL= L;

+    &n= bsp;  }

+    &n= bsp;  new =3D (struct page_map_info *)(user_space_reserve_start + = pin_pid_num);

+    &n= bsp;  new->pid =3D pid;

+    &n= bsp;  new->pme =3D NULL;

+    &n= bsp;  new->entry_num =3D 0;

+    &n= bsp;  new->pid_reserved =3D false;

+    &n= bsp;  (*pin_pid_num_addr)++;

+    &n= bsp;  pin_pid_num++;

+    &n= bsp;  return new;

+}

+EXPORT_SYMBOL_GPL(create_p= age_map_info);

+

+struct page_map_info *get_= page_map_info(int pid)

+{

+    &n= bsp;  int i;

+

+    &n= bsp;  if (!user_space_reserve_start)

+    &n= bsp;           return NUL= L;

+

+    &n= bsp;  for (i =3D 0; i < pin_pid_num; i++) {

+    &n= bsp;           if (user_s= pace_reserve_start[i].pid =3D=3D pid) {

+    &n= bsp;            = ;         return &(user_space_r= eserve_start[i]);

+    &n= bsp;           }

+    &n= bsp;  }

+    &n= bsp;  return NULL;

+}

+EXPORT_SYMBOL_GPL(get_page= _map_info);

+

+static struct page *find_h= ead_page(struct page *page)

+{

+    &n= bsp;  struct page *p =3D page;

+

+    &n= bsp;  while (!PageBuddy(p)) {

+    &n= bsp;           if (PageLR= U(p))

+    &n= bsp;            = ;         return NULL;

+    &n= bsp;           p--;<= /o:p>

+    &n= bsp;  }

+    &n= bsp;  return p;

+}

+

+static void spilt_page_are= a_left(struct zone *zone, struct free_area *area, struct page *page,

+    &n= bsp;  unsigned long size, int order)

+{

+    &n= bsp;  unsigned long cur_size =3D 1 << order;

+    &n= bsp;  unsigned long total_size =3D 0;

+    &n= bsp;  struct page *tmp;

+    &n= bsp;  unsigned long tmp_size =3D size;

+

+    &n= bsp;  while (size && cur_size > size) {

+    &n= bsp;           cur_size &= gt;>=3D 1;

+    &n= bsp;           order--;

+    &n= bsp;           area--;

+    &n= bsp;           if (cur_si= ze <=3D size) {

+    &n= bsp;            = ;         list_add(&page[total_= size].lru, &area->free_list[MIGRATE_MOVABLE]);

+    &n= bsp;            = ;         atomic_set(&(page[tot= al_size]._mapcount), PAGE_BUDDY_MAPCOUNT_VALUE);

+    &n= bsp;            = ;         set_page_private(&pag= e[total_size], order);

+    &n= bsp;            = ;         set_pageblock_migratetype= (&page[total_size], MIGRATE_MOVABLE);

+    &n= bsp;            = ;         area->nr_free++= ;;

+    &n= bsp;            = ;         total_size +=3D cur_s= ize;

+    &n= bsp;            = ;         size -=3D cur_size;<= /o:p>

+    &n= bsp;           }

+    &n= bsp;  }

+}

+

+static void spilt_page_are= a_right(struct zone *zone, struct free_area *area, struct page *page,<= /o:p>

+    &n= bsp;           unsigned l= ong size, int order)

+{

+    &n= bsp;  unsigned long cur_size =3D 1 << order;

+    &n= bsp;  struct page *right_page, *head_page;

+    &n= bsp;  unsigned long tmp_size =3D size;

+

+    &n= bsp;  right_page =3D page + size;

+    &n= bsp;  while (size && cur_size > size) {

+    &n= bsp;           cur_size &= gt;>=3D 1;

+    &n= bsp;           order--;

+    &n= bsp;           area--;

+    &n= bsp;           if (cur_si= ze <=3D size) {

+    &n= bsp;            = ;         head_page =3D right_page = - cur_size;

+    &n= bsp;            = ;         list_add(&head_page-&= gt;lru, &area->free_list[MIGRATE_MOVABLE]);

+    &n= bsp;            = ;         atomic_set(&(head_pag= e->_mapcount), PAGE_BUDDY_MAPCOUNT_VALUE);

+    &n= bsp;            = ;         set_page_private(head_pag= e, order);

+    &n= bsp;            = ;         set_pageblock_migratetype= (head_page, MIGRATE_MOVABLE);

+    &n= bsp;            = ;         area->nr_free++= ;;

+    &n= bsp;            = ;         size -=3D cur_size;<= /o:p>

+    &n= bsp;            = ;         right_page =3D head_page;=

+    &n= bsp;           }

+    &n= bsp;  }

+}

+

+void reserve_page_from_bud= dy(unsigned long nr_pages, struct page *page)

+{

+    &n= bsp;  unsigned int current_order;

+    &n= bsp;  struct page *page_end;

+    &n= bsp;  struct free_area *area;

+    &n= bsp;  struct zone *zone;

+    &n= bsp;  struct page *head_page;

+

+    &n= bsp;  head_page =3D find_head_page(page);

+    &n= bsp;  if (!head_page) {

+    &n= bsp;           pr_warn(&q= uot;Find page head fail.");

+    &n= bsp;           return;

+    &n= bsp;  }

+    &n= bsp;  current_order =3D head_page->private;

+    &n= bsp;  page_end =3D head_page + (1 << current_order);

+    &n= bsp;  zone =3D page_zone(head_page);

+    &n= bsp;  area =3D &(zone->free_area[current_order]);

+    &n= bsp;  list_del(&head_page->lru);

+    &n= bsp;  atomic_set(&head_page->_mapcount, -1);<= /p>

+    &n= bsp;  set_page_private(head_page, 0);

+    &n= bsp;  area->nr_free--;

+    &n= bsp;  if (head_page !=3D page)

+    &n= bsp;           spilt_page= _area_left(zone, area, head_page,

+    &n= bsp;            = ;         (unsigned long)(page - he= ad_page), current_order);

+    &n= bsp;  page =3D page + nr_pages;

+    &n= bsp;  if (page < page_end) {

+    &n= bsp;           spilt_page= _area_right(zone, area, page,

+    &n= bsp;            = ;         (unsigned long)(page_end = - page), current_order);

+    &n= bsp;  } else if (page > page_end) {

+    &n= bsp;           pr_warn(&q= uot;Find page end smaller than page.");

+    &n= bsp;  }

+}

+

+static inline void reserve= _user_normal_pages(struct page *page)

+{

+    &n= bsp;  if (!atomic_read(&page->_refcount)) {

+    &n= bsp;           atomic_inc= (&page->_refcount);

+    &n= bsp;           reserve_pa= ge_from_buddy(1, page);

+    &n= bsp;  } else {

+    &n= bsp;           pr_warn(&q= uot;Page %pK refcount %d large than zero, no need reserve.\n",

+    &n= bsp;            = ;         page, page->_refcount.= counter);

+    &n= bsp;  }

+}

+

+static void init_huge_pmd_= pages(struct page *head_page)

+{

+    &n= bsp;  int i =3D 0;

+    &n= bsp;  struct page *page =3D head_page;

+    &n= bsp;  unsigned long *temp;

+    &n= bsp;  unsigned long compound_pad =3D COMPOUND_PAD_START;

+

+    &n= bsp;  __set_bit(PG_head, &page->flags);

+    &n= bsp;  __set_bit(PG_active, &page->flags);

+    &n= bsp;  atomic_set(&page->_refcount, 1);

+    &n= bsp;  page++;

+    &n= bsp;  i++;

+    &n= bsp;  page->compound_head =3D (unsigned long)head_page + 1;

+    &n= bsp;  page->_compound_pad_2 =3D (unsigned long)head_page & COMP= OUND_PAD_MASK;

+    &n= bsp;  temp =3D (unsigned long *)(&(page->_compound_pad_2));

+    &n= bsp;  temp[1] =3D LIST_POISON4;

+    &n= bsp;  page->compound_dtor =3D HUGETLB_PAGE_DTOR + 1;=

+    &n= bsp;  page->compound_order =3D HPAGE_PMD_ORDER;

+    &n= bsp;  page++;

+    &n= bsp;  i++;

+    &n= bsp;  page->compound_head =3D (unsigned long)head_page + 1;

+    &n= bsp;  page->_compound_pad_2 =3D (unsigned long)head_page + comp= ound_pad;

+    &n= bsp;  i++;

+    &n= bsp;  INIT_LIST_HEAD(&(page->deferred_list));=

+    &n= bsp;  for (; i < HPAGE_PMD_NR; i++) {

+    &n= bsp;           page =3D h= ead_page + i;

+    &n= bsp;           page->c= ompound_head =3D (unsigned long)head_page + 1;

+    &n= bsp;           compound_p= ad +=3D COMPOUND_PAD_DELTA;

+    &n= bsp;           page->_= compound_pad_2 =3D (unsigned long)head_page + compound_pad;<= /span>

+    &n= bsp;           temp =3D (= unsigned long *)(&(page->_compound_pad_2));

+    &n= bsp;           temp[1] = =3D LIST_POISON4;

+    &n= bsp;  }

+}

+

+static void reserve_user_h= uge_pmd_pages(struct page *page)

+{

+    &n= bsp;  struct page *head_page;

+

+    &n= bsp;  if (!atomic_read(&page->_refcount)) {

+    &n= bsp;           atomic_inc= (&page->_refcount);

+    &n= bsp;           head_page = =3D find_head_page(page);

+    &n= bsp;           reserve_pa= ge_from_buddy((1 << HPAGE_PMD_ORDER), page);

+    &n= bsp;           init_huge_= pmd_pages(page);

+    &n= bsp;  } else {

+    &n= bsp;           pr_warn(&q= uot;Page %pK refcount %d large than zero, no need reserve.\n",

+    &n= bsp;            = ;         page, page->_refcount.= counter);

+    &n= bsp;  }

+}

+

+static void reserve_user_s= pace_map_pages(void)

+{

+    &n= bsp;  struct page_map_info *pmi;

+    &n= bsp;  struct page_map_entry *pme;

+    &n= bsp;  unsigned int i, j, index;

+    &n= bsp;  struct page *page;

+    &n= bsp;  unsigned long flags;

+    &n= bsp;  unsigned long page_size;

+    &n= bsp;  int err =3D 0;

+    &n= bsp;  unsigned long phy_addr;

+

+    &n= bsp;  if (!user_space_reserve_start)

+    &n= bsp;           return;

+    &n= bsp;  spin_lock_irqsave(&page_map_entry_lock, flags);

+    &n= bsp;  for (index =3D 0; index < pin_pid_num; index++) {

+    &n= bsp;           pmi =3D &a= mp;(user_space_reserve_start[index]);

+    &n= bsp;           pme =3D pm= i->pme;

+

+    &n= bsp;           for (i =3D= 0; i < pmi->entry_num; i++) {

+    &n= bsp;            = ;         err =3D 0;

+    &n= bsp;            = ;         for (j =3D 0; j < pme-= >nr_pages; j++) {

+    &n= bsp;            = ;            &n= bsp;     phy_addr =3D pme->phy_addr_array[j];

+    &n= bsp;            = ;            &n= bsp;     if (!phy_addr)

+    &n= bsp;            = ;            &n= bsp;            = ;  continue;

+    &n= bsp;            = ;            &n= bsp;     page =3D phys_to_page(phy_addr);

+    &n= bsp;            = ;            &n= bsp;     if (atomic_read(&page->_refcount)) {

+    &n= bsp;            = ;            &n= bsp;            = ;  pme->phy_addr_array[j] =3D 0;

+    &n= bsp;            = ;            &n= bsp;            = ;  page_size =3D pme->is_huge_page ? HPAGE_PMD_SIZE : PAGE_SIZE;

+    &n= bsp;            = ;            &n= bsp;            = ;  continue;

+    &n= bsp;            = ;            &n= bsp;     }

+    &n= bsp;            = ;            &n= bsp;     if (!pme->is_huge_page) {=

+    &n= bsp;            = ;            &n= bsp;            = ;  reserve_user_normal_pages(page);

+    &n= bsp;            = ;            &n= bsp;     } else {

+    &n= bsp;            = ;            &n= bsp;            = ;  reserve_user_huge_pmd_pages(page);

+    &n= bsp;            = ;            &n= bsp;     }

+    &n= bsp;            = ;         }

+    &n= bsp;            = ;         pme =3D (struct page_map_= entry *)next_pme(pme);

+    &n= bsp;            = ;         if (err)

+    &n= bsp;            = ;            &n= bsp;     err_phy_num++;

+    &n= bsp;           }

+    &n= bsp;           page_size = =3D pme->is_huge_page ? HPAGE_PMD_SIZE : PAGE_SIZE;

+    &n= bsp;  }

+    &n= bsp;  spin_unlock(&page_map_entry_lock);

+}

+

+

+/* The whole page map entr= y collect process must be Sequentially.

+   The user_spac= e_reserve_start points to the first page map info for

+   the first dum= p task. And the page_map_entry_start points to

+   the first pag= e map entry of the first dump vma. */

+static void init_page_map_= info(unsigned int *map_addr)

+{

+    &n= bsp;  unsigned long map_len =3D pin_memory_resource.end - pin_memory_r= esource.start;

+

+    &n= bsp;  if (user_space_reserve_start || !max_pin_pid_num)

+    &n= bsp;           return;

+    &n= bsp;  pin_pid_num =3D *map_addr;

+    &n= bsp;  pin_pid_num_addr =3D map_addr;

+    &n= bsp;  user_space_reserve_start =3D

+    &n= bsp;           (struct ku= p_page_map_info *)(map_addr + 1);

+    &n= bsp;  page_map_entry_start =3D

+    &n= bsp;           (struct pa= ge_map_entry *)(user_space_reserve_start + max_pin_pid_num);=

+    &n= bsp;  page_map_entry_end =3D (unsigned long)map_addr + map_len;

+    &n= bsp;  if (pin_pid_num > 0)

+    &n= bsp;           reserve_us= er_space_map_pages();

+}

+

+int collect_pmd_huge_pages= (struct task_struct *task,

+    &n= bsp;  unsigned long start_addr, unsigned long end_addr, struct page_ma= p_entry *pme)

+{

+    &n= bsp;  long res;

+    &n= bsp;  int index =3D 0;

+    &n= bsp;  unsigned long start =3D start_addr;

+    &n= bsp;  struct page *temp_page;

+

+    &n= bsp;  while (start < end_addr) {

+    &n= bsp;           temp_page = =3D NULL;

+    &n= bsp;           res =3D ge= t_user_pages_remote(task, task->mm, start, 1,

+    &n= bsp;            = ;         FOLL_TOUCH|FOLL_GET, &= ;temp_page, NULL, NULL);

+    &n= bsp;           if (!res) = {

+    &n= bsp;            = ;         pr_warn("Get huge pa= ge for addr(%lx) fail.", start);

+    &n= bsp;            = ;         return COLLECT_PAGES_FAIL= ;

+    &n= bsp;           }

+    &n= bsp;           if (PageHe= ad(temp_page)) {

+    &n= bsp;            = ;         start +=3D HPAGE_PMD_= SIZE;

+    &n= bsp;            = ;         pme->phy_addr_array[in= dex] =3D page_to_phys(temp_page);

+    &n= bsp;            = ;         index++;

+    &n= bsp;           } else {

+    &n= bsp;            = ;         pme->nr_pages =3D inde= x;

+    &n= bsp;            = ;         atomic_dec(&((temp_pa= ge)->_refcount));

+    &n= bsp;            = ;         return COLLECT_PAGES_NEED= _CONTINUE;

+    &n= bsp;           }

+    &n= bsp;  }

+    &n= bsp;  pme->nr_pages =3D index;

+    &n= bsp;  return COLLECT_PAGES_FINISH;

+}

+

+int collect_normal_pages(s= truct task_struct *task,

+    &n= bsp;  unsigned long start_addr, unsigned long end_addr, struct page_ma= p_entry *pme)

+{

+    &n= bsp;  int res;

+    &n= bsp;  unsigned long next;

+    &n= bsp;  unsigned long i, nr_pages;

+    &n= bsp;  struct page *tmp_page;

+    &n= bsp;  unsigned long *phy_addr_array =3D pme->phy_addr_array;

+    &n= bsp;  struct page **page_array =3D (struct page **)pme->phy_addr_ar= ray;

+

+    &n= bsp;  next =3D (start_addr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;=

+    &n= bsp;  next =3D (next > end_addr) ? end_addr : next;

+    &n= bsp;  pme->nr_pages =3D 0;

+    &n= bsp;  while (start_addr < next) {

+    &n= bsp;           nr_pages = =3D (next - start_addr) / PAGE_SIZE;

+    &n= bsp;           res =3D ge= t_user_pages_remote(task, task->mm, start_addr, 1,

+    &n= bsp;            = ;            &n= bsp;     FOLL_TOUCH|FOLL_GET, &tmp_page, NULL, NULL= );

+    &n= bsp;           if (!res) = {

+    &n= bsp;            = ;         pr_warn("Get user pa= ges of %lx fail.\n", start_addr);

+    &n= bsp;            = ;         return COLLECT_PAGES_FAIL= ;

+    &n= bsp;           }

+    &n= bsp;           if (PageHe= ad(tmp_page)) {

+    &n= bsp;            = ;         atomic_dec(&(tmp_page= ->_refcount));

+    &n= bsp;            = ;         return COLLECT_PAGES_NEED= _CONTINUE;

+    &n= bsp;           }

+    &n= bsp;           atomic_dec= (&(tmp_page->_refcount));

+    &n= bsp;           if (PageTa= il(tmp_page)) {

+    &n= bsp;            = ;         start_addr =3D next;=

+    &n= bsp;            = ;         pme->virt_addr =3D sta= rt_addr;

+    &n= bsp;            = ;         next =3D (next + HPAG= E_PMD_SIZE) > end_addr ? end_addr : (next + HPAGE_PMD_SIZE);

+    &n= bsp;            = ;         continue;

+    &n= bsp;           }

+    &n= bsp;           res =3D ge= t_user_pages_remote(task, task->mm, start_addr, nr_pages,

+    &n= bsp;            = ;         FOLL_TOUCH|FOLL_GET, page= _array, NULL, NULL);

+    &n= bsp;           if (!res) = {

+    &n= bsp;            = ;         pr_warn("Get user pa= ges of %lx fail.\n", start_addr);

+    &n= bsp;            = ;         return COLLECT_PAGES_FAIL= ;

+    &n= bsp;           }

+    &n= bsp;           for (i =3D= 0; i < nr_pages; i++) {

+    &n= bsp;            = ;         phy_addr_array[i] =3D pag= e_to_phys(page_array[i]);

+    &n= bsp;           }

+    &n= bsp;           pme->nr= _pages +=3D nr_pages;

+    &n= bsp;           page_array= +=3D nr_pages;

+    &n= bsp;           phy_addr_a= rray +=3D nr_pages;

+    &n= bsp;           start_addr= =3D next;

+    &n= bsp;           next =3D (= next + HPAGE_PMD_SIZE) > end_addr ? end_addr : (next + HPAGE_PMD= _SIZE);

+    &n= bsp;  }

+    &n= bsp;  return COLLECT_PAGES_FINISH;

+}

+

+/* Users make sure that th= e pin memory belongs to anonymous vma. */

+int pin_mem_area(struct ta= sk_struct *task, struct mm_struct *mm,

+    &n= bsp;           unsigned l= ong start_addr, unsigned long end_addr)

+{

+    &n= bsp;  int pid, ret;

+    &n= bsp;  int is_huge_page =3D false;

+    &n= bsp;  unsigned int page_size;

+    &n= bsp;  unsigned long nr_pages, flags;

+    &n= bsp;  struct page_map_entry *pme;

+    &n= bsp;  struct page_map_info *pmi;

+    &n= bsp;  struct vm_area_struct *vma;

+    &n= bsp;  unsigned long i;

+    &n= bsp;  struct page *tmp_page;

+

+    &n= bsp;  if (!page_map_entry_start

+    &n= bsp;           || !task |= | !mm

+    &n= bsp;           || start_a= ddr >=3D end_addr)

+    &n= bsp;           return -EF= AULT;

+

+    &n= bsp;  pid =3D task->pid;

+    &n= bsp;  spin_lock_irqsave(&page_map_entry_lock, flags);

+    &n= bsp;  nr_pages =3D ((end_addr - start_addr) / PAGE_SIZE);

+    &n= bsp;  if ((unsigned long)page_map_entry_start + nr_pages * sizeof(= struct page *)

+    &n= bsp;           >=3D pa= ge_map_entry_end) {

+    &n= bsp;           pr_warn(&q= uot;Page map entry use up!\n");

+    &n= bsp;           ret =3D -E= FAULT;

+    &n= bsp;           goto finis= h;

+    &n= bsp;  }

+    &n= bsp;  vma =3D find_extend_vma(mm, start_addr);

+    &n= bsp;  if (!vma) {

+    &n= bsp;           pr_warn(&q= uot;Find no match vma!\n");

+    &n= bsp;           ret =3D -E= FAULT;

+    &n= bsp;           goto finis= h;

+    &n= bsp;  }

+    &n= bsp;  if (start_addr =3D=3D (start_addr & HPAGE_PMD_MASK) &&am= p;

+    &n= bsp;           transparen= t_hugepage_enabled(vma)) {

+    &n= bsp;           page_size = =3D HPAGE_PMD_SIZE;

+    &n= bsp;           is_huge_pa= ge =3D true;

+    &n= bsp;  } else {

+    &n= bsp;           page_size = =3D PAGE_SIZE;

+    &n= bsp;  }

+    &n= bsp;  pme =3D page_map_entry_start;

+    &n= bsp;  pme->virt_addr =3D start_addr;

+    &n= bsp;  pme->is_huge_page =3D is_huge_page;

+    &n= bsp;  memset(pme->phy_addr_array, 0, nr_pages * sizeof(unsigned lon= g));

+    &n= bsp;  down_write(&mm->mmap_sem);

+    &n= bsp;  if (!is_huge_page) {

+    &n= bsp;           ret =3D co= llect_normal_pages(task, start_addr, end_addr, pme);

+    &n= bsp;           if (!pme-&= gt;nr_pages) {

+    &n= bsp;            = ;         if (ret =3D=3D COLLECT_PA= GES_FINISH) {

+    &n= bsp;            = ;            &n= bsp;     ret =3D 0;

+    &n= bsp;            = ;            &n= bsp;     up_write(&mm->mmap_sem);

+    &n= bsp;            = ;            &n= bsp;     goto finish;

+    &n= bsp;            = ;         }

+    &n= bsp;            = ;         pme->is_huge_page =3D = true;

+    &n= bsp;            = ;         page_size =3D HPAGE_PMD_S= IZE;

+    &n= bsp;            = ;         ret =3D collect_pmd_huge_= pages(task, pme->virt_addr, end_addr, pme);

+    &n= bsp;           }

+    &n= bsp;  } else {

+    &n= bsp;           ret =3D co= llect_pmd_huge_pages(task, start_addr, end_addr, pme);

+    &n= bsp;           if (!pme-&= gt;nr_pages) {

+    &n= bsp;            = ;         if (ret =3D=3D COLLECT_PA= GES_FINISH) {

+    &n= bsp;            = ;            &n= bsp;     ret =3D 0;

+    &n= bsp;            = ;            &n= bsp;     up_write(&mm->mmap_sem);

+    &n= bsp;            = ;            &n= bsp;     goto finish;

+    &n= bsp;            = ;         }

+    &n= bsp;            = ;         pme->is_huge_page =3D = false;

+    &n= bsp;            = ;         page_size =3D PAGE_SIZE;<= o:p>

+    &n= bsp;            = ;         ret =3D collect_normal_pa= ges(task, pme->virt_addr, end_addr, pme);

+    &n= bsp;           }

+    &n= bsp;  }

+    &n= bsp;  up_write(&mm->mmap_sem);

+    &n= bsp;  if (ret =3D=3D COLLECT_PAGES_FAIL) {

+    &n= bsp;           ret =3D -E= FAULT;

+    &n= bsp;           goto finis= h;

+    &n= bsp;  }

+

+    &n= bsp;  /* check for zero pages */

+    &n= bsp;  for (i =3D 0; i < pme->nr_pages; i++) {=

+    &n= bsp;           tmp_page = =3D phys_to_page(pme->phy_addr_array[i]);

+    &n= bsp;           if (!pme-&= gt;is_huge_page) {

+    &n= bsp;            = ;         if (page_to_pfn(tmp_page)= =3D=3D my_zero_pfn(pme->virt_addr + i * PAGE_SIZE))

+    &n= bsp;            = ;            &n= bsp;     pme->phy_addr_array[i] =3D 0;

+    &n= bsp;           } else if = (is_huge_zero_page(tmp_page))

+    &n= bsp;            = ;         pme->phy_addr_array[i]= =3D 0;

+    &n= bsp;  }

+

+    &n= bsp;  page_map_entry_start =3D (struct page_map_entry *)(next_pme(pme)= );

+    &n= bsp;  pmi =3D get_page_map_info(pid);

+    &n= bsp;  if (!pmi)

+    &n= bsp;           pmi =3D cr= eate_page_map_info(pid);

+    &n= bsp;  if (!pmi) {

+    &n= bsp;           pr_warn(&q= uot;Create page map info fail for pid: %d!\n", pid);=

+    &n= bsp;           ret =3D -E= FAULT;

+    &n= bsp;           goto finis= h;

+    &n= bsp;  }

+    &n= bsp;  if (!pmi->pme)

+    &n= bsp;           pmi->pm= e =3D pme;

+    &n= bsp;  pmi->entry_num++;

+

+    &n= bsp;  if (ret =3D=3D COLLECT_PAGES_NEED_CONTINUE) {<= /p>

+    &n= bsp;           ret =3D pi= n_mem_area(task, mm, pme->virt_addr + pme->nr_pages * page_size, = end_addr);

+    &n= bsp;  }

+

+finish:<= /p>

+    &n= bsp;  spin_unlock_irqrestore(&page_map_entry_lock, flags);

+    &n= bsp;  return ret;

+}

+EXPORT_SYMBOL_GPL(pin_mem_= area);

+

+vm_fault_t remap_normal_pa= ges(struct mm_struct *mm, struct vm_area_struct *vma,

+    &n= bsp;           struct pag= e_map_entry *pme)

+{

+    &n= bsp;  int ret;

+    &n= bsp;  unsigned int j;

+    &n= bsp;  pgd_t *pgd;

+    &n= bsp;  p4d_t *p4d;

+    &n= bsp;  pmd_t *pmd;

+    &n= bsp;  pud_t *pud;

+    &n= bsp;  struct page *page;

+    &n= bsp;  unsigned long address;

+    &n= bsp;  unsigned long phy_addr;

+

+    &n= bsp;  for (j =3D 0; j < pme->nr_pages; j++) {=

+    &n= bsp;           address = =3D pme->virt_addr + j * PAGE_SIZE;

+    &n= bsp;           phy_addr = =3D pme->phy_addr_array[j];

+    &n= bsp;           if (!phy_a= ddr)

+    &n= bsp;            = ;         continue;

+    &n= bsp;           page =3D p= hys_to_page(phy_addr);

+    &n= bsp;          if (page->flags &a= mp; (1 << PG_reserved))

+    &n= bsp;            &nbs= p;     page->flags -=3D (1 << PG_reserved);

+    &n= bsp;           if (page_t= o_pfn(page) =3D=3D my_zero_pfn(address)) {

+    &n= bsp;            = ;         pme->phy_addr_array[j]= =3D 0;

+    &n= bsp;            = ;         continue;

+    &n= bsp;           }

+    &n= bsp;           page->m= apping =3D NULL;

+    &n= bsp;           pgd =3D pg= d_offset(mm, address);

+    &n= bsp;           p4d =3D p4= d_alloc(mm, pgd, address);

+    &n= bsp;           if (!p4d)<= o:p>

+    &n= bsp;            = ;         return VM_FAULT_OOM;=

+    &n= bsp;           pud =3D pu= d_alloc(mm, p4d, address);

+    &n= bsp;           if (!pud)<= o:p>

+    &n= bsp;            = ;         return VM_FAULT_OOM;=

+    &n= bsp;           pmd =3D pm= d_alloc(mm, pud, address);

+    &n= bsp;           if (!pmd)<= o:p>

+    &n= bsp;            = ;         return VM_FAULT_OOM;=

+    &n= bsp;           ret =3D do= _anon_page_remap(vma, address, pmd, page);

+    &n= bsp;           if (ret = =3D=3D VM_FAULT_OOM)

+    &n= bsp;            = ;         return ret;

+    &n= bsp;  }

+    &n= bsp;  return 0;

+}

+

+vm_fault_t remap_huge_pmd_= pages(struct mm_struct *mm, struct vm_area_struct *vma,

+    &n= bsp;           struct pag= e_map_entry *pme)

+{

+    &n= bsp;  int ret;

+    &n= bsp;  unsigned int j;

+    &n= bsp;  pgd_t *pgd;

+    &n= bsp;  p4d_t *p4d;

+    &n= bsp;  pmd_t *pmd;

+    &n= bsp;  pud_t *pud;

+    &n= bsp;  struct page *page;

+    &n= bsp;  unsigned long address;

+    &n= bsp;  unsigned long phy_addr;

+

+    &n= bsp;  for (j =3D 0; j < pme->nr_pages; j++) {=

+    &n= bsp;           address = =3D pme->virt_addr + j * HPAGE_PMD_SIZE;

+    &n= bsp;           phy_addr = =3D pme->phy_addr_array[j];

+    &n= bsp;           if (!phy_a= ddr)

+    &n= bsp;            = ;         continue;

+    &n= bsp;           page =3D p= hys_to_page(phy_addr);

+    &n= bsp;          if (page->flags &a= mp; (1 << PG_reserved))

+    &n= bsp;            &nbs= p;     page->flags -=3D (1 << PG_reserved);

+    &n= bsp;           if (is_hug= e_zero_page(page)) {

+    &n= bsp;            = ;         pme->phy_addr_array[j]= =3D 0;

+    &n= bsp;            = ;         continue;

+    &n= bsp;           }

+    &n= bsp;           pgd =3D pg= d_offset(mm, address);

+    &n= bsp;           p4d =3D p4= d_alloc(mm, pgd, address);

+    &n= bsp;           if (!p4d)<= o:p>

+    &n= bsp;            = ;         return VM_FAULT_OOM;=

+    &n= bsp;           pud =3D pu= d_alloc(mm, p4d, address);

+    &n= bsp;           if (!pud)<= o:p>

+    &n= bsp;            = ;         return VM_FAULT_OOM;=

+    &n= bsp;           pmd =3D pm= d_alloc(mm, pud, address);

+    &n= bsp;           if (!pmd)<= o:p>

+    &n= bsp;            = ;         return VM_FAULT_OOM;=

+    &n= bsp;           ret =3D do= _anon_huge_page_remap(vma, address, pmd, page);

+    &n= bsp;           if (ret = =3D=3D VM_FAULT_OOM)

+    &n= bsp;            = ;         return ret;

+    &n= bsp;  }

+    &n= bsp;  return 0;

+}

+

+vm_fault_t do_mem_remap(in= t pid, struct mm_struct *mm)

+{

+    &n= bsp;  unsigned int i =3D 0;

+    &n= bsp;  vm_fault_t ret =3D 0;

+    &n= bsp;  struct vm_area_struct *vma;

+    &n= bsp;  struct page_map_info *pmi;

+    &n= bsp;  struct page_map_entry *pme;

+

+    &n= bsp;  pmi =3D get_page_map_info(pid);

+    &n= bsp;  if (!pmi)

+    &n= bsp;           return -EF= AULT;

+    &n= bsp;  down_write(&mm->mmap_sem);

+    &n= bsp;  pme =3D pmi->pme;

+    &n= bsp;  vma =3D mm->mmap;

+    &n= bsp;  while ((i < pmi->entry_num) && (vma !=3D NULL)) {<= o:p>

+    &n= bsp;           if (pme-&g= t;virt_addr >=3D vma->vm_start && pme->virt_addr < vma-= >vm_end) {

+    &n= bsp;            = ;         i++;

+    &n= bsp;            = ;         if (!vma_is_anonymous(vma= )) {

+    &n= bsp;            = ;            &n= bsp;     pme =3D (struct page_map_entry *)(next_pme(pme= ));

+    &n= bsp;            = ;            &n= bsp;     continue;

+    &n= bsp;            = ;         }

+    &n= bsp;            = ;         if (!pme->is_huge_page= ) {

+    &n= bsp;            = ;            &n= bsp;     ret =3D remap_normal_pages(mm, vma, pme);=

+    &n= bsp;            = ;            &n= bsp;     if (ret < 0)

+    &n= bsp;            = ;            &n= bsp;            = ;  goto out;

+    &n= bsp;            = ;         } else {

+    &n= bsp;            = ;            &n= bsp;     ret =3D remap_huge_pmd_pages(mm, vma, pme);

+    &n= bsp;            = ;            &n= bsp;     if (ret < 0)

+    &n= bsp;            = ;            &n= bsp;            = ;  goto out;

+    &n= bsp;            = ;         }

+    &n= bsp;            = ;         pme =3D (struct page_map_= entry *)(next_pme(pme));

+    &n= bsp;           } else {

+    &n= bsp;            = ;         vma =3D vma->vm_next;<= o:p>

+    &n= bsp;           }

+    &n= bsp;  }

+out:

+    &n= bsp;  up_write(&mm->mmap_sem);

+    &n= bsp;  return ret;

+}

+EXPORT_SYMBOL_GPL(do_mem_r= emap);

+

+#if defined(CONFIG_ARM64)<= o:p>

+void init_reserve_page_map= (unsigned long map_addr, unsigned long map_size)

+{

+    &n= bsp;  void *addr;

+

+    &n= bsp;  if (!map_addr || !map_size)

+    &n= bsp;           return;

+    &n= bsp;  addr =3D phys_to_virt(map_addr);

+    &n= bsp;  init_page_map_info((unsigned int *)addr);

+}

+#else

+void init_reserve_page_map= (unsigned long map_addr, unsigned long map_size)

+{

+}

+#endif

+

+/* Clear all pin memory re= cord. */

+void clear_pin_memory_reco= rd(void)

+{

+    &n= bsp;  if (pin_pid_num_addr) {

+    &n= bsp;           *pin_pid_n= um_addr =3D 0;

+    &n= bsp;           pin_pid_nu= m =3D 0;

+    &n= bsp;           page_map_e= ntry_start =3D (struct page_map_entry *)__page_map_entry_start;<= /span>

+    &n= bsp;  }

+    &n= bsp;  if (kernel_space_reserve_start && kernel_pin_space_size = > 0) {

+    &n= bsp;           *(unsigned= long *)kernel_space_reserve_start =3D 0;

+    &n= bsp;  }

+}

+EXPORT_SYMBOL_GPL(clear_pi= n_memory_record);

+

+vm_fault_t reserve_kernel_= space_mem(unsigned long start_addr, unsigned int pages)

+{

+    &n= bsp;  unsigned long i;

+    &n= bsp;  unsigned long entry_num;

+    &n= bsp;  struct page_map_entry *pme, *pme_start;

+

+

+    &n= bsp;  entry_num =3D *(unsigned long *)kernel_space_reserve_start;=

+    &n= bsp;  pme_start =3D (struct page_map_entry *)(kernel_space_reserve_sta= rt + sizeof(entry_num));

+    &n= bsp;  pme =3D pme_start;

+    &n= bsp;  spin_lock(&page_map_entry_lock);

+    &n= bsp;  for (i =3D 0; i < entry_num; i++) {=

+    &n= bsp;           if (start_= addr =3D=3D pme->virt_addr) {

+    &n= bsp;            = ;         spin_unlock(&page_map= _entry_lock);

+    &n= bsp;            = ;         return 0;

+    &n= bsp;           }

+    &n= bsp;           pme =3D pm= e + 1;

+    &n= bsp;  }

+    &n= bsp;  if ((unsigned long)(pme_start + entry_num) >=3D kernel_sp= ace_reserve_end) {

+    &n= bsp;           spin_unloc= k(&page_map_entry_lock);

+    &n= bsp;           return VM_= FAULT_OOM;

+    &n= bsp;  }

+    &n= bsp;  pme =3D pme_start + entry_num;

+    &n= bsp;  pme->virt_addr =3D start_addr;

+    &n= bsp;  pme->nr_pages =3D pages;

+    &n= bsp;  pme->is_huge_page =3D false;

+    &n= bsp;  *(unsigned long *)kernel_space_reserve_start =3D entry_num += 1;

+    &n= bsp;  spin_unlock(&page_map_entry_lock);

+    &n= bsp;  return 0;

+}

+EXPORT_SYMBOL_GPL(reserve_= kernel_space_mem);

+

+#endif /* CONFIG_PIN_MEMOR= Y */

--

1.8.3.1

--_000_a68df79992c04bbf8167748dbeca1fcchuaweicom_--