From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60B92C6FA83 for ; Tue, 27 Sep 2022 16:27:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E30078E00DC; Tue, 27 Sep 2022 12:27:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DDEA98E00C1; Tue, 27 Sep 2022 12:27:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5A418E00DC; Tue, 27 Sep 2022 12:27:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B48B38E00C1 for ; Tue, 27 Sep 2022 12:27:48 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 61C5AA04E1 for ; Tue, 27 Sep 2022 16:27:48 +0000 (UTC) X-FDA: 79958396616.13.3F04BC0 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf02.hostedemail.com (Postfix) with ESMTP id EC6FA80010 for ; Tue, 27 Sep 2022 16:27:47 +0000 (UTC) Received: by mail-pl1-f176.google.com with SMTP id n7so817909plp.1 for ; Tue, 27 Sep 2022 09:27:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=Ch3iRYZG7BvL1WANyT2EwsRBrbvSXhTcdOkD0Wx+tJI=; b=pHNTTWGV0DU4hVyvtY8TSpgEsj5JBeZ0rgMCeZ1pu5d+GY8morwfkAM9ptjbMVdEyV imM/g2tHkw7xSPN6fTQ6IGpuWJJr0EOnbyn/XoIvshsjJ50vnPRCxNlUkF3jaUVWmU7F xASEF/UvfT3Vh4pOYFibeqK6mPl8pI81Ac3DVe2ScTQs1DfV2bOmjuoQeJLPlrvZIcC4 HzYkpeh0ZE+m4I8MtBo4z484pUCJHOrvcY0OE0LoR9klOfG7RTIF6qbmdP1WpWscKnwK RZx0TUoBv9qUKFVVTOWkZO8yQqdNeoJpi/In6PXF7/whmu03EZ7QotXz5Qsb+WOLMr4n T7OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=Ch3iRYZG7BvL1WANyT2EwsRBrbvSXhTcdOkD0Wx+tJI=; b=D1mKJrFMlITsfUycoGJqjuwNBTzvbN4nJH3ozQClilDbA7fxbp+P1hULvcJXrT8lBl j5hZxPErobrs0E8js6/3d/I9Y8rS81GNMDcqvS01ag6Hlj+40wk3fRoKR5BpiWmJeev+ LygBnei5LE4L1vVbyEht7Z+ihSxS4dj3Y5F1FijsZtVe7r7Db6Oe2fcDqFikqtyzzRiS XKLeS0//2B4D1Cs+oUjRF9lAZQ4oBLC/L10lk3arhL1l4HT+BgC7WPQk0OH75V60uZiI 6mXwcfVpJzfgwD40sqcaF90k3lohmgqHET98q2A3vS1gQ53WUp2/TlR49yArx6lhmMNj WrbQ== X-Gm-Message-State: ACrzQf1pcDMenK6oqV5Q4u2E+gW1x7XbOVF2SAAqE2JL19UlVrKN0/h8 Y5PRDO+qDlbWJpgAAYIjlkU= X-Google-Smtp-Source: AMsMyM7/Fqm4HVnIjXka29tQsuAw6B6XFfSVE1NQCCiSrxD8Nnbtv5yV5dZwRsdAW0VLVfM3KrVz5w== X-Received: by 2002:a17:902:f70e:b0:178:9a17:8e42 with SMTP id h14-20020a170902f70e00b001789a178e42mr28423664plo.14.1664296066874; Tue, 27 Sep 2022 09:27:46 -0700 (PDT) Received: from archlinux.localdomain ([140.121.198.213]) by smtp.googlemail.com with ESMTPSA id 9-20020a17090a0f0900b001f333fab3d6sm8602360pjy.18.2022.09.27.09.27.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Sep 2022 09:27:46 -0700 (PDT) From: Chih-En Lin To: Andrew Morton , Qi Zheng , David Hildenbrand , Matthew Wilcox , Christophe Leroy Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Luis Chamberlain , Kees Cook , Iurii Zaikin , Vlastimil Babka , William Kucharski , "Kirill A . Shutemov" , Peter Xu , Suren Baghdasaryan , Arnd Bergmann , Tong Tiangen , Pasha Tatashin , Li kunyu , Nadav Amit , Anshuman Khandual , Minchan Kim , Yang Shi , Song Liu , Miaohe Lin , Thomas Gleixner , Sebastian Andrzej Siewior , Andy Lutomirski , Fenghua Yu , Dinglan Peng , Pedro Fonseca , Jim Huang , Huichun Feng , Chih-En Lin Subject: [RFC PATCH v2 2/9] mm: pgtable: Add sysctl to enable COW PTE Date: Wed, 28 Sep 2022 00:29:50 +0800 Message-Id: <20220927162957.270460-3-shiyn.lin@gmail.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220927162957.270460-1-shiyn.lin@gmail.com> References: <20220927162957.270460-1-shiyn.lin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=pHNTTWGV; spf=pass (imf02.hostedemail.com: domain of shiyn.lin@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=shiyn.lin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664296068; a=rsa-sha256; cv=none; b=gep5A5A0VKJdo+M+3UBZD+Kg2hppUrF47Md8cSprUj/rupTfy0/f5D/2DET9lI51a+eQB3 2ZRhHiOHA4rRJVrQtXwBNP1V+ZJYHhyNIvG2wLDm9lUNi7k0xqQWQLCNsxFjF0qGIZfrE8 8PVgKcx0LzOsnDNK7fZc3iFH3VE3UmU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664296068; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ch3iRYZG7BvL1WANyT2EwsRBrbvSXhTcdOkD0Wx+tJI=; b=qnzsWKtp6RgO7D1qnfjilmknSgs6ybDzuhB4y7XplfHuWOmE5gqh2tV7xSFom+/QnXPSts Ff1RqCOGr583DoOaSXIibuLiOKLUs5wWegAwC1pixtyL6/NH06I3xcVKM9jTAbx0UjlW+y py/9b2FoZcVLE/1gZRUL9iRzPZ4uO1E= X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: EC6FA80010 X-Rspam-User: Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=pHNTTWGV; spf=pass (imf02.hostedemail.com: domain of shiyn.lin@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=shiyn.lin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: epuxntf44im3h7gi6ktdqgotdxfktx6r X-HE-Tag: 1664296067-291879 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a new sysctl vm.cow_pte to set MMF_COW_PTE_READY flag for enabling copy-on-write (COW) to the PTE page table during the next time of fork. Since it has a time gap between using the sysctl to enable the COW PTE and doing the fork, we use two states to determine the task that wants to do COW PTE or already doing it. Signed-off-by: Chih-En Lin --- include/linux/pgtable.h | 6 ++++++ kernel/fork.c | 5 +++++ kernel/sysctl.c | 8 ++++++++ mm/Makefile | 2 +- mm/cow_pte.c | 39 +++++++++++++++++++++++++++++++++++++++ 5 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 mm/cow_pte.c diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 014ee8f0fbaab..d03d01aefe989 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -937,6 +937,12 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma, __ptep_modify_prot_commit(vma, addr, ptep, pte); } #endif /* __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION */ + +int cow_pte_handler(struct ctl_table *table, int write, void *buffer, + size_t *lenp, loff_t *ppos); + +extern int sysctl_cow_pte_pid; + #endif /* CONFIG_MMU */ /* diff --git a/kernel/fork.c b/kernel/fork.c index 8a9e92068b150..6981944a7c6ec 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2671,6 +2671,11 @@ pid_t kernel_clone(struct kernel_clone_args *args) trace = 0; } + if (current->mm && test_bit(MMF_COW_PTE_READY, ¤t->mm->flags)) { + clear_bit(MMF_COW_PTE_READY, ¤t->mm->flags); + set_bit(MMF_COW_PTE, ¤t->mm->flags); + } + p = copy_process(NULL, trace, NUMA_NO_NODE, args); add_latent_entropy(); diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 205d605cacc5b..c4f54412ae3a9 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2360,6 +2360,14 @@ static struct ctl_table vm_table[] = { .mode = 0644, .proc_handler = mmap_min_addr_handler, }, + { + .procname = "cow_pte", + .data = &sysctl_cow_pte_pid, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = cow_pte_handler, + .extra1 = SYSCTL_ZERO, + }, #endif #ifdef CONFIG_NUMA { diff --git a/mm/Makefile b/mm/Makefile index 9a564f8364035..7a568d5066ee6 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -40,7 +40,7 @@ mmu-y := nommu.o mmu-$(CONFIG_MMU) := highmem.o memory.o mincore.o \ mlock.o mmap.o mmu_gather.o mprotect.o mremap.o \ msync.o page_vma_mapped.o pagewalk.o \ - pgtable-generic.o rmap.o vmalloc.o + pgtable-generic.o rmap.o vmalloc.o cow_pte.o ifdef CONFIG_CROSS_MEMORY_ATTACH diff --git a/mm/cow_pte.c b/mm/cow_pte.c new file mode 100644 index 0000000000000..4e50aa4294ce7 --- /dev/null +++ b/mm/cow_pte.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +/* sysctl will write to this variable */ +int sysctl_cow_pte_pid = -1; + +static void set_cow_pte_task(void) +{ + struct pid *pid; + struct task_struct *task; + + pid = find_get_pid(sysctl_cow_pte_pid); + if (!pid) { + pr_info("pid %d does not exist\n", sysctl_cow_pte_pid); + sysctl_cow_pte_pid = -1; + return; + } + task = get_pid_task(pid, PIDTYPE_PID); + if (!test_bit(MMF_COW_PTE, &task->mm->flags)) + set_bit(MMF_COW_PTE_READY, &task->mm->flags); + sysctl_cow_pte_pid = -1; +} + +int cow_pte_handler(struct ctl_table *table, int write, void *buffer, + size_t *lenp, loff_t *ppos) +{ + int ret; + + ret = proc_dointvec(table, write, buffer, lenp, ppos); + + if (write && !ret) + set_cow_pte_task(); + + return ret; +} -- 2.37.3