From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAB3AC636CD for ; Tue, 7 Feb 2023 03:52:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85DFA6B0073; Mon, 6 Feb 2023 22:52:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7BE516B0074; Mon, 6 Feb 2023 22:52:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5EB476B0075; Mon, 6 Feb 2023 22:52:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4815E6B0073 for ; Mon, 6 Feb 2023 22:52:48 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 11022A0C3E for ; Tue, 7 Feb 2023 03:52:48 +0000 (UTC) X-FDA: 80439124416.02.C5224AB Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf15.hostedemail.com (Postfix) with ESMTP id 367A5A0002 for ; Tue, 7 Feb 2023 03:52:45 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=EY+PKakR; spf=pass (imf15.hostedemail.com: domain of shiyn.lin@gmail.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=shiyn.lin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675741966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=THkbeIqrskaD2PQNhZ3CklGbHlPRjGInAR3sWDIA39M=; b=ThFXWqcRBk9ASqGWu12cDuQnh5w3+4H2576pxmRvEK3VZ/KX+sSAeH34E4UlKi8QO3ro+5 84x9aELtdcBcZ3K9Ob28ocBZIxRzDnv+MIyRuYDZpRWtcZBLPsfLf36kNP0H4PDFj/KYDP wPuLcsVPb351QPR4BAO4ib/35rYQxdg= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=EY+PKakR; spf=pass (imf15.hostedemail.com: domain of shiyn.lin@gmail.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=shiyn.lin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675741966; a=rsa-sha256; cv=none; b=dTMPPDudG/7whJwe5so+9jnUAKlBpLYK1H9vLK108jmYHgMHRhRPvIaUQUMyZQh0oOMcfv wLckOLnmawtMr+EJUDKYjmdNERzZuh79uUfdLhyJt6eIsSG/2fvvrG1tS8rQbN/8LV99ek Aryo3FpUK99fcTcb1qrwcwq4zgzYLXM= Received: by mail-pj1-f52.google.com with SMTP id j1so7831789pjd.0 for ; Mon, 06 Feb 2023 19:52:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=THkbeIqrskaD2PQNhZ3CklGbHlPRjGInAR3sWDIA39M=; b=EY+PKakRdaKE8DbW8E7qw6byiX3BYP4skCjmQhfzN0h8oR9JAHzemzDky43uEb59dh gGvEuSdb6CDSmxf3bE8x90QtOtHc+/A9dyJZas7vltrEelfSf9UqYYKuTIoSF2fFV7xS v8wZ6FvO+8b9g+srnNdSAzjhha7lziW7v/2keYQ50SsNWYFLwol9wyBZ7tJHaO6Cs9g0 Gsb4p/q6gVlCZjRX262CLrPLNMYi6vztpssdmhh41phtqu3bvcPt+6rGlSf+5bPVXSOQ DLBTyZCEnfOuLKc5wre6Pf4M7uM71w46yMuXL0rgeO3wX8DGYS1/cB6DLuzNf08GI45W qilw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=THkbeIqrskaD2PQNhZ3CklGbHlPRjGInAR3sWDIA39M=; b=7i7qCExmUP4XvYWpqHdsvmoTC7QSIgaQiXaypJWgBk4iZBwelmORUpd7Vfz/g1K+k/ A2GLZ6uOhQB4tEO/lH5DsraKkLQEnbY1Uzdz498SmjRcjrns6nG9iRvyBbf7VeomKDiF 1daDrd7Xo1PFvjkry1PxY0vRh2H5eAIoMnMX7cXrxH6x4GVnsHjwaWc2MTsjN+1ZzIbE m63p8zrdi6n6yyHBb5bbYNgI4Zr6g57gZXX8RE1Ds1cs2/XhP4YkpMgzc0ePha3Sn0w5 jLRT7QI/sFA6KVWBhYFKixoTwdoMSeyNUBUwvGybN9yNBrGmJK6RafH8y51MmDqdmZV/ xx8A== X-Gm-Message-State: AO0yUKWzCY5E28slPS4VgBYtXvEmx/0i5x3j8L70D+wPGT1hkXEeY5Df CB47JDLRyELU2LVYGrgZ0oY= X-Google-Smtp-Source: AK7set+nGsqo/sl/wqd55K2dU5QYefXctRyCSyI26tZ7c51ELti8mM2EnNAUFgqv60jBNNLuth9o3w== X-Received: by 2002:a17:902:c950:b0:196:58ac:6593 with SMTP id i16-20020a170902c95000b0019658ac6593mr1729111pla.61.1675741965047; Mon, 06 Feb 2023 19:52:45 -0800 (PST) Received: from strix-laptop.hitronhub.home ([123.110.9.95]) by smtp.googlemail.com with ESMTPSA id q4-20020a170902b10400b0019682e27995sm7647655plr.223.2023.02.06.19.52.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 19:52:44 -0800 (PST) From: Chih-En Lin To: Andrew Morton , Qi Zheng , David Hildenbrand , "Matthew Wilcox (Oracle)" , Christophe Leroy , John Hubbard , Nadav Amit , Barry Song Cc: Steven Rostedt , Masami Hiramatsu , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Yang Shi , Peter Xu , Vlastimil Babka , "Zach O'Keefe" , Yun Zhou , Hugh Dickins , Suren Baghdasaryan , Pasha Tatashin , Yu Zhao , Juergen Gross , Tong Tiangen , Liu Shixin , Anshuman Khandual , Li kunyu , Minchan Kim , Miaohe Lin , Gautam Menghani , Catalin Marinas , Mark Brown , Will Deacon , Vincenzo Frascino , Thomas Gleixner , "Eric W. Biederman" , Andy Lutomirski , Sebastian Andrzej Siewior , "Liam R. Howlett" , Fenghua Yu , Andrei Vagin , Barret Rhoden , Michal Hocko , "Jason A. Donenfeld" , Alexey Gladkov , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dinglan Peng , Pedro Fonseca , Jim Huang , Huichun Feng , Chih-En Lin Subject: [PATCH v4 01/14] mm: Allow user to control COW PTE via prctl Date: Tue, 7 Feb 2023 11:51:26 +0800 Message-Id: <20230207035139.272707-2-shiyn.lin@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230207035139.272707-1-shiyn.lin@gmail.com> References: <20230207035139.272707-1-shiyn.lin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 367A5A0002 X-Rspam-User: X-Stat-Signature: 9nikof9nuim6o5gumjsoe9w8dpubk5k5 X-HE-Tag: 1675741965-895546 X-HE-Meta: U2FsdGVkX197jQy2/FKqLmNri17C5H+msNt6eczl56b50NBSyKsSQgMznffHS19kUdz7DGJgRmyDmKY0N4jQzNkGECGpo+++LrNgZZHevMkgwjmjJkJmkSh+JKMaUNKoMMn3OSLiGkstOS31RBvxzZPnTWEvCnySLhigT9LgOOIp+z9WH1y0Xj0fU2ikiurL2lnPEkPaTIVUCT2X2pT/3U1DPq/Mbaslh771icMbWIpiknImIV48KTPzfJSqi7MMqrihYe35FVgHnZYpBcYAKUDEDFRqfAqX83fn/whsQYDLERRk6Bvx6xB1neCasjNcgRjRoKX9lcQjXzn8j4llxbkHtnsARlqey1P+LauY+L6CAMJGTBPrul0dpWireOvAnt7iu9TwUr31sFDcgR7gcOzPpXycv1B0o/sEYvCE6Tgk+CUmA/kfKsel4NEdx8ebTCKLe4aZSD/6YsFHRLOsXTcSCYKYnCF+hgU4CLLbG37gR3LlBiAG1mt+R6+7VhM9BUizxLqhx9o++7IuuDLiRMc/Oa4NkvtTvi4Ywh8mRlD9jWZ7Caq4a40wRxd69wtuCVy+yrbRZQiKA13D5fOq03xQlTy2kyr+DpZcd6FMk5UHzvtZKCNbbGX7kcQ2yl3fe7uVuWKe8zTHTn2HdWiffGpDLb7PHIYycoCrEes5ftuX0PVqjnK1FT7lOoKwczwGiWoJzmGtI7oKYm4bzD9k2EWGy45sAO+/ylqFJ+nhJw7pdqBKbarGd5PH2qeB3LKykpnD2yzrpI1IQBUj0t4yhFoNjonOzzpjUQzwVgPjrjD7Tft4Jy1IkO4y3PULs97fX9/Mo5L8s2DymJA0F51qwsxs1rJK7Qff+4kONEq2MGq0Pwz3+EtWFvPeuxzOPokmy58b2BhlQhu/fIw2w0SGf/pqpOJDMUl6i3Q67KGdERZc4zZOWWWqF8AiXKm/KgXRrgpl65IYPKHX9XczhBW vmaa7wn+ 8eIlSXVsnT+DwQVRdZ9qh6n3/nVNO3sEcO5EHSHbPVPqnVPpnjEU2qoVNFbVVpMwvOVQ8m238XXkVXFyQwBgeLIj5U7oS76oxLFXv1sx/xBSlHgjzVhaFq/a42ZACNITgUM55FAxYUjVVOI5C+0V+kX7MipiHar3GT4mRVm0N81w/b9WOMenu6zliUzcdqQ2A8yXWA9sV9wnIG+bBHDRsrj0Pu3hW9ftkytqnL2zfOG01kMSl3HkGrmJd4i+DyugZ7fWWRCguIMiYDhRciL+lQ81bXKlTj8E3N/D+Cs3I+jNa9SbNVltB1240P7ujfY0u1oYSXBenhnoPNHl7AnuaOidjcQM6T2nAxqBM2l2m4pm6k42AfLApDFj2h+5atkS3+o7G8qTD3CQla5gRrAknb3Un+3pYaxRXGmRZskBbLxHMkEyB1zmVeuFM5CI6zvZhulhw0DYGqU+dDzjTIx4QSCkegzTt86AWsGsnSZO/VBsBbllZgcicLjE9djAeBuT8yNrfNuVMK5vuYX0Oue4CgM4DiEzrhEb2yJ3/VeUIAuQEG+/vRQKwjzjrEg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a new prctl, PR_SET_COW_PTE, to allow the user to enable COW PTE. Since it has a time gap between using the prctl to enable the COW PTE and doing the fork, we use two states (MMF_COW_PTE_READY and MMF_COW_PTE) to determine the task that wants to do COW PTE or already doing it. The MMF_COW_PTE_READY flag marks the task to do COW PTE in the next time of fork(). During fork(), if MMF_COW_PTE_READY set, fork() will unset the flag and set the MMF_COW_PTE flag. After that, fork() might shares PTEs instead of duplicates it. Signed-off-by: Chih-En Lin --- include/linux/sched/coredump.h | 12 +++++++++++- include/uapi/linux/prctl.h | 6 ++++++ kernel/sys.c | 11 +++++++++++ 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h index 8270ad7ae14c..570d599ebc85 100644 --- a/include/linux/sched/coredump.h +++ b/include/linux/sched/coredump.h @@ -83,7 +83,17 @@ static inline int get_dumpable(struct mm_struct *mm) #define MMF_HAS_PINNED 27 /* FOLL_PIN has run, never cleared */ #define MMF_DISABLE_THP_MASK (1 << MMF_DISABLE_THP) +/* + * MMF_COW_PTE_READY: Marking the task to do COW PTE in the next time of + * fork(). During fork(), if MMF_COW_PTE_READY set, fork() will unset the + * flag and set the MMF_COW_PTE flag. After that, fork() might shares PTEs + * rather than duplicates it. + */ +#define MMF_COW_PTE_READY 29 /* Share PTE tables in next time of fork() */ +#define MMF_COW_PTE 30 /* PTE tables are shared between processes */ +#define MMF_COW_PTE_MASK (1 << MMF_COW_PTE) + #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\ - MMF_DISABLE_THP_MASK) + MMF_DISABLE_THP_MASK | MMF_COW_PTE_MASK) #endif /* _LINUX_SCHED_COREDUMP_H */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index a5e06dcbba13..664a3c023019 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -284,4 +284,10 @@ struct prctl_mm_map { #define PR_SET_VMA 0x53564d41 # define PR_SET_VMA_ANON_NAME 0 +/* + * Set the prepare flag, MMF_COW_PTE_READY, to do the share (copy-on-write) + * page table in the next time of fork. + */ +#define PR_SET_COW_PTE 65 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 88b31f096fb2..eeab3093026f 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2350,6 +2350,14 @@ static int prctl_set_vma(unsigned long opt, unsigned long start, } #endif /* CONFIG_ANON_VMA_NAME */ +static int prctl_set_cow_pte(struct mm_struct *mm) +{ + if (test_bit(MMF_COW_PTE, &mm->flags)) + return -EINVAL; + set_bit(MMF_COW_PTE_READY, &mm->flags); + return 0; +} + SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, unsigned long, arg4, unsigned long, arg5) { @@ -2628,6 +2636,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_SET_VMA: error = prctl_set_vma(arg2, arg3, arg4, arg5); break; + case PR_SET_COW_PTE: + error = prctl_set_cow_pte(me->mm); + break; default: error = -EINVAL; break; -- 2.34.1