From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01AC0C83F03 for ; Thu, 3 Jul 2025 14:07:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5C1A6B019D; Thu, 3 Jul 2025 10:07:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E0CD06B019F; Thu, 3 Jul 2025 10:07:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFDB16B01A0; Thu, 3 Jul 2025 10:07:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B61436B019D for ; Thu, 3 Jul 2025 10:07:49 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 735E61D3A2B for ; Thu, 3 Jul 2025 14:07:49 +0000 (UTC) X-FDA: 83623131858.17.A36811B Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf22.hostedemail.com (Postfix) with ESMTP id DDDFAC0006 for ; Thu, 3 Jul 2025 14:07:47 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Qi3cX/BG"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf22.hostedemail.com: domain of frederic@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751551667; a=rsa-sha256; cv=none; b=S670PA/WrCRwPYOe0SdGRUC1SzNL4/C0oaFq6a+D4o0mNOf5Pw/gKkt4tjmrPNaKI273RO fwfAVdGwGVoor9E6ewwL1G4j3416Ls4zBBxqU2Gok6PpDqciCNWEtfFL7TkjYme0jY7bq4 v8+3jpRn9qaNxGhNbRew1+OE7L9EnGo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Qi3cX/BG"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf22.hostedemail.com: domain of frederic@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=frederic@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751551667; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7a4QesCFtQRoqyozUzXEf0j/WYMzgZWHYbsD1C4+SlE=; b=3Mb7UN8ysP+F7rk6u1dY3/6hjWVO5khEItEcdKrEpulg2DAm/U5pEAvr2/lpvCTT1Bu9H2 a99YfZ9mbHw6JQlc1MCwHWpSM+ZGlLBf+fJQGKEkPCxCmnlE1k1q96Czd2+EPse4PhndQP FiOknJiAFC2DrfRHnYuKrX9FuqRsBB0= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 61FA76113A; Thu, 3 Jul 2025 14:07:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE5D3C4CEF2; Thu, 3 Jul 2025 14:07:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1751551667; bh=QY9AUh/9sbWRVey2XWEDIQEEsAK1y6vLLNwig1fqR50=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Qi3cX/BGE6MvsXYCQZ5ZQSZr8cjrtR/u1fbQ5/6Uzb1d+1jbRRRddwO6gE8p1IJ5w k0+cNXhZzgRopQQswSx4Cu1beVVx8XuP5SsKgITRnJ2aPFToP6HHn/y6usKsnTH6V1 HnpNaSU0vweHwSuH6tGpfU2AiWCRvUEGNg4GNEX8kjhZx0I6alY4+5cE3KJXvcvnrm /PS6DciVI5u+rzJOZG8AEZ9xuWCMLW6HgR6yHTh5Kp1u4wFvREZ2p3miWDEe4oWVz/ jQP2f3bHnvJtvTpOrMPg2zaiOIJUGJvhq1pwm6MPtQ4a9eAotNAwxTmiu+ll4OeuRH kk0o63TmWOmyA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marcelo Tosatti , Michal Hocko , Oleg Nesterov , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Vlastimil Babka , linux-mm@kvack.org Subject: [PATCH 5/6] sched/isolation: Introduce isolated task work Date: Thu, 3 Jul 2025 16:07:16 +0200 Message-ID: <20250703140717.25703-6-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250703140717.25703-1-frederic@kernel.org> References: <20250703140717.25703-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: 786ijob63twiomdkjmbabqgmod719mss X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: DDDFAC0006 X-Rspam-User: X-HE-Tag: 1751551667-650368 X-HE-Meta: U2FsdGVkX18OKK9Mt3f5QUhD889eyIDC1v1BO7ohjbq/ptzK1wEIiKpHHjwe2tdWDEbdbFJKvr5ucjRj3ZHt+JfiUjsWMh8JEsKD1m6W4swp2xiri2MIAEnjjR21nbHroJpcv+M7Q88mMdYkIMDkCjRdbQEtI+1WZSedRA7ODHEBBHcM+qc2/fj0sg3Bw+fbNk5KFM+EB5M+Xf8BTFZRNsb3ENjq7CXxXplc4zwAfjPI3cv2iMr85wb0j9dfzZI3OsrQFYWf8CNTWkWT+chMWgCPv5cCEKvjxWoqMFT2boy6WfPWgRKAzqTqGz3d0tIrKakjQNYS52/6xfwt03JuP6AKrN/9ZIxZ771xf1/Nw5vs9ZnI5fMR7+Do5/g9BCus6N84zPaw3A5drhSf55PeqtYUcJY974Kf7zRDn073rv19iW4Wp1g5lDJZlUJoel4mhd5iXhDBln7FlEBLWx+UK9EPsmECL0/vMMf4H0gV83xCX79i7lc4/fQN4/2Ck0vSOP0d1UxCF5s7czTHdhHjg9MLS4Y98CV78X7UfVucKzmEh3PDkTz2XDQK/DyShel6SeHb6eUuB0sFVIHOqqymLi21n8d2RlDN6lFlKaHkoXPuZiic0AXDruxQdpoE34hRnyz7FMsNJDuRCgZHr+x9iZOHPzaPbQ83jRMQuZOyLGQQh3g46eqM5HN2Nqt5QZQQE0WY8Q0aGUUuswmS4pj/uSrMInxOUbqOBK4TXTj1dlS1m6gaFlqNHQgckz/9JLXGLtJLJli6+kS8vtDVE6cd+r00K4lBxflpdcsxIpbrkhFd7hP7+BDnj8G83qfCdh7AP9xBCdDBMRWbfuNpjqrSSnLk04q9aZdpSQBaHB6bcTl+Dq+wiXfmWmwq71ezgD71ayAF9rI2sHl41xInd5MAg0XLhS7jymHUrSYHacMi6ZEUHWyZiFPNXbsSN1lFEBxlS56C6CM6VG9kqA/Bz2Q r1cIXGaD EDSyGOP+ihdzMTXXAR4SLrc2gxMlHegyDxCoCxsP3qilxqVfiBwlNXZkrEWKYmTkqocSqy/cruvDdHbbQrB2/OQCkWbXTzrPXWcojnFCmIUA11zYhjZILJkjGDht+V+8NsngFgwPVkDS6xXAUaL8xPFSFvFt1Jd0I1zpBv/HCTr/Q2Xq+fo3VytYW2Z3VFqgKuWgNANegjCiwJ477KfriZD8rmUDbXm/BviVaWP1nNfntpncHHR6CHdsFhbYtgTCZCMxNkK91yvzu868= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some asynchronous kernel work may be pending upon resume to userspace and execute later on. On isolated workload this becomes problematic once the process is done with preparatory work involving syscalls and wants to run in userspace without being interrupted. Provide an infrastructure to queue a work to be executed from the current isolated task context right before resuming to userspace. This goes with the assumption that isolated tasks are pinned to a single nohz_full CPU. Signed-off-by: Frederic Weisbecker --- include/linux/sched.h | 4 ++++ include/linux/sched/isolation.h | 17 +++++++++++++++++ kernel/sched/core.c | 1 + kernel/sched/isolation.c | 23 +++++++++++++++++++++++ kernel/sched/sched.h | 1 + kernel/time/Kconfig | 12 ++++++++++++ 6 files changed, 58 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 117aa20b8fb6..931065b5744f 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1448,6 +1448,10 @@ struct task_struct { atomic_t tick_dep_mask; #endif +#ifdef CONFIG_NO_HZ_FULL_WORK + struct callback_head nohz_full_work; +#endif + #ifdef CONFIG_FAULT_INJECTION int make_it_fail; unsigned int fail_nth; diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index d8501f4709b5..9481b7d152c9 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -77,4 +77,21 @@ static inline bool cpu_is_isolated(int cpu) cpuset_cpu_is_isolated(cpu); } +#if defined(CONFIG_NO_HZ_FULL_WORK) +extern int __isolated_task_work_queue(void); + +static inline int isolated_task_work_queue(void) +{ + if (!housekeeping_cpu(raw_smp_processor_id(), HK_TYPE_KERNEL_NOISE)) + return -ENOTSUPP; + + return __isolated_task_work_queue(); +} + +extern void isolated_task_work_init(struct task_struct *tsk); +#else +static inline int isolated_task_work_queue(void) { return -ENOTSUPP; } +static inline void isolated_task_work_init(struct task_struct *tsk) { } +#endif /* CONFIG_NO_HZ_FULL_WORK */ + #endif /* _LINUX_SCHED_ISOLATION_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 35783a486c28..eca8242bd81d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4538,6 +4538,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p) p->migration_pending = NULL; #endif init_sched_mm_cid(p); + isolated_task_work_init(p); } DEFINE_STATIC_KEY_FALSE(sched_numa_balancing); diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 93b038d48900..d74c4ef91ce2 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -249,3 +249,26 @@ static int __init housekeeping_isolcpus_setup(char *str) return housekeeping_setup(str, flags); } __setup("isolcpus=", housekeeping_isolcpus_setup); + +#ifdef CONFIG_NO_HZ_FULL_WORK +static void isolated_task_work(struct callback_head *head) +{ +} + +int __isolated_task_work_queue(void) +{ + if (current->flags & (PF_KTHREAD | PF_USER_WORKER | PF_IO_WORKER)) + return -EINVAL; + + guard(irqsave)(); + if (task_work_queued(¤t->nohz_full_work)) + return 0; + + return task_work_add(current, ¤t->nohz_full_work, TWA_RESUME); +} + +void isolated_task_work_init(struct task_struct *tsk) +{ + init_task_work(&tsk->nohz_full_work, isolated_task_work); +} +#endif /* CONFIG_NO_HZ_FULL_WORK */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 475bb5998295..50e0cada1e1b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -60,6 +60,7 @@ #include #include #include +#include #include #include #include diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index b0b97a60aaa6..34591fc50ab1 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -146,6 +146,18 @@ config NO_HZ_FULL endchoice +config NO_HZ_FULL_WORK + bool "Full dynticks work flush on kernel exit" + depends on NO_HZ_FULL + help + Selectively flush pending asynchronous kernel work upon user exit. + Assuming userspace is not performing any critical isolated work while + issuing syscalls, some per-CPU kernel works are flushed before resuming + to userspace so that they don't get remotely queued later when the CPU + doesn't want to be disturbed. + + If in doubt say N. + config CONTEXT_TRACKING_USER bool depends on HAVE_CONTEXT_TRACKING_USER -- 2.48.1