From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87E6EC4332F for ; Wed, 8 Nov 2023 09:37:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01BB98D00B1; Wed, 8 Nov 2023 04:37:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F0DC38D00AD; Wed, 8 Nov 2023 04:37:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD5B88D00B1; Wed, 8 Nov 2023 04:37:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CEBBD8D00AD for ; Wed, 8 Nov 2023 04:37:27 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 890FE8017F for ; Wed, 8 Nov 2023 09:37:27 +0000 (UTC) X-FDA: 81434284134.05.92C069B Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf07.hostedemail.com (Postfix) with ESMTP id BBFB04000C for ; Wed, 8 Nov 2023 09:37:23 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=afZpI7cs; dmarc=none; spf=none (imf07.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699436245; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7UGEVbPDKW86arlOpWMG5o+CcD0Juo30JkEeKJv9V3M=; b=XwaYn1ift51UiiLTV9vGQUL2jJbIO4QJPeJKQyjZrL61+t/6UYQoEYtLohYCjTOfTj1sKK bWD+ctB4E/KDnJBNogKIApA65nNBI02chmy8WOvkKsWYbJx1SzRDaCbtDktCsgEc/SAchS 32TABT1l5TcEW3NPxFpZ34dVmInbh8w= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=afZpI7cs; dmarc=none; spf=none (imf07.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699436245; a=rsa-sha256; cv=none; b=TW62ZkpvdUCPvrbCAC4rsHvWPYMQAzEA1YsFTFJxga2+K4mq/Z+637NJNzw8G7Ormm3S7M w+6CDC085bp7HQ+7OYb+Mfpwd2NgRi0da9q+BThTAhv4SZ2h76ZY3O/wz3yGToQdv/Y3hD 0332487V2tkFoR/LYcjXIS6d2ONjd+c= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=7UGEVbPDKW86arlOpWMG5o+CcD0Juo30JkEeKJv9V3M=; b=afZpI7csGSh+SqgCIhXY62tgk8 AV51dYsaKnNvbumqESqqM6q7ZTecUoeACxWM8jVxgJMh9SB5rZN9ybqDkbPqsDxm62G3HgzgAD5cm 19DI9HU9tnX6wuu/vXCfvEWd8ZfnF0cy0hnQiLdfymfdClt6J90mRV+/2olQbiBzLsntQ8uQyarjT nnRF66smjzoSuAo8QtNoE+ZM+2KVOl/YSEnoLilmyS3ZmYgRnlUoSLbk8DtdwrOjZDb1HEXXnqaX/ B+Gj+OtOaK9wKQuNSRyDWdHy+Q3lKvCC1lbg+HjadBN7rgr3DDlzOJltLzrrODV0kztlLYSoV5hgk FZ2wAZPg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1r0ezN-00DNKk-2K; Wed, 08 Nov 2023 09:36:45 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 61EEB30049D; Wed, 8 Nov 2023 10:36:45 +0100 (CET) Date: Wed, 8 Nov 2023 10:36:45 +0100 From: Peter Zijlstra To: Ankur Arora Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, paulmck@kernel.org, linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com, jgross@suse.com, andrew.cooper3@citrix.com, mingo@kernel.org, bristot@kernel.org, mathieu.desnoyers@efficios.com, geert@linux-m68k.org, glaubitz@physik.fu-berlin.de, anton.ivanov@cambridgegreys.com, mattst88@gmail.com, krypton@ulrich-teichert.org, rostedt@goodmis.org, David.Laight@aculab.com, richard@nod.at, mjguzik@gmail.com Subject: Re: [RFC PATCH 41/86] sched: handle resched policy in resched_curr() Message-ID: <20231108093645.GL8262@noisy.programming.kicks-ass.net> References: <20231107215742.363031-1-ankur.a.arora@oracle.com> <20231107215742.363031-42-ankur.a.arora@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231107215742.363031-42-ankur.a.arora@oracle.com> X-Rspamd-Queue-Id: BBFB04000C X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: kwukh1pjjurejzxim1nr5xqptctno3ts X-HE-Tag: 1699436243-560501 X-HE-Meta: U2FsdGVkX1/6Mn4Ja7Pk7ia/5lYIy2GaKtbYf8FEJsmBOldzC72yLzjWn1txXqlx1thCcyPuWpPfIXQskWZFgq+4NrtDPtXffJP5/wtnbexNpC8J6MfbBnm90XMpqj4h0lVhhZ09MdVhdnuvz/tTuDUr/Pzq1Ve5XOc6zPIqoHgkpDpfakGbB003Ym06Oj9H8jwfVwAG3A8yX6TMgh3oOYt+WV6U6FuzcoODWwUX1ONPhyn6T6p1H3OzatQ3Bsd4VpWOqTnKQGjeJH9z3MgIqhWAxxVnWaB5TNsBQ9EWV1fuxyqgRt5L3gB4n/v3CZ9K0eK9B9MxB8Gfg1tHoeXKyuoqkXamAH+p6YVU8JwdsHlq2Dt7rVWCScAMW0u0G4p85QsEKrDrXAyAVOp0vSplhkufTpYAaGZc4MiNVrJuu15rIoALzvDWsLD0kfGzWxapm4uy6EX3UWiOA1rkULMQ2cds5/DVHaFIVHEo2brHBQHMcDeJ/fBM2wVO2Ub20jE/dPPBBqtvKZs9tRMb4MxUVi37qr1Up5d9bBzt98PScqoIpIUQfzYME52190m5MpbzgMj2eYhdbdSRFPU9zo+KEokb1Msr8yJlmZce4oepTGTHJNP4Ob54puEjJwm9zv76unDeT3pIWvZGiROwqQL1R3oAY5NPCwBEM89Wlrlk/NuTuHxocZBS/h7kDWxYC5b+5r87scaQe7ebXb8R94chUnHq5c+n36pQian4H33oUx+xXYgtylp1Furp+cZnA4yHSlC8oYhWI/moU2YHMwadMP7eHZk0mns4EduiOWL4b4HC7dpip+wkSPMMmHyoUS5MoH9+gKoyVHnxXUEEU3EyZESCKlrbOkP9MPQc5FAZwodkG/rS0ezD3cUc2nef9WUjJ52HBI37EaJxNyhcqvcqSH6y8Z3CN9DWFn88wEwTKVa6PhlBFEKKsx4r+loU0zfOEUc8Q+TFQA0zON0/QLo 7wBkeUjX KX/lFGFRzKVP6acXEWLKCXdH67qz+Tm9tCr7KEpBdBnqgBp6l+XC/C1EX1Bx+44KuOOj8Y6p/jlvPcFDa9/ckO9IuXOo3D9LRx7S/WMUwtmHFMKjtOOAFQfp699G9BmNgVszgE7Z41Hvbe+40f3yoOhZeJC5F6AlHKZBS9gna85/dtdR1wrSbX3x2GA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 07, 2023 at 01:57:27PM -0800, Ankur Arora wrote: > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1027,13 +1027,13 @@ void wake_up_q(struct wake_q_head *head) > } > > /* > - * resched_curr - mark rq's current task 'to be rescheduled now'. > + * __resched_curr - mark rq's current task 'to be rescheduled'. > * > - * On UP this means the setting of the need_resched flag, on SMP it > - * might also involve a cross-CPU call to trigger the scheduler on > - * the target CPU. > + * On UP this means the setting of the need_resched flag, on SMP, for > + * eager resched it might also involve a cross-CPU call to trigger > + * the scheduler on the target CPU. > */ > -void resched_curr(struct rq *rq) > +void __resched_curr(struct rq *rq, resched_t rs) > { > struct task_struct *curr = rq->curr; > int cpu; > @@ -1046,17 +1046,77 @@ void resched_curr(struct rq *rq) > cpu = cpu_of(rq); > > if (cpu == smp_processor_id()) { > - set_tsk_need_resched(curr, RESCHED_eager); > - set_preempt_need_resched(); > + set_tsk_need_resched(curr, rs); > + if (rs == RESCHED_eager) > + set_preempt_need_resched(); > return; > } > > - if (set_nr_and_not_polling(curr, RESCHED_eager)) > - smp_send_reschedule(cpu); > - else > + if (set_nr_and_not_polling(curr, rs)) { > + if (rs == RESCHED_eager) > + smp_send_reschedule(cpu); I think you just broke things. Not all idle threads have POLLING support, in which case you need that IPI to wake them up, even if it's LAZY. > + } else if (rs == RESCHED_eager) > trace_sched_wake_idle_without_ipi(cpu); > } > > +/* > + * resched_curr - mark rq's current task 'to be rescheduled' eagerly > + * or lazily according to the current policy. > + * > + * Always schedule eagerly, if: > + * > + * - running under full preemption > + * > + * - idle: when not polling (or if we don't have TIF_POLLING_NRFLAG) > + * force TIF_NEED_RESCHED to be set and send a resched IPI. > + * (the polling case has already set TIF_NEED_RESCHED via > + * set_nr_if_polling()). > + * > + * - in userspace: run to completion semantics are only for kernel tasks > + * > + * Otherwise (regardless of priority), run to completion. > + */ > +void resched_curr(struct rq *rq) > +{ > + resched_t rs = RESCHED_lazy; > + int context; > + > + if (IS_ENABLED(CONFIG_PREEMPT) || > + (rq->curr->sched_class == &idle_sched_class)) { > + rs = RESCHED_eager; > + goto resched; > + } > + > + /* > + * We might race with the target CPU while checking its ct_state: > + * > + * 1. The task might have just entered the kernel, but has not yet > + * called user_exit(). We will see stale state (CONTEXT_USER) and > + * send an unnecessary resched-IPI. > + * > + * 2. The user task is through with exit_to_user_mode_loop() but has > + * not yet called user_enter(). > + * > + * We'll see the thread's state as CONTEXT_KERNEL and will try to > + * schedule it lazily. There's obviously nothing that will handle > + * this need-resched bit until the thread enters the kernel next. > + * > + * The scheduler will still do tick accounting, but a potentially > + * higher priority task waited to be scheduled for a user tick, > + * instead of execution time in the kernel. > + */ > + context = ct_state_cpu(cpu_of(rq)); > + if ((context == CONTEXT_USER) || > + (context == CONTEXT_GUEST)) { > + > + rs = RESCHED_eager; > + goto resched; > + } Like said, this simply cannot be. You must not rely on the remote CPU being in some state or not. Also, it's racy, you could observe USER and then it enters KERNEL. > + > +resched: > + __resched_curr(rq, rs); > +} > + > void resched_cpu(int cpu) > { > struct rq *rq = cpu_rq(cpu);