From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 676C5C4332F for ; Wed, 8 Nov 2023 09:56:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E82DD80017; Wed, 8 Nov 2023 04:56:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E32F28D00AD; Wed, 8 Nov 2023 04:56:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF9D880017; Wed, 8 Nov 2023 04:56:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BE50C8D00AD for ; Wed, 8 Nov 2023 04:56:31 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8F840403A8 for ; Wed, 8 Nov 2023 09:56:31 +0000 (UTC) X-FDA: 81434332182.26.50F2D7D Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf05.hostedemail.com (Postfix) with ESMTP id 0C73E10000C for ; Wed, 8 Nov 2023 09:56:28 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Hs1LTOT+; spf=none (imf05.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699437389; a=rsa-sha256; cv=none; b=u6sdUfi02It/+uZo/gNJcjaypr4Omz+v2JFAR6oM01xNwVBEpgKuuLTsc2GfkS5cBZMr4g Q5lRYmzLCJAySnzAGHyF+Wkzz8leOG2O/hZGqdbgyFmjn8QOC9k/C9TovJCMAkVJwKrbjy lXLPmV3i9LILAM7+OCoBLeeGksccFzs= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Hs1LTOT+; spf=none (imf05.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699437389; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DLHXrfK1M5Av5ypOae5a/HcqwbkktTqup8PAqdNWA8Y=; b=KwSilyPNVo/3jfBywgxqKVDnhBVHy8My15hIMLoAHtwrneAJbSbNJTSA7F8pDWWNi4UBeQ 2+wsZa3Lnaw3eqbv7c4Jbhe8W5JSVeaXuVKxjwm6mxvnPEd7ccTHKVAaHGUZHyNN5MKvLO eOQvkM4Aufq2cQnYIOOOAWToRTV0ebc= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=DLHXrfK1M5Av5ypOae5a/HcqwbkktTqup8PAqdNWA8Y=; b=Hs1LTOT+QZvI7oggVAolqKrPoz du5uNvcv/ArqQU1DocN9TxHeR7xMpbzRmzbJPB8wZdXYMQzk8s1DXa8f41rFHVgAiHYi049flDnuE 9Nq4QB8ZRmckuOOYRtHlGIL5ONPhR4woL7oahb024uGPJj2RuqlvOcIUN5mlPMjg8L5TEuKZ8h38R TxvJ+ytTYRXXIkYFZTlBShTaiL24d0eGEC0SEdoW85KuowCWJdL6wNx0a/52TsIoq177F5LVA47p8 rmhlEhm8GVwDcKFVwyWWr67udZne99V7sSIp/cCPDiX2Lvn724PLXi97n81jf89WrdN231VaYPByt N1jm2KvQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1r0fI0-000JA3-Ew; Wed, 08 Nov 2023 09:56:01 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id C946C30049D; Wed, 8 Nov 2023 10:56:00 +0100 (CET) Date: Wed, 8 Nov 2023 10:56:00 +0100 From: Peter Zijlstra To: Ankur Arora Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, paulmck@kernel.org, linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com, jgross@suse.com, andrew.cooper3@citrix.com, mingo@kernel.org, bristot@kernel.org, mathieu.desnoyers@efficios.com, geert@linux-m68k.org, glaubitz@physik.fu-berlin.de, anton.ivanov@cambridgegreys.com, mattst88@gmail.com, krypton@ulrich-teichert.org, rostedt@goodmis.org, David.Laight@aculab.com, richard@nod.at, mjguzik@gmail.com Subject: Re: [RFC PATCH 42/86] sched: force preemption on tick expiration Message-ID: <20231108095600.GM8262@noisy.programming.kicks-ass.net> References: <20231107215742.363031-1-ankur.a.arora@oracle.com> <20231107215742.363031-43-ankur.a.arora@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231107215742.363031-43-ankur.a.arora@oracle.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 0C73E10000C X-Stat-Signature: 1godguhxzjymhh4hyrkapxud8s8ojn6n X-Rspam-User: X-HE-Tag: 1699437388-817323 X-HE-Meta: U2FsdGVkX19NjTSkGWtFPNgq5+zJJSKymWp9GbLOqQQRT09Hn14O03cRM50LcFs4O4EcJG5e8U8G+l9mbwuArZNRbvdq5Ib9AHvCAyg2adIotSv6HCVSCQMBP1bNbZmo4/QqUkvjSdNjJndR7T2kg0g5JgyX1ZUmytuZkuC8LclOA4lQ1Upy1kAczE293VjZnqiXZMh6VB9bZZI75rUp50C9LZ9h75KvMysDvXnvjYb5FLKJhtS+gJSdZF5/mVZSv7HoG6ltoZmWVYk0MyVqUzTjFvlwl/ChwGUqecmgaqYStDtph420KJn4tKhD42n2g2y8uFqx4yNNLTfOekSyhb6Gcn1b0valOkERl3FA+JT2mV/kxeqdHo3Ic8gleEG4fMCz2pQ86vGpk99/wbG6mtOXXKL2Idn1N9XyclJCTHZUQoUeglUJGnp9vwTG9YnJH9QyTzjhgJx4fI5OpdASpNKrywl6cag514GkWf3MmEf3MktCZKeNRwPq4BkVowQbcZFIeUbSjdhwPYSQH4GueNNn6PCGzepEHv82IG8YUhWM7NftB1ddo/bTjmGLWaKOZG3EfcOWkr+jnu5i3uWmibcqkxESCCOfIj0waz7XteTlR6GoqTx2PhdUAfPymReGngCjy6HZvQaUecfOjqbqbcnnLaazoQ4lzPO/57wY363nqZMBl0BdsJW/cqFCkZ+AAit30HRkhnWFnRbTb+WXAgbqykroMAxMC/thxyFlJxjMJcwUpFnIY/w+ew+QRUYrzwXnHjCqS7LOlhGFCEXzk72k8/Bkx/9x15TE1ZwRzqzh4Ury4jhkOMVqye//ogGyJUeTAfNPtAFy9aGwFR6mWwEL/N5OnoC3s69UYkDCH615m8bC38Hy6RmST+ZGBIWaSUam5/GGg7h8+LcH4Is+enArFSqLxYcohRvHQTs4iDsh9qgguTCa8Hcjd5op5mYZmImK/6ugn5UJJoBKthL nceajjvz 5CHJynVz9TEc8gA6FH5jvMXq0duKvaIg1tQHZswnPBsRuDnjvSNKkUARtzNOL4qc2RGTpoY4aSP2778MrI0flNYUp3soLfYj7HcIO/36p3Fp6w/vEyI7QOqzPcosz6GMR73/5VyQRQcfGrYk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 07, 2023 at 01:57:28PM -0800, Ankur Arora wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 4d86c618ffa2..fe7e5e9b2207 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1016,8 +1016,11 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se); > * XXX: strictly: vd_i += N*r_i/w_i such that: vd_i > ve_i > * this is probably good enough. > */ > -static void update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se) > +static void update_deadline(struct cfs_rq *cfs_rq, > + struct sched_entity *se, bool tick) > { > + struct rq *rq = rq_of(cfs_rq); > + > if ((s64)(se->vruntime - se->deadline) < 0) > return; > > @@ -1033,13 +1036,19 @@ static void update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se) > */ > se->deadline = se->vruntime + calc_delta_fair(se->slice, se); > > + if (cfs_rq->nr_running < 2) > + return; > + > /* > - * The task has consumed its request, reschedule. > + * The task has consumed its request, reschedule; eagerly > + * if it ignored our last lazy reschedule. > */ > - if (cfs_rq->nr_running > 1) { > - resched_curr(rq_of(cfs_rq)); > - clear_buddies(cfs_rq, se); > - } > + if (tick && test_tsk_thread_flag(rq->curr, TIF_NEED_RESCHED_LAZY)) > + __resched_curr(rq, RESCHED_eager); > + else > + resched_curr(rq); > + > + clear_buddies(cfs_rq, se); > } > > #include "pelt.h" > @@ -1147,7 +1156,7 @@ static void update_tg_load_avg(struct cfs_rq *cfs_rq) > /* > * Update the current task's runtime statistics. > */ > -static void update_curr(struct cfs_rq *cfs_rq) > +static void __update_curr(struct cfs_rq *cfs_rq, bool tick) > { > struct sched_entity *curr = cfs_rq->curr; > u64 now = rq_clock_task(rq_of(cfs_rq)); > @@ -1174,7 +1183,7 @@ static void update_curr(struct cfs_rq *cfs_rq) > schedstat_add(cfs_rq->exec_clock, delta_exec); > > curr->vruntime += calc_delta_fair(delta_exec, curr); > - update_deadline(cfs_rq, curr); > + update_deadline(cfs_rq, curr, tick); > update_min_vruntime(cfs_rq); > > if (entity_is_task(curr)) { > @@ -1188,6 +1197,11 @@ static void update_curr(struct cfs_rq *cfs_rq) > account_cfs_rq_runtime(cfs_rq, delta_exec); > } > > +static void update_curr(struct cfs_rq *cfs_rq) > +{ > + __update_curr(cfs_rq, false); > +} > + > static void update_curr_fair(struct rq *rq) > { > update_curr(cfs_rq_of(&rq->curr->se)); > @@ -5309,7 +5323,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued) > /* > * Update run-time statistics of the 'current'. > */ > - update_curr(cfs_rq); > + __update_curr(cfs_rq, true); > > /* > * Ensure that runnable average is periodically updated. I'm thinking this will be less of a mess if you flip it around some. (ignore the hrtick mess, I'll try and get that cleaned up) This way you have two distinct sites to handle the preemption. the update_curr() would be 'FULL ? force : lazy' while the tick one gets the special magic bits. --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index df348aa55d3c..5399696de9e0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1016,10 +1016,10 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se); * XXX: strictly: vd_i += N*r_i/w_i such that: vd_i > ve_i * this is probably good enough. */ -static void update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se) +static bool update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se) { if ((s64)(se->vruntime - se->deadline) < 0) - return; + return false; /* * For EEVDF the virtual time slope is determined by w_i (iow. @@ -1037,9 +1037,11 @@ static void update_deadline(struct cfs_rq *cfs_rq, struct sched_entity *se) * The task has consumed its request, reschedule. */ if (cfs_rq->nr_running > 1) { - resched_curr(rq_of(cfs_rq)); clear_buddies(cfs_rq, se); + return true; } + + return false; } #include "pelt.h" @@ -1147,18 +1149,19 @@ static void update_tg_load_avg(struct cfs_rq *cfs_rq) /* * Update the current task's runtime statistics. */ -static void update_curr(struct cfs_rq *cfs_rq) +static bool __update_curr(struct cfs_rq *cfs_rq) { struct sched_entity *curr = cfs_rq->curr; u64 now = rq_clock_task(rq_of(cfs_rq)); u64 delta_exec; + bool ret; if (unlikely(!curr)) - return; + return false; delta_exec = now - curr->exec_start; if (unlikely((s64)delta_exec <= 0)) - return; + return false; curr->exec_start = now; @@ -1174,7 +1177,7 @@ static void update_curr(struct cfs_rq *cfs_rq) schedstat_add(cfs_rq->exec_clock, delta_exec); curr->vruntime += calc_delta_fair(delta_exec, curr); - update_deadline(cfs_rq, curr); + ret = update_deadline(cfs_rq, curr); update_min_vruntime(cfs_rq); if (entity_is_task(curr)) { @@ -1186,6 +1189,14 @@ static void update_curr(struct cfs_rq *cfs_rq) } account_cfs_rq_runtime(cfs_rq, delta_exec); + + return ret; +} + +static void update_curr(struct cfs_rq *cfs_rq) +{ + if (__update_curr(cfs_rq)) + resched_curr(rq_of(cfs_rq)); } static void update_curr_fair(struct rq *rq) @@ -5309,7 +5320,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued) /* * Update run-time statistics of the 'current'. */ - update_curr(cfs_rq); + bool resched = __update_curr(cfs_rq); /* * Ensure that runnable average is periodically updated. @@ -5317,22 +5328,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued) update_load_avg(cfs_rq, curr, UPDATE_TG); update_cfs_group(curr); -#ifdef CONFIG_SCHED_HRTICK - /* - * queued ticks are scheduled to match the slice, so don't bother - * validating it and just reschedule. - */ - if (queued) { - resched_curr(rq_of(cfs_rq)); - return; - } - /* - * don't let the period tick interfere with the hrtick preemption - */ - if (!sched_feat(DOUBLE_TICK) && - hrtimer_active(&rq_of(cfs_rq)->hrtick_timer)) - return; -#endif + return resched; } @@ -12387,12 +12383,16 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued) { struct cfs_rq *cfs_rq; struct sched_entity *se = &curr->se; + bool resched = false; for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); - entity_tick(cfs_rq, se, queued); + resched |= entity_tick(cfs_rq, se, queued); } + if (resched) + resched_curr(rq); + if (static_branch_unlikely(&sched_numa_balancing)) task_tick_numa(rq, curr);