From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 614E5C07545 for ; Wed, 25 Oct 2023 15:42:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DAEE46B0346; Wed, 25 Oct 2023 11:42:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D5ECF6B0347; Wed, 25 Oct 2023 11:42:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C26536B0348; Wed, 25 Oct 2023 11:42:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B289A6B0346 for ; Wed, 25 Oct 2023 11:42:22 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 7F3431CB632 for ; Wed, 25 Oct 2023 15:42:22 +0000 (UTC) X-FDA: 81384400524.16.E1EE1F8 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf03.hostedemail.com (Postfix) with ESMTP id 70A5D20028 for ; Wed, 25 Oct 2023 15:42:20 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=MC8mRgur; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf03.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698248540; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IebG/pdp7FAa6Zd0Tjy+RCu1+j1VNuJKVLv7/aCzdJI=; b=eHbYdvg7cFV8+/7YGBNh/BBoTzCgSQyIc0vRAiJhotIX0ppMOrn60IeSmR+u5A/eAQMZob KclP+PWL4CXUfAR0WS9VSIx6qfzf44ydWIfcFll97YiAKsPK4h3/aVe6dbPpPWbEQjwPKd B4oMgkLDkjRJjaTCQmfvocco5zVONnM= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=MC8mRgur; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf03.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698248540; a=rsa-sha256; cv=none; b=0eDh9rgKEfUzN2GgpR2+2Kqj7Buw2bIkCxjir9yjzLDDUMvDwyaPDObQQGTL90Paf2hsEI gls84X3m5ePiQrA2KemJ+4R06/ppK0q1+xsSMwg5QE5Wm1/7h7knGfCQ69ef7k3fDkg8eM ZeCw7dbGPy465unxMhJt6HcCTYsipw4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1698248539; bh=XLfjy5zK7rJ7BMk5Ub8hm0MefkN9QOqzEYXN3ub6whY=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=MC8mRgur6RYYIFlpPIj61wRMo+5Lsu4bmCXztRoVJSi0WiQi7slsbsII/ZCcqRU+P FesRkK+nVYKoGzDINqGPIi3j7feftFqBCfdZucNPJyxYxsUddPK34S5sX27HMOX7jO L/2A4prBLwjEPfVYfCeYfqxMmIHvw6HKL+XjUvUmSLppK5SN09xX81uek2azxmQxZc Pdt73liohzZ4R1MDVxT0bqEVSRNTZfa+i+JUFFNBPqy4oZDxkH8leAr00ytjJBqbbS 6mgpuvH9NDCrojRiwc07L7LDsw+CiTBGK+9ztHcK/fWJW6w+OKyemKEHWEzK9/Js25 +s2akE3r4PWRg== Received: from [172.16.0.134] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4SFtTk6t9dz1ZWm; Wed, 25 Oct 2023 11:42:18 -0400 (EDT) Message-ID: <884e4603-4d29-41ae-8715-a070c43482c4@efficios.com> Date: Wed, 25 Oct 2023 11:42:34 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [POC][RFC][PATCH] sched: Extended Scheduler Time Slice Content-Language: en-US To: Steven Rostedt , Peter Zijlstra Cc: LKML , Thomas Gleixner , Ankur Arora , Linus Torvalds , linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com, jgross@suse.com, andrew.cooper3@citrix.com, Joel Fernandes , Youssef Esmat , Vineeth Pillai , Suleiman Souhlal , Ingo Molnar , Daniel Bristot de Oliveira References: <20231025054219.1acaa3dd@gandalf.local.home> <20231025102952.GG37471@noisy.programming.kicks-ass.net> <20231025085434.35d5f9e0@gandalf.local.home> <20231025135545.GG31201@noisy.programming.kicks-ass.net> <20231025103105.5ec64b89@gandalf.local.home> From: Mathieu Desnoyers In-Reply-To: <20231025103105.5ec64b89@gandalf.local.home> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 70A5D20028 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 5i3xu3wo59mxddpzi7h976eg477wh9gi X-HE-Tag: 1698248540-735072 X-HE-Meta: U2FsdGVkX193spAszQVljsyHF2gn/ocqwcOulXRmz87vH3c+bADWfbkTfERJ8WNmfH3uXsSJhrbMjPEgQeAc/m7I6vbWsTusKG5YgJy0vjmeqryJFrUZDSrZ6U3ntJNHwKTTY8ekg2kNusu1/d05NiER6Zl1KMyZk6EMYvTlL/fwTU+rD0cy8R0Vo3A7eBU6VzKA/Bjx1hvPFm2Vd0kmE42I1ICwqzAcvL8BLi8l4opdms0RXaA5D+ZoOUec8HsbCrS/FLJTVPRJFJxZJ7HkE0IQ+uR0qCJVrWL+MiTLxybV8A/K/hQeTfVFY5vbl7b91cNlB2O5/BUHOGTvRHbEfwVEHPhS2w3Zzyw0USnLhMokZLxDn3ym2p9N/rFKKs7PueVS3nhOwCF1V2RghnlKqRE0t/On/4II+G4P0XTErcVlXv4NW5Luk76bAlTBEd+1b8cXwsIFi5yj9VD7m/QayuUbWZjZxsLLja/p3ljV4Rc+YPiy2aX2RV1U57ex39ZIhsBuXrNoHTfv4J6huQxQlupNqYVU/ME8MR0fkNpVzo4InecTWAzYHBVjdiJlUtG9ZKZn1AYXppweSKMLJUthzAwilvh6RQgiz69VqbDWUwKVeDUKVza0D3/BEw96DQ05WuLQRe6K7xK6uhZpXUMEi8Qr2cFffhd0FDaxW1grrjv0Dc6hZimd9+Yp45it72ZGoKx4MrcJafknaU896WFukX0wYmHQ9ZDCh4TmYOgbgRHFlVu0Txv3MxZz2vyV+3myKy3qAttav8WcIU1KjFwBk3MowXJC1NjfaxGzi5VdpuhZRLjoOzcUujx1Y7zs0HW2tmQZE9kRKiVMVoD0gU9gu/UeKFdELbt6ju39Em1DpP3rUn+jN2/o11pZ7gxZxzhv/M+lu/gmmY0T9vIEZshbe/YusvgdrW98gaovVYh1P+YavQ9V8SdfK3nBXDIpkwX1YI8l2YvqdBvIRIjOV7F yV9K/ldq G3f3+MM+ly7v5dr9ge/1NplTZgyyLTLJd2+QZFavOoomLjQttzXtTv3/PFoJfW2IDA8ygdUXPAZqfWXpgw0JrflObRWCzQ5ci9tRtlIF7yLl8psMbiqyj3FwLzQo3nQFiwJGUwYphhwztvlbPG+8EG55wjBr6WXbLg7lOOG6zMPc4/1dIf/VdtWOjCQU7JiM9XLkeGrPIz6BGKygCEmGkAjB7eAeKDIojDBzlUpN1skdYattzyTFVkwu2nOaguMBN8XLKnIBKBtjvSGf+LaNVfgR9RVMfs6OBb9xMmcD1qm+CiFVfj93+yFNBzvWLRWF6rHoTUlwk6yOompthbWcYKG77Rmy5IGMRYF1WZXJZ2jNjNx8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2023-10-25 10:31, Steven Rostedt wrote: > On Wed, 25 Oct 2023 15:55:45 +0200 > Peter Zijlstra wrote: [...] After digging lore for context, here are some thoughts about the actual proposal: AFAIU the intent here is to boost the scheduling slice for a userspace thread running with a mutex held so it can complete faster, and therefore reduce contention. I suspect this is not completely unrelated to priority inheritance futexes, except that one goal stated by Steven is to increase the owner slice without requiring to call a system call on the fast-path. Compared to PI futexes, I think Steven's proposal misses the part where a thread waiting on a futex boosts the lock owner's priority so it can complete faster. By making the lock owner selfishly claim that it needs a larger scheduling slice, it opens the door to scheduler disruption, and it's hard to come up with upper-bounds that work for all cases. Hopefully I'm not oversimplifying if I state that we have mainly two actors to consider: [A] the lock owner thread [B] threads that block trying to acquire the lock The fast-path here is [A]. [B] can go through a system call, I don't think it matters at all. So perhaps we can extend the rseq per-thread area with a field that implements a "held locks" list that allows [A] to let the kernel know that it is currently holding a set of locks (those can be chained when locks are nested). It would be updated on lock/unlock with just a few stores in userspace. Those lock addresses could then be used as keys for private locks, or transformed into inode/offset keys for shared-memory locks. Threads [B] blocking trying to acquire the lock can call a system call which would boost the lock owner's slice and/or priority for a given lock key. When the scheduler preempts [A], it would check whether the rseq per-thread area has a "held locks" field set and use this information to find the slice/priority boost which are currently active for each lock, and use this information to boost the task slice/priority accordingly. A scheme like this should allow lock priority inheritance without requiring system calls on the userspace lock/unlock fast path. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com