From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84F1FC0218A for ; Sat, 1 Feb 2025 23:06:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7602C6B007B; Sat, 1 Feb 2025 18:06:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7103B6B0083; Sat, 1 Feb 2025 18:06:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FEA16B0085; Sat, 1 Feb 2025 18:06:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 432BF6B007B for ; Sat, 1 Feb 2025 18:06:25 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C7296A1A35 for ; Sat, 1 Feb 2025 23:06:24 +0000 (UTC) X-FDA: 83072911488.09.F9581C4 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf10.hostedemail.com (Postfix) with ESMTP id 1F064C000F for ; Sat, 1 Feb 2025 23:06:22 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of "SRS0=QCtu=UY=goodmis.org=rostedt@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=QCtu=UY=goodmis.org=rostedt@kernel.org"; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738451183; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3zoW+BOvnfA5pXcXOku+IeAo91uP3154fJquEceRSqU=; b=RdwYuPX5xEWClpjxE63z5a8j/DvIbXZ7wBjgXMtA+D2Zi9zLHPb/tLTd9p/8fNrJxFLdV3 iT4QDawMXAwcYU7I5TArPgOOVBHPx4Oiu6VaOW8pFVSnkCpZ8Zpa8zbrZ3irwrGqitVSmM Wy+Sw9qVTDVLpgEprLmb17bOODHOe+U= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of "SRS0=QCtu=UY=goodmis.org=rostedt@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=QCtu=UY=goodmis.org=rostedt@kernel.org"; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738451183; a=rsa-sha256; cv=none; b=4v0ULHbvy73EPp8zau7hEUP9OEOEPmtb6YpLfed3sO8Y3/kiEarq2anojBuri0BRqXVRff 98fdLECII7+91nLt/ISDOkbLIUGsHkwkOiQwJaCH74yOgMZhOu5hth+agUyf7ynBTM2iZA 9vbLeOijKebMYX1qgRoElkyXYcwE3/8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id A76FA5C5A94; Sat, 1 Feb 2025 23:05:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5409C4CED3; Sat, 1 Feb 2025 23:06:17 +0000 (UTC) Date: Sat, 1 Feb 2025 18:06:17 -0500 From: Steven Rostedt To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, Thomas Gleixner , Ankur Arora , Linus Torvalds , linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com, jgross@suse.com, andrew.cooper3@citrix.com, Joel Fernandes , Vineeth Pillai , Suleiman Souhlal , Ingo Molnar , Mathieu Desnoyers , Clark Williams , bigeasy@linutronix.de, daniel.wagner@suse.com, joseph.salisbury@oracle.com, broonie@gmail.com Subject: Re: [RFC][PATCH 1/2] sched: Extended scheduler time slice Message-ID: <20250201180617.491ce087@batman.local.home> In-Reply-To: <20250201181129.GA34937@noisy.programming.kicks-ass.net> References: <20250131225837.972218232@goodmis.org> <20250131225942.365475324@goodmis.org> <20250201115906.GB8256@noisy.programming.kicks-ass.net> <20250201181129.GA34937@noisy.programming.kicks-ass.net> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1F064C000F X-Stat-Signature: kgxkryjuqooot6p64ygdbgiy84fforiz X-Rspam-User: X-HE-Tag: 1738451182-566884 X-HE-Meta: U2FsdGVkX18cZY8eOi2Q+IPI4pAWON8E7hgb1dDiiU6gHjwXIMcH4M4Z+uZ4YOe0BRA/ck9o1nPaXsLA/438NxLzMpgtxqf4Hg737nDAM0Ixo0WCqKVJ5E0YAZRY8GPmQN2CiA7Z9RuIFU5NtmIuMbx27kQP45cskb1uhPWza4neGVZtQZgudrYhj1u2gc2xVO0o8Bt9cqOoaM83Gtr5y6CKS4XePM9qr3YQGnlkKS+o6XIa7ltxSYK/aGlhZaN1eGDzUuMJZdB/AcDx81yi7TmCc/BZSU1+e3BNaZT6EjPiIE5FqkBRsLekrp3TvgZVFMO0G1ZNA7wGNrejH54g8oe+p2mWW2xwf257+Oo6eIDwypJMSrucwA5XTHd+SCs0MTdAF3FAjUJoLEIuKxqOTYt9BjivlUmnyGTpU+m6vNapzdJkfPmG7kQiUZZOOb9oIdNqlo8+vkymAwc4df0edUuqWQJdMCTKA+GYgwIXj6rLF+1EDhtiGeyGc9hzyszi2nsk02C4uRveauDlLcG3NrgM7D6qBWlSBC+8LsmrJasBklZfXX0XXG6fYHC3XYqQwzrbXbtEHVkDeHHuuIP0Cfg0biNjWfGoHD31pdnMU7/kWSBu1Krv6ykc8X/QpYyRFy2/8firrZy0hx6sKlsW5BC+9lRp7rlxEzLDThhO8AuNQq7AniQoXJbe0M0nyY3EX8BEiNjNG9HdemQ6c21kM9KZKEdaZIpku1jZpDY5yaFT1h6hpigKZ6PTQVKgFMPUhXfayOecrp0pKtNSrocBpmmLB1PfDUGYOYFN5jLWeQyYvYRlmQJsIwHnh94Opk51eV3teRFOhTEMgAg9bKvlMdUe47Nmo+axEoN18RIVRiVUYJF9tMUH8pDnR52j/Rb07SlDPfiO3zTq86oqZxfMBkmVf2scA4ihwvBKbAO6IGd6mFpLkbg5SHY3wY5L9n1gAM1llqtcuK8tBRJ24KW 1zNcsRzq Nhd8QB8cSuRcJ2IBtuaVRT2ngMWxvEinZhYZ7h5eMbYp9epyA3MrfcoaIP7vT5bR4DJDXAsKzDZEXIuTG5yD0WPj8+PoTSUm9rkINOci4yy8JldP73pA/aw+UnfUShc355tN2g68H7Gbey65FnWwdH3xatL1mvF5OoXwx7OBwfA0v1bi1mj6WXT0f/0zHmQnPKmivD8fYklGu95473VvfXB3Jll+zEFbTiepYNV/L6U/0h2ptxbXRSdb2NJ4mvYtkaQ9GNWhfX09iukaNaKiOzp8VutcGadxLq5lRzyoCZsSl8ZjZe2Xfoxyvu0WkJIfoKy1nbpvqyLdr0aXnNYTaYKxBO7mrwa6NAH3x7q2JWLyL0xlbbCGA15BrOBoxMMctRxub X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, 1 Feb 2025 19:11:29 +0100 Peter Zijlstra wrote: > On Sat, Feb 01, 2025 at 07:47:32AM -0500, Steven Rostedt wrote: > > > > > > On February 1, 2025 6:59:06 AM EST, Peter Zijlstra wrote: > > > > >I still have full hate for this approach. > > > > So what approach would you prefer? > > The one that does not rely on the preemption method -- I think I posted > something along those line, and someone else recently reposted something > bsaed on it. > > Tying things to the preemption method is absurdly bad design -- and I've > told you that before. How exactly is it "bad design"? Changing the preemption method itself changes the way applications schedule and can be very noticeable to the applications themselves. No preempt, applications will have high latency every time any application does a system call. Preempt voluntary is a little more reactive, but more randomly done. The preempt lazy kconfig has: This option provides a scheduler driven preemption model that is fundamentally similar to full preemption, but is less eager to preempt SCHED_NORMAL tasks in an attempt to reduce lock holder preemption and recover some of the performance gains seen from using Voluntary preemption. This could be a config option called PREEMPT_USER_LAZY that extends the "reduce lock holder preemption of user space spin locks". But if your issue is with relying on the preemption method, does that mean you prefer to have this feature for any preemption method? That may require still using the LAZY flag that can cause a schedule in the kernel but not in user space? Note, my group is actually more interested in implementing this for VMs. But that requires another level of redirection of the pointers. That is, qemu could create a device that shares memory between the guest kernel and the qemu VCPU thread. The guest kernel could update the counter in this shared memory before grabbing a raw_spin_lock which act like this patch set does. The difference would be that the counter would need to live in a memory page that only has this information in it and not the rseq structure itself. Mathieu was concerned about leaks and corruption in the rseq structure by a malicious guest. Thus, the counter would have to be a clean memory page that is shared between the guest and the qemu thread. The rseq would then have a pointer to this memory, and the host kernel would then have to traverse that pointer to the location of the counter. In other words, my real goal is to have this working for guests on their raw_spin_locks. We first tried to do this in KVM directly, but the KVM maintainers said this is more a generic scheduling issue and doesn't belong in KVM. I agreed with them. -- Steve