From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30349C77B7C for ; Thu, 3 Jul 2025 09:12:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B70346B0154; Thu, 3 Jul 2025 05:12:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1F506B0155; Thu, 3 Jul 2025 05:12:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0EB06B0156; Thu, 3 Jul 2025 05:12:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8EAA16B0154 for ; Thu, 3 Jul 2025 05:12:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 44BB1C073E for ; Thu, 3 Jul 2025 09:12:26 +0000 (UTC) X-FDA: 83622387492.12.653C494 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf04.hostedemail.com (Postfix) with ESMTP id 7CD7C40006 for ; Thu, 3 Jul 2025 09:12:24 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b="r/08NPFe"; dkim=pass header.d=linutronix.de header.s=2020e header.b=kl4gT8t3; spf=pass (imf04.hostedemail.com: domain of tglx@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=tglx@linutronix.de; dmarc=pass (policy=none) header.from=linutronix.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751533944; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BH+n0wxS5+7k1dboDiEY2RlATCZyUxvY3iUONP6tN6o=; b=1dGRNtdB36MGrzAiyfns+5BxYnlMAizqyldO+pinYmtu/iSw21KGZiFMpEMlbn3P2k7ngD 6xVElJwSgnlhLJeWHW6mkSyQTxrOhnXUhG+G3+TFgvBEm3rwrAO9hZ3qM7z3xydXcJwTeW MRMrWZTQLNgo/au8V2Cg08lY9cdv3js= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751533944; a=rsa-sha256; cv=none; b=NdDKgFwatARD3qPkcyYTaDAt3khRyc7JG3rCmxBVAvPhSkrZwfB2dR2Ml1pjHdRAiJcymb tK13RH3MhPzDcUHNCBRjp8q0qOYrkiPVpj7ZJjSRLccvVoAGDyfrYAltKFydw3dTDi6Bbz Xt8v2HBqRHnzOs4QEQof6o3dEYKCDwI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b="r/08NPFe"; dkim=pass header.d=linutronix.de header.s=2020e header.b=kl4gT8t3; spf=pass (imf04.hostedemail.com: domain of tglx@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=tglx@linutronix.de; dmarc=pass (policy=none) header.from=linutronix.de From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1751533942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=BH+n0wxS5+7k1dboDiEY2RlATCZyUxvY3iUONP6tN6o=; b=r/08NPFepkh3tyxBDmCH/MsvcVF4miWtXz1vE8lvqonKp88riYAlOnfMTxXs/nc8zE5ElP UEMXioarcY4/U5m/rBcsG9MCKyVJnAxYrRfTiF3AKMHGomUqCIR/XaUVJkfIHtLUlfnDjD a2o3woTLEbj9Fgk7GDMQwPI6TQnsWxVFPWo5h+u6CaNvN7eAt7E0Sb30Chw7sFOmHpZU4p HS3ABz2qr7gD0S0Fkh37O9naNOzuE9SKGq+VbzaYaryW12EOF9YnCR9ZrfvmB6XSdjC3fY FJfmRI4c4HqI1NCpGlV9jlEDTcDnayMxf9Os6MHtNA4BVFYbo74Scc+tYUNqxw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1751533942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=BH+n0wxS5+7k1dboDiEY2RlATCZyUxvY3iUONP6tN6o=; b=kl4gT8t3qU7My7xa4oHcC7hMACpqzS6+0MLztS0vZq2/ZxidHQZIrglA4Ya/6ks3GGmtwi O9L8+1U+1X4E3mBA== To: "Christoph Lameter (Ampere)" Cc: Christoph Lameter via B4 Relay , Anna-Maria Behnsen , Frederic Weisbecker , Ingo Molnar , linux-kernel@vger.kernel.org, linux-mm@kvack.org, sh@gentwo.org, Darren Hart , Arjan van de Ven Subject: Re: [PATCH] Skew tick for systems with a large number of processors In-Reply-To: References: <87sejew87r.ffs@tglx> Date: Thu, 03 Jul 2025 11:12:21 +0200 Message-ID: <87ms9lwscq.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7CD7C40006 X-Stat-Signature: qohbq7pn66qwxwdhwgpuxorchrua1pwc X-Rspam-User: X-HE-Tag: 1751533944-182471 X-HE-Meta: U2FsdGVkX1890wK6YlC1PWimzUoFzwHl/FUxE3Y1Lf1YS6Iwb5cZqYiKac0NEfcsW+6jPt47aRzpt/rp27lzgjMPqaT/N52LfajHJbZ1E//QtR2VgulmOiVXcnrixU4iQuHrfD/4NSOH/MJjbWYA1iLxF0kQdMtMxT3ZH6HeD0fQHjit384vs7q2JGkJHqAx9+cjHImf/ed1fulC2Dvu2jbNsMhWm2BPuw0vzzc5RDo3ijYqxz79xNLIwEJyMQ4O0K0MciIpEObo1k3yR369qrx6z84r4pIY/9u+FaHFlAjakjPvRzyWK6MJHCNB3j2VcfZta1XKNBaG2dW5kr6Y/gvW3ShUgILucsk1pq2EkQ/uvCgbgT67bb6bLP4gCEy6EumM6A6DGzV35foSsHbZlBAPrp8fASolqVw4bT195DrEpQ3QEHsr6En+jx65Dr95Ehtvuyp0maOeZyeXxrDocdpIckT5zBwNS35j09qdz4v9Z0DKRIErvuLJ84Bt/Ymw+Mn6cVyQmh91hKHPMtJmvCZKwECc4+jxIoxZNAsZR4fidu49K9aDmmWCt1I07n1MQVLmob9VLgnyNvFiKrrkhyxOXVlEdF1Bf6cc8LztCAK15yO7asmDguzp1goJVpEVMYyhgHDF3OFAg2megFKJpfZtrvoP+92Q2wFNA77/UGb4S41MFhMeoRRC0loHMMN+kQzvSRWIe/JBSN6ulxk/UFMwOODcOSaJxlzvz/hfcdLjbvHUoV7SWpvNuJ4DR5uKHwD7KA26+qodBUt99aGB+5+Cs3HFxzB/r3lz9K1NucdENiAceASSiSOi+W5Ma1s7EpV3mbeQXdk9l4jBGASaHbsTnzYpY5D7GTNbX5Zf4GlKlT51E25cHzXKM9T1/sQVVDQZ7bGz74SSwDI53Je+vbAiFdGjL7h/Ceuw/yRir8kuitgFID8HoCJrN2zETawQXninB+RnKz6V9Lxd11e OvJjp/FG QXxc5gMzRtaoDqlMtvaPHvdi5SmWiQkfPY0B3RbH2eRrJJLBFfGrRRyl8zra+lBWXJVNWi7wz9k78JVmSKDZYzhJvCbYvHElubGDBckK6kMh3yAhJBauJDj9R6WD1uBVZLxcU3xS6wZtDmRyz3K1/p4yjhYMcBZ3uUWzNlmdaUdZPuIerNrP0fpaNIMVPasmEMn441K6K8Hyw6BABspKAymSlWVqGwOXxY77N3XiYZlfR9d2ce+E3BLruxQcYldWsDZeC1zbBVkA4o0N43AQCbfjbUA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000033, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 02 2025 at 17:25, Christoph Lameter wrote: > On Thu, 3 Jul 2025, Thomas Gleixner wrote: > >> The above aside. As you completely failed to provide at least the >> minimal historical background in the change log, let me fill in the >> blanks. >> >> commit 3704540b4829 ("tick management: spread timer interrupt") added the >> skew unconditionally in 2007 to avoid lock contention on xtime lock. > > Right but that was only one reason why the timer interrupts where > staggered. It was the main reason because all CPUs contended on xtime lock and other global locks. The subsequent issues you describe were not observable back then to the extent they are today for bloody obvious reasons. >> commit af5ab277ded0 ("clockevents: Remove the per cpu tick skew") >> removed it in 2010 because the xtime lock contention was gone and the >> skew affected the power consumption of slightly loaded _large_ servers. > > But then the tick also executes other code that can cause contention. Why > merge such an obvious problematic patch without considering the reasons > for the 2007 patch? As I said above, the main reason was contention on xtime lock and some other global locks. These contention issues had been resolved over time, so the initial reason to have the skew was gone. The power consumption issue was a valid reason to remove it and the testing back then did not show any negative side effects. The subsequently discovered issues, were not observable and some of them got introduced by later code changes. Obviously the patch is problematic in hindsight, but hindsight is always 20/20. >> commit 5307c9556bc1 ("tick: Add tick skew boot option") brought it back >> with a command line option to address contention and jitter issues on >> larger systems. > > And then issues resulted because the scaling issues where not > considered when merging the 2010 patch. What are you trying to do here? Playing a blame game is not helping to find a solution. >> So while you preserved the behaviour of the command line option in the >> most obscure way, you did not even make an attempt to explain why this >> change does not bring back the issues which caused the removal in commit >> af5ab277ded0 or why they are irrelevant today. > > As pointed out in the patch description: The synchronized tick (aside from > the jitter) also causes power spikes on large core systems which can cause > system instabilities. That's a _NEW_ problem and has nothing to do with the power saving concerns which led to af5ab277ded0. >> "Scratches my itch" does not work and you know that. This needs to be >> consolidated both on the implementation side and also on the user >> side. > > We can get to that but I at least need some direction on how to approach > this and figure out the concerns that exist. Frankly my initial idea was > just to remove the buggy patches since this caused a regression in > performance and system stability but I guess there were power savings > concerns. Guessing is not a valid engineering approach, as you might know already. It's not rocket science to validate whether these power saving concerns still apply and to reach out to people who have been involved in this and ask them to revalidate. I just Cc'ed Arjan for you. > How can we address this issue in a better way then? By analysing the consequences of flipping the default for skew_tick to default on, which can be evaluated upfront trivially without a single line of code change by adding 'skew_tick=1' to the kernel command line and running tests and asking others to help evaluating. There is only a limited range of scenarios, which need to be looked at: - Big servers and the power saving issues on lightly loaded machines - Battery operated devices - Virtualization (guests) That might not cover 100% of the use cases, but should be a good enough coverage to base an informed decision on. > The kernel should not come up all wobbly and causing power spikes > every tick. The kernel should not do a lot of things, but does them due to historical decisions, which turn out to be suboptimal when technology advances. The power spike problem simply did not exist 18 years ago at least not to the extent that it mattered or caused concerns. If we could have predicted the future and the consequences of ad hoc decisions, we wouldn't have had a BKL, which took only 20 years of effort to get rid of (except for the well hidden leftovers in tty). But what we learned from the past is to avoid hacky ad hoc workarounds, which are guaranteed to just make the situation worse. Thanks, tglx