From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 095AFC6FD18 for ; Wed, 29 Mar 2023 20:39:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5FC776B0072; Wed, 29 Mar 2023 16:39:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5AC446B0074; Wed, 29 Mar 2023 16:39:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4714B6B0075; Wed, 29 Mar 2023 16:39:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 34C1B6B0072 for ; Wed, 29 Mar 2023 16:39:02 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0257EC0D71 for ; Wed, 29 Mar 2023 20:39:01 +0000 (UTC) X-FDA: 80623100124.22.C502929 Received: from mail-yb1-f182.google.com (mail-yb1-f182.google.com [209.85.219.182]) by imf26.hostedemail.com (Postfix) with ESMTP id 38E35140003 for ; Wed, 29 Mar 2023 20:38:59 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="NU/yfcrc"; spf=pass (imf26.hostedemail.com: domain of hughd@google.com designates 209.85.219.182 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680122339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=roQ4I8SfNlurAVA2gUW8YNCdvD/U9UP1h39JPEDCNrw=; b=wZQXB8lkvg2nziKr46xLjQzuAcFJnMfinM+hUUKLU2NISqBqh7tT8qDHmw6x8MgHlgqbKk CkEZ90buTKJ1Ojqx6KxvKQFfHUOpTSrydXm0AaHENkCDpMqJyjESblSKz5BAn/KPAdzvFf 0GxCUdKsogFBgLXOJin2TorOGHjCqvE= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="NU/yfcrc"; spf=pass (imf26.hostedemail.com: domain of hughd@google.com designates 209.85.219.182 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680122339; a=rsa-sha256; cv=none; b=AqFxM8rYLLPMJu05LLO8kVZlULRg4Z6AQFvCTk6lrTdUDXTCKDK1ZupfhfuhoqfSc4fw+2 PQCBgfqAC7Y7nwAVVmw9gD+ZZMWJfFXDOUUI+WH74bmDybc6B+/x+sZVVP9MeLbljsJ5Hx nBQulO2zjvKSeHDXiHxKxNRRXDS27rE= Received: by mail-yb1-f182.google.com with SMTP id p204so20999667ybc.12 for ; Wed, 29 Mar 2023 13:38:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1680122338; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=roQ4I8SfNlurAVA2gUW8YNCdvD/U9UP1h39JPEDCNrw=; b=NU/yfcrc8ryHQhIAObFNbURZYYwBhWTiQcib7OVvY/80Set4ojSk/egyVrcxg9WBQ1 8yqI244zT3XrggWqoMHveuVVPk8Z2LeSgnN6ROSj4fKFcZHCArk6iGNr7q9j5X17Ny23 gmo9mEp/to3tEwTte/8Z7bDTyJtCMtNYFbADW39rzX3kLuRLCJcFYk0CzDahs3JND0r7 f96lCJylAr51Qw4wmj36DwO90YcGV7yvS8D1S6nqpocow0tu6wu3TqawKx9gnwEmZBip jOPhjsfHq3YLAiBGrnnBXaxJLVQ+N5Er/HmS5eE2sSr/k3DgDVkRF8MlpE+6XIDw6pC4 ZMew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680122338; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=roQ4I8SfNlurAVA2gUW8YNCdvD/U9UP1h39JPEDCNrw=; b=glFu+P2louV1PV5DY5mcRCg9uI4u46ehw3isCO2yRYSWhILIU3K+ucd0b9qd/quRF8 Fif9heI/R4Spy6Flcn8hqTFD+daQkNp5j8e61shzUmgahYO/BKGOxBQ9/rwyn6Qzh76w dYcY0NOFmx3LLkzmuJ6lMEPYxn+qXQ85j2YtJhQJtDbiDf1c7hoWBNgOij/voo9IJkib eBqMvYHoRnYGgXykUM2wC79JU9XLDEU/bjK9Tqb9VgKaVrYsw7tZDa74q32yyU7Zysc4 v3aH0kDzOKB7ZjjrFKzNyM6hgiqScHbRRHpOgyVx0iYfYFyCgrC3MrEmn7IgUjinLDpN HsrQ== X-Gm-Message-State: AAQBX9fNaA5UiTSaXkzdWKM0nlXlv7qdJBgJuhvthy+5CIUcX5zrcN54 8Q6GI8BKS9PsRfCmmHropKNQOg== X-Google-Smtp-Source: AKy350awy+iqFFLhoWPnPqScJw2eCrMSToUYrqptA8NQ5i6yrQSWWA49k/VEwxmPQNxTIvIjhwOwcA== X-Received: by 2002:a25:6d08:0:b0:a8a:4380:e073 with SMTP id i8-20020a256d08000000b00a8a4380e073mr17114309ybc.53.1680122338180; Wed, 29 Mar 2023 13:38:58 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id b125-20020a256783000000b00b7767ca748dsm3732036ybc.42.2023.03.29.13.38.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Mar 2023 13:38:57 -0700 (PDT) Date: Wed, 29 Mar 2023 13:38:48 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Tejun Heo cc: Hugh Dickins , Yosry Ahmed , Shakeel Butt , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock In-Reply-To: Message-ID: <98cb3ce-7ed9-3d17-9015-ef7193d6627@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 38E35140003 X-Stat-Signature: zewripxherdjeeeuu51aj7rxu987rcwy X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1680122339-624563 X-HE-Meta: U2FsdGVkX1+idzbZhvm6n6a41TIzywbNqVD7uR4l+IRG1rfxRbGlRU3DQwHe/7pLCrjB5CVFhzFCTQ8v/cmNdiAjfif/suLlZFzNtzEzYqMpfmOE9wK+kV4jrV47n+9c+SGr0DD9bBVxwCbk009RIftrgVWyoVWVPeOSOs9Y2Hj1oX7xresdtL9Zn88bMvTgqGYdJDL6GBs+dFBRXIqPtaVdJGj0NNm/y0D8mA5MGtIUkg7GH0Ntl2zOLeL3ZVGZpY/4Xu53YetUX00BPaANKgJZomzAgOS8jentPCGY6s7h38l0AHznPEYuflznBJzlvkKczIxBMSRiw0rVFK3q85bXmQZhfDHO8kw4iKK57v46iVaSqgIbHD0CdqzT+bMvVvqI4w2Km29Azn1WnNR4kLZ09Yc/Ze8itR2+WkFxyy6x82+wkkfHNPlRQgitztzfKvhUMYp7tArS5ilDFD8V7iY5CPG8jJUx71FBAYcMw1loKblyz/IDk0IXnrYwgzCphz8SJzBriSK8RjGCcTFnIkzR9aXzMHqyUnVnpejybEPSAZavG9aVmh6PaBHzdD8NbKrcbZ/CrVMwKOlIfRJpGLQzjjQsgfsMkT4esJhNTtR0QnIPBXdaz62mjHz4X8PACDSxel3Gr6TFAj94KUxgduQtAFwVIQ3ZSgPqaGJgLbxcThItWnqJE0x0sYqOZApLwh097jKzhwvtF7fO/nkV9/tu6WHA1BTsyhgLjHL3lqRrFNto7ZBd07eusLeCzUe7i5aX6937Ibe9gZ7JtqwDRBf/sfT3yv6t0V2ybjgyvFQ9vF0yQ9ld7LSCydqVPl0QAknPKtI7oGWR6cm98yjuKbilxTzx4kq0DLx0tJxKxmQqgiZ9j15V2v2sQ9tS69Nhemlqz7kV+LpvRMXYSp+TNgbz5YuIU6g6n/8aYy/jMx0d8ca0vzg7Gj0u5DorEyDGqvA0AQgw0KFtnoiTXbj +rIF1IX6 nYBrF//tntB5wmXWtv5o2ISiDHESE3qGe7z0R+Vseaaq53fSk8P5zbJO1ye8vJlUaheX3E9PQWgsVHE55yut6Bh7HRcqiaS2eRp6xCeXD/zTEea0v8oo07cCQev30xmR4EnLJtpNpOjZzN8EDQ3s8wS2urMRFAwzkfEV4bTo7c3494aIflfCCnxtdXPRzIruB4hxmQYTBvXRq1zrCtPPTQBQIgI4rY4BZy6mxEKyaHuW3Wbzm7V1/DH+pfA97yDY81qJ5Rwejjjq6imgpfA0X3tDnBg85rcmLfwtaX7q9CFC1G/Bc7lc7/F8U2ze1aDBwnhUpke0WTCX8t5NPVGpll2O3Dw9hoXfiCFf7JyGhG4cMnlaF9qz6XXKf9rJkG64cvWY0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 29 Mar 2023, Tejun Heo wrote: > Hello, Hugh. How have you been? > > On Wed, Mar 29, 2023 at 12:22:24PM -0700, Hugh Dickins wrote: > > Hi Tejun, > > Butting in here, I'm fascinated. This is certainly not my area, I know > > nothing about rstat, but this is the first time I ever heard someone > > arguing for more disabling of interrupts rather than less. > > > > An interrupt coming in while holding a contended resource can certainly > > add to latencies, that I accept of course. But until now, I thought it > > was agreed best practice to disable irqs only regretfully, when strictly > > necessary. > > > > If that has changed, I for one want to know about it. How should we > > now judge which spinlocks should disable interrupts and which should not? > > Page table locks are currently my main interest - should those be changed? > > For rstat, it's a simple case because the global lock here wraps around > per-cpu locks which have to be irq-safe, so the only difference we get > between making the global irq-unsafe and keeping it so but releasing > inbetween is: > > Global lock held: G > IRQ disabled: I > Percpu lock held: P > > 1. IRQ unsafe > > GGGGGGGGGGGGGGG~~GGGGG > IIII IIII IIII ~~ IIII > PPPP PPPP PPPP ~~ PPPP > > 2. IRQ safe released inbetween cpus > > GGGG GGGG GGGG ~~ GGGG > IIII IIII IIII ~~ IIII > PPPP PPPP PPPP ~~ PPPP > > #2 seems like the obvious thing to do here given how the lock is used and > each P section may take a bit of time. Many thanks for the detailed response. I'll leave it to the rstat folks, to agree or disagree with your analysis there. > > So, in the rstat case, the choice is, at least to me, obvious, but even for > more generic cases where the bulk of actual work isn't done w/ irq disabled, > I don't think the picture is as simple as "use the least protected variant > possible" anymore because the underlying hardware changed. > > For an SMP kernel running on an UP system, "the least protected variant" is > the obvious choice to make because you don't lose anything by holding a > spinlock longer than necessary. However, as you increase the number of CPUs, > there rises a tradeoff between local irq servicing latency and global lock > contention. > > Imagine a, say, 128 cpu system with a few cores servicing relatively high > frequency interrupts. Let's say there's a mildly hot lock. Usually, it shows > up in the system profile but only just. Let's say something happens and the > irq rate on those cores went up for some reason to the point where it > becomes a rather common occurrence when the lock is held on one of those > cpus, irqs are likely to intervene lengthening how long the lock is held, > sometimes, signficantly. Now because the lock is on average held for much > longer, it become a lot hotter as more CPUs would stall on it and depending > on luck or lack thereof these stalls can span many CPUs on the system for > quite a while. This is actually something we saw in production. > > So, in general, there's a trade off between local irq service latency and > inducing global lock contention when using unprotected locks. With more and > more CPUs, the balance keeps shifting. The balance still very much depends > on the specifics of a given lock but yeah I think it's something we need to > be a lot more careful about now. And this looks a very plausible argument to me: I'll let it sink in. But I hadn't heard that the RT folks were clamouring for more irq disabling: perhaps they partition their machines with more care, and are not devotees of high CPU counts. What I hope is that others will chime in one way or the other - it does sound as if a reappraisal of the balances is overdue. Thanks, Hugh (disabling interrupts for as long as he can)