From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6665C0218F for ; Sun, 2 Feb 2025 20:44:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E35676B007B; Sun, 2 Feb 2025 15:44:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DE5556B0083; Sun, 2 Feb 2025 15:44:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CACF36B0085; Sun, 2 Feb 2025 15:44:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AE52B6B007B for ; Sun, 2 Feb 2025 15:44:57 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0BCDB1221AA for ; Sun, 2 Feb 2025 20:44:57 +0000 (UTC) X-FDA: 83076183834.08.8BA1BE3 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) by imf29.hostedemail.com (Postfix) with ESMTP id 104ED120008 for ; Sun, 2 Feb 2025 20:44:54 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Fp+odXMF; spf=pass (imf29.hostedemail.com: domain of david.laight.linux@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=david.laight.linux@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738529095; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cbSx/ISaCz+L+0YMvLHQK84mnVS9XArcMwXBbJKLocA=; b=SddFJt55hKGjamm81FLqVHeMI8SUhSIvV7Qj6O5N+RK9M/sewiliJt6dNHbllVWxejzjOQ +3OCHYXEbo3XBcCiULHc2y9LTbo59gl1rvBWqpZJ7oYtakTzc+CNby9FS9K2xNeKeNHvWR mklfm9y6688Z4eH0Gh+75nPeho5H1/A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738529095; a=rsa-sha256; cv=none; b=jINb2i4O48ll+LCevQzSQ/cOcPmfYt2n5ujb8kIJkCoRlZsgk9fFwlbWZaqbyOi3/zQ6qw 66bnkxt9O34LvGG9Sg15OuS88sUBjDU6GU9geUalLEPEVAoH9ML2GOts9QDeNgL3LMn0q6 5qeOhOmCjSU9tcCmvHdYIqvmGL/3TaU= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Fp+odXMF; spf=pass (imf29.hostedemail.com: domain of david.laight.linux@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=david.laight.linux@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-385e1fcb0e1so1841767f8f.2 for ; Sun, 02 Feb 2025 12:44:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738529093; x=1739133893; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=cbSx/ISaCz+L+0YMvLHQK84mnVS9XArcMwXBbJKLocA=; b=Fp+odXMFxTjUCw5bsA05F9j76asBrQF5RTE7oAl3idwOz3FnZAol5UBdscu3H4T8UN 77aIFbvVyk22ga91rAaJXliklY/M003eykMlh2VC4oGuKEsMS/ZnfrbGH89hPVSNrhQW Nk4Q2GgGIoFwKk6grIXxOeU1iJHms572lHA9fp5b7Z5sinv4JXne/P8vU2v7T+BivRfL 7Y+KFjnk51Oh+i7LXYq3dfUDqeGPeaS/efjUqA7KDd6b9v8E/x+FvfPC+FxD2c0RgWhV b/FFr0zjGJA4z2Cj2mc6atAZspcXQR/CNcfVbbytKWe0BkyIWuNeyUKK2vRRaH5TlPR1 kQHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738529093; x=1739133893; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cbSx/ISaCz+L+0YMvLHQK84mnVS9XArcMwXBbJKLocA=; b=mDbFSLkXT213sWa5vaiqfiqUZgFRQrgF2rsb+ZX/ACQhXqX3C2TrdXo8/wGrRdqMdR 2PtUaQFEI3B+Gdl3NNe0tCaRPSriTih/Hd0Qlu44w1ASHvQyGNL9GRJlrTOXm2QU7IeO AnNHykL+ldgP2PS+oGttNn5ekaGYPT19eAB47GH9mQmfEUmRSVqypjdREDwtlMsJWD0w 2/AIRad32yl7oKaBJH1InBsPX4VCQZfe6I73XX1211EtLGIVzIUCD+NqWVA89DR+tHs1 fYpsXJkjqSmD2QA1rgArjfAk8v7+HA+69h7nHCK3sSTjkI8usGAX7ADG4nDcZnqZuMLv rO4A== X-Forwarded-Encrypted: i=1; AJvYcCXBXPS4mYDX958OyfOj/5g1EXFEnSy2+PNYFawk9xSCQKn0oiCFhC2V8Dys2l8KOjMa9nzzGkoX8g==@kvack.org X-Gm-Message-State: AOJu0YwcjEmdoCBfrBzZWnhREuwKc7cAMRf1eaHKYWgGtIpAJ/l9wiCU zfknn5OYTRKV/wCYbVwkfQtBsKg1Kd4P0tXIW8+5R63xYX0q4TXq X-Gm-Gg: ASbGnctQ0Y4ErHxmlck6XyItuMxDRaDZt6CXOQ1Y9GS0hfCqdhIlSeYf1qLm8J+1kWn xd8909fXntrgRi8z8HlpON9w5JaWeHtKPejaJ37R6WaHiOp9Qfl76Nb3+Tk2xHhPN8jCi9T7xGI +nbM/r1D1b2mwZ5mwkXssWkoFg5w36RemKEFj2POJxViTamzS8kgPGLlfDWZ/tY92giDDV++82t Qut63RG+Ouw9ZqtffNfmUJYUlH7G5d1Go4x3jINegKa1nGKD5sGHIhA0F9C1uX+U6z/TnnM7e1E R0bZNknlgWDtvsAOCybJKB2/4r2AbIFPj2T2cjvTMMwWOSHXvkRP1g== X-Google-Smtp-Source: AGHT+IHLsWh/YGpljPw48TJZyadtyPAWHtbeX9ie2f3Bzm4IzJqabYq5/PboSRajCtTjNtqAWvC6/w== X-Received: by 2002:a05:6000:1561:b0:38c:5b52:3a47 with SMTP id ffacd0b85a97d-38c5b523e1dmr9683706f8f.4.1738529093086; Sun, 02 Feb 2025 12:44:53 -0800 (PST) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c5c1b595asm10875448f8f.66.2025.02.02.12.44.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Feb 2025 12:44:52 -0800 (PST) Date: Sun, 2 Feb 2025 20:44:49 +0000 From: David Laight To: Mateusz Guzik Cc: Matthew Wilcox , ebiederm@xmission.com, oleg@redhat.com, brauner@kernel.org, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [PATCH v3 6/6] pid: drop irq disablement around pidmap_lock Message-ID: <20250202204449.77cab5e5@pumpkin> In-Reply-To: References: <20250201163106.28912-1-mjguzik@gmail.com> <20250201163106.28912-7-mjguzik@gmail.com> <20250201181933.07a3e7e2@pumpkin> <20250201215105.55c0319a@pumpkin> <20250202135503.4e377fb0@pumpkin> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 104ED120008 X-Stat-Signature: ggz8uzek1uyj5gftgrcf9feyfqjcm7xt X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738529094-841975 X-HE-Meta: U2FsdGVkX18XUobaE7OZgNAxlqC7/FdDoENEDLaXI8DmXwSad057C7I90PDPl2OQwF3BdLrMBqrATiE55HEgdMe9ogtX1XuONFSsJ5/8pEOqBL1QGL108hzrPNCladdHTRVV+Rwt9U5M1oPoxvrTf+hH0YPIYiXucIdxQR3diq1WVjbiJI1a10KkZftOm94v6pwwLE5rldFZCQp6o5hHCpr5EHf+9grIvV0vYapmbKWTWSII7V9ZdUqq/2fSXqv53vMsLC1aoKUg7kbSSfbSVWQxuOoi0yp8+xJ9Cn2gxE2p0lnviKv8zQBTE24lQkcaPti8YI7/rC9I3gZKLLPbuSWkmPFCX2pUb2QPed4bVuCwA/S8VgJmfwa2nv5kw2qCrR1MqN9LtpVxE0jcvnxRJgdeFt5tU4jG0qh0WNf0uPZygnIvJTYikuQTxRkdt/0hFmOGsfzuQ/gKvBH/i6SgOFs45X3fVFF6+NnvKp7aWGSF/YQa+iYJhdcFVHz/YG9f4kiblomoA5H1fulUQ+tatRR2F05uGqRX8+DPdoeWwLhRO40DsYLwtoGSjWk/ZH6GW9JI3GbjK/Hv3+zpIGcY3+RoigjWu2fapkKBTLOZbaneA0M8Hs+BFOgHpmmEF2auWq0c96IciJHNrW+dZRZ2VZC0vkDTjZGxpKGEIXW5b5ybuSmypPuqdNn8nual8CK7g4nX52DEbqtHtwNHpjQnmxyx392GwJXVGre30f0qfL8CEgJALNuQRgUJLTdkO5l/loLSZkGqyj3HYxRCkg1eyax7grFikUMJKpmUAy2Ws5Me0DN968M2vYoDL4k2TA3lP9VkGppZEgYmxzX1TLkIaIejN07wxFAADwS8jl5ypx2ZYC8LS6BZQAqCKChnhV48/fvxl0HlWaemC3yn+Gg3Y8DCncqTOiJ+Ji10bHZyao8L5Rs6BUqwFfqVNqqfwN95B5JiS0n2nu3sStV3Y4X PYQr4UmX J0JDss6a3OqhzYCreMPNasGvoruvcBDnY+pNvEfzx1bOsN/IzdwpD90mKqcyyKHKAX2tzOarAEvom/gv0i58JXp9+y3CxR8ZvZ/Q/Vt8Rw+HreW0cbLbA46MKWmTue/jRXmeCHIZy+b1JoNDGY8sI1wHeuVBlC6mtr2sI8LjBV0sZfpX71MjFvV9YvoI16jmnd8pGdK7/jMlUSAhJeFy82Ixgt9PG/EMncq6TKhXdgZeNwj9upmj7jYbhDHOnMMxxpdk0XI+QTy45aaruMYFN0w1mmOTqUAourhNrmwXzhGOB21Y190tF+jnX8H/zvAzaTamayucE7W1BqedEAb/L0BuvpYC9IaR+tu6ALP6SILpkoRSPOf0AESdKM9ckPbRDhzAPTY1twpC8stxaJFTqNFrwRxX10iVX1FOeDTVhqoQRpwdTdZZiueOZCD7Co48jSqY9np/l4FNfUaz9a1keU8iSbdmTEGPJVeePyTZwrd0Fhh5qhsdC3NLQJHbsVgAWjYhxHg5DkD8sXts= X-Bogosity: Ham, tests=bogofilter, spamicity=0.008431, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, 2 Feb 2025 20:34:48 +0100 Mateusz Guzik wrote: > On Sun, Feb 2, 2025 at 2:55=E2=80=AFPM David Laight > wrote: > > > > On Sat, 1 Feb 2025 22:00:06 +0000 > > Matthew Wilcox wrote: > > =20 > > > On Sat, Feb 01, 2025 at 09:51:05PM +0000, David Laight wrote: =20 > > > > I'm not sure what you mean. > > > > Disabling interrupts isn't as cheap as it ought to be, but probably= isn't > > > > that bad. =20 > > > > > > Time it. You'll see. =20 > > > > The best scheme I've seen is to just increment a per-cpu value. > > Let the interrupt happen, notice it isn't allowed and return with > > interrupts disabled. > > Then re-issue the interrupt when the count is decremented to zero. > > Easy with level sensitive interrupts. > > But I don't think Linux ever uses that scheme. > > =20 >=20 > I presume you are talking about the splhigh/splx set of primivitives > from Unix kernels. >=20 > While "entering" is indeed cheap, undoing the work still needs to be > atomic vs interrupts. >=20 > I see NetBSD uses local cmpxchg8b on the interrupt level and interrupt > mask, while the rest takes the irq trip. >=20 > The NetBSD solution is still going to be visibly slower than not > messing with any of it as spin_unlock on amd64 is merely a store of 0 > and cmpxchg even without the lock prefix costs several clocks. >=20 > Maybe there is other hackery which could be done, but see below. I was thinking it might be possible to merge an 'interrupts disabled' count with the existing 'pre-emption disabled' count. IIRC (on x86 at least) this is just a per-cpu variabled accessed from %fs/%= gs. So you add one to disable pre-emption and (say) 1<<16 to disable interrupts. If an interrupt happens while the count is 'big' the count value is changed so the last decrement of 1<<16 will set carry (or overflow), and a return from interrupt is done that leaves interrupts disabled (traditionally easy). The interrupt enable call just subtracts the 1<<16 and checks for carry (or overflow), if not set all is fine, it set it needs to call something to re-issue the interrupt - that is probably the hard bit. >=20 > > =20 > > > > > So while this is indeed a tradeoff, as I understand the sane defa= ult > > > > > is to *not* disable interrupts unless necessary. =20 > > > > > > > > I bet to differ. =20 > > > > > > You're wrong. It is utterly standard to take spinlocks without > > > disabling IRQs. We do it all over the kernel. If you think that nee= ds > > > to change, then make your case, don't throw a driveby review. > > > > > > And I don't mean by arguing. Make a change, measure the difference. = =20 > > > > The analysis was done on some userspace code that basically does: > > for (;;) { > > pthread_mutex_enter(lock); > > item =3D get_head(list); > > if (!item) > > break; > > pthead_mutex_exit(lock); > > process(item); > > } > > For the test there were about 10000 items on the list and 30 threads > > processing it (that was the target of the tests). > > The entire list needs to be processed in 10ms (RTP audio). > > There was a bit more code with the mutex held, but only 100 or so > > instructions. > > Mostly it works fine, some threads get delayed by interrupts (etc) but > > the other threads carry on working and all the items get processed. > > > > However sometimes an interrupt happens while the mutex is held. > > In that case the other 29 threads get stuck waiting for the mutex. > > No progress is made until the interrupt completes and it overruns > > the 10ms period. > > > > While this is a userspace test, the same thing will happen with > > spin locks in the kernel. > > > > In userspace you can't disable interrupts, but for kernel spinlocks > > you can. > > > > The problem is likely to show up as unexpected latency affecting > > code with a hot mutex that is only held for short periods while > > running a lot of network traffic. > > That is also latency that affects all cpu at the same time. > > The interrupt itself will always cause latency to one cpu. > > =20 >=20 > Nobody is denying there is potential that lock hold time will get > significantly extended if you get unlucky enough vs interrupts. It is > questioned whether defaulting to irqs off around lock-protected areas > is the right call. I really commented because you were changing one lock which could easily be 'hot' enough for there to be side effects, without even a comment about any pitfalls. David >=20 > As I noted in my previous e-mail the spin_lock_irq stuff disables > interrupts upfront and does not touch them afterwards even when > waiting for the lock to become free. Patching that up with queued > locks may be non-trivial, if at all possible. Thus contention on > irq-disabled locks *will* add latency to their handling unless this > gets addressed. Note maintaining forward progress guarantee in the > locking mechanism is non-negotiable, so punting to TAS or similar > unfair locks does not cut it. >=20 > This is on top of having to solve the overhead problem for taking the > trips (see earlier in the e-mail). >=20 > I would argue if the network stuff specifically is known to add > visible latency, then perhaps that's something to investigate. >=20 > Anyhow, as Willy said, you are welcome to code this up and demonstrate > it is better overall. >=20