From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EF9EC0218F for ; Sun, 2 Feb 2025 19:35:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F1586B007B; Sun, 2 Feb 2025 14:35:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3A05A6B0083; Sun, 2 Feb 2025 14:35:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2683F6B0085; Sun, 2 Feb 2025 14:35:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0967B6B007B for ; Sun, 2 Feb 2025 14:35:05 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7A730162116 for ; Sun, 2 Feb 2025 19:35:04 +0000 (UTC) X-FDA: 83076007728.15.81B44D7 Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by imf28.hostedemail.com (Postfix) with ESMTP id 9615EC0005 for ; Sun, 2 Feb 2025 19:35:02 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="GggL/Aw/"; spf=pass (imf28.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.48 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738524902; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E2RaUqVREKbhY3EAZxb9ywq76j5yh9yQJJhOBP/DC3s=; b=62Mp5mhHWshDjzSC4TM+YfuLOxV588I8UtyFOBXCdfikjcCazlERR1vJqWEePs0gn1Zokd kksfxU+gs4nzncX7ij1Ge3rFGOjw3boLIIb7yFSPV3lPTCmMJcG59y1JxnUghQlGrw/2ai x0YNHNEKmooLb4gY78ehMQQaCm5j/nM= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="GggL/Aw/"; spf=pass (imf28.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.48 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738524902; a=rsa-sha256; cv=none; b=JxbgVGQA+RKROCMqfS75GrdpHRXfc4roRLbC+E3+QavcFSpUZhHuwEKsjsq4AgVX+kJD/q 5I9Jb/3ICdopPoba/14GExV+WIGKeRY9voi8Asmg4oweM7s0AsasRHDtDYqaIoBsUV3K6H IPzaxyftkr7qKfxyantqeoKoUhD1XEw= Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-5dc75f98188so5721690a12.2 for ; Sun, 02 Feb 2025 11:35:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738524901; x=1739129701; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=E2RaUqVREKbhY3EAZxb9ywq76j5yh9yQJJhOBP/DC3s=; b=GggL/Aw/yIm6VgpThH/vZQiBoFfgFxBHhT5pW9ANA1Z7tMQehmd5LOAzT9zip/5ZuZ daNGwdDALrvfGxbSC+seeWD4sY6OWZsBGA69SMuuyph7iBHJz+shVW/pIfc4h+l+xbzb 7J2xBsDQQajJ8rs/PG2MYrFXwopqH9Wc361rjOG+gmnuDPbx1QM7vzSB/jhR1urUsZQ/ g+tManCggKkTMjEhBLNHXN6Bo24LOCxmIoL3hT0tPmK7PCNiPEzI94Icg73wc5dVX0wM iHPfYjn4GF+aRq+e3grtxBRAALqwp6Y/n/M1Fbga6GzslssoLunNn/MN0PmMGgD00H9F 6iag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738524901; x=1739129701; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E2RaUqVREKbhY3EAZxb9ywq76j5yh9yQJJhOBP/DC3s=; b=F4WCIrlsj9xKluC6c+Lc1qho1E1uki/muhnqnnZciFZ5qpIBY0vtwX8XBDCTU2jTby FtOXY5ErjkYXNOQuDdN4sSy4nPb2SNtsZtSXCCgwMdIg67imWbZ70RSBitcCaIPauqTU N8VDFvluT+TlxMosOgigmKBUZmH+qcmcwKDxCvD+z66vf+UdE12mB9P7Rs1EAYTpZcCH MQKI7oSZDSQ0jO3rl2SgSS02m5UCj4BpPjPdj/kYSxPne+5zOMfpbD0rZXLAuizHfG48 zutwD7Qb4p/KBgc10Gu+PCLru7SeYsn3E7KrJMSPj/ZR+cGTalT6LT6uBv9uOhSlAp8k 6z1g== X-Forwarded-Encrypted: i=1; AJvYcCWdJo0SFYZQG72vbIs5VNJUn3D20tS3ZHYa77BIXK5gMus4RJ/h1xpVdb7lKSbcM+YLec07GcBp7Q==@kvack.org X-Gm-Message-State: AOJu0Yx+sBmMm5XwLbjlLNtwfFCPKWMTbCg8ZN4DtjIPasfyTA9dnHUc h7pm10OoMMtr9b3lgW7tnQ9ubzbo27G/dNfKuBLQFRQxSXky7WjYxkEFdIb0zVGnCZvROIjRATt YiRAsfoo/0deZuWfATav2AJ+dObs= X-Gm-Gg: ASbGncu2O5KiT1+CJI+FTQLutuL2MRp7F07qUAuLUuJzzFzsVzafXEC/eKYJxGqUIZN 18xpMKc9X8oU55okj997unaNrnev/rsYCVIBgm1HZPTDZ22dkoahsT73/cBAZYnFdD67vXRo= X-Google-Smtp-Source: AGHT+IH95JH6BXEWseCbWij6poPQoo6VHz2EGWsC8wb+zmfvzv/+jsXglNxU9kzXRPCydDRjxsr3pf3CNr1MEEZXQ0w= X-Received: by 2002:a05:6402:51d1:b0:5db:f52c:806c with SMTP id 4fb4d7f45d1cf-5dc5efebd64mr18883801a12.20.1738524900584; Sun, 02 Feb 2025 11:35:00 -0800 (PST) MIME-Version: 1.0 References: <20250201163106.28912-1-mjguzik@gmail.com> <20250201163106.28912-7-mjguzik@gmail.com> <20250201181933.07a3e7e2@pumpkin> <20250201215105.55c0319a@pumpkin> <20250202135503.4e377fb0@pumpkin> In-Reply-To: <20250202135503.4e377fb0@pumpkin> From: Mateusz Guzik Date: Sun, 2 Feb 2025 20:34:48 +0100 X-Gm-Features: AWEUYZmIA7JgyTYxCOisCFVJgvPVvJiRJmvsImdxPgYM7n6f_txR06QRSwEKYc0 Message-ID: Subject: Re: [PATCH v3 6/6] pid: drop irq disablement around pidmap_lock To: David Laight Cc: Matthew Wilcox , ebiederm@xmission.com, oleg@redhat.com, brauner@kernel.org, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 9615EC0005 X-Stat-Signature: rj6axmzb6dkw7dgyeh7zz1zdipe15549 X-HE-Tag: 1738524902-808274 X-HE-Meta: U2FsdGVkX1+CyqddJQjocMB7HdEhh/Qcje/3B76qdHB2N4MOjgR/KMO8vZ0pvLHZ45/VmSZbXrt7dakWQ0DL7xYG/G5kKZ8x7nWHCar1k2OPKz+5FhNJ0q4RD6se8OaPUUYO161XWUxJjvWuTAk1lUpdqwHvzqhgz/Tc1wmCqrNr7thYdc9BW6GjCWlY7sVa6qFGypzu+AcjymHoVrlQIP7Jgmybk1arNrrICXwqp+4BnGuKGshpuTJcHThGcZ2cdNS0+7pmtwIxZArxtyp6DJS9OosiPe4TXUddBxx7QpyCtQevC8zzWQ6xZMp5hKRdCrUFr510U8RptIZa0nhs/HaVWxaLFawb1Y2L2v7ZAHxWldDLSo3uG7t9ALi6zrPyoRSGN3pdVTMQ0MJ9NsB9lcFlVVP/8rspJNO10L5k73XfUY1yo0oV+d5aoaw3L5pNKCat01XE43BfDBiUSamjj2nTdzhWj8rh+TBd830ih9Ls2EHgcYHm8khtrffZbebFhk9pQag+1PbvP/7w3iJgXyL8m3NNhMhkmMFyeYAd8JGnmRk3L21dLg2jfrjFi/35trrosjp+eqf+oX/Bmg/tY+fpvcmC9WEFsylvd6jj/7nT5FaQ+qZzg35ojyi7w3C5U8sbWmcZfVzkPxiBQzmFoeI00qfK6cJ0QF8nwvascfiBuX8KPQwG83KhfBqTTwkxwLtCe/qCLz3L9GfbSUO2lqIT7o7MhiooYu3SviZdF6wuCb5FYc8ss/EjUVFwjf8nQszZBH/jZCEFdGxD4aPdiwgPmba2S0p2qpELeUyh/h2TFr4FJFYCbeZf8BfucTI8c+UBzg/jNDvKi8+BtMxvnkt9SJMGmZkOGGxZZTjBKUaKoYlkFJIEsuNh0rReyGhsQOTD6sshz6hxjaosqz/lbtGg9JBHRvf4hZnl2+N0H8iT35UY+9iSW6YdPXqYWVGEiNXwt5GrvoZzg7VNw8F IV2VZXMD caDUdByVVjU13ZUXr7vpXd47ZXX5qRBDZTn/mEvH4v4Zc4KcODXzVlHgYuIJshnbfsWgob6/IyS8ycvn0i82KGS6/zJKzlrxIE7b8Ko6Xa9hwvEfk8zpwVoal4Foo2htJfifcXcXdAIHSPwobpPBeW7NxDyMglH0RCzcoS+wSYpPMr7hUsgnhIKRU+D5/6NPS//3PttQoOFgRURFU76Wgxz66PvyvzVBcDeSk5bADTjeLA0ThdagC2qhySh4nS74igza0rI4XeLi7eVrs0yJvmydkV9T53e5L+X+qb7RAfiXV6O3e5kEQ6i8InTyDhCowJNzx4ZFAv9AxzKWln1p2WQDdG4qzcu7KdrKv36Yn6chimN4vMq4g/IihJWs/zVusg0+EWCVJ7IqcFCQ7DA73hx6RDa+KC8LmSflXlEd1nvt8WjpsfZVIyn+D1j4ze0a07hLh467R0Q6TCM9nrrdR7mKx4/CCNjdkuTphkdUkXG3T87Me2wqGaakmatjbcb12AdVbkTCRWL85Quw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.001703, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Feb 2, 2025 at 2:55=E2=80=AFPM David Laight wrote: > > On Sat, 1 Feb 2025 22:00:06 +0000 > Matthew Wilcox wrote: > > > On Sat, Feb 01, 2025 at 09:51:05PM +0000, David Laight wrote: > > > I'm not sure what you mean. > > > Disabling interrupts isn't as cheap as it ought to be, but probably i= sn't > > > that bad. > > > > Time it. You'll see. > > The best scheme I've seen is to just increment a per-cpu value. > Let the interrupt happen, notice it isn't allowed and return with > interrupts disabled. > Then re-issue the interrupt when the count is decremented to zero. > Easy with level sensitive interrupts. > But I don't think Linux ever uses that scheme. > I presume you are talking about the splhigh/splx set of primivitives from Unix kernels. While "entering" is indeed cheap, undoing the work still needs to be atomic vs interrupts. I see NetBSD uses local cmpxchg8b on the interrupt level and interrupt mask, while the rest takes the irq trip. The NetBSD solution is still going to be visibly slower than not messing with any of it as spin_unlock on amd64 is merely a store of 0 and cmpxchg even without the lock prefix costs several clocks. Maybe there is other hackery which could be done, but see below. > > > > > So while this is indeed a tradeoff, as I understand the sane defaul= t > > > > is to *not* disable interrupts unless necessary. > > > > > > I bet to differ. > > > > You're wrong. It is utterly standard to take spinlocks without > > disabling IRQs. We do it all over the kernel. If you think that needs > > to change, then make your case, don't throw a driveby review. > > > > And I don't mean by arguing. Make a change, measure the difference. > > The analysis was done on some userspace code that basically does: > for (;;) { > pthread_mutex_enter(lock); > item =3D get_head(list); > if (!item) > break; > pthead_mutex_exit(lock); > process(item); > } > For the test there were about 10000 items on the list and 30 threads > processing it (that was the target of the tests). > The entire list needs to be processed in 10ms (RTP audio). > There was a bit more code with the mutex held, but only 100 or so > instructions. > Mostly it works fine, some threads get delayed by interrupts (etc) but > the other threads carry on working and all the items get processed. > > However sometimes an interrupt happens while the mutex is held. > In that case the other 29 threads get stuck waiting for the mutex. > No progress is made until the interrupt completes and it overruns > the 10ms period. > > While this is a userspace test, the same thing will happen with > spin locks in the kernel. > > In userspace you can't disable interrupts, but for kernel spinlocks > you can. > > The problem is likely to show up as unexpected latency affecting > code with a hot mutex that is only held for short periods while > running a lot of network traffic. > That is also latency that affects all cpu at the same time. > The interrupt itself will always cause latency to one cpu. > Nobody is denying there is potential that lock hold time will get significantly extended if you get unlucky enough vs interrupts. It is questioned whether defaulting to irqs off around lock-protected areas is the right call. As I noted in my previous e-mail the spin_lock_irq stuff disables interrupts upfront and does not touch them afterwards even when waiting for the lock to become free. Patching that up with queued locks may be non-trivial, if at all possible. Thus contention on irq-disabled locks *will* add latency to their handling unless this gets addressed. Note maintaining forward progress guarantee in the locking mechanism is non-negotiable, so punting to TAS or similar unfair locks does not cut it. This is on top of having to solve the overhead problem for taking the trips (see earlier in the e-mail). I would argue if the network stuff specifically is known to add visible latency, then perhaps that's something to investigate. Anyhow, as Willy said, you are welcome to code this up and demonstrate it is better overall. --=20 Mateusz Guzik