From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 960B4C25B78 for ; Tue, 28 May 2024 20:57:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DEE66B0098; Tue, 28 May 2024 16:57:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28E6D6B00AD; Tue, 28 May 2024 16:57:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 156CB6B009C; Tue, 28 May 2024 16:57:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E0E786B00AD for ; Tue, 28 May 2024 16:57:04 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 991C9140549 for ; Tue, 28 May 2024 20:57:04 +0000 (UTC) X-FDA: 82169014368.25.42429FB Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by imf04.hostedemail.com (Postfix) with ESMTP id C2A7740010 for ; Tue, 28 May 2024 20:57:02 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf04.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716929822; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=07IIU5/iLq0mpEKlNBbglcpKnHcrqOpairoOpNeunHM=; b=nRvd00LNhFqPUDn9EHYUMnddnWbhkPCp4s7XGW6mP9x6EfLF/W+N33i42Ocj6L9WP/AuE7 HalK5YqhpfO2jUGo6sKM1Z2ICQuAiIA6JgZAlMo/DL9jfgdBFbadpx+KxHkf5UUWNuIY/m 3OT5gTiPyVvu6vXQgpoJakB9dcMg6zA= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf04.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716929822; a=rsa-sha256; cv=none; b=JTckAjqDi2XonVESQmDuxCgaXi/kUA1+h13AzYB2DywYt9QKYQl1SxEzxvmYaq3PY1eKYQ rpqJOUYf7lIJ8D5wf/2an+VoXm72TtVXAcK9C8DjRdZAGaQlg9veDyO/YiZdih+ZGh8dFA 988GXTgkylCn1YYtWQa1+KSIE5zs448= Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2bf5e0d8df9so169231a91.1 for ; Tue, 28 May 2024 13:57:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716929822; x=1717534622; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=07IIU5/iLq0mpEKlNBbglcpKnHcrqOpairoOpNeunHM=; b=gZiWeho6WCsaAhy2Hj6jkj1bFIWwuiSWVGgjNu6QLaWhgJhb9A1qKhYukrna7tL8ye U1pHT6s1PVcXhRcElbFt0O1zVsZgPI9AaysCOw25xfG8Kk8n3WdxZek95GgnN6PB6fay yb1XyI1DVqtHkel3EJVwU4mszDPkmi3hy1znGwDemd8n1mNIRNXJch8lf+qYh8sM7MAs pQy+kkExhtvdqaIvl2UjE+lmWhvSU5gUHEwJ7ck8FO+imaNJOZM9CY1fQA5H3oU+13Sn p86FF3Wsx468QUKX047rlq0ITTQ81n/8ED4++JioVAHGsahv+dSBpv+qxQzR4X2XwVJ9 8PJA== X-Forwarded-Encrypted: i=1; AJvYcCUxdk8PagV/SJzthJCaHyfK1PTCaId+yCvNBaS5cTkIb/oBG7c4E+PUyp2Hc0fWyD5QHTfPa1MNZIY7DaQR169CndY= X-Gm-Message-State: AOJu0YxnBhShhT6/oMDgz5EvtG1Q3iPFOc1A33sWEXkRH3Fnyu7FeHV9 vVHhBVP6L8/1TMsDjjX0bjyiZpTz5ohNIPM53b8XVfi3TbpZGkmo X-Google-Smtp-Source: AGHT+IGtnpj/yAqJkRmgY6bYx6gFBwzXCLnDfWllgsMjTowHKCPmryfeOYsafjqkBKAXhIDB2PAtQg== X-Received: by 2002:a17:90a:5d91:b0:2bd:eb72:9fd4 with SMTP id 98e67ed59e1d1-2c02ec62d38mr257568a91.23.1716929821575; Tue, 28 May 2024 13:57:01 -0700 (PDT) Received: from snowbird ([136.25.84.117]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2bf5f61272fsm8249740a91.27.2024.05.28.13.57.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 May 2024 13:57:01 -0700 (PDT) Date: Tue, 28 May 2024 13:56:58 -0700 From: Dennis Zhou To: Mateusz Guzik , Andrew Morton Cc: tj@kernel.org, hughd@google.com, akpm@linux-foundation.org, vbabka@suse.cz, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v4] percpu_counter: add a cmpxchg-based _add_batch variant Message-ID: References: <20240528204257.434817-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240528204257.434817-1-mjguzik@gmail.com> X-Rspamd-Queue-Id: C2A7740010 X-Stat-Signature: eh49oqs7ubzumhibmru3mxnk3itzh48f X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1716929822-14097 X-HE-Meta: U2FsdGVkX1/08rTqgA0qE//UfldrsbgxA8yTDrEnu55z7z8QAeihDvkXb767iiDWlitQJ0ZpfQDgVL+VKB4TgqV3zn2hbWT82ZU92eJHl0/lffBwpIh+CTjHRIu0gL4P14QvuwBCXhIlwr6XUefMNBwktuYXe+Fvi8bjCqSxg6eIkxH44suTnBVKv2bw+MXTME78A+MIMQSszOJ0jPZAOxTQ+ZNq5jEY2xC3ZBmgN+H6nc3B0S00YF9Mg9pdrVEE4Iw5iIR74URcOJYz8Q8nZOam8manxXIoc8q8T05unPb3NY8lnCMsfUOJcUvZErS3X5frQonSkTF3WVZNvW0iJai0F9u1rlgXh89UVpegPpamRZUqdvpkrrT0CPs/P8Ttzp0F5799oDz00TlyOaKJeVnbmuf7CDQ3Q5D9F7Axe0pAHE2rbvn12nYyxPhyA49owsp/5K1NWr9uDlfPUsU4PMCLvDcy/GsR7rClJIdL7gV8+G7CpwtMZ/dGCrraO7b1zfM6NVjJN6Tdmasy3+HashnoLqvsRlU+lsrvxQLATHL/YZLIUIVM87TNoDej2KiqTqVx3+b6d/RpG6Xz1nGjL5l9yYPFDPf9esGLm6b60tM4+tZuog54/m7uW5mUFdGLofnPGXOg0XrioQFGwzG4mklQ6XYwoDJc+jdbV4glXwstVfsSi15dee5A361vYRLVSDzS47aRD4oN/UsFhTJzyDJg2H6bQFUFSTdeHj7Krb/I31KDCMH7Isv2Cq25GQ7Ptl51uVfF9R+jXaHvNNt4uZEwKlqpgjvqGlW0smtgWr7A0jVu+3uBxvE/2kiBo0wzBuBzjI3AHTNLBuW93SWuw+gH8icanY12OmPu1UuV4Y+U8KiGVDlvbhINtmzGMpTUVW1YUc2cURwQzd+F80kUDqns/AaSYJ1qT0vhVIFItYQd5Xzofjn+wiVMH+0S8uS8qux34oQ92VIbOvg+GKA jwjcAglk UagFC4I3bNWjYE0NzuBNToqkPNO4Bt7pfpp5z18vLPwrkUYXkdDcKcuEVm9CzZUXWH3lJXc1pqXmz9jqZCgXLEPavIGN06k9iEoQWcMlEIoGwGYuqDAb8C4BfMGHjt1uFjoJCIpgtR7isQ0gMtoi4tgrlFuGCDHZGM53ZeMuvuIE4S+AJOkCGnl9Yof/URy8FMvR96HzCKBU+93QoNpCp7kJ+S58REOMxWvhKVwmL7dvLmx2FwEEGwK3041zajIT8q6HmCWzE6Dm5c28BJIi6AVnHhQfU7CJ8dU+XrMjbQfTuzi+hw/qgJtcGPLGqcZBurEHtRRrSPa4TEw1qJ2P99q27Pvta5382eu0onFuTM+E+CqYS9va7ZF4sve6f6dXH6bOi+XGZ7fKaGrEab8ADCVaX3JnQXdRwIouZ+Tz2k3rVy2NVPGk8plMDJKce34NjqAt6FsdF9YSu1QSU6QODyEV8O6q0jw8zj6R0nK00bk6isiRH/DubDbbCMRtTHup4FQn8O2oZqrBYgyLT+/M2h4rIoXrckb383D+Ae8QyBKq+0opYemYL0wWEHyRBL98rKBklj38k6RO6ghfeKAHcHc6YSA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 28, 2024 at 10:42:57PM +0200, Mateusz Guzik wrote: > Interrupt disable/enable trips are quite expensive on x86-64 compared to > a mere cmpxchg (note: no lock prefix!) and percpu counters are used > quite often. > > With this change I get a bump of 1% ops/s for negative path lookups, > plugged into will-it-scale: > > void testcase(unsigned long long *iterations, unsigned long nr) > { > while (1) { > int fd = open("/tmp/nonexistent", O_RDONLY); > assert(fd == -1); > > (*iterations)++; > } > } > > The win would be higher if it was not for other slowdowns, but one has > to start somewhere. > > Signed-off-by: Mateusz Guzik > Acked-by: Vlastimil Babka > --- > > v4: > - fix a misplaced paren in unlikely(), reported by lkp: > https://lore.kernel.org/oe-lkp/ZlZAbkjOylfZC5Os@snowbird/T/#t > > v3: > - add a missing word to the new comment > > v2: > - dodge preemption > - use this_cpu_try_cmpxchg > - keep the old variant depending on CONFIG_HAVE_CMPXCHG_LOCAL > > > lib/percpu_counter.c | 44 +++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 39 insertions(+), 5 deletions(-) > > diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c > index 44dd133594d4..51bc5246986d 100644 > --- a/lib/percpu_counter.c > +++ b/lib/percpu_counter.c > @@ -73,17 +73,50 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount) > EXPORT_SYMBOL(percpu_counter_set); > > /* > - * local_irq_save() is needed to make the function irq safe: > - * - The slow path would be ok as protected by an irq-safe spinlock. > - * - this_cpu_add would be ok as it is irq-safe by definition. > - * But: > - * The decision slow path/fast path and the actual update must be atomic, too. > + * Add to a counter while respecting batch size. > + * > + * There are 2 implementations, both dealing with the following problem: > + * > + * The decision slow path/fast path and the actual update must be atomic. > * Otherwise a call in process context could check the current values and > * decide that the fast path can be used. If now an interrupt occurs before > * the this_cpu_add(), and the interrupt updates this_cpu(*fbc->counters), > * then the this_cpu_add() that is executed after the interrupt has completed > * can produce values larger than "batch" or even overflows. > */ > +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL > +/* > + * Safety against interrupts is achieved in 2 ways: > + * 1. the fast path uses local cmpxchg (note: no lock prefix) > + * 2. the slow path operates with interrupts disabled > + */ > +void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) > +{ > + s64 count; > + unsigned long flags; > + > + count = this_cpu_read(*fbc->counters); > + do { > + if (unlikely(abs(count + amount) >= batch)) { > + raw_spin_lock_irqsave(&fbc->lock, flags); > + /* > + * Note: by now we might have migrated to another CPU > + * or the value might have changed. > + */ > + count = __this_cpu_read(*fbc->counters); > + fbc->count += count + amount; > + __this_cpu_sub(*fbc->counters, count); > + raw_spin_unlock_irqrestore(&fbc->lock, flags); > + return; > + } > + } while (!this_cpu_try_cmpxchg(*fbc->counters, &count, count + amount)); > +} > +#else > +/* > + * local_irq_save() is used to make the function irq safe: > + * - The slow path would be ok as protected by an irq-safe spinlock. > + * - this_cpu_add would be ok as it is irq-safe by definition. > + */ > void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) > { > s64 count; > @@ -101,6 +134,7 @@ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) > } > local_irq_restore(flags); > } > +#endif > EXPORT_SYMBOL(percpu_counter_add_batch); > > /* > -- > 2.39.2 > Andrew you picked up the __this_cpu_try_cmpxchg() patches. At this point you might as well pick up this too. The cpumask clean ups are likely going to give me trouble later this week when I rebase so I'll probably have to base my percpuh hotplug branch on your mm-unstable now. Acked-by: Dennis Zhou Feel free to toss my ack on the __this_cpu_try_cmpxchg() too. Thanks, Dennis