From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9258C433F5 for ; Sun, 28 Nov 2021 18:05:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B3F76B007B; Sun, 28 Nov 2021 13:05:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 164796B007D; Sun, 28 Nov 2021 13:05:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F20506B007E; Sun, 28 Nov 2021 13:05:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id E23936B007B for ; Sun, 28 Nov 2021 13:05:29 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id ABDF488498 for ; Sun, 28 Nov 2021 18:05:19 +0000 (UTC) X-FDA: 78859115958.06.51A1EA1 Received: from rere.qmqm.pl (rere.qmqm.pl [91.227.64.183]) by imf05.hostedemail.com (Postfix) with ESMTP id 073A8508BB9E for ; Sun, 28 Nov 2021 18:05:09 +0000 (UTC) Received: from remote.user (localhost [127.0.0.1]) by rere.qmqm.pl (Postfix) with ESMTPSA id 4J2Gbw0CGMzWG; Sun, 28 Nov 2021 19:05:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=rere.qmqm.pl; s=1; t=1638122717; bh=sEqOa0ZEPEWEUlBRSLFAiVwd6iLDymwK+DV3juGIkQI=; h=Date:From:To:Cc:Subject:From; b=XxvJmPPWazylznPkDFNVygPkohQ37AeYwywQvuepL1w31LahkAW0gP0LAhZyVJpcw QCkB6BljQdra5dN8zW66BSxqJNFsOQZ3SXN/IvfR7aJdEfhChJ01RFDgEXy7ZRidew 5AuU0watcW2VhJ8VLEKa/g48uwHd6hUjzetMQJoBwvwxwQVSA3IWggzLzC2joJiPHp 8JY+UhUQ+oYzz2QsWC6IREdA0RA0M1lN6rt3R0c026YIrrdzB0zCSa9TOmB+kCXZPl HMxpZwG7mci2iV7G+PnmCZt5b87M/uk6yQ9shXg1YFXT3Bi4Ka8VZS9Szm3f/vvMWn w1e6Lkbu+s3yw== X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.3 at mail Date: Sun, 28 Nov 2021 19:05:14 +0100 From: =?iso-8859-2?Q?Micha=B3_Miros=B3aw?= To: Yury Norov Cc: linux-kernel@vger.kernel.org, "James E.J. Bottomley" , "Martin K. Petersen" , "Paul E. McKenney" , "Rafael J. Wysocki" , Alexander Shishkin , Alexey Klimov , Amitkumar Karwar , Andi Kleen , Andrew Lunn , Andrew Morton , Andy Gross , Andy Lutomirski , Andy Shevchenko , Anup Patel , Ard Biesheuvel , Arnaldo Carvalho de Melo , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christoph Hellwig , Christoph Lameter , Daniel Vetter , Dave Hansen , David Airlie , David Laight , Dennis Zhou , Dinh Nguyen , Geetha sowjanya , Geert Uytterhoeven , Greg Kroah-Hartman , Guo Ren , Hans de Goede , Heiko Carstens , Ian Rogers , Ingo Molnar , Jakub Kicinski , Jason Wessel , Jens Axboe , Jiri Olsa , Jonathan Cameron , Juri Lelli , Kalle Valo , Kees Cook , Krzysztof Kozlowski , Lee Jones , Marc Zyngier , Marcin Wojtas , Mark Gross , Mark Rutland , Matti Vaittinen , Mauro Carvalho Chehab , Mel Gorman , Michael Ellerman , Mike Marciniszyn , Nicholas Piggin , Palmer Dabbelt , Peter Zijlstra , Petr Mladek , Randy Dunlap , Rasmus Villemoes , Roy Pledge , Russell King , Saeed Mahameed , Sagi Grimberg , Sergey Senozhatsky , Solomon Peachy , Stephen Boyd , Stephen Rothwell , Steven Rostedt , Subbaraya Sundeep , Sudeep Holla , Sunil Goutham , Tariq Toukan , Tejun Heo , Thomas Bogendoerfer , Thomas Gleixner , Ulf Hansson , Vincent Guittot , Vineet Gupta , Viresh Kumar , Vivien Didelot , Vlastimil Babka , Will Deacon , bcm-kernel-feedback-list@broadcom.com, kvm@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-crypto@vger.kernel.org, linux-csky@vger.kernel.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-perf-users@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-snps-arc@lists.infradead.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline X-Rspamd-Queue-Id: 073A8508BB9E X-Stat-Signature: jneyjge5nahzd3yydfiy9hhrotywdyrg Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=rere.qmqm.pl header.s=1 header.b=XxvJmPPW; dmarc=pass (policy=reject) header.from=rere.qmqm.pl; spf=pass (imf05.hostedemail.com: domain of mirq-linux@rere.qmqm.pl designates 91.227.64.183 as permitted sender) smtp.mailfrom=mirq-linux@rere.qmqm.pl X-Rspamd-Server: rspam02 X-HE-Tag: 1638122709-121165 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Message-ID: <20211128180514.nepaRaB8WjMDQcy9o9_XdZpNhfU0WCdBjPs62W6tRS4@z> On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > In many cases people use bitmap_weight()-based functions like this: >=20 > if (num_present_cpus() > 1) > do_something(); >=20 > This may take considerable amount of time on many-cpus machines because > num_present_cpus() will traverse every word of underlying cpumask > unconditionally. >=20 > We can significantly improve on it for many real cases if stop traversi= ng > the mask as soon as we count present cpus to any number greater than 1: >=20 > if (num_present_cpus_gt(1)) > do_something(); >=20 > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > functions together with corresponding wrappers in cpumask and nodemask. Having slept on it I have more structured thoughts: First, I like substituting bitmap_empty/full where possible - I think the change stands on its own, so could be split and sent as is. I don't like the proposed API very much. One problem is that it hides the comparison operator and makes call sites less readable: bitmap_weight(...) > N becomes: bitmap_weight_gt(..., N) and: bitmap_weight(...) <=3D N becomes: bitmap_weight_lt(..., N+1) or: !bitmap_weight_gt(..., N) I'd rather see something resembling memcmp() API that's known enough to be easier to grasp. For above examples: bitmap_weight_cmp(..., N) > 0 bitmap_weight_cmp(..., N) <=3D 0 ... This would also make the implementation easier in not having to copy and paste the code three times. Could also use a simple optimization reducing code size: #include int bitmap_weight_cmp(long *bits, size_t nbits, size_t cmp) { for (size_t i =3D 0; i < nbits / BITS_PER_LONG; ++i, ++bits) if (check_sub_overflow(cmp, popcount(*bits), &cmp)) return 1; nbits %=3D BITS_PER_LONG; if (nbits && check_sub_overflow(cmp, popcount(*bits & GENMASK(nbits)), &cmp)) return 1; return cmp ? -1 : 0; } Best Regards Micha=B3 Miros=B3aw