From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B0D8C001DE for ; Thu, 3 Aug 2023 00:45:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C95A5280204; Wed, 2 Aug 2023 20:45:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C44B92801EB; Wed, 2 Aug 2023 20:45:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0C96280204; Wed, 2 Aug 2023 20:45:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A31922801EB for ; Wed, 2 Aug 2023 20:45:50 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6796540461 for ; Thu, 3 Aug 2023 00:45:50 +0000 (UTC) X-FDA: 81080950860.11.289C1E8 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf25.hostedemail.com (Postfix) with ESMTP id 79698A0019 for ; Thu, 3 Aug 2023 00:45:47 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=j7CTXFrn; spf=pass (imf25.hostedemail.com: domain of yury.norov@gmail.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=yury.norov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691023547; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GrMEgIpkm4DOHGO7jkt8EZd96GkcH4rGkdwb4adkLh4=; b=dHMCBDz37xXpz4bIkiPx5nkORz/KvcE/TqlZsDvTeoH637Ez8HtSTeYuNTKLFrZXj2qc8t +yKoPWPR/Wo7L58jTBqrgT9zgHQ6zUN5xS0ZXo8HX9gGPDKbYUurBUo6TAIB0+6M+eINm/ pFjoiQz9dPLok4K00OjQI98zqGa78LE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=j7CTXFrn; spf=pass (imf25.hostedemail.com: domain of yury.norov@gmail.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=yury.norov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691023547; a=rsa-sha256; cv=none; b=Aw1p21UbZgNCdEjn2Yslc2OijKggi3KmAQUUIG1pRgYFk9UTt/dbveDKekiDsVkDvRCvtI M/BSTFs7fTLO08r9q6AsbqzOwkJ8LjAAi6tV8eYuwhp+VlhpVCMrnbrRZj1e4dm3lo6OsS RfXoQTdJ4FRDzWaJwikWuA+oLG73Y7Q= Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-40fbf360a9cso2875061cf.3 for ; Wed, 02 Aug 2023 17:45:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691023546; x=1691628346; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=GrMEgIpkm4DOHGO7jkt8EZd96GkcH4rGkdwb4adkLh4=; b=j7CTXFrnyFko0+2H/M8nyAfarmrhDwmXt+n4w1bxjxq7ssdjRIzyoNV1MTFDwz7GLY ijN47+5abnB1LdeMxVLJNF94q0+dTvXR1nqut0q5GcfbzLZ6fVCfEcLOEocZy8pW2cBr isfOWiXkCNZ8KGrfwq2GsoPlxyRX1j61IIA4kcW9kUybk8vASOMRsxi3HWAKPE1317Um Zxw8JE2YWRZTuSMK1CD8WVt6AasrDWt5oykDdVP41NGRkyySoI/PsU3g7CB5urTSf0Yz JhW1eN2s0tVTVXC0TYqGnGtAhBBp4o/yuvfZObgtneSWG4dVwh+Wa1x/xZ/xBS+smfog iK5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691023546; x=1691628346; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GrMEgIpkm4DOHGO7jkt8EZd96GkcH4rGkdwb4adkLh4=; b=CstKf2vR2zta6SI0qdWfxrC74XFHolHzJ3555FrAIpf4MHv0tXXf3k2alNn+6eU5j3 qpinzfiQckdIUxEpjFP7YysV2EjzpRZf5UtKtYPPY4XH9GjrG8sfhAvDby3D5Ve5Q8VL 0sOlzv+jdfKAqFrY0jbaQ5ib5Z42LeOp4WBTCtH3saac/r+7G/QvZVWuvo5fp0QUw5Nu yQU824qKHO89qKarIpCSr7setSrArPdBLM3coJg3Wczc+nrxHg4fwY288mbONwvyCCYc 8GavghCv6rojFV64LhPM/5Z1XxERPkZcCoytdCmeYi9HQadP8/0uFAPvWVtkdZcboXYv ePMA== X-Gm-Message-State: ABy/qLYFWxfypeKODP7C04sQRfqClmA+LpeRjcCltVwqkbWnPMssHatE AAJlI4XDs0up4cYbe9mGv0I= X-Google-Smtp-Source: APBJJlFeJ5QHYsTLYvciZrfmRwYH5Do4DFRCqMp05HjanrOhz1z506QNy3sHJ0LayCmbrXV9OuDe/g== X-Received: by 2002:ac8:4e4c:0:b0:40c:82f:b66c with SMTP id e12-20020ac84e4c000000b0040c082fb66cmr19106129qtw.34.1691023546451; Wed, 02 Aug 2023 17:45:46 -0700 (PDT) Received: from localhost ([2607:fb90:8de1:d00d:8c37:1fd4:e9e6:8e5a]) by smtp.gmail.com with ESMTPSA id j13-20020a37c24d000000b00763b94432ebsm5430212qkm.18.2023.08.02.17.45.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Aug 2023 17:45:46 -0700 (PDT) Date: Wed, 2 Aug 2023 17:45:44 -0700 From: Yury Norov To: Peter Zijlstra Cc: andriy.shevchenko@linux.intel.com, linux@rasmusvillemoes.dk, Andrew Morton , Linus Torvalds , Mateusz Guzik , linux-kernel@vger.kernel.org, linux-mm@kvack.org, tglx@linutronix.de, rppt@kernel.org Subject: Re: [PATCH v2 2/2] mm,nodemask: Use nr_node_ids Message-ID: References: <20230802112458.230221601@infradead.org> <20230802112525.633758009@infradead.org> <20230802193616.GC231007@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230802193616.GC231007@hirez.programming.kicks-ass.net> X-Rspamd-Queue-Id: 79698A0019 X-Rspam-User: X-Stat-Signature: n58msets7w95wpgcez8kasop8ecjp79n X-Rspamd-Server: rspam01 X-HE-Tag: 1691023547-49260 X-HE-Meta: U2FsdGVkX19y1Q78wgTeN/dYfYtWJmm2OnoSrrmtwTMMss2b0REkqkFe7oDk3yzf15BHriU28G3lHfWAgvaHJuEHfWrVIpZkIHFtl1MXCoZe50SKfzV61ajJHtzlQSqmey6udIQZouE2mVe+20MMthoGgxMXnB3kRWU9XmczL+7Vc8aeZ2960Nhp24+SlawKjBKh4YGNMGwNtJXNqqV58y1F+BknTAHfwzW0JZTnSGchyypTl+jeOAxIMdXOEGBShsE9NrabbQpOvgEq+/AiY5rxNbwg2LY9kp/IjJZkNCyiDBO2gBXrKQh1EJTfOBY3d1rFIhMhI5JzbAMBVaky+UiY0CsSzU35wwWpg+DcabVqpZhOSEEVaQy9acnPbG8GrC1G6qNNWsnyqGu5GufV2daGVsOk+gpO49PDPJVW738fMsaXPJ6fxPdTTBa+5DkLy9SxgxtkqxZv1QpboHOsECOGleuGbyYXvBUGirzweZt82sooXbcbsRhUD531N9uQhyBM/s0Qbz4koNHN57rNAEt67YXj84QlYDObO8mHNo4g1nTuXVtpluLh23AH5Wr6jz+CzeiOavznXCTIwT46TwRI83GeO6stV71i33Cas3z0Hy79JRQIrijbA5RIj7zlPYz2nDxMF3k2pQzEnZgi20hqWrzWIiW43Ak2ZodG0yBRKiGZQ3FCXG2QtnpZfU0NVyNUpljG3ogx3glpSVFK2O/RCKInZptgq8xz8tYqtkRU7mec+65zMBkJt6O90VoDbk1i4haBMQGtEydGwT2/NPCo6pXQBPAXFa2T3CIzqFvm92veXgYN0JqporN+866eQPbudKnhMyRo9s3TYr1+BJcsYdQRaEL4i+FLTgEOIkVHgoUEXbrStwj8HA64MqY+29ZwdW4POLdOBgJ5m+aR0QYPXsldVp/D4eeSOZ6AtEkCs7P55MLgLhH8rOipFsQK3X0VKihJ2hfW5aElguo 6aHELHRX CAG+SZiuZLoexYyI5IbzpDsi0V/8fpQeB995hnIz6Zj6slsJ8b98rosyGvKzmPMPJlOtvcWG6RfH/7FFO/wRk/TWfWfiO7c14gJa4HRmUf0IZwGqU/7dcxifCJFl4tZHzAP0tGgj8JeVOYRvaQ4gRZun5BZALTJwYPaG18LbL5yOpLp38R73glm0UBWj5tD/btEzqzqh7Cf9VcRzBzeVh0ESbe1C46Q0FwHozbnWTXC1A1P+gCfTZwMURkV+onnNB4f0XMYphVV/V929MFxVZLlHWoE6q0M0oxrOTfdAsGRvu2Tai8ChGWLGsQg2bZSevHm/eGfuuCoAf+MQkaW5VHOZF+MTHvvlRz1y4m/Po/XyllSMlZUdJWh2XX/7Azgh7c7YYunBU7r2JpCQfCuNOoJJQSzI8xivwp17FtHVlZq9RDRwprZzMalPGZa3IZtA84TOp341cZXvw9Gr9bdm92f+davl5mVw3Pvv6/MwWO0GqDhsd/sHKeBqYLG99g3gbqrRaUstbD7WDsgbw5bTmAnXfAExVOnufnSEsbjbDpMxddNTTT8iNTsBNxeQ/5bqHgP5HQiYBZxc98M3xyB5Nz/4ONkdLBoXj0aRSYWlrxmpmB0c4/gaaa/tYOBbiGkabTyjlrdY4NkKoGTzLCEMr5W+fYES25Fb94Cu9Jhf4+1yOycJVTFQg7J6rEKW7lv3Jjoof X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: + Linus, Mateusz On Wed, Aug 02, 2023 at 09:36:16PM +0200, Peter Zijlstra wrote: > > Just like how cpumask uses nr_cpu_ids to limit the bitmap scanning, > make nodemask use nr_node_ids. > > Since current users expect MAX_NUMNODES as the end-of-bitmap marker, > retain this behaviour for now. > > Signed-off-by: Peter Zijlstra (Intel) > Reviewed-by: Mike Rapoport (IBM) > --- > Changes since v1: > - updated and reflowed the 'borrowed' comment some more (rppt) Hi Peter, Thanks for the patch! I wanted to do it sooner or later. Can you mention the commit that you used to borrow the approach. Maybe suggested-by? The motivation for the original patch was that the following test revealed broken small_const_nbits() optimization for cpumasks: On Fri, Mar 3, 2023 at 12:39 PM Mateusz Guzik wrote: > > as an example here is a one-liner to show crappers which do 0-sized ops: > bpftrace -e 'kprobe:memset,kprobe:memcpy /arg2 == 0/ { @[probe, > kstack(2)] = count(); }' See: https://lore.kernel.org/lkml/CAHk-=wgfNrMFQCFWFtn+UXjAdJAGAAFFJZ1JpEomTneza32A6g@mail.gmail.com/ Can you make sure your patch doesn't brake the test for nodemasks? > include/linux/nodemask.h | 121 ++++++++++++++++++++++++++++++++++------------- > 1 file changed, 89 insertions(+), 32 deletions(-) > > --- a/include/linux/nodemask.h > +++ b/include/linux/nodemask.h > @@ -99,6 +99,48 @@ > typedef struct { DECLARE_BITMAP(bits, MAX_NUMNODES); } nodemask_t; > extern nodemask_t _unused_nodemask_arg_; > > +#if MAX_NUMNODES > 1 > +extern unsigned int nr_node_ids; > +#else > +#define nr_node_ids 1U > +#endif > + > +/* > + * We have several different "preferred sizes" for the nodemask operations, > + * depending on operation. > + * > + * For example, the bitmap scanning and operating operations have optimized > + * routines that work for the single-word case, but only when the size is > + * constant. So if MAX_NUMNODES fits in one single word, we are better off > + * using that small constant, in order to trigger the optimized bit finding. > + * That is 'small_nodemask_size'. > + * > + * The clearing and copying operations will similarly perform better with a Copying will not, because there's no nodemask_copy(). :-) > + * constant size, but we limit that size arbitrarily to four words. We call > + * this 'large_nodemask_size'. > + * > + * Finally, some operations just want the exact limit, either because they set > + * bits or just don't have any faster fixed-sized versions. We call this just > + * 'nr_nodemask_bits'. > + * > + * Note that these optional constants are always guaranteed to be at least as > + * big as 'nr_node_ids' itself is, and all our nodemask allocations are at > + * least that size. The optimization comes from being able to potentially use > + * a compile-time constant instead of a run-time generated exact number of > + * nodes. > + */ > +#if MAX_NUMNODES <= BITS_PER_LONG > + #define small_nodemask_bits ((unsigned int)MAX_NUMNODES) > + #define large_nodemask_bits ((unsigned int)MAX_NUMNODES) > +#elif MAX_NUMNODES <= 4*BITS_PER_LONG > + #define small_nodemask_bits nr_node_ids > + #define large_nodemask_bits ((unsigned int)MAX_NUMNODES) > +#else > + #define small_nodemask_bits nr_node_ids > + #define large_nodemask_bits nr_node_ids > +#endif > +#define nr_nodemask_bits nr_node_ids We don't need nr_nodemask_bits. In CPU subsystem nr_cpumask_bits exists (existed) to support dynamic allocation for cpumask_var_t if CPUMASK_OFFSTACK is enabled. And it apparently caused troubles. In nodemasks we don't have an offstack feature, and don't need the nr_nodemask_bits. Just use nr_node_ids everywhere. [...] > -#define nodes_setall(dst) __nodes_setall(&(dst), MAX_NUMNODES) > +#define nodes_setall(dst) __nodes_setall(&(dst), large_nodemask_bits) > static inline void __nodes_setall(nodemask_t *dstp, unsigned int nbits) > { > bitmap_fill(dstp->bits, nbits); > } When MAX_NUMNODES <= 4*BITS_PER_LONG, this breaks the rule that all bits beyond nr_node_ids must be clear. And that in turn may brake nodemask_weight() and others. Refer to this patch for details and correct implementation: 63355b9884b ("cpumask: be more careful with 'cpumask_setall()'") [...] > @@ -452,7 +511,6 @@ static inline unsigned int next_memory_n > return next_node(nid, node_states[N_MEMORY]); > } > > -extern unsigned int nr_node_ids; > extern unsigned int nr_online_nodes; > > static inline void node_set_online(int nid) > @@ -494,7 +552,6 @@ static inline int num_node_state(enum no > #define first_memory_node 0 > #define next_online_node(nid) (MAX_NUMNODES) > #define next_memory_node(nid) (MAX_NUMNODES) > -#define nr_node_ids 1U > #define nr_online_nodes 1U I like how you separated the nr_node_ids from the other ifdefery, and changed it to __ro_after_init. But I think it's better to fold this all into the 1st patch. Why don't we make nr_cpu_ids to be a __ro_after_init just as well? Thanks, Yury