From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.0 required=3.0 tests=BAYES_00,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A52FC00A89 for ; Fri, 30 Oct 2020 20:20:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 79E9D20704 for ; Fri, 30 Oct 2020 20:19:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 79E9D20704 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D618C6B0036; Fri, 30 Oct 2020 16:19:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CEB8A6B005C; Fri, 30 Oct 2020 16:19:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB4296B005D; Fri, 30 Oct 2020 16:19:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 8B3C06B0036 for ; Fri, 30 Oct 2020 16:19:58 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2D355180AD804 for ; Fri, 30 Oct 2020 20:19:58 +0000 (UTC) X-FDA: 77429708076.08.legs03_0c0a3f027298 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id 6622A1819E636 for ; Fri, 30 Oct 2020 20:19:49 +0000 (UTC) X-HE-Tag: legs03_0c0a3f027298 X-Filterd-Recvd-Size: 5003 Received: from mail-vk1-f193.google.com (mail-vk1-f193.google.com [209.85.221.193]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Fri, 30 Oct 2020 20:19:48 +0000 (UTC) Received: by mail-vk1-f193.google.com with SMTP id m3so1718649vki.12 for ; Fri, 30 Oct 2020 13:19:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/Sp4KUrSSBCNx6Tu2J7D/yS4JwDFIV+96wrw9j5aMEM=; b=CdEUDJj0R7W/F51ZXY/WVsDMbi9hoTq5bbtrCk80dDoAibQM/8MUU4PiVx1bLHH+1r lfH1qwyioyB0xRvBZg0ffESBK+sNeT8/7voFnlgY6cs6PVEyngHIraSo36e04JLbgFt7 afYUUv5BID8IlLCxrFqQL0kaHf9TlefZ7lshM4za7yCsXpnz+lzRiJd9qGh9BdFm2lcz HhUMCA/gUQxp0tZvK35pFSAK5KoFXe+WHfsYr6jjUYSGAcJwHkBEX+Tk3Yf608Lc9hmP belVostxiPrk5lldVfATyPBEB3q8OIuhyGz7X88JacruufvfQB1R4/8Egkui9CwpJ/OT cLJQ== X-Gm-Message-State: AOAM530obiyl1UztTlCOhtH5YGAewfViBJJj4Xh8Uv7RrxTA/tqDcOY/ ZglO0wnaa2YuKeGg/05ZlW4= X-Google-Smtp-Source: ABdhPJxatPv0njJcq6j/9Co09GZV8NWp+59y8hz6BYK9fjBZ8pjLbLg1Fpvx77EqmfV1HgFydPnlsA== X-Received: by 2002:a05:6122:10eb:: with SMTP id m11mr8787909vko.8.1604089188301; Fri, 30 Oct 2020 13:19:48 -0700 (PDT) Received: from google.com (239.145.196.35.bc.googleusercontent.com. [35.196.145.239]) by smtp.gmail.com with ESMTPSA id a13sm879252vkm.47.2020.10.30.13.19.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Oct 2020 13:19:47 -0700 (PDT) Date: Fri, 30 Oct 2020 20:19:46 +0000 From: Dennis Zhou To: Wonhuyk Yang Cc: Tejun Heo , Christoph Lameter , linux-mm@kvack.org Subject: Re: [PATCH] percpu: Reduce the number of cpu distance comparisions Message-ID: <20201030201946.GA1061822@google.com> References: <20201030013820.29758-1-vvghjk1234@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201030013820.29758-1-vvghjk1234@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Fri, Oct 30, 2020 at 10:38:20AM +0900, Wonhuyk Yang wrote: > From: Wonhyuk Yang > > To build group_map[] and group_cnt[], we find out which group > CPUs belong to by comparing the distance of the cpu. However, > this includes cases where comparisons are not required. > > This patch uses a bitmap to record CPUs that is not classified in > the group. CPUs that we know which group they belong to should be > cleared from the bitmap. In result, we can reduce the number of > unnecessary comparisons. > > Signed-off-by: Wonhyuk Yang > --- > mm/percpu.c | 32 ++++++++++++++++++-------------- > 1 file changed, 18 insertions(+), 14 deletions(-) > > diff --git a/mm/percpu.c b/mm/percpu.c > index 66a93f096394..d19ca484eee4 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -2669,6 +2669,7 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info( > { > static int group_map[NR_CPUS] __initdata; > static int group_cnt[NR_CPUS] __initdata; > + static struct cpumask mask __initdata; > const size_t static_size = __per_cpu_end - __per_cpu_start; > int nr_groups = 1, nr_units = 0; > size_t size_sum, min_unit_size, alloc_size; > @@ -2702,24 +2703,27 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info( > upa--; > max_upa = upa; > > + cpumask_copy(&mask, cpu_possible_mask); > + > /* group cpus according to their proximity */ > - for_each_possible_cpu(cpu) { > - group = 0; > - next_group: > - for_each_possible_cpu(tcpu) { > - if (cpu == tcpu) > - break; > - if (group_map[tcpu] == group && cpu_distance_fn && > - (cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE || > - cpu_distance_fn(tcpu, cpu) > LOCAL_DISTANCE)) { > - group++; > - nr_groups = max(nr_groups, group + 1); > - goto next_group; > - } > - } > + for (group = 0; !cpumask_empty(&mask); group++) { > + /* pop the group's first cpu */ > + cpu = cpumask_first(&mask); > group_map[cpu] = group; > group_cnt[group]++; > + cpumask_clear_cpu(cpu, &mask); > + > + for_each_cpu(tcpu, &mask) { > + if (!cpu_distance_fn || > + (cpu_distance_fn(cpu, tcpu) == LOCAL_DISTANCE && > + cpu_distance_fn(tcpu, cpu) == LOCAL_DISTANCE)) { > + group_map[tcpu] = group; > + group_cnt[group]++; > + cpumask_clear_cpu(tcpu, &mask); > + } > + } > } > + nr_groups = group; > > /* > * Wasted space is caused by a ratio imbalance of upa to group_cnt. > -- > 2.17.1 > Sorry for the delay. It's been a little bit of a busy week for me and it always takes me a moment to wrap my head around this code. I've applied this to percpu#for-5.11. Thanks, Dennis