From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=E04d=NZ=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.5 required=3.0 tests=BAYES_00,
	HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,
	SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 518B9C433F5
	for <linux-mm@archiver.kernel.org>; Fri,  3 Sep 2021 04:00:45 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id D065161057
	for <linux-mm@archiver.kernel.org>; Fri,  3 Sep 2021 04:00:44 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D065161057
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org
Received: by kanga.kvack.org (Postfix)
	id EAF1E8D0001; Fri,  3 Sep 2021 00:00:43 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id E372F900002; Fri,  3 Sep 2021 00:00:43 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id CFEB08D0002; Fri,  3 Sep 2021 00:00:43 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116])
	by kanga.kvack.org (Postfix) with ESMTP id BEE2A8D0001
	for <linux-mm@kvack.org>; Fri,  3 Sep 2021 00:00:43 -0400 (EDT)
Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay03.hostedemail.com (Postfix) with ESMTP id 714BD82E08E4
	for <linux-mm@kvack.org>; Fri,  3 Sep 2021 04:00:43 +0000 (UTC)
X-FDA: 78544910766.03.0A430D3
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by imf25.hostedemail.com (Postfix) with ESMTP id DC79FB000187
	for <linux-mm@kvack.org>; Fri,  3 Sep 2021 04:00:42 +0000 (UTC)
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D69CCD6E;
	Thu,  2 Sep 2021 21:00:41 -0700 (PDT)
Received: from [10.163.72.65] (unknown [10.163.72.65])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3040B3F5A1;
	Thu,  2 Sep 2021 21:00:38 -0700 (PDT)
Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building
 node fallback list
To: "Ramakrishnan, Krupa" <Krupa.Ramakrishnan@amd.com>,
 "Rao, Bharata Bhasker" <bharata@amd.com>,
 "linux-mm@kvack.org" <linux-mm@kvack.org>,
 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
 "kamezawa.hiroyu@jp.fujitsu.com" <kamezawa.hiroyu@jp.fujitsu.com>,
 "lee.schermerhorn@hp.com" <lee.schermerhorn@hp.com>,
 "mgorman@suse.de" <mgorman@suse.de>,
 "Srinivasan, Sadagopan" <Sadagopan.Srinivasan@amd.com>
References: <20210830121603.1081-1-bharata@amd.com>
 <20210830121603.1081-3-bharata@amd.com>
 <13dab5ac-03a3-e9b3-ff12-f819f7711569@arm.com>
 <SN6PR12MB2765859076BFE5B667A0C4719BCC9@SN6PR12MB2765.namprd12.prod.outlook.com>
From: Anshuman Khandual <anshuman.khandual@arm.com>
Message-ID: <a051f54f-7bec-ab7b-cfac-d427b2e0e4bb@arm.com>
Date: Fri, 3 Sep 2021 09:31:39 +0530
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <SN6PR12MB2765859076BFE5B667A0C4719BCC9@SN6PR12MB2765.namprd12.prod.outlook.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Authentication-Results: imf25.hostedemail.com;
	dkim=none;
	spf=pass (imf25.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com;
	dmarc=pass (policy=none) header.from=arm.com
X-Rspamd-Server: rspam02
X-Rspamd-Queue-Id: DC79FB000187
X-Stat-Signature: c1kdqrdizrj9kj3ncj3daqo17zxxnyr9
X-HE-Tag: 1630641642-964038
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>


On 8/31/21 8:56 PM, Ramakrishnan, Krupa wrote:
> [AMD Official Use Only]
> 
> The bandwidth is limited by underutilization of cross socket links and not the  latency. Hotspotting on  one node will not engage all  hardware resources based on our routing protocol which results in the lower bandwidth. Distributing equally across nodes 0 and 1 will yield the best results as it stresses the full system capabilities.

Makes sense. Nonetheless this patch clearly solves a problem. 

> 
> Thanks
> Krupa Ramakrishnan
> 
> -----Original Message-----
> From: Anshuman Khandual <anshuman.khandual@arm.com> 
> Sent: 31 August, 2021 4:58
> To: Rao, Bharata Bhasker <bharata@amd.com>; linux-mm@kvack.org; linux-kernel@vger.kernel.org
> Cc: akpm@linux-foundation.org; kamezawa.hiroyu@jp.fujitsu.com; lee.schermerhorn@hp.com; mgorman@suse.de; Ramakrishnan, Krupa <Krupa.Ramakrishnan@amd.com>; Srinivasan, Sadagopan <Sadagopan.Srinivasan@amd.com>
> Subject: Re: [FIX PATCH 2/2] mm/page_alloc: Use accumulated load when building node fallback list
> 
> [CAUTION: External Email]
> 
> On 8/30/21 5:46 PM, Bharata B Rao wrote:
>> As an example, consider a 4 node system with the following distance 
>> matrix.
>>
>> Node 0  1  2  3
>> ----------------
>> 0    10 12 32 32
>> 1    12 10 32 32
>> 2    32 32 10 12
>> 3    32 32 12 10
>>
>> For this case, the node fallback list gets built like this:
>>
>> Node  Fallback list
>> ---------------------
>> 0     0 1 2 3
>> 1     1 0 3 2
>> 2     2 3 0 1
>> 3     3 2 0 1 <-- Unexpected fallback order
>>
>> In the fallback list for nodes 2 and 3, the nodes 0 and 1 appear in 
>> the same order which results in more allocations getting satisfied 
>> from node 0 compared to node 1.
>>
>> The effect of this on remote memory bandwidth as seen by stream 
>> benchmark is shown below:
>>
>> Case 1: Bandwidth from cores on nodes 2 & 3 to memory on nodes 0 & 1
>>       (numactl -m 0,1 ./stream_lowOverhead ... --cores <from 2, 3>) 
>> Case 2: Bandwidth from cores on nodes 0 & 1 to memory on nodes 2 & 3
>>       (numactl -m 2,3 ./stream_lowOverhead ... --cores <from 0, 1>)
>>
>> ----------------------------------------
>>               BANDWIDTH (MB/s)
>>     TEST      Case 1          Case 2
>> ----------------------------------------
>>     COPY      57479.6         110791.8
>>    SCALE      55372.9         105685.9
>>      ADD      50460.6         96734.2
>>   TRIADD      50397.6         97119.1
>> ----------------------------------------
>>
>> The bandwidth drop in Case 1 occurs because most of the allocations 
>> get satisfied by node 0 as it appears first in the fallback order for 
>> both nodes 2 and 3.
> 
> I am wondering what causes this performance drop here ? Would not the memory access latency be similar between {2, 3} --->  { 0 } and {2, 3} --->  { 1 }, given both these nodes {0, 1} have same distance from {2, 3} i.e 32 from the above distance matrix. Even if the preferred node order changes from { 0 } to { 1 } for the accessing node { 3 }, it should not change the latency as such.
> 
> Is the performance drop here, is caused by excessive allocation on node { 0 } resulting from page allocation latency instead.
>