From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2776C27C79 for ; Thu, 20 Jun 2024 06:07:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DFD7D6B044D; Thu, 20 Jun 2024 02:07:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DAD396B0452; Thu, 20 Jun 2024 02:07:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C74CE6B0455; Thu, 20 Jun 2024 02:07:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A85F76B044D for ; Thu, 20 Jun 2024 02:07:55 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C50651A17E8 for ; Thu, 20 Jun 2024 06:07:54 +0000 (UTC) X-FDA: 82250236068.12.24ECA7E Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf15.hostedemail.com (Postfix) with ESMTP id 1E64BA0008 for ; Thu, 20 Jun 2024 06:07:50 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=xj8Vyy5f; spf=pass (imf15.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718863668; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rxs7RYOCJ6F7fVorwZRw+xVI5v9xMBcKEYUN83Xwho4=; b=3bYQcA0hrfh6YjWs5KsT9HUMfyHytz4GIatFIvkuvQqvsTTzuQ36Pi9wa51Jw5cy2u7SUE v7CdW0dcpDV4eHqcC0I2oRDngq9DIFdam3dZ67Py/P6FDG2KkKE1gXAYp3iy+WJaJJ4p/1 suUn3p/eV4i5B5K7WQNJ6d4VcOmnAdg= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=xj8Vyy5f; spf=pass (imf15.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718863668; a=rsa-sha256; cv=none; b=cF/mIB5xEJMkb1yigZsvyhHeQzV91ZKxqR9O0HjYabhK+fCmWNOZPy/e+fgD53SsJh/2YQ dDF4di0TTpGYcAGI48SLxNXNoktaj3rzzFJZytpBPYPP5CQHXKijxMytxP2Xxbn2L51Jys 7GoMLB271C3GX2by730dFrJdpu0QAMY= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718863668; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=rxs7RYOCJ6F7fVorwZRw+xVI5v9xMBcKEYUN83Xwho4=; b=xj8Vyy5fCYqIo3QGjQ2Vc5v7dtIdy9Xt5etCvMR86/UG6afjszthCn0k2BVEPGuUHNhMNEUmPkeHh5kxI8W3ul+3Ayl78stCN+UtlF95ANfVS/SA9QX+4TwtlercbRX0Nxg5/DyvVkLcYfu4v+5wlBS86Awa+HW1cMSv7XDncQI= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R931e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032014031;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0W8qFaN3_1718863665; Received: from 30.97.56.69(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8qFaN3_1718863665) by smtp.aliyun-inc.com; Thu, 20 Jun 2024 14:07:46 +0800 Message-ID: <24a985cf-1bbf-41f9-b234-24f000164fa6@linux.alibaba.com> Date: Thu, 20 Jun 2024 14:07:45 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [linus:master] [mm] d2136d749d: vm-scalability.throughput -7.1% regression To: kernel test robot Cc: oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , "Huang, Ying" , David Hildenbrand , John Hubbard , Kefeng Wang , Mel Gorman , Ryan Roberts , linux-mm@kvack.org, feng.tang@intel.com, fengwei.yin@intel.com References: <202406201010.a1344783-oliver.sang@intel.com> From: Baolin Wang In-Reply-To: <202406201010.a1344783-oliver.sang@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 1E64BA0008 X-Stat-Signature: x5pqzkk3ktwwa3jburqz6a869c671ztg X-HE-Tag: 1718863670-997018 X-HE-Meta: U2FsdGVkX1+BTpKSuhl7JvmcduTi1O90wf3BssmpnBVefAz5GWvhgIhcfHy+OXcQVQNwWZnt5nK7/lyY8nybvNxzRWgj+qJaf3B4B4oLeosejmjyaeX7S54qxx8aCGUgv2GA39zeSRQNHlCbmiWaPshQLEdTA9HRb1YgSJQBUCBWEYIb/Yo7F33p4dYgsP6+Ob5YU8d5QiuwT6a+ie+eo0Rq2NDOXkRaX5wrwm6YC5inrtOR3O4YPSnma6gKwfv/nH5uV1Th5UL/HG+H9OTRje6UQlggLYnzCZaqjubyDbp8K4J0DaD24vXcaD+Rtq+aQ8vmpFAaZsGSwEfyRs2m9Gr6MkZUu1z1v/kw5d0Iv7E8O1gUVrXRdoBj6Y4pJjlywxEtcVDPMUYsjQ9aPVKn8Wv/75q9qbVebJBr76J1HZkZrgU0kcPKmjgsHNOSF9CY9PvUI2H8lpa6o4a9l7MS55lpG0ZMh012kSmKx6Oygcc+ry2gkRANhi4sSKlKVSfVFKyJ/L5rBXbu8Vpv/nVtVuVQfElnXA6YKhjd3zOGgaAIJSL+BVjpUgpx+NVOFFpfxOdz4CqLGmUeX5+EbQYbmNruWY9TpLDdRr6GOUrnUnpCt+SmHwM6g/OZfPrr15u7CohYAAOK9Fsqy6ABFBIg+26Gv4z25ylCwAwrPyGMHFpy8iRyRE3IvyCMKD5I7yg9OLLMneBZuWrkKRwIsoR1kLudxIJI/Y09UkME9aTPqFAGfLslM+dcS9SI+HbYXwat/vXSCB9GsJFNp6ZCA+YiUI/hE/zgTp0GP1uvrThDqo77R10nvBn9aHVLwDTOQ3YnSPQSzG4Wndgb1PD0eHfHmPVYR46fla35U6uw+KWpy8JMDuUbz5naPYD3ovtY9yzFE7asZ2L1wHUtDN0uIKIzyafAJ3WZwLn0wM7wK4af3su2HJUi9TtYpMuDzqdX6C5payKK+78fVU7fk7WawNS 3YXVfkX6 ckUMStt6QIMudgRswWBsrLT1Lxmf6dikprRvXfRmSM99vIJp59IVj3FGidgkaf6w0cm6Scz/ybm656LKevXrt/dqoTbOTJgwzwL/DTA8q7JaSkm6Jl1+WdUtqdnMcKEBdH6+XZuijgr5hkyFVddKCJOE+GJfR1mOmrK2hWvsycWqjhWAX8B3TDwE3P7ksVqfDeerooNY/H9QPqQE4TMoy3ljnKzlQjGqdZyenYn9D+Y24iKdjetrLwVjeQ7T8RgIRuDdLIz6ymrUsvoD2WwROdVhDbZjaPBhrA4RdhylDFoBNiuGMNwOG7LJ2iA+VY6gLih6f1VRJm/Y62ZGVThE/l9sLG9W0oN4n6GfvOjzepP1/l3SxUXrvXD0JBFzp1wHxF8U6780vD0GMkORj9ohLJO5mgVYzutrH4YLC5WAg0fGFVg6qEPLr5woUgzCpiD4WXyjJ0+YE0gkeR/ZLdBY7j2MQdQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/6/20 10:39, kernel test robot wrote: > > > Hello, > > kernel test robot noticed a -7.1% regression of vm-scalability.throughput on: > > > commit: d2136d749d76af980b3accd72704eea4eab625bd ("mm: support multi-size THP numa balancing") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > [still regression on linus/master 92e5605a199efbaee59fb19e15d6cc2103a04ec2] > > > testcase: vm-scalability > test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory > parameters: > > runtime: 300s > size: 512G > test: anon-cow-rand-hugetlb > cpufreq_governor: performance Thanks for reporting. IIUC numa balancing will not scan hugetlb VMA, I'm not sure how this patch affects the performance of hugetlb cow, but let me try to reproduce it. > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot > | Closes: https://lore.kernel.org/oe-lkp/202406201010.a1344783-oliver.sang@intel.com > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240620/202406201010.a1344783-oliver.sang@intel.com > > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: > gcc-13/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/300s/512G/lkp-icl-2sp2/anon-cow-rand-hugetlb/vm-scalability > > commit: > 6b0ed7b3c7 ("mm: factor out the numa mapping rebuilding into a new helper") > d2136d749d ("mm: support multi-size THP numa balancing") > > 6b0ed7b3c77547d2 d2136d749d76af980b3accd7270 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 12.02 -1.3 10.72 ± 4% mpstat.cpu.all.sys% > 1228757 +3.0% 1265679 proc-vmstat.pgfault > 7392513 -7.1% 6865649 vm-scalability.throughput > 17356 +9.4% 18986 vm-scalability.time.user_time > 0.32 ± 22% -36.9% 0.20 ± 17% sched_debug.cfs_rq:/.h_nr_running.stddev > 28657 ± 86% -90.8% 2640 ± 19% sched_debug.cfs_rq:/.load.stddev > 0.28 ± 35% -52.1% 0.13 ± 29% sched_debug.cfs_rq:/.nr_running.stddev > 299.88 ± 27% -39.6% 181.04 ± 23% sched_debug.cfs_rq:/.runnable_avg.stddev > 284.88 ± 32% -44.0% 159.65 ± 27% sched_debug.cfs_rq:/.util_avg.stddev > 0.32 ± 22% -37.2% 0.20 ± 17% sched_debug.cpu.nr_running.stddev > 1.584e+10 ± 2% -6.9% 1.476e+10 ± 3% perf-stat.i.branch-instructions > 11673151 ± 3% -6.3% 10935072 ± 4% perf-stat.i.branch-misses > 4.90 +3.5% 5.07 perf-stat.i.cpi > 333.40 +7.5% 358.32 perf-stat.i.cycles-between-cache-misses > 6.787e+10 ± 2% -6.8% 6.324e+10 ± 3% perf-stat.i.instructions > 0.25 -6.2% 0.24 perf-stat.i.ipc > 4.19 +7.5% 4.51 perf-stat.overall.cpi > 323.02 +7.4% 346.94 perf-stat.overall.cycles-between-cache-misses > 0.24 -7.0% 0.22 perf-stat.overall.ipc > 1.549e+10 ± 2% -6.8% 1.444e+10 ± 3% perf-stat.ps.branch-instructions > 6.634e+10 ± 2% -6.7% 6.186e+10 ± 3% perf-stat.ps.instructions > 17.33 ± 77% -10.6 6.72 ±169% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access > 17.30 ± 77% -10.6 6.71 ±169% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access > 17.30 ± 77% -10.6 6.71 ±169% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access > 17.28 ± 77% -10.6 6.70 ±169% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access > 17.27 ± 77% -10.6 6.70 ±169% perf-profile.calltrace.cycles-pp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault > 13.65 ± 76% -8.4 5.29 ±168% perf-profile.calltrace.cycles-pp.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault > 13.37 ± 76% -8.2 5.18 ±168% perf-profile.calltrace.cycles-pp.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault > 13.35 ± 76% -8.2 5.18 ±168% perf-profile.calltrace.cycles-pp.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault > 13.23 ± 76% -8.1 5.13 ±168% perf-profile.calltrace.cycles-pp.copy_mc_enhanced_fast_string.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault > 3.59 ± 78% -2.2 1.39 ±169% perf-profile.calltrace.cycles-pp.__mutex_lock.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault > 17.35 ± 77% -10.6 6.73 ±169% perf-profile.children.cycles-pp.asm_exc_page_fault > 17.32 ± 77% -10.6 6.72 ±168% perf-profile.children.cycles-pp.do_user_addr_fault > 17.32 ± 77% -10.6 6.72 ±168% perf-profile.children.cycles-pp.exc_page_fault > 17.30 ± 77% -10.6 6.71 ±168% perf-profile.children.cycles-pp.handle_mm_fault > 17.28 ± 77% -10.6 6.70 ±169% perf-profile.children.cycles-pp.hugetlb_fault > 13.65 ± 76% -8.4 5.29 ±168% perf-profile.children.cycles-pp.hugetlb_wp > 13.37 ± 76% -8.2 5.18 ±168% perf-profile.children.cycles-pp.copy_user_large_folio > 13.35 ± 76% -8.2 5.18 ±168% perf-profile.children.cycles-pp.copy_subpage > 13.34 ± 76% -8.2 5.17 ±168% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string > 3.59 ± 78% -2.2 1.39 ±169% perf-profile.children.cycles-pp.__mutex_lock > 13.24 ± 76% -8.1 5.13 ±168% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > >