From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92555C636D4 for ; Fri, 17 Feb 2023 06:04:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 105226B0072; Fri, 17 Feb 2023 01:04:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B5416B0073; Fri, 17 Feb 2023 01:04:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EBEC56B0074; Fri, 17 Feb 2023 01:04:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D9E026B0072 for ; Fri, 17 Feb 2023 01:04:35 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9C974C14EA for ; Fri, 17 Feb 2023 06:04:35 +0000 (UTC) X-FDA: 80475744510.20.36EE6C6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf28.hostedemail.com (Postfix) with ESMTP id 124CAC0013 for ; Fri, 17 Feb 2023 06:04:32 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=c5f+fYRp; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676613873; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sZSvpfG4A+Kr6VzkKtamslEjrUjp+epWAewyvbFoTLs=; b=8Fz5VcUo/SxcZNgHEY+T9G2tHb2uBVTiXl9FqYnbZgvp5890yCjbiqwj6h0kloHyOXt9TI Jb6gfpGp/tRWlNJiFj1u9/KAA74Mbijt/nurG2UspjiUwChZbPzYJrm7r6XQ+C5hhqoe4I RRqU9wuRSx8pIYrOBAmjK+NmhIsuBv0= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=c5f+fYRp; spf=pass (imf28.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676613873; a=rsa-sha256; cv=none; b=ISkDANVQ8GQOmS6rN3NMbi1FShJs49Bzr8jfVYGB08vAmj8VDhTPNNaLlNuNrX9OkGFeZ+ pV/GZawd9MAJoVp5I//jzHF/+7ltknnCTlMtfIFCIkBa8JlTtN5qSfIduDqA2l6dBecX3d BPWA+fwcNt5JybYRwAFYdtpW7RdwfbE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676613873; x=1708149873; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=PCiBbKw+u9zZUj5AzZfQc/GM0HnW2Q2nldtRPRAHYoo=; b=c5f+fYRpd0I+op6dFpDjOHkY5B+DDfkLpORJjSBD5KAbby2USlF2c0MZ 0GpRZc0nwYh7fLz0IEdyiriHFdqSLzqFzn7zSucDqlBa58ikIl4JALO4p VM78wpEkOVQPLaYc6pVlAvC12DADhBb3Edo7ZSfvPvDtNn/qrHv5JXFb+ htJzlh2ipVvPRm6+Z5mQd14a7dHL4T5i//k8X6GXwZ0cwlrqREF3mSDzP Q6oy4kReHUXSU6fK6dvm8PMpF81YYs/qXMVzDXB/ZoS+GAr8FmtjI+muX mig88EgTDB21g30q1R72kbOGJNrJh+A/oVVjbbNVCbZ9fPF71+FQTzdwo w==; X-IronPort-AV: E=McAfee;i="6500,9779,10623"; a="359365333" X-IronPort-AV: E=Sophos;i="5.97,304,1669104000"; d="scan'208";a="359365333" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Feb 2023 22:04:30 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10623"; a="739152232" X-IronPort-AV: E=Sophos;i="5.97,304,1669104000"; d="scan'208";a="739152232" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Feb 2023 22:04:28 -0800 From: "Huang, Ying" To: Bharata B Rao Cc: , , , , , , , , , , , , Subject: Re: [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing References: <20230208073533.715-1-bharata@amd.com> <878rh2b5zt.fsf@yhuang6-desk2.ccr.corp.intel.com> <72b6ec8b-f141-3807-d7f2-f853b0f0b76c@amd.com> <87zg9i9iw2.fsf@yhuang6-desk2.ccr.corp.intel.com> <1547d291-1512-faae-aba5-0f84c3502be4@amd.com> Date: Fri, 17 Feb 2023 14:03:32 +0800 In-Reply-To: (Bharata B. Rao's message of "Thu, 16 Feb 2023 14:11:20 +0530") Message-ID: <87zg9c7rrf.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: 124CAC0013 X-Stat-Signature: rtbja1yakcok5dhqi5pns4x85rd9mkxi X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1676613872-112480 X-HE-Meta: U2FsdGVkX1+Wh/cA+03mbLyJ9TmcJwOvDi4LUewWoKnpM4EQUIXOifug5LccJuni774ITXJ8MEG3druORljS6jEDxfON0yp+PfuhBIjiDNH7nq6r9bQNTM+5QoN+cIsDauUVAZwhqAiM4HwJLPLCW5rSgeZkV/gSi/NIfdxqhDzT2F9cEpxPjIjmXTjDYbmpsM9HSeqnYN+iqmqhwnWvgZayMVdCEpcu8mQZ+Ghrx/MLDBnbUfyothbpjLyXW+tZyF4mTUefl9wCNJ+CsdXwxDd4HTG71IdTulS4ouNrLOGfc9EtugLpb1svRf5gw8/OAJnTs0Di4EVHwLhMMrUrBbq2XBWGql9qR3dDaF3Wy+pOcwvcIdM4dDZOtnAq2ZykPlV+E6lL0riREvcqf6cjL12NSiAJ8aucN//as7/8UJNRRqU66EVCEPDvmdgoWWLMl4FXjNtLN7H3OCzzIJ+Dk5X7I4mE6YZEuJEE1ZShV9eQ+QMst562gfAWSr3y/KX6bv42BlcqXgnQQjcjXF6Z/HxvmMWLbz2i8p+v7CoiaOGp2dWX7fJxccTqau+JvLRftj6qYyo3tAWff3BDahDx2szZzvxGz4AHWuLsoXrqw7AoGLvqHdb+0T3lYT4gFHZKGaWQKSeaVXmGtvJ8vtaQ9BLe7peY7+tHYIw8acVl5Qp1D/3i1V3h8g7pqXd+qI4NN+BtBPg7rZXQGS/DqQ56gJRlbQXA2EAK0CNsAUWsT8PYr+HqcCSjLIFlqRVfH0f/eQbR7YifIjL1dkUSPD2I7y1s/CpF9c8SU9uH/cmrj7MQWTx1VeKAT3ioYb/P3jSn0CXdkJrt5vG4twOIbt0pkRvUm69WGtu2cn2AmqeUJDmJxMSdcEsFZsX1GPnOCGxcsKKceRlmJT39vu/+nJqDChHzSXo1TmcjTkB9j17OrWazLtCkv+nYP9VOOESUrUpQ/ZQfmSK4rSQUcijH1jJ qT4/V4Wz QLmyAVEDRK7+LvuOep+uOhtiWJQ/p6Zgu92WzmSlYg6TEXmpmXeS9OdMwUj2MxNAyzplVeAy2pTMWEzbL9KmwcChZYrWASSDmn74x44oC6c4poSecKZMRhJA9ZuOR+zH9F6dtOJxoazRu3z6JNm+meRLax4ZZ6Q0ch0tU9Kz+4/BoDOttE9Xrh0RVFOxaUcXZdE+P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Bharata B Rao writes: > On 14-Feb-23 10:25 AM, Bharata B Rao wrote: >> On 13-Feb-23 12:00 PM, Huang, Ying wrote: >>>> I have a microbenchmark where two sets of threads bound to two >>>> NUMA nodes access the two different halves of memory which is >>>> initially allocated on the 1st node. >>>> >>>> On a two node Zen4 system, with 64 threads in each set accessing >>>> 8G of memory each from the initial allocation of 16G, I see that >>>> IBS driven NUMA balancing (i,e., this patchset) takes 50% less time >>>> to complete a fixed number of memory accesses. This could well >>>> be the best case and real workloads/benchmarks may not get this much >>>> uplift, but it does show the potential gain to be had. >>> >>> Can you find a way to show the overhead of the original implementation >>> and your method? Then we can compare between them? Because you think >>> the improvement comes from the reduced overhead. >> >> Sure, will measure the overhead. > > I used ftrace function_graph tracer to measure the amount of time (in us) > spent in fault handling and task_work handling in both the methods when > the above mentioned benchmark was running. > > Default IBS > Fault handling 29879668.71 1226770.84 > Task work handling 24878.894 10635593.82 > Sched switch handling 78159.846 > > Total 29904547.6 11940524.51 Thanks! You have shown the large overhead difference between the original method and your method. Can you show the number of the pages migrated too? I think the overhead / page can be a good overhead indicator too. Can it be translated to the performance improvement? Per my understanding, the total overhead is small compared with total run time. Best Regards, Huang, Ying > In the default case, the fault handling duration is measured > by tracing do_numa_page() and the task_work duration is tracked > by task_numa_work(). > > In the IBS case, the fault handling is tracked by the NMI handler > ibs_overflow_handler(), the task_work is tracked by task_ibs_access_work() > and sched switch time overhead is tracked by hw_access_sched_in(). Note > that in IBS case, not much is done in NMI handler but bulk of the work > (page migration etc) happens in task_work context unlike the default case. > > The breakup in numbers is given below: > > Default > ======= > Duration Min Max Avg > do_numa_page 29879668.71 0.08 317.166 17.16 > task_numa_work 24878.894 0.2 3424.19 388.73 > Total 29904547.6 > > IBS > === > Duration Min Max Avg > ibs_overflow_handler 1226770.84 0.15 104.918 1.26 > task_ibs_access_work 10635593.82 0.21 398.428 29.81 > hw_access_sched_in 78159.846 0.15 247.922 1.29 > Total 11940524.51 > > Regards, > Bharata.