From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A409C636CC for ; Wed, 8 Feb 2023 11:26:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C953E6B0071; Wed, 8 Feb 2023 06:26:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C1E5B6B0072; Wed, 8 Feb 2023 06:26:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A983C6B0073; Wed, 8 Feb 2023 06:26:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9440A6B0071 for ; Wed, 8 Feb 2023 06:26:26 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5BF0280A42 for ; Wed, 8 Feb 2023 11:26:26 +0000 (UTC) X-FDA: 80443896372.20.815E9A6 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf30.hostedemail.com (Postfix) with ESMTP id 97A0D8001C for ; Wed, 8 Feb 2023 11:26:23 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dXv43EDJ; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf30.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675855584; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w2z23ZbblbC0rFb644PhezHLP3K1Ahz3WDsKSm8FEiE=; b=S8TThytMcl0gz0hf0d63mHZrKzQMN22H12C3zSEqsmR8JbukwdwAZVQUv2JiRPDmbvsjNC Uphltr2rybBJCYbQQgD5pcyqRUu/NQfTQ36yZ+X34P7ixz1tNFi04i0bN4BBb8tl8HWley ULITu4UgY1uWQfdXiiHf9IaaW2bR+sA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=dXv43EDJ; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf30.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675855584; a=rsa-sha256; cv=none; b=4GDlV6pCu1NwWyoZeNepGl8FqN1eCFcuWDSTGpG2Cjj7w1UvLbEzuzrCc80CnV3e3uoiJy H9o6Xmm+dR4tQsu4z/CEcwV9iFBVJRY7ldzCMz0u0sozgTU2n8fICnxhTuP5CQ1Cq2Ppr8 Ig6OWflxX83R/H6XBRLxB5Ny1+ln31M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675855583; x=1707391583; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version:content-transfer-encoding; bh=FTe/GG29MPPR23/33mg82vShTieQSngtw32ZKh9+Ky8=; b=dXv43EDJ5hphgVSJsBwH5L1dLNFMeKTgcd/nRHHeP7NLKoh2nZ+wXwbD f3l7vi7Yd0JhDkcrckY9g3besHDX/nPeW6sO6cGULcHngA3Su1XylBxXa Ub9cKZyEf2QTQH3VwP7fwslj0UdOpeiRl49Ygm8BNs0IwvJyr07vehhbJ 02QXWzQfHDFwEfcnfEFDvPpYvHYyAD5kGA11WUcHm5ymtLMYbihlIbrew OWn56zzUPwvDKZHzzfg04F1mS24sfG/OwWtmtGwOTarztqtNMjUdPFDBt pd2ByLejajKaanlbCusVYBrZBXddRuhiQbxXuiAFLoK0/OodZKBoPPk+7 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10614"; a="415993678" X-IronPort-AV: E=Sophos;i="5.97,280,1669104000"; d="scan'208";a="415993678" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Feb 2023 03:26:05 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10614"; a="667209521" X-IronPort-AV: E=Sophos;i="5.97,280,1669104000"; d="scan'208";a="667209521" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Feb 2023 03:26:02 -0800 From: "Huang, Ying" To: haoxin Cc: Andrew Morton , , , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , "Matthew Wilcox" , Bharata B Rao , "Alistair Popple" , Minchan Kim , Mike Kravetz , Hyeonggon Yoo <42.hyeyoo@gmail.com> Subject: Re: [PATCH -v4 0/9] migrate_pages(): batch TLB flushing References: <20230206063313.635011-1-ying.huang@intel.com> Date: Wed, 08 Feb 2023 19:25:02 +0800 In-Reply-To: (haoxin's message of "Wed, 8 Feb 2023 14:21:48 +0800") Message-ID: <87cz6kh01d.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 97A0D8001C X-Stat-Signature: tcuzhw6hjg336ofcdh6iahh6ryze34ge X-HE-Tag: 1675855583-692235 X-HE-Meta: U2FsdGVkX1+Pjhfww5YDJL8qs9GqlatIgijsDmi0LvVBCSN7P+cgTNuwJ2jPYnCaC74OTUIZsActXnFqKaAfb+4tEUCpVdcwdh76z64bNStfJHVACA9RND9q8bay8+uoc74c0jNzAQkkfj6fKyQ0gSjxu9m/QPd5XIKJFo0wl6OLIQNhFF0uJFcHxfJ58Lu3eu53CYn7IDcVC23hoKB0nti2poA288uXQZG1/aVCs2rOjIeA00oOasIFUMxnj5eZ6Sb8x5nAbWLAyETpop9iRcck5f8oMeCx5GMiOxzR+mV9XNptBq3S0gA5MRQQh9y0iswjCgqhMpgzd4h9odb5T2d8cD1yXUeAaADHoLUI4FyIoRv+zryKit/UVYQY3xe+FXL0GAh9Ua81Nx/yGkU02HVRHUyxZV5hmkbRHwW2FqG0BbyhWMARCFaB796+Cw6bpm0vnpL8r6ZaeuN9jNIcCLrNKcVpgpsc599E9jhXJvw3Z/n2QLFN/qWACcCyyM0nRlCnJSu2nqJhOpTO407EKqs4TTkLyjFV9sXyLS2yO5th5h1hQEDzADlCa629Ax9aYtLAWuXENTErQmlFWqWBVKLxjbEaxQF3HoibnANzmFJ/LX6ngYOAWj/0tgnhh73mxCqVFh+rH7211OMeAnSimpBTQKOVDi0dG6dyBhwiKe0UFxEOPB/W4N9vNw+avQQbyjfSNhNSvEM9i6dJ7h95Z6xS+a0wHQUPCTWhyPzbho2R6W6zdFcVywGNIW2MBifzEI+P5eZHX4UtBvEW1Y2zh2ERRUOcVEweYGMx1Mrf6WPcVL9YGk3N9WyYUopUTRO27Q2H+pkIsZXZpL8bjz5MCoMPrP+vAL9zOL2ZQi3HEpoJ1ZHVsaeV0iNdDJzwEf2J1e1YBNGNhagS4P1wa9fopDKUVPKF5mO/DGuZ7izkdk6JFrLOxxne0dz9tfmiTRRGXAy4D5iAwvnCJxbdI4D gEgUnzi2 FDAYICuH12Yid+dT3ZcY3Ed+GmA6EQj4GVMBYUcyPISa5Qesma8X0dLIXO+MZ3Zt4cBjTRplOyQyuqfWvgDqOhbuxni26GCN4MmPIg5znPAAApW0Fk+qMBkthZFYFdwYTlQaIVDPBAQmCTwk/XLtGwOYl/d7kCLbsU1v8bpCjDpiSwxwdbo1cwD6fNC2m4utRj2lFvCpetdwpW6k5ZpIM8JhAqB0UTOvy1OQMSD8LRPvbMrXCk6VXR49zNfx56xcOJhmmONwt5EUvT1OwPvH/RwhxU5M4su0VyZmI4ygV/WYgtWdf8tyMV5xYcfoHoO2w8EjDJllXLOnoyAY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: haoxin writes: > On my arm64 server with 128 cores, 2 numa nodes. > > I used memhog as benchmark : > >   numactl -m -C 5 memhog -r100000 1G > > The test result as below: > > With this patch: > >   #time migratepages 8490 0 1 > >   real 0m1.161s > >   user 0m0.000s > >   sys 0m1.161s > > without this patch: > >   #time migratepages 8460 0 1 > >   real 0m2.068s > >   user 0m0.001s > >   sys 0m2.068s > > So you can see the migration performance improvement about *+78%* > > This is the perf record info. > > w/o > +  51.07%   0.09% migratepages > [kernel.kallsyms] [k] migrate_folio_extra > +  42.43%   0.04% migratepages [kernel.kallsyms] [k] folio_copy > +  42.34%  42.34% migratepages [kernel.kallsyms] [k] __pi_copy_page > +  33.99%   0.09% migratepages [kernel.kallsyms] [k] rmap_walk_anon > +  32.35%   0.04% migratepages [kernel.kallsyms] [k] try_to_migrate > *+  27.78%  27.78% migratepages > [kernel.kallsyms] [k] ptep_clear_flush * > +  8.19%   6.64% migratepages > [kernel.kallsyms] [k] folio_migrate_flagsmigrati_tlb_flush > > w/ this patch > +  18.57%   0.13% > migratepages   [kernel.kallsyms]  [k] > migrate_pages > +  18.23%   0.07% > migratepages   [kernel.kallsyms]  [k] > migrate_pages_batch > +  16.29%   0.13% > migratepages   [kernel.kallsyms]  [k] > migrate_folio_move > +  12.73%   0.10% > migratepages   [kernel.kallsyms]  [k] > move_to_new_folio > +  12.52%   0.06% > migratepages   [kernel.kallsyms]  [k] > migrate_folio_extra > > Therefore, this patch helps improve performance in page migration > > > So, you can add Tested-by: Xin Hao Thank you very much! Best Regards, Huang, Ying > > ( 2023/2/6 H2:33, Huang Ying S: >> From: "Huang, Ying" >> >> Now, migrate_pages() migrate folios one by one, like the fake code as >> follows, >> >> for each folio >> unmap >> flush TLB >> copy >> restore map >> >> If multiple folios are passed to migrate_pages(), there are >> opportunities to batch the TLB flushing and copying. That is, we can >> change the code to something as follows, >> >> for each folio >> unmap >> for each folio >> flush TLB >> for each folio >> copy >> for each folio >> restore map >> >> The total number of TLB flushing IPI can be reduced considerably. And >> we may use some hardware accelerator such as DSA to accelerate the >> folio copying. >> >> So in this patch, we refactor the migrate_pages() implementation and >> implement the TLB flushing batching. Base on this, hardware >> accelerated folio copying can be implemented. >> >> If too many folios are passed to migrate_pages(), in the naive batched >> implementation, we may unmap too many folios at the same time. The >> possibility for a task to wait for the migrated folios to be mapped >> again increases. So the latency may be hurt. To deal with this >> issue, the max number of folios be unmapped in batch is restricted to >> no more than HPAGE_PMD_NR in the unit of page. That is, the influence >> is at the same level of THP migration. >> >> We use the following test to measure the performance impact of the >> patchset, >> >> On a 2-socket Intel server, >> >> - Run pmbench memory accessing benchmark >> >> - Run `migratepages` to migrate pages of pmbench between node 0 and >> node 1 back and forth. >> >> With the patch, the TLB flushing IPI reduces 99.1% during the test and >> the number of pages migrated successfully per second increases 291.7%. >> >> This patchset is based on v6.2-rc4. >> >> Changes: >> >> v4: >> >> - Fixed another bug about non-LRU folio migration. Thanks Hyeonggon! >> >> v3: >> >> - Rebased on v6.2-rc4 >> >> - Fixed a bug about non-LRU folio migration. Thanks Mike! >> >> - Fixed some comments. Thanks Baolin! >> >> - Collected reviewed-by. >> >> v2: >> >> - Rebased on v6.2-rc3 >> >> - Fixed type force cast warning. Thanks Kees! >> >> - Added more comments and cleaned up the code. Thanks Andrew, Zi, Alistair, Dan! >> >> - Collected reviewed-by. >> >> from rfc to v1: >> >> - Rebased on v6.2-rc1 >> >> - Fix the deadlock issue caused by locking multiple pages synchronously >> per Alistair's comments. Thanks! >> >> - Fix the autonumabench panic per Rao's comments and fix. Thanks! >> >> - Other minor fixes per comments. Thanks! >> >> Best Regards, >> Huang, Ying