From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2947C433F5 for ; Tue, 23 Nov 2021 02:53:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5ECF76B0071; Mon, 22 Nov 2021 21:53:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 59C816B0072; Mon, 22 Nov 2021 21:53:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4636E6B0073; Mon, 22 Nov 2021 21:53:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210]) by kanga.kvack.org (Postfix) with ESMTP id 3285F6B0071 for ; Mon, 22 Nov 2021 21:53:34 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id EC0038C588 for ; Tue, 23 Nov 2021 02:53:23 +0000 (UTC) X-FDA: 78838673886.08.7DCDE8A Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf17.hostedemail.com (Postfix) with ESMTP id 196F7F0001CD for ; Tue, 23 Nov 2021 02:53:22 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10176"; a="321168388" X-IronPort-AV: E=Sophos;i="5.87,256,1631602800"; d="scan'208";a="321168388" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2021 18:53:16 -0800 X-IronPort-AV: E=Sophos;i="5.87,256,1631602800"; d="scan'208";a="509227794" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.159.101]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2021 18:53:14 -0800 From: "Huang, Ying" To: Baolin Wang Cc: , , , , , , , , Mel Gorman Subject: Re: [RFC PATCH] mm: Promote slow memory in advance to improve performance References: Date: Tue, 23 Nov 2021 10:53:12 +0800 In-Reply-To: (Baolin Wang's message of "Mon, 22 Nov 2021 18:22:17 +0800") Message-ID: <87ilwjbn1j.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 196F7F0001CD X-Stat-Signature: 1yuxdaauqz5n6e4gow6amffdmkdztdip Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf17.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=ying.huang@intel.com X-HE-Tag: 1637636002-519602 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000037, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Baolin Wang writes: > Some workloads access a set of data entities will follow the data locality, > also known as locality of reference, which means the probability of accessing > some data soon after some nearby data has been accessed. > > On some systems with different memory types, which will rely on the numa > balancing to promote slow hot memory to fast memory to improve performance. > So we can promote several sequential pages on slow memory at one time > according to the data locality for some workloads to improve the performance. > > Testing with mysql can show about 5% performance improved as below. > > Machine: 16 CPUs, 64G DRAM, 256G AEP > > sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua > --mysql-user=root --mysql-password=root --oltp-test-mode=complex > --oltp-tables-count=65 --oltp-table-size=5000000 --threads=20 --time=600 > --report-interval=10 > > No proactive promotion: > transactions > 2259245 (3765.37 per sec.) > 2312605 (3854.31 per sec.) > 2325907 (3876.47 per sec.) > > Proactive promotion bytes=16384: > transactions > 2419023 (4031.66 per sec.) > 2451903 (4086.47 per sec.) > 2441941 (4068.68 per sec.) This is kind of readahead to promote the page before we know it's hot. It can definitely benefit the performance if we predict correctly, but may hurt if we predict wrongly. Is it possible for us to add some self-adaptive algorithm like that in readahead to determine whether to adjust the fault around window dynamically? A system level knob may be not sufficient to fit all workloads run in system? Best Regards, Huang, Ying > Suggested-by: Xunlei Pang > Signed-off-by: Baolin Wang > --- > Note: This patch is based on "NUMA balancing: optimize memory placement > for memory tiering system" [1] from Huang Ying. > > [1] https://lore.kernel.org/lkml/87bl2gsnrd.fsf@yhuang6-desk2.ccr.corp.intel.com/T/ [snip]