From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 859EDC433EF for ; Mon, 20 Jun 2022 03:24:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4F076B0071; Sun, 19 Jun 2022 23:24:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DFE646B0073; Sun, 19 Jun 2022 23:24:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEDF86B0074; Sun, 19 Jun 2022 23:24:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BF74E6B0071 for ; Sun, 19 Jun 2022 23:24:27 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8F2C876B for ; Mon, 20 Jun 2022 03:24:27 +0000 (UTC) X-FDA: 79597171374.20.C465084 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf23.hostedemail.com (Postfix) with ESMTP id B768314000B for ; Mon, 20 Jun 2022 03:24:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655695466; x=1687231466; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=D9XE01XwM+TilxJWOF9FdLepcJrbW+dWDW1WZgjV+/E=; b=WwtDkAsfjFEj8/91vGviGR8zCrpoYrc+U11jFQESfFBvbJjeMlETmHwU JS+9BtYDH3SxldH+Ofc9bleR80isEj2Gk51o5cRD3R0PvqvemdDOUp91f u+h+4CmG0Mwrqkef68V6kvwSg+FXFEptR/rXCtXOJgvVFrl1Jm0Ym/IhG 1ifA/XFM3mq6VaIw2tEuoSiK+628Muo8yhHJhrSzFKUe9rQpl78HE6IYO 0hqYg3BHLXtYSEOjHNBfaeslzlelfcozFz+UBWBdUMeza5NoW+ya6uQdh n9txgWuRtW+vtOQxp53zMjc9pZu1zWlVJp9tNAy4Kv7SPmk5EjVGtAvFE A==; X-IronPort-AV: E=McAfee;i="6400,9594,10380"; a="262826394" X-IronPort-AV: E=Sophos;i="5.92,306,1650956400"; d="scan'208";a="262826394" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jun 2022 20:24:25 -0700 X-IronPort-AV: E=Sophos;i="5.92,306,1650956400"; d="scan'208";a="729180271" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.13.94]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jun 2022 20:24:22 -0700 From: "Huang, Ying" To: Baolin Wang Cc: Andrew Morton , , , Johannes Weiner , Michal Hocko , Rik van Riel , Mel Gorman , Peter Zijlstra , Dave Hansen , Yang Shi , Zi Yan , Wei Xu , osalvador , Shakeel Butt , "Zhong Jiang" Subject: Re: [PATCH -V3 0/3] memory tiering: hot page selection References: <20220614081635.194014-1-ying.huang@intel.com> <872bdaee-21a0-005b-b66c-893eb331e39a@linux.alibaba.com> Date: Mon, 20 Jun 2022 11:24:17 +0800 In-Reply-To: <872bdaee-21a0-005b-b66c-893eb331e39a@linux.alibaba.com> (Baolin Wang's message of "Mon, 20 Jun 2022 11:19:23 +0800") Message-ID: <87czf4rp9a.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655695467; a=rsa-sha256; cv=none; b=GKZmxz/gn6ehVRO2NQODjejM1vw+i4PBKkz52a4YsSRqUIJbjLN6BWYB+raXTvG+JRTKvF T1WRWUjil6fdu+/ldX+d3EwZxuxB7sKKjj/5tlYjoq/glJqPvlM1wqnNYx36OT+fdida3c uqmncdk22nOPGj9bINxMfZzbIUFrT98= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=WwtDkAsf; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf23.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 134.134.136.126) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655695467; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3sERX0mKqKNsODC2oU0THCnvFbBki9kPjJ3Bb3lsdK8=; b=szS3UskeTvF/qQFXI+S7Bd9qrdRhF/L+OfKKqOottMrMocsL1IUo/B67m0EWtG0zzn9Mwh J3rJf31VYXu5FCrjwgFYwJ+CsnXmd2a4rfvFzUsyXpNRzeAzbitat0NWoqJxtiMHozQirH nJUIONi70f1iKOpDJQG5Ugu6R34Wgxs= Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=WwtDkAsf; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf23.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 134.134.136.126) smtp.mailfrom=ying.huang@intel.com X-Stat-Signature: rph4k57nuqychyy94uujhxxxtf6jfqge X-Rspamd-Queue-Id: B768314000B X-Rspam-User: X-Rspamd-Server: rspam03 X-HE-Tag: 1655695466-908124 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Baolin Wang writes: > On 6/14/2022 4:16 PM, Huang Ying wrote: >> To optimize page placement in a memory tiering system with NUMA >> balancing, the hot pages in the slow memory nodes need to be >> identified. Essentially, the original NUMA balancing implementation >> selects the mostly recently accessed (MRU) pages to promote. But this >> isn't a perfect algorithm to identify the hot pages. Because the >> pages with quite low access frequency may be accessed eventually given >> the NUMA balancing page table scanning period could be quite long >> (e.g. 60 seconds). So in this patchset, we implement a new hot page >> identification algorithm based on the latency between NUMA balancing >> page table scanning and hint page fault. Which is a kind of mostly >> frequently accessed (MFU) algorithm. >> In NUMA balancing memory tiering mode, if there are hot pages in >> slow >> memory node and cold pages in fast memory node, we need to >> promote/demote hot/cold pages between the fast and cold memory nodes. >> A choice is to promote/demote as fast as possible. But the CPU >> cycles >> and memory bandwidth consumed by the high promoting/demoting >> throughput will hurt the latency of some workload because of accessing >> inflating and slow memory bandwidth contention. >> A way to resolve this issue is to restrict the max >> promoting/demoting >> throughput. It will take longer to finish the promoting/demoting. >> But the workload latency will be better. This is implemented in this >> patchset as the page promotion rate limit mechanism. >> The promotion hot threshold is workload and system configuration >> dependent. So in this patchset, a method to adjust the hot threshold >> automatically is implemented. The basic idea is to control the number >> of the candidate promotion pages to match the promotion rate limit. >> We used the pmbench memory accessing benchmark tested the patchset >> on >> a 2-socket server system with DRAM and PMEM installed. The test >> results are as follows, >> pmbench score promote rate >> (accesses/s) MB/s >> ------------- ------------ >> base 146887704.1 725.6 >> hot selection 165695601.2 544.0 >> rate limit 162814569.8 165.2 >> auto adjustment 170495294.0 136.9 >> From the results above, >> With hot page selection patch [1/3], the pmbench score increases >> about >> 12.8%, and promote rate (overhead) decreases about 25.0%, compared with >> base kernel. >> With rate limit patch [2/3], pmbench score decreases about 1.7%, and >> promote rate decreases about 69.6%, compared with hot page selection >> patch. >> With threshold auto adjustment patch [3/3], pmbench score increases >> about 4.7%, and promote rate decrease about 17.1%, compared with rate >> limit patch. > > I did a simple testing with mysql on my machine which contains 1 DRAM > node (30G) and 1 PMEM node (126G). > > sysbench /usr/share/sysbench/oltp_read_write.lua \ > ...... > --tables=200 \ > --table-size=1000000 \ > --report-interval=10 \ > --threads=16 \ > --time=120 > > The tps can be improved about 5% from below data, and I think this is > a good start to optimize the promotion. So for this series, please > feel free to add: > > Reviewed-by: Baolin Wang > Tested-by: Baolin Wang > > Without this patchset: > transactions: 2080188 (3466.48 per sec.) > > With this patch set: > transactions: 2174296 (3623.40 per sec.) Thanks a lot! Best Regards, Huang, Ying