From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03F39C05027 for ; Mon, 13 Feb 2023 03:35:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B4CF6B007B; Sun, 12 Feb 2023 22:35:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 63DF36B007D; Sun, 12 Feb 2023 22:35:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4DE706B007E; Sun, 12 Feb 2023 22:35:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3B6126B007B for ; Sun, 12 Feb 2023 22:35:16 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 0672EAB664 for ; Mon, 13 Feb 2023 03:35:16 +0000 (UTC) X-FDA: 80460853032.04.76D4D6B Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf18.hostedemail.com (Postfix) with ESMTP id 5F8EB1C0009 for ; Mon, 13 Feb 2023 03:35:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=MLfG++86; spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676259314; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9H+9rv2+e70KnwgK4AykLdUcCwzna4kppk3IBsPFWEw=; b=60V+jC7zi8lAwmAhS6yaxKOf98cA3bOPOVzONOfvcwIIFao5VRCZoc9FJaQHGCGPWTD285 7ilpBZFSeGVbMz2R5FRhamsDXgGRacw3u/imxwzwBjnMnJLlvL82OLMPbTXa5D8VBg0Pon V1VDmLILoerQDAh2iJ1Kly6MOw0hVeA= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=MLfG++86; spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676259314; a=rsa-sha256; cv=none; b=d0HEi8WipHaQh7W7/UlkwlEv9idV83wkDH/g2ptjRbYMikvCZXFHhoyK466JK3Ityu1iz/ OVrRcmWMc5GeyHv3uwj7Fp4iVlqdABB7tn1pjcJpKRVkuL5n1ztSl9p4PQ+urh+LKrv3aE ER1BK4RWyI5TMoUIt5Bx5XKwiGxCpIs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676259313; x=1707795313; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=TdVcZ17rKF87bG8fcEMj5OWJBgzqMGvqBdznTrAWTSQ=; b=MLfG++86c1lKn+bfjmozR8jxTIFAX1iGOCe9y6dBTu9SoWoqQ+bSNOVo yqetT+9e67NKt/n1tYn3UmZkHnIJnaDIXIIIlX6cxmsUZ+qVYAdWiLaaM XMklKKNFQ8yDFyDKSzXKMR0yW45IpV6PkGKrUB0Oe6R+zPZu9JduJ+6Jl pM9IocJRJIVS/SjUj7NwjaFRK88zHwZsf+lL9ns7aCJUo/srjcIuZSB7X I2ovqGe1v/FUcRocMQ1ndDzyTM0dGiXSnCbu8G5BavuQAI9HjPPBQyUw4 qzapmCvyOVoZPOKNRI8B+Q94fHtQDhD4GxB2cqXbTSm6yFQRG9ZgQDY0+ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10619"; a="332937212" X-IronPort-AV: E=Sophos;i="5.97,291,1669104000"; d="scan'208";a="332937212" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2023 19:35:11 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10619"; a="668668649" X-IronPort-AV: E=Sophos;i="5.97,291,1669104000"; d="scan'208";a="668668649" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2023 19:35:07 -0800 From: "Huang, Ying" To: Bharata B Rao Cc: Peter Zijlstra , , , , , , , , , , , , Subject: Re: [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing References: <20230208073533.715-1-bharata@amd.com> <87cz6eb7e6.fsf@yhuang6-desk2.ccr.corp.intel.com> <3c811078-c869-452a-8e2d-ebe720d21691@amd.com> Date: Mon, 13 Feb 2023 11:34:19 +0800 In-Reply-To: <3c811078-c869-452a-8e2d-ebe720d21691@amd.com> (Bharata B. Rao's message of "Mon, 13 Feb 2023 08:53:55 +0530") Message-ID: <874jrqb5ms.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: agebf38wcabg5hdu7x1j9xtah4bk1den X-Rspamd-Queue-Id: 5F8EB1C0009 X-HE-Tag: 1676259313-946754 X-HE-Meta: U2FsdGVkX197AMadjbZ64N5aEG/t686DG5eYjWIB2uIa62meqlsVIKtFyr2gURgYCbfkB8LG6CgUlTQWFFaKMfimxi4H7d8P+OcwfoFJ7hauoynN3fSf5GSm6dsl27fqfPqYnlGeXnk3Fe6OJrEgzTHMj0Em5fJEMDmzj8Pgxvr8rpPnh0wPMDxwQKxTQwYESSY/QsoICf1sWoUPawS+KYMHbq3q1gzAbFBksYLdy7WsuNYOjMy9ZZYz8TAsDegMFd3nnC+KHI/bXiX9MhYmsWGjUN6aHmlxS+NGb6rWdZfK+vi4+ihm6LOvK6a6RyTMF+fVL5egoMIYFA+7gRKO7aGtiK9F/uaQP8WUxrvh/T5CLtZI9CrP+coVEA8IZhkCa6JKm/6G7x+54WukkjiZGWC0sKBZhAynfffBDk4zd0jBfauJpJopAGwplGmBexi1cUy9xZKX4jnMkkqf6b72/WmKBfVYG703jylIcFhBXG/N4XXu4bAN0nxzBGBdnYv3QcOiViJo3mfWeLOe7jhOzI4fuMwS1ICsZj7FBWcGEMzYexb0Ilg80jTRgpfUv90PO0Ve8xi2V4BM9B9rzZS35B2Lc0QKs56Qns6nHyBE08ZpYB2VEa9UPPdlpBXE2tDEPaOWkNA+yWB5xFivNm4SLzcs+xSDSstAjHabHeiFc6QI+yx87XNMVoMmAygxEDFhy93FI+P6ooseazIS/u9U65zqvX3mcL971JZwES/il5wa05WKdavGCiqj1Avmh3rcqutRA5kucdhox3Psc5t8leE5zA35I92PWyyyBzGq4j2GY8B7F7fOsQ1ZCPABp4MaApEmEOLSfiWPdvZ8mEaiqFVJKkAVZKigPfUUkbpPsZWuOYnP4nbzDMAEmgGYdW9UcfsLhBUHu8CZUXQpE/DxrvlTsGDYZU38IzQ1xEZv2utd8iK/Cp4yjxNhU5Fiwd8DwfQwMATgGP0b/9PSWsv gy1RonMm 3VqsySjC0DnhDtg1KIzOCyTi6MG5lmK3asirqI0KovqUxUOrWjFkBMDugv2fDS4lk1hE5JbPtOOHW5CoQz/S2vygJB5ExbnaUgAy13q5BxtgLN3m6ipUaUM3WAa8F9hhuXL7Q75nJehh7ek3DgeZR2nuatqQ2fiX5wHqvDLjRfqmitNPmM9zByt/iMGqhSYSLmg9K X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Bharata B Rao writes: > On 2/13/2023 8:26 AM, Huang, Ying wrote: >> Bharata B Rao writes: >> >>> On 2/8/2023 11:33 PM, Peter Zijlstra wrote: >>>> On Wed, Feb 08, 2023 at 01:05:28PM +0530, Bharata B Rao wrote: >>>> >>>> >>>>> - Hardware provided access information could be very useful for driving >>>>> hot page promotion in tiered memory systems. Need to check if this >>>>> requires different tuning/heuristics apart from what NUMA balancing >>>>> already does. >>>> >>>> I think Huang Ying looked at that from the Intel POV and I think the >>>> conclusion was that it doesn't really work out. What you need is >>>> frequency information, but the PMU doesn't really give you that. You >>>> need to process a *ton* of PMU data in-kernel. >>> >>> What I am doing here is to feed the access data into NUMA balancing which >>> already has the logic to aggregate that at task and numa group level and >>> decide if that access is actionable in terms of migrating the page. In this >>> context, I am not sure about the frequency information that you and Dave >>> are mentioning. AFAIU, existing NUMA balancing takes care of taking >>> action, IBS becomes an alternative source of access information to NUMA >>> hint faults. >> >> We do need frequency information to determine whether a page is hot >> enough to be migrated to the fast memory (promotion). What PMU provided >> is just "recently" accessed pages, not "frequently" accessed pages. For >> current NUMA balancing implementation, please check >> NUMA_BALANCING_MEMORY_TIERING in should_numa_migrate_memory(). In >> general, it estimates the page access frequency via measuring the >> latency between page table scanning and page fault, the shorter the >> latency, the higher the frequency. This isn't perfect, but provides a >> starting point. You need to consider how to get frequency information >> via PMU. For example, you may count access number for each page, aging >> them periodically, and get hot threshold via some statistics. > > For the tiered memory hot page promotion case of NUMA balancing, we will > have to maintain frequency information in software when such information > isn't available from the hardware. Yes. It's challenging to calculate frequency information. Please consider how to do that. Best Regards, Huang, Ying