From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D452DC02192 for ; Wed, 5 Feb 2025 16:05:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54F4928000A; Wed, 5 Feb 2025 11:05:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FF63280003; Wed, 5 Feb 2025 11:05:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39FD028000A; Wed, 5 Feb 2025 11:05:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1CC09280003 for ; Wed, 5 Feb 2025 11:05:40 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BDAA71A022D for ; Wed, 5 Feb 2025 16:05:39 +0000 (UTC) X-FDA: 83086366398.25.3914E04 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf15.hostedemail.com (Postfix) with ESMTP id 74717A001C for ; Wed, 5 Feb 2025 16:05:37 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=aA1ydUez; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf15.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.174 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738771537; a=rsa-sha256; cv=none; b=N5GMoRw8EmJAEitPy6DntELKkpMV0OZUUvfH5WHuowv7cDtWlK/AjK3wCeZx+1Dj9ujbbH GfMpiwIHS+0wok8Kv7W5MkQGBO8U6v0OF/0oxgzUV1rGMkrPdpuR5ZeYKiVphg30tLbBv2 GDvbHmCugmZeWCMp+Z9BAHYrHr9431Q= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=aA1ydUez; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf15.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.174 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738771537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nI3OxNrtDO2/9J234gYunPE+YsfotDZVga3ivwTWhbY=; b=kcaOyk7E9S7X+XTzPbHFh+Bx5ohtFJv+AmvjCzOzbG8xu07B/g8WPyok96xcIqX0shN2o9 72kDKJvE35IxpXK8WAtG8y8YNfPnyJZdRi+FVqxEyHF31t6okNg7bXCNlZFQLJ1Ce28ZeO J3cJiVyGZeDBtplHfdH4D7zQuUe+mh0= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-467838e75ffso88107011cf.3 for ; Wed, 05 Feb 2025 08:05:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1738771536; x=1739376336; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=nI3OxNrtDO2/9J234gYunPE+YsfotDZVga3ivwTWhbY=; b=aA1ydUezQNXBfh2SD3MGUQQSCUwzG04cBtPjJWamIvZm3JioqsazKrYtYMGHziLWoq tYVcf2ZRG773l0w3WRYvik8HKCscabJk6BYQyLF75bL8Htt+4fIq0QxQW1XaEeaYM7Tn TDFRliuITuPw7X0b6zKCGBRxHUyUgKXOTb6M+Eg5JY9XaxGQBMMBGbo3jet1ExCXT0nH 1DRfoOoO33QpSbGc9ifV8iedU1Yc7lz3bN7uTnPe4LQacR0TnKrivr9+AV3E7wcaBkWv NnXZwN870U5N/JTmAMHogRoDQnijX9SrBGk11a3xl72l5MX4ipDWICqBJ/I6X6ZTD5Pa SbZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738771536; x=1739376336; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nI3OxNrtDO2/9J234gYunPE+YsfotDZVga3ivwTWhbY=; b=bhZ5IuUZEQ1yiOXQaJIhrD8LbeApLJyKurcIVRYrhjCqFdw/0sVFpfeWDgh9WbpGmf 8IWnIg15/GITdp4yHdzpa3j22/TwdLxTizzD3xH8Hvm3OTeS309iwY76RCmApXB+MSrb H1pY6tjSur0bonMGr3CkTfneKUNwEtoYpzmjy6kuoihF9GGCZJGl0bldD2ZTQpwkc4+L CgricuwxgAbaUQ87MwEY88wVQYNGqS26qr2Adc/eNWT4KV7aqHMha732luuAP22zfsTO 8Bu9z0/lgE7zyjXVyARRdvdRHWvRAUcIikg0YbPh+5tqd0Xh052KH4aE13Xx3gThiiR8 cZpw== X-Forwarded-Encrypted: i=1; AJvYcCXUVOXntUICluwV3k3vdhaONTONKcL9I/856b/loqRQDRdrDjkyS18v24XbSu2ylVA8vX3+hoMajg==@kvack.org X-Gm-Message-State: AOJu0YxSuBShDRpu0Xl675i7pAPQeGUuyZyvvzU1alzlcHy9wluMM42k rvQtdYK8tfWqSy9hvX8w7KvDucE8zbePUEAryRmQWSGWcx2e8BdDlxoKeVBt/Ts= X-Gm-Gg: ASbGncucSZx27Q/wnGQ1k9OxbCdTB/rjf64ErN861p2N809zsfZzs6ZIG2n/tJNyx7Y PnCiYLSvHgweRNci+n8RFYnXURH3LlToeXJ1xISWng2rG1vM8uxUTKZrJKoFFrXYcKP9NyuGrWu pIU9K+SBuT2k+yB+FxkMF3Mv9xsL/a1H4XW5tENIEP+65efDpjF7qRsTUxbRRy1koWNkPcpnyVH YyywfARlHAH6+U1ZrkvK/MTxqvX1RTZAZxv0+2ppMWFeyBJTMZOW7bJiEifRJru2G126PKO/wJ1 6gJdbW70GHQsuQ== X-Google-Smtp-Source: AGHT+IGDxwlKxUxtmqiAukdUNAhCmwtT1XnLXZnSe9RyA+1/jO1dxUHk3/+YWgtCBRrKv23ZyapWoQ== X-Received: by 2002:ac8:7d91:0:b0:466:b122:5143 with SMTP id d75a77b69052e-47028168982mr51301281cf.16.1738771536086; Wed, 05 Feb 2025 08:05:36 -0800 (PST) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-46fdf18cbb8sm71612721cf.78.2025.02.05.08.05.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Feb 2025 08:05:34 -0800 (PST) Date: Wed, 5 Feb 2025 11:05:29 -0500 From: Johannes Weiner To: Bharata B Rao Cc: Jonathan Cameron , Raghavendra K T , linux-mm@kvack.org, akpm@linux-foundation.org, lsf-pc@lists.linux-foundation.org, gourry@gourry.net, nehagholkar@meta.com, abhishekd@meta.com, ying.huang@linux.alibaba.com, nphamcs@gmail.com, feng.tang@intel.com, kbusch@meta.com, Hasan.Maruf@amd.com, sj@kernel.org, david@redhat.com, willy@infradead.org, k.shutemov@gmail.com, mgorman@techsingularity.net, vbabka@suse.cz, hughd@google.com, rientjes@google.com, shy828301@gmail.com, liam.howlett@oracle.com, peterz@infradead.org, mingo@redhat.com, nadav.amit@gmail.com, shivankg@amd.com, ziy@nvidia.com, jhubbard@nvidia.com, AneeshKumar.KizhakeVeetil@arm.com, linux-kernel@vger.kernel.org, jon.grimm@amd.com, santosh.shukla@amd.com, Michael.Day@amd.com, riel@surriel.com, weixugc@google.com, leesuyeon0506@gmail.com, honggyu.kim@sk.com, leillc@google.com, kmanaouil.dev@gmail.com, rppt@kernel.org, dave.hansen@intel.com Subject: Re: [LSF/MM/BPF TOPIC] Unifying sources of page temperature information - what info is actually wanted? Message-ID: <20250205160529.GB1183495@cmpxchg.org> References: <20250123105721.424117-1-raghavendra.kt@amd.com> <20250131122803.000031aa@huawei.com> <20250131130901.00000dd1@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 74717A001C X-Stat-Signature: hhn398hcwehogwgiyjte96kn9monuicc X-HE-Tag: 1738771537-396438 X-HE-Meta: U2FsdGVkX19ou1yT2FEWWGJ9B+bVXv4w4s3iP8yW8NDYpKGTq3VXAIYXK82zRbxntCSb1DM/0SREXLhB5ll3LIFXZuTnuHFk3yv08CHFmug/F5D+agvoZwsHI+ZrrcHPHCPadMSnz5f7Re08y2Ufie+yszID03sFBunngrmNQZGFkGzCZqgtUOy5yedQ23uFy+55TFOmhkUuAVqdMHwPPh4FEJgpByVsI7sjRrdeWyV+L37a+dr4HoO8ojakBbebEACpF6UbwDQ8aw5QhCnSEM0QRas2w3LiBUJqyhKJA3k6sdtnSA8nj4+Sm5PfBKOR6b2uxv5aTIIWfe6GMFVt4XUixX+EWVve8G9+ykgAdJiheHSbh6KvLKInaDfrRq1AnJaPKvIJ7xJKr22MnUjN87AgQ7KDS8om5nOR11Cjws87BSTcALXadUWmageryX8UCkipKXmk1Lq+T2FplreevOeYYEFf0uT7kLMrW09tnAhCjCEBnkVr0jGOYDK/POQ7t1oE5gM/bJhC+idV5t0J6MGTcJHZGw0f9HikYkLbXMy3KFjIqrqNUUNdVjFhcccimXqzoVZVa2SuCXoCZTubzGqfbA736sOsSJwO8cHJKCLa8C7aR0RmktbQgiaUSlyQsjFUu9PkPMDlZTLYrMsBMDfn+3c6jDYaf8Gr1yzjCagnWVYMIioAUpoWESqtafQ8DiC7iLaCOYLYDn8VTRnJimuyRX+7NRAMuk74BDN4yd/zWcaeC6gv/KEMCCiQeSbuKtIY6d7/yk44DqeWRmwH/gxOZfHGcHnouFvSpqKBxvckw0Q8FCUiidLpZ0bS8gjNVfBdQQyaK4DhEMlwYJAiw6NvcODMgfJ76gdoy2Z0ZXZGAynUTYPkv4YEeeyi8zElRNYn7Yj2OVFTwT+R3+OdpLC+pf4Tk3+hHB02P9Xsy3gLwjZLHhKAJbUOvEWTMqlbpRCehRq9a3Vdp/GnvE+ tDAvvG4k oWsK02NB8yBlIZWn6tHkPb0hc5urUOy4DBQUJyLv1yS4CLYZKe7wIDaJgA/2LMy8T0Bhu5V7o9geYVU2JDVxA2Zr5oomIfrHXDKtq2/w9OGMu0RIOKb/7nxQz/XsB6uPQgBiMPns7Yqw9vlyywO0QJOmOottkrkYBBdMivFOW2xZMxhOvqGjo/vheHyXNnla7B06OAG5Ts2Kh6qXTGE7ysmMLm/HO8nxyvlEUmD0WsnGMrzqX/yCXptDrYQAWivemBh2L9a/+c5PD/EZmEVBH3lZiZQluzjFZzFwasWrArVmgDBcsB9E2F/xrPt4bfiPrHjDrw1fonPLoWE/NhIiiXs1Cm83CoA2o54YLQhCQnGnOKOGIMpxJlysb/3Mv9INLPVMFvovvTLkSPTyeYpLsTFSB4yv2kBc/ckJPNUfTADV0VSPZNH/25tVKH/JJEz/wLuYVVpKucxrV9HQIfcKVzXf9cMdmjnMCJh36BoNmth7NeuUuQytGn79KhyOFJZyI2nEdGLFx3szoNLHmNPIUbytYHsDCTEiNhNJIzg29ttRinZf9AAJ2YfBgYQgYefqR6jS9hdn2zVYPivuzPXQH1ockYoBH5Z+uynbd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000754, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 05, 2025 at 11:54:05AM +0530, Bharata B Rao wrote: > On 31-Jan-25 6:39 PM, Jonathan Cameron wrote: > > On Fri, 31 Jan 2025 12:28:03 +0000 > > Jonathan Cameron wrote: > > > >>> Here is the list of potential discussion points: > >> ... > >> > >>> 2. Possibility of maintaining single source of truth for page hotness that would > >>> maintain hot page information from multiple sources and let other sub-systems > >>> use that info. > >> Hi, > >> > >> I was thinking of proposing a separate topic on a single source of hotness, > >> but this question covers it so I'll add some thoughts here instead. > >> I think we are very early, but sharing some experience and thoughts in a > >> session may be useful. > > > > Thinking more on this over lunch, I think it is worth calling this out as a > > potential session topic in it's own right rather than trying to find > > time within other sessions. Hence the title change. > > > > I think a session would start with a brief listing of the temperature sources > > we have and those on the horizon to motivate what we are unifying, then > > discussion to focus on need for such a unification + requirements > > (maybe with a straw man). > > Here is a compilation of available temperature sources and how the > hot/access data is consumed by different subsystems: This is super useful, thanks for collecting this. > PA-Physical address available > VA-Virtual address available > AA-Access time available > NA-accessing Node info available > > I have left the slot blank for those which I am not sure about. > ================================================== > Temperature PA VA AA NA > source > ================================================== > PROT_NONE faults Y Y Y Y > -------------------------------------------------- > folio_mark_accessed() Y Y Y > -------------------------------------------------- For fma(), the VA info is available in unmap, but usually it isn't - or doesn't meaningfully exist, as in the case of unmapped buffered IO. I'd say it's an N. > PTE A bit Y Y N N > -------------------------------------------------- > Platform hints Y Y Y Y > (AMD IBS) > -------------------------------------------------- > Device hints Y > (CXL HMU) > ================================================== For the following table, it might be useful to add *when* the source produces this information. Sampling frequency is a likely challenge: consumers have different requirements, and overhead should be limited to the minimum required to serve enabled consumers. Here is an (incomplete) attempt - sorry about the long lines: > And here is an attempt to compile how different subsystems > use the above data: > ============================================================== > Source Subsystem Consumption Activation/Frequency > ============================================================== > PROT_NONE faults NUMAB NUMAB=1 locality based While task is running, > via process pgtable balancing rate varies on observed > walk NUMAB=2 hot page locality and sysctl knobs. > promotion > ============================================================== > folio_mark_accessed() FS/filemap/GUP LRU list activation On cache access and unmap > ============================================================== > PTE A bit via Reclaim:LRU LRU list activation, During memory pressure > rmap walk deactivation/demotion > ============================================================== > PTE A bit via Reclaim:MGLRU LRU list activation, - During memory pressure > rmap walk and process deactivation/demotion - Continuous sampling (configurable) > pgtable walk for workingset reporting > ============================================================== > PTE A bit via DAMON LRU activation, Continuous sampling (configurable)? > rmap walk hot page promotion, (I believe SJ is looking into > demotion etc auto-tuning this). > ============================================================== > Platform hints NUMAB NUMAB=1 Locality based > (AMD IBS) balancing and > NUMAB=2 hot page > promotion > ============================================================== > Device hints NUMAB NUMAB=2 hot page > promotion > ============================================================== > The last two are listed as possibilities. > > Feel free to correct/clarify and add more. > > Regards, > Bharata.