From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9567D12D5D for ; Mon, 11 Nov 2024 01:38:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 502D06B0083; Sun, 10 Nov 2024 20:38:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B2CC6B0088; Sun, 10 Nov 2024 20:38:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 353226B0089; Sun, 10 Nov 2024 20:38:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1110D6B0083 for ; Sun, 10 Nov 2024 20:38:49 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id AA0A7AD270 for ; Mon, 11 Nov 2024 01:38:48 +0000 (UTC) X-FDA: 82772104758.29.D74C91D Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16]) by imf26.hostedemail.com (Postfix) with ESMTP id 7FF5C14000F for ; Mon, 11 Nov 2024 01:38:15 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=JUSN6BFy; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf26.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.16 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731288953; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AmsdZN/ZjE5HmAacly3wlgQDs/H+exqcBgMD34DZDvw=; b=ewfKINfAlD9qoArvMtbcMXPbOzj3AYsTHlnqJ3VTbTg2hviFLUXX0Van2JwpiM+DBKg5DT i6dGxBG/jxEL9oNyq9kz5ebpA0I+dO6bzgf4kyyPoubAo+OwP2QKvVq6gOfvuwVP26pRSX AR/ONV/tQHsOoQ6glpjwJJZpNm/oFdM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=JUSN6BFy; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf26.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.16 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731288953; a=rsa-sha256; cv=none; b=UzGWm3bAY5+hq73/TzjCmc3ZTs3470OJDaXv5RHYDp7MWS4b5BVf1mtD0AVJFDoanuh89+ 3xsLbMNa2hR4vzkVvwqd/D4juEr05Kd2Ne2m/X6pugSAkhTMl18KgfE/hr0zF1gtKj33Dh jMvupJAYkQ6YSuFeEXCM7Qa+eaDt++E= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731289126; x=1762825126; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=FsmGo0TppDFqmpjJLXUHDiGAN4OW5nTMnsuhfSTEi8g=; b=JUSN6BFyJqqE2XQiKRyZdB1NHbfpAD9KQNXPxGPnuZ9ZNNpa66ZLWMef 2zRjdRgemLqYc+akvt6lLR+6jxsoZ9UsX8u1v7JvaI9o5g984Ai3NwM4Y Jj3dE5JVUCoSttxuZBu/eD4z0pJOeNcBuF+1ds2WggcICvyGmG2/v1MrC 05VTinikJMShteToaMoP7ZD3jXIqEn3iook1E+l9yU9uObgW3FsFE8/gj Fb5kn/xuaIMuXApWVTAuYY8z+RknQEQODBKS1mcFroNOeJb6EYQovbRjr R99omx+7hfKOgAeRQQdjnlnLTn4y5qUPvJ0WpAT4Y85o6b6GbB9Ix4r77 w==; X-CSE-ConnectionGUID: V81aklLHRwyvZnkOYvx0Jw== X-CSE-MsgGUID: XbtpWWgQR8KtFXza0WEREg== X-IronPort-AV: E=McAfee;i="6700,10204,11252"; a="18710442" X-IronPort-AV: E=Sophos;i="6.12,144,1728975600"; d="scan'208";a="18710442" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2024 17:38:44 -0800 X-CSE-ConnectionGUID: HseMNB7BQnSYUxe62vWldA== X-CSE-MsgGUID: /M36LMEDQoqNfO82NQOnrQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,144,1728975600"; d="scan'208";a="87043852" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Nov 2024 17:38:42 -0800 From: "Huang, Ying" To: Gregory Price Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, david@redhat.com, nphamcs@gmail.com, nehagholkar@meta.com, abhishekd@meta.com, Johannes Weiner , Feng Tang Subject: Re: [PATCH 0/3] mm,TPP: Enable promotion of unmapped pagecache In-Reply-To: (Gregory Price's message of "Fri, 8 Nov 2024 13:00:56 -0500") References: <20240803094715.23900-1-gourry@gourry.net> <875xrxhs5j.fsf@yhuang6-desk2.ccr.corp.intel.com> <87ikvefswp.fsf@yhuang6-desk2.ccr.corp.intel.com> <87jzdi782s.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Mon, 11 Nov 2024 09:35:09 +0800 Message-ID: <871pzi5z8y.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7FF5C14000F X-Stat-Signature: kte9z75rhsgyx57tt5eiehstkaazizd1 X-Rspam-User: X-HE-Tag: 1731289095-907660 X-HE-Meta: U2FsdGVkX18BAXcdUZf2ni9YsUnwikaT0r1JAxr/ve2Xh+1W0oh6XCbQ6ezl0ZFGKztkAaqCADAWCmHK0g+9f2QJ3Xm4u3aiwocjSjOqNUJziE1ftUiDCxrJszjYb9bvHsVVVEWNOKseVF2H4Q0lZDBNNLobpZUhUA3gnRvYPCDylkAwz2JCIDJHdwJOGIzKpgoRezC5XWib10BrPG1w0rk2HSEPT7yMKycmulMnC0CGTfVk7r8eza1wywvuT1FAAXBvOiVOrwA+BaxLHGhgBNMQwMA3740xfjLCE7ghB2JixEiU2KSiurljrOs6ptIAarlQBRPeP05EZ8gAgRsSCSxw2htNWHK2r1kdKuCx/v+G47lEMzAbSgnM7m53KJyVUmv/Kxqalejt9pMYxJ8jnFXwBAlSlmiHU/g31b1X2KJTwIn+iVtux961hvoDtAi1o5aiwwTenhWK+QG6OvweA6zLQqrn8B3OaxJaysIjpv+5t6eiVu+DmdK27j48sRysKyQuHnMrMTQE43O8wNhNpnVabgvj4ACVpOsO/RQ4xUgsRkI8jcScSXIdE/Dk6fYLS4vPsiEb34e+L0VjNOaVeqh8QWPZGsQyynAU61gV5EZ88AqeZ7Vk5tNiDO8MMzyLe5H5YFYI5eTJXhqjoSC2Mwu/fmu9Y9dkNqyBXDezC3Amzurlc5+1ax2wposq/zfLQjJfeaQCeH9JIrghokBtcgJX0IdEfWXKEAUuq7X00O0ebB8O9cSBg/Ja4PsDM3wAbcgu4Nl2wK6U1TdmNAaR6WCjdAC3yS0p82nCFiGjVVVkDgRxf9qrx/LpiEzUUPTVTYjXxSOaPE/lTMucnjISVrV6my/PrsHcw5DrUWXTXPQFVEcMQ5Y+P2DiMykJ9YT6xicnwybhmTmW8QV8K0Ai8gqtPEJ+lc8JRD+NwbbV2boAkzJ8cXzuMBSSfNPtUPb9V58LVAoFaAoc9L1xkru rGqKcb5N d5lNVjl1YBAdiS+yd+yBsJCO2XuoZLGSQWYdc3LMJKRf5xmUpeIqiQwxQ39X4gZEA/na58Nxvh8tSDJwWl3RfmCge7P1M/ILjOHATVFU0bi/joc59Av8vJjg6uJRUWuXdoprNNqD80IU0RjXAOWc81PXwvAiTHJ1EY6agFeXdXk6NeEkmh/KUbWfGVTtP0jSlIpJn5ND5AWdw+mrg+qcTl7vTwasBGV5kHVw9u6/5291gTBj1OUY/Y4j2QbM14+biOLYZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Gregory Price writes: > On Tue, Nov 05, 2024 at 10:00:59AM +0800, Huang, Ying wrote: >> Hi, Gregory, >> >> >> >> Several years ago, we have tried to use the access time tracking >> >> mechanism of NUMA balancing to track the access time latency of unmapped >> >> file cache folios. The original implementation is as follows, >> >> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/vishal/tiering.git/commit/?h=tiering-0.8&id=5f2e64ce75c0322602c2ec8c70b64bb69b1f1329 >> >> >> >> What do you think about this? >> >> >> > >> > Coming back around to explore this topic a bit more, dug into this old >> > patch and the LRU patch by Keith - I'm struggling find a good option >> > that doesn't over-complicate or propose something contentious. >> > >> > >> > I did a browse through lore and did not see any discussion on this patch >> > or on Keith's LRU patch, so i presume discussion on this happened largely >> > off-list. So if you have any context as to why this wasn't RFC'd officially >> > I would like more information. >> >> Thanks for doing this. There's no much discussion offline. We just >> don't have enough time to work on the solution. >> > > Exploring and testing this a little further, I brought this up to current > folio work in 6.9 and found this solution to be unstable as-is. > > After some work to fix lock/reference issues, Johannes pointed out that > __filemap_get_folio can be called from an atomic context - which means it > may not be safe to do migrations in this context. Sorry, I don't understand this, the above patch changes filemap_get_pages() and grab_cache_page_write_begin() instead of __filemap_get_folio(). > We're back to looking at something like an LRU-esque system, but now we're > thinking about isolating the folios in folio_mark_accessed into a task-local > list, and then process the list on resume. If necessary, we can use a similar method for above solution too. And we can filter accessed once folios with folio_mark_accessed() firstly. That is, only promote a page if, - record the folio access time in folio_mark_accessed() only - when the folio are accessed again, and "access_time - record_time < threshold", promote the folio. > Basically we're thinking > > 1) hook folio_mark_accessed and use PG_ACTIVE/PG_ACCESSED to determine whether > the page is a promotion candidate. > 2) if it is, isolate it from the LRU - which is safe because folio_mark_accessed > already does this elsewhere, and place it onto current->promo_queue > 3) set_notify_resume > 4) add logic to resume_user_mode_work() to run through current->promo_queue and > either promote the pages accordingly, or do folio_putback_lru on failure. Use a task_work? > Going to RFC this up -- Best Regards, Huang, Ying