From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1429CC433FE for ; Thu, 24 Feb 2022 03:32:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65D9C8D0002; Wed, 23 Feb 2022 22:32:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 60E188D0001; Wed, 23 Feb 2022 22:32:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D5D38D0002; Wed, 23 Feb 2022 22:32:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0107.hostedemail.com [216.40.44.107]) by kanga.kvack.org (Postfix) with ESMTP id 2820B8D0001 for ; Wed, 23 Feb 2022 22:32:07 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B85699A7FD for ; Thu, 24 Feb 2022 03:32:06 +0000 (UTC) X-FDA: 79176249852.20.3B8518C Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by imf29.hostedemail.com (Postfix) with ESMTP id 20704120004 for ; Thu, 24 Feb 2022 03:32:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1645673525; x=1677209525; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=F5EcjioZj5uSNCtFNgSr/dmqOCTfAJkkvl7RheW7RxQ=; b=AQaWIfZ6VzqN+bY/UeGOWfAK4Ck/rfxDUdaFl5E1SUpdUhxDjH8XD9Uz N5tCokE7EAKM92zjVdA84raJRENVxW54YlGK63eciKT+RayTo5k/9KKmb odQJ30khTtmkOfTwcMcemq4szr4h7r6ye/vfEtbIgZb/xq/ttRhBmDu6w JDA3trITRIV58dtRS5ZJOEml9qHE5EtWuwovL6pDxY3YOZa+W5r+lFrKL Me69871oYuYi132k5yXviuca3ZylcOZuzCmhzYIK+L3JrpKKeKVTJYymb kXUjdYD9DqsvQIR/TYcb/HkEM1pIb9MiYZGRi3fM/yduPZKHuTpmQYSEa Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10267"; a="250969332" X-IronPort-AV: E=Sophos;i="5.88,392,1635231600"; d="scan'208";a="250969332" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2022 19:32:03 -0800 X-IronPort-AV: E=Sophos;i="5.88,392,1635231600"; d="scan'208";a="707298780" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.13.11]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2022 19:31:55 -0800 From: "Huang, Ying" To: Yu Zhao Cc: Andrew Morton , Johannes Weiner , Mel Gorman , Michal Hocko , Andi Kleen , Aneesh Kumar , Barry Song <21cnbao@gmail.com>, Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Michael Larabel , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Linux ARM , "open list:DOCUMENTATION" , linux-kernel , Linux-MM , Kernel Page Reclaim v2 , "the arch/x86 maintainers" , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , Holger =?utf-8?Q?Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh Subject: Re: [PATCH v7 05/12] mm: multigenerational LRU: minimal implementation References: <20220208081902.3550911-1-yuzhao@google.com> <20220208081902.3550911-6-yuzhao@google.com> <87bkyy56nv.fsf@yhuang6-desk2.ccr.corp.intel.com> <87y2213wrl.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Thu, 24 Feb 2022 11:31:53 +0800 In-Reply-To: (Yu Zhao's message of "Wed, 23 Feb 2022 18:34:33 -0700") Message-ID: <87h78p3pp2.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: 20704120004 X-Stat-Signature: o155np4364ikninsm5bfemkok5mx9dj7 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=AQaWIfZ6; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf29.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.120) smtp.mailfrom=ying.huang@intel.com X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1645673524-443581 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yu Zhao writes: > On Wed, Feb 23, 2022 at 5:59 PM Huang, Ying wrote: >> >> Yu Zhao writes: >> >> > On Wed, Feb 23, 2022 at 1:28 AM Huang, Ying wrote: >> >> >> >> Hi, Yu, >> >> >> >> Yu Zhao writes: >> >> >> >> > To avoid confusions, the terms "promotion" and "demotion" will be >> >> > applied to the multigenerational LRU, as a new convention; the terms >> >> > "activation" and "deactivation" will be applied to the active/inactive >> >> > LRU, as usual. >> >> >> >> In the memory tiering related commits and patchset, for example as follows, >> >> >> >> commit 668e4147d8850df32ca41e28f52c146025ca45c6 >> >> Author: Yang Shi >> >> Date: Thu Sep 2 14:59:19 2021 -0700 >> >> >> >> mm/vmscan: add page demotion counter >> >> >> >> https://lore.kernel.org/linux-mm/20220221084529.1052339-1-ying.huang@intel.com/ >> >> >> >> "demote" and "promote" is used for migrating pages between different >> >> types of memory. Is it better for us to avoid overloading these words >> >> too much to avoid the possible confusion? >> > >> > Given that LRU and migration are usually different contexts, I think >> > we'd be fine, unless we want a third pair of terms. >> >> This is true before memory tiering is introduced. In systems with >> multiple types memory (called memory tiering), LRU is used to identify >> pages to be migrated to the slow memory node. Please take a look at >> can_demote(), which is called in shrink_page_list(). > > This sounds clearly two contexts to me. Promotion/demotion (move > between generations) while pages are on LRU; or promotion/demotion > (migration between nodes) after pages are taken off LRU. > > Note that promotion/demotion are not used in function names. They are > used to describe how MGLRU works, in comparison with the > active/inactive LRU. Memory tiering is not within this context. Because we have used pgdemote_* in /proc/vmstat, "demotion_enabled" in /sys/kernel/mm/numa, and will use pgpromote_* in /proc/vmstat. It seems better to avoid to use promote/demote directly for MGLRU in ABI. A possible solution is to use "mglru" and "promote/demote" together (such as "mglru_promote_*" when it is needed? >> >> > +static int get_swappiness(struct mem_cgroup *memcg) >> >> > +{ >> >> > + return mem_cgroup_get_nr_swap_pages(memcg) >= MIN_LRU_BATCH ? >> >> > + mem_cgroup_swappiness(memcg) : 0; >> >> > +} >> >> >> >> After we introduced demotion support in Linux kernel. The anonymous >> >> pages in the fast memory node could be demoted to the slow memory node >> >> via the page reclaiming mechanism as in the following commit. Can you >> >> consider that too? >> > >> > Sure. How do I check whether there is still space on the slow node? >> >> You can always check the watermark of the slow node. But now, we >> actually don't check that (as in demote_page_list()), instead we will >> wake up kswapd of the slow node. The intended behavior is something >> like, >> >> DRAM -> PMEM -> disk > > I'll look into this later -- for now, it's a low priority because > there isn't much demand. I'll bump it up if anybody is interested in > giving it a try. Meanwhile, please feel free to cook up something if > you are interested. When we introduce a new feature, we shouldn't break an existing one. That is, not introducing regression. I think that it is a rule? If my understanding were correct, MGLRU will ignore to scan anonymous page list even if there's demotion target for the node. This breaks the demotion feature in the upstream kernel. Right? It's a new feature to check whether there is still space on the slow node. We can look at that later. Best Regards, Huang, Ying