From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7E60C4332F for ; Fri, 10 Nov 2023 04:02:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51F1B4401C0; Thu, 9 Nov 2023 23:02:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CEF7280027; Thu, 9 Nov 2023 23:02:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3968F4401C0; Thu, 9 Nov 2023 23:02:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2A678280027 for ; Thu, 9 Nov 2023 23:02:12 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0008916082B for ; Fri, 10 Nov 2023 04:02:11 +0000 (UTC) X-FDA: 81440696862.07.62E3DC3 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by imf29.hostedemail.com (Postfix) with ESMTP id 7FA38120004 for ; Fri, 10 Nov 2023 04:02:08 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VUQd17Qs; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf29.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.10 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699588928; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3yUyV8FHaf86lafidkASxF2K/zpzhU+aWvwrmsVX6WY=; b=1ru5/9r0+pYaCsDWRZoluQLaMwP2auD2jntZvX1cK7iQepZpeNS8B3aCMNF/pDK4Lpy6Mh ANZYG8AjS98IYVrB/AmdEbkA3ZV0vteFciV/FQMosH1j0b8Y/f/Lve9PXgrde6SCedLWPQ AXVx4KVP4oDT04aX/stNFN7R/xxPO7A= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VUQd17Qs; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf29.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.10 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699588928; a=rsa-sha256; cv=none; b=B0pSYWTbfyi1JIWJiT0Vv/Spgr1NbEwTGzACK90olerjoukc8vZnX82p2GLzN9uZvFG+2u ZiurOWqjy9drIbMt89itCZ+ah/pOtYWQFKyh8aVgrFDF67C6mRGeVksmgCYMJu+32DSC6p DoJLOP8oUjkETojVD19vYCRTFfaG2y0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699588929; x=1731124929; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=CFrsyyYaC5+GirTFxIWyuSUnIgbvfSGgrSzOo8xJlhI=; b=VUQd17QsgcfdqisQe3yKv9npllG8FMNcoAlwApRJcC1tfuY4FPherd5G OZNHwkGf03Pxvn/Hzh4SqU5D1O6UiQQUjjUOxtQPNQwKSl1G/1MdrlCpg GbtAt5eq2dinZI5AWUFy5ZZ5zLSnZIvnJzKF03nRfvxX/pIwqsyQqJsfn TLxtZxG3DN413zQLUt8yIfMV732C8ZNJuxUWpQsVKSFP4uucs7gOZskY3 JQFByTQ2AWCm1z+BgPTRvgyV+Z1jQFgIY7QzUMB9q5PdespNSsukc2u6t 61lrZmRe/vba4mjq96ltaTDmJulNezrFWf+xHWhvCV77nJ9OvMCizJjIy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10889"; a="3164304" X-IronPort-AV: E=Sophos;i="6.03,291,1694761200"; d="scan'208";a="3164304" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2023 20:02:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10889"; a="1095079230" X-IronPort-AV: E=Sophos;i="6.03,291,1694761200"; d="scan'208";a="1095079230" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Nov 2023 20:02:01 -0800 From: "Huang, Ying" To: Huan Yang Cc: Michal Hocko , Tejun Heo , Zefan Li , Johannes Weiner , "Jonathan Corbet" , Roman Gushchin , "Shakeel Butt" , Muchun Song , "Andrew Morton" , David Hildenbrand , Matthew Wilcox , Kefeng Wang , Peter Xu , "Vishal Moola (Oracle)" , Yosry Ahmed , "Liu Shixin" , Hugh Dickins , , , , , Subject: Re: [RFC 0/4] Introduce unbalance proactive reclaim In-Reply-To: (Huan Yang's message of "Fri, 10 Nov 2023 10:44:45 +0800") References: <20231108065818.19932-1-link@vivo.com> <87msvniplj.fsf@yhuang6-desk2.ccr.corp.intel.com> <1e699ff2-0841-490b-a8e7-bb87170d5604@vivo.com> <6b539e16-c835-49ff-9fae-a65960567657@vivo.com> <87a5rmiewp.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Fri, 10 Nov 2023 12:00:00 +0800 Message-ID: <878r76gsvz.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: z6c3afbyrd65m4n1yqst41jnzj6ekax9 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7FA38120004 X-HE-Tag: 1699588928-569244 X-HE-Meta: U2FsdGVkX1/LT/Mj5oW6pySgcGjGgtg0F9thougM4x3Mi3SaJouu4iVUJ6vc0Nk143clmMBWbKD/rcute3tI1+AisC1BlZad+lVjngtbP9Ghy7XH93xUWb6F57vaKFvIM5lWrqpWItEIWkwxdmnzTJB6fHlE6kr4wHvWNmDNfd+kragesERfrgFH1KjnPHhXCeaK2FxYPh5cB4+FsWxHPnYKGpo4+dk3qctRS/U2bRePIYfKG7FwTa4+DUQQHETfo9M0zY7LcQzyjwD4dpThtK2/QMKpWienuYYBRnosbHwYBWxW5Wut2Ilui1js5TRIJUkP42/rErfhFnJAfsIrr3HNt3Pz7x/zUOGpJ9rwLfTrHwz173hoWcaSg6zkYuOv/Bef6s1l4dcv+NlrU7+t+GsKJ3M86H61tsh3uW/ddh5bPQHvF3Y/sKrmUuS9t1fDAIT9radrbqMF1KJJClTRRJqmJF8goaqrpTR0jGW5l1UZ3sWxSblYISRlMUfrPRpNvKNPZg96P0EhL9QbN2ANAqcNJU0n94zw6EtrNY9haOGKFjiP/fOZF+CKaPyXq0iafYx284NwCQJFR549pwDB/DyQsEsCbV1gT8JGXAr+EBHSel2u/K+S+sF9TRma3fO7CkMn02d0f/TUx1cwXVuyn2owvO5j7F3FviGsSDIrjd9FJvhJiyIHG13TE8DWltHGINlc1BPPZnDqS3fOuk+VFX6yGeSavBDRS27kemNxNgY/fxlHnmtmm8EJreXA4NRJ9BhqgQEzHfqhgxUVapgyb11AdjMdXv97kw2PrgEOQKR0MM8z4UisU6u0MSS/nm+V6n97kRp2NFScUez0rNwUiD2YxP+rKjDnDopcy3ughLU2kU2suMqwsrcwiKug20ixVvNNfQHsol8znxrDHJVKtp8ThBzQgH00kkw6WLndOxiOR6vdLm8JweTNkfIxDX4xLOJW4e2aTxwaeY8Ej/v g9c0SD6Z NBo4i84KKBNEG7/+3sDxNlZpZ/68NAmN/ooKBAQXcFig2GE0pvfbIpnDJYIqITqHeouamTlQaoXyzY9pn2SPtc6HnOafgrF79jxU1yqItyInIJJbLQdpUs7mkAQUVGs69rGveydcCBHrOjoDLmA1bmzrFBow+KWhkdcV03i54pZJH/hiNIRjJXaSUg1CQzlFapRmkuwRQ35ncWP4jCcfwLqxlppwv0GzpOv0mbmjj0KiNF50aj7YKTDsGpvV9jlDrJ7gpXyiptXq1NwIwHsyrKfQ2N+eVa5EKcCGqRApZ/s0DPNdPnd5NiubXMyWeifdNDuFKqFbxsrcQbnkIJQHFDFtk3grIg4YsGrmqgAsATCCJZG4MZwI4X0/8X11EzbJhHai1+iyljUHRhprOjt6e3nEhjePo261xbtAJ4EKot4MgRKDY0pN4Haxk2LgCrs2V8NrM1cFQ1+7Cr6DZtu/X5ZJmKy8sQagxp22M X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Huan Yang writes: > =E5=9C=A8 2023/11/10 9:19, Huang, Ying =E5=86=99=E9=81=93: >> [Some people who received this message don't often get email from ying.h= uang@intel.com. Learn why this is important at https://aka.ms/LearnAboutSen= derIdentification ] >> >> Huan Yang writes: >> >>> =E5=9C=A8 2023/11/9 18:39, Michal Hocko =E5=86=99=E9=81=93: >>>> [Some people who received this message don't often get email from mhoc= ko@suse.com. Learn why this is important at https://aka.ms/LearnAboutSender= Identification ] >>>> >>>> On Thu 09-11-23 18:29:03, Huan Yang wrote: >>>>> HI Michal Hocko, >>>>> >>>>> Thanks for your suggestion. >>>>> >>>>> =E5=9C=A8 2023/11/9 17:57, Michal Hocko =E5=86=99=E9=81=93: >>>>>> [Some people who received this message don't often get email from mh= ocko@suse.com. Learn why this is important at https://aka.ms/LearnAboutSend= erIdentification ] >>>>>> >>>>>> On Thu 09-11-23 11:38:56, Huan Yang wrote: >>>>>> [...] >>>>>>>> If so, is it better only to reclaim private anonymous pages explic= itly? >>>>>>> Yes, in practice, we only proactively compress anonymous pages and = do not >>>>>>> want to touch file pages. >>>>>> If that is the case and this is mostly application centric (which you >>>>>> seem to be suggesting) then why don't you use madvise(MADV_PAGEOUT) >>>>>> instead. >>>>> Madvise may not be applicable in this scenario.(IMO) >>>>> >>>>> This feature is aimed at a core goal, which is to compress the anonym= ous >>>>> pages >>>>> of frozen applications. >>>>> >>>>> How to detect that an application is frozen and determine which pages= can be >>>>> safely reclaimed is the responsibility of the policy part. >>>>> >>>>> Setting madvise for an application is an active behavior, while the a= bove >>>>> policy >>>>> is a passive approach.(If I misunderstood, please let me know if ther= e is a >>>>> better >>>>> way to set madvise.) >>>> You are proposing an extension to the pro-active reclaim interface so >>>> this is an active behavior pretty much by definition. So I am really n= ot >>>> following you here. Your agent can simply scan the address space of the >>>> application it is going to "freeze" and call pidfd_madvise(MADV_PAGEOU= T) >>>> on the private memory is that is really what you want/need. >>> There is a key point here. We want to use the grouping policy of memcg >>> to perform >>> proactive reclamation with certain tendencies. Your suggestion is to >>> reclaim memory >>> by scanning the task process space. However, in the mobile field, >>> memory is usually >>> viewed at the granularity of an APP. >>> >>> Therefore, after an APP is frozen, we hope to reclaim memory uniformly >>> according >>> to the pre-grouped APP processes. >>> >>> Of course, as you suggested, madvise can also achieve this, but >>> implementing it in >>> the agent may be more complex.(In terms of achieving the same goal, >>> using memcg >>> to group all the processes of an APP and perform proactive reclamation >>> is simpler >>> than using madvise and scanning multiple processes of an application >>> using an agent?) >> I still think that it's not too complex to use process_madvise() to do >> this. For each process of the application, the agent can read >> /proc/PID/maps to get all anonymous address ranges, then call >> process_madvise(MADV_PAGEOUT) to reclaim pages. This can even filter >> out shared anonymous pages. Does this work for you? > > Thanks for this suggestion. This way can avoid touch shared anonymous, it= 's > pretty well. But, I have some doubts about this, CPU resources are > usually limited in > embedded devices, and power consumption must also be taken into > consideration. > > If this approach is adopted, the agent needs to periodically scan > frozen applications > and set pageout for the address space. Is the frequency of this active > operation more > complex and unsuitable for embedded devices compared to reclamation based= on > memcg grouping features? In memcg based solution, when will you start the proactive reclaiming? You can just replace the reclaiming part of the solution from memcg proactive reclaiming to process_madvise(MADV_PAGEOUT). Because you can get PIDs in a memcg. Is it possible? > In addition, without LRU, it is difficult to control the reclamation > of only partially cold > anonymous page data of frozen applications. For example, if I only > want to proactively > reclaim 100MB of anonymous pages and issue the proactive reclamation > interface, > we can use the LRU feature to only reclaim 100MB of cold anonymous pages. > However, this cannot be achieved through madvise.(If I have > misunderstood something, > please correct me.) IIUC, it should be OK to reclaim all private anonymous pages of an application in your specific use case? If you really want to restrict the number of pages reclaimed, it's possible too. You can restrict the size of address range to call process_madvise(MADV_PAGEOUT), and check the RSS of the application. The accuracy of the number reclaimed isn't good. But I think that it should OK in practice? BTW: how do you know the number of pages to be reclaimed proactively in memcg proactive reclaiming based solution? -- Best Regards, Huang, Ying