From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94B37C433FE for ; Sun, 23 Jan 2022 22:48:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98F756B0081; Sun, 23 Jan 2022 17:48:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 917506B0083; Sun, 23 Jan 2022 17:48:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7910F6B0085; Sun, 23 Jan 2022 17:48:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 628436B0081 for ; Sun, 23 Jan 2022 17:48:40 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 17BFC8249980 for ; Sun, 23 Jan 2022 22:48:40 +0000 (UTC) X-FDA: 79063042800.28.79B3F1E Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by imf28.hostedemail.com (Postfix) with ESMTP id 8A9F1C0002 for ; Sun, 23 Jan 2022 22:48:39 +0000 (UTC) Received: by mail-pj1-f46.google.com with SMTP id s61-20020a17090a69c300b001b4d0427ea2so18638039pjj.4 for ; Sun, 23 Jan 2022 14:48:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=k+IWLGaQh6aDZOBXI4bmi02taB1vvDIkWuoFDwP7/Ho=; b=bZbnD45FO5ZZ4F5NXNe63D7tyF/aRbfsHgVW8icGYq6F7MyrwBikRkG4gOA7dxDuBi 51ption67YihOG2MTeBtGqHdOHByRkRoLiuBGhXBgA1WAE+CCDTaeKekfpBdoADQSZ3U objCA3gScNifflYAg/nYhssW3fui487dfYB4rnezMFw1ntdqKi2wUJKNGKPUqiKnxpwX JD5Of7+aqZ3XjVC6z/MZARA0fOhLzgiNi6/uGogyPZ8nYtfrD088bzzoycEHNa443tuy tXonBRrXFIHwMerlSpi9aPD3bSVYwg1x0JkiH73XNz2k7nLiiNlulgvLpTd+Sg8Fj7C7 XDmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=k+IWLGaQh6aDZOBXI4bmi02taB1vvDIkWuoFDwP7/Ho=; b=DDB3sUCt6kh04Av8SWuqYlhd/btpFPfY6ZnpOS9/l83zW+XfT4SsdUVl+gAIzoyEin WZoVnDvesV/ztnm+RvT4zAjENapYXzgtFXnDexE8nefGQ8DxuPWNELuMIvVBWZZ/hqLK O+HoZkxzUK7VuXNrwfpCcxaRUj5L+bxIlWl/OVDjKbgeLeYArLktSHIkhn+C5ygGeSD0 siBPud3s88UCms+b2F3xSr9T+y9tvO1pl05idGd0ZsOoKGlQwNzd7We29dQLITgYYkfz SPIiJ9ugbGsg7kB8bSS4rPsMXQhM7gsTOgwFECWgddmgKXsEw1fzBrRm7MRYm444prpM /JAA== X-Gm-Message-State: AOAM531WtvLOgreH3WeB8pO1JLCbKKf4mGOhDgtgNxRLaFp3bhuTEhih gopSWymtuSoFx6fZf7MiMkEbDA== X-Google-Smtp-Source: ABdhPJwejoO02J7M+0Drg5UYaP5lmcJcKccVFkEGF7BzgB00XogAk49mPRJYQ0kQUk5tOCa5IIMXiA== X-Received: by 2002:a17:90b:164e:: with SMTP id il14mr3749055pjb.84.1642978117775; Sun, 23 Jan 2022 14:48:37 -0800 (PST) Received: from [2620:15c:29:204:e422:4914:e809:839f] ([2620:15c:29:204:e422:4914:e809:839f]) by smtp.gmail.com with ESMTPSA id b2sm14079640pfv.134.2022.01.23.14.48.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Jan 2022 14:48:36 -0800 (PST) Date: Sun, 23 Jan 2022 14:48:35 -0800 (PST) From: David Rientjes To: SeongJae Park , Johannes Weiner , Dave Hansen cc: Andrew Morton , Jonathan.Cameron@huawei.com, amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, gthelen@google.com, markubo@amazon.de, shakeelb@google.com, baolin.wang@linux.alibaba.com, guoqing.jiang@linux.dev, xhao@linux.alibaba.com, hanyihao@vivo.com, changbin.du@gmail.com, kuba@kernel.org, rongwei.wang@linux.alibaba.com, rikard.falkeborn@gmail.com, geert@linux-m68k.org, kilobyte@angband.pl, linux-damon@amazon.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PLAN] Some humble ideas for DAMON future works In-Reply-To: <20220119133110.24901-1-sj@kernel.org> Message-ID: <7afca3b5-626a-8356-aa73-b378f5aa7a3c@google.com> References: <20220119133110.24901-1-sj@kernel.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="447496086-1972156286-1642978116=:236780" X-Rspamd-Queue-Id: 8A9F1C0002 X-Stat-Signature: dmti7xi3tgtdooquzgu1g1bbqwcru6f8 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bZbnD45F; spf=pass (imf28.hostedemail.com: domain of rientjes@google.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam09 X-HE-Tag: 1642978119-515247 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --447496086-1972156286-1642978116=:236780 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Wed, 19 Jan 2022, SeongJae Park wrote: > User-space Policy or In-kernel Policy? Both. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > When discussing about a sort of kernel involved system efficiency > optimizations, I show two kinds of people who have slightly different o= pinions. > The first party prefer to implement only simple but efficient mechanism= s in the > kernel and export it to user space, so that users can make smart user s= pace > policy. Meanwhile, the second party prefer the kernel just works. I a= gree > with both parties. >=20 Thanks for starting this discussion, SeongJae, and kicking it off with al= l=20 of your roadmap thoughts. It's very helpful. I would love for this to turn into an active discussion amongst those=20 people who are currently looking into using DAMON for their set of=20 interests and also those who are investigating how its current set of=20 support can be adapated for their use cases. For discussion on where the kernel and userspace boundary lies for policy= =20 decisions, I think it depends heavily on (1) the specific subcomponent of= =20 the mm subsystem being discussed, I don't think this boundary will be the= =20 same for all areas (and can/will evolve over time), and (2) the differenc= e =20 between the base out-of-the-box behavior that Linux provides for everybod= y=20 and the elaborate support that some users need for efficiency or=20 performance. This is going to be very different for things like hugepage= =20 optimizations and memory compaction, for example. > I think the first opinion makes sense as there are some valuable inform= ation > that only user space can know. I think only such approaches could achi= eve the > ultimate efficiency in such cases. > I also agree to the second party, though, because there could be some p= eople > who don't have special information that only their applications know, o= r > resources to do the additional work. In-kernel simple policies will be= still > beneficial for some users even though those are sub-optimal compared to= the > highly tuned user space policy, if it provides some extent of efficienc= y gain > and no regressions for most cases. >=20 > I'd like to help both. For the reason, I made DAMON as an in-kernel me= chanism > for both user and kernel-space policies. It provides highly tunable ge= neral > user space interface to help the first party. It also provides in-kern= el > policies which built on top of DAMON using its kernel-space API for spe= cific > common use cases with conservative default parameters that assumed to i= ncur no > regression but some extent of benefits in most cases, namely DAMON-base= d > proactive reclamation. I will continue pursuing the two ways. >=20 Are you referring only to root userspace here or are you including=20 non-root userspace? Imagine a process that is willing to accept the cpu overhead for doing th= p=20 collapse for portions of its memory in process context rather than waitin= g=20 for khugepaged and that we had a mechanism (discussed later) for doing=20 that in the kernel. The non-root user in this case would need the abilit= y=20 to monitor regions of its own heap, for example, and disregard others. =20 The malloc implementation wants to answer the question of "what regions o= f=20 my heap are accessed very frequently?" so that we can do hugepage=20 optimizations. Do you see that the user will have the ability to fork off a DAMON contex= t=20 to do this monitoring for their own heap? kdamond could be attached to a= =20 cpu cgroup to charge the cpu overhead for doing this monitoring and the=20 time spent applying any actions to that memory to that workload on a=20 multi-tenant machine. I think it would be useful to discuss the role of non-root userspace for=20 future DAMON support. > Imaginable DAMON-based Policies > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D >=20 > I'd like to start from listing some imaginable data access-aware operat= ion > policies that I hope to eventually be made. The list will hopefully sh= ed light > on how DAMON should be evolved to efficiently support the policies. >=20 > DAMON-based Proactive LRU-pages (de)Activation > ---------------------------------------------- >=20 > The reclamation mechanism which selects reclaim target using the > active/inactive LRU lists sometimes doesn't work well. According to my > previous work, providing access pattern-based hints can significantly i= mprove > the performance under memory pressure[1,2]. >=20 > Proactive reclamation is known to be useful for many memory intensive s= ystems, > and now we have a DAMON-based implementation of it[3]. However, the pr= oactive > reclamation wouldn't be so welcome to some systems having high cost of = I/O. > Also, even though the system runs proactive reclamation, memory pressur= e can > still occasionally triggered. >=20 > My idea for helping this situation is manipulating the orders of pages = in LRU > lists using DAMON-provided monitoring results. That is, making DAMON > proactively finds hot/cold memory regions and moves pages of the hot re= gions to > the head of the active list, while moving pages of the cold regions to = the tail > of the inactive list. This will help eventual reclamation under memory > pressure to evict cold pages first, so incur less additional page fault= s. >=20 Let's add Johannes Weiner into this discussion as=20 well since we had previously discussed persistent background ordering of=20 the lru lists based on hotness and coldness of memory before. This=20 discussion had happened before DAMON was merged upstream, so that DAMON=20 has landed it is likely an area that he's interested in. One gotcha with the above might be the handling of MADV_FREE memory that=20 we want to lazily free under memory pressure. Userspace has indicated=20 that we can free this memory whenever necessary, so the kernel=20 implementation moves this memory to the inactive lru regardless of any=20 hotness or coldness of the memory. In other words, this memory *can* hav= e=20 very high access frequencies in the short-term and then it's madvised wit= h=20 MADV_FREE by userspace to free if we encounter memory pressure. It seems= =20 like this needs to override the DAMON-provided monitoring results since=20 userspace just knows better in certain scenarios. > [1] https://www.usenix.org/conference/hotstorage19/presentation/park > [2] https://linuxplumbersconf.org/event/4/contributions/548/ > [3] https://docs.kernel.org/admin-guide/mm/damon/reclaim.html >=20 > DAMON-based THP Coalesce/Split > ------------------------------ >=20 > THP is know to significantly improve performance, but also increase mem= ory > footprint[1]. We can minimize the memory overhead while preserving the > performance benefit by asking DAMON to provide MADV_HUGEPAGE-like hints= for hot > memory regions of >=3D 2MiB size, and MADV_NOHUGEPAGE-like hints for co= ld memory > regions. Our experimental user space policy implementation[2] of this = idea > removes 76.15% of THP memory waste while preserving 51.25% of THP speed= up in > total. >=20 This is a very interesting area to explore, and turns out to be very=20 timely as well. We'll soon be proposing the MADV_COLLAPSE support that w= e=20 discussed here[1] and was well received. One thought here is that with DAMON we can create a scheme to apply a=20 DAMOS_COLLAPSE action on very hot memory in the monitoring region that=20 would simply call into the new MADV_COLLAPSE code to allow us to do a=20 synchronous collapse in process context. With the current DAMON support,= =20 this seems very straight-forward once we have MADV_COLLAPSE. [1] https://lore.kernel.org/all/d098c392-273a-36a4-1a29-59731cdf5d3d@goog= le.com/ > [1] https://www.usenix.org/conference/osdi16/technical-sessions/present= ation/kwon > [2] https://damonitor.github.io/doc/html/v34/vm/damon/eval.html >=20 > DAMON-based Tiered Memory (Pro|De)motion > ---------------------------------------- >=20 > In tiered memory systems utilizing DRAM and PMEM[1], we can promote hot= pages to > DRAM and demote cold pages to PMEM using DAMON. A patch for allowing > access-aware demotion user space policy development is already submitte= d[2] by > Baolin. >=20 Thanks for this, it's very useful. Is it possible to point to any data o= n=20 how responsive the promotion side can be to recent memory accesses? It=20 seems like we'll need to promote that memory quite quickly to not suffer=20 long-lived performance degradations if we're treating DRAM and PMEM as=20 schedulable memory. DAMON provides us with a framework so that we have complete control over=20 the efficiency of scanning PMEM for possible promotion candidates. But=20 I'd be very interested in seeing any data from Baolin (or anybody else) o= n=20 just how responsive the promotion side can be. > [1] https://www.intel.com/content/www/us/en/products/details/memory-sto= rage/optane-memory.html > [2] https://lore.kernel.org/linux-mm/cover.1640171137.git.baolin.wang@l= inux.alibaba.com/ >=20 > DAMON-based Proactive Compaction > -------------------------------- >=20 > Compaction uses migration scanner to find migration source pages. Hot = pages > would be more likely to be unmovable compared to cold pages, so it woul= d be > better to try migration of cold pages first. DAMON could be used here.= That > is, proactively monitoring accesses via DAMON and start compaction so t= hat the > migration scanner scan cold memory ranges first. I should admit I'm no= t > familiar with compaction code and I have no PoC data for this but just = the > groundless idea, though. >=20 Is compaction enlightenment for DAMON a high priority at this point, or=20 would AutoNUMA be a more interesting candidate? Today, AutoNUMA works with a sliding window setting page tables to have=20 PROT_NONE permissions so that we induce a page fault and can determine=20 which cpu is accessing potentially remote memory (task_numa_work()). If=20 that's happening, we can migrate the memory to the home NUMA node so that= =20 we can avoid those remote memory accesses and the increased latency that=20 it induces. Idea: if we enlightened task_numa_work() to prioritize hot memory using=20 DAMON, it *seems* like this would be most effective rather than relying o= n=20 a sliding window. We want to migrate memory that is frequently being=20 accessed to reduce the remote memory access latency, we only get a minima= l=20 improvement (mostly only node balancing) for memory that is rarely=20 accessed. I'm somewhat surprised this isn't one of the highest priorities, actually= ,=20 for being enlightened with DAMON support, so it feels like I'm missing=20 something obvious. Let's also add Dave Hansen into the thread=20 for the above two sections (memory tiering and AutoNUMA) because I know=20 he's thought about both. > How We Can Implement These > -------------------------- >=20 > Implementing most of the above mentioned policies wouldn't be too diffi= cult > because we have DAMON-based Operation Schemes (DAMOS). That is, we wil= l need > to implement some more DAMOS action for each policy. Some existing ker= nel > functions can be reused. Such actions would include LRU (de)activation= , THP > coalesce/split hints, memory (pro|de)motion, and cold pages first scann= ing > compaction. Then, supporting those actions with the user space interfa= ce will > allows implementing user space policies. If we find reasonably good de= fault > DAMOS parameters and some kernel side control mechanism, we can further= make > those as kernel policies in form of, say, builtin modules. >=20 > How DAMON Should Be Evolved For Supporting Those > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > Let's discuss what kind of changes in DAMON will be needed to efficient= ly > support above mentioned policies. >=20 > Simultaneously Monitoring Different Types of Address Spaces > ----------------------------------------------------------- >=20 > It would be better to run all the above mentioned policies simultaneous= ly on > single system. As some policies such as LRU-pages (de)activation would= better > to run on physical address space while some policies such as THP coales= ce/split > would need to run on virtual address spaces, DAMON should support concu= rrently > monitoring different address spaces. We can always do this by creating= one > DAMON context for each address space and running those. However, as th= e > address spaces will conflict, each other will be interfered. Current i= dea for > avoiding this is allowing multiple DAMON contexts to run on a single th= read, > forcing them to have same monitoring contexts. >=20 > Online Parameters Updates > ------------------------- >=20 > Someone would also want to dynamically turn on/off and/or tune each pol= icy. > This is impossible with current DAMON, because it prohibits updating an= y > parameter while it is running. We disallow the online parameters updat= e > mainly because we want to avoid doing additional synchronization betwee= n the > running kdamond and the parameters updater. The idea for supporting th= e use > case while avoiding the additional synchronization is, allowing users t= o pause > DAMON and update parameters while it is paused. >=20 > A Better DAMON interface > ------------------------ >=20 > DAMON is currently exposing its major functionality to the user space v= ia the > debugfs. After all, DAMON is not for only debugging. Also, this makes= the > interface depends on debugfs unnecessarily, and considered unreliable. = Also, > the interface is quite unflexible for future interface extension. I ad= mit it > was not a good choice. >=20 > It would be better to implement another reliable and easily extensible > interface, and deprecate the debugfs interface. The idea is exposing t= he > interface via sysfs using hierarchical Kobjects under mm_kobject. For = example, > the usage would be something like below: >=20 > # cd /sys/kernel/mm/damon > # echo 1 > nr_kdamonds > # echo 1 > kdamond_1/contexts/nr_contexts > # echo va > kdamond_1/contexts/context_1/target_type > # echo 1 > kdamond_1/contexts/context_1/targets/nr_targets > # echo $(pidof ) > \ > kdamond_1/contexts/context_1/targets/target_1/pid > # echo Y > monitor_on >=20 > The underlying files hierarchy could be something like below. >=20 > /sys/kernel/mm/damon/ > =E2=94=82 monitor_on > =E2=94=82 kdamonds > =E2=94=82 =E2=94=82 nr_kdamonds > =E2=94=82 =E2=94=82 kdamond_1/ > =E2=94=82 =E2=94=82 =E2=94=82 kdamond_pid > =E2=94=82 =E2=94=82 =E2=94=82 contexts > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_contexts > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 context_1/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 target_type (va |= pa) > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 attrs/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 interva= ls/sampling,aggr,update > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_regi= ons/min,max > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 targets/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_targ= ets > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 target_= 1/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= pid > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= init_regions/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 region1/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 =E2=94=82 start,end > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 ... > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 ... > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 schemes/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_sche= mes > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 scheme_= 1/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= action > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= target_access_pattern/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 sz/min,max > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 nr_accesses/min,max > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 age/min,max > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= quotas/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 ms,bytes,reset_interval > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 prioritization_weights/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 sz,nr_accesses,age > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= watermarks/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= metric,check_interval,high,mid,low > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= stats/ > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 quota_exceeds > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 tried/nr,sz > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= =E2=94=82 applied/nr,sz > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82= ... > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 ... > =E2=94=82 =E2=94=82 ... >=20 > More DAMON Future Works > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > In addition to above mentioned things, there are many works to do. It = would be > better to extend DAMON for more use cases and address spaces support, i= ncluding > page granularity, idleness only, read/write only, page cache only, and = cgroups > monitoring supports. >=20 Cgroup support is very interesting so that we do not need to constantly=20 maintain a list of target_ids when a job forks new processes. We've=20 discussed the potential for passing a cgroup inode as the target rather=20 than pid for virtual address monitoring that would operate over the set o= f=20 processes attached to that cgroup hierarchy. Is this what you imagine fo= r=20 cgroup support or something more elaborate (or something different=20 entirely :)? > Also it would be valuable to improve the accuracy of monitoring, using = some > adaptive monitoring attributes tuning or some new fancy idea[1]. >=20 > DAMOS could also be improved by utilizing its own autotuning feature, f= or > example, by monitoring PSI and other metrics related to the given actio= n. >=20 > [1] https://linuxplumbersconf.org/event/11/contributions/984/ >=20 I'd like to add another topic here: DAMON based monitoring for virtualize= d=20 workloads. Today, it seems like you'd need to run DAMON in the guest to=20 be able to describe its working set. Monitoring the hypervisor process=20 is inadequate because it will reveal the first access to the guest owned=20 memory but not the accesses done by the guest itself. So it seems like=20 the *current* support for virtual address monitoring is insufficient=20 unless the guest is enlightened to do DAMON monitoring itself. What about unenlightened guests? An idea is a third DAMON monitoring mod= e=20 that monitors accesses in the EPT. Have you thought about this before or= =20 other ways to monitor memory access for an *unenlightened* guest? Would=20 love to have a discussion on this. --447496086-1972156286-1642978116=:236780--