From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 94B37C433FE
	for <linux-mm@archiver.kernel.org>; Sun, 23 Jan 2022 22:48:41 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 98F756B0081; Sun, 23 Jan 2022 17:48:40 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 917506B0083; Sun, 23 Jan 2022 17:48:40 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 7910F6B0085; Sun, 23 Jan 2022 17:48:40 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95])
	by kanga.kvack.org (Postfix) with ESMTP id 628436B0081
	for <linux-mm@kvack.org>; Sun, 23 Jan 2022 17:48:40 -0500 (EST)
Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay03.hostedemail.com (Postfix) with ESMTP id 17BFC8249980
	for <linux-mm@kvack.org>; Sun, 23 Jan 2022 22:48:40 +0000 (UTC)
X-FDA: 79063042800.28.79B3F1E
Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46])
	by imf28.hostedemail.com (Postfix) with ESMTP id 8A9F1C0002
	for <linux-mm@kvack.org>; Sun, 23 Jan 2022 22:48:39 +0000 (UTC)
Received: by mail-pj1-f46.google.com with SMTP id s61-20020a17090a69c300b001b4d0427ea2so18638039pjj.4
        for <linux-mm@kvack.org>; Sun, 23 Jan 2022 14:48:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:from:to:cc:subject:in-reply-to:message-id:references
         :mime-version;
        bh=k+IWLGaQh6aDZOBXI4bmi02taB1vvDIkWuoFDwP7/Ho=;
        b=bZbnD45FO5ZZ4F5NXNe63D7tyF/aRbfsHgVW8icGYq6F7MyrwBikRkG4gOA7dxDuBi
         51ption67YihOG2MTeBtGqHdOHByRkRoLiuBGhXBgA1WAE+CCDTaeKekfpBdoADQSZ3U
         objCA3gScNifflYAg/nYhssW3fui487dfYB4rnezMFw1ntdqKi2wUJKNGKPUqiKnxpwX
         JD5Of7+aqZ3XjVC6z/MZARA0fOhLzgiNi6/uGogyPZ8nYtfrD088bzzoycEHNa443tuy
         tXonBRrXFIHwMerlSpi9aPD3bSVYwg1x0JkiH73XNz2k7nLiiNlulgvLpTd+Sg8Fj7C7
         XDmw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id
         :references:mime-version;
        bh=k+IWLGaQh6aDZOBXI4bmi02taB1vvDIkWuoFDwP7/Ho=;
        b=DDB3sUCt6kh04Av8SWuqYlhd/btpFPfY6ZnpOS9/l83zW+XfT4SsdUVl+gAIzoyEin
         WZoVnDvesV/ztnm+RvT4zAjENapYXzgtFXnDexE8nefGQ8DxuPWNELuMIvVBWZZ/hqLK
         O+HoZkxzUK7VuXNrwfpCcxaRUj5L+bxIlWl/OVDjKbgeLeYArLktSHIkhn+C5ygGeSD0
         siBPud3s88UCms+b2F3xSr9T+y9tvO1pl05idGd0ZsOoKGlQwNzd7We29dQLITgYYkfz
         SPIiJ9ugbGsg7kB8bSS4rPsMXQhM7gsTOgwFECWgddmgKXsEw1fzBrRm7MRYm444prpM
         /JAA==
X-Gm-Message-State: AOAM531WtvLOgreH3WeB8pO1JLCbKKf4mGOhDgtgNxRLaFp3bhuTEhih
	gopSWymtuSoFx6fZf7MiMkEbDA==
X-Google-Smtp-Source: ABdhPJwejoO02J7M+0Drg5UYaP5lmcJcKccVFkEGF7BzgB00XogAk49mPRJYQ0kQUk5tOCa5IIMXiA==
X-Received: by 2002:a17:90b:164e:: with SMTP id il14mr3749055pjb.84.1642978117775;
        Sun, 23 Jan 2022 14:48:37 -0800 (PST)
Received: from [2620:15c:29:204:e422:4914:e809:839f] ([2620:15c:29:204:e422:4914:e809:839f])
        by smtp.gmail.com with ESMTPSA id b2sm14079640pfv.134.2022.01.23.14.48.36
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sun, 23 Jan 2022 14:48:36 -0800 (PST)
Date: Sun, 23 Jan 2022 14:48:35 -0800 (PST)
From: David Rientjes <rientjes@google.com>
To: SeongJae Park <sj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>, 
    Dave Hansen <dave.hansen@linux.intel.com>
cc: Andrew Morton <akpm@linux-foundation.org>, Jonathan.Cameron@huawei.com, 
    amit@kernel.org, benh@kernel.crashing.org, corbet@lwn.net, 
    david@redhat.com, dwmw@amazon.com, elver@google.com, foersleo@amazon.de, 
    gthelen@google.com, markubo@amazon.de, shakeelb@google.com, 
    baolin.wang@linux.alibaba.com, guoqing.jiang@linux.dev, 
    xhao@linux.alibaba.com, hanyihao@vivo.com, changbin.du@gmail.com, 
    kuba@kernel.org, rongwei.wang@linux.alibaba.com, 
    rikard.falkeborn@gmail.com, geert@linux-m68k.org, kilobyte@angband.pl, 
    linux-damon@amazon.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PLAN] Some humble ideas for DAMON future works
In-Reply-To: <20220119133110.24901-1-sj@kernel.org>
Message-ID: <7afca3b5-626a-8356-aa73-b378f5aa7a3c@google.com>
References: <20220119133110.24901-1-sj@kernel.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="447496086-1972156286-1642978116=:236780"
X-Rspamd-Queue-Id: 8A9F1C0002
X-Stat-Signature: dmti7xi3tgtdooquzgu1g1bbqwcru6f8
Authentication-Results: imf28.hostedemail.com;
	dkim=pass header.d=google.com header.s=20210112 header.b=bZbnD45F;
	spf=pass (imf28.hostedemail.com: domain of rientjes@google.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=rientjes@google.com;
	dmarc=pass (policy=reject) header.from=google.com
X-Rspamd-Server: rspam09
X-HE-Tag: 1642978119-515247
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--447496086-1972156286-1642978116=:236780
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wed, 19 Jan 2022, SeongJae Park wrote:

> User-space Policy or In-kernel Policy?  Both.
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> When discussing about a sort of kernel involved system efficiency
> optimizations, I show two kinds of people who have slightly different o=
pinions.
> The first party prefer to implement only simple but efficient mechanism=
s in the
> kernel and export it to user space, so that users can make smart user s=
pace
> policy.  Meanwhile, the second party prefer the kernel just works.  I a=
gree
> with both parties.
>=20

Thanks for starting this discussion, SeongJae, and kicking it off with al=
l=20
of your roadmap thoughts.  It's very helpful.

I would love for this to turn into an active discussion amongst those=20
people who are currently looking into using DAMON for their set of=20
interests and also those who are investigating how its current set of=20
support can be adapated for their use cases.

For discussion on where the kernel and userspace boundary lies for policy=
=20
decisions, I think it depends heavily on (1) the specific subcomponent of=
=20
the mm subsystem being discussed, I don't think this boundary will be the=
=20
same for all areas (and can/will evolve over time), and (2) the differenc=
e =20
between the base out-of-the-box behavior that Linux provides for everybod=
y=20
and the elaborate support that some users need for efficiency or=20
performance.  This is going to be very different for things like hugepage=
=20
optimizations and memory compaction, for example.

> I think the first opinion makes sense as there are some valuable inform=
ation
> that only user space can know.  I think only such approaches could achi=
eve the
> ultimate efficiency in such cases.
> I also agree to the second party, though, because there could be some p=
eople
> who don't have special information that only their applications know, o=
r
> resources to do the additional work.  In-kernel simple policies will be=
 still
> beneficial for some users even though those are sub-optimal compared to=
 the
> highly tuned user space policy, if it provides some extent of efficienc=
y gain
> and no regressions for most cases.
>=20
> I'd like to help both.  For the reason, I made DAMON as an in-kernel me=
chanism
> for both user and kernel-space policies.  It provides highly tunable ge=
neral
> user space interface to help the first party.  It also provides in-kern=
el
> policies which built on top of DAMON using its kernel-space API for spe=
cific
> common use cases with conservative default parameters that assumed to i=
ncur no
> regression but some extent of benefits in most cases, namely DAMON-base=
d
> proactive reclamation.  I will continue pursuing the two ways.
>=20

Are you referring only to root userspace here or are you including=20
non-root userspace?

Imagine a process that is willing to accept the cpu overhead for doing th=
p=20
collapse for portions of its memory in process context rather than waitin=
g=20
for khugepaged and that we had a mechanism (discussed later) for doing=20
that in the kernel.  The non-root user in this case would need the abilit=
y=20
to monitor regions of its own heap, for example, and disregard others. =20
The malloc implementation wants to answer the question of "what regions o=
f=20
my heap are accessed very frequently?" so that we can do hugepage=20
optimizations.

Do you see that the user will have the ability to fork off a DAMON contex=
t=20
to do this monitoring for their own heap?  kdamond could be attached to a=
=20
cpu cgroup to charge the cpu overhead for doing this monitoring and the=20
time spent applying any actions to that memory to that workload on a=20
multi-tenant machine.

I think it would be useful to discuss the role of non-root userspace for=20
future DAMON support.

> Imaginable DAMON-based Policies
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>=20
> I'd like to start from listing some imaginable data access-aware operat=
ion
> policies that I hope to eventually be made.  The list will hopefully sh=
ed light
> on how DAMON should be evolved to efficiently support the policies.
>=20
> DAMON-based Proactive LRU-pages (de)Activation
> ----------------------------------------------
>=20
> The reclamation mechanism which selects reclaim target using the
> active/inactive LRU lists sometimes doesn't work well.  According to my
> previous work, providing access pattern-based hints can significantly i=
mprove
> the performance under memory pressure[1,2].
>=20
> Proactive reclamation is known to be useful for many memory intensive s=
ystems,
> and now we have a DAMON-based implementation of it[3].  However, the pr=
oactive
> reclamation wouldn't be so welcome to some systems having high cost of =
I/O.
> Also, even though the system runs proactive reclamation, memory pressur=
e can
> still occasionally triggered.
>=20
> My idea for helping this situation is manipulating the orders of pages =
in LRU
> lists using DAMON-provided monitoring results.  That is, making DAMON
> proactively finds hot/cold memory regions and moves pages of the hot re=
gions to
> the head of the active list, while moving pages of the cold regions to =
the tail
> of the inactive list.  This will help eventual reclamation under memory
> pressure to evict cold pages first, so incur less additional page fault=
s.
>=20

Let's add Johannes Weiner <hannes@cmpxchg.org> into this discussion as=20
well since we had previously discussed persistent background ordering of=20
the lru lists based on hotness and coldness of memory before.  This=20
discussion had happened before DAMON was merged upstream, so that DAMON=20
has landed it is likely an area that he's interested in.

One gotcha with the above might be the handling of MADV_FREE memory that=20
we want to lazily free under memory pressure.  Userspace has indicated=20
that we can free this memory whenever necessary, so the kernel=20
implementation moves this memory to the inactive lru regardless of any=20
hotness or coldness of the memory.  In other words, this memory *can* hav=
e=20
very high access frequencies in the short-term and then it's madvised wit=
h=20
MADV_FREE by userspace to free if we encounter memory pressure.  It seems=
=20
like this needs to override the DAMON-provided monitoring results since=20
userspace just knows better in certain scenarios.

> [1] https://www.usenix.org/conference/hotstorage19/presentation/park
> [2] https://linuxplumbersconf.org/event/4/contributions/548/
> [3] https://docs.kernel.org/admin-guide/mm/damon/reclaim.html
>=20
> DAMON-based THP Coalesce/Split
> ------------------------------
>=20
> THP is know to significantly improve performance, but also increase mem=
ory
> footprint[1].  We can minimize the memory overhead while preserving the
> performance benefit by asking DAMON to provide MADV_HUGEPAGE-like hints=
 for hot
> memory regions of >=3D 2MiB size, and MADV_NOHUGEPAGE-like hints for co=
ld memory
> regions.  Our experimental user space policy implementation[2] of this =
idea
> removes 76.15% of THP memory waste while preserving 51.25% of THP speed=
up in
> total.
>=20

This is a very interesting area to explore, and turns out to be very=20
timely as well.  We'll soon be proposing the MADV_COLLAPSE support that w=
e=20
discussed here[1] and was well received.

One thought here is that with DAMON we can create a scheme to apply a=20
DAMOS_COLLAPSE action on very hot memory in the monitoring region that=20
would simply call into the new MADV_COLLAPSE code to allow us to do a=20
synchronous collapse in process context.  With the current DAMON support,=
=20
this seems very straight-forward once we have MADV_COLLAPSE.

[1] https://lore.kernel.org/all/d098c392-273a-36a4-1a29-59731cdf5d3d@goog=
le.com/

> [1] https://www.usenix.org/conference/osdi16/technical-sessions/present=
ation/kwon
> [2] https://damonitor.github.io/doc/html/v34/vm/damon/eval.html
>=20
> DAMON-based Tiered Memory (Pro|De)motion
> ----------------------------------------
>=20
> In tiered memory systems utilizing DRAM and PMEM[1], we can promote hot=
 pages to
> DRAM and demote cold pages to PMEM using DAMON.  A patch for allowing
> access-aware demotion user space policy development is already submitte=
d[2] by
> Baolin.
>=20

Thanks for this, it's very useful.  Is it possible to point to any data o=
n=20
how responsive the promotion side can be to recent memory accesses?  It=20
seems like we'll need to promote that memory quite quickly to not suffer=20
long-lived performance degradations if we're treating DRAM and PMEM as=20
schedulable memory.

DAMON provides us with a framework so that we have complete control over=20
the efficiency of scanning PMEM for possible promotion candidates.  But=20
I'd be very interested in seeing any data from Baolin (or anybody else) o=
n=20
just how responsive the promotion side can be.

> [1] https://www.intel.com/content/www/us/en/products/details/memory-sto=
rage/optane-memory.html
> [2] https://lore.kernel.org/linux-mm/cover.1640171137.git.baolin.wang@l=
inux.alibaba.com/
>=20
> DAMON-based Proactive Compaction
> --------------------------------
>=20
> Compaction uses migration scanner to find migration source pages.  Hot =
pages
> would be more likely to be unmovable compared to cold pages, so it woul=
d be
> better to try migration of cold pages first.  DAMON could be used here.=
  That
> is, proactively monitoring accesses via DAMON and start compaction so t=
hat the
> migration scanner scan cold memory ranges first.  I should admit I'm no=
t
> familiar with compaction code and I have no PoC data for this but just =
the
> groundless idea, though.
>=20

Is compaction enlightenment for DAMON a high priority at this point, or=20
would AutoNUMA be a more interesting candidate?

Today, AutoNUMA works with a sliding window setting page tables to have=20
PROT_NONE permissions so that we induce a page fault and can determine=20
which cpu is accessing potentially remote memory (task_numa_work()).  If=20
that's happening, we can migrate the memory to the home NUMA node so that=
=20
we can avoid those remote memory accesses and the increased latency that=20
it induces.

Idea: if we enlightened task_numa_work() to prioritize hot memory using=20
DAMON, it *seems* like this would be most effective rather than relying o=
n=20
a sliding window.  We want to migrate memory that is frequently being=20
accessed to reduce the remote memory access latency, we only get a minima=
l=20
improvement (mostly only node balancing) for memory that is rarely=20
accessed.

I'm somewhat surprised this isn't one of the highest priorities, actually=
,=20
for being enlightened with DAMON support, so it feels like I'm missing=20
something obvious.

Let's also add Dave Hansen <dave.hansen@linux.intel.com> into the thread=20
for the above two sections (memory tiering and AutoNUMA) because I know=20
he's thought about both.

> How We Can Implement These
> --------------------------
>=20
> Implementing most of the above mentioned policies wouldn't be too diffi=
cult
> because we have DAMON-based Operation Schemes (DAMOS).  That is, we wil=
l need
> to implement some more DAMOS action for each policy.  Some existing ker=
nel
> functions can be reused.  Such actions would include LRU (de)activation=
, THP
> coalesce/split hints, memory (pro|de)motion, and cold pages first scann=
ing
> compaction.  Then, supporting those actions with the user space interfa=
ce will
> allows implementing user space policies.  If we find reasonably good de=
fault
> DAMOS parameters and some kernel side control mechanism, we can further=
 make
> those as kernel policies in form of, say, builtin modules.
>=20
> How DAMON Should Be Evolved For Supporting Those
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> Let's discuss what kind of changes in DAMON will be needed to efficient=
ly
> support above mentioned policies.
>=20
> Simultaneously Monitoring Different Types of Address Spaces
> -----------------------------------------------------------
>=20
> It would be better to run all the above mentioned policies simultaneous=
ly on
> single system.  As some policies such as LRU-pages (de)activation would=
 better
> to run on physical address space while some policies such as THP coales=
ce/split
> would need to run on virtual address spaces, DAMON should support concu=
rrently
> monitoring different address spaces.  We can always do this by creating=
 one
> DAMON context for each address space and running those.  However, as th=
e
> address spaces will conflict, each other will be interfered.  Current i=
dea for
> avoiding this is allowing multiple DAMON contexts to run on a single th=
read,
> forcing them to have same monitoring contexts.
>=20
> Online Parameters Updates
> -------------------------
>=20
> Someone would also want to dynamically turn on/off and/or tune each pol=
icy.
> This is impossible with current DAMON, because it prohibits updating an=
y
> parameter while it is running.  We disallow the online parameters updat=
e
> mainly because we want to avoid doing additional synchronization betwee=
n the
> running kdamond and the parameters updater.  The idea for supporting th=
e use
> case while avoiding the additional synchronization is, allowing users t=
o pause
> DAMON and update parameters while it is paused.
>=20
> A Better DAMON interface
> ------------------------
>=20
> DAMON is currently exposing its major functionality to the user space v=
ia the
> debugfs.  After all, DAMON is not for only debugging.  Also, this makes=
 the
> interface depends on debugfs unnecessarily, and considered unreliable. =
 Also,
> the interface is quite unflexible for future interface extension.  I ad=
mit it
> was not a good choice.
>=20
> It would be better to implement another reliable and easily extensible
> interface, and deprecate the debugfs interface.  The idea is exposing t=
he
> interface via sysfs using hierarchical Kobjects under mm_kobject.  For =
example,
> the usage would be something like below:
>=20
>     # cd /sys/kernel/mm/damon
>     # echo 1 > nr_kdamonds
>     # echo 1 > kdamond_1/contexts/nr_contexts
>     # echo va > kdamond_1/contexts/context_1/target_type
>     # echo 1 > kdamond_1/contexts/context_1/targets/nr_targets
>     # echo $(pidof <workload>) > \
>                     kdamond_1/contexts/context_1/targets/target_1/pid
>     # echo Y > monitor_on
>=20
> The underlying files hierarchy could be something like below.
>=20
>     /sys/kernel/mm/damon/
>     =E2=94=82 monitor_on
>     =E2=94=82 kdamonds
>     =E2=94=82 =E2=94=82 nr_kdamonds
>     =E2=94=82 =E2=94=82 kdamond_1/
>     =E2=94=82 =E2=94=82 =E2=94=82 kdamond_pid
>     =E2=94=82 =E2=94=82 =E2=94=82 contexts
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_contexts
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 context_1/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 target_type (va |=
 pa)
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 attrs/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 interva=
ls/sampling,aggr,update
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_regi=
ons/min,max
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 targets/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_targ=
ets
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 target_=
1/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 pid
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 init_regions/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 region1/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 =E2=94=82 start,end
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 ...
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 ...
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 schemes/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 nr_sche=
mes
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 scheme_=
1/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 action
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 target_access_pattern/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 sz/min,max
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 nr_accesses/min,max
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 age/min,max
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 quotas/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 ms,bytes,reset_interval
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 prioritization_weights/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82   sz,nr_accesses,age
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 watermarks/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
   metric,check_interval,high,mid,low
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 stats/
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 quota_exceeds
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 tried/nr,sz
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 =E2=94=82 applied/nr,sz
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82=
 ...
>     =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=82 ...
>     =E2=94=82 =E2=94=82 ...
>=20
> More DAMON Future Works
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> In addition to above mentioned things, there are many works to do.  It =
would be
> better to extend DAMON for more use cases and address spaces support, i=
ncluding
> page granularity, idleness only, read/write only, page cache only, and =
cgroups
> monitoring supports.
>=20

Cgroup support is very interesting so that we do not need to constantly=20
maintain a list of target_ids when a job forks new processes.  We've=20
discussed the potential for passing a cgroup inode as the target rather=20
than pid for virtual address monitoring that would operate over the set o=
f=20
processes attached to that cgroup hierarchy.  Is this what you imagine fo=
r=20
cgroup support or something more elaborate (or something different=20
entirely :)?

> Also it would be valuable to improve the accuracy of monitoring, using =
some
> adaptive monitoring attributes tuning or some new fancy idea[1].
>=20
> DAMOS could also be improved by utilizing its own autotuning feature, f=
or
> example, by monitoring PSI and other metrics related to the given actio=
n.
>=20
> [1] https://linuxplumbersconf.org/event/11/contributions/984/
>=20

I'd like to add another topic here: DAMON based monitoring for virtualize=
d=20
workloads.  Today, it seems like you'd need to run DAMON in the guest to=20
be able to describe its working set.  Monitoring the hypervisor process=20
is inadequate because it will reveal the first access to the guest owned=20
memory but not the accesses done by the guest itself.  So it seems like=20
the *current* support for virtual address monitoring is insufficient=20
unless the guest is enlightened to do DAMON monitoring itself.

What about unenlightened guests?  An idea is a third DAMON monitoring mod=
e=20
that monitors accesses in the EPT.  Have you thought about this before or=
=20
other ways to monitor memory access for an *unenlightened* guest?  Would=20
love to have a discussion on this.
--447496086-1972156286-1642978116=:236780--