From: "Prasad, Aravinda" <aravinda.prasad@intel.com>
To: SeongJae Park <sj@kernel.org>
Cc: "damon@lists.linux.dev" <damon@lists.linux.dev>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"s2322819@ed.ac.uk" <s2322819@ed.ac.uk>,
"Kumar, Sandeep4" <sandeep4.kumar@intel.com>,
"Huang, Ying" <ying.huang@intel.com>,
"Hansen, Dave" <dave.hansen@intel.com>,
"Williams, Dan J" <dan.j.williams@intel.com>,
"Subramoney, Sreenivas" <sreenivas.subramoney@intel.com>,
"Kervinen, Antti" <antti.kervinen@intel.com>,
"Kanevskiy, Alexander" <alexander.kanevskiy@intel.com>
Subject: RE: [PATCH v2 0/3] mm/damon: Profiling enhancements for DAMON
Date: Tue, 19 Mar 2024 10:56:42 +0000 [thread overview]
Message-ID: <MW5PR11MB5907F44D802C298E8182A1B6F22C2@MW5PR11MB5907.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20240319052054.100167-1-sj@kernel.org>
> -----Original Message-----
> From: SeongJae Park <sj@kernel.org>
> Sent: Tuesday, March 19, 2024 10:51 AM
> To: Prasad, Aravinda <aravinda.prasad@intel.com>
> Cc: damon@lists.linux.dev; linux-mm@kvack.org; sj@kernel.org; linux-
> kernel@vger.kernel.org; s2322819@ed.ac.uk; Kumar, Sandeep4
> <sandeep4.kumar@intel.com>; Huang, Ying <ying.huang@intel.com>;
> Hansen, Dave <dave.hansen@intel.com>; Williams, Dan J
> <dan.j.williams@intel.com>; Subramoney, Sreenivas
> <sreenivas.subramoney@intel.com>; Kervinen, Antti
> <antti.kervinen@intel.com>; Kanevskiy, Alexander
> <alexander.kanevskiy@intel.com>
> Subject: Re: [PATCH v2 0/3] mm/damon: Profiling enhancements for DAMON
>
> Hi Aravinda,
>
>
> Thank you for posting this new revision!
>
> I remember I told you that I don't see a high level significant problems on on
> the reply to the previous revision of this patch[1], but I show a concern now.
> Sorry for not raising this earlier, but let me explain my humble concerns before
> being even more late.
Sure, no problem. We can discuss. I will get back to you with a detailed note.
Regards,
Aravinda
>
> On Mon, 18 Mar 2024 18:58:45 +0530 Aravinda Prasad
> <aravinda.prasad@intel.com> wrote:
>
> > DAMON randomly samples one or more pages in every region and tracks
> > accesses to them using the ACCESSED bit in PTE (or PMD for 2MB pages).
> > When the region size is large (e.g., several GBs), which is common for
> > large footprint applications, detecting whether the region is accessed
> > or not completely depends on whether the pages that are actively
> > accessed in the region are picked during random sampling.
> > If such pages are not picked for sampling, DAMON fails to identify the
> > region as accessed. However, increasing the sampling rate or
> > increasing the number of regions increases CPU overheads of kdamond.
>
> DAMON uses sampling because it considers a region as accessed if a portion of
> the region that big enough to be detected via sampling is all accessed. If a
> region is having some pages that really accessed but the proportion is too
> small to be found via sampling, I think DAMON could say the overall access to
> the region is only modest and could even be ignored. In my humble opinion,
> this fits with the definition of DAMON region: A memory address range that
> constructed with pages having similar access frequency.
>
> >
> > This patch proposes profiling different levels of the
> > application\u2019s page table tree to detect whether a region is
> > accessed or not. This patch set is based on the observation that, when
> > the accessed bit for a page is set, the accessed bits at the higher
> > levels of the page table tree (PMD/PUD/PGD) corresponding to the path
> > of the page table walk are also set. Hence, it is efficient to check
> > the accessed bits at the higher levels of the page table tree to
> > detect whether a region is accessed or not. For example, if the access
> > bit for a PUD entry is set, then one or more pages in the 1GB PUD
> > subtree is accessed as each PUD entry covers 1GB mapping. Hence,
> > instead of sampling thousands of 4K/2M pages to detect accesses in a
> > large region, sampling at the higher level of page table tree is faster and
> efficient.
>
> Due to the above reason, I concern this could result in making DAMON
> monitoring results be inaccurately biased to report more than real accesses.
>
> >
> > This patch set is based on 6.8-rc5 kernel (commit: f48159f8,
> > mm-unstable
> > tree)
> >
> > Changes since v1 [1]
> > ====================
> >
> > - Added support for 5-level page table tree
> > - Split the patch to mm infrastructure changes and DAMON enhancements
> > - Code changes as per comments on v1
> > - Added kerneldoc comments
> >
> > [1] https://lkml.org/lkml/2023/12/15/272
> >
> > Evaluation:
> >
> > - MASIM benchmark with 1GB, 10GB, 100GB footprint with 10% hot data
> > and 5TB with 10GB hot data.
> > - DAMON: 5ms sampling, 200ms aggregation interval. Rest all
> > parameters set to default value.
> > - DAMON+PTP: Page table profiling applied to DAMON with the above
> > parameters.
> >
> > Profiling efficiency in detecting hot data:
> >
> > Footprint 1GB 10GB 100GB 5TB
> > ---------------------------------------------
> > DAMON >90% <50% ~0% 0%
> > DAMON+PTP >90% >90% >90% >90%
>
> Sampling interval is the time interval that assumed to be large enough for the
> workload to make meaningful amount of accesses within the interval. Hence,
> meaningful amount of sampling interval depends on the workload's
> characteristic and system's memory bandwidth.
>
> Here, the size of the hot memory region is about 100MB, 1GB, 10GB, and
> 10GB for the four cases, respectively. And you set the sampling interval as
> 5ms. Let's assume the system can access, say, 50 GB per second, and hence it
> could be able to access only up to 250 MB per 5ms. So, in case of 1GB and
> footprint, all hot memory region would be accessed while DAMON is waiting
> for next sampling interval. Hence, DAMON would be able to see most
> accesses via sampling. But for 100GB footprint case, only 250MB / 10GB =
> about 2.5% of the hot memory region would be accessed between the
> sampling interval. DAMON cannot see whole accesses, and hence the
> precision could be low.
>
> I don't know exact memory bandwith of the system, but to detect the 10 GB
> hot region with 5ms sampling interval, the system should be able to access
> 2GB memory per millisecond, or about 2TB memory per second. I think
> systems of such memory bandwidth is not that common.
>
> I show you also explored a configuration setting the aggregation interval
> higher. But because each sampling checks only access between the sampling
> interval, that might not help in this setup. I'm wondering if you also explored
> increasing sampling interval.
>
> Sorry again for finding this concern not early enough. But I think we may need
> to discuss about this first.
>
> [1] https://lkml.kernel.org/r/20231215201159.73845-1-sj@kernel.org
>
>
> Thanks,
> SJ
>
>
> >
> > CPU overheads (in billion cycles) for kdamond:
> >
> > Footprint 1GB 10GB 100GB 5TB
> > ---------------------------------------------
> > DAMON 1.15 19.53 3.52 9.55
> > DAMON+PTP 0.83 3.20 1.27 2.55
> >
> > A detailed explanation and evaluation can be found in the arXiv paper:
> > https://arxiv.org/pdf/2311.10275.pdf
> >
> >
> > Aravinda Prasad (3):
> > mm/damon: mm infrastructure support
> > mm/damon: profiling enhancement
> > mm/damon: documentation updates
> >
> > Documentation/mm/damon/design.rst | 42 ++++++
> > arch/x86/include/asm/pgtable.h | 20 +++
> > arch/x86/mm/pgtable.c | 28 +++-
> > include/linux/mmu_notifier.h | 36 +++++
> > include/linux/pgtable.h | 79 ++++++++++
> > mm/damon/vaddr.c | 233 ++++++++++++++++++++++++++++--
> > 6 files changed, 424 insertions(+), 14 deletions(-)
> >
> > --
> > 2.21.3
next prev parent reply other threads:[~2024-03-19 10:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-18 13:28 Aravinda Prasad
2024-03-18 13:28 ` [PATCH v2 1/3] mm/damon: mm infrastructure support Aravinda Prasad
2024-03-18 20:27 ` kernel test robot
2024-03-18 13:28 ` [PATCH v2 2/3] mm/damon: profiling enhancement Aravinda Prasad
2024-03-18 18:23 ` kernel test robot
2024-03-18 21:59 ` kernel test robot
2024-03-18 13:28 ` [PATCH v2 3/3] mm/damon: documentation updates Aravinda Prasad
2024-03-19 0:51 ` [PATCH v2 0/3] mm/damon: Profiling enhancements for DAMON Yu Zhao
2024-03-19 5:20 ` SeongJae Park
2024-03-19 10:56 ` Prasad, Aravinda [this message]
2024-03-20 12:31 ` Prasad, Aravinda
2024-03-21 23:10 ` SeongJae Park
2024-03-22 12:12 ` Prasad, Aravinda
2024-03-22 18:32 ` SeongJae Park
2024-03-25 7:50 ` Prasad, Aravinda
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MW5PR11MB5907F44D802C298E8182A1B6F22C2@MW5PR11MB5907.namprd11.prod.outlook.com \
--to=aravinda.prasad@intel.com \
--cc=alexander.kanevskiy@intel.com \
--cc=antti.kervinen@intel.com \
--cc=damon@lists.linux.dev \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=s2322819@ed.ac.uk \
--cc=sandeep4.kumar@intel.com \
--cc=sj@kernel.org \
--cc=sreenivas.subramoney@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox