From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49F1EC433E0 for ; Wed, 24 Feb 2021 13:30:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BF21A64F08 for ; Wed, 24 Feb 2021 13:30:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BF21A64F08 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 40CB06B0006; Wed, 24 Feb 2021 08:30:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E0C96B006C; Wed, 24 Feb 2021 08:30:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DDD06B006E; Wed, 24 Feb 2021 08:30:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0205.hostedemail.com [216.40.44.205]) by kanga.kvack.org (Postfix) with ESMTP id 16D7C6B0006 for ; Wed, 24 Feb 2021 08:30:58 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D68B8181AF5E1 for ; Wed, 24 Feb 2021 13:30:57 +0000 (UTC) X-FDA: 77853246954.23.30AFEE4 Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) by imf15.hostedemail.com (Postfix) with ESMTP id 2D44CA00052C for ; Wed, 24 Feb 2021 13:30:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1614173457; x=1645709457; h=from:to:cc:subject:date:message-id:in-reply-to: mime-version; bh=XgvjsOMBy0W2i8h9nrIbreu3UIpB0UjjUMV8skAtoL0=; b=Ngf9zfcsJDJZY78JgdRKo0JpggYB0o3WyWR9uLupl/Vv4FVw2yU0pg5R dshP22wLxxOirj5+X8TXTGgsgwsFc2k2OJ2tnVgpUXK3NF2XF588i6yfV baIx4zh+UaqcnldcxhxcO64o+1RXilg5M9wh8vxC7l3+Y7CZxwmOUaYFT E=; X-IronPort-AV: E=Sophos;i="5.81,203,1610409600"; d="scan'208";a="91700011" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-2a-d0be17ee.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP; 24 Feb 2021 13:30:46 +0000 Received: from EX13D31EUA001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2a-d0be17ee.us-west-2.amazon.com (Postfix) with ESMTPS id 5DD37A1900; Wed, 24 Feb 2021 13:30:43 +0000 (UTC) Received: from u3f2cd687b01c55.ant.amazon.com (10.43.160.207) by EX13D31EUA001.ant.amazon.com (10.43.165.15) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 24 Feb 2021 13:30:26 +0000 From: SeongJae Park To: SeongJae Park CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v24 00/14] Subject: Introduce Data Access MONitor (DAMON) Date: Wed, 24 Feb 2021 14:30:05 +0100 Message-ID: <20210224133005.9265-1-sjpark@amazon.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210204153150.15948-1-sjpark@amazon.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.43.160.207] X-ClientProxiedBy: EX13D23UWC001.ant.amazon.com (10.43.162.196) To EX13D31EUA001.ant.amazon.com (10.43.165.15) X-Stat-Signature: wd83m1dygf69dec4sb35pfnmwst5tc1b X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 2D44CA00052C Received-SPF: none (amazon.com>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=smtp-fw-6001.amazon.com; client-ip=52.95.48.154 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614173456-500769 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 4 Feb 2021 16:31:36 +0100 SeongJae Park wrote: > From: SeongJae Park > [...] > > Introduction > ============ > > DAMON is a data access monitoring framework for the Linux kernel. The core > mechanisms of DAMON called 'region based sampling' and 'adaptive regions > adjustment' (refer to 'mechanisms.rst' in the 11th patch of this patchset for > the detail) make it > > - accurate (The monitored information is useful for DRAM level memory > management. It might not appropriate for Cache-level accuracy, though.), > - light-weight (The monitoring overhead is low enough to be applied online > while making no impact on the performance of the target workloads.), and > - scalable (the upper-bound of the instrumentation overhead is controllable > regardless of the size of target workloads.). > > Using this framework, therefore, several memory management mechanisms such as > reclamation and THP can be optimized to aware real data access patterns. > Experimental access pattern aware memory management optimization works that > incurring high instrumentation overhead will be able to have another try. > > Though DAMON is for kernel subsystems, it can be easily exposed to the user > space by writing a DAMON-wrapper kernel subsystem. Then, user space users who > have some special workloads will be able to write personalized tools or > applications for deeper understanding and specialized optimizations of their > systems. > I realized I didn't introduce a good, intuitive example use case of DAMON for profiling so far, though DAMON is not for only profiling. One straightforward and realistic usage of DAMON as a profiling tool would be recording the monitoring results with callstack and visualize those by timeline together. For example, below link shows that visualization for a realistic workload, namely 'fft' in SPLASH-2X benchmark suite. From that, you can know there are three memory access bursting phases in the workload and 'FFT1DOnce.cons::prop.2()' looks responsible for the first and second hot phase, while 'Transpose()' is responsible for the last one. Now the programmer can take a deep look in the functions and optimize the code (e.g., adding madvise() or mlock() calls). https://damonitor.github.io/temporal/damon_callstack.png We used the approach for 'mlock()'-based optimization of a range of other realistic benchmark workloads. The optimized versions achieved up to about 2.5x performance improvement under memory pressure[1]. Note: I made the uppermost two figures in above 'fft' visualization (working set size and access frequency of each memory region by time) via the DAMON user space tool[2], while the lowermost one (callstack by time) is made using perf and speedscope[3]. We have no descent and totally automated tool for that yet (will be implemented soon, maybe under perf as a perf-script[4]), but you could reproduce that with below commands. $ # run the workload $ sudo damo record $(pidof ) & $ sudo perf record -g $(pidof ) $ # after your workload finished (you should also finish perf on your own) $ damo report wss --sortby time --plot wss.pdf $ damo report heats --heatmap freq.pdf $ sudo perf script | speedscope - $ # open wss.pdf and freq.pdf with our favorite pdf viewer [1] https://linuxplumbersconf.org/event/4/contributions/548/attachments/311/590/damon_ksummit19.pdf [2] https://lore.kernel.org/linux-mm/20201215115448.25633-8-sjpark@amazon.com/ [3] https://www.speedscope.app/ [4] https://lore.kernel.org/linux-mm/20210107120729.22328-1-sjpark@amazon.com/