From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34638C64E8A for ; Wed, 2 Dec 2020 08:28:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7FB7722203 for ; Wed, 2 Dec 2020 08:28:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7FB7722203 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 99D766B005D; Wed, 2 Dec 2020 03:28:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 94E616B0068; Wed, 2 Dec 2020 03:28:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C68A8D0001; Wed, 2 Dec 2020 03:28:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0101.hostedemail.com [216.40.44.101]) by kanga.kvack.org (Postfix) with ESMTP id 620336B005D for ; Wed, 2 Dec 2020 03:28:32 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 229E48249980 for ; Wed, 2 Dec 2020 08:28:32 +0000 (UTC) X-FDA: 77547665664.30.tax35_2c09939273b1 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id F38E8180B3C83 for ; Wed, 2 Dec 2020 08:28:31 +0000 (UTC) X-HE-Tag: tax35_2c09939273b1 X-Filterd-Recvd-Size: 11469 Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com [207.171.184.25]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Wed, 2 Dec 2020 08:28:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1606897710; x=1638433710; h=from:to:cc:subject:date:message-id:mime-version; bh=0DSQ2od5++p476YAby16Rk0+LHj1ymzAS3859FXXfF0=; b=MD9el6qt1JlzplMWm+CrpPL91lduXHm/x9459682BNw2W6lLhq9b6HYC aPZyNzxNakVnjgip0w2vWAzLn3PvmIX4TLEA+skLZ6dlD7ezt1i/pR7jx JtpBF5iTfAMrqgo8uHR0+NS8KVuZy1vqnK05+XZ998eEzHkxlCGf/qzJ/ Q=; X-IronPort-AV: E=Sophos;i="5.78,386,1599523200"; d="scan'208";a="92823194" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-1e-c7f73527.us-east-1.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP; 02 Dec 2020 08:28:20 +0000 Received: from EX13D31EUA001.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-1e-c7f73527.us-east-1.amazon.com (Postfix) with ESMTPS id F1280AA476; Wed, 2 Dec 2020 08:28:07 +0000 (UTC) Received: from u3f2cd687b01c55.ant.amazon.com (10.43.161.174) by EX13D31EUA001.ant.amazon.com (10.43.165.15) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 2 Dec 2020 08:27:50 +0000 From: SeongJae Park To: CC: SeongJae Park , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Plans around DAMON: perf integration and a new page reclaim mechanism Date: Wed, 2 Dec 2020 09:27:31 +0100 Message-ID: <20201202082731.24828-1-sjpark@amazon.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.43.161.174] X-ClientProxiedBy: EX13D06UWC002.ant.amazon.com (10.43.162.205) To EX13D31EUA001.ant.amazon.com (10.43.165.15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, This mail describes what DAMON is, what I am trying to do with it, where the project is now, and what are the next things I will do. I hope to hear some comments for refining of the plans if possible. What DAMON is ------------- DAMON[1] is a kernel framework for data access monitoring that scalable. For the scalability, it guarantees upper-bound limit of the monitoring overhead that users can set while providing a best effort accuracy. The kernel programmers, hence, can easily write various data access monitoring-based subsystems in the kernel space using DAMON. Some of such subsystems would export some interface to user space so that users can also get some benefit from it. [1] https://damonitor.github.io What I am trying to do ---------------------- Actually, DAMON is a part of my project called Data Access-aware Operating System (DAOS). As the name implies, I want to improve the performance and efficiency of systems using fine-grained data access patterns. The optimizations are for both kernel and user spaces. We will therefore modify or create kernel mechanisms, exports some of those to user space and implement user space library / tools. Below shows the layers and components for the project. --------------------------------------------------------------------------- Primitives: PTE Accessed bit, PG_idle, rmap, (Intel CMT), ... Framework: DAMON Features: DAMOS, virtual addr, physical addr, ... Applications: DAMON-debugfs, (DARC), ... ^^^^^^^^^^^^^^^^^^^^^^^ KERNEL SPACE ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Raw Interface: debugfs, (sysfs), (damonfs), tracepoints, (sys_damon), ... vvvvvvvvvvvvvvvvvvvvvvv USER SPACE vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv Library: (libdamon), ... Tools: DAMO, (perf), ... --------------------------------------------------------------------------- The components in parentheses are not implemented yet but in our future plan. IOW, those are the TODO tasks of DAOS project. DAMOS, DARC and DAMO will be explained in following sections. Where the project is and how it arrived there --------------------------------------------- The project motivated by increasing memory intensive systems. Working set size is continuously growing while DRAM in single system cannot follow the speed. Fortunately, new memory devices like NVRAM are evolving. The trend made a number of data access pattern aware system optimization works to begin. Most of those works showed impressive results, but have a common problem. Many of their access pattern extraction schemes are impractical or incur high overhead. Therefore I started designing a way to extract the fine-grained information in efficient and scalable way. It is named DAMON. It proved its lightweight overhead and accuracy with many environments including realistic benchmarks[1] and a real huge production systems[2]. For rough but effective re-implementation of the previous works using DAMON with no code, I implemented a feature called DAMON-based Operation Schemes (DAMOS). Using this, I implemented two well-known access-aware memory management schemes (access-aware THP[3] and proactive reclamation[4]) in 3 lines of configurations[1] and achieved impressive memory footprint reduction while preserving most of the performance. The results presented in several venues including KernelSummit'19[5], MIDDLEWARE Industry'19[6], LWN[7], a Google's internal event, and KernelSummit'20[8]. The patches posted to LKML since January and received many reviews. As of now, 22nd version of DAMON patchset[9], 15th version of DAMOS patchset[10], and 8th version of a patchset[11] for a few more works are available. [1] https://damonitor.github.io/doc/html/next/vm/damon/eval.html [2] https://lore.kernel.org/linux-mm/20201117143021.11883-1-sjpark@amazon.com/ [3] https://www.usenix.org/system/files/conference/osdi16/osdi16-kwon.pdf [4] https://research.google/pubs/pub48551/ [5] https://linuxplumbersconf.org/event/4/contributions/548/ [6] https://dl.acm.org/citation.cfm?id=3368125 [7] https://lwn.net/Articles/812707/ [8] https://www.linuxplumbersconf.org/event/7/contributions/659/ [9] https://lore.kernel.org/linux-mm/20201020085940.13875-1-sjpark@amazon.com/ [10] https://lore.kernel.org/linux-mm/20201006123931.5847-1-sjpark@amazon.com/ [11] https://lore.kernel.org/linux-mm/20200831104730.28970-1-sjpark@amazon.com/ What I will do next ------------------- In a long term, I will continue the works mentioned in 'What I am trying to do' section. IOW, I will implement the parentheses-wrapped components in the above figure. In a short term, I'd like to start with two things below. 1. Integration of DAMON user space tool in perf The DAMON patchset introduces a kernel space DAMON application called damon-dbgfs as a static kernel module. It exposes DAMON interface to user space via the debugfs and provide monitoring results recording feature, so that users can use DAMON as a profiler or data access-aware optimization framework (using DAMOS feature). For easier use of the debugfs interface, the patchset also introduces a user space tool named DAMON Operator (DAMO). It wraps the debugfs interface with a human friendly interface and provides a few useful monitoring results visualization features. Since the DAMON is presented, many people asked if it is integrated in perf or is it able to be controlled via perf. As perf is the must-have tool for system admins, making it integrated in perf will make much better user experience. For the reason, I want to integrate DAMO inside perf as yet another subcommand. For example, users will be able to use DAMON in below way: # perf damon start $(pidof $my_workload) /* Starts monitoring */ # perf record -e damon:damon_aggregated /* DAMON's tracepoint */ # perf damon record $(pidof $my_workload) /* shortcut for above two */ # perf damon report 2. DAMON-based Page Reclamation Page reclamation considered harmful, but the trend mentioned above in the motivation part implies a change of the situation. Simplest but reasonable choice under the trend is configuring fast swap devices such as NVRAM or zram. Pseudo-LRU, the current page replacement algorithm of the kernel, worked well in many real world production systems, but the overhead will become more easily viewable in frequently reclaiming systems. I also noticed it before[1]. After all, concerns about the algorithm have long existed[2]. I'd like to propose another Data Access-aware ReClamation algorithm (DARC) which can be implemented on the DAMON framework. The design is not fixed yet, but the abstract idea is as follows. Once a memory pressure is recognized, it monitors the memory access pattern of the system and select eviction targets based on both access frequency and recency. In a detail, it would account the age of each region based on access frequency; the age gradually increases but becomes zero if a big access frequency change to the region is detected. Then, it selects pages in regions having lowest access frequency for longest time as the first eviction candidate. Rather than just replacing the pseudo-LRU based reclamation, I'd liket to implement it as an optional proactive reclamation feature. In a detail, it will have three watermarks for each zone, that tunable via sysfs. The lowest watermarks will be higher than the high watermark for the original reclaim logic. DARC will start if the available memory becomes lower than middle watermark, and stop if the available memory becomes >highest watermark or