From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07F23C30658 for ; Tue, 2 Jul 2024 19:28:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C9596B0089; Tue, 2 Jul 2024 15:28:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3792E6B008A; Tue, 2 Jul 2024 15:28:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 219176B008C; Tue, 2 Jul 2024 15:28:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0440E6B0089 for ; Tue, 2 Jul 2024 15:28:08 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 881671404FD for ; Tue, 2 Jul 2024 19:28:08 +0000 (UTC) X-FDA: 82295798256.27.52581CC Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [91.218.175.170]) by imf01.hostedemail.com (Postfix) with ESMTP id C91A940018 for ; Tue, 2 Jul 2024 19:28:04 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=W2dJEYBN; spf=pass (imf01.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.170 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719948474; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hojp0DeSg4Ixj48tcE2WeffeK1tRtWxICSSIIGexTDc=; b=02WGYjP7Six/cOmCWjiVOWoRvxt6DFUzgtH7Jm1GlNF5ajd6B0W9fastw8MX8aM9TT4ZNa f0q53szeWNsphMkhwLau05QYPSWsJoWURtb6k/qvHNYm+TgcJl4qkpdzGZh9bwMm8rSu0s ByhPTMDrvAlVnmsjW5w1FpTQZxF1eKU= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=W2dJEYBN; spf=pass (imf01.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.170 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719948474; a=rsa-sha256; cv=none; b=YR+1ZmprflYVlxYgRIqfprtS9mWCmM5zG4Z2wjeQefamhmZs8j1cKmDVfvr0s8Hykox7x3 tiDbBRfvbobR6iaMQrMIyLZOxnA6+JjZnvd4F0eKAC01VZkz2tSYK05rWnYZLis8nhCGM4 gWaXz7Nds4UsxHYguhshirUHpf90uo4= X-Envelope-To: link@vivo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1719948482; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Hojp0DeSg4Ixj48tcE2WeffeK1tRtWxICSSIIGexTDc=; b=W2dJEYBNeSo6u8zzBmvyF8uBBxS70B+MA1/UR+gobWa2dc4EuPzHQ8y27i9mEJpI9c8pys W+qKa1uVE3ROgsYaf1jry5M7mR4J3Vhj3ImsCQ4gWnN16LQXJudwnUsoeLuzLUxbCK/Fxy qRWKBcptOEz4zQOAH6djOObkQCpy3NU= X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: mhocko@kernel.org X-Envelope-To: shakeel.butt@linux.dev X-Envelope-To: muchun.song@linux.dev X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: willy@infradead.org X-Envelope-To: david@redhat.com X-Envelope-To: ryan.roberts@arm.com X-Envelope-To: chrisl@kernel.org X-Envelope-To: schatzberg.dan@gmail.com X-Envelope-To: kasong@tencent.com X-Envelope-To: cgroups@vger.kernel.org X-Envelope-To: linux-mm@kvack.org X-Envelope-To: linux-kernel@vger.kernel.org X-Envelope-To: brauner@kernel.org X-Envelope-To: opensource.kernel@vivo.com Date: Tue, 2 Jul 2024 19:27:54 +0000 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Huan Yang Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , "Matthew Wilcox (Oracle)" , David Hildenbrand , Ryan Roberts , Chris Li , Dan Schatzberg , Kairui Song , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christian Brauner , opensource.kernel@vivo.com Subject: Re: [RFC PATCH 0/4] Introduce PMC(PER-MEMCG-CACHE) Message-ID: References: <20240702084423.1717904-1-link@vivo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240702084423.1717904-1-link@vivo.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: C91A940018 X-Stat-Signature: abp71648zp34aeniy5e8i6p63cwqtfd5 X-HE-Tag: 1719948484-68901 X-HE-Meta: U2FsdGVkX19Dv9Jj9fScaxdY1pEkUSxn19HLo1SjB4C8S32vrYQFUj7K+j/l0vA7QEJhr+1nDQ/wQKHPcjVk48BdpFGWypUszWc6gpe6vp/rkXD5D9r6IvxoT6msNGQEQxE1UZxVfoY7URmrcz+21b9oJJMZdZBEv6o7l1Nr4uK0x4G0jWIHVIyBjNJLw3QbgEsyakWk2GvZVDEWJ4O0vb28AtqzL3fRdLlxT9bCNCC9yieCHPgus0/QmTq5l3cLM9nD0PR5y1sax33ABZMqvXRhsUn6WNBZuN6xU7mpIVN7OXZ1YqSgDv9o+j6rUhsSxHZGHI/7LtHt4vbuSPEvm7/6mUqOLaAK2vTjV6ta0MwLkevZDN4ZMf57upb9SdWG38J0lo1OjJP855Y6eHxKQwg0SrFqkJfRhfutO/RdOh+p8+epDjJtfYOYipI8sFLRXvBAknXNrYEPtToPGtSNtH3IyTLbSHcNwr+zY2locu1mXttE5nwicptKnQnuwb/YGmnL0T+RQ+nK5kZdrW4boLeDs299dCetSr3OuLmyy+CUEel+1voSGG2yGrWMFW0NjiKcd1k1Tr39nO79pWY4rTURMwLqQgD+dd4ODE+kwyHtFyfGIyuSnrHg+lF9F7LS18pWgBfMHpXKZaOOBD9XCIoLE3sRsfh3yS3H6Q/YCxVrq+HDOaOlg+eryt1gPwG0tbZqGVI5orIUm+y9vylsTGdZv6INCfL1yGzSMmTDUcSgLPIEO1wYMV/fIEla+4R6QjC99rbl1aUILfGy/ywMPZvnKYh50oUhhuD463AwVKl4Tc4oQv+1ecO7ptTip5ALzuVHpGtNES3PmuqxG0EnLEdkpdEY5xXybO6lT5th5dMkJs0NhRrOFFoLRawAjVdCcBVVAcFIK+ahmldnz6gJRIEASxwkGu8MO0AOS8hyAdxz7hW/mWxexgD1r08fjc2UehZ+WZOyVf5msB2duA3 8vFdJOZj PxiW6qk8b4t3wMfUTvUIt9vTLX9eQ66aN/Xh6Ew8LzmXv0RuLkUVgYwBqlOsRI0jgesH1u176uCG3l56Mu5Ih79CYdxC3QMJS5izha7H5TL2z7CQ5+hXkuSwjTALZJ3mna9qWrTGc/wU9sgq/w6+ZvChiSQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 02, 2024 at 04:44:03PM +0800, Huan Yang wrote: > This patchset like to talk abount a idea about PMC(PER-MEMCG-CACHE). > > Background > === > > Modern computer systems always have performance gaps between hardware, > such as the performance differences between CPU, memory, and disk. > Due to the principle of locality of reference in data access: > > Programs often access data that has been accessed before > Programs access the next set of data after accessing a particular data > As a result: > 1. CPU cache is used to speed up the access of already accessed data > in memory > 2. Disk prefetching techniques are used to prepare the next set of data > to be accessed in advance (to avoid direct disk access) > The basic utilization of locality greatly enhances computer performance. > > PMC (per-MEMCG-cache) is similar, utilizing a principle of locality to enhance > program performance. > > In modern computers, especially in smartphones, services are provided to > users on a per-application basis (such as Camera, Chat, etc.), > where an application is composed of multiple processes working together to > provide services. > > The basic unit for managing resources in a computer is the process, > which in turn uses threads to share memory and accomplish tasks. > Memory is shared among threads within a process. > > However, modern computers have the following issues, with a locality deficiency: > > 1. Different forms of memory exist and are not interconnected (anonymous > pages, file pages, special memory such as DMA-BUF, various memory alloc in > kernel mode, etc.) > 2. Memory isolation exists between processes, and apart from specific > shared memory, they do not communicate with each other. > 3. During the transition of functionality within an application, a process > usually releases memory, while another process requests memory, and in > this process, memory has to be obtained from the lowest level through > competition. > > For example abount camera application: > > Camera applications typically provide photo capture services as well as photo > preview services. > The photo capture process usually utilizes DMA-BUF to facilitate the sharing > of image data between the CPU and DMA devices. > When it comes to image preview, multiple algorithm processes are typically > involved in processing the image data, which may also involve heap memory > and other resources. > > During the switch between photo capture and preview, the application typically > needs to release DMA-BUF memory and then the algorithms need to allocate > heap memory. The flow of system memory during this process is managed by > the PCP-BUDDY system. > > However, the PCP and BUDDY systems are shared, and subsequently requested > memory may not be available due to previously allocated memory being used > (such as for file reading), requiring a competitive (memory reclamation) > process to obtain it. > > So, if it is possible to allow the released memory to be allocated with > high priority within the application, then this can meet the locality > requirement, improve performance, and avoid unnecessary memory reclaim. > > PMC solutions are similar to PCP, as they both establish cache pools according > to certain rules. > > Why base on MEMCG? > === > > The MEMCG container can allocate selected processes to a MEMCG based on certain > grouping strategies (typical examples include grouping by app or UID). > Processes within the same MEMCG can then be used for statistics, upper limit > restrictions, and reclamation control. > > All processes within a MEMCG are considered as a single memory unit, > sharing memory among themselves. As a result, when one process releases > memory, another process within the same group can obtain it with the > highest priority, fully utilizing the locality of memory allocation > characteristics within the MEMCG (such as APP grouping). > > In addition, MEMCG provides feature interfaces that can be dynamically toggled > and are fully controllable by the policy.This provides greater flexibility > and does not impact performance when not enabled (controlled through static key). > > > Abount PMC implement > === > Here, a cache switch is provided for each MEMCG(not on root). > When the user enables the cache, processes within the MEMCG will share memory > through this cache. > > The cache pool is positioned before the PCP. All order0 page released by > processes in MEMCG will be released to the cache pool first, and when memory > is requested, it will also be prioritized to be obtained from the cache pool. > > `memory.cache` is the sole entry point for controlling PMC, here are some > nested keys to control PMC: > 1. "enable=[y|n]" to enable or disable targeted MEMCG's cache > 2. "keys=nid=%d,watermark=%u,reaper_time=%u,limit=%u" to control already > enabled PMC's behavior. > a) `nid` to targeted a node to change it's key. or else all node. > b) The `watermark` is used to control cache behavior, caching only when > zone free pages above the zone's high water mark + this watermark is > exceeded during memory release. (unit byte, default 50MB, > min 10MB per-node-all-zone) > c) `reaper_time` to control reaper gap, if meet, reaper all cache in this > MEMCG(unit us, default 5s, 0 is disable.) > d) `limit` is to limit the maximum memory used by the cache pool(unit bytes, > default 100MB, max 500MB per-node-all-zone) > > Performance > === > PMC is based on MEMCG and requires performance measurement through the > sharing of complex workloads between application processes. > Therefore, at the moment, we unable to provide a better testing solution > for this patchset. > > Here is the internal testing situation we provide, using the camera > application as an example. (1-NODE-1-ZONE-8GRAM) > > Test Case: Capture in rear portrait HDR mode > 1. Test mode: rear portrait HDR mode. This scene needs more than 800M ram > which memory types including dmabuf(470M), PSS(150M) and APU(200M) > 2. Test steps: take a photo, then click thumbnail to view the full image > > The overall performance benefit from click shutter button to showing whole > image improves 500ms, and the total slowpath cost of all camera threads reduced > from 958ms to 495ms. > Especially for the shot2shot in this mode, the preview dealy of each frame have > a significant improve. Hello Huan, thank you for sharing your work. Some high-level thoughts: 1) Naming is hard, but it took me quite a while to realize that you're talking about free memory. Cache is obviously an overloaded term, but per-memcg-cache can mean absolutely anything (pagecache? cpu cache? ...), so maybe it's not the best choice. 2) Overall an idea to have a per-memcg free memory pool makes sense to me, especially if we talk 2MB or 1GB pages (or order > 0 in general). 3) You absolutely have to integrate the reclaim mechanism with a generic memory reclaim mechanism, which is driven by the memory pressure. 4) You claim a ~50% performance win in your workload, which is a lot. It's not clear to me where it's coming from. It's hard to believe the page allocation/release paths are taking 50% of the cpu time. Please, clarify. There are a lot of other questions, and you highlighted some of them below (and these are indeed right questions to ask), but let's start with something. Thanks