From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89B94C32772 for ; Tue, 23 Aug 2022 11:51:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0FCBE6B0073; Tue, 23 Aug 2022 07:51:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 086036B0074; Tue, 23 Aug 2022 07:51:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E69798D0001; Tue, 23 Aug 2022 07:51:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D79D76B0073 for ; Tue, 23 Aug 2022 07:51:41 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8F511A1564 for ; Tue, 23 Aug 2022 11:51:41 +0000 (UTC) X-FDA: 79830692802.06.6DDB41B Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf20.hostedemail.com (Postfix) with ESMTP id EC0F81C0015 for ; Tue, 23 Aug 2022 11:51:40 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C4FE21F86C; Tue, 23 Aug 2022 11:51:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1661255499; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=opjEw1Z8QcMCfQyBa8x+WY+gq+Oq1yKc++/AzCqaoWU=; b=sbBf3Wnoq6V3v1nY8ZASzf5RJBIXdPR0x956qhC1pNlrjYoRG49jus6j2ofp/GzefduyRn gprvqR8MWsOXyRImqrJoVfAqPjupFmSMfE2UPw94YXQ6NGjsS+Zfj3u54mB8LJPlahQvrp N5vS2vFPkqLfs2QZdoTqh1gJ43dkvGM= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B068613AB7; Tue, 23 Aug 2022 11:51:39 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 1MhTKku/BGP+NQAAMHmgww (envelope-from ); Tue, 23 Aug 2022 11:51:39 +0000 Date: Tue, 23 Aug 2022 13:51:39 +0200 From: Michal Hocko To: Zhaoyang Huang Cc: Suren Baghdasaryan , Tejun Heo , Shakeel Butt , "zhaoyang.huang" , Johannes Weiner , Linux MM , LKML , Cgroups , Ke Wang , Zefan Li , Roman Gushchin , Muchun Song Subject: Re: [RFC PATCH] memcg: use root_mem_cgroup when css is inherited Message-ID: References: <1660908562-17409-1-git-send-email-zhaoyang.huang@unisoc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661255501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=opjEw1Z8QcMCfQyBa8x+WY+gq+Oq1yKc++/AzCqaoWU=; b=P41jCkVvOavlacpjaMv4WNQHYBGtTIa7xxFwHOhGK5gCiJn8qvcQkl/tmVCxJdyMg2UGzR up89+1KJWXg4wSLNuTsQUw0GBXYKrPwURunwUrEORPnOL9MfDLl96f/uYAR7eLOMPE1IVD Ek5YzwoVnaM4DuhCL0bZJY34MYAYZbw= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=sbBf3Wno; spf=pass (imf20.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661255501; a=rsa-sha256; cv=none; b=ZjsguW/EQ9uZO/KPceFCwvo27iy9SMeS85hOK7zbXF6fjvhqc68QS2V+GRNVy8qxFN1hNd Ok9Ll+mZ2Ayazfwl7c4bp6VANwjg7Qj+//1+yoRZk8Xrr8iMvhhqUgObGxkgR39kUiX7sx 834TJgSePSyi2wVqHDV57AnPbwt5lSw= X-Stat-Signature: 9hnu15q3bb3cwduj7yqe8ktqnpskuay6 X-Rspamd-Queue-Id: EC0F81C0015 X-Rspam-User: Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=sbBf3Wno; spf=pass (imf20.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-Rspamd-Server: rspam01 X-HE-Tag: 1661255500-971060 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 23-08-22 17:20:59, Zhaoyang Huang wrote: > On Tue, Aug 23, 2022 at 4:33 PM Michal Hocko wrote: > > > > On Tue 23-08-22 14:03:04, Zhaoyang Huang wrote: > > > On Tue, Aug 23, 2022 at 1:21 PM Michal Hocko wrote: > > > > > > > > On Tue 23-08-22 10:31:57, Zhaoyang Huang wrote: > > [...] > > > > > I would like to quote the comments from google side for more details > > > > > which can also be observed from different vendors. > > > > > "Also be advised that when you enable memcg v2 you will be using > > > > > per-app memcg configuration which implies noticeable overhead because > > > > > every app will have its own group. For example pagefault path will > > > > > regress by about 15%. And obviously there will be some memory overhead > > > > > as well. That's the reason we don't enable them in Android by > > > > > default." > > > > > > > > This should be reported and investigated. Because per-application memcg > > > > vs. memcg in general shouldn't make much of a difference from the > > > > performance side. I can see a potential performance impact for no-memcg > > > > vs. memcg case but even then 15% is quite a lot. > > > Less efficiency on memory reclaim caused by multi-LRU should be one of > > > the reason, which has been proved by comparing per-app memcg on/off. > > > Besides, theoretically workingset could also broken as LRU is too > > > short to compose workingset. > > > > Do you have any data to back these claims? Is this something that could > > be handled on the configuration level? E.g. by applying low limit > > protection to keep the workingset in the memory? > I don't think so. IMO, workingset works when there are pages evicted > from LRU and then refault which provide refault distance for pages. > Applying memcg's protection will have all LRU out of evicted which > make the mechanism fail. It is really hard to help you out without any actual data. The idea was though to use the low limit protection to adaptively configure respective memcgs to reduce refaults. You already have data about refaults ready so increasing the limit for often refaulting memcgs would reduce the trashing. [...] > > A.cgroup.controllers = memory > > A.cgroup.subtree_control = memory > > > > A/B.cgroup.controllers = memory > > A/B.cgroup.subtree_control = memory > > A/B/B1.cgroup.controllers = memory > > > > A/C.cgroup.controllers = memory > > A/C.cgroup.subtree_control = "" > > A/C/C1.cgroup.controllers = "" > Yes for above hierarchy and configuration. > > > > Is your concern that C1 is charged to A/C or that you cannot actually make > > A/C.cgroup.controllers = "" because you want to maintain memory in A? > > Because that would be breaking the internal node constrain rule AFAICS. > No. I just want to keep memory on B. That would require A to be without controllers which is not possible due to hierarchical constrain. > > Or maybe you just really want a different hierarchy where > > A == root_cgroup and want the memory acocunted in B > > (root/B.cgroup.controllers = memory) but not in C (root/C.cgroup.controllers = "")? > Yes. > > > > That would mean that C memory would be maintained on the global (root > > memcg) LRUs which is the only internal node which is allowed to have > > resources because it is special. > Exactly. I would like to have all groups like C which have no parent's > subtree_control = memory charge memory to root. Under this > implementation, memory under enabled group will be protected by > min/low while other groups' memory share the same LRU to have > workingset things take effect. One way to achieve that would be shaping the hierarchy the following way root / \ no_memcg[1] memcg[2] |||||||| ||||| app_cgroups app_cgroups with no_memcg.subtree_control = "" memcg.subtree_control = memory no? You haven't really described why you need per application freezer cgroup but I suspect you want to selectively freeze applications. Is there any obstacle to have a dedicated frozen cgroup and migrate tasks to be frozen there? -- Michal Hocko SUSE Labs