From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2D90C00528 for ; Fri, 21 Jul 2023 19:18:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 417B08D0003; Fri, 21 Jul 2023 15:18:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C7BF8D0001; Fri, 21 Jul 2023 15:18:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 268778D0003; Fri, 21 Jul 2023 15:18:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1741F8D0001 for ; Fri, 21 Jul 2023 15:18:30 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D8100A03D1 for ; Fri, 21 Jul 2023 19:18:29 +0000 (UTC) X-FDA: 81036580338.30.E9CB620 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf08.hostedemail.com (Postfix) with ESMTP id ECAD9160017 for ; Fri, 21 Jul 2023 19:18:27 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="mAkNV7/T"; spf=pass (imf08.hostedemail.com: domain of htejun@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=htejun@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689967108; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sxUJQYgkClQx653D+GLhwGjuASPJZY3UChFEBbRQygA=; b=7h+aGTyqc8Dxw7thKrCgVpri4IaNvKDWUk5ql9mQdsum1faM+I+L0SpEsewskGEh4fJ17m o2zOcJL0zFI6G+9K3DcRC1P2rGA3P+uXLoYQKVQbjJIqZCRUV7PAtRjLsCfYezc8DQUDqW xEm6ZbB1mUBqfhB3DO6KYTU6cMUQH4w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689967108; a=rsa-sha256; cv=none; b=AlfyQdpF8tX2ooW/dtnedPVTzwevoGKxz5oZw/bs2QnoVAKt73KsIVy+8qFA1ETkJUXvd2 FiigpW6PQ978DxLqPD0Nv3LZ5jPiMCpHIGKe448uxK9fWKBf/emzg6+vd8O+FokgRes32E 3ArXEfwlIWI1c2tudvwBrIvwJy+H3Cg= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="mAkNV7/T"; spf=pass (imf08.hostedemail.com: domain of htejun@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=htejun@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-1b8b4749013so16938135ad.2 for ; Fri, 21 Jul 2023 12:18:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689967106; x=1690571906; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :from:to:cc:subject:date:message-id:reply-to; bh=sxUJQYgkClQx653D+GLhwGjuASPJZY3UChFEBbRQygA=; b=mAkNV7/TkrTRt+4xRyYVs8bvaw/fK1B7/XnSKv8lxCseRRpsGfLzQfkVn+eOClOviz 88azYzYMlLiQwxDNiTu9AYJAdsHizyT/BwQ8Plm2itEb9RnWNeRgCSFf4EdIS16a1sak aYRBKLeMZQMW9EVuXp0t0qMaCiRc1rHmVMWfmOh5gDdfXGa0oG+YG2830o3mTPdvHL6e vH9U7Mt3P3EmCao0guiqhKMLkn+xj3uxeSldJ3kgXK9mSNRmQ8CAWR6EavAWBeJ6WDMN pp1i6eklTZCez9JN3tl2Y/wC2ZsyC0RyRmxlVGUPhyrr/flYK7KV0HpNYHCV9Cf2ZY33 kxfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689967106; x=1690571906; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sxUJQYgkClQx653D+GLhwGjuASPJZY3UChFEBbRQygA=; b=IfDxEam0PTAYsoDJ3nZ+tpHv452mB4z/QVGGCggMqdE262eGyeOtoL0moJNJzr92Yi Tw6GUxDboP12Yevaj7nyaTru6ToM4i8BPoQoeepGS2+Mb/gtc7iJUUgElZtoeViR1BlK jQUbCaJ0si+VbtRSoxyYAvasOiO5+sqs5L0h4+J79U8BEAroqmxczEF5dhayBf8IiZdD dVOKzCUrpCl9cdMJ5LRBmR67ebzLmG3vUSomq9t3s0E+XxImsrpRpBW0OOUj7wmbMi9v wBq+U1FciDzdJBrpVP8puRXbCM3q9HPztLqW2bPsZqt5cbYWtVd0trkW7lW0TOq7r02u NBLQ== X-Gm-Message-State: ABy/qLZ/y50+BZ4tBT2CiCMF5OpnHXzgWNdAC8tvAM9piOHNwKId3x+Y x4563rSau1GSqUulKVL5Wjs= X-Google-Smtp-Source: APBJJlGeYWelvP7ZZbyjEZlPKBLxUVkhczU84qA1NP9Ce5DxHPwDt6oLQPJhLTBvJ17Xc7dYMPuMNA== X-Received: by 2002:a17:903:120f:b0:1b8:4b87:20dc with SMTP id l15-20020a170903120f00b001b84b8720dcmr2917727plh.37.1689967106467; Fri, 21 Jul 2023 12:18:26 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:fbd8]) by smtp.gmail.com with ESMTPSA id d14-20020a170902728e00b001b88da737c6sm3855710pll.54.2023.07.21.12.18.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jul 2023 12:18:26 -0700 (PDT) Date: Fri, 21 Jul 2023 09:18:24 -1000 From: Tejun Heo To: Yosry Ahmed Cc: Johannes Weiner , Andrew Morton , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , "Matthew Wilcox (Oracle)" , Zefan Li , Yu Zhao , Luis Chamberlain , Kees Cook , Iurii Zaikin , "T.J. Mercier" , Greg Thelen , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org Subject: Re: [RFC PATCH 0/8] memory recharging for offline memcgs Message-ID: References: <20230720070825.992023-1-yosryahmed@google.com> <20230720153515.GA1003248@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: ECAD9160017 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: qtf3raicbw91hjbbtdcnbr3pxi4aabtb X-HE-Tag: 1689967107-563080 X-HE-Meta: U2FsdGVkX1+81dgp+sjt8oCoQNURD4SwGJgNbKE8LfjEf7PMHH6Wy+ntF/YTfcgTJZN2Kz6WKY9RRl5Jtn6XotYZZJafxjmjZSdYJ/qZTa3iKxMU8N7j6d+kbyjPJz8EjpWuvU41TAvTxXHD+ltyeYAxTD5nsQlocJnKfHelq9y7WYqUcl/HgpbYsEWaFh42mUxWrJFBCmuC+uuzG8COL6LGDBzV0Po8MKIJEM/IdQVi8qxoJ19Ee2IxJq8lf84bNwIgF0bcTulfIACpQf9T/foNA/GrwvxnWBSSn1mjqCECTJJrmP9mN0fF0i4LqqjjJsEZqJwoOnyElNU/UxJpLEsvZt7BZ5Rk5T5yabfflIEnT/tBmTCw7eaVH+hDHz+by09NPHdNjn7HsfLIqDsEfsqMVIiHQnAG4o22zUZSzL00gukr8GfC1fEA2bXJnsr+QYypCH6SJSHDVmp3bW8Oq6xPwMkyzG9LiKjfBdjLY17RXx9VCG+AmTMopOL85Gi7gxRRvx2z4AvJ8UCfBQ1JO24gKh5i2qpcEdszv22vEzBWvR+9l0M5w3V0lq0A4ZRnL0J8JrSxUciycpoDtFIR/h4JcZPD4AI5E8DvqLzBNj5Xb+ATAm1sPl8k3U7l1uG42lxiwOzEXQrkP3iiyph0YSUf41gHamfamXKz1lgrjExjhVZeLU76VzH9G4HiX2QDJ3mZg+Bqx+eMM7niteHXCQQPPstTidbwayWQlPgS0DzwVoF2LKyNE/w+E568wb8dQyjGKOiZNNLsd8XmIIMmsMpl/GMRcw3uZ/lnHF4j37O9osKJ8z8ySu+C3cF05Ma5mO8APS8SpP1eAcR5CMmPG6HJJtc1U7plrROMU2IK3S8/cZX0THNIKqcYzlvXpFtQKRuQqt3dk0UEwLJyi98QOoVh9bjHlyHyZ31GeGQTnI85lmEM9yqirpbaqZGyAnz6GJjXC+4mu5vlzkYzRLN 7Lc2Vpmj 8L1KOCRc2FYe0+zVGXFtNWt1LfdXdyFLuI0BNSwz4OPOhZNavw/QXmxciOBPmgFTp8P1SwJQHNn/EF5Cz3csXTAHGlLa9aO0rECgeoqAd2IpBvJcKY57MBD7h2txotZT1IS6tv/oPotwaSDxRC3Rm6bhGuRU1LLt8+VmS+2uWHxEMybLIuen35kDT7AEYiz/oKaLmPVoKyQUkoeSAQyrUPoE17uVeEvX0b3hv6HDOrV8VCf0rgOW/2Ir1tc1AR0Cr/3N/6DlDB/61q7+krdY1F5xgyQO7nObC5Dt8WaNB6Rzn0bRT92XMzd7N2DxpE4evY6UX2W+N805tx7l9X4Yg8mu6y4+grHMneKnDq5PDOPu9Qfh7WuL/vkquZX+7A8ra0pYNEmZf6DBbLIji9JvecKPxv3KR9//fznQGiF1lyX+ikuICy/q2lJc+EetgxR5UbXXCWrkhfNxxnyPESKP9xmIk67YZp2OX8t0Wa4gVJOSFI1eI4NgeLUe6ZM2u4oE2NqSsWpYCxmcctTaop8F7lfAI24yHBhhEe/W7IBoM15gI+FpMnl2BWVUjNmqIerD7dXKg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Fri, Jul 21, 2023 at 11:47:49AM -0700, Yosry Ahmed wrote: > On Fri, Jul 21, 2023 at 11:26 AM Tejun Heo wrote: > > On Fri, Jul 21, 2023 at 11:15:21AM -0700, Yosry Ahmed wrote: > > > On Thu, Jul 20, 2023 at 3:31 PM Tejun Heo wrote: > > > > memory at least in our case. The sharing across them comes down to things > > > > like some common library pages which don't really account for much these > > > > days. > > > > > > Keep in mind that even a single page charged to a memcg and used by > > > another memcg is sufficient to result in a zombie memcg. > > > > I mean, yeah, that's a separate issue or rather a subset which isn't all > > that controversial. That can be deterministically solved by reparenting to > > the parent like how slab is handled. I think the "deterministic" part is > > important here. As you said, even a single page can pin a dying cgroup. > > There are serious flaws with reparenting that I mentioned above. We do > it for kernel memory, but that's because we really have no other > choice. Oftentimes the memory is not reclaimable and we cannot find an > owner for it. This doesn't mean it's the right answer for user memory. > > The semantics are new compared to normal charging (as opposed to > recharging, as I explain below). There is an extra layer of > indirection that we did not (as far as I know) measure the impact of. > Parents end up with pages that they never used and we have no > observability into where it came from. Most importantly, over time > user memory will keep accumulating at the root, reducing the accuracy > and usefulness of accounting, effectively an accounting leak and > reduction of capacity. Memory that is not attributed to any user, aka > system overhead. That really sounds like the setup is missing cgroup layers tracking persistent resources. Most of the problems you describe can be solved by adding cgroup layers at the right spots which would usually align with the logical structure of the system, right? ... > I believe recharging is being mis-framed here :) > > Recharging semantics are not new, it is a shortcut to a process that > is already happening that is focused on offline memcgs. Let's take a > step back. Yeah, it does sound better when viewed that way. I'm still not sure what extra problems it solves tho. We experienced similar problems but AFAIK all of them came down to needing the appropriate hierarchical structure to capture how resources are being used on systems. Thanks. -- tejun