From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03AA0C001DE for ; Fri, 21 Jul 2023 18:26:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8BF2C8D0002; Fri, 21 Jul 2023 14:26:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 86FB58D0001; Fri, 21 Jul 2023 14:26:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7104C8D0002; Fri, 21 Jul 2023 14:26:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 607988D0001 for ; Fri, 21 Jul 2023 14:26:34 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 332C4140377 for ; Fri, 21 Jul 2023 18:26:34 +0000 (UTC) X-FDA: 81036449508.23.2EAA95F Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf25.hostedemail.com (Postfix) with ESMTP id 21C94A0002 for ; Fri, 21 Jul 2023 18:26:31 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=byfrvwVR; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=pass (imf25.hostedemail.com: domain of htejun@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=htejun@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689963992; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+nDLHNSovJBpy7GRacDMrAtqiXNHt1dAkEqDxo04+EA=; b=r3EIaH5Kn9+Gua0OvWRsRhUk8WFxOVlWBE11DmPP1cyLrdnQ7FMnfu9PeY/e+T2fCWLpEv y2cJBbOetslzUQLeZYcQ29hpIKeJgvvaoUKKF8nhAWjtx9Ddv3+g4OkHgHcK25zMTQRNek DSP2h1VDqT2pQI7fAjcDzeVKcGaxNkI= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=byfrvwVR; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=pass (imf25.hostedemail.com: domain of htejun@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=htejun@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689963992; a=rsa-sha256; cv=none; b=zdVKarxjZN+abiwrafEWYl5/hLeCj97E9oyXREoi/NAMabbz4q5jN+jmmVzwr1gfm/WCiF +rOBGQmYIYhtIPcXiDsiJUg6T9VGJqqfyYZz+J2EHTsibMIrQeD+KR0u9nXTPWwmZe1Ii7 KKciLN7Ug156cm3c3O3TEQRCuMB5B4Y= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1bb119be881so16273915ad.3 for ; Fri, 21 Jul 2023 11:26:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689963991; x=1690568791; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :from:to:cc:subject:date:message-id:reply-to; bh=+nDLHNSovJBpy7GRacDMrAtqiXNHt1dAkEqDxo04+EA=; b=byfrvwVRXmfGcyKN0KQNikPRt347H2lBT+iMj0kENUcWDzoABxw2A1ju6iy+Q2GQMc 7q4YHsL7QX6EC58wN8XJAKbLpmTB0tYiZi/mNH+UYz2ZlHLJlRtD+M9pctqgDpqsJwg2 tbUp1cXMC6ADjVdFvQmS7GDyfZJ6aQZukkkYLUeTTOjQ20JnB3Z/yScrbSUiwBadHWIv rGS5FD8XV3pn6yHnleEwQCsgvfcPn1eOCbg81jEHC8934W+iElMm7hkxa+K+PxDONGBj YQFP4HzpB0eck75pBEN2A+0N918Eq2NtywUkigZa5YV6Hpt7vq70Ea1UKYJEtSuIEY3d SMzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689963991; x=1690568791; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+nDLHNSovJBpy7GRacDMrAtqiXNHt1dAkEqDxo04+EA=; b=DJgbhgfcKRpBln8+jfz7mqvMYJOtiUUQhfvEAOEXg+ZkHvEcO9mBQ7YSKP6b8/cNU+ IsyYY/UxMoeJP1oH6YfuPgzXmpbFmYMa7RjjXVt8QeR7/39Znvu3WIIwpOrSedYEph6Y GeYtZW5fC7kL2Oc5bd88q9wvfldw5EQNW5ZPP0lg/Q/WAp7Acoc7Z7L0u1aaj/i9+NpV 5OtGSUyZzWbTEz6424dwRKp8elaTsyZEd0U14mGqr8RyE1rl58AlCAngTjRiHo0jsXjj YAbQtHEYsSwo/kmDY25QWJpY/V5bz+JJWrmJjsW2Dx85e2fHdKlWkf86wC+xFlZYmAws PoiQ== X-Gm-Message-State: ABy/qLYlo90J8ackpXuunxZc5ed+LbCDYzIkv3RzR1JOzB96dREBqhzp 0sMCK4gvP6bFnsmIQzRQ47E= X-Google-Smtp-Source: APBJJlHVZ3RSQbVzj6jDDoUYKFAaSfmkR0nBXzxyaRrjiuftEh17LxJmg3pf+fzL9Hmg6KnxEUJHPA== X-Received: by 2002:a17:902:eccd:b0:1b9:e97f:3846 with SMTP id a13-20020a170902eccd00b001b9e97f3846mr3436799plh.15.1689963990617; Fri, 21 Jul 2023 11:26:30 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:fbd8]) by smtp.gmail.com with ESMTPSA id jw20-20020a170903279400b001b8b26fa6a9sm3853213plb.19.2023.07.21.11.26.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jul 2023 11:26:30 -0700 (PDT) Date: Fri, 21 Jul 2023 08:26:28 -1000 From: Tejun Heo To: Yosry Ahmed Cc: Johannes Weiner , Andrew Morton , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , "Matthew Wilcox (Oracle)" , Zefan Li , Yu Zhao , Luis Chamberlain , Kees Cook , Iurii Zaikin , "T.J. Mercier" , Greg Thelen , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org Subject: Re: [RFC PATCH 0/8] memory recharging for offline memcgs Message-ID: References: <20230720070825.992023-1-yosryahmed@google.com> <20230720153515.GA1003248@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 21C94A0002 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 8ybf9epqspx9cq8czkrs6bw3wm514yca X-HE-Tag: 1689963991-345380 X-HE-Meta: U2FsdGVkX19JyxOa17X9H9jWW0BTIL7Lv5EQdAPwOxRG/a16gvsQBXZ/vh/Cnf2MheXPjvoUoKUKbOywsEGrNdTdXV7s1+n+GYZjG0Zg0tYW9jkzq/kDNVXf3w4yBuIFXusgduDIncQx/DuDqeHr6KBCAzLdfneqAD+oBLdAxv8BfFBdBfw54Bj7pL8+XOShkBVqgOMCf0fl7YOpeakhRjfUjXAD51Fb9EQ1mbsv0nspQ55pcTDyN9a/cg6xqdS+Shg+Y5WJAttlVp3GmtrBzUadrrTrNfPVJAScB+RTuhBOJuBxhzEeSGN8G6H2FU6LWhlrhUG3ZWBxV0Fj8rlpIrHZpV9jJ9pHociiXys3uWAVEozS1jp4cksjebHRci0ybG6N7+OrO6iAFhEL+06NBW1RQnfI1fu9wx2161Ls3veW2HQoSfk1SnSN0ERlD8iYH31+CMInk9qO/Hpvg/mRZILWrkKwSV0qppSzl6b0g8UGbjJXl3MBIRfTc0Y7amrLcfXfncIHpEDOCljQ+kUkSqWKMB0IVLIFDyI9kafV9MeZhT+jM+h13kAfu6v7Ljhr4ruJd7Z8AiGhl0amR+fH99L38Lj4Ud7vsAWvY6L4AjpnaYpu18AeudVtI72u5kDXRtRPAfZU7lKZXnPNXpSULljKiDtE+J6T9GQxGqmUZ3qkeD9ciEyix+7qpjBSEYftsw8R+sRrzgK9Ys6yTtVlGbQ19wpePcaiJK4Dzt7wdo44pNxxI70paKAtVo7OwRWIlY52Nwo3SGgly/CReHYArvCPy7Ecrb2Fg0uuWBf4+55QFqQa7ZWQ6YObiTnLZ9Yd5j47hcqb0QcU7V0hep2Nbk0RATDOxWfwmgGgR0gHtSqgeH9oyO7zOLJRYJdtKvubHc+8kdPY81TWq6kcg/jlZj9hVgomkTO5WThA9DowmXbiXdtQ635D//ajJhPJ215MSkmvx2eDEORoG9kqCOQ erRVzOFM f7Iq/LiNZ9ClhGRpYFWLhmP1qsMOZ5ep9Z/v5QYQihVM6HSRA1iNbp5kUAyPPu1e6DFQ5qTb7R5Za/jpItVypRRivc0whlGZbWgixgH0RTdOzHnVygl1LbIwbPb0a+dKSJ3DiIHi8Dm6WPJECqd7gi4sNwPxQgTNXlTeg/29qauAz0bOfz+HAdEVXqhjMz2/OjV0aCFBs6X4sHDoXwjW//eGsxvfN/63pjM6JqwRl791Arxqo3McJfO6wHfQlyK4GE4MQ3gBKeaYeQtl95pBLPE8xUVlOW90rERLvSQyY0T0FayZwo/GZokdhFYFGK9uLQ8UAIpfz2b8nfBuVakHnnH08TqqFLwo9QpI83zvSF+jg0XQHmIfFbHSEq9uW3ica1jrcgKWr84e4XpwhRcwMEtV79RrR4Rdw5d9JeJ1m9Nlm1QUA4Xpx0ENbuULbZ2YcuvPyqHY4STngCFjhFSvq/zFqAMZ8Yw+AsoQrdkMI0JyEKvoTB3VE2VIAlTlEJinmkO4uOLV/wzEE7yQHs66Hbqkrm+jxwTIAueqAeBqjWJl3cZQ2YRu9CY4qvpGNUlLx6c6q X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Fri, Jul 21, 2023 at 11:15:21AM -0700, Yosry Ahmed wrote: > On Thu, Jul 20, 2023 at 3:31 PM Tejun Heo wrote: > > memory at least in our case. The sharing across them comes down to things > > like some common library pages which don't really account for much these > > days. > > Keep in mind that even a single page charged to a memcg and used by > another memcg is sufficient to result in a zombie memcg. I mean, yeah, that's a separate issue or rather a subset which isn't all that controversial. That can be deterministically solved by reparenting to the parent like how slab is handled. I think the "deterministic" part is important here. As you said, even a single page can pin a dying cgroup. > > > Keep in mind that the environment is dynamic, workloads are constantly > > > coming and going. Even if find the perfect nesting to appropriately > > > scope resources, some rescheduling may render the hierarchy obsolete > > > and require us to start over. > > > > Can you please go into more details on how much memory is shared for what > > across unrelated dynamic workloads? That sounds different from other use > > cases. > > I am trying to collect more information from our fleet, but the > application restarting in a different cgroup is not what is happening > in our case. It is not easy to find out exactly what is going on on This is the point that Johannes raised but I don't think the current proposal would make things more deterministic. From what I can see, it actually pushes it towards even less predictability. Currently, yeah, some pages may end up in cgroups which aren't the majority user but it at least is clear how that would happen. The proposed change adds layers of indeterministic behaviors on top. I don't think that's the direction we want to go. > machines and where the memory is coming from due to the > indeterministic nature of charging. The goal of this proposal is to > let the kernel handle leftover memory in zombie memcgs because it is > not always obvious to userspace what's going on (like it's not obvious > to me now where exactly is the sharing happening :) ). > > One thing to note is that in some cases, maybe a userspace bug or > failed cleanup is a reason for the zombie memcgs. Ideally, this > wouldn't happen, but it would be nice to have a fallback mechanism in > the kernel if it does. I'm not disagreeing on that. Our handling of pages owned by dying cgroups isn't great but I don't think the proposed change is an acceptable solution. Thanks. -- tejun