From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 772C0C02194 for ; Thu, 6 Feb 2025 19:37:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 05F66280005; Thu, 6 Feb 2025 14:37:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 00F80280004; Thu, 6 Feb 2025 14:37:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF2BD280005; Thu, 6 Feb 2025 14:37:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C01B2280004 for ; Thu, 6 Feb 2025 14:37:46 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 0343A1A1420 for ; Thu, 6 Feb 2025 19:37:45 +0000 (UTC) X-FDA: 83090529732.24.1FB6784 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf01.hostedemail.com (Postfix) with ESMTP id 2E85840006 for ; Thu, 6 Feb 2025 19:37:44 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pmNtqvDJ; spf=pass (imf01.hostedemail.com: domain of tjmercier@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=tjmercier@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738870664; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1fOBSotabUbiNEkU1i3ahpn3iJT7kDj6sw5tfKLVTW0=; b=plKDDNz8xFyAOea+mVVPqJV2zbCGxVmTzY2ArajrAos0Ym87abef65L1rS+J79M/yzXvH5 k35pbhE+LyN7xvraVFoVsQcUDsDQt2EJzqh/RyB3fcMkmJc7pc8R3dsLZKu5KVFhUXUOYe ePc3islLG602bmRN3IV7bZ72Y36UNEQ= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pmNtqvDJ; spf=pass (imf01.hostedemail.com: domain of tjmercier@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=tjmercier@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738870664; a=rsa-sha256; cv=none; b=B+W+nB33Ks4hFiP/Jdn8zss9Jc5JUw8WtwQgdz1c5abYf6KIopUQKECuADytRELLNXjljS soz4nOzPwg8EuMVZPPqtRKIWUqRYt8ZPkpllSn+ZVivVJ7STJcL1HdHq1SHBbUIrURDU3o GpFowWUENRpXfSkLprfNvRt+whEYQJc= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-4679b5c66d0so40771cf.1 for ; Thu, 06 Feb 2025 11:37:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738870663; x=1739475463; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1fOBSotabUbiNEkU1i3ahpn3iJT7kDj6sw5tfKLVTW0=; b=pmNtqvDJqAdBtlnhST/dcAMdwDz5DhzgcfzXqKkGeAT9FB0iTjst6eu7EfBXSm0sY3 aK/jjkAPYUZE2Hkc3fpNV+N16sPt+zYIg0sY7gY4gshJcPm8zD+cIjOG1D5OXerc86vD FTB8AHIScBC9EVUFBTv88kHhHTwwLtHokucQwhsHwSYoYbFxrlJbNw5bf5nKfgEhNGMz 0wQZZHrTD+x31mAF5DKVqxzMLoQ0WSprOAUrIa9jJ5zapz/oZRjMp724Z4BbY5u4YmAr X8FXy0XE5Qx2MCAa2VH/oHlQaCTAPF6cF8ii/JaPA52j2YI758k3CdlPpEzE8yyzsBlK +yuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738870663; x=1739475463; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1fOBSotabUbiNEkU1i3ahpn3iJT7kDj6sw5tfKLVTW0=; b=sw8vXQdYZRgpB6bT/7nXuij0O4Tfx1/UcrSGscVvk/kuYqPURDuORlH5C893JQn+W9 /RcgW3RnZflp9oEZeVpEFG8Q8+zYg1XkiBzaP1zokMPlLj93wdz1zsLyiDLARZJDpN+x H2tL1EHRXv3mNp3hQX2cGs/uq2umEeCrOKtYDBYjrzxhY8CckRixDjq7RxWE//oncEhy WoAAUSTxULRtUFDWAlIbxrpfVe/mw5B0J3HawXF4p86+d4gzX62O3x0FFMWML3f9QlUL dqvDsI3aq2cg2nkMxw1bgwwNc+CncaEdjZOMf/DAzX/y+cYsURXvZMTImA+2bw4x3V0Q NH3Q== X-Forwarded-Encrypted: i=1; AJvYcCVOve3aHStLdHcLlZzQqCbUfGr2DaQIzByAlp5hFpXGkckQGya5lL2DmR+KlssRJqvHiBf1thcLRA==@kvack.org X-Gm-Message-State: AOJu0Yy85BsC2S1ZGyl/7IsIys5caSQfKrft75Mc7+Oz0dIMm+3JcxoH Nd9SsjqxVXFp1fOJjNe+OTCpnqkTBlnA0EwmCQ79v7Mi88DwjT7KovPrMJjJYt0wSf/ViheDwyh LETjtW+0XeqBpO38r+WKVXw5nCcblzGQoncTc X-Gm-Gg: ASbGncul9DiNnEAXBQjK3WdqBp1h2oYMKTjum0PQUU2LpuBrtAeTG7KIfwlG74scgyf VEwAoFiU4edIzBnWu5UCL6ATlMoLjPhM2KPK9WW797xw0iG97ferAuTIZDBf80jleHvY3hRwRhj qQk4Tticmc3gC6XKNxeqp77t8RIPg= X-Google-Smtp-Source: AGHT+IF5gcj7RwLxoY9dHuLAsZYeyd8PzDxP/5Bhyd910Q/jlsEYsY1jhfG9vSf69IxjLgvuxW1YntQdA6bHBpDIkHg= X-Received: by 2002:ac8:6d10:0:b0:466:861a:f633 with SMTP id d75a77b69052e-47168810975mr122421cf.5.1738870663117; Thu, 06 Feb 2025 11:37:43 -0800 (PST) MIME-Version: 1.0 References: <20250205222029.2979048-1-shakeel.butt@linux.dev> In-Reply-To: From: "T.J. Mercier" Date: Thu, 6 Feb 2025 11:37:31 -0800 X-Gm-Features: AWEUYZnlJFAzlgtpxwDOLgRzF9l_JBXAaDX14-ukU7OJ52CaWgfL_jvK_ty4s9Q Message-ID: Subject: Re: [PATCH] memcg: add hierarchical effective limits for v2 To: Shakeel Butt Cc: =?UTF-8?Q?Michal_Koutn=C3=BD?= , Tejun Heo , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 2E85840006 X-Stat-Signature: zzeh45smqxcbbh6jj3zozw8y9czxmz3g X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1738870664-182703 X-HE-Meta: U2FsdGVkX1+h6gJEaoWohEufeOSz7YQpQ+OFMvte6K90lfnF/xOEXJ72JM1pSG4YhIZBGpPWQ5qqIMg+UoPpvj6f4nE6AhhpAPRthNIhNoNmpJm4RdmoGbQhkXT8DI8v9ElZ814YAcPMlU1zZV/glpDU2Z7mF5mjOGyA5l3t74vXtmKLbkGJUetQGX0tzJUiVVTWei/UkdZlFLJ72vJ//cc+dgql+kaxwSYnhjERWw/AFbr81sTaW0m1Ealp2y3D0Qk6T+EMW6veS2tjfSteEX7TA6sIuPVkt3EdZNngoeNiAXEACaz1T4yDXhSwn215TudWAIFbqpTPFcE2q7fMjRIfJYCHiRDZlFE9C9jar0ykEdQ13tIPkBzqUz5jzoD1130MNjOPkQVmuNsTlzU/A7yOlBf3QFaPvt3iHJKJfPwCr14oibsgFQUuJOsoMV3XDR3BaIyReu9VdRW5NFOX3lRUKdjfPQyui+BPSWqEN8wEcFy7XcoDWyQOJs+QOnV7He88ydaTdtrt91GV9lnJW9snvLrVnu4n4ab4nKiFtRv59+B1C/wSgR3HH4xxjroAFw/z4rSsxUyoE+56FqzE2Rned2rxktEt7VsEc1wb/XUxWB4FLsWqNWD1l3XRJn6JKmv1upnirykTdxWD97jGAt20wy9SNTO0Hz/S2SjNYBLN7RenL/pcNFXARs39LaVl6/d5oY7CrzYNDv/tw1fs+hC24Al6Os/BhBtObCfbZ0cuFORJm8xvyNIjxHV2ORZcbQTEvheznqXWCdBO1kQTXR9RcGphUTNYWvpB6XjnO7uM6iSV83Ry/WIQkACrQ0C1AHG0+qjl+82lfQtnA1O53jm9N3wHtuOLEbcrg/TwM4ERhSvo4ZTisuET+VsbX6AVM0jLBCMGY5gL23Wt35NpQJE6QJrgyKHpzh7hXoIj6mGsIfBij8IVKas4EkBGA+yc+zcU5OmnWZbYK14KMCf GtNH4SDi uO6HjrvufiDc4peN9w/fQ/5/bvqngOjwFDjl48SwvTkBGCXGHJLY7C9NFJ6ewaj7axK2alrznyM+G9CI/At5pWTvR7+1QNKNNyUUGSPYsWW4oQ9mgd9hr6BKrOPJrFcs7giDlzAO3gVsLqgyCPFFz5m0XCYIHrLHyTyTmNElaJWCqZRz8WaJObOOYXChQYc8xcz+P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 6, 2025 at 11:09=E2=80=AFAM Shakeel Butt wrote: > > On Thu, Feb 06, 2025 at 04:57:39PM +0100, Michal Koutn=C3=BD wrote: > > Hello Shakeel. > > > > On Wed, Feb 05, 2025 at 02:20:29PM -0800, Shakeel Butt wrote: > > > Memcg-v1 exposes hierarchical_[memory|memsw]_limit counters in its > > > memory.stat file which applications can use to get their effective li= mit > > > which is the minimum of limits of itself and all of its ancestors. > > > > I was fan of equal idea too [1]. The referenced series also tackles > > change notifications (to make this complete for apps that really want t= o > > scale based on the actual limit). I ceased to like it when I realized > > there can be hierarchies when the effective value cannot be effectively > > :) determined [2]. > > > > > This is pretty useful in environments where cgroup namespace is used > > > and the application does not have access to the full view of the > > > cgroup hierarchy. Let's expose effective limits for memcg v2 as well. > > > > Also, the case for this exposition was never strongly built. > > Why isn't PSI enough in your case? > > > > Hi Michal, > > Oh I totally forgot about your series. In my use-case, it is not about > dynamically knowning how much they can expand and adjust themselves but > rather knowing statically upfront what resources they have been given. > More concretely, these are workloads which used to completely occupy a > single machine, though within containers but without limits. These > workloads used to look at machine level metrics at startup on how much > resources are available. > > Now these workloads are being moved to multi-tenant environment but > still the machine is partitioned statically between the workloads. So, > these workloads need to know upfront how much resources are allocated to > them upfront and the way the cgroup hierarchy is setup, that information > is a bit above the tree. > > I hope this clarifies the motivation behind this change i.e. the target > is not dynamic load balancing but rather upfront static knowledge. > > thanks, > Shakeel > We've been thinking of using memcg to both protect (memory.min) and limit (via memcg OOM) memory hungry apps (games), while informing such apps of their upper limit so they know how much they can allocate before risking being killed. Visibility of the cgroup hierarchy isn't an issue, but having a single file to read instead of walking up the tree with multiple reads to calculate an effective limit would be nice. Partial memcg activation in the hierarchy *is* an issue, but walking up to the closest ancestor with memcg activated is better than reading all the way up.