From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38FF5C3DA49 for ; Tue, 16 Jul 2024 18:01:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C34856B0096; Tue, 16 Jul 2024 14:00:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE4B86B0098; Tue, 16 Jul 2024 14:00:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AABCE6B0099; Tue, 16 Jul 2024 14:00:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8CFA36B0096 for ; Tue, 16 Jul 2024 14:00:59 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 39186A291D for ; Tue, 16 Jul 2024 18:00:59 +0000 (UTC) X-FDA: 82346381838.23.F3B0263 Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by imf05.hostedemail.com (Postfix) with ESMTP id 15582100033 for ; Tue, 16 Jul 2024 18:00:55 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=NKbdrSLC; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf05.hostedemail.com: domain of mhocko@suse.com designates 209.85.167.51 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721152824; a=rsa-sha256; cv=none; b=TK89EJaDk7ZGU8AB2YWDiUAxR71Txv3JWQFN4SBnKaaHKUbdQMLDLEXWcjqkMmCTGpx4yv 2lmjlWU70wzZ4+NMsbxYcvgsmKHSgTwUkZsDU9VAmdek5eUrtMR4O3VJEUw7anATjC/Rgx KNFOiNtf3Ex33ZCAsJF9KrVUywU12AU= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=NKbdrSLC; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf05.hostedemail.com: domain of mhocko@suse.com designates 209.85.167.51 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721152824; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iFsiHkT5kLKgnpnJV32ad1AY52Jam+fAuXfe9Kv9IFA=; b=Gd6v8GbNzHGSsNXFEMnTZ5S5EW5zAu5YU2IY0J6/6PFItDwehDzm+gpaviI64uiE5tHzQR 7UTRGnloCwvbMf1lej8MBsy8o7OVoGC+QEkwt7j0/EORGjSi5h5LPlncOk9gwA8m12+MoP 7bZMDWQP021C2KNp4vfrvNfA3DXYG5M= Received: by mail-lf1-f51.google.com with SMTP id 2adb3069b0e04-52ea5dc3c66so9599927e87.3 for ; Tue, 16 Jul 2024 11:00:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1721152854; x=1721757654; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=iFsiHkT5kLKgnpnJV32ad1AY52Jam+fAuXfe9Kv9IFA=; b=NKbdrSLCbys/unRCawUpR15n4Wu9o32Y/gcRWECskhcdsNS8Imd2MCLSQbPDPurSoY ck7gnykGLuHasoGD7vmomcGFVsMSFzXM20fPMjJH0rrN9jCtjg0CIklsMYRumkkcq3wN 0fFbzx26kolBdrb2Kf20Ngz5uAzRqGTJMRQTcWULNOSpIkFusmcHKh79TJwrrkn8ZtPG A7UY/yhMs8JnQx7jfg4abWRt8vopcbOH3lX/PHhv8pz0Sa92ynRVDiaLjebb+WH38Lxo L4OT/7UPDL8INjNdYGMRdn+cpBNQMwUDyXFKpyom39cAZVkFwYFB5n5DDnmXwyqtzWMk yPpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721152854; x=1721757654; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=iFsiHkT5kLKgnpnJV32ad1AY52Jam+fAuXfe9Kv9IFA=; b=Lv4LtRYuPOjIyBTjFk2k27C2LpZnkbtS9wDYFElsKPWMqtq121NwrS7k96LEbvcbTP JVU9kjUnpmmIlE5DzZCozCk83y1VPtwDubwlZcUylFyBm1FRmr5IRfn7eWKOLzCDUL6W z1NySSGek9r98ktX2r+m2fLeapCmrYhxKPjPwJTyYc9lvZgPT6bbbYUvJAJF03HHPrdg oR9NpyR21DUyoNDLQtBGaKSlGyVC2cZfQ4GJBKO7/sBSmMfeLV5ywOWnmW/FxXiBM8uQ +aqV2Y07SHI7iqrKwyYIpn6/GL8U7Uf1KlkDqMDntsbItX05Lepva3hIl/hUXUBEZ0Yh V+Lw== X-Forwarded-Encrypted: i=1; AJvYcCV5aj5F5568FziuOo6HPFqGkAD8idHd2uiR722mFallSWUu8AKq+eRZCAAuN8YYCIN7lcdvXK8mvE2URX9UrcW7moY= X-Gm-Message-State: AOJu0YzjY6Qs4+MwY+O0rqkqFLrAm5SLf0wbIAX4NHguyNYmXuWW0zDB F4iaS4NcSz8XsYU0MnJKWtWbxrIpxVnCtw61Wpziu7ZtpoNO5mQyoPCGOs6sdjQ= X-Google-Smtp-Source: AGHT+IFguqmHbDZmceNZdWrl3ay3ZAoue33JScuW0GGJ/EGw5opOR8ZAK/jYib2a5sZ5B5pWLuMxEA== X-Received: by 2002:a05:6512:6d6:b0:52e:933f:f1fa with SMTP id 2adb3069b0e04-52edf034544mr2643555e87.61.1721152853937; Tue, 16 Jul 2024 11:00:53 -0700 (PDT) Received: from localhost (109-81-86-75.rct.o2.cz. [109.81.86.75]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a79bc5a3965sm341267566b.42.2024.07.16.11.00.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Jul 2024 11:00:53 -0700 (PDT) Date: Tue, 16 Jul 2024 20:00:52 +0200 From: Michal Hocko To: Tejun Heo Cc: David Finkel , Muchun Song , Andrew Morton , core-services@vimeo.com, Jonathan Corbet , Roman Gushchin , Shakeel Butt , Shuah Khan , Johannes Weiner , Zefan Li , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH] mm, memcg: cg2 memory{.swap,}.peak write handlers Message-ID: References: <20240715203625.1462309-1-davidf@vimeo.com> <20240715203625.1462309-2-davidf@vimeo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 15582100033 X-Stat-Signature: 1fzuqe4y7ddw45m9wrwwhuxt7c6k79bk X-Rspam-User: X-HE-Tag: 1721152855-832043 X-HE-Meta: U2FsdGVkX1/dZU//wWE/kHz/K5o/4njfvyayZzALunFPXWFYL3vrb3crdkxT4C0EDRl6y8OFA7piorSIjm/ewmD0m76SrSVWrGE2saqWjjHNCU4IIxFqdLicBSwPnj1mag/8Sf69KoUutzjrDggcNJ/WZn3nQPiyaqCQZtn+3fYlVdHS6/vIMwIUMJsRs/ZsyRtXlxhFajB+B19vPwJci0wrkPp6F2/RyLEhoPDdmv+dZii7NXfeyqE9N5vtsNXQAwOH6w7hUbraDYh2PjpbqrvYwMWBVqWE4FQ821XXox9UysgRbJlqYFnoyhmzHqj9IOO3Tatnoi7dZ3SmzgeMyWTPnGa1PeKXkuYpfVJzIjebe027NzTAZI2OeTFqqYUsXpy7tCrDOqyfHbug2lp8eoEdQYV8JPGKL+tZcTkOlvEHzV3wq6ZpBjfZnT/2dIEPveqd6Ivxb1ikePF2ombIJUb+dFdL9gkncF2fi6eOZprXG+7rR4/a07bL6kKgVA4pMwjV0756hSVu7nmVma7/DNJlRam4Z2hhy52OxHpDeRWtYt55gQvZvaGib3MSBjBLGHygerc6z+Os8CZJ9lIFpozsJUKH0+dfMn6h0ewZHjiNRKcUC0QFX3bqQHpGGdjZSBCGT+1K4CDGOqAsA5Enk7J9auHfYkKoOfmFFtQj6wVVAL2cbBNo1aGMMBYzEVIByWJPx7JgkLcoyv+MmXZzpwGXIP3o9+Iy7IMYoCmVTfqT6yaywfNaOPzMYlQv4LNJGpvCZBiNVD8fo91vvHOl+PCWRfFEVgMOKHP0yR4sxrE5yuvQcXzziJnSG3mLreR4Uom9yC+rrLD05DtPkK4jI/l8px1TciM7oL20yIhgzFqH9+B2LUL56MeYzXTA6j7SzoBOh0P9kBOHaENub34dvsSvEAXLt9CgFzVBBwKK8IMdoZ8IbLvpxeY9Td/47vi+Klnu2jkEuyzU6OlKKcL IYckt3k0 9BAVeZ2qnUp1szCn79v1slM2w6WpHrHc9INx/iPygiZ9p3muh4uc21GwnElh2FGJlo+cFycrMYlQcnp77D6861bLy5Tz7QlG05QnYc9YNULT9Bhtj/QyZF6k4mT/y/Q78G06b2x4/equHWQE/RnQCOQD+IrkwoPLr7AYBrvK8kOLyGYlUWkIxuCpECycU4C8vEp7CwdUVueLam8LM8InGXlbTW/wyhHl8w6r1HRz/Yo4W/WFYvJKWf7Twe2te3v3jAeCST29xdCKNuAAgNapCm9fxiXGdHffwaDrZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 16-07-24 06:44:11, Tejun Heo wrote: > Hello, > > On Tue, Jul 16, 2024 at 03:48:17PM +0200, Michal Hocko wrote: > ... > > > This behavior is particularly useful for work scheduling systems that > > > need to track memory usage of worker processes/cgroups per-work-item. > > > Since memory can't be squeezed like CPU can (the OOM-killer has > > > opinions), these systems need to track the peak memory usage to compute > > > system/container fullness when binpacking workitems. > > Swap still has bad reps but there's nothing drastically worse about it than > page cache. ie. If you're under memory pressure, you get thrashing one way > or another. If there's no swap, the system is just memlocking anon memory > even when they are a lot colder than page cache, so I'm skeptical that no > swap + mostly anon + kernel OOM kills is a good strategy in general > especially given that the system behavior is not very predictable under OOM > conditions. Completely agree on this! > > As mentioned down the email thread, I consider usefulness of peak value > > rather limited. It is misleading when memory is reclaimed. But > > fundamentally I do not oppose to unifying the write behavior to reset > > values. > > The removal of resets was intentional. The problem was that it wasn't clear > who owned those counters and there's no way of telling who reset what when. > It was easy to accidentally end up with multiple entities that think they > can get timed measurement by resetting. yes, I understand and agree with you. Generally speaking peak value is of a very limited value. On the other hand we already have it in v2 and if it allows a reliable way to scale the workload (which seems to be the case here) than reseting the value sounds like a cheaper value than tearing down the memcg and creating it again (with all the dead cgroups headache that might follow). The reset interface doesn't cause much from the maintenance POV and if somebody wants to use it they surely need find a way to coordinate. > So, in general, I don't think this is a great idea. There are shortcomings > to how memory.peak behaves in that its meaningfulness quickly declines over > time. This is expected and the rationale behind adding memory.peak, IIRC, > was that it was difficult to tell the memory usage of a short-lived cgroup. > > If we want to allow peak measurement of time periods, I wonder whether we > could do something similar to pressure triggers - ie. let users register > watchers so that each user can define their own watch periods. This is more > involved but more useful and less error-inducing than adding reset to a > single counter. I would rather not get back to that unless we have many more users that really need that. Absolute value of the memory consumption is a long living concept that doesn't make much sense most of the time. People just tend to still use it because it is much simpler to compare two different values rather than something as dynamic as PSI similar metrics. -- Michal Hocko SUSE Labs