From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4055DC001DE for ; Wed, 26 Jul 2023 02:56:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EE078D0002; Tue, 25 Jul 2023 22:56:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 977AD6B0078; Tue, 25 Jul 2023 22:56:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F0B08D0002; Tue, 25 Jul 2023 22:56:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6A6DD6B0075 for ; Tue, 25 Jul 2023 22:56:26 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2F7ADA0702 for ; Wed, 26 Jul 2023 02:56:26 +0000 (UTC) X-FDA: 81052249572.04.F4E8A10 Received: from out-55.mta1.migadu.com (out-55.mta1.migadu.com [95.215.58.55]) by imf11.hostedemail.com (Postfix) with ESMTP id 1E24C40021 for ; Wed, 26 Jul 2023 02:56:23 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=CWY0VoBd; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf11.hostedemail.com: domain of roman.gushchin@linux.dev designates 95.215.58.55 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690340184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M8HcyBJhNbmAFYQ4vhr/4v6PwhJI/M2KUGD3rBet++I=; b=Up91woU2Mc1iz7RVZpq5Fee2oikb7DmSARg7KTLh5RE4sRU3TtDPVhGjv9ZjNQCraXch7c 2ZClxOofoUCnVtv4aSAxKnLjFYmO/QLCcCpsYk9jjgOoujP2pgJMT16w3Gay3T/0nmvRXh U11i59+CuFWCAxEAJoBc7a7GuGs/ps8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=CWY0VoBd; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf11.hostedemail.com: domain of roman.gushchin@linux.dev designates 95.215.58.55 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690340184; a=rsa-sha256; cv=none; b=skf8bYCpAB+A7mlDiWns4IMn/lcnzlb3P1pKGqQoUghKh9DwC7iZ0mHkH3GUNWmMSkAOE0 xZb88C4rhMQRhDHxI5d51EfAMGItM23hXO6rRcYd4NM2TY/m7byqFryywnGxC4t/KEeZXL cdMzxd2LIE+SExIvs6rPakCmxq3WzGk= Date: Tue, 25 Jul 2023 19:56:13 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1690340181; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=M8HcyBJhNbmAFYQ4vhr/4v6PwhJI/M2KUGD3rBet++I=; b=CWY0VoBdxv90fE2Kx4slvk7SKhNU64RiS4zo/67lqAHvkomMISjpuadrAKmiZtx3178RPo yfNQlTWX6efUd8GwjE2kygFoh9/mTxSUiV5Wph+uzlk+0wfwPoF0KJq+FVvK2fSuFHgH1v ppQqROFo3fA+qYCo3RRaBLzBqJjPyxU= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Abel Wu Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , David Ahern , Yosry Ahmed , "Matthew Wilcox (Oracle)" , Yu Zhao , Kefeng Wang , Yafang Shao , Kuniyuki Iwashima , Martin KaFai Lau , Alexander Mikhalitsyn , Breno Leitao , David Howells , Jason Xing , Xin Long , Michal Hocko , Alexei Starovoitov , open list , "open list:NETWORKING [GENERAL]" , "open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" , "open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" Subject: Re: Re: [PATCH RESEND net-next 1/2] net-memcg: Scopify the indicators of sockmem pressure Message-ID: References: <20230711124157.97169-1-wuyun.abel@bytedance.com> <58e75f44-16e3-a40a-4c8a-0f61bbf393f9@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <58e75f44-16e3-a40a-4c8a-0f61bbf393f9@bytedance.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 1E24C40021 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: dbu5f58waya58enggk6zpmk6n445k1he X-HE-Tag: 1690340183-717128 X-HE-Meta: U2FsdGVkX19F5fQXiO+RQg6wa6pwfYhLz7+dL5LmHT3YglVniMv3QMnIoCVZO5RD6sgH9/KCTaA0L4UqpOtTVTXyAo0ytMucfzoc98yxsDC+Mk5gdm0mWkSYFSMhSr8nkLhC8aeh9L2nUtA8qLDmqMeyUfgy1eJk9JBe68UJ8lP4vbnITuXKVRcZ9PLZUX6bES3zdW3cfa7u9muj2re1poORbwcPnaLg2AI8aSmbJZfTZM7uOIQldSoFuQBUzdpJWKdYEfDcBBTkvJ0P1mOsQ3W4bQd5mX8SW6o7M4ei5DdpXMCYPHx5khCNw4VdIAmmLs+6NC8ZKFC/4BbaUFU/apf45fArMrPhIhTMWOavTTP1Y/aNkm77sC9uBzy3YT96JuTjt7hj80zGIOlrl/KAoPyUYh3MNOQmU3YNDZfwqoepluCTiFSte4c8EsAawoE5gljEFvL77gW43EgdSvB0wGbZt3QFodfYoMgWMqmoP4rz7RnIUk6gr1CXgx0H9uvIcUO7T7WNSDNS75Pos6G6ydCE1s9yBIyDq5sN3MV6W2YaH7F/cWhHwQITvwcBJXG5/FGhc31hxLddBloqs9zIOcqxYYxovtqQOugLQfFr4uKYwPshrloJWxTtc2y0sBS/3zr6+fx+hphTIa5fvHASf6SRGm7CVV5VNAlMlje888vlO44Sqg0/4Xuo03jbE8z1u7a/9AKSxbgjb+ubvDAlQgPunQ4vZJFJdpDDv4EmjF5xY15GT61uQRCBdVhkDGupuTLqvWvRN51vg4R7wRZLzksue4H2FOpJqaQpfvt1H0qNAf/bNlsaLT6dVWzQwmg5Wz++Ram2BrdUI0KHRuhJrvOVmkp+SCN5YsOqlFmWIDIbgnKD3+VdIbH/fSV25sRb5DT9b1gM035Ea0uSK6nop+BI4sup/LvkZ9s+rGzgHZseCdewiEHNWHNn5U5fX45zAuHDhJHaTawo+xfSoQC FgujEpDM 6OLrXSEo9I40Jj/J848unOYupNzRUYK0nILAU2hkk+HE6IyyfqFD/53lFTiztD0NELG6AYTcxQbdbtj92qReF62WL2cFi+l97pbjtf3RJvAm+oz4MKP4YRAt9Ce6nuCtC5BQiSnp7jxGCfrZ+nkdvxBSOaanuX6psSz61v0gnG/dO61iNuxIzr4VtnZkfZBjKuTDLcAyfvI7oFDhzqc8/21rNJA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 24, 2023 at 11:47:02AM +0800, Abel Wu wrote: > Hi Roman, thanks for taking time to have a look! > > On 7/22/23 8:20 AM, Roman Gushchin wrote: > > On Tue, Jul 11, 2023 at 08:41:43PM +0800, Abel Wu wrote: > > > Now there are two indicators of socket memory pressure sit inside > > > struct mem_cgroup, socket_pressure and tcpmem_pressure. > > > > Hi Abel! > > > > > When in legacy mode aka. cgroupv1, the socket memory is charged > > > into a separate counter memcg->tcpmem rather than ->memory, so > > > the reclaim pressure of the memcg has nothing to do with socket's > > > pressure at all. > > > > But we still might set memcg->socket_pressure and propagate the pressure, > > right? > > Yes, but the pressure comes from memcg->socket_pressure does not mean > pressure in socket memory in cgroupv1, which might lead to premature > reclamation or throttling on socket memory allocation. As the following > example shows: > > ->memory ->tcpmem > limit 10G 10G > usage 9G 4G > pressure true false Yes, now it makes sense to me. Thank you for the explanation. Then I'd organize the patchset in the following way: 1) cgroup v1-only fix to not throttle tcpmem based on the vmpressure 2) a formal code refactoring > > the memcg's memory limits are both set to 10G, and the ->memory part > is suffering from reclaim pressure while ->tcpmem still has much room > for use. I have no idea why should treat the ->tcpmem as under pressure > in this scenario, am I missed something? > > > If you're changing this, you need to provide a bit more data on why it's > > a good idea. I'm not saying the current status is perfect, but I think we need > > a bit more justification for this change. > > > > > While for default mode, the ->tcpmem is simply > > > not used. > > > > > > So {socket,tcpmem}_pressure are only used in default/legacy mode > > > respectively. This patch fixes the pieces of code that make mixed > > > use of both. > > > > > > Signed-off-by: Abel Wu > > > --- > > > include/linux/memcontrol.h | 4 ++-- > > > mm/vmpressure.c | 8 ++++++++ > > > 2 files changed, 10 insertions(+), 2 deletions(-) > > > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > > index 5818af8eca5a..5860c7f316b9 100644 > > > --- a/include/linux/memcontrol.h > > > +++ b/include/linux/memcontrol.h > > > @@ -1727,8 +1727,8 @@ void mem_cgroup_sk_alloc(struct sock *sk); > > > void mem_cgroup_sk_free(struct sock *sk); > > > static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) > > > { > > > - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) > > > - return true; > > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > > + return !!memcg->tcpmem_pressure; > > > > So here you can have something like > > if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > > do { > > if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) > > return true; > > } while ((memcg = parent_mem_cgroup(memcg))); > > } else { > > return !!READ_ONCE(memcg->socket_pressure); > > } > > Yes, this looks better. > > > > > And, please, add a bold comment here or nearby the socket_pressure definition > > that it has a different semantics in the legacy and default modes. > > Agreed. > > > > > Overall I think it's a good idea to clean these things up and thank you > > for working on this. But I wonder if we can make the next step and leave only > > one mechanism for both cgroup v1 and v2 instead of having this weird setup > > where memcg->socket_pressure is set differently from different paths on cgroup > > v1 and v2. > > There is some difficulty in unifying the mechanism for both cgroup > designs. Throttling socket memory allocation when memcg is under > pressure only makes sense when socket memory and other usages are > sharing the same limit, which is not true for cgroupv1. Thoughts? I see... Generally speaking cgroup v1 is considered frozen, so we can leave it as it is, except when it creates an unnecessary complexity in the code. I'm curious, was your work driven by some real-world problem or a desire to clean up the code? Both are valid reasons of course. Thanks!