From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C049EB64DD for ; Mon, 24 Jul 2023 03:47:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 921F76B0071; Sun, 23 Jul 2023 23:47:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D1556B0074; Sun, 23 Jul 2023 23:47:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 773E98D0001; Sun, 23 Jul 2023 23:47:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6322C6B0071 for ; Sun, 23 Jul 2023 23:47:16 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C87851208C3 for ; Mon, 24 Jul 2023 03:47:15 +0000 (UTC) X-FDA: 81045120030.30.5D1DE9D Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf22.hostedemail.com (Postfix) with ESMTP id D4FA3C0003 for ; Mon, 24 Jul 2023 03:47:12 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=PgEeS0qO; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf22.hostedemail.com: domain of wuyun.abel@bytedance.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=wuyun.abel@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690170433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1NSxBI+jZ4lM78HRKxD8ibkmyQyG/FtRFP+HyoBnx5I=; b=RM6CrAEA0i9jfBb1kbAE7XFM9tQfkFnsThKOa7tJI8N47x+10NzPCBq1FZcvmonzSHq1d7 egUOFUJcpemXeexDGDdTf7V8Ot0ObropA7A0gfUgLBj75pf0lLNFL2SL5gUKbnGDcKCZni qoGorlE+1MIF5HHe7dh6cNLLq+1tmBI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=PgEeS0qO; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf22.hostedemail.com: domain of wuyun.abel@bytedance.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=wuyun.abel@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690170433; a=rsa-sha256; cv=none; b=hSFWg8BHWhGCPRrCVtIbHrJm2dfrmV5BECuEQMpTeUvKCKDvcUDFVEFfbJV4lPVAM9PVBo 5QsZXLBMCpkYMKf3XB61RYrk2SEG7P4WZkWpyTazKehL3M+WZjMqxle7FCZ4JONWwr7x0y AlYW2KwUmdRZiYT0JjriXiP2TeK/+aY= Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-66f3fc56ef4so3180763b3a.0 for ; Sun, 23 Jul 2023 20:47:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1690170431; x=1690775231; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=1NSxBI+jZ4lM78HRKxD8ibkmyQyG/FtRFP+HyoBnx5I=; b=PgEeS0qOI+pKcrpPsv7cx17Jlt7Uzht14nwBIdBKNMvY9lkPJv+0Kt98k1wsdnIzG9 iSOI7qakWbJdPnM6TsHMhEEbpc1/VybplhTM0PsZyroqv8Kmm0p+by7fpC7oedDAGnFJ eCmTNVgTyCAu5MVVme1Ez5G8PIxUSZqLfqn3iEHdI5GWl+HVWWK3XgVHPwP99qeED/Nr 54QvjkQaaJj8zr0ff8BGNk0KBQ5UUrnePYxFm4YzSF3r0O2Nq3CP3fITV8nYm1kwMXkc ro+uz9JlKWbIcp26gzgXJ+gw4DKpOys7OSBDl5v5JwqaBhyejNeuuwKnv1Em5AkWiAPb jwRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690170431; x=1690775231; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1NSxBI+jZ4lM78HRKxD8ibkmyQyG/FtRFP+HyoBnx5I=; b=EIaZgpfqq9TwTpS+bGD+MOFRJmPkyN30lie48aTamPsAdOMiTdmTO282/M932ENrjb 1mh/pykrxtz/SaURsCxWPNGpp8nVAjWsIaQjXkI/p6SNvCrFP1+6kprQb3VyN+EbOuec rgZJsArG7aauJfxitXQDNSebpaGH6Yjrx4iIkUks6dVqbmzhzu+Zf17E+AuuIqSxK+1W wKeEJTGJWyANMtPnarbFSs7wSianTZgqxQq5VdXZmrP0oQ5ojFxZ7Sy87p0W0J5h4IZ/ CUqUahCmgqaMrILz+1aVqrhMNe1rxnjP5H6/dTkhS5qCNG09P72HaUTnF8CZVpQ+zZwo Po9g== X-Gm-Message-State: ABy/qLbAMbT6Wz7eV8VI2i+TnKkRyypHkXswTXXI16sX1QLKDTJc0FtW Z0hre6efxqMF+i51Sbm5PbPT2g== X-Google-Smtp-Source: APBJJlEyQhIh1pIemHqUgQdKJnZFXDHmNxVzskhvDJZSJRCsJIVIOay9LorLBuwXEjozyQKE9n+Xqg== X-Received: by 2002:a17:90b:195:b0:268:f2e:b480 with SMTP id t21-20020a17090b019500b002680f2eb480mr3628698pjs.11.1690170431458; Sun, 23 Jul 2023 20:47:11 -0700 (PDT) Received: from ?IPV6:fdbd:ff1:ce00:11bb:1457:9302:1528:c8f4? ([240e:694:e21:b::2]) by smtp.gmail.com with ESMTPSA id om5-20020a17090b3a8500b002677739860fsm5583934pjb.34.2023.07.23.20.47.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 23 Jul 2023 20:47:10 -0700 (PDT) Message-ID: <58e75f44-16e3-a40a-4c8a-0f61bbf393f9@bytedance.com> Date: Mon, 24 Jul 2023 11:47:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: Re: [PATCH RESEND net-next 1/2] net-memcg: Scopify the indicators of sockmem pressure To: Roman Gushchin Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , David Ahern , Yosry Ahmed , "Matthew Wilcox (Oracle)" , Yu Zhao , Kefeng Wang , Yafang Shao , Kuniyuki Iwashima , Martin KaFai Lau , Alexander Mikhalitsyn , Breno Leitao , David Howells , Jason Xing , Xin Long , Michal Hocko , Alexei Starovoitov , open list , "open list:NETWORKING [GENERAL]" , "open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" , "open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" References: <20230711124157.97169-1-wuyun.abel@bytedance.com> Content-Language: en-US From: Abel Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: snj351rfbhgas38s6ad547iyc4geqkyc X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D4FA3C0003 X-HE-Tag: 1690170432-441337 X-HE-Meta: U2FsdGVkX1+dh/ayVHjip8AYLQU+Te8qNmDhsKVTU73uhJynQ4+LuoKS0qtRJTdmjA/Cm8daXxNjBIzopDXid4CNFdbRCnB9KW0wETqa3e8ZBWZDB516Y1SaAypxRJUa/LMxFh8KP/FtYaKSIogmoGEIT0RXrH1Q96qdOOTZjsjw47nrneana9dQ6qZdkY57j+ArEg26vZ47ywbQxa/9NO6PGav/Gv9rkEYg4OS02r10L3u4y7D6WURPz77RLRej3/zwMe4LltqGKm2bmCz37kY1+9TxQ9xxLST5rp+woe6RL5eP9NV/5JBgOg6qSvms4LfKewmME9nLtPrC56csctNGS5J7fzDrbZ7rVrozFthPlhupEWUjRddcYjWZXMH/rEmidW5nw10YMuwMJIDBxCmFpoPwo099rdbwlOXS06HIttFSbLDyuwkCDTlcvx5feU0eP/YwBDNv7qx7vRLoqG2/KMLfUvwiR4v/3ak+CYLbt1RfYrpJhyaRPlYH75zTDgnZdg9+MWdxfB/N28RK01vMHh+g+7Wi8BVvxspcFiRSty6hwl+pAelgVIM+9mwDqWxiYV0Vp5RO4nEXHukXaaDCa6fi3D09IxioHqJQuJAqMTGELGrQm3ZwqW0SgQAG+ctFXV4pKCmd5Q12TNg0ard1A05J/SQoTi7xoUCmYJWgtOI55CZwKS8AMUqFb1hW32yUw2l/hElt9zW0aU3JTtlwYmrfjs5Z3z8ZdUFHa1l8pCpwuFVKNNvWPWkmXqWIf+HuY+NS2r6MmJ8Kzr0XPVFl90eOIIVnr8ZCPv0XnHp0u+AG3GiR8EcE6ka9GJp3jF472orDdJE8+StUc6IzzphQoAGH3+SchKxkCudTbSpUKZKEUdspnK4hfagtR0wUsnWepYaonQOfe9PCZ+aOrZ82/uvCJIzGs8Cx7+fNt6xpMOqKP6UQ562iIu5jaZrGiFx2jxrujV26qzJOeqE UStlwubu lFqcbRO4q0Q56WrqfDopk8zpdH3/6nR42kX1yw8iZ1lL76dDGB84CAPhSz/hcQ7w4iP4ZhUVa6UKNIA9Dhak7RO1dgFgW+N5wUzpUpC69OKS48SiC5KUjPWlYw17L+dQxgqHst93ktu654nxbsalq/H+AOmvH134k6Loy/YxXkn0C/jDjEGFxSRCD2VQ40m94u5LP/lTrT6DIWAcoHgFx6yzmnqFbEhlVV5VCtV1cUknXWrMWQ7k/6syOFG+WOJezZxhHs8gLbpoaoqQjuHFzm8tvmnUR+d3O86+r2EWRx0N3Wk4LdQbu5bk6SQASk9ID3KR1/8/gRAlPjEeX/a53Pz4BiVQwB9O7KRM4zqq3xe0JAPT3SvVt4sjdt1ctTuPtkHnZq2bYG7XL+Q2xpPOmxHqPIIPj/gvCzNsaYw2CdsxkLMuryYqId15O2qxiLPBq7q6RtF5X7Z24cNvt4Ip/BVtU9gbMGyrVl++xtUT2jWvEekKSHGRHMa6X/CH25b1Aw9Se X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Roman, thanks for taking time to have a look! On 7/22/23 8:20 AM, Roman Gushchin wrote: > On Tue, Jul 11, 2023 at 08:41:43PM +0800, Abel Wu wrote: >> Now there are two indicators of socket memory pressure sit inside >> struct mem_cgroup, socket_pressure and tcpmem_pressure. > > Hi Abel! > >> When in legacy mode aka. cgroupv1, the socket memory is charged >> into a separate counter memcg->tcpmem rather than ->memory, so >> the reclaim pressure of the memcg has nothing to do with socket's >> pressure at all. > > But we still might set memcg->socket_pressure and propagate the pressure, > right? Yes, but the pressure comes from memcg->socket_pressure does not mean pressure in socket memory in cgroupv1, which might lead to premature reclamation or throttling on socket memory allocation. As the following example shows: ->memory ->tcpmem limit 10G 10G usage 9G 4G pressure true false the memcg's memory limits are both set to 10G, and the ->memory part is suffering from reclaim pressure while ->tcpmem still has much room for use. I have no idea why should treat the ->tcpmem as under pressure in this scenario, am I missed something? > If you're changing this, you need to provide a bit more data on why it's > a good idea. I'm not saying the current status is perfect, but I think we need > a bit more justification for this change. > >> While for default mode, the ->tcpmem is simply >> not used. >> >> So {socket,tcpmem}_pressure are only used in default/legacy mode >> respectively. This patch fixes the pieces of code that make mixed >> use of both. >> >> Signed-off-by: Abel Wu >> --- >> include/linux/memcontrol.h | 4 ++-- >> mm/vmpressure.c | 8 ++++++++ >> 2 files changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h >> index 5818af8eca5a..5860c7f316b9 100644 >> --- a/include/linux/memcontrol.h >> +++ b/include/linux/memcontrol.h >> @@ -1727,8 +1727,8 @@ void mem_cgroup_sk_alloc(struct sock *sk); >> void mem_cgroup_sk_free(struct sock *sk); >> static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) >> { >> - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) >> - return true; >> + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) >> + return !!memcg->tcpmem_pressure; > > So here you can have something like > if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > do { > if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) > return true; > } while ((memcg = parent_mem_cgroup(memcg))); > } else { > return !!READ_ONCE(memcg->socket_pressure); > } Yes, this looks better. > > And, please, add a bold comment here or nearby the socket_pressure definition > that it has a different semantics in the legacy and default modes. Agreed. > > Overall I think it's a good idea to clean these things up and thank you > for working on this. But I wonder if we can make the next step and leave only > one mechanism for both cgroup v1 and v2 instead of having this weird setup > where memcg->socket_pressure is set differently from different paths on cgroup > v1 and v2. There is some difficulty in unifying the mechanism for both cgroup designs. Throttling socket memory allocation when memcg is under pressure only makes sense when socket memory and other usages are sharing the same limit, which is not true for cgroupv1. Thoughts? Thanks & Best, Abel