From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEC7AC7EE22 for ; Wed, 10 May 2023 11:25:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A39F6B0071; Wed, 10 May 2023 07:25:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 152DB6B0072; Wed, 10 May 2023 07:25:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 041246B0074; Wed, 10 May 2023 07:25:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E889F6B0071 for ; Wed, 10 May 2023 07:25:08 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A874440A2C for ; Wed, 10 May 2023 11:25:08 +0000 (UTC) X-FDA: 80774113896.17.1714304 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf18.hostedemail.com (Postfix) with ESMTP id C79621C0003 for ; Wed, 10 May 2023 11:25:06 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=glDScdM+; spf=pass (imf18.hostedemail.com: domain of edumazet@google.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=edumazet@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683717906; a=rsa-sha256; cv=none; b=EYjzeEXPvYJb0F2XyQcA156mhmQEZlRIMjx8j16MwDnyZnjUI48wlkwyptI+lqjKvke/aB EOQu7YcjcnnWjWL+ntUHmuFG69wwKPOy3svsukuOz03O90dbPI4NLh3SB10vkYsPc5AuyB Pi1eceGhKYdLxsUSq6QyVHyeIRWyMcw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=glDScdM+; spf=pass (imf18.hostedemail.com: domain of edumazet@google.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=edumazet@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683717906; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yCUB8//jyn0v6VuC0Mv1gAetwatciCjYaByO0V5N8x4=; b=ghUVv3379Cnhh/6/ysWQvK035d5+d2i7dJJrB3bCbIiEZmTMVU5/BAEWBptvKjMp3phyjD HGCVrqUfcskjMhmbIZ54btJEt78tsjH9h38DbEQ8+V8lnsVGkfCG0nOeFhG4QPaN+Wucsf fay+KTnA8/Ee2d4Ave4VW3jyMJQUFUU= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-3f423c17bafso110075e9.0 for ; Wed, 10 May 2023 04:25:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683717905; x=1686309905; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yCUB8//jyn0v6VuC0Mv1gAetwatciCjYaByO0V5N8x4=; b=glDScdM+b+Eq4bPgyBFONs+ujCqj0fUTzb2C1b7YC9k0bX00m378sp+jClAFSCEYIj 7GQmW5yCYlh8ljWEw+78HDoWFDW3woBXw6OWf3sutdWuRau9RpJTBeFzr5rniNgwu8B1 RXeBFJvwdpWuXrQNRVHHeJxrfM+4flHaskwNjG8OzWaUKrYT8zOnmkjN6UEowshjdkVX PiEWdYsuLY3ObbHKdPzefSLUPMZzXEQ/uJ8/dDTSaTJaCz/lXYPUOCVsKTIZMN5k1a3O fhKq0sDR4DzxKg5KKy/y8r4CgHK0FDtbgeUmotkf4M6TOF0GI1bPRq/wwfAJEGhj4ODp eilA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683717905; x=1686309905; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yCUB8//jyn0v6VuC0Mv1gAetwatciCjYaByO0V5N8x4=; b=L/PYBbzy7LcemeJ80L50ULjuqrLpclgQqgKJnGQjDEkeHmI54ZPnaCHJV8/qwFsu95 x590LhNbuTI0dgpa9AjG5JzzOtCG6L28z8A2349hhfUngkNOtQo0DIrgIZvR8/CZMml4 ZjoIwro91S1mjdk31sAAwkXe99H2cBvdLHVOgXJNFs63hl8kGlbCdBl7BiMwA6H7naAC /a+u0qqNwQobFMAonA+IeFgUAExYNtIw027VWx9QW3OuLKJEiqJvogUlytG3laf1r81/ Km5PGJuLyWTy6h9AXc3jBsXWi/Hu5gO9E5lDKzzV/UFRiwlQsWQOi3Ka+j68ZkRgqmAA cI7Q== X-Gm-Message-State: AC+VfDwLeDJjn6aB0kf4obYRiOlTgaIn9qATOT2iwC2j+5KckhhLlQsO lZle7ovtF8eS+D88gLsLHPax2NoZE8wW+V5AWZFbZQ== X-Google-Smtp-Source: ACHHUZ69T1gYDKpF4OjPKZJnBUCAEzKCno/ZJPDxW3EBm1g/1xeKp6EIRTP5+AunUAnDep3VHzalXh6IrmgpZ6uKBGI= X-Received: by 2002:a05:600c:3d98:b0:3f1:9396:6fbf with SMTP id bi24-20020a05600c3d9800b003f193966fbfmr177026wmb.4.1683717905062; Wed, 10 May 2023 04:25:05 -0700 (PDT) MIME-Version: 1.0 References: <20230508020801.10702-1-cathy.zhang@intel.com> <20230508020801.10702-2-cathy.zhang@intel.com> <3887b08ac0e55e27a24d2f66afcfff1961ed9b13.camel@redhat.com> In-Reply-To: From: Eric Dumazet Date: Wed, 10 May 2023 13:24:53 +0200 Message-ID: Subject: Re: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper size To: "Zhang, Cathy" Cc: Shakeel Butt , Linux MM , Cgroups , Paolo Abeni , "davem@davemloft.net" , "kuba@kernel.org" , "Brandeburg, Jesse" , "Srinivas, Suresh" , "Chen, Tim C" , "You, Lizhen" , "eric.dumazet@gmail.com" , "netdev@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C79621C0003 X-Stat-Signature: 4jq8xxywtb57r333hdp4pr5zsib49erw X-HE-Tag: 1683717906-478005 X-HE-Meta: U2FsdGVkX1+5ANDq6sDbS6yoebLlTtWODkfF81VYPKWmrBPFB9llklxwhmH0nmhZDqg2l2OmsgIJtIVjTyy8obtHp0qINkiNhv9xdwW2hlONe0yDBpGAN5PLwPH6794eBW13+dbv5H9xkDQEZIxzongaBcG9+4p1N6K7eJiMSGyo7c6hMdM7jGYeQ/wSD9cknWYjMmPR/61WwwZmhA2Z+Ll1G/+XMYQYk7MOGHp5gEGB3hFMQCcbNon4hWJzUv1sb6Zg0GHOHRSnJQti78xVx5tFlEfNT5xp6VgpOiXQtZJQcTGsVzfPTMJV/jPVAUaboTwOSr6Q7T6lPG4wnPgMpzLa3AOxkzO3MtxBbr+abRjaVHRZs2hbwVYw0r2C5JMUQrWpGLnzaeQTqRuw5nfwRV7Q5C9zCotb9Ei/qM5Wb0NmImErGWJ+huA6WDMneFLSyc7IdtP+h3aQ2l/il1WQDuP8MjsCMIuRbxgzvnzRwsCbGPriw0RX1y0A49usT1PSx+bTplcWGepMkRssXCaLJ1WPQIS9Atj62MzoHXnweuxsJ+eNjT1tdWSt3dXBjFviW8IKurKJ03smLYo+dCMp47tcpJHlOws196c/W/hL9KaA/Q5Bdm0xVk+PyqII3zzuSIGLWwvogPVUfkeZXAF1SyK4ITZ+ZBZsi4US0fXKH2ypuwdXmRB5+kggabyiyqOh+9oTnGSv7Ef6vrX5QQBdBdI4fMhhzVujYKF7OjvXuFLKKbk+4469ftX1nTb640P+s4LnLZFZTCNAdaEnok6HVApTje6edsLx1//Uj17cfN8gFM3PZ+uwu5FKiAy3iZVcGoMQZkBB4cylB2ZEjpstwgNHNEFG7vxqp+mwZu+QDysehJqk6vInwxLGak3AjX5+gpn2nEwcJ1gzbwRAoRZtFKGkNlFUIpC4kEbnRebRKP9tqVlh2yyJcKLepdcJWN7uAhb65TnyKdqoRdsSxFT 3mnGRp81 JquhIxQF+fCKpexyjbz/nI/gR6RlcqL40WSgdasJT4UplSUBC6S3MA1GPzNImeKwJvuadFDpFZ6OAg23mnFKhIFHVGtkW2MIlluaOyr7SPjBpdZHhICWBxEmoCszgl78g3tAuTfkM8ohRBLhLYkBvEUkf37/t5EekczkFIhvsc5XhOgmrby49ajgkq+kAACPjI1aIoMyGECdY50ndEN6e0bD7sSYDPYCfft/4lSNnY5H62R6hOAv3qmGo7oBO44QBpz3DM7J2sTJfcVvMuZSev4CRBX8xgOhF6Hldt5XSs32kFSvhVTr4xflEAKKsf/fpyjqErDS2N2aaKnI4l4VFNmtiwlSoXek0nDHkfYqDgTfoiMM/Y/zqnAiow3SQk7j9w5C/OVB2hCOCYZmnrvbEmxHL0zULD4IX7UiezGZ8xQTHnVp/xovT1d2g7yslz70WTK5Ljfk+jHp2B+UpnzRcaflcykUWzbjzvoCG47UEHon+OYRYW8WSsPKJqEICmvwHm1m/LrxQ+DkzfWDtnxHxvzz9cSkyk3XEMwSECZAmHnOMPh86z5vYI3gMVVIcaFKH7+Bwj75qjgmLZCsb9KmeZ6UyeS+yqROD50e28wTGHpPwTKAtqa4Nw29+pdOwtdNriYCGGag05kQihttB+/zOPqF/QA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, May 10, 2023 at 1:11=E2=80=AFPM Zhang, Cathy wrote: > > Hi Shakeel, Eric and all, > > How about adding memory pressure checking in sk_mem_uncharge() > to decide if keep part of memory or not, which can help avoid the issue > you fixed and the problem we find on the system with more CPUs. > > The code draft is like this: > > static inline void sk_mem_uncharge(struct sock *sk, int size) > { > int reclaimable; > int reclaim_threshold =3D SK_RECLAIM_THRESHOLD; > > if (!sk_has_account(sk)) > return; > sk->sk_forward_alloc +=3D size; > > if (mem_cgroup_sockets_enabled && sk->sk_memcg && > mem_cgroup_under_socket_pressure(sk->sk_memcg)) { > sk_mem_reclaim(sk); > return; > } > > reclaimable =3D sk->sk_forward_alloc - sk_unused_reserved_mem(sk)= ; > > if (reclaimable > reclaim_threshold) { > reclaimable -=3D reclaim_threshold; > __sk_mem_reclaim(sk, reclaimable); > } > } > > I've run a test with the new code, the result looks good, it does not int= roduce > latency, RPS is the same. > It will not work for sockets that are idle, after a burst. If we restore per socket caches, we will need a shrinker. Trust me, we do not want that kind of big hammer, crushing latencies. Have you tried to increase batch sizes ? Any kind of cache (even per-cpu) might need some adjustment when core count or expected traffic is increasing. This was somehow hinted in commit 1813e51eece0ad6f4aacaeb738e7cced46feb470 Author: Shakeel Butt Date: Thu Aug 25 00:05:06 2022 +0000 memcg: increase MEMCG_CHARGE_BATCH to 64 diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 222d7370134c73e59fdbdf598ed8d66897dbbf1d..0418229d30c25d114132a1ed46a= c01358cf21424 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -334,7 +334,7 @@ struct mem_cgroup { * TODO: maybe necessary to use big numbers in big irons or dynamic based of the * workload. */ -#define MEMCG_CHARGE_BATCH 64U +#define MEMCG_CHARGE_BATCH 128U extern struct mem_cgroup *root_mem_cgroup; diff --git a/include/net/sock.h b/include/net/sock.h index 656ea89f60ff90d600d16f40302000db64057c64..82f6a288be650f886b207e6a5e6= 2a1d5dda808b0 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1433,8 +1433,8 @@ sk_memory_allocated(const struct sock *sk) return proto_memory_allocated(sk->sk_prot); } -/* 1 MB per cpu, in page units */ -#define SK_MEMORY_PCPU_RESERVE (1 << (20 - PAGE_SHIFT)) +/* 2 MB per cpu, in page units */ +#define SK_MEMORY_PCPU_RESERVE (1 << (21 - PAGE_SHIFT)) static inline void sk_memory_allocated_add(struct sock *sk, int amt) > > -----Original Message----- > > From: Shakeel Butt > > Sent: Wednesday, May 10, 2023 12:10 AM > > To: Eric Dumazet ; Linux MM > mm@kvack.org>; Cgroups > > Cc: Zhang, Cathy ; Paolo Abeni > > ; davem@davemloft.net; kuba@kernel.org; > > Brandeburg, Jesse ; Srinivas, Suresh > > ; Chen, Tim C ; You, > > Lizhen ; eric.dumazet@gmail.com; > > netdev@vger.kernel.org > > Subject: Re: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a p= roper > > size > > > > +linux-mm & cgroup > > > > Thread: https://lore.kernel.org/all/20230508020801.10702-1- > > cathy.zhang@intel.com/ > > > > On Tue, May 9, 2023 at 8:43=E2=80=AFAM Eric Dumazet > > wrote: > > > > > [...] > > > Some mm experts should chime in, this is not a networking issue. > > > > Most of the MM folks are busy in LSFMM this week. I will take a look at= this > > soon.