From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6432ACDB46E for ; Thu, 12 Oct 2023 08:04:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C70178D0115; Thu, 12 Oct 2023 04:04:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1FFD8D0002; Thu, 12 Oct 2023 04:04:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE78B8D0115; Thu, 12 Oct 2023 04:04:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9CBFE8D0002 for ; Thu, 12 Oct 2023 04:04:42 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 71694B4FBE for ; Thu, 12 Oct 2023 08:04:42 +0000 (UTC) X-FDA: 81336072804.07.748E523 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf05.hostedemail.com (Postfix) with ESMTP id AB52F100010 for ; Thu, 12 Oct 2023 08:04:40 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="N7/8veq1"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697097880; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=937YDLhB/LKhEB9XjdalzzUB5ZBYWrU2yT5JnPkc4iE=; b=nyQgr6Qg4DQ8VxywGcims7P1eSL3+ovk+4QYN7p4uVVyqgho8y9os5SY4u0i7GOXDpDJsl SP6ssJDWPNidU2xyjhWjg5fL3VeD58xzrzV0Vjlz1RJ9ZVMgesSgEalyDli3IYuBk2UmRa b/YiJqgPC9XVQNLL/hPXFgwENunBErI= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="N7/8veq1"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697097880; a=rsa-sha256; cv=none; b=O1Ec5a+ly37Uv7XFuR13nmt1cJzynxEhItlbXKmn92+4bP2puJW2On2GonvYOjFi5K7/+W Ow4sYDET6MwsxQR8PhKeZrzjxezhjBlWEPlXmvBIpkzRvDRiNXBi0lw7dIpIh4GzlZwQIc rcyENkGnaX5VFPMZOrAMCamLTRqBKx4= Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-53e04b17132so825497a12.0 for ; Thu, 12 Oct 2023 01:04:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697097879; x=1697702679; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=937YDLhB/LKhEB9XjdalzzUB5ZBYWrU2yT5JnPkc4iE=; b=N7/8veq1AHQPa31IKGpneOLxCzxkMwEB1Gk9OfXAHMKTZFkte3ZrxfnA/p+hRaNkNk QG6J07uo0N7kkqd6UHf77grUqZAlehWIeNoBRywAstCg4kBvsccOI1KvP0txUKBmVGEa jOimmpo1qdb73WiOR6gOgTCSEdW8IVthm8G/zFqzRF2MCBPCTU0k+wO1mCsPYSPIWfQU RI4tLScmjtKZ2RfdNkLMSCxaDOsSMoOWe7vjukB4MSP83jfjG3o+dRcqx7gyJd6c6lLM Hq7jNBHWk22WAbyN0ca5HWXd5VLDHG377uu5dbX2JQwSAFEmYbfdV0vNzvN7bLM7xDNB uWPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697097879; x=1697702679; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=937YDLhB/LKhEB9XjdalzzUB5ZBYWrU2yT5JnPkc4iE=; b=dZiCWjIQsvnleQ0llB0zgppQE7eNQLiupm0K6DvO1f2JaO4fzDnd7EqBlFmbFXrNzK Gf0UfCcEE59SA6MjXLYWyJfEXBTSUTAVWdcoCHpXtHneQHvNE09Sua5ooyfflikACK0R 2itoneEGbBBeo0sZ+XMfOSbGV+gvTcfOfKBzeiTHEzyTl32M/iCEOjGFLy/zjzuxbLO4 3XXSpcPKgQJqfZBkB7XFi+9BqhvMQi5ZGB95elkGdiyXueaIfLsbvPoqvIfx4U534YeN xdxM2+0NfGQ7TBdyks+VTv1YKB/8b1r67D1mqTuTM1fQkOC6M2Q8APV978Ax2pwJRwD1 gbew== X-Gm-Message-State: AOJu0Yzh9kAlIkaNYhnIlcP1wXthCQ2GX2q8J9HBHYv0qXYWVtE03cQX mRzldRscW3zO+rJH9KHYoq/se0Pi56QYQtnyCP1EJw== X-Google-Smtp-Source: AGHT+IFkw+QlomjvvrPZT6/iMPFPNgS2KfjqRUczRUJTLw7MtE4TQ62uK3MgP17Mrm9Cmj76go5vcIhfgvcNqM1U/xI= X-Received: by 2002:a17:907:7636:b0:9ba:4163:1801 with SMTP id jy22-20020a170907763600b009ba41631801mr5519834ejc.60.1697097879098; Thu, 12 Oct 2023 01:04:39 -0700 (PDT) MIME-Version: 1.0 References: <20231010032117.1577496-1-yosryahmed@google.com> <20231010032117.1577496-4-yosryahmed@google.com> <20231011003646.dt5rlqmnq6ybrlnd@google.com> In-Reply-To: From: Yosry Ahmed Date: Thu, 12 Oct 2023 01:04:03 -0700 Message-ID: Subject: Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Ivan Babrou , Tejun Heo , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Waiman Long , kernel-team@cloudflare.com, Wei Xu , Greg Thelen , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 94y3ef6w54mokp13kqs91uswzkboixxu X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: AB52F100010 X-HE-Tag: 1697097880-252503 X-HE-Meta: U2FsdGVkX1/LEV7FPqp6kPMjF8mcbk8oLgMN24YJTep75zdpIVCgVFd2eRlIo96N5wQQOMdrDlTiQwH+cAD7ipsqiPor6ELZZJs0rl91NT9j9lhLl5e5NcNxhfRWKCS+B8m7kbY8L59Kqd396Db/OdT6Q17Wh6XeRvzgH8YwBer6AfFD7841ha2kl5YYIkIwjvdEdcz7586Z1gEyaOWTUiHkm0oDPt6LEs0jJyuBQgAjiYaPqtl6/lchJV5+rJaCDRW4vm4Zrmok1eY5cZf8fJGLbqRiPHal5XwFc2alaoZ9HVJ4yXbExHRdxUfnUqGC1P98QO9IRz9t7o8JLM9S5KknP07HVtyu4W7z96qzHVyeddfZHAEch0Ko0WgAJOgBZUenLyyJ9OawF8N5TqkFtsNl8XDv8Q2U5l6edbCkftzBGTwhgr/8qzNX/vcaXj4ZFKfiQ7R86wsLEC6KEobxCogkmu7t6FZYMOq8cFRNyHAs9HLU8OG4rkS/BqZL4FsaShtfk8P+FIFftpc1dukGvUZW8/8YGZWed16VJW67LPTU6RNPR44CjurKPvEo2RKmD/EidtrxiB4I8v6hYYIFZCxYl4EllnsP8MnbuYN0nz/3tLOx7fwKc0aONqdzLEw8zYz+kREcr4s/XNU97eESqVXAvsMicpT5+jUk2sLrSNHuE7jXKe05fyu/tQgvTD9FbAiqekDof9/wDJI4OMfcqQlsV/H6l6PGlOvX0EtRFtRxlXNGQ+GGqQVS6zo0T/43sOa2N2eVeemKhIrdma27SZrgnWYcyTwD8AMCNPIom0UuMx2InzR5GG+5m1/ZQRJVlLjcjVjc+VTLmTBho9hg1CLBw4pPYCvFUwT4bpo9enwgNeTH2sYTOKjNcuy1zt+YE/6pIz2Jg0KJl9Mbs5pJg6lPiJH1ymIiQ+cgCYuDqAuq0Ftx28XVFCuaBBXJPAHdJkdNUZ2Oqnhkvn9SRCS AnbjPl3l 2Z9kh1PqVPMUufVCJwQaPgmMF3N9qwZfmmEwQT1NYlac1lsRUQcYl/pcYPl/dmIYq+LNC0zT4hGZtzixbwh209fnTkOuUW7JyYfzOTqUaOor1fNjPwjW51zbj94+iTsgDhUL7LUKtZXcOtU+lrqF7tOxkF8uHp2a7fvEkuyQyrmdJ1nppxLzrgNB2v/XVSPjPfusXnKKERJ4jkFYu4AzoDwsXLa+9lnTOYVCaI8U6JHCxb0UANOBSz3Vulw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000040, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Oct 11, 2023 at 8:13=E2=80=AFPM Yosry Ahmed = wrote: > > On Wed, Oct 11, 2023 at 5:46=E2=80=AFAM Shakeel Butt wrote: > > > > On Tue, Oct 10, 2023 at 6:48=E2=80=AFPM Yosry Ahmed wrote: > > > > > > On Tue, Oct 10, 2023 at 5:36=E2=80=AFPM Shakeel Butt wrote: > > > > > > > > On Tue, Oct 10, 2023 at 03:21:47PM -0700, Yosry Ahmed wrote: > > > > [...] > > > > > > > > > > I tried this on a machine with 72 cpus (also ixion), running both > > > > > netserver and netperf in /sys/fs/cgroup/a/b/c/d as follows: > > > > > # echo "+memory" > /sys/fs/cgroup/cgroup.subtree_control > > > > > # mkdir /sys/fs/cgroup/a > > > > > # echo "+memory" > /sys/fs/cgroup/a/cgroup.subtree_control > > > > > # mkdir /sys/fs/cgroup/a/b > > > > > # echo "+memory" > /sys/fs/cgroup/a/b/cgroup.subtree_control > > > > > # mkdir /sys/fs/cgroup/a/b/c > > > > > # echo "+memory" > /sys/fs/cgroup/a/b/c/cgroup.subtree_control > > > > > # mkdir /sys/fs/cgroup/a/b/c/d > > > > > # echo 0 > /sys/fs/cgroup/a/b/c/d/cgroup.procs > > > > > # ./netserver -6 > > > > > > > > > > # echo 0 > /sys/fs/cgroup/a/b/c/d/cgroup.procs > > > > > # for i in $(seq 10); do ./netperf -6 -H ::1 -l 60 -t TCP_SENDFIL= E -- > > > > > -m 10K; done > > > > > > > > You are missing '&' at the end. Use something like below: > > > > > > > > #!/bin/bash > > > > for i in {1..22} > > > > do > > > > /data/tmp/netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K & > > > > done > > > > wait > > > > > > > > > > Oh sorry I missed the fact that you are running instances in parallel= , my bad. > > > > > > So I ran 36 instances on a machine with 72 cpus. I did this 10 times > > > and got an average from all instances for all runs to reduce noise: > > > > > > #!/bin/bash > > > > > > ITER=3D10 > > > NR_INSTANCES=3D36 > > > > > > for i in $(seq $ITER); do > > > echo "iteration $i" > > > for j in $(seq $NR_INSTANCES); do > > > echo "iteration $i" >> "out$j" > > > ./netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K >> "out$j" & > > > done > > > wait > > > done > > > > > > cat out* | grep 540000 | awk '{sum +=3D $5} END {print sum/NR}' > > > > > > Base: 22169 mbps > > > Patched: 21331.9 mbps > > > > > > The difference is ~3.7% in my runs. I am not sure what's different. > > > Perhaps it's the number of runs? > > > > My base kernel is next-20231009 and I am running experiments with > > hyperthreading disabled. > > Using next-20231009 and a similar 44 core machine with hyperthreading > disabled, I ran 22 instances of netperf in parallel and got the > following numbers from averaging 20 runs: > > Base: 33076.5 mbps > Patched: 31410.1 mbps > > That's about 5% diff. I guess the number of iterations helps reduce > the noise? I am not sure. > > Please also keep in mind that in this case all netperf instances are > in the same cgroup and at a 4-level depth. I imagine in a practical > setup processes would be a little more spread out, which means less > common ancestors, so less contended atomic operations. (Resending the reply as I messed up the last one, was not in plain text) I was curious, so I ran the same testing in a cgroup 2 levels deep (i.e /sys/fs/cgroup/a/b), which is a much more common setup in my experience. Here are the numbers: Base: 40198.0 mbps Patched: 38629.7 mbps The regression is reduced to ~3.9%. What's more interesting is that going from a level 2 cgroup to a level 4 cgroup is already a big hit with or without this patch: Base: 40198.0 -> 33076.5 mbps (~17.7% regression) Patched: 38629.7 -> 31410.1 (~18.7% regression) So going from level 2 to 4 is already a significant regression for other reasons (e.g. hierarchical charging). This patch only makes it marginally worse. This puts the numbers more into perspective imo than comparing values at level 4. What do you think?