From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00B1CC5320E for ; Mon, 26 Aug 2024 02:16:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56AE48D0040; Sun, 25 Aug 2024 22:16:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 51B208D001C; Sun, 25 Aug 2024 22:16:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3BB788D0040; Sun, 25 Aug 2024 22:16:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1C6C28D001C for ; Sun, 25 Aug 2024 22:16:19 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 923851A09CD for ; Mon, 26 Aug 2024 02:16:18 +0000 (UTC) X-FDA: 82492782036.18.29128B5 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf06.hostedemail.com (Postfix) with ESMTP id B5208180004 for ; Mon, 26 Aug 2024 02:16:16 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MTK63aVn; spf=pass (imf06.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724638509; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x3t51FcKH1O1ZTqoTyt4cTRme4HJT18VHTa5y/zw0NA=; b=0QK+KN0bd5veK2t4nF65GXqng/5dLkjj2+2B5FG8s2o6R3B0WH4dbn/XMIT9MnvezwxbrX A+Aa8KCzaKRlNVwTru1aRNFx2nYnWkQ61meIpr+ul18b+ThWx9yYm+/fmW487uwX/XExPW Xmyr4oyk0uO0BmAPInQ/mB1Ay4zqJPs= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MTK63aVn; spf=pass (imf06.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724638509; a=rsa-sha256; cv=none; b=gSQGG8aAPAHBqW6j9yIoAWNo7ec8EgRrDM4B/A9qicZvqvt+UsK15gWw+xd/tWvVLJn7Yb O9cWSXJFjhnAytbhAgM2wfIpYmszUa6kmR+s7BvRLKAHTkZQbDQkXEuPurwaNh2Zb64KDY fR3Sf14QQfSc3AtRQTEqUi1iN4Gk9nA= Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-5bebd3b7c22so7758841a12.0 for ; Sun, 25 Aug 2024 19:16:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724638575; x=1725243375; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=x3t51FcKH1O1ZTqoTyt4cTRme4HJT18VHTa5y/zw0NA=; b=MTK63aVndgpmy4GkXFVFa+WPJ05ZwIZmyrlIArlILYwdzKkH2A3PF/NX6z3geJl9Pg xIkYdpSoUPmyinIxfHGdK5DwjdTEF3TJQfruAC+kMPuDg1s/9SLWQ7Wd5EwbJcfQdHbX c8T1XCxjnFHJmIimfwTxAyRbLP5TOYIUS3kwPUCc/CBrxqEckBWfjZpU5S8zSRo1XmO7 /YwX3FWzX75qifef57zBsvePotjkfai5567DZJuaNDatbydbnGPSXYlXFM2dIgdOFIa8 C06F5DZmTYHd7TMcf1Y4HFwO7u8E5DcjHM6Aov6/PBj9KxHi2IUKTnqC8k+zV11lGqBe C7sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724638575; x=1725243375; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x3t51FcKH1O1ZTqoTyt4cTRme4HJT18VHTa5y/zw0NA=; b=AA8kqaprPEmhylZlr0oIcfzTpZgJW0S4EdRM3ojxA8g8CyUDj9dWgZeRIzDIOmEajr Uo5ijqFs2Rvxs7zXm+Z2F996PZ3yl0kTyK0TRGIMbwQNFHySQ65viDKOp1GLhrET9nYd KoSSmsQ4Zp7uC+e/nNuwgqk6cd79bjvMRFGmqQOO8w0NR8niID9Nuv+rOHTFybzROewB iZlhcg3098hUz9F9F4OR2E9CdIcW0xAGrGrGtKX8xRIFEJWVJWlqoGk5I5KKZqtTVi9O bOq508Ckdezeobjx8P7v41g9rhmfRQUoNjDA3JYAce9WN1TQzMjH7P4G8VSId9mGzwkO mbdg== X-Forwarded-Encrypted: i=1; AJvYcCWA32RjT6yqHvKFqm7AC+ExdJfTPgPMslbRkVvaNgIw1bhueN5vhKvTyy5wrK18nUe3y2ZV7KYHNw==@kvack.org X-Gm-Message-State: AOJu0YxSbEySILQOB6j2CQxUlUMPXbWmi7BPXw+/sOBTA7Ljog8zhpNy CH9CyWjjb86qBrKs1fV6cVbJ5Tb+MBJoJeRgp/MMZIyACC1X0McbTSRK2XWrAzxMDQHjkCyV1Gk lM52E+TrFF7Z5T3S3cZ9I9ODSnpc= X-Google-Smtp-Source: AGHT+IGFnGf4z4OSEQbXlIeyv/4lEOAoGE+B28Ov4LIIZkQ0+J1LqBp2ssQAobhn+5Dqhpgh+0dfZ28rlgx5kEwyuy8= X-Received: by 2002:a05:6402:2551:b0:5be:ddbe:2798 with SMTP id 4fb4d7f45d1cf-5bf2c0693admr11170100a12.18.1724638574276; Sun, 25 Aug 2024 19:16:14 -0700 (PDT) MIME-Version: 1.0 References: <7c3499ac-faa7-cc0c-2d90-b8291fce5492@huaweicloud.com> <20240823120510.61853-1-ioworker0@gmail.com> <36c4744a-3827-f6d7-664a-8ee2b7d0e281@huaweicloud.com> In-Reply-To: <36c4744a-3827-f6d7-664a-8ee2b7d0e281@huaweicloud.com> From: Lance Yang Date: Mon, 26 Aug 2024 10:15:38 +0800 Message-ID: Subject: Re: [BUG] cgroupv2/blk: inconsistent I/O behavior in Cgroup v2 with set device wbps and wiops To: Yu Kuai Cc: 21cnbao@gmail.com, a.hindborg@samsung.com, axboe@kernel.dk, baolin.wang@linux.alibaba.com, boqun.feng@gmail.com, cgroups@vger.kernel.org, david@redhat.com, fujita.tomonori@lab.ntt.co.jp, josef@toxicpanda.com, libang.li@antgroup.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mkoutny@suse.com, paolo.valente@unimore.it, tj@kernel.org, vbabka@kernel.org, "yukuai (C)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: iok3bdtb5pcyobks5ck5iws9i8jicrcr X-Rspam-User: X-Rspamd-Queue-Id: B5208180004 X-Rspamd-Server: rspam02 X-HE-Tag: 1724638576-665193 X-HE-Meta: U2FsdGVkX19fUqTpuCETrIn26GUccW2QfXsKaXI5m+cpMSxTecNqt6eN+BfUchCYOVciFQxdqMawOXDr4P0pw4AJGQaLFdZKh3gAj3S0diHC/QcTbtX1A3LREcoFC8/L9ATRzEtqZi+zn9ex/Ki+olzW0nFLmiQ03JPcHkdBuyFPOyY+o8GqHqw1frjKyZTbKC5k9hbMpntyyrjuUEAXPwlBXf33K0JMbFvSFsJI/TM4imLGPLXewtuQML6ZFOuvwApekVR04RKirZkBOMbNHkMiTgjK51f/lrFS+imM5YSgzpFxNc38Zaw/l38CdmMJyT3OWzD+Bigp0U0rwOJmPiW+/VrMPAVuK8xtHGp4GexrbU9UWaLRJYQ2jvJzLc+WcuOK6GPlYblWWC2KhI3eTuwaEnmq3QuJbcW/OeLa8jq3bSmfcEMgH7jkBOe1gsjSTpsa+Bm20xSO+dF3Sh4xw77fRubOI1Orpt7Yzb3uS6bwWkGFdqnGwJuqqvHfjFfnhwk2SPZ/XIaMm2VPQ8hqlPcy5SRNGaxk6Awb00wmYGWnhp0qXjIfJoX53Oyl+aTkNqRTvczulxahuP62Vrj/hzC4eeiPtm7DhOkv+xaHNkValbIaK4tPQtORMgjYr4D2W99FkoOeDAyc3zjGIZP2tw3WbFikR2hQfK/PT7UDy5LeWcCRYi/XUeWR7mLUAVA6TwcQIzN17xqioaSelYLjs3GFIefEFzXdlge2Bt10hCHVGwwlTAQjcPYJm11ln0uNdTHTT4E++OqftzN0IXzGUdSWqB0S14WnQxOEMPBBGjCOjcaCvIGdQlY9tCjRIDD50vXCcRmN4u/o3txo+MPChuM+HRfEcaXTW6jgLkatw5Aq3nT5oNQxEaEC+bMkpE1USb+XAenI2UawPtC/fDknWo49fIgXAy7w9+CT1fjpm/ETVh8+8jWWtaFO4E8KexwX+NAl6ECWG8eRou+zV7g x+ZNEzuu y8GGXQVdNFoqhUwx4szkXPvmwS4L0DLvGJka5hX9isbF6auSoK20xzDA4xWg+oRbXQi+hxvuXQTzuczn8HLYwX1auGATp7EcBEX4sH3Y0lFgR9Js86yzo1mNrFr6lgbUnj9vnTOUsTdkJsYIPkcijQc6amGSmaP6E0/U7dYOiEYkeDylMkOEi7MWDKy/WWlr3Wd0XDpO+HYK1FODuMCAvnqH335JXwC9DfxJOSbOF0+0Qf8ico3bYHsW+vvB14XBTar7HsT9/jNLPEsgZKiRPPNjjtVUyxSGqXcrjUYmZks4sldUhKDZv40G9YsKxoMJVslMGHnrFUeZjnH73facA/F3C937ze+lgsNJ5ij3zIYhVjCYV4uC9jSbjDpA23iTDpcsScjtVdgBrFtOr/aIETnZu132w4Q1MfRDqgw88Adt7LGA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000261, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Kuai, Thanks a lot for following up on this! On Mon, Aug 26, 2024 at 9:31=E2=80=AFAM Yu Kuai w= rote: > > Hi, > > =E5=9C=A8 2024/08/23 20:05, Lance Yang =E5=86=99=E9=81=93: > > My bad, I got tied up with some stuff :( > > > > Hmm... tried your debug patch today, but my test results are different = from > > yours. So let's take a look at direct IO with raw disk first. > > > > ``` > > $ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > > sda 8:0 0 90G 0 disk > > =E2=94=9C=E2=94=80sda1 8:1 0 1G 0 part /boot/efi > > =E2=94=94=E2=94=80sda2 8:2 0 88.9G 0 part / > > sdb 8:16 0 10G 0 disk > > > > $ cat /sys/block/sda/queue/scheduler > > none [mq-deadline] > > > > $ cat /sys/block/sda/queue/rotational > > 0 > > > > $ cat /sys/block/sdb/queue/rotational > > 0 > > > > $ cat /sys/block/sdb/queue/scheduler > > none [mq-deadline] > > > > $ cat /boot/config-6.11.0-rc3+ |grep CONFIG_CGROUP_ > > # CONFIG_CGROUP_FAVOR_DYNMODS is not set > > CONFIG_CGROUP_WRITEBACK=3Dy > > CONFIG_CGROUP_SCHED=3Dy > > CONFIG_CGROUP_PIDS=3Dy > > CONFIG_CGROUP_RDMA=3Dy > > CONFIG_CGROUP_FREEZER=3Dy > > CONFIG_CGROUP_HUGETLB=3Dy > > CONFIG_CGROUP_DEVICE=3Dy > > CONFIG_CGROUP_CPUACCT=3Dy > > CONFIG_CGROUP_PERF=3Dy > > CONFIG_CGROUP_BPF=3Dy > > CONFIG_CGROUP_MISC=3Dy > > # CONFIG_CGROUP_DEBUG is not set > > CONFIG_CGROUP_NET_PRIO=3Dy > > CONFIG_CGROUP_NET_CLASSID=3Dy > > > > $ cd /sys/fs/cgroup/test/ && cat cgroup.controllers > > cpu io memory pids > > > > $ cat io.weight > > default 100 > > > > $ cat io.prio.class > > no-change > > ``` > > > > With wiops, the result is as follows: > > > > ``` > > $ echo "8:16 wbps=3D10485760 wiops=3D100000" > io.max > > > > $ dd if=3D/dev/zero of=3D/dev/sdb bs=3D50M count=3D1 oflag=3Ddirect > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.05893 s, 10.4 MB/s > > > > $ dmesg -T > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 2984 ffff0000fb3= a8f00 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 6176 ffff0000fb3= a97c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 7224 ffff0000fb3= a9180 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a8640 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a9400 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a8c80 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a9040 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a92c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 4096 ffff0000fb3= a8000 > > > > > And without wiops, the result is quite different: > > > > ``` > > $ echo "8:16 wbps=3D10485760 wiops=3Dmax" > io.max > > > > $ dd if=3D/dev/zero of=3D/dev/sdb bs=3D50M count=3D1 oflag=3Ddirect > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.08187 s, 10.3 MB/s > > > > $ dmesg -T > > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 2880 ffff0000c74= 659c0 > > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 6992 ffff00014f6= 21b80 > > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 92528 ffff00014f= 620dc0 > > I don't know why IO size from fs layer is different in this case. Me neither... > > > ``` > > > > Then, I retested for ext4 as you did. > > > > ``` > > $ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > > sda 8:0 0 90G 0 disk > > =E2=94=9C=E2=94=80sda1 8:1 0 1G 0 part /boot/efi > > =E2=94=94=E2=94=80sda2 8:2 0 88.9G 0 part / > > sdb 8:16 0 10G 0 disk > > > > $ df -T /data > > Filesystem Type 1K-blocks Used Available Use% Mounted on > > /dev/sda2 ext4 91222760 54648704 31894224 64% / > > ``` > > > > With wiops, the result is as follows: > > > > ``` > > $ echo "8:0 wbps=3D10485760 wiops=3D100000" > io.max > > > > $ rm -rf /data/file1 && dd if=3D/dev/zero of=3D/data/file1 bs=3D50M cou= nt=3D1 oflag=3Ddirect > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.06227 s, 10.4 MB/s > > > > $ dmesg -T > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 2984 ffff0000fb3= a8f00 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 6176 ffff0000fb3= a97c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 7224 ffff0000fb3= a9180 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a8640 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a9400 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a8c80 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a9040 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb= 3a92c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 4096 ffff0000fb3= a8000 > > > > > And without wiops, the result is also quite different: > > > > ``` > > $ echo "8:0 wbps=3D10485760 wiops=3Dmax" > io.max > > > > $ rm -rf /data/file1 && dd if=3D/dev/zero of=3D/data/file1 bs=3D50M cou= nt=3D1 oflag=3Ddirect > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.03759 s, 10.4 MB/s > > > > $ dmesg -T > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 2904 ffff0000c4e= 9f2c0 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 5984 ffff0000c4e= 9e000 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 7496 ffff0000c4e= 9e3c0 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4= e9eb40 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4= e9f540 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4= e9e780 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4= e9ea00 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4= e9f900 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 4096 ffff0000c4e= 9e8c0 > > While ext4 is the same. And I won't say result is different here. Perhap there is other subtle stuff at play since ext4 is the same? > > [ > > ``` > > > > Hmm... I still hava two questions here: > > 1. Is wbps an average value? > > Yes. > > 2. What's the difference between setting 'max' and setting a very high = value for 'wiops'? > > The only difference is that: > > - If there is no iops limit, splited IO will be dispatched directly; > - If there is iops limit, splited IO will be throttled again. iops is > high, however, blk-throtl is FIFO, splited IO will have to wait for > formal request to be throttled by bps first before checking the iops > limit for splited IO. Thanks a lot again for the lesson! Lance > > Thanks, > Kuai > > > > > Thanks a lot again for your time! > > Lance > > . > > >