From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC92EC5320E for ; Mon, 26 Aug 2024 01:32:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4E4F8D0040; Sun, 25 Aug 2024 21:32:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BFEB28D001C; Sun, 25 Aug 2024 21:32:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AECEE8D0040; Sun, 25 Aug 2024 21:32:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 90E6D8D001C for ; Sun, 25 Aug 2024 21:32:01 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 36F37140A43 for ; Mon, 26 Aug 2024 01:32:01 +0000 (UTC) X-FDA: 82492670442.02.790C14A Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by imf29.hostedemail.com (Postfix) with ESMTP id 992A512000C for ; Mon, 26 Aug 2024 01:31:56 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of yukuai1@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=yukuai1@huaweicloud.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724635834; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pDlQiXfZbjiUZev3sBXFrEpqG8SyHIh45sg4QgKogb0=; b=5axYIo89lQKG7Stz/OqsPBdFfrime6/fbGLXE0W1vR+HKn99/Z6lbUphqk4/QsEwBrJ1aR lcDSkdqZH1agQfAamcDPgYh+y/fCqyjdoBFKPDJx80C+q608+UjBWDtyYG7KuYS1jkqPEA 0zL351nOlMW856Ur7uDVJJ+Tu0tbO2Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724635834; a=rsa-sha256; cv=none; b=jKwi1zBQXhTfZbb2rvxPa27gR7AdWVj/EqoC1eFVRxKO3+o3Mzdw0PF+nABUnTMh5NT7GF KtV7VjYcGlSNp2KMmIyYwupBW554pEcZd1Xmk6HSDXjMLa0WPTvHFIUbEhvwY/DkgHxCd0 0yBDI3NYEOcIAkh1p9JxnglgaWvi2e4= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of yukuai1@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=yukuai1@huaweicloud.com; dmarc=none Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4WsY6v1zGdz4f3kvn for ; Mon, 26 Aug 2024 09:31:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 994181A0568 for ; Mon, 26 Aug 2024 09:31:50 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP4 (Coremail) with SMTP id gCh0CgBnj4UE28tme4DrCg--.7908S3; Mon, 26 Aug 2024 09:31:50 +0800 (CST) Subject: Re: [BUG] cgroupv2/blk: inconsistent I/O behavior in Cgroup v2 with set device wbps and wiops To: Lance Yang , yukuai1@huaweicloud.com Cc: 21cnbao@gmail.com, a.hindborg@samsung.com, axboe@kernel.dk, baolin.wang@linux.alibaba.com, boqun.feng@gmail.com, cgroups@vger.kernel.org, david@redhat.com, fujita.tomonori@lab.ntt.co.jp, josef@toxicpanda.com, libang.li@antgroup.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mkoutny@suse.com, paolo.valente@unimore.it, tj@kernel.org, vbabka@kernel.org, "yukuai (C)" References: <7c3499ac-faa7-cc0c-2d90-b8291fce5492@huaweicloud.com> <20240823120510.61853-1-ioworker0@gmail.com> From: Yu Kuai Message-ID: <36c4744a-3827-f6d7-664a-8ee2b7d0e281@huaweicloud.com> Date: Mon, 26 Aug 2024 09:31:48 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20240823120510.61853-1-ioworker0@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:gCh0CgBnj4UE28tme4DrCg--.7908S3 X-Coremail-Antispam: 1UD129KBjvJXoW3XryrKrWkCw4ruryfZFykuFg_yoW7tw4fpr y5ZF4UXrs5Gr929w4kAw40yFsYqw47J3y5ZF9Ikr1xC34fZ3srJryDAF4agFWYqa13GF47 Aws7GryIyF4DJ3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBF14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E 2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJV W8JwACjcxG0xvEwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2ka 0xkIwI1lc7I2V7IY0VAS07AlzVAYIcxG8wCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x 0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E 7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcV C0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF 04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7 CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pRHUDLUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-Stat-Signature: yzwj6f3a3yonq81etcq3jfk9swiq3hqe X-Rspamd-Queue-Id: 992A512000C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1724635916-481535 X-HE-Meta: U2FsdGVkX19D7GQMh3s5dEC9xstMMR3lOR+unZTuSHZbKrqWV5VdoS8jOA3UwtDZ3pH9z+WUz45hFB8zBoa3HwBaGBU6rwAykrGH85pocgHlLIDkK94DERBLaeVGn5OZCdw45ua1sdLN7P7zS54VzV6Awe6LvAFPyBqJz+1pLfILbhU3pr+jHlQG01oxvdZrSuHta8Lo76mHVgxnnhrTazneS7OSBhccweZGahxLXrNARZZ+v2szMW2+5Q+tl+jo5NcObXz/JdwzIfhfkcLyb35l7j2wvNGE4C4vxH+Yzp9ti8oinmcXBu52T/eKy9UCbnpYzJkAxreFfhZYxOY/q8BaGyosYosmoTYvukI5+XMqcEpvwcleCF13Dj2LBUIbPJYHdV/reoAoQJFk04sp+8HAf4vbC8wU/biDPjDa1yMk4FL+QtRHWjToEXT5jizh9kc9fvZP4gyLcCdZMXyMByEvO3JtdwT8RUxP21jKQzsQJTEaJ72pfoIe+0R/ku+IYoD+YuWwVCjkQXXszeHToyOMCjA0P6RYWycJzN2TjkcWD/SNqtRoV0f3/IjqpmEYxSkpe47Zys9SyjCCKtKNIsEPbjhU0u2IIbPaCTXh2yoZAx+WD9IhudZhUpqR0lGdtPz12xM+ZIjqTFNBssHuj3calCCjVTbw6jK+x+KtVey5bL5ogQOuAMKATDt4omM2QU08/rOuzTl3Zg8DAdC2pdAzOq+swC4fS+h3tOq3IAdVWXYwR1MyW3HM/Z2ubGNIPpCFYhJnXHVkBqZCoK+6/c5lngpq6jmxwPlnhekZwpFSvAOiPnvosLNfEUfrfU4aQX/HGRZAjclW/gp77lRvtgqoZwD1Ax1tcqsTLR+PDlIPqGkkLtb6egh0pqx7mfanR5mfTrWKyjKtUn5HIory1ZWWwQexNEKWL6t4W1kThQOGhUUsCOtp+bYdsO3v4eqiV+BYgg11jtt9KfyoJ2m Bvi7fDam jYI8nTqlW5JjkMGG9JohyfEnM4iAuC/VdtSzUr9zIFbbOtp9bSLJLg/UoQoeOrSbuNC2RZf02zjfjBY7h4udoNMzJzdBfJoJEuAJuzAMuTDmW7YLQxmQBm2FVX8/z6Fy5h9UgjwbqAYBCS+iOeEM9Dv+g00QWbk+7orZnASa0mkc59PMW2Qh3NXbrS/rzoMLd4GHe1jnOE9C6jjtA2t6z3kRmqYbsAqAVct3S X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, 在 2024/08/23 20:05, Lance Yang 写道: > My bad, I got tied up with some stuff :( > > Hmm... tried your debug patch today, but my test results are different from > yours. So let's take a look at direct IO with raw disk first. > > ``` > $ lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > sda 8:0 0 90G 0 disk > ├─sda1 8:1 0 1G 0 part /boot/efi > └─sda2 8:2 0 88.9G 0 part / > sdb 8:16 0 10G 0 disk > > $ cat /sys/block/sda/queue/scheduler > none [mq-deadline] > > $ cat /sys/block/sda/queue/rotational > 0 > > $ cat /sys/block/sdb/queue/rotational > 0 > > $ cat /sys/block/sdb/queue/scheduler > none [mq-deadline] > > $ cat /boot/config-6.11.0-rc3+ |grep CONFIG_CGROUP_ > # CONFIG_CGROUP_FAVOR_DYNMODS is not set > CONFIG_CGROUP_WRITEBACK=y > CONFIG_CGROUP_SCHED=y > CONFIG_CGROUP_PIDS=y > CONFIG_CGROUP_RDMA=y > CONFIG_CGROUP_FREEZER=y > CONFIG_CGROUP_HUGETLB=y > CONFIG_CGROUP_DEVICE=y > CONFIG_CGROUP_CPUACCT=y > CONFIG_CGROUP_PERF=y > CONFIG_CGROUP_BPF=y > CONFIG_CGROUP_MISC=y > # CONFIG_CGROUP_DEBUG is not set > CONFIG_CGROUP_NET_PRIO=y > CONFIG_CGROUP_NET_CLASSID=y > > $ cd /sys/fs/cgroup/test/ && cat cgroup.controllers > cpu io memory pids > > $ cat io.weight > default 100 > > $ cat io.prio.class > no-change > ``` > > With wiops, the result is as follows: > > ``` > $ echo "8:16 wbps=10485760 wiops=100000" > io.max > > $ dd if=/dev/zero of=/dev/sdb bs=50M count=1 oflag=direct > 1+0 records in > 1+0 records out > 52428800 bytes (52 MB, 50 MiB) copied, 5.05893 s, 10.4 MB/s > > $ dmesg -T > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 2984 ffff0000fb3a8f00 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 6176 ffff0000fb3a97c0 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 7224 ffff0000fb3a9180 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8640 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9400 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8c80 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9040 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a92c0 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 4096 ffff0000fb3a8000 > > And without wiops, the result is quite different: > > ``` > $ echo "8:16 wbps=10485760 wiops=max" > io.max > > $ dd if=/dev/zero of=/dev/sdb bs=50M count=1 oflag=direct > 1+0 records in > 1+0 records out > 52428800 bytes (52 MB, 50 MiB) copied, 5.08187 s, 10.3 MB/s > > $ dmesg -T > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 2880 ffff0000c74659c0 > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 6992 ffff00014f621b80 > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 92528 ffff00014f620dc0 I don't know why IO size from fs layer is different in this case. > ``` > > Then, I retested for ext4 as you did. > > ``` > $ lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > sda 8:0 0 90G 0 disk > ├─sda1 8:1 0 1G 0 part /boot/efi > └─sda2 8:2 0 88.9G 0 part / > sdb 8:16 0 10G 0 disk > > $ df -T /data > Filesystem Type 1K-blocks Used Available Use% Mounted on > /dev/sda2 ext4 91222760 54648704 31894224 64% / > ``` > > With wiops, the result is as follows: > > ``` > $ echo "8:0 wbps=10485760 wiops=100000" > io.max > > $ rm -rf /data/file1 && dd if=/dev/zero of=/data/file1 bs=50M count=1 oflag=direct > 1+0 records in > 1+0 records out > 52428800 bytes (52 MB, 50 MiB) copied, 5.06227 s, 10.4 MB/s > > $ dmesg -T > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 2984 ffff0000fb3a8f00 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 6176 ffff0000fb3a97c0 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 7224 ffff0000fb3a9180 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8640 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9400 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8c80 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9040 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a92c0 > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 4096 ffff0000fb3a8000 > > And without wiops, the result is also quite different: > > ``` > $ echo "8:0 wbps=10485760 wiops=max" > io.max > > $ rm -rf /data/file1 && dd if=/dev/zero of=/data/file1 bs=50M count=1 oflag=direct > 1+0 records in > 1+0 records out > 52428800 bytes (52 MB, 50 MiB) copied, 5.03759 s, 10.4 MB/s > > $ dmesg -T > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 2904 ffff0000c4e9f2c0 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 5984 ffff0000c4e9e000 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 7496 ffff0000c4e9e3c0 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9eb40 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9f540 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9e780 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9ea00 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9f900 > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 4096 ffff0000c4e9e8c0 While ext4 is the same. And I won't say result is different here. > [ > ``` > > Hmm... I still hava two questions here: > 1. Is wbps an average value? Yes. > 2. What's the difference between setting 'max' and setting a very high value for 'wiops'? The only difference is that: - If there is no iops limit, splited IO will be dispatched directly; - If there is iops limit, splited IO will be throttled again. iops is high, however, blk-throtl is FIFO, splited IO will have to wait for formal request to be throttled by bps first before checking the iops limit for splited IO. Thanks, Kuai > > Thanks a lot again for your time! > Lance > . >