From: Kemeng Shi <shikemeng@huaweicloud.com>
To: willy@infradead.org, akpm@linux-foundation.org
Cc: tj@kernel.org, jack@suse.cz, hcochran@kernelspring.com,
axboe@kernel.dk, mszeredi@redhat.com,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v2 2/4] mm: correct calculation of wb's bg_thresh in cgroup domain
Date: Thu, 25 Apr 2024 21:17:22 +0800 [thread overview]
Message-ID: <20240425131724.36778-3-shikemeng@huaweicloud.com> (raw)
In-Reply-To: <20240425131724.36778-1-shikemeng@huaweicloud.com>
The wb_calc_thresh is supposed to calculate wb's share of bg_thresh in
global domain. To calculate wb's share of bg_thresh in cgroup domain,
it's more reasonable to use __wb_calc_thresh in which way we calculate
dirty_thresh in cgroup domain in balance_dirty_pages().
Consider following domain hierarchy:
global domain (> 20G)
/ \
cgroup domain1(10G) cgroup domain2(10G)
| |
bdi wb1 wb2
Assume wb1 and wb2 has the same bandwidth.
We have global domain bg_thresh > 2G, cgroup domain bg_thresh 1G.
Then we have:
wb's thresh in global domain = 2G * (wb bandwidth) / (system bandwidth)
= 2G * 1/2 = 1G
wb's thresh in cgroup domain = 1G * (wb bandwidth) / (system bandwidth)
= 1G * 1/2 = 0.5G
At last, wb1 and wb2 will be limited at 0.5G, the system will be limited
at 1G which is less than global domain bg_thresh 2G.
Test as following:
/* make it easier to observe the issue */
echo 300000 > /proc/sys/vm/dirty_expire_centisecs
echo 100 > /proc/sys/vm/dirty_writeback_centisecs
/* run fio in wb1 */
cd /sys/fs/cgroup
echo "+memory +io" > cgroup.subtree_control
mkdir group1
cd group1
echo 10G > memory.high
echo 10G > memory.max
echo $$ > cgroup.procs
mkfs.ext4 -F /dev/vdb
mount /dev/vdb /bdi1/
fio -name test -filename=/bdi1/file -size=600M -ioengine=libaio -bs=4K \
-iodepth=1 -rw=write -direct=0 --time_based -runtime=600 -invalidate=0
/* run fio in wb2 with a new shell */
cd /sys/fs/cgroup
mkdir group2
cd group2
echo 10G > memory.high
echo 10G > memory.max
echo $$ > cgroup.procs
mkfs.ext4 -F /dev/vdc
mount /dev/vdc /bdi2/
fio -name test -filename=/bdi2/file -size=600M -ioengine=libaio -bs=4K \
-iodepth=1 -rw=write -direct=0 --time_based -runtime=600 -invalidate=0
Before fix, the wrttien pages of wb1 and wb2 reported from
toos/writeback/wb_monitor.py keep growing. After fix, rare written pages
are accumulated.
There is no obvious change in fio result.
Fixes: 74d369443325 ("writeback: Fix performance regression in wb_over_bg_thresh()")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/page-writeback.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 2a3b68aae336..14893b20d38c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2137,7 +2137,7 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb)
if (mdtc->dirty > mdtc->bg_thresh)
return true;
- thresh = wb_calc_thresh(mdtc->wb, mdtc->bg_thresh);
+ thresh = __wb_calc_thresh(mdtc, mdtc->bg_thresh);
if (thresh < 2 * wb_stat_error())
reclaimable = wb_stat_sum(wb, WB_RECLAIMABLE);
else
--
2.30.0
next prev parent reply other threads:[~2024-04-25 13:17 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-25 13:17 [PATCH v2 0/4] Fix and cleanups to page-writeback Kemeng Shi
2024-04-25 13:17 ` [PATCH v2 1/4] mm: enable __wb_calc_thresh to calculate dirty background threshold Kemeng Shi
2024-05-03 9:11 ` Jan Kara
2024-04-25 13:17 ` Kemeng Shi [this message]
2024-05-03 9:30 ` [PATCH v2 2/4] mm: correct calculation of wb's bg_thresh in cgroup domain Jan Kara
2024-05-07 1:16 ` Kemeng Shi
2024-05-07 13:28 ` Jan Kara
2024-04-25 13:17 ` [PATCH v2 3/4] mm: call __wb_calc_thresh instead of wb_calc_thresh in wb_over_bg_thresh Kemeng Shi
2024-05-03 9:31 ` Jan Kara
2024-04-25 13:17 ` [PATCH v2 4/4] mm: remove stale comment __folio_mark_dirty Kemeng Shi
2024-05-03 9:31 ` Jan Kara
2024-05-01 16:16 ` [PATCH v2 0/4] Fix and cleanups to page-writeback Tejun Heo
2024-05-06 1:25 ` Kemeng Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240425131724.36778-3-shikemeng@huaweicloud.com \
--to=shikemeng@huaweicloud.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=hcochran@kernelspring.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mszeredi@redhat.com \
--cc=tj@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox