From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62B30E7716C for ; Thu, 5 Dec 2024 15:33:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 422F46B0099; Thu, 5 Dec 2024 10:19:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 907D06B0116; Thu, 5 Dec 2024 10:19:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DF746B00D7; Thu, 5 Dec 2024 10:19:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D33D46B0099 for ; Tue, 19 Nov 2024 07:29:29 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5573AA041B for ; Tue, 19 Nov 2024 12:29:29 +0000 (UTC) X-FDA: 82802774250.16.9C619F1 Received: from mail-pj1-f68.google.com (mail-pj1-f68.google.com [209.85.216.68]) by imf27.hostedemail.com (Postfix) with ESMTP id EB2EF40007 for ; Tue, 19 Nov 2024 12:28:34 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VCrJLLsm; spf=pass (imf27.hostedemail.com: domain of jimzhao.ai@gmail.com designates 209.85.216.68 as permitted sender) smtp.mailfrom=jimzhao.ai@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732019165; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x8Cy4XcMd92kGMZiPRTiEZ4gdTHlal5Fe394Gz1xPoA=; b=jC9BrOTKi4ctL0V/97IHxLAo6abzt/vZ/eCUURKnfiahUOCCY6R+/LgDCaB2LrQpfJhCJQ ZYGa562pEuDuKDRvQk36xVB520wXo9S9i4qrhGMoTL7RD0SpCObPRGh/YgI4NnojpKnJYA Ch/VhApLVm+er3Q0kWZiH2sr8MIEFJY= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VCrJLLsm; spf=pass (imf27.hostedemail.com: domain of jimzhao.ai@gmail.com designates 209.85.216.68 as permitted sender) smtp.mailfrom=jimzhao.ai@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732019165; a=rsa-sha256; cv=none; b=a9ZKwMx/ZcCR6/fg1RdZ+LIEeZujwQ/pt0GqUZzBiEgj7veKoSHg7AYlTFOQQPqFZU3F49 ezm8xvl+GulT7rvQGvCj2qsp9BqUdCXxrXKAcjxoS6c/gy5cuAWyu0THG8sZJOoBfxh3vw 3fGILP5zkiwzNS6OvTMn+1A+doJ5sO8= Received: by mail-pj1-f68.google.com with SMTP id 98e67ed59e1d1-2ea2dd09971so687105a91.3 for ; Tue, 19 Nov 2024 04:29:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732019366; x=1732624166; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x8Cy4XcMd92kGMZiPRTiEZ4gdTHlal5Fe394Gz1xPoA=; b=VCrJLLsm65AAwtdWgXveBMwP+bwO2DGemZL+2Aen0O8Z52iRd/8Rr2Wo7JoDCw7KDq xSpUXI6ElZnXOvkDYAuoEQXhsSa/8wey3OZQDkHGsFlkyt5Iy8QO+9XYn+dHI7kxRERi y2Y9/2UaiMEmTQBakH+C/kBLSknFE7L2xQkSx7HNlKs7Zwr7rvSTUz64OgftZ+q0U2Jz LqDGrlBecfINbRvFuU9LEkrYaZGdQsPrBpEvDLP4os3jGuQKgf+wCN3GzRe6o4+EGSa1 UtFp98St+3b+JSr81h9l9a9fnxqAmE6YuEXW8JRug6RfIS9yoFuu2imagmlRP45CRiqK 8PYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732019366; x=1732624166; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x8Cy4XcMd92kGMZiPRTiEZ4gdTHlal5Fe394Gz1xPoA=; b=klcuQyp/grN/zh9q1HinbnDxm72lw1+Hibn458dv5SwOI/BAmEBro47cSAqdZx7z+a wyXBcBSwCCoq/4s95mR35ZqyPfNosdiCqhXbO/+U1DIKfSokai39IGXBng2InnBDkN5g OQNn/c+2euGPF6ohSHv6Y6/YNfq1WBUXEP3sRmcsW6KwluOSiXb+hJuVp2L9ZwZlW6Nn mGYAxBgyWIIMHjNH2zJGWKL3lVjYyfyprD+zMD4umm/W7H/6kpaR39n6ma1BjRmXkMPe oqnPTtcXsIdm3wDKb8H8m3nVAy0zRIHjcFDXQQ07JFd+XCQc9W8XrCXxkZhFUXGiuwBM Z7jA== X-Forwarded-Encrypted: i=1; AJvYcCXtRS3EFBHUqcx9fDL7Xu8Wg8px0ryaQIizRyU3N70Kaf71na7/L4orGPCEnnFTY/XCX6YwRTCyqQ==@kvack.org X-Gm-Message-State: AOJu0Yy39SZOcCtQdqUAxLfX4ZGphHjk1TjgTsUQQXvB1dmGtr5U8z1c Hrbat7SfbchsbEoBKH0avL9+7uwo11Sof8J05FaMlzkSqjNNNZ8j X-Google-Smtp-Source: AGHT+IHbmrydBTMkJRe5R2ZwCxpRenmU8Eus8/aMcSDJwulMF+J5G1E0KpHb/AVHDPSNAuhUL6uDhw== X-Received: by 2002:a17:90b:5281:b0:2ea:97b6:c466 with SMTP id 98e67ed59e1d1-2ea97b6c53emr7066154a91.12.1732019365919; Tue, 19 Nov 2024 04:29:25 -0800 (PST) Received: from localhost.localdomain ([43.154.34.99]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2ea737252bdsm3792883a91.12.2024.11.19.04.29.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 04:29:25 -0800 (PST) From: Jim Zhao To: jack@suse.cz, shikemeng@huaweicloud.com Cc: akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, willy@infradead.org, jimzhao.ai@gmail.com Subject: Re: [PATCH v2] mm/page-writeback: Raise wb_thresh to prevent write blocking with strictlimit Date: Tue, 19 Nov 2024 20:29:22 +0800 Message-Id: <20241119122922.3939538-1-jimzhao.ai@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241119114444.3925495-1-jimzhao.ai@gmail.com> References: <20241119114444.3925495-1-jimzhao.ai@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EB2EF40007 X-Stat-Signature: 35dr11t8cd3ght6rfcoa7eux7tg89jfa X-Rspam-User: X-HE-Tag: 1732019314-72909 X-HE-Meta: U2FsdGVkX1+WnFcenoy5j/zPhsI8Zxj0z6IDBQaSxO9ynK3NR54fA0S4JSObcsnXMFRbV3/JFibJsxqxbYhLQVJBVzrVzzje8LS5kNUcIsEugyS7AI0wLqbHfntyU5XGL19g+jefWkmJDnjCgzf/PUYfoY5s8CONZBcsXP0MlCTbimJBIsLgtp4Kce9U6HCIax0CZV+tuX7avCKSu7nXLL86pJxCzLOAPHAebQV6Sm1rwTrjdv5FK4aU11IkGgr0U4FUtwY4t4AKXfqx2e2AHxH+/TexM7lfxbp2doRJBJ7rcgD3MXh4HE6hPFRJenQl8W6d2xg9YgcI1g6oSNHjbQxzDtYNckkiS1lqkXRSl4PtizOAlrUYdN/3PMbIZYM5PKiGyy2MBwWF3PaZl5RJIOetSZTnfE0kHdG93ZKqcrthNbccFIMSxnQpXOzufgy5cgmcsJ2uvkgOfebi+oWrfH5Qs0nn8UZzK083fOCFeXsBj+il/awXerUBpuFZPBeaENnb0PD3lXsRe5gvewtALHcnrKawxiTMvZw3Yzi+L5gk6odpIG0tKwQiF+38vKCeMChWPTiHYy7Fod/V1WZwxX+5ocgcqxuAZ3Bt1OuqvXDF3GNRs9HEesMRkkwHadh7/Ka+PkzGSgNG1VYqK4Ej5JqSi+jFqmIO/QeP70JhrE6GHnswuM033sknnJ5zCUZ0uIz7D0sTI0SwGg+tS2/D/TriRCeNsZnWt1a+KwfInvWGr/H/VKyNtFLNpEzg66etr3mPHJ2a04+8eP4zsZVPtkviF0HhJf0WDQ46doIWFZ57YljPgQL8JKv25tHBfJujDFQoaMziEdpbwDz90SXGuGnCMSIbErWo5uThfhH4W4ObTDT8duLqVJJZT9FEwWD8S37gQP49iFkz00f/Bdq8bxa0qRV1Ub65gbnm/89wG9aY4fvTuwKCZIgys95o+r0FbHaPGIDXSxU112P5Xzl sZx0VXS9 2vq5muP/0uBc6theZPdahPGyw60UkNus+66mQ1py5trcvdoMcZWlShmb36rMHpVMg8KL0chVjNMdbIj9T6u2jX6F6Ai2nBeBNj0FwcG8SOE4L/Lro+HaDPZ1gyg62qe1hJvpkFj3goaciUkA2xQ3STdSzqQGs9tjgMdUxh5s/44EF6WZadAbwVWjQ9HyJRDb+xJaU7ef8EwpOEbE8GDTF7MmumEvO/QXaaQMxBPGqqTWquVELuBmoLk3kG7aXsLRe2IEI/ehwlvSWx143ecZMKmVVpoPhX1kgYshQftsOuKM2U8qtfr2rFzN7lp7tM1U9sXFkCace2++Ah2h6yIgdtzc1qUESU3IjxDm1y0d/xdi1MeFoxMbcK1DnfUxQieauQ6DNBn/nFR2CywOKlPzxFKDMYtVZ/02Lz3gGPKHWftF+hVmv3c639ye6lCffiOK5qC6krvpsLff644Hn4tHhb5wHneMPpOCUdT0ptoLz5MJqMPzEfYQsph4acvVHH/oBvgAJpalwVUf9hyXxwl6kjBydKN9xLFga5PJYytBLY3iI7imOAsS44e+A7daFdlNHOoSZbYccDEeJ1m1I11lsYe5SaJCumlqec39cB18upRW2pyNDkG6hDiW0zjiL/DJ91OAUxSYurjQrxjM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Thanks, Jan, I just sent patch v2, could you please review it ? And I found the debug info in the bdi stats. The BdiDirtyThresh value may be greater than DirtyThresh, and after applying this patch, the value of BdiDirtyThresh could become even larger. without patch: --- root@ubuntu:/sys/kernel/debug/bdi/8:0# cat stats BdiWriteback: 0 kB BdiReclaimable: 96 kB BdiDirtyThresh: 1346824 kB DirtyThresh: 673412 kB BackgroundThresh: 336292 kB BdiDirtied: 19872 kB BdiWritten: 19776 kB BdiWriteBandwidth: 0 kBps b_dirty: 0 b_io: 0 b_more_io: 0 b_dirty_time: 0 bdi_list: 1 state: 1 with patch: --- root@ubuntu:/sys/kernel/debug/bdi/8:0# cat stats BdiWriteback: 96 kB BdiReclaimable: 192 kB BdiDirtyThresh: 3090736 kB DirtyThresh: 650716 kB BackgroundThresh: 324960 kB BdiDirtied: 472512 kB BdiWritten: 470592 kB BdiWriteBandwidth: 106268 kBps b_dirty: 2 b_io: 0 b_more_io: 0 b_dirty_time: 0 bdi_list: 1 state: 1 @kemeng, is this a normal behavior or an issue ? Thanks, Jim Zhao > With the strictlimit flag, wb_thresh acts as a hard limit in > balance_dirty_pages() and wb_position_ratio(). When device write > operations are inactive, wb_thresh can drop to 0, causing writes to be > blocked. The issue occasionally occurs in fuse fs, particularly with > network backends, the write thread is blocked frequently during a period. > To address it, this patch raises the minimum wb_thresh to a controllable > level, similar to the non-strictlimit case. > > Signed-off-by: Jim Zhao > --- > Changes in v2: > 1. Consolidate all wb_thresh bumping logic in __wb_calc_thresh for consistency; > 2. Replace the limit variable with thresh for calculating the bump value, > as __wb_calc_thresh is also used to calculate the background threshold; > 3. Add domain_dirty_avail in wb_calc_thresh to get dtc->dirty. > --- > mm/page-writeback.c | 48 ++++++++++++++++++++++----------------------- > 1 file changed, 23 insertions(+), 25 deletions(-) > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index e5a9eb795f99..8b13bcb42de3 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -917,7 +917,9 @@ static unsigned long __wb_calc_thresh(struct dirty_throttle_control *dtc, > unsigned long thresh) > { > struct wb_domain *dom = dtc_dom(dtc); > + struct bdi_writeback *wb = dtc->wb; > u64 wb_thresh; > + u64 wb_max_thresh; > unsigned long numerator, denominator; > unsigned long wb_min_ratio, wb_max_ratio; > > @@ -931,11 +933,27 @@ static unsigned long __wb_calc_thresh(struct dirty_throttle_control *dtc, > wb_thresh *= numerator; > wb_thresh = div64_ul(wb_thresh, denominator); > > - wb_min_max_ratio(dtc->wb, &wb_min_ratio, &wb_max_ratio); > + wb_min_max_ratio(wb, &wb_min_ratio, &wb_max_ratio); > > wb_thresh += (thresh * wb_min_ratio) / (100 * BDI_RATIO_SCALE); > - if (wb_thresh > (thresh * wb_max_ratio) / (100 * BDI_RATIO_SCALE)) > - wb_thresh = thresh * wb_max_ratio / (100 * BDI_RATIO_SCALE); > + > + /* > + * It's very possible that wb_thresh is close to 0 not because the > + * device is slow, but that it has remained inactive for long time. > + * Honour such devices a reasonable good (hopefully IO efficient) > + * threshold, so that the occasional writes won't be blocked and active > + * writes can rampup the threshold quickly. > + */ > + if (thresh > dtc->dirty) { > + if (unlikely(wb->bdi->capabilities & BDI_CAP_STRICTLIMIT)) > + wb_thresh = max(wb_thresh, (thresh - dtc->dirty) / 100); > + else > + wb_thresh = max(wb_thresh, (thresh - dtc->dirty) / 8); > + } > + > + wb_max_thresh = thresh * wb_max_ratio / (100 * BDI_RATIO_SCALE); > + if (wb_thresh > wb_max_thresh) > + wb_thresh = wb_max_thresh; > > return wb_thresh; > } > @@ -944,6 +962,7 @@ unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh) > { > struct dirty_throttle_control gdtc = { GDTC_INIT(wb) }; > > + domain_dirty_avail(&gdtc, true); > return __wb_calc_thresh(&gdtc, thresh); > } > > @@ -1120,12 +1139,6 @@ static void wb_position_ratio(struct dirty_throttle_control *dtc) > if (unlikely(wb->bdi->capabilities & BDI_CAP_STRICTLIMIT)) { > long long wb_pos_ratio; > > - if (dtc->wb_dirty < 8) { > - dtc->pos_ratio = min_t(long long, pos_ratio * 2, > - 2 << RATELIMIT_CALC_SHIFT); > - return; > - } > - > if (dtc->wb_dirty >= wb_thresh) > return; > > @@ -1196,14 +1209,6 @@ static void wb_position_ratio(struct dirty_throttle_control *dtc) > */ > if (unlikely(wb_thresh > dtc->thresh)) > wb_thresh = dtc->thresh; > - /* > - * It's very possible that wb_thresh is close to 0 not because the > - * device is slow, but that it has remained inactive for long time. > - * Honour such devices a reasonable good (hopefully IO efficient) > - * threshold, so that the occasional writes won't be blocked and active > - * writes can rampup the threshold quickly. > - */ > - wb_thresh = max(wb_thresh, (limit - dtc->dirty) / 8); > /* > * scale global setpoint to wb's: > * wb_setpoint = setpoint * wb_thresh / thresh > @@ -1459,17 +1464,10 @@ static void wb_update_dirty_ratelimit(struct dirty_throttle_control *dtc, > * balanced_dirty_ratelimit = task_ratelimit * write_bw / dirty_rate). > * Hence, to calculate "step" properly, we have to use wb_dirty as > * "dirty" and wb_setpoint as "setpoint". > - * > - * We rampup dirty_ratelimit forcibly if wb_dirty is low because > - * it's possible that wb_thresh is close to zero due to inactivity > - * of backing device. > */ > if (unlikely(wb->bdi->capabilities & BDI_CAP_STRICTLIMIT)) { > dirty = dtc->wb_dirty; > - if (dtc->wb_dirty < 8) > - setpoint = dtc->wb_dirty + 1; > - else > - setpoint = (dtc->wb_thresh + dtc->wb_bg_thresh) / 2; > + setpoint = (dtc->wb_thresh + dtc->wb_bg_thresh) / 2; > } > > if (dirty < setpoint) { > -- > 2.20.1