From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FFF5C04FCF for ; Mon, 10 May 2021 10:44:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB6AA6162C for ; Mon, 10 May 2021 10:44:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB6AA6162C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 670396B0071; Mon, 10 May 2021 06:44:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 647AA6B0072; Mon, 10 May 2021 06:44:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 536E56B0073; Mon, 10 May 2021 06:44:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id 369AE6B0071 for ; Mon, 10 May 2021 06:44:56 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id ED008181AF5CC for ; Mon, 10 May 2021 10:44:55 +0000 (UTC) X-FDA: 78124988550.40.C779BEB Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf17.hostedemail.com (Postfix) with ESMTP id 4EFF240002FE for ; Mon, 10 May 2021 10:44:50 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 52563ADCE; Mon, 10 May 2021 10:44:54 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id F158B1F2C5C; Mon, 10 May 2021 12:44:53 +0200 (CEST) Date: Mon, 10 May 2021 12:44:53 +0200 From: Jan Kara To: Andrew Morton Cc: Chi Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org, tj@kernel.org, Howard Cochran , Miklos Szeredi , Jens Axboe , Jan Kara Subject: Re: [PATCH] mm/page-writeback: Fix performance when BDI's share of ratio is 0. Message-ID: <20210510104453.GE11100@quack2.suse.cz> References: <20210428225046.16301-1-wuchi.zero@gmail.com> <20210509163633.ced3588cb92984c0d3835fc3@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210509163633.ced3588cb92984c0d3835fc3@linux-foundation.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 4EFF240002FE X-Stat-Signature: 17dfr35prhxqeb68wyyk8hnsthhezs15 Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf17.hostedemail.com: domain of jack@suse.cz designates 195.135.220.15 as permitted sender) smtp.mailfrom=jack@suse.cz Received-SPF: none (suse.cz>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1620643490-860916 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun 09-05-21 16:36:33, Andrew Morton wrote: > On Thu, 29 Apr 2021 06:50:46 +0800 Chi Wu wrote: > > > Fix performance when BDI's share of ratio is 0. > > > > The issue is similar to commit 74d369443325 ("writeback: Fix > > performance regression in wb_over_bg_thresh()"). > > > > Balance_dirty_pages and the writeback worker will also disagree on > > whether writeback when a BDI uses BDI_CAP_STRICTLIMIT and BDI's share > > of the thresh ratio is zero. > > > > For example, A thread on cpu0 writes 32 pages and then > > balance_dirty_pages, it will wake up background writeback and pauses > > because wb_dirty > wb->wb_thresh = 0 (share of thresh ratio is zero). > > A thread may runs on cpu0 again because scheduler prefers pre_cpu. > > Then writeback worker may runs on other cpus(1,2..) which causes the > > value of wb_stat(wb, WB_RECLAIMABLE) in wb_over_bg_thresh is 0 and does > > not writeback and returns. > > > > Thus, balance_dirty_pages keeps looping, sleeping and then waking up the > > worker who will do nothing. It remains stuck in this state until the > > writeback worker hit the right dirty cpu or the dirty pages expire. > > > > The fix that we should get the wb_stat_sum radically when thresh is low. > > (optimistically Cc's various people who might remember how this code works) Thanks for forwarding Andrew! > > Signed-off-by: Chi Wu > > Thanks. I'll add it for some testing and hopefully someone will find > the time to review this. Thanks for the patch! It looks good to me, good catch! Feel free to add: Reviewed-by: Jan Kara Honza > > --- > > mm/page-writeback.c | 20 ++++++++++++++++---- > > 1 file changed, 16 insertions(+), 4 deletions(-) > > > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > > index 0062d5c57d41..bd7052295246 100644 > > --- a/mm/page-writeback.c > > +++ b/mm/page-writeback.c > > @@ -1945,6 +1945,8 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb) > > struct dirty_throttle_control * const gdtc = &gdtc_stor; > > struct dirty_throttle_control * const mdtc = mdtc_valid(&mdtc_stor) ? > > &mdtc_stor : NULL; > > + unsigned long reclaimable; > > + unsigned long thresh; > > > > /* > > * Similar to balance_dirty_pages() but ignores pages being written > > @@ -1957,8 +1959,13 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb) > > if (gdtc->dirty > gdtc->bg_thresh) > > return true; > > > > - if (wb_stat(wb, WB_RECLAIMABLE) > > > - wb_calc_thresh(gdtc->wb, gdtc->bg_thresh)) > > + thresh = wb_calc_thresh(gdtc->wb, gdtc->bg_thresh); > > + if (thresh < 2 * wb_stat_error()) > > + reclaimable = wb_stat_sum(wb, WB_RECLAIMABLE); > > + else > > + reclaimable = wb_stat(wb, WB_RECLAIMABLE); > > + > > + if (reclaimable > thresh) > > return true; > > > > if (mdtc) { > > @@ -1972,8 +1979,13 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb) > > if (mdtc->dirty > mdtc->bg_thresh) > > return true; > > > > - if (wb_stat(wb, WB_RECLAIMABLE) > > > - wb_calc_thresh(mdtc->wb, mdtc->bg_thresh)) > > + thresh = wb_calc_thresh(mdtc->wb, mdtc->bg_thresh); > > + if (thresh < 2 * wb_stat_error()) > > + reclaimable = wb_stat_sum(wb, WB_RECLAIMABLE); > > + else > > + reclaimable = wb_stat(wb, WB_RECLAIMABLE); > > + > > + if (reclaimable > thresh) > > return true; > > } > > > > -- > > 2.17.1 -- Jan Kara SUSE Labs, CR