From: Jan Kara <jack@suse.cz>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-mm@kvack.org,
Jan Kara <jack@suse.cz>,
stable@vger.kernel.org
Subject: Re: [PATCH] blk-wbt: Fix detection of dirty-throttled tasks
Date: Tue, 6 Feb 2024 10:04:59 +0100 [thread overview]
Message-ID: <20240206090459.6a6qpb6lug3nw57g@quack3> (raw)
In-Reply-To: <20240123175826.21452-1-jack@suse.cz>
On Tue 23-01-24 18:58:26, Jan Kara wrote:
> The detection of dirty-throttled tasks in blk-wbt has been subtly broken
> since its beginning in 2016. Namely if we are doing cgroup writeback and
> the throttled task is not in the root cgroup, balance_dirty_pages() will
> set dirty_sleep for the non-root bdi_writeback structure. However
> blk-wbt checks dirty_sleep only in the root cgroup bdi_writeback
> structure. Thus detection of recently throttled tasks is not working in
> this case (we noticed this when we switched to cgroup v2 and suddently
> writeback was slow).
>
> Since blk-wbt has no easy way to get to proper bdi_writeback and
> furthermore its intention has always been to work on the whole device
> rather than on individual cgroups, just move the dirty_sleep timestamp
> from bdi_writeback to backing_dev_info. That fixes the checking for
> recently throttled task and saves memory for everybody as a bonus.
>
> CC: stable@vger.kernel.org
> Fixes: b57d74aff9ab ("writeback: track if we're sleeping on progress in balance_dirty_pages()")
> Signed-off-by: Jan Kara <jack@suse.cz>
Ping Jens?
Honza
> ---
> block/blk-wbt.c | 4 ++--
> include/linux/backing-dev-defs.h | 7 +++++--
> mm/backing-dev.c | 2 +-
> mm/page-writeback.c | 2 +-
> 4 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/block/blk-wbt.c b/block/blk-wbt.c
> index 5ba3cd574eac..0c0e270a8265 100644
> --- a/block/blk-wbt.c
> +++ b/block/blk-wbt.c
> @@ -163,9 +163,9 @@ static void wb_timestamp(struct rq_wb *rwb, unsigned long *var)
> */
> static bool wb_recent_wait(struct rq_wb *rwb)
> {
> - struct bdi_writeback *wb = &rwb->rqos.disk->bdi->wb;
> + struct backing_dev_info *bdi = rwb->rqos.disk->bdi;
>
> - return time_before(jiffies, wb->dirty_sleep + HZ);
> + return time_before(jiffies, bdi->last_bdp_sleep + HZ);
> }
>
> static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb,
> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
> index ae12696ec492..ad17739a2e72 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -141,8 +141,6 @@ struct bdi_writeback {
> struct delayed_work dwork; /* work item used for writeback */
> struct delayed_work bw_dwork; /* work item used for bandwidth estimate */
>
> - unsigned long dirty_sleep; /* last wait */
> -
> struct list_head bdi_node; /* anchored at bdi->wb_list */
>
> #ifdef CONFIG_CGROUP_WRITEBACK
> @@ -179,6 +177,11 @@ struct backing_dev_info {
> * any dirty wbs, which is depended upon by bdi_has_dirty().
> */
> atomic_long_t tot_write_bandwidth;
> + /*
> + * Jiffies when last process was dirty throttled on this bdi. Used by
> + * blk-wbt.
> + */
> + unsigned long last_bdp_sleep;
>
> struct bdi_writeback wb; /* the root writeback info for this bdi */
> struct list_head wb_list; /* list of all wbs */
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 1e3447bccdb1..e039d05304dd 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -436,7 +436,6 @@ static int wb_init(struct bdi_writeback *wb, struct backing_dev_info *bdi,
> INIT_LIST_HEAD(&wb->work_list);
> INIT_DELAYED_WORK(&wb->dwork, wb_workfn);
> INIT_DELAYED_WORK(&wb->bw_dwork, wb_update_bandwidth_workfn);
> - wb->dirty_sleep = jiffies;
>
> err = fprop_local_init_percpu(&wb->completions, gfp);
> if (err)
> @@ -921,6 +920,7 @@ int bdi_init(struct backing_dev_info *bdi)
> INIT_LIST_HEAD(&bdi->bdi_list);
> INIT_LIST_HEAD(&bdi->wb_list);
> init_waitqueue_head(&bdi->wb_waitq);
> + bdi->last_bdp_sleep = jiffies;
>
> return cgwb_bdi_init(bdi);
> }
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index cd4e4ae77c40..cc37fa7f3364 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1921,7 +1921,7 @@ static int balance_dirty_pages(struct bdi_writeback *wb,
> break;
> }
> __set_current_state(TASK_KILLABLE);
> - wb->dirty_sleep = now;
> + bdi->last_bdp_sleep = jiffies;
> io_schedule_timeout(pause);
>
> current->dirty_paused_when = now + pause;
> --
> 2.35.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2024-02-06 9:05 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-23 17:58 Jan Kara
2024-02-06 9:04 ` Jan Kara [this message]
2024-02-06 16:44 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240206090459.6a6qpb6lug3nw57g@quack3 \
--to=jack@suse.cz \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox