* [PATCHv3 0/4] zram: introduce writeback bio batching
@ 2025-11-15 2:34 Sergey Senozhatsky
2025-11-15 2:34 ` [PATCHv3 1/4] zram: introduce writeback bio batching support Sergey Senozhatsky
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Sergey Senozhatsky @ 2025-11-15 2:34 UTC (permalink / raw)
To: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang
Cc: Brian Geffon, Fengyu Lian, linux-kernel, linux-mm, linux-block,
Sergey Senozhatsky
This is a rework of Yuwen's patch [1] that adds writeback bio
batching support to zram, which can improve throughput of
writeback operation.
[1] https://lore.kernel.org/linux-block/tencent_0FBBFC8AE0B97BC63B5D47CE1FF2BABFDA09@qq.com/
v2->v3:
- addressed Minchan's feedback
- changed GFP flags, dropped SYNC req, minor code tweaks.
Sergey Senozhatsky (3):
zram: add writeback batch size device attr
zram: take write lock in wb limit store handlers
zram: drop wb_limit_lock
Yuwen Chen (1):
zram: introduce writeback bio batching support
drivers/block/zram/zram_drv.c | 410 +++++++++++++++++++++++++++-------
drivers/block/zram/zram_drv.h | 2 +-
2 files changed, 325 insertions(+), 87 deletions(-)
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-15 2:34 [PATCHv3 0/4] zram: introduce writeback bio batching Sergey Senozhatsky @ 2025-11-15 2:34 ` Sergey Senozhatsky 2025-11-17 15:19 ` Brian Geffon 2025-11-18 7:00 ` Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 2/4] zram: add writeback batch size device attr Sergey Senozhatsky ` (2 subsequent siblings) 3 siblings, 2 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-15 2:34 UTC (permalink / raw) To: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang Cc: Brian Geffon, Fengyu Lian, linux-kernel, linux-mm, linux-block, Sergey Senozhatsky, Minchan Kim From: Yuwen Chen <ywen.chen@foxmail.com> Currently, zram writeback supports only a single bio writeback operation, waiting for bio completion before post-processing next pp-slot. This works, in general, but has certain throughput limitations. Implement batched (multiple) bio writeback support to take advantage of parallel requests processing and better requests scheduling. For the time being the writeback batch size (maximum number of in-flight bio requests) is set to 32 for all devices. A follow up patch adds a writeback_batch_size device attribute, so the batch size becomes run-time configurable. Please refer to [1] and [2] for benchmarks. [1] https://lore.kernel.org/linux-block/tencent_B2DC37E3A2AED0E7F179365FCB5D82455B08@qq.com [2] https://lore.kernel.org/linux-block/tencent_0FBBFC8AE0B97BC63B5D47CE1FF2BABFDA09@qq.com [senozhatsky: significantly reworked the initial patch so that the approach and implementation resemble current zram post-processing code] Signed-off-by: Yuwen Chen <ywen.chen@foxmail.com> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Co-developed-by: Richard Chang <richardycc@google.com> Suggested-by: Minchan Kim <minchan@google.com> --- drivers/block/zram/zram_drv.c | 343 +++++++++++++++++++++++++++------- 1 file changed, 277 insertions(+), 66 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index a43074657531..84e72c3bb280 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -500,6 +500,24 @@ static ssize_t idle_store(struct device *dev, } #ifdef CONFIG_ZRAM_WRITEBACK +struct zram_wb_ctl { + struct list_head idle_reqs; + struct list_head inflight_reqs; + + atomic_t num_inflight; + struct completion done; +}; + +struct zram_wb_req { + unsigned long blk_idx; + struct page *page; + struct zram_pp_slot *pps; + struct bio_vec bio_vec; + struct bio bio; + + struct list_head entry; +}; + static ssize_t writeback_limit_enable_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { @@ -734,20 +752,207 @@ static void read_from_bdev_async(struct zram *zram, struct page *page, submit_bio(bio); } -static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) +static void release_wb_req(struct zram_wb_req *req) +{ + __free_page(req->page); + kfree(req); +} + +static void release_wb_ctl(struct zram_wb_ctl *wb_ctl) +{ + /* We should never have inflight requests at this point */ + WARN_ON(!list_empty(&wb_ctl->inflight_reqs)); + + while (!list_empty(&wb_ctl->idle_reqs)) { + struct zram_wb_req *req; + + req = list_first_entry(&wb_ctl->idle_reqs, + struct zram_wb_req, entry); + list_del(&req->entry); + release_wb_req(req); + } + + kfree(wb_ctl); +} + +/* XXX: should be a per-device sysfs attr */ +#define ZRAM_WB_REQ_CNT 32 + +static struct zram_wb_ctl *init_wb_ctl(void) +{ + struct zram_wb_ctl *wb_ctl; + int i; + + wb_ctl = kmalloc(sizeof(*wb_ctl), GFP_KERNEL); + if (!wb_ctl) + return NULL; + + INIT_LIST_HEAD(&wb_ctl->idle_reqs); + INIT_LIST_HEAD(&wb_ctl->inflight_reqs); + atomic_set(&wb_ctl->num_inflight, 0); + init_completion(&wb_ctl->done); + + for (i = 0; i < ZRAM_WB_REQ_CNT; i++) { + struct zram_wb_req *req; + + /* + * This is fatal condition only if we couldn't allocate + * any requests at all. Otherwise we just work with the + * requests that we have successfully allocated, so that + * writeback can still proceed, even if there is only one + * request on the idle list. + */ + req = kzalloc(sizeof(*req), GFP_KERNEL | __GFP_NOWARN); + if (!req) + break; + + req->page = alloc_page(GFP_KERNEL | __GFP_NOWARN); + if (!req->page) { + kfree(req); + break; + } + + list_add(&req->entry, &wb_ctl->idle_reqs); + } + + /* We couldn't allocate any requests, so writeabck is not possible */ + if (list_empty(&wb_ctl->idle_reqs)) + goto release_wb_ctl; + + return wb_ctl; + +release_wb_ctl: + release_wb_ctl(wb_ctl); + return NULL; +} + +static void zram_account_writeback_rollback(struct zram *zram) { + spin_lock(&zram->wb_limit_lock); + if (zram->wb_limit_enable) + zram->bd_wb_limit += 1UL << (PAGE_SHIFT - 12); + spin_unlock(&zram->wb_limit_lock); +} + +static void zram_account_writeback_submit(struct zram *zram) +{ + spin_lock(&zram->wb_limit_lock); + if (zram->wb_limit_enable && zram->bd_wb_limit > 0) + zram->bd_wb_limit -= 1UL << (PAGE_SHIFT - 12); + spin_unlock(&zram->wb_limit_lock); +} + +static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) +{ + u32 index; + int err; + + index = req->pps->index; + release_pp_slot(zram, req->pps); + req->pps = NULL; + + err = blk_status_to_errno(req->bio.bi_status); + if (err) { + /* + * Failed wb requests should not be accounted in wb_limit + * (if enabled). + */ + zram_account_writeback_rollback(zram); + return err; + } + + atomic64_inc(&zram->stats.bd_writes); + zram_slot_lock(zram, index); + /* + * We release slot lock during writeback so slot can change under us: + * slot_free() or slot_free() and zram_write_page(). In both cases + * slot loses ZRAM_PP_SLOT flag. No concurrent post-processing can + * set ZRAM_PP_SLOT on such slots until current post-processing + * finishes. + */ + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) + goto out; + + zram_free_page(zram, index); + zram_set_flag(zram, index, ZRAM_WB); + zram_set_handle(zram, index, req->blk_idx); + atomic64_inc(&zram->stats.pages_stored); + +out: + zram_slot_unlock(zram, index); + return 0; +} + +static void zram_writeback_endio(struct bio *bio) +{ + struct zram_wb_ctl *wb_ctl = bio->bi_private; + + if (atomic_dec_return(&wb_ctl->num_inflight) == 0) + complete(&wb_ctl->done); +} + +static void zram_submit_wb_request(struct zram *zram, + struct zram_wb_ctl *wb_ctl, + struct zram_wb_req *req) +{ + /* + * wb_limit (if enabled) should be adjusted before submission, + * so that we don't over-submit. + */ + zram_account_writeback_submit(zram); + atomic_inc(&wb_ctl->num_inflight); + list_add_tail(&req->entry, &wb_ctl->inflight_reqs); + submit_bio(&req->bio); +} + +static struct zram_wb_req *select_idle_req(struct zram_wb_ctl *wb_ctl) +{ + struct zram_wb_req *req; + + req = list_first_entry_or_null(&wb_ctl->idle_reqs, + struct zram_wb_req, entry); + if (req) + list_del(&req->entry); + return req; +} + +static int zram_wb_wait_for_completion(struct zram *zram, + struct zram_wb_ctl *wb_ctl) +{ + int ret = 0; + + if (atomic_read(&wb_ctl->num_inflight)) + wait_for_completion_io(&wb_ctl->done); + + reinit_completion(&wb_ctl->done); + while (!list_empty(&wb_ctl->inflight_reqs)) { + struct zram_wb_req *req; + int err; + + req = list_first_entry(&wb_ctl->inflight_reqs, + struct zram_wb_req, entry); + list_move(&req->entry, &wb_ctl->idle_reqs); + + err = zram_writeback_complete(zram, req); + if (err) + ret = err; + } + + return ret; +} + +static int zram_writeback_slots(struct zram *zram, + struct zram_pp_ctl *ctl, + struct zram_wb_ctl *wb_ctl) +{ + struct zram_wb_req *req = NULL; unsigned long blk_idx = 0; - struct page *page = NULL; struct zram_pp_slot *pps; - struct bio_vec bio_vec; - struct bio bio; + struct blk_plug io_plug; int ret = 0, err; - u32 index; - - page = alloc_page(GFP_KERNEL); - if (!page) - return -ENOMEM; + u32 index = 0; + blk_start_plug(&io_plug); while ((pps = select_pp_slot(ctl))) { spin_lock(&zram->wb_limit_lock); if (zram->wb_limit_enable && !zram->bd_wb_limit) { @@ -757,6 +962,26 @@ static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) } spin_unlock(&zram->wb_limit_lock); + while (!req) { + req = select_idle_req(wb_ctl); + if (req) + break; + + blk_finish_plug(&io_plug); + err = zram_wb_wait_for_completion(zram, wb_ctl); + blk_start_plug(&io_plug); + /* + * BIO errors are not fatal, we continue and simply + * attempt to writeback the remaining objects (pages). + * At the same time we need to signal user-space that + * some writes (at least one, but also could be all of + * them) were not successful and we do so by returning + * the most recent BIO error. + */ + if (err) + ret = err; + } + if (!blk_idx) { blk_idx = alloc_block_bdev(zram); if (!blk_idx) { @@ -765,7 +990,6 @@ static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) } } - index = pps->index; zram_slot_lock(zram, index); /* * scan_slots() sets ZRAM_PP_SLOT and relases slot lock, so @@ -775,67 +999,46 @@ static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) */ if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) goto next; - if (zram_read_from_zspool(zram, page, index)) + if (zram_read_from_zspool(zram, req->page, index)) goto next; zram_slot_unlock(zram, index); - bio_init(&bio, zram->bdev, &bio_vec, 1, - REQ_OP_WRITE | REQ_SYNC); - bio.bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9); - __bio_add_page(&bio, page, PAGE_SIZE, 0); - /* - * XXX: A single page IO would be inefficient for write - * but it would be not bad as starter. + * From now on pp-slot is owned by the req, remove it from + * its pp bucket. */ - err = submit_bio_wait(&bio); - if (err) { - release_pp_slot(zram, pps); - /* - * BIO errors are not fatal, we continue and simply - * attempt to writeback the remaining objects (pages). - * At the same time we need to signal user-space that - * some writes (at least one, but also could be all of - * them) were not successful and we do so by returning - * the most recent BIO error. - */ - ret = err; - continue; - } + list_del_init(&pps->entry); - atomic64_inc(&zram->stats.bd_writes); - zram_slot_lock(zram, index); - /* - * Same as above, we release slot lock during writeback so - * slot can change under us: slot_free() or slot_free() and - * reallocation (zram_write_page()). In both cases slot loses - * ZRAM_PP_SLOT flag. No concurrent post-processing can set - * ZRAM_PP_SLOT on such slots until current post-processing - * finishes. - */ - if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) - goto next; + req->blk_idx = blk_idx; + req->pps = pps; + bio_init(&req->bio, zram->bdev, &req->bio_vec, 1, REQ_OP_WRITE); + req->bio.bi_iter.bi_sector = req->blk_idx * (PAGE_SIZE >> 9); + req->bio.bi_end_io = zram_writeback_endio; + req->bio.bi_private = wb_ctl; + __bio_add_page(&req->bio, req->page, PAGE_SIZE, 0); - zram_free_page(zram, index); - zram_set_flag(zram, index, ZRAM_WB); - zram_set_handle(zram, index, blk_idx); + zram_submit_wb_request(zram, wb_ctl, req); blk_idx = 0; - atomic64_inc(&zram->stats.pages_stored); - spin_lock(&zram->wb_limit_lock); - if (zram->wb_limit_enable && zram->bd_wb_limit > 0) - zram->bd_wb_limit -= 1UL << (PAGE_SHIFT - 12); - spin_unlock(&zram->wb_limit_lock); + req = NULL; + continue; + next: zram_slot_unlock(zram, index); release_pp_slot(zram, pps); - cond_resched(); } - if (blk_idx) - free_block_bdev(zram, blk_idx); - if (page) - __free_page(page); + /* + * Selected idle req, but never submitted it due to some error or + * wb limit. + */ + if (req) + release_wb_req(req); + + blk_finish_plug(&io_plug); + err = zram_wb_wait_for_completion(zram, wb_ctl); + if (err) + ret = err; return ret; } @@ -948,7 +1151,8 @@ static ssize_t writeback_store(struct device *dev, struct zram *zram = dev_to_zram(dev); u64 nr_pages = zram->disksize >> PAGE_SHIFT; unsigned long lo = 0, hi = nr_pages; - struct zram_pp_ctl *ctl = NULL; + struct zram_pp_ctl *pp_ctl = NULL; + struct zram_wb_ctl *wb_ctl = NULL; char *args, *param, *val; ssize_t ret = len; int err, mode = 0; @@ -970,8 +1174,14 @@ static ssize_t writeback_store(struct device *dev, goto release_init_lock; } - ctl = init_pp_ctl(); - if (!ctl) { + pp_ctl = init_pp_ctl(); + if (!pp_ctl) { + ret = -ENOMEM; + goto release_init_lock; + } + + wb_ctl = init_wb_ctl(); + if (!wb_ctl) { ret = -ENOMEM; goto release_init_lock; } @@ -1000,7 +1210,7 @@ static ssize_t writeback_store(struct device *dev, goto release_init_lock; } - scan_slots_for_writeback(zram, mode, lo, hi, ctl); + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); break; } @@ -1011,7 +1221,7 @@ static ssize_t writeback_store(struct device *dev, goto release_init_lock; } - scan_slots_for_writeback(zram, mode, lo, hi, ctl); + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); break; } @@ -1022,7 +1232,7 @@ static ssize_t writeback_store(struct device *dev, goto release_init_lock; } - scan_slots_for_writeback(zram, mode, lo, hi, ctl); + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); continue; } @@ -1033,17 +1243,18 @@ static ssize_t writeback_store(struct device *dev, goto release_init_lock; } - scan_slots_for_writeback(zram, mode, lo, hi, ctl); + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); continue; } } - err = zram_writeback_slots(zram, ctl); + err = zram_writeback_slots(zram, pp_ctl, wb_ctl); if (err) ret = err; release_init_lock: - release_pp_ctl(zram, ctl); + release_pp_ctl(zram, pp_ctl); + release_wb_ctl(wb_ctl); atomic_set(&zram->pp_in_progress, 0); up_read(&zram->init_lock); -- 2.52.0.rc1.455.g30608eb744-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-15 2:34 ` [PATCHv3 1/4] zram: introduce writeback bio batching support Sergey Senozhatsky @ 2025-11-17 15:19 ` Brian Geffon 2025-11-18 2:08 ` Yuwen Chen 2025-11-18 3:48 ` Yuwen Chen 2025-11-18 7:00 ` Sergey Senozhatsky 1 sibling, 2 replies; 13+ messages in thread From: Brian Geffon @ 2025-11-17 15:19 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang, Fengyu Lian, linux-kernel, linux-mm, linux-block, Minchan Kim On Fri, Nov 14, 2025 at 9:35 PM Sergey Senozhatsky <senozhatsky@chromium.org> wrote: > > From: Yuwen Chen <ywen.chen@foxmail.com> > > Currently, zram writeback supports only a single bio writeback > operation, waiting for bio completion before post-processing > next pp-slot. This works, in general, but has certain throughput > limitations. Implement batched (multiple) bio writeback support > to take advantage of parallel requests processing and better > requests scheduling. > > For the time being the writeback batch size (maximum number of > in-flight bio requests) is set to 32 for all devices. A follow > up patch adds a writeback_batch_size device attribute, so the > batch size becomes run-time configurable. > > Please refer to [1] and [2] for benchmarks. > > [1] https://lore.kernel.org/linux-block/tencent_B2DC37E3A2AED0E7F179365FCB5D82455B08@qq.com > [2] https://lore.kernel.org/linux-block/tencent_0FBBFC8AE0B97BC63B5D47CE1FF2BABFDA09@qq.com > > [senozhatsky: significantly reworked the initial patch so that the > approach and implementation resemble current zram post-processing > code] > > Signed-off-by: Yuwen Chen <ywen.chen@foxmail.com> > Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> > Co-developed-by: Richard Chang <richardycc@google.com> > Suggested-by: Minchan Kim <minchan@google.com> > --- > drivers/block/zram/zram_drv.c | 343 +++++++++++++++++++++++++++------- > 1 file changed, 277 insertions(+), 66 deletions(-) > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index a43074657531..84e72c3bb280 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -500,6 +500,24 @@ static ssize_t idle_store(struct device *dev, > } > > #ifdef CONFIG_ZRAM_WRITEBACK > +struct zram_wb_ctl { > + struct list_head idle_reqs; > + struct list_head inflight_reqs; > + > + atomic_t num_inflight; > + struct completion done; > +}; > + > +struct zram_wb_req { > + unsigned long blk_idx; > + struct page *page; > + struct zram_pp_slot *pps; > + struct bio_vec bio_vec; > + struct bio bio; > + > + struct list_head entry; > +}; > + > static ssize_t writeback_limit_enable_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t len) > { > @@ -734,20 +752,207 @@ static void read_from_bdev_async(struct zram *zram, struct page *page, > submit_bio(bio); > } > > -static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) > +static void release_wb_req(struct zram_wb_req *req) > +{ > + __free_page(req->page); > + kfree(req); > +} > + > +static void release_wb_ctl(struct zram_wb_ctl *wb_ctl) > +{ > + /* We should never have inflight requests at this point */ > + WARN_ON(!list_empty(&wb_ctl->inflight_reqs)); > + > + while (!list_empty(&wb_ctl->idle_reqs)) { > + struct zram_wb_req *req; > + > + req = list_first_entry(&wb_ctl->idle_reqs, > + struct zram_wb_req, entry); > + list_del(&req->entry); > + release_wb_req(req); > + } > + > + kfree(wb_ctl); > +} > + > +/* XXX: should be a per-device sysfs attr */ > +#define ZRAM_WB_REQ_CNT 32 > + > +static struct zram_wb_ctl *init_wb_ctl(void) > +{ > + struct zram_wb_ctl *wb_ctl; > + int i; > + > + wb_ctl = kmalloc(sizeof(*wb_ctl), GFP_KERNEL); > + if (!wb_ctl) > + return NULL; > + > + INIT_LIST_HEAD(&wb_ctl->idle_reqs); > + INIT_LIST_HEAD(&wb_ctl->inflight_reqs); > + atomic_set(&wb_ctl->num_inflight, 0); > + init_completion(&wb_ctl->done); > + > + for (i = 0; i < ZRAM_WB_REQ_CNT; i++) { > + struct zram_wb_req *req; > + > + /* > + * This is fatal condition only if we couldn't allocate > + * any requests at all. Otherwise we just work with the > + * requests that we have successfully allocated, so that > + * writeback can still proceed, even if there is only one > + * request on the idle list. > + */ > + req = kzalloc(sizeof(*req), GFP_KERNEL | __GFP_NOWARN); > + if (!req) > + break; > + > + req->page = alloc_page(GFP_KERNEL | __GFP_NOWARN); > + if (!req->page) { > + kfree(req); > + break; > + } > + > + list_add(&req->entry, &wb_ctl->idle_reqs); > + } > + > + /* We couldn't allocate any requests, so writeabck is not possible */ > + if (list_empty(&wb_ctl->idle_reqs)) > + goto release_wb_ctl; > + > + return wb_ctl; > + > +release_wb_ctl: > + release_wb_ctl(wb_ctl); > + return NULL; > +} > + > +static void zram_account_writeback_rollback(struct zram *zram) > { > + spin_lock(&zram->wb_limit_lock); > + if (zram->wb_limit_enable) > + zram->bd_wb_limit += 1UL << (PAGE_SHIFT - 12); > + spin_unlock(&zram->wb_limit_lock); > +} > + > +static void zram_account_writeback_submit(struct zram *zram) > +{ > + spin_lock(&zram->wb_limit_lock); > + if (zram->wb_limit_enable && zram->bd_wb_limit > 0) > + zram->bd_wb_limit -= 1UL << (PAGE_SHIFT - 12); > + spin_unlock(&zram->wb_limit_lock); > +} > + > +static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) > +{ > + u32 index; > + int err; > + > + index = req->pps->index; > + release_pp_slot(zram, req->pps); > + req->pps = NULL; > + > + err = blk_status_to_errno(req->bio.bi_status); > + if (err) { > + /* > + * Failed wb requests should not be accounted in wb_limit > + * (if enabled). > + */ > + zram_account_writeback_rollback(zram); > + return err; > + } > + > + atomic64_inc(&zram->stats.bd_writes); > + zram_slot_lock(zram, index); > + /* > + * We release slot lock during writeback so slot can change under us: > + * slot_free() or slot_free() and zram_write_page(). In both cases > + * slot loses ZRAM_PP_SLOT flag. No concurrent post-processing can > + * set ZRAM_PP_SLOT on such slots until current post-processing > + * finishes. > + */ > + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) > + goto out; > + > + zram_free_page(zram, index); > + zram_set_flag(zram, index, ZRAM_WB); > + zram_set_handle(zram, index, req->blk_idx); > + atomic64_inc(&zram->stats.pages_stored); > + > +out: > + zram_slot_unlock(zram, index); > + return 0; > +} > + > +static void zram_writeback_endio(struct bio *bio) > +{ > + struct zram_wb_ctl *wb_ctl = bio->bi_private; > + > + if (atomic_dec_return(&wb_ctl->num_inflight) == 0) > + complete(&wb_ctl->done); > +} > + > +static void zram_submit_wb_request(struct zram *zram, > + struct zram_wb_ctl *wb_ctl, > + struct zram_wb_req *req) > +{ > + /* > + * wb_limit (if enabled) should be adjusted before submission, > + * so that we don't over-submit. > + */ > + zram_account_writeback_submit(zram); > + atomic_inc(&wb_ctl->num_inflight); > + list_add_tail(&req->entry, &wb_ctl->inflight_reqs); > + submit_bio(&req->bio); > +} > + > +static struct zram_wb_req *select_idle_req(struct zram_wb_ctl *wb_ctl) > +{ > + struct zram_wb_req *req; > + > + req = list_first_entry_or_null(&wb_ctl->idle_reqs, > + struct zram_wb_req, entry); > + if (req) > + list_del(&req->entry); > + return req; > +} > + > +static int zram_wb_wait_for_completion(struct zram *zram, > + struct zram_wb_ctl *wb_ctl) > +{ > + int ret = 0; > + > + if (atomic_read(&wb_ctl->num_inflight)) > + wait_for_completion_io(&wb_ctl->done); > + > + reinit_completion(&wb_ctl->done); > + while (!list_empty(&wb_ctl->inflight_reqs)) { > + struct zram_wb_req *req; > + int err; > + > + req = list_first_entry(&wb_ctl->inflight_reqs, > + struct zram_wb_req, entry); > + list_move(&req->entry, &wb_ctl->idle_reqs); > + > + err = zram_writeback_complete(zram, req); > + if (err) > + ret = err; > + } > + > + return ret; > +} > + > +static int zram_writeback_slots(struct zram *zram, > + struct zram_pp_ctl *ctl, > + struct zram_wb_ctl *wb_ctl) > +{ > + struct zram_wb_req *req = NULL; > unsigned long blk_idx = 0; > - struct page *page = NULL; > struct zram_pp_slot *pps; > - struct bio_vec bio_vec; > - struct bio bio; > + struct blk_plug io_plug; > int ret = 0, err; > - u32 index; > - > - page = alloc_page(GFP_KERNEL); > - if (!page) > - return -ENOMEM; > + u32 index = 0; > > + blk_start_plug(&io_plug); > while ((pps = select_pp_slot(ctl))) { > spin_lock(&zram->wb_limit_lock); > if (zram->wb_limit_enable && !zram->bd_wb_limit) { > @@ -757,6 +962,26 @@ static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) > } > spin_unlock(&zram->wb_limit_lock); > > + while (!req) { > + req = select_idle_req(wb_ctl); > + if (req) > + break; > + > + blk_finish_plug(&io_plug); > + err = zram_wb_wait_for_completion(zram, wb_ctl); > + blk_start_plug(&io_plug); > + /* > + * BIO errors are not fatal, we continue and simply > + * attempt to writeback the remaining objects (pages). > + * At the same time we need to signal user-space that > + * some writes (at least one, but also could be all of > + * them) were not successful and we do so by returning > + * the most recent BIO error. > + */ > + if (err) > + ret = err; > + } > + > if (!blk_idx) { > blk_idx = alloc_block_bdev(zram); > if (!blk_idx) { > @@ -765,7 +990,6 @@ static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) > } > } > > - index = pps->index; > zram_slot_lock(zram, index); > /* > * scan_slots() sets ZRAM_PP_SLOT and relases slot lock, so > @@ -775,67 +999,46 @@ static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) > */ > if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) > goto next; > - if (zram_read_from_zspool(zram, page, index)) > + if (zram_read_from_zspool(zram, req->page, index)) > goto next; > zram_slot_unlock(zram, index); > > - bio_init(&bio, zram->bdev, &bio_vec, 1, > - REQ_OP_WRITE | REQ_SYNC); > - bio.bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9); > - __bio_add_page(&bio, page, PAGE_SIZE, 0); > - > /* > - * XXX: A single page IO would be inefficient for write > - * but it would be not bad as starter. > + * From now on pp-slot is owned by the req, remove it from > + * its pp bucket. > */ > - err = submit_bio_wait(&bio); > - if (err) { > - release_pp_slot(zram, pps); > - /* > - * BIO errors are not fatal, we continue and simply > - * attempt to writeback the remaining objects (pages). > - * At the same time we need to signal user-space that > - * some writes (at least one, but also could be all of > - * them) were not successful and we do so by returning > - * the most recent BIO error. > - */ > - ret = err; > - continue; > - } > + list_del_init(&pps->entry); > > - atomic64_inc(&zram->stats.bd_writes); > - zram_slot_lock(zram, index); > - /* > - * Same as above, we release slot lock during writeback so > - * slot can change under us: slot_free() or slot_free() and > - * reallocation (zram_write_page()). In both cases slot loses > - * ZRAM_PP_SLOT flag. No concurrent post-processing can set > - * ZRAM_PP_SLOT on such slots until current post-processing > - * finishes. > - */ > - if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) > - goto next; > + req->blk_idx = blk_idx; > + req->pps = pps; > + bio_init(&req->bio, zram->bdev, &req->bio_vec, 1, REQ_OP_WRITE); > + req->bio.bi_iter.bi_sector = req->blk_idx * (PAGE_SIZE >> 9); > + req->bio.bi_end_io = zram_writeback_endio; > + req->bio.bi_private = wb_ctl; > + __bio_add_page(&req->bio, req->page, PAGE_SIZE, 0); Out of curiosity, why are we doing 1 page per bio? Why are we not adding BIO_MAX_VECS before submitting? And then, why are we not chaining? Do the block layer maintainers have thoughts? > > - zram_free_page(zram, index); > - zram_set_flag(zram, index, ZRAM_WB); > - zram_set_handle(zram, index, blk_idx); > + zram_submit_wb_request(zram, wb_ctl, req); > blk_idx = 0; > - atomic64_inc(&zram->stats.pages_stored); > - spin_lock(&zram->wb_limit_lock); > - if (zram->wb_limit_enable && zram->bd_wb_limit > 0) > - zram->bd_wb_limit -= 1UL << (PAGE_SHIFT - 12); > - spin_unlock(&zram->wb_limit_lock); > + req = NULL; > + continue; > + > next: > zram_slot_unlock(zram, index); > release_pp_slot(zram, pps); > - > cond_resched(); > } > > - if (blk_idx) > - free_block_bdev(zram, blk_idx); > - if (page) > - __free_page(page); > + /* > + * Selected idle req, but never submitted it due to some error or > + * wb limit. > + */ > + if (req) > + release_wb_req(req); > + > + blk_finish_plug(&io_plug); > + err = zram_wb_wait_for_completion(zram, wb_ctl); > + if (err) > + ret = err; > > return ret; > } > @@ -948,7 +1151,8 @@ static ssize_t writeback_store(struct device *dev, > struct zram *zram = dev_to_zram(dev); > u64 nr_pages = zram->disksize >> PAGE_SHIFT; > unsigned long lo = 0, hi = nr_pages; > - struct zram_pp_ctl *ctl = NULL; > + struct zram_pp_ctl *pp_ctl = NULL; > + struct zram_wb_ctl *wb_ctl = NULL; > char *args, *param, *val; > ssize_t ret = len; > int err, mode = 0; > @@ -970,8 +1174,14 @@ static ssize_t writeback_store(struct device *dev, > goto release_init_lock; > } > > - ctl = init_pp_ctl(); > - if (!ctl) { > + pp_ctl = init_pp_ctl(); > + if (!pp_ctl) { > + ret = -ENOMEM; > + goto release_init_lock; > + } > + > + wb_ctl = init_wb_ctl(); > + if (!wb_ctl) { > ret = -ENOMEM; > goto release_init_lock; > } > @@ -1000,7 +1210,7 @@ static ssize_t writeback_store(struct device *dev, > goto release_init_lock; > } > > - scan_slots_for_writeback(zram, mode, lo, hi, ctl); > + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); > break; > } > > @@ -1011,7 +1221,7 @@ static ssize_t writeback_store(struct device *dev, > goto release_init_lock; > } > > - scan_slots_for_writeback(zram, mode, lo, hi, ctl); > + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); > break; > } > > @@ -1022,7 +1232,7 @@ static ssize_t writeback_store(struct device *dev, > goto release_init_lock; > } > > - scan_slots_for_writeback(zram, mode, lo, hi, ctl); > + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); > continue; > } > > @@ -1033,17 +1243,18 @@ static ssize_t writeback_store(struct device *dev, > goto release_init_lock; > } > > - scan_slots_for_writeback(zram, mode, lo, hi, ctl); > + scan_slots_for_writeback(zram, mode, lo, hi, pp_ctl); > continue; > } > } > > - err = zram_writeback_slots(zram, ctl); > + err = zram_writeback_slots(zram, pp_ctl, wb_ctl); > if (err) > ret = err; > > release_init_lock: > - release_pp_ctl(zram, ctl); > + release_pp_ctl(zram, pp_ctl); > + release_wb_ctl(wb_ctl); > atomic_set(&zram->pp_in_progress, 0); > up_read(&zram->init_lock); > > -- > 2.52.0.rc1.455.g30608eb744-goog > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-17 15:19 ` Brian Geffon @ 2025-11-18 2:08 ` Yuwen Chen 2025-11-18 2:14 ` Sergey Senozhatsky 2025-11-18 3:48 ` Yuwen Chen 1 sibling, 1 reply; 13+ messages in thread From: Yuwen Chen @ 2025-11-18 2:08 UTC (permalink / raw) To: bgeffon Cc: akpm, licayy, linux-block, linux-kernel, linux-mm, minchan, minchan, richardycc, senozhatsky, ywen.chen On Mon, 17 Nov 2025 10:19:22 -0500, Sergey Senozhatsky wrote: - index = pps->index; zram_slot_lock(zram, index); /* * scan_slots() sets ZRAM_PP_SLOT and relases slot lock, so @@ -775,67 +999,46 @@ static int zram_writeback_slots(struct z */ if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) goto next; - if (zram_read_from_zspool(zram, page, index)) + if (zram_read_from_zspool(zram, req->page, index)) goto next; zram_slot_unlock(zram, index); I tested the reorganized patch on my end and found that it didn't work properly. Currently, it has been found that this index is uninitialized. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-18 2:08 ` Yuwen Chen @ 2025-11-18 2:14 ` Sergey Senozhatsky 2025-11-18 3:18 ` Yuwen Chen 2025-11-18 3:36 ` Yuwen Chen 0 siblings, 2 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-18 2:14 UTC (permalink / raw) To: Yuwen Chen Cc: bgeffon, akpm, licayy, linux-block, linux-kernel, linux-mm, minchan, minchan, richardycc, senozhatsky On (25/11/18 10:08), Yuwen Chen wrote: > On Mon, 17 Nov 2025 10:19:22 -0500, Sergey Senozhatsky wrote: > - index = pps->index; > zram_slot_lock(zram, index); D'oh, indeed. Thanks for spotting this. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-18 2:14 ` Sergey Senozhatsky @ 2025-11-18 3:18 ` Yuwen Chen 2025-11-18 3:36 ` Yuwen Chen 1 sibling, 0 replies; 13+ messages in thread From: Yuwen Chen @ 2025-11-18 3:18 UTC (permalink / raw) To: senozhatsky Cc: akpm, bgeffon, licayy, linux-block, linux-kernel, linux-mm, minchan, minchan, richardycc, ywen.chen On Mon, 17 Nov 2025 10:19:22 -0500, Sergey Senozhatsky wrote: > +static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) > +{ > + u32 index; > + int err; > + > + index = req->pps->index; > + release_pp_slot(zram, req->pps); > + req->pps = NULL; > + > + err = blk_status_to_errno(req->bio.bi_status); > + if (err) { > + /* > + * Failed wb requests should not be accounted in wb_limit > + * (if enabled). > + */ > + zram_account_writeback_rollback(zram); In this place, the index may be leaked. > + return err; > + } > + > + atomic64_inc(&zram->stats.bd_writes); > + zram_slot_lock(zram, index); > + /* > + * We release slot lock during writeback so slot can change under us: > + * slot_free() or slot_free() and zram_write_page(). In both cases > + * slot loses ZRAM_PP_SLOT flag. No concurrent post-processing can > + * set ZRAM_PP_SLOT on such slots until current post-processing > + * finishes. > + */ > + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) > + goto out; In this place, the index may be leaked. > + > + zram_free_page(zram, index); > + zram_set_flag(zram, index, ZRAM_WB); > + zram_set_handle(zram, index, req->blk_idx); > + atomic64_inc(&zram->stats.pages_stored); > + > +out: > + zram_slot_unlock(zram, index); > + return 0; > +} ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-18 2:14 ` Sergey Senozhatsky 2025-11-18 3:18 ` Yuwen Chen @ 2025-11-18 3:36 ` Yuwen Chen 2025-11-18 3:47 ` Sergey Senozhatsky 1 sibling, 1 reply; 13+ messages in thread From: Yuwen Chen @ 2025-11-18 3:36 UTC (permalink / raw) To: senozhatsky Cc: akpm, bgeffon, licayy, linux-block, linux-kernel, linux-mm, minchan, minchan, richardycc, ywen.chen On Tue, 18 Nov 2025 11:18:56 +0800, Yuwen Chen wrote: >> + /* >> + * We release slot lock during writeback so slot can change under us: >> + * slot_free() or slot_free() and zram_write_page(). In both cases >> + * slot loses ZRAM_PP_SLOT flag. No concurrent post-processing can >> + * set ZRAM_PP_SLOT on such slots until current post-processing >> + * finishes. >> + */ >> + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) >> + goto out; >In this place, the index may be leaked. To be precise, blk_idx may be leaked. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-18 3:36 ` Yuwen Chen @ 2025-11-18 3:47 ` Sergey Senozhatsky 0 siblings, 0 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-18 3:47 UTC (permalink / raw) To: Yuwen Chen Cc: senozhatsky, akpm, bgeffon, licayy, linux-block, linux-kernel, linux-mm, minchan, minchan, richardycc On (25/11/18 11:36), Yuwen Chen wrote: > On Tue, 18 Nov 2025 11:18:56 +0800, Yuwen Chen wrote: > >> + /* > >> + * We release slot lock during writeback so slot can change under us: > >> + * slot_free() or slot_free() and zram_write_page(). In both cases > >> + * slot loses ZRAM_PP_SLOT flag. No concurrent post-processing can > >> + * set ZRAM_PP_SLOT on such slots until current post-processing > >> + * finishes. > >> + */ > >> + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) > >> + goto out; > >In this place, the index may be leaked. > > To be precise, blk_idx may be leaked. Ah, I see what you mean. Agreed. I guess I'll just send out v4 later today, given that we have two fixups in the queue already. --- drivers/block/zram/zram_drv.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 0a5f11e0c523..8e91dc071b65 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -886,6 +886,7 @@ static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) * (if enabled). */ zram_account_writeback_rollback(zram); + free_block_bdev(zram, req->blk_idx); return err; } @@ -898,8 +899,10 @@ static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) * set ZRAM_PP_SLOT on such slots until current post-processing * finishes. */ - if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) { + free_block_bdev(zram, req->blk_idx); goto out; + } zram_free_page(zram, index); zram_set_flag(zram, index, ZRAM_WB); -- 2.52.0.rc1.455.g30608eb744-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-17 15:19 ` Brian Geffon 2025-11-18 2:08 ` Yuwen Chen @ 2025-11-18 3:48 ` Yuwen Chen 1 sibling, 0 replies; 13+ messages in thread From: Yuwen Chen @ 2025-11-18 3:48 UTC (permalink / raw) To: bgeffon Cc: akpm, licayy, linux-block, linux-kernel, linux-mm, minchan, minchan, richardycc, senozhatsky, ywen.chen On Mon, 17 Nov 2025 10:19:22 -0500, Brian Geffon wrote: > Out of curiosity, why are we doing 1 page per bio? Why are we not > adding BIO_MAX_VECS before submitting? And then, why are we not > chaining? Do the block layer maintainers have thoughts? Mainly because the zram backend device is quite special. When performing the writeback operation, the probability of continuous writing is relatively low. If BIO_MAX_VECS is used, it will make the logic extremely complex. Thank you very much! ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv3 1/4] zram: introduce writeback bio batching support 2025-11-15 2:34 ` [PATCHv3 1/4] zram: introduce writeback bio batching support Sergey Senozhatsky 2025-11-17 15:19 ` Brian Geffon @ 2025-11-18 7:00 ` Sergey Senozhatsky 1 sibling, 0 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-18 7:00 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang, Brian Geffon, Fengyu Lian, linux-kernel, linux-mm, linux-block, Minchan Kim On (25/11/15 11:34), Sergey Senozhatsky wrote: [..] > +static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) > +{ > + u32 index; > + int err; > + > + index = req->pps->index; > + release_pp_slot(zram, req->pps); > + req->pps = NULL; It's too early to release the pp_slot here. Will fix in v4. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCHv3 2/4] zram: add writeback batch size device attr 2025-11-15 2:34 [PATCHv3 0/4] zram: introduce writeback bio batching Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 1/4] zram: introduce writeback bio batching support Sergey Senozhatsky @ 2025-11-15 2:34 ` Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 3/4] zram: take write lock in wb limit store handlers Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 4/4] zram: drop wb_limit_lock Sergey Senozhatsky 3 siblings, 0 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-15 2:34 UTC (permalink / raw) To: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang Cc: Brian Geffon, Fengyu Lian, linux-kernel, linux-mm, linux-block, Sergey Senozhatsky Introduce writeback_batch_size device attribute so that the maximum number of in-flight writeback bio requests can be configured at run-time per-device. This essentially enables batched bio writeback. Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> --- drivers/block/zram/zram_drv.c | 48 ++++++++++++++++++++++++++++++----- drivers/block/zram/zram_drv.h | 1 + 2 files changed, 43 insertions(+), 6 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 84e72c3bb280..e6fecea2e3bf 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -588,6 +588,42 @@ static ssize_t writeback_limit_show(struct device *dev, return sysfs_emit(buf, "%llu\n", val); } +static ssize_t writeback_batch_size_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct zram *zram = dev_to_zram(dev); + u32 val; + ssize_t ret = -EINVAL; + + if (kstrtouint(buf, 10, &val)) + return ret; + + if (!val) + val = 1; + + down_read(&zram->init_lock); + zram->wb_batch_size = val; + up_read(&zram->init_lock); + ret = len; + + return ret; +} + +static ssize_t writeback_batch_size_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + u32 val; + struct zram *zram = dev_to_zram(dev); + + down_read(&zram->init_lock); + val = zram->wb_batch_size; + up_read(&zram->init_lock); + + return sysfs_emit(buf, "%u\n", val); +} + static void reset_bdev(struct zram *zram) { if (!zram->backing_dev) @@ -775,10 +811,7 @@ static void release_wb_ctl(struct zram_wb_ctl *wb_ctl) kfree(wb_ctl); } -/* XXX: should be a per-device sysfs attr */ -#define ZRAM_WB_REQ_CNT 32 - -static struct zram_wb_ctl *init_wb_ctl(void) +static struct zram_wb_ctl *init_wb_ctl(struct zram *zram) { struct zram_wb_ctl *wb_ctl; int i; @@ -792,7 +825,7 @@ static struct zram_wb_ctl *init_wb_ctl(void) atomic_set(&wb_ctl->num_inflight, 0); init_completion(&wb_ctl->done); - for (i = 0; i < ZRAM_WB_REQ_CNT; i++) { + for (i = 0; i < zram->wb_batch_size; i++) { struct zram_wb_req *req; /* @@ -1180,7 +1213,7 @@ static ssize_t writeback_store(struct device *dev, goto release_init_lock; } - wb_ctl = init_wb_ctl(); + wb_ctl = init_wb_ctl(zram); if (!wb_ctl) { ret = -ENOMEM; goto release_init_lock; @@ -2821,6 +2854,7 @@ static DEVICE_ATTR_RW(backing_dev); static DEVICE_ATTR_WO(writeback); static DEVICE_ATTR_RW(writeback_limit); static DEVICE_ATTR_RW(writeback_limit_enable); +static DEVICE_ATTR_RW(writeback_batch_size); #endif #ifdef CONFIG_ZRAM_MULTI_COMP static DEVICE_ATTR_RW(recomp_algorithm); @@ -2842,6 +2876,7 @@ static struct attribute *zram_disk_attrs[] = { &dev_attr_writeback.attr, &dev_attr_writeback_limit.attr, &dev_attr_writeback_limit_enable.attr, + &dev_attr_writeback_batch_size.attr, #endif &dev_attr_io_stat.attr, &dev_attr_mm_stat.attr, @@ -2903,6 +2938,7 @@ static int zram_add(void) init_rwsem(&zram->init_lock); #ifdef CONFIG_ZRAM_WRITEBACK + zram->wb_batch_size = 32; spin_lock_init(&zram->wb_limit_lock); #endif diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index 6cee93f9c0d0..1a647f42c1a4 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -129,6 +129,7 @@ struct zram { struct file *backing_dev; spinlock_t wb_limit_lock; bool wb_limit_enable; + u32 wb_batch_size; u64 bd_wb_limit; struct block_device *bdev; unsigned long *bitmap; -- 2.52.0.rc1.455.g30608eb744-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCHv3 3/4] zram: take write lock in wb limit store handlers 2025-11-15 2:34 [PATCHv3 0/4] zram: introduce writeback bio batching Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 1/4] zram: introduce writeback bio batching support Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 2/4] zram: add writeback batch size device attr Sergey Senozhatsky @ 2025-11-15 2:34 ` Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 4/4] zram: drop wb_limit_lock Sergey Senozhatsky 3 siblings, 0 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-15 2:34 UTC (permalink / raw) To: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang Cc: Brian Geffon, Fengyu Lian, linux-kernel, linux-mm, linux-block, Sergey Senozhatsky Write device attrs handlers should take write zram init_lock. While at it, fixup coding styles. Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> --- drivers/block/zram/zram_drv.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index e6fecea2e3bf..76daf1e53859 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -519,7 +519,8 @@ struct zram_wb_req { }; static ssize_t writeback_limit_enable_store(struct device *dev, - struct device_attribute *attr, const char *buf, size_t len) + struct device_attribute *attr, + const char *buf, size_t len) { struct zram *zram = dev_to_zram(dev); u64 val; @@ -528,18 +529,19 @@ static ssize_t writeback_limit_enable_store(struct device *dev, if (kstrtoull(buf, 10, &val)) return ret; - down_read(&zram->init_lock); + down_write(&zram->init_lock); spin_lock(&zram->wb_limit_lock); zram->wb_limit_enable = val; spin_unlock(&zram->wb_limit_lock); - up_read(&zram->init_lock); + up_write(&zram->init_lock); ret = len; return ret; } static ssize_t writeback_limit_enable_show(struct device *dev, - struct device_attribute *attr, char *buf) + struct device_attribute *attr, + char *buf) { bool val; struct zram *zram = dev_to_zram(dev); @@ -554,7 +556,8 @@ static ssize_t writeback_limit_enable_show(struct device *dev, } static ssize_t writeback_limit_store(struct device *dev, - struct device_attribute *attr, const char *buf, size_t len) + struct device_attribute *attr, + const char *buf, size_t len) { struct zram *zram = dev_to_zram(dev); u64 val; @@ -563,11 +566,11 @@ static ssize_t writeback_limit_store(struct device *dev, if (kstrtoull(buf, 10, &val)) return ret; - down_read(&zram->init_lock); + down_write(&zram->init_lock); spin_lock(&zram->wb_limit_lock); zram->bd_wb_limit = val; spin_unlock(&zram->wb_limit_lock); - up_read(&zram->init_lock); + up_write(&zram->init_lock); ret = len; return ret; -- 2.52.0.rc1.455.g30608eb744-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCHv3 4/4] zram: drop wb_limit_lock 2025-11-15 2:34 [PATCHv3 0/4] zram: introduce writeback bio batching Sergey Senozhatsky ` (2 preceding siblings ...) 2025-11-15 2:34 ` [PATCHv3 3/4] zram: take write lock in wb limit store handlers Sergey Senozhatsky @ 2025-11-15 2:34 ` Sergey Senozhatsky 3 siblings, 0 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2025-11-15 2:34 UTC (permalink / raw) To: Andrew Morton, Minchan Kim, Yuwen Chen, Richard Chang Cc: Brian Geffon, Fengyu Lian, linux-kernel, linux-mm, linux-block, Sergey Senozhatsky We don't need wb_limit_lock. Writeback limit setters take an exclusive write zram init_lock, while wb_limit modifications happen only from a single task and under zram read init_lock. No concurrent wb_limit modifications are possible (we permit only one post-processing task at a time). Add lockdep assertions to wb_limit mutators. While at it, fixup coding styles. Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> --- drivers/block/zram/zram_drv.c | 22 +++++----------------- drivers/block/zram/zram_drv.h | 1 - 2 files changed, 5 insertions(+), 18 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 76daf1e53859..bc268670f852 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -530,9 +530,7 @@ static ssize_t writeback_limit_enable_store(struct device *dev, return ret; down_write(&zram->init_lock); - spin_lock(&zram->wb_limit_lock); zram->wb_limit_enable = val; - spin_unlock(&zram->wb_limit_lock); up_write(&zram->init_lock); ret = len; @@ -547,9 +545,7 @@ static ssize_t writeback_limit_enable_show(struct device *dev, struct zram *zram = dev_to_zram(dev); down_read(&zram->init_lock); - spin_lock(&zram->wb_limit_lock); val = zram->wb_limit_enable; - spin_unlock(&zram->wb_limit_lock); up_read(&zram->init_lock); return sysfs_emit(buf, "%d\n", val); @@ -567,9 +563,7 @@ static ssize_t writeback_limit_store(struct device *dev, return ret; down_write(&zram->init_lock); - spin_lock(&zram->wb_limit_lock); zram->bd_wb_limit = val; - spin_unlock(&zram->wb_limit_lock); up_write(&zram->init_lock); ret = len; @@ -577,15 +571,13 @@ static ssize_t writeback_limit_store(struct device *dev, } static ssize_t writeback_limit_show(struct device *dev, - struct device_attribute *attr, char *buf) + struct device_attribute *attr, char *buf) { u64 val; struct zram *zram = dev_to_zram(dev); down_read(&zram->init_lock); - spin_lock(&zram->wb_limit_lock); val = zram->bd_wb_limit; - spin_unlock(&zram->wb_limit_lock); up_read(&zram->init_lock); return sysfs_emit(buf, "%llu\n", val); @@ -864,18 +856,18 @@ static struct zram_wb_ctl *init_wb_ctl(struct zram *zram) static void zram_account_writeback_rollback(struct zram *zram) { - spin_lock(&zram->wb_limit_lock); + lockdep_assert_held_read(&zram->init_lock); + if (zram->wb_limit_enable) zram->bd_wb_limit += 1UL << (PAGE_SHIFT - 12); - spin_unlock(&zram->wb_limit_lock); } static void zram_account_writeback_submit(struct zram *zram) { - spin_lock(&zram->wb_limit_lock); + lockdep_assert_held_read(&zram->init_lock); + if (zram->wb_limit_enable && zram->bd_wb_limit > 0) zram->bd_wb_limit -= 1UL << (PAGE_SHIFT - 12); - spin_unlock(&zram->wb_limit_lock); } static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) @@ -990,13 +982,10 @@ static int zram_writeback_slots(struct zram *zram, blk_start_plug(&io_plug); while ((pps = select_pp_slot(ctl))) { - spin_lock(&zram->wb_limit_lock); if (zram->wb_limit_enable && !zram->bd_wb_limit) { - spin_unlock(&zram->wb_limit_lock); ret = -EIO; break; } - spin_unlock(&zram->wb_limit_lock); while (!req) { req = select_idle_req(wb_ctl); @@ -2942,7 +2931,6 @@ static int zram_add(void) init_rwsem(&zram->init_lock); #ifdef CONFIG_ZRAM_WRITEBACK zram->wb_batch_size = 32; - spin_lock_init(&zram->wb_limit_lock); #endif /* gendisk structure */ diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index 1a647f42c1a4..c6d94501376c 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -127,7 +127,6 @@ struct zram { bool claim; /* Protected by disk->open_mutex */ #ifdef CONFIG_ZRAM_WRITEBACK struct file *backing_dev; - spinlock_t wb_limit_lock; bool wb_limit_enable; u32 wb_batch_size; u64 bd_wb_limit; -- 2.52.0.rc1.455.g30608eb744-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-11-18 7:00 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-11-15 2:34 [PATCHv3 0/4] zram: introduce writeback bio batching Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 1/4] zram: introduce writeback bio batching support Sergey Senozhatsky 2025-11-17 15:19 ` Brian Geffon 2025-11-18 2:08 ` Yuwen Chen 2025-11-18 2:14 ` Sergey Senozhatsky 2025-11-18 3:18 ` Yuwen Chen 2025-11-18 3:36 ` Yuwen Chen 2025-11-18 3:47 ` Sergey Senozhatsky 2025-11-18 3:48 ` Yuwen Chen 2025-11-18 7:00 ` Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 2/4] zram: add writeback batch size device attr Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 3/4] zram: take write lock in wb limit store handlers Sergey Senozhatsky 2025-11-15 2:34 ` [PATCHv3 4/4] zram: drop wb_limit_lock Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox