* [PATCH 0/2] zram: introduce compressed data writeback
@ 2025-11-28 17:04 Sergey Senozhatsky
2025-11-28 17:04 ` [PATCH 1/2] " Sergey Senozhatsky
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-11-28 17:04 UTC (permalink / raw)
To: Andrew Morton, Richard Chang
Cc: Brian Geffon, Minchan Kim, linux-kernel, linux-mm, linux-block,
Sergey Senozhatsky
As writeback becomes more common there is another shortcoming
that needs to be addressed - compressed data writeback. Currently
zram does uncompressed data writeback which is not optimal due to
potential CPU and battery wastage. This series changes suboptimal
uncompressed writeback to a more optimal compressed data writeback.
Richard Chang (1):
zram: introduce compressed data writeback
Sergey Senozhatsky (1):
zram: rename zram_free_page()
drivers/block/zram/zram_drv.c | 260 ++++++++++++++++++++++++++--------
1 file changed, 202 insertions(+), 58 deletions(-)
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] zram: introduce compressed data writeback
2025-11-28 17:04 [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
@ 2025-11-28 17:04 ` Sergey Senozhatsky
2025-11-29 9:07 ` kernel test robot
2025-11-29 9:55 ` Barry Song
2025-11-28 17:04 ` [PATCH 2/2] zram: rename zram_free_page() Sergey Senozhatsky
2025-12-01 7:39 ` [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
2 siblings, 2 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-11-28 17:04 UTC (permalink / raw)
To: Andrew Morton, Richard Chang
Cc: Brian Geffon, Minchan Kim, linux-kernel, linux-mm, linux-block,
Sergey Senozhatsky, Minchan Kim
From: Richard Chang <richardycc@google.com>
zram stores all written back slots raw, which implies that
during writeback zram first has to decompress slots (except
for ZRAM_HUGE slots, which are raw already). The problem
with this approach is that not every written back page gets
read back (either via read() or via page-fault), which means
that zram basically wastes CPU cycles and battery decompressing
such slots. This changes with introduction of decompression
on demand, in other words decompression on read()/page-fault.
One caveat of decompression on demand is that async read
is completed in IRQ context, while zram decompression is
sleepable. To workaround this, read-back decompression
is offloaded to a preemptible context - system high-prio
work-queue.
Signed-off-by: Richard Chang <richardycc@google.com>
Co-developed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Suggested-by: Minchan Kim <minchan@google.com>
Suggested-by: Brian Geffon <bgeffon@google.com>
---
drivers/block/zram/zram_drv.c | 240 +++++++++++++++++++++++++++-------
1 file changed, 192 insertions(+), 48 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 5759823d6314..eef6c0a675b5 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -57,8 +57,8 @@ static size_t huge_class_size;
static const struct block_device_operations zram_devops;
static void zram_free_page(struct zram *zram, size_t index);
-static int zram_read_from_zspool(struct zram *zram, struct page *page,
- u32 index);
+static int zram_read_from_zspool_raw(struct zram *zram, struct page *page,
+ u32 index);
#define slot_dep_map(zram, index) (&(zram)->table[(index)].dep_map)
@@ -522,6 +522,22 @@ struct zram_wb_req {
struct list_head entry;
};
+struct zram_rb_req {
+ struct work_struct work;
+ struct zram *zram;
+ struct page *page;
+ /* The read bio for backing device */
+ struct bio *bio;
+ unsigned long blk_idx;
+ union {
+ /* The original bio to complete (async read) */
+ struct bio *parent;
+ /* error status (sync read) */
+ int error;
+ };
+ u32 index;
+};
+
static ssize_t writeback_limit_enable_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t len)
@@ -780,18 +796,6 @@ static void zram_release_bdev_block(struct zram *zram, unsigned long blk_idx)
atomic64_dec(&zram->stats.bd_count);
}
-static void read_from_bdev_async(struct zram *zram, struct page *page,
- unsigned long entry, struct bio *parent)
-{
- struct bio *bio;
-
- bio = bio_alloc(zram->bdev, 1, parent->bi_opf, GFP_NOIO);
- bio->bi_iter.bi_sector = entry * (PAGE_SIZE >> 9);
- __bio_add_page(bio, page, PAGE_SIZE, 0);
- bio_chain(bio, parent);
- submit_bio(bio);
-}
-
static void release_wb_req(struct zram_wb_req *req)
{
__free_page(req->page);
@@ -886,8 +890,9 @@ static void zram_account_writeback_submit(struct zram *zram)
static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req)
{
- u32 index = req->pps->index;
- int err;
+ u32 size, index = req->pps->index;
+ int err, prio;
+ bool huge;
err = blk_status_to_errno(req->bio.bi_status);
if (err) {
@@ -914,9 +919,22 @@ static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req)
goto out;
}
+ /*
+ * ZRAM_WB slots get freed, we need to preserve data required for
+ * read decompression.
+ */
+ size = zram_get_obj_size(zram, index);
+ prio = zram_get_priority(zram, index);
+ huge = zram_test_flag(zram, index, ZRAM_HUGE);
+
zram_free_page(zram, index);
zram_set_flag(zram, index, ZRAM_WB);
+ if (huge)
+ zram_set_flag(zram, index, ZRAM_HUGE);
zram_set_handle(zram, index, req->blk_idx);
+ zram_set_obj_size(zram, index, size);
+ zram_set_priority(zram, index, prio);
+
atomic64_inc(&zram->stats.pages_stored);
out:
@@ -1050,7 +1068,7 @@ static int zram_writeback_slots(struct zram *zram,
*/
if (!zram_test_flag(zram, index, ZRAM_PP_SLOT))
goto next;
- if (zram_read_from_zspool(zram, req->page, index))
+ if (zram_read_from_zspool_raw(zram, req->page, index))
goto next;
zram_slot_unlock(zram, index);
@@ -1313,24 +1331,123 @@ static ssize_t writeback_store(struct device *dev,
return ret;
}
-struct zram_work {
- struct work_struct work;
- struct zram *zram;
- unsigned long entry;
- struct page *page;
- int error;
-};
+static int decompress_bdev_page(struct zram *zram, struct page *page, u32 index)
+{
+ struct zcomp_strm *zstrm;
+ unsigned int size;
+ int ret, prio;
+ void *src;
+
+ zram_slot_lock(zram, index);
+ /* Since slot was unlocked we need to make sure it's still ZRAM_WB */
+ if (!zram_test_flag(zram, index, ZRAM_WB)) {
+ zram_slot_unlock(zram, index);
+ /* We read some stale data, zero it out */
+ memset_page(page, 0, 0, PAGE_SIZE);
+ return -EIO;
+ }
+
+ if (zram_test_flag(zram, index, ZRAM_HUGE)) {
+ zram_slot_unlock(zram, index);
+ return 0;
+ }
+
+ size = zram_get_obj_size(zram, index);
+ prio = zram_get_priority(zram, index);
+
+ zstrm = zcomp_stream_get(zram->comps[prio]);
+ src = kmap_local_page(page);
+ ret = zcomp_decompress(zram->comps[prio], zstrm, src, size,
+ zstrm->local_copy);
+ if (!ret)
+ copy_page(src, zstrm->local_copy);
+ kunmap_local(src);
+ zcomp_stream_put(zstrm);
+ zram_slot_unlock(zram, index);
+
+ return ret;
+}
+
+static void zram_deferred_decompress(struct work_struct *w)
+{
+ struct zram_rb_req *req = container_of(w, struct zram_rb_req, work);
+ struct page *page = bio_first_page_all(req->bio);
+ struct zram *zram = req->zram;
+ u32 index = req->index;
+ int ret;
+
+ ret = decompress_bdev_page(zram, page, index);
+ if (ret)
+ req->parent->bi_status = BLK_STS_IOERR;
+
+ /* Decrement parent's ->remaining */
+ bio_endio(req->parent);
+ bio_put(req->bio);
+ kfree(req);
+}
+
+static void zram_async_read_endio(struct bio *bio)
+{
+ struct zram_rb_req *req = bio->bi_private;
+
+ if (bio->bi_status) {
+ req->parent->bi_status = bio->bi_status;
+ bio_endio(req->parent);
+ bio_put(bio);
+ kfree(req);
+ return;
+ }
-static void zram_sync_read(struct work_struct *work)
+ /*
+ * zram decompression is sleepable, so we need to deffer it to
+ * a preemptible context.
+ */
+ INIT_WORK(&req->work, zram_deferred_decompress);
+ queue_work(system_highpri_wq, &req->work);
+}
+
+static void read_from_bdev_async(struct zram *zram, struct page *page,
+ u32 index, unsigned long blk_idx,
+ struct bio *parent)
{
- struct zram_work *zw = container_of(work, struct zram_work, work);
+ struct zram_rb_req *req;
+ struct bio *bio;
+
+ req = kmalloc(sizeof(*req), GFP_NOIO);
+ if (!req)
+ return;
+
+ bio = bio_alloc(zram->bdev, 1, parent->bi_opf, GFP_NOIO);
+ if (!bio) {
+ kfree(req);
+ return;
+ }
+
+ req->zram = zram;
+ req->index = index;
+ req->blk_idx = blk_idx;
+ req->bio = bio;
+ req->parent = parent;
+
+ bio->bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9);
+ bio->bi_private = req;
+ bio->bi_end_io = zram_async_read_endio;
+
+ __bio_add_page(bio, page, PAGE_SIZE, 0);
+ bio_inc_remaining(parent);
+ submit_bio(bio);
+}
+
+static void zram_sync_read(struct work_struct *w)
+{
+ struct zram_rb_req *req = container_of(w, struct zram_rb_req, work);
struct bio_vec bv;
struct bio bio;
- bio_init(&bio, zw->zram->bdev, &bv, 1, REQ_OP_READ);
- bio.bi_iter.bi_sector = zw->entry * (PAGE_SIZE >> 9);
- __bio_add_page(&bio, zw->page, PAGE_SIZE, 0);
- zw->error = submit_bio_wait(&bio);
+ bio_init(&bio, req->zram->bdev, &bv, 1, REQ_OP_READ);
+ bio.bi_iter.bi_sector = req->blk_idx * (PAGE_SIZE >> 9);
+ __bio_add_page(&bio, req->page, PAGE_SIZE, 0);
+ req->error = submit_bio_wait(&bio);
}
/*
@@ -1338,39 +1455,41 @@ static void zram_sync_read(struct work_struct *work)
* chained IO with parent IO in same context, it's a deadlock. To avoid that,
* use a worker thread context.
*/
-static int read_from_bdev_sync(struct zram *zram, struct page *page,
- unsigned long entry)
+static int read_from_bdev_sync(struct zram *zram, struct page *page, u32 index,
+ unsigned long blk_idx)
{
- struct zram_work work;
+ struct zram_rb_req req;
- work.page = page;
- work.zram = zram;
- work.entry = entry;
+ req.page = page;
+ req.zram = zram;
+ req.blk_idx = blk_idx;
- INIT_WORK_ONSTACK(&work.work, zram_sync_read);
- queue_work(system_dfl_wq, &work.work);
- flush_work(&work.work);
- destroy_work_on_stack(&work.work);
+ INIT_WORK_ONSTACK(&req.work, zram_sync_read);
+ queue_work(system_dfl_wq, &req.work);
+ flush_work(&req.work);
+ destroy_work_on_stack(&req.work);
- return work.error;
+ if (!req.error)
+ return decompress_bdev_page(zram, page, index);
+ return req.error;
}
-static int read_from_bdev(struct zram *zram, struct page *page,
- unsigned long entry, struct bio *parent)
+static int read_from_bdev(struct zram *zram, struct page *page, u32 index,
+ unsigned long blk_idx, struct bio *parent)
{
atomic64_inc(&zram->stats.bd_reads);
if (!parent) {
if (WARN_ON_ONCE(!IS_ENABLED(ZRAM_PARTIAL_IO)))
return -EIO;
- return read_from_bdev_sync(zram, page, entry);
+ return read_from_bdev_sync(zram, page, index, blk_idx);
}
- read_from_bdev_async(zram, page, entry, parent);
+ read_from_bdev_async(zram, page, index, blk_idx, parent);
return 0;
}
#else
static inline void reset_bdev(struct zram *zram) {};
-static int read_from_bdev(struct zram *zram, struct page *page,
- unsigned long entry, struct bio *parent)
+static int read_from_bdev(struct zram *zram, struct page *page, u32 index,
+ unsigned long blk_idx, struct bio *parent)
{
return -EIO;
}
@@ -1977,6 +2096,31 @@ static int read_compressed_page(struct zram *zram, struct page *page, u32 index)
return ret;
}
+static int zram_read_from_zspool_raw(struct zram *zram, struct page *page,
+ u32 index)
+{
+ struct zcomp_strm *zstrm;
+ unsigned long handle;
+ unsigned int size;
+ void *src;
+
+ handle = zram_get_handle(zram, index);
+ size = zram_get_obj_size(zram, index);
+
+ /*
+ * We need to get stream just for ->local_copy buffer, in
+ * case if object spans two physical pages. No decompression
+ * takes place here, as we read raw compressed data.
+ */
+ zstrm = zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]);
+ src = zs_obj_read_begin(zram->mem_pool, handle, zstrm->local_copy);
+ memcpy_to_page(page, 0, src, size);
+ zs_obj_read_end(zram->mem_pool, handle, src);
+ zcomp_stream_put(zstrm);
+
+ return 0;
+}
+
/*
* Reads (decompresses if needed) a page from zspool (zsmalloc).
* Corresponding ZRAM slot should be locked.
@@ -2012,7 +2156,7 @@ static int zram_read_page(struct zram *zram, struct page *page, u32 index,
* device.
*/
zram_slot_unlock(zram, index);
- ret = read_from_bdev(zram, page, blk_idx, parent);
+ ret = read_from_bdev(zram, page, index, blk_idx, parent);
}
/* Should NEVER happen. Return bio error if it does. */
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] zram: rename zram_free_page()
2025-11-28 17:04 [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
2025-11-28 17:04 ` [PATCH 1/2] " Sergey Senozhatsky
@ 2025-11-28 17:04 ` Sergey Senozhatsky
2025-12-01 7:39 ` [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
2 siblings, 0 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-11-28 17:04 UTC (permalink / raw)
To: Andrew Morton, Richard Chang
Cc: Brian Geffon, Minchan Kim, linux-kernel, linux-mm, linux-block,
Sergey Senozhatsky
We don't free page in zram_free_page(), not all slots even
have any memory associated with them (e.g. ZRAM_SAME). We
free the slot (or reset it), rename the function accordingly.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
drivers/block/zram/zram_drv.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index eef6c0a675b5..d8054b3cafaa 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -56,7 +56,7 @@ static size_t huge_class_size;
static const struct block_device_operations zram_devops;
-static void zram_free_page(struct zram *zram, size_t index);
+static void zram_slot_free(struct zram *zram, u32 index);
static int zram_read_from_zspool_raw(struct zram *zram, struct page *page,
u32 index);
@@ -927,7 +927,7 @@ static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req)
prio = zram_get_priority(zram, index);
huge = zram_test_flag(zram, index, ZRAM_HUGE);
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_set_flag(zram, index, ZRAM_WB);
if (huge)
zram_set_flag(zram, index, ZRAM_HUGE);
@@ -1966,7 +1966,7 @@ static void zram_meta_free(struct zram *zram, u64 disksize)
/* Free all pages that are still in this zram device */
for (index = 0; index < num_pages; index++)
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zs_destroy_pool(zram->mem_pool);
vfree(zram->table);
@@ -1998,7 +1998,7 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize)
return true;
}
-static void zram_free_page(struct zram *zram, size_t index)
+static void zram_slot_free(struct zram *zram, u32 index)
{
unsigned long handle;
@@ -2197,7 +2197,7 @@ static int write_same_filled_page(struct zram *zram, unsigned long fill,
u32 index)
{
zram_slot_lock(zram, index);
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_set_flag(zram, index, ZRAM_SAME);
zram_set_handle(zram, index, fill);
zram_slot_unlock(zram, index);
@@ -2235,7 +2235,7 @@ static int write_incompressible_page(struct zram *zram, struct page *page,
kunmap_local(src);
zram_slot_lock(zram, index);
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_set_flag(zram, index, ZRAM_HUGE);
zram_set_handle(zram, index, handle);
zram_set_obj_size(zram, index, PAGE_SIZE);
@@ -2300,7 +2300,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
zcomp_stream_put(zstrm);
zram_slot_lock(zram, index);
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_set_handle(zram, index, handle);
zram_set_obj_size(zram, index, comp_len);
zram_slot_unlock(zram, index);
@@ -2522,7 +2522,7 @@ static int recompress_slot(struct zram *zram, u32 index, struct page *page,
zs_obj_write(zram->mem_pool, handle_new, zstrm->buffer, comp_len_new);
zcomp_stream_put(zstrm);
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_set_handle(zram, index, handle_new);
zram_set_obj_size(zram, index, comp_len_new);
zram_set_priority(zram, index, prio);
@@ -2725,7 +2725,7 @@ static void zram_bio_discard(struct zram *zram, struct bio *bio)
while (n >= PAGE_SIZE) {
zram_slot_lock(zram, index);
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_slot_unlock(zram, index);
atomic64_inc(&zram->stats.notify_free);
index++;
@@ -2833,7 +2833,7 @@ static void zram_slot_free_notify(struct block_device *bdev,
return;
}
- zram_free_page(zram, index);
+ zram_slot_free(zram, index);
zram_slot_unlock(zram, index);
}
--
2.52.0.487.g5c8c507ade-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-11-28 17:04 ` [PATCH 1/2] " Sergey Senozhatsky
@ 2025-11-29 9:07 ` kernel test robot
2025-12-01 9:10 ` Sergey Senozhatsky
2025-11-29 9:55 ` Barry Song
1 sibling, 1 reply; 11+ messages in thread
From: kernel test robot @ 2025-11-29 9:07 UTC (permalink / raw)
To: Sergey Senozhatsky, Andrew Morton, Richard Chang
Cc: oe-kbuild-all, Linux Memory Management List, Brian Geffon,
Minchan Kim, linux-kernel, linux-block, Sergey Senozhatsky
Hi Sergey,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on next-20251128]
[cannot apply to axboe/for-next linus/master v6.18-rc7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Sergey-Senozhatsky/zram-introduce-compressed-data-writeback/20251129-010716
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20251128170442.2988502-2-senozhatsky%40chromium.org
patch subject: [PATCH 1/2] zram: introduce compressed data writeback
config: x86_64-randconfig-011-20251129 (https://download.01.org/0day-ci/archive/20251129/202511291628.NZif1jdx-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251129/202511291628.NZif1jdx-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511291628.NZif1jdx-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/block/zram/zram_drv.c:2099:12: warning: 'zram_read_from_zspool_raw' defined but not used [-Wunused-function]
2099 | static int zram_read_from_zspool_raw(struct zram *zram, struct page *page,
| ^~~~~~~~~~~~~~~~~~~~~~~~~
vim +/zram_read_from_zspool_raw +2099 drivers/block/zram/zram_drv.c
2098
> 2099 static int zram_read_from_zspool_raw(struct zram *zram, struct page *page,
2100 u32 index)
2101 {
2102 struct zcomp_strm *zstrm;
2103 unsigned long handle;
2104 unsigned int size;
2105 void *src;
2106
2107 handle = zram_get_handle(zram, index);
2108 size = zram_get_obj_size(zram, index);
2109
2110 /*
2111 * We need to get stream just for ->local_copy buffer, in
2112 * case if object spans two physical pages. No decompression
2113 * takes place here, as we read raw compressed data.
2114 */
2115 zstrm = zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]);
2116 src = zs_obj_read_begin(zram->mem_pool, handle, zstrm->local_copy);
2117 memcpy_to_page(page, 0, src, size);
2118 zs_obj_read_end(zram->mem_pool, handle, src);
2119 zcomp_stream_put(zstrm);
2120
2121 return 0;
2122 }
2123
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-11-28 17:04 ` [PATCH 1/2] " Sergey Senozhatsky
2025-11-29 9:07 ` kernel test robot
@ 2025-11-29 9:55 ` Barry Song
2025-12-01 3:56 ` Sergey Senozhatsky
1 sibling, 1 reply; 11+ messages in thread
From: Barry Song @ 2025-11-29 9:55 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, Richard Chang, Brian Geffon, Minchan Kim,
linux-kernel, linux-mm, linux-block, Minchan Kim
On Sat, Nov 29, 2025 at 1:06 AM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> From: Richard Chang <richardycc@google.com>
>
Hi Richard, Sergey,
Thanks a lot for developing this. For years, people have been looking for
compressed data writeback to reduce I/O, such as compacting multiple compressed
blocks into a single page on block devices. I guess this patchset hasn’t reached
that point yet, right?
> zram stores all written back slots raw, which implies that
> during writeback zram first has to decompress slots (except
> for ZRAM_HUGE slots, which are raw already). The problem
> with this approach is that not every written back page gets
> read back (either via read() or via page-fault), which means
> that zram basically wastes CPU cycles and battery decompressing
> such slots. This changes with introduction of decompression
If a page is swapped out and never read again, does that actually indicate
a memory leak in userspace?
So the main benefit of this patch so far is actually avoiding decompression
for "leaked" anon pages, which might still have a pointer but are
never accessed again?
> on demand, in other words decompression on read()/page-fault.
Thanks
Barry
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-11-29 9:55 ` Barry Song
@ 2025-12-01 3:56 ` Sergey Senozhatsky
2025-12-01 8:59 ` Barry Song
0 siblings, 1 reply; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-12-01 3:56 UTC (permalink / raw)
To: Barry Song
Cc: Sergey Senozhatsky, Andrew Morton, Richard Chang, Brian Geffon,
Minchan Kim, linux-kernel, linux-mm, linux-block, Minchan Kim
Hi Barry,
On (25/11/29 17:55), Barry Song wrote:
> On Sat, Nov 29, 2025 at 1:06 AM Sergey Senozhatsky
> <senozhatsky@chromium.org> wrote:
> >
> > From: Richard Chang <richardycc@google.com>
> >
>
> Hi Richard, Sergey,
>
> Thanks a lot for developing this. For years, people have been looking for
> compressed data writeback to reduce I/O, such as compacting multiple compressed
> blocks into a single page on block devices. I guess this patchset hasn’t reached
> that point yet, right?
Right.
> > zram stores all written back slots raw, which implies that
> > during writeback zram first has to decompress slots (except
> > for ZRAM_HUGE slots, which are raw already). The problem
> > with this approach is that not every written back page gets
> > read back (either via read() or via page-fault), which means
> > that zram basically wastes CPU cycles and battery decompressing
> > such slots. This changes with introduction of decompression
>
> If a page is swapped out and never read again, does that actually indicate
> a memory leak in userspace?
No, it just means that there is no page-fault on that page. E.g. we
swapped out an unused browser tab and never come back to it within the
session: e.g. user closed the tab/app, or logged out of session, or
rebooted the device, or simply powered off (desktop/laptop).
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2] zram: introduce compressed data writeback
2025-11-28 17:04 [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
2025-11-28 17:04 ` [PATCH 1/2] " Sergey Senozhatsky
2025-11-28 17:04 ` [PATCH 2/2] zram: rename zram_free_page() Sergey Senozhatsky
@ 2025-12-01 7:39 ` Sergey Senozhatsky
2 siblings, 0 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-12-01 7:39 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, Richard Chang, Brian Geffon, Minchan Kim,
linux-kernel, linux-mm, linux-block
On (25/11/29 02:04), Sergey Senozhatsky wrote:
> As writeback becomes more common there is another shortcoming
> that needs to be addressed - compressed data writeback. Currently
> zram does uncompressed data writeback which is not optimal due to
> potential CPU and battery wastage. This series changes suboptimal
> uncompressed writeback to a more optimal compressed data writeback.
JFI, v2 is coming.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-12-01 3:56 ` Sergey Senozhatsky
@ 2025-12-01 8:59 ` Barry Song
2025-12-01 9:09 ` Sergey Senozhatsky
0 siblings, 1 reply; 11+ messages in thread
From: Barry Song @ 2025-12-01 8:59 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, Richard Chang, Brian Geffon, Minchan Kim,
linux-kernel, linux-mm, linux-block, Minchan Kim
On Mon, Dec 1, 2025 at 11:56 AM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
[...]
> > > zram stores all written back slots raw, which implies that
> > > during writeback zram first has to decompress slots (except
> > > for ZRAM_HUGE slots, which are raw already). The problem
> > > with this approach is that not every written back page gets
> > > read back (either via read() or via page-fault), which means
> > > that zram basically wastes CPU cycles and battery decompressing
> > > such slots. This changes with introduction of decompression
> >
> > If a page is swapped out and never read again, does that actually indicate
> > a memory leak in userspace?
>
> No, it just means that there is no page-fault on that page. E.g. we
> swapped out an unused browser tab and never come back to it within the
> session: e.g. user closed the tab/app, or logged out of session, or
> rebooted the device, or simply powered off (desktop/laptop).
Thanks, Sergey. That makes sense to me. On Android, users don’t have a
close button, yet apps can still be OOM-killed; those pages are never
swapped in.
Thanks
Barry
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-12-01 8:59 ` Barry Song
@ 2025-12-01 9:09 ` Sergey Senozhatsky
2025-12-01 18:00 ` Barry Song
0 siblings, 1 reply; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-12-01 9:09 UTC (permalink / raw)
To: Barry Song
Cc: Sergey Senozhatsky, Andrew Morton, Richard Chang, Brian Geffon,
Minchan Kim, linux-kernel, linux-mm, linux-block, Minchan Kim
On (25/12/01 16:59), Barry Song wrote:
> On Mon, Dec 1, 2025 at 11:56 AM Sergey Senozhatsky
> <senozhatsky@chromium.org> wrote:
> [...]
> > > > zram stores all written back slots raw, which implies that
> > > > during writeback zram first has to decompress slots (except
> > > > for ZRAM_HUGE slots, which are raw already). The problem
> > > > with this approach is that not every written back page gets
> > > > read back (either via read() or via page-fault), which means
> > > > that zram basically wastes CPU cycles and battery decompressing
> > > > such slots. This changes with introduction of decompression
> > >
> > > If a page is swapped out and never read again, does that actually indicate
> > > a memory leak in userspace?
> >
> > No, it just means that there is no page-fault on that page. E.g. we
> > swapped out an unused browser tab and never come back to it within the
> > session: e.g. user closed the tab/app, or logged out of session, or
> > rebooted the device, or simply powered off (desktop/laptop).
>
> Thanks, Sergey. That makes sense to me. On Android, users don’t have a
> close button, yet apps can still be OOM-killed; those pages are never
> swapped in.
I see. I suppose on android you still can swipe up and terminate
un-needed apps, wouldn't this be the same? Well, apart from that,
zram is not android-specific, some distros use it on desktops/laptops
as well.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-11-29 9:07 ` kernel test robot
@ 2025-12-01 9:10 ` Sergey Senozhatsky
0 siblings, 0 replies; 11+ messages in thread
From: Sergey Senozhatsky @ 2025-12-01 9:10 UTC (permalink / raw)
To: kernel test robot
Cc: Sergey Senozhatsky, Andrew Morton, Richard Chang, oe-kbuild-all,
Linux Memory Management List, Brian Geffon, Minchan Kim,
linux-kernel, linux-block
On (25/11/29 17:07), kernel test robot wrote:
> Hi Sergey,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on akpm-mm/mm-everything]
> [also build test WARNING on next-20251128]
> [cannot apply to axboe/for-next linus/master v6.18-rc7]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Sergey-Senozhatsky/zram-introduce-compressed-data-writeback/20251129-010716
> base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link: https://lore.kernel.org/r/20251128170442.2988502-2-senozhatsky%40chromium.org
> patch subject: [PATCH 1/2] zram: introduce compressed data writeback
> config: x86_64-randconfig-011-20251129 (https://download.01.org/0day-ci/archive/20251129/202511291628.NZif1jdx-lkp@intel.com/config)
> compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251129/202511291628.NZif1jdx-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202511291628.NZif1jdx-lkp@intel.com/
>
> All warnings (new ones prefixed by >>):
>
> >> drivers/block/zram/zram_drv.c:2099:12: warning: 'zram_read_from_zspool_raw' defined but not used [-Wunused-function]
> 2099 | static int zram_read_from_zspool_raw(struct zram *zram, struct page *page,
> | ^~~~~~~~~~~~~~~~~~~~~~~~~
Fixed in v2. Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] zram: introduce compressed data writeback
2025-12-01 9:09 ` Sergey Senozhatsky
@ 2025-12-01 18:00 ` Barry Song
0 siblings, 0 replies; 11+ messages in thread
From: Barry Song @ 2025-12-01 18:00 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, Richard Chang, Brian Geffon, Minchan Kim,
linux-kernel, linux-mm, linux-block, Minchan Kim
On Mon, Dec 1, 2025 at 5:09 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (25/12/01 16:59), Barry Song wrote:
> > On Mon, Dec 1, 2025 at 11:56 AM Sergey Senozhatsky
> > <senozhatsky@chromium.org> wrote:
> > [...]
> > > > > zram stores all written back slots raw, which implies that
> > > > > during writeback zram first has to decompress slots (except
> > > > > for ZRAM_HUGE slots, which are raw already). The problem
> > > > > with this approach is that not every written back page gets
> > > > > read back (either via read() or via page-fault), which means
> > > > > that zram basically wastes CPU cycles and battery decompressing
> > > > > such slots. This changes with introduction of decompression
> > > >
> > > > If a page is swapped out and never read again, does that actually indicate
> > > > a memory leak in userspace?
> > >
> > > No, it just means that there is no page-fault on that page. E.g. we
> > > swapped out an unused browser tab and never come back to it within the
> > > session: e.g. user closed the tab/app, or logged out of session, or
> > > rebooted the device, or simply powered off (desktop/laptop).
> >
> > Thanks, Sergey. That makes sense to me. On Android, users don’t have a
> > close button, yet apps can still be OOM-killed; those pages are never
> > swapped in.
>
> I see. I suppose on android you still can swipe up and terminate
> un-needed apps, wouldn't this be the same? Well, apart from that,
That’s true, although it’s not typical user behavior :-)
> zram is not android-specific, some distros use it on desktops/laptops
> as well.
Yes, absolutely.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-12-01 18:01 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-28 17:04 [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
2025-11-28 17:04 ` [PATCH 1/2] " Sergey Senozhatsky
2025-11-29 9:07 ` kernel test robot
2025-12-01 9:10 ` Sergey Senozhatsky
2025-11-29 9:55 ` Barry Song
2025-12-01 3:56 ` Sergey Senozhatsky
2025-12-01 8:59 ` Barry Song
2025-12-01 9:09 ` Sergey Senozhatsky
2025-12-01 18:00 ` Barry Song
2025-11-28 17:04 ` [PATCH 2/2] zram: rename zram_free_page() Sergey Senozhatsky
2025-12-01 7:39 ` [PATCH 0/2] zram: introduce compressed data writeback Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox