From: Brian Geffon <bgeffon@google.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
linux-kernel@vger.kernel.org,
Suleiman Souhlal <suleiman@google.com>,
linux-mm@kvack.org, Brian Geffon <bgeffon@google.com>
Subject: [RFC PATCH 1/1] zram: Allow rw_page when page isn't written back.
Date: Mon, 8 Aug 2022 12:50:06 -0400 [thread overview]
Message-ID: <20220808165006.2451180-2-bgeffon@google.com> (raw)
In-Reply-To: <20220808165006.2451180-1-bgeffon@google.com>
Today when a zram device has a backing device we change the ops to
a new set which does not expose a rw_page method. This prevents the
upper layers from trying to issue a synchronous rw. This has the
downside that we penalize every rw even when it could possibly
still be performed as a synchronous rw.
This change will always expose a rw_page function and if the page
has been written back it will return -EOPNOTSUPP which will force the
upper layers to try again with bio.
To safely allow a synchronous read to proceed for pages which have not
yet written back we introduce a new flag ZRAM_NO_WB. On the first
synchronous read if the page is not written back we will set the
ZRAM_NO_WB flag. This flag, which is never cleared, prevents writeback
from ever happening to that page.
This approach works because in the case of zram as a swap backing device
the page is going to be removed from zram shortly thereafter so
preventing writeback is fine. However, if zram is being used as a
generic block device then this might prevent writeback of the page.
Signed-off-by: Brian Geffon <bgeffon@google.com>
---
drivers/block/zram/zram_drv.c | 65 +++++++++++++++++++++--------------
drivers/block/zram/zram_drv.h | 1 +
2 files changed, 41 insertions(+), 25 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 92cb929a45b7..196392353bd3 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -52,9 +52,6 @@ static unsigned int num_devices = 1;
static size_t huge_class_size;
static const struct block_device_operations zram_devops;
-#ifdef CONFIG_ZRAM_WRITEBACK
-static const struct block_device_operations zram_wb_devops;
-#endif
static void zram_free_page(struct zram *zram, size_t index);
static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
@@ -439,7 +436,6 @@ static void reset_bdev(struct zram *zram)
filp_close(zram->backing_dev, NULL);
zram->backing_dev = NULL;
zram->bdev = NULL;
- zram->disk->fops = &zram_devops;
kvfree(zram->bitmap);
zram->bitmap = NULL;
}
@@ -543,17 +539,6 @@ static ssize_t backing_dev_store(struct device *dev,
zram->backing_dev = backing_dev;
zram->bitmap = bitmap;
zram->nr_pages = nr_pages;
- /*
- * With writeback feature, zram does asynchronous IO so it's no longer
- * synchronous device so let's remove synchronous io flag. Othewise,
- * upper layer(e.g., swap) could wait IO completion rather than
- * (submit and return), which will cause system sluggish.
- * Furthermore, when the IO function returns(e.g., swap_readpage),
- * upper layer expects IO was done so it could deallocate the page
- * freely but in fact, IO is going on so finally could cause
- * use-after-free when the IO is really done.
- */
- zram->disk->fops = &zram_wb_devops;
up_write(&zram->init_lock);
pr_info("setup backing device %s\n", file_name);
@@ -722,7 +707,8 @@ static ssize_t writeback_store(struct device *dev,
if (zram_test_flag(zram, index, ZRAM_WB) ||
zram_test_flag(zram, index, ZRAM_SAME) ||
- zram_test_flag(zram, index, ZRAM_UNDER_WB))
+ zram_test_flag(zram, index, ZRAM_UNDER_WB) ||
+ zram_test_flag(zram, index, ZRAM_NO_WB))
goto next;
if (mode & IDLE_WRITEBACK &&
@@ -1226,6 +1212,10 @@ static void zram_free_page(struct zram *zram, size_t index)
goto out;
}
+ if (zram_test_flag(zram, index, ZRAM_NO_WB)) {
+ zram_clear_flag(zram, index, ZRAM_NO_WB);
+ }
+
/*
* No memory is allocated for same element filled pages.
* Simply clear same page flag.
@@ -1654,6 +1644,40 @@ static int zram_rw_page(struct block_device *bdev, sector_t sector,
index = sector >> SECTORS_PER_PAGE_SHIFT;
offset = (sector & (SECTORS_PER_PAGE - 1)) << SECTOR_SHIFT;
+#ifdef CONFIG_ZRAM_WRITEBACK
+ /*
+ * With writeback feature, zram does asynchronous IO so it's no longer
+ * synchronous device so let's remove synchronous io flag. Othewise,
+ * upper layer(e.g., swap) could wait IO completion rather than
+ * (submit and return), which will cause system sluggish.
+ * Furthermore, when the IO function returns(e.g., swap_readpage),
+ * upper layer expects IO was done so it could deallocate the page
+ * freely but in fact, IO is going on so finally could cause
+ * use-after-free when the IO is really done.
+ *
+ * If the page is not currently written back then we may proceed to
+ * read the page synchronously, otherwise, we must fail with
+ * -EOPNOTSUPP to force the upper layers to use a normal bio.
+ */
+ zram_slot_lock(zram, index);
+ if (zram_test_flag(zram, index, ZRAM_WB) ||
+ zram_test_flag(zram, index, ZRAM_UNDER_WB)) {
+ zram_slot_unlock(zram, index);
+ /* We cannot proceed with synchronous read */
+ return -EOPNOTSUPP;
+ }
+
+ /*
+ * Don't allow the page to be written back while we read it,
+ * this flag is never cleared. It shouldn't be a problem that
+ * we don't clear this flag because in the case of swap this
+ * page will be removed shortly after this read anyway.
+ */
+ if (op == REQ_OP_READ)
+ zram_set_flag(zram, index, ZRAM_NO_WB);
+ zram_slot_unlock(zram, index);
+#endif
+
bv.bv_page = page;
bv.bv_len = PAGE_SIZE;
bv.bv_offset = 0;
@@ -1827,15 +1851,6 @@ static const struct block_device_operations zram_devops = {
.owner = THIS_MODULE
};
-#ifdef CONFIG_ZRAM_WRITEBACK
-static const struct block_device_operations zram_wb_devops = {
- .open = zram_open,
- .submit_bio = zram_submit_bio,
- .swap_slot_free_notify = zram_slot_free_notify,
- .owner = THIS_MODULE
-};
-#endif
-
static DEVICE_ATTR_WO(compact);
static DEVICE_ATTR_RW(disksize);
static DEVICE_ATTR_RO(initstate);
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 158c91e54850..20e4c6a579e0 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -50,6 +50,7 @@ enum zram_pageflags {
ZRAM_UNDER_WB, /* page is under writeback */
ZRAM_HUGE, /* Incompressible page */
ZRAM_IDLE, /* not accessed page since last idle marking */
+ ZRAM_NO_WB, /* Do not allow page to be written back */
__NR_ZRAM_PAGEFLAGS,
};
--
2.37.1.559.g78731f0fdb-goog
next prev parent reply other threads:[~2022-08-08 16:50 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-08 16:50 [RFC PATCH 0/1] " Brian Geffon
2022-08-08 16:50 ` Brian Geffon [this message]
2022-08-09 1:38 ` [RFC PATCH 1/1] " Sergey Senozhatsky
2022-08-10 19:18 ` Brian Geffon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220808165006.2451180-2-bgeffon@google.com \
--to=bgeffon@google.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=ngupta@vflare.org \
--cc=senozhatsky@chromium.org \
--cc=suleiman@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox