From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24B04C36002 for ; Mon, 24 Mar 2025 07:09:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 412F2280002; Mon, 24 Mar 2025 03:09:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C418280001; Mon, 24 Mar 2025 03:09:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 24132280002; Mon, 24 Mar 2025 03:09:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 06715280001 for ; Mon, 24 Mar 2025 03:09:42 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C906AAC3C7 for ; Mon, 24 Mar 2025 07:09:43 +0000 (UTC) X-FDA: 83255569446.18.1BA27FC Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf06.hostedemail.com (Postfix) with ESMTP id D8A1818000C for ; Mon, 24 Mar 2025 07:09:41 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=kzmSX68O; spf=pass (imf06.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.177 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742800182; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=W79sq2DoA2VKaacTSC34apFP8WYj+YQQer7GFXkhKBQ=; b=aFd0GfhispbwUAtB9CiA8hQ9OFTREvEk+qcNIkhQtk0ysKepLzFSYjXeLjfr14u62LGOgz Qmg+mTusiWlxgdbMXyP2uttI7Fta5ke5Ugfo9NQ7r8khYyuv/1FHFCgfzASriR7rpXYfXg FY7hsFIdQMWpgSToOF148QK7j47Wa+Q= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=kzmSX68O; spf=pass (imf06.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.177 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742800182; a=rsa-sha256; cv=none; b=f4b5lB4sx3b7wLwuxqebLOfUaNCXeJQPqZZXjHM9c1ICFY2dZTdDI9AnyUqCsZkEIwJi+I htNdqzI3fBoodQfQmilFjY/hie/YD3QA8Wj9JfqpIQ6L41RuBdWS2Kml2NDGJNDcGz8ZLD lB+kZfU9nhbg5znrXjekNndcLKQ7CJo= Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-2243803b776so63139755ad.0 for ; Mon, 24 Mar 2025 00:09:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1742800180; x=1743404980; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=W79sq2DoA2VKaacTSC34apFP8WYj+YQQer7GFXkhKBQ=; b=kzmSX68OImA6j3HJfwWUpHuRTIdJjQY0WsoCbJWJumtjUgLzDUUC4fVD6PiOM3PFa4 K0cadsqXI0Cni6X6WPy7qLme5bVOQQV5zwoLy+GQz1Mp59wkK0U3Bjrix1XurSL0D9AX fGD/uJceEAHWRnL8YD1/bjUo1UCvTcOv6DBjw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742800180; x=1743404980; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=W79sq2DoA2VKaacTSC34apFP8WYj+YQQer7GFXkhKBQ=; b=hhDQN1bHiHqtcCKEihA1gkUOQogn4LMpjGZLpbxITa7hJksk/3TstOdcAojNuZDR1P xgIjO5ImKuW2ZyT/ecfAWIUrPaMHJAuIzvqJdprsfnFV97gSa5a2v2zNxs5Yi6JI42p2 0pTF7wXoGh5x680wOrW21Kt1WwDORrOkngTC+EMdjGL5FfmD/AHzDCme4Mz5d80/PSn7 Ic2xWdoQDSmwq5S9yVvWlZGxhmtPGB8Szh6FcCA41u5UuxVV0Th2DZJyIr4bFYUyYPRJ 3HUt3Gz5FQaaFzhSDLDiKMel1AddJfiXwzkNeweKIJ/+AsmxckBmO0PHX5n0bNuOAVdH /H6g== X-Gm-Message-State: AOJu0YyPYYd3TWjzHfMxx3GET6IkjMRwHSxMiGZCZ7dR31ZPlPcO6RSp DnphS7BvAKZJt6bSuaO0gPLgWPmKoVqTcECB5m/CUvblrzDi/DJPCiJGn0jW6w== X-Gm-Gg: ASbGncvLr+rSyvn6Vg1TGaPat4Yg2Pra8yE404pg/6NhmilLmwuZ6wUJMqi3ioW9rTi l5VVptY3jKUj2Nv/KEdLCFxb38qJlckAYIHhew5duiaUolI7a7fcq5+1tnuCLj7g7mZUquurDCg W4IkxDHB9qcWj84C9NepX5wSocg6tKeLLoSnkUXXvE+N+lbxHmPLXwRUPLAWggMy3MBUK8b8mhw 49GxgvLnLBOiaH4H+U6EiC1vvJHe5L1QnT/pE9ntApDDDj/yyYUUoqP/YzQ4zOkb9DeOHM6WePV +j7w4a+qwRcicIo8UQv4Ismq1/a6xcXUcp39vLzecRC9LDo2g8I6gAR9ZJU= X-Google-Smtp-Source: AGHT+IFT7BS6GiZQhmnZ9D+DMV2M2e4bANx2H9TywbQbwJbXwVC5mo1y1e4AxitzF87kyA5QDP60Qw== X-Received: by 2002:a05:6a20:2589:b0:1f5:7873:3052 with SMTP id adf61e73a8af0-1fe42f35932mr21301860637.11.1742800180411; Mon, 24 Mar 2025 00:09:40 -0700 (PDT) Received: from google.com ([2401:fa00:8f:203:f38f:d38c:cfee:a113]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-739061592b9sm7095170b3a.152.2025.03.24.00.09.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Mar 2025 00:09:39 -0700 (PDT) Date: Mon, 24 Mar 2025 16:09:36 +0900 From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim , Brian Geffon Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: Re: [RFC PATCH] zram: introduce writeback_ext interface Message-ID: References: <20250324065029.3077450-1-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250324065029.3077450-1-senozhatsky@chromium.org> X-Rspamd-Queue-Id: D8A1818000C X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: qzcceqpiemwh5u768fjeirks444murg6 X-HE-Tag: 1742800181-155524 X-HE-Meta: U2FsdGVkX19/ldZ+metedS5aoeKcnN82i84SvfmlVuMJCRljV7mOP/8OQR3iBgprW5QIdbe8Npq1c6nEl2ekM9hoNGpT3vVVJmP8pVXwwLFNhi9o/7dLcZSsgI1cPzw2BeFA4+Fjm5LQT2/pzROCZhAQ87a+BjMV3P6gDVaJrq68I9+trEQFUCbH0IEO4z55fVk+UbgZZxj41D25AB8YZyiYUoQocFpN/2DcXL+eFUvkl/QYrH47byuRNM3ZiLVROvpo1U79YiGdh/KisziNbnBt/nmIHttT70VAf2GhMsVenfiiMrxKxjy/A0ug19mKXHa4k14yxz0BSLNiBAztNIOjzzIORyE6CoZwlrt0estV0JAXmVkd4HqIIEWGzuIRh8zuZrO1iTNmbvfZdNMxrSMqGVSdepk1RzaLkkBp9udkfvFBqDeRK64KTFHnqX4POhUyyGJN6FvgKSIKp457QsHRqF/k3GizKwPy0ODhWNw1298MxZo68StOCWV8rDouCRCG2b5jqMy8J+qTYzuybVb67jwYcBfu6rqjG2GGwn2Xe63kvn5O9hfa4OeExRrfCqXl/2lI7nf5nyFkXW9LNTYUut41hsdH4k/cIx6j2vAdHGfFQNGGJBgoQbMvRxlUDQ545N+zusry9BzyRVck9igKwob/34LVs0o7Sb4geZZzgDDzG9g9QZvbx+7C/OWFKoM6LrR7pL2ZP9Rne5fuVZ7PxcXNx3qRlF9HH13egqJvBRlKrRDkt6C5mzkd/wooixfAZ8sWQBoKtUU0RR99FXZmXjPog8+cfzZQqG6xtAvkpTwj32T4GExcnqEqrzYCuEvHdqWSIKz5GQRgG1dvEw69Kp5E1ZCNVxCyBFihUuvZbgkYPGovOBRFhAPn0FIY5jwI/IP1IOantVvrsXV9uP9wu51uJ9ItAWzdWu40jBXlYp/864vcGAG3qRzdZO8h+aJNr7TRSd2LlAqidoo tInwtBQo 02xzKIjriYJO7PRAuZV6mLIvSsZrIcK5HWadtlj/pi0m2Yd5VhST59W6QGwMrOY251S+SSWVOUroChDUplEHHyEi3Z8z2aBxgUzn5RFY+nM8GEO+El+5GGGzd3jLj2IIrh/QH3XIfdReEcZVULFqWZwp40SNDpeUuCPPiDxSPjXPjp5U+LsKHD1b76Z8a9QqMI0PU7N9a4AfRp1XiIu2R3v76TMOeJcrzcHOb4GtnEO5kcN+iK8fnupgZG7GOK+SgIS/Grjr2sL99ng9dezk7WOhEBpN6GmNZIW4pN8TRmmcn91+Y9VPh4NtoletONj6By2xtBAF3GjePLQzewO9Hn1iojSuRh/V+dGDwnYv9MxUr9ZisC+OQntO4HxufbmQEhM9xH6nDtpaLp44= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On (25/03/24 15:45), Sergey Senozhatsky wrote: > > Alternatively I can just extend writeback and say something like > "starting from 6.16 it also accepts indices ranges" in zram's > documentation. See below. Maybe this is actually fine and we don't need a new sysfs file. Subject: [RFC PATCH] zram: modernize writeback interface Extend page writeback with page indices ranges. The previous interface could write only one page at a time: echo page_index=100 > zram0/writeback ... echo page_index=200 > zram0/writeback echo page_index=500 > zram0/writeback ... echo page_index=700 > zram0/writeback One obvious downside is that it increases the number of syscalls. Less obvious but significantly more important downside is that when given only one page to post-process zram cannot perform an optimal target selection. This becomes critical when writeback_limit is enabled, because under writeback_limit we want to guarantee the highest memory savings. The new interface supports multiple page indices ranges in a single call: echo page_index_range=100-200 \ page_index_range=500-700 > zram0/writeback This gives zram a chance to apply an optimal target selection strategy on each iteration of the writeback loop. Apart from that the new interface unifies parameters passing and resembles other "modern" zram device attributes (e.g. recompression), while the old interface used a mixed scheme: values-less parameters for mode and a key=value format for page_index. We still support the "old" value-less format for compatibility reasons. Signed-off-by: Sergey Senozhatsky --- Documentation/admin-guide/blockdev/zram.rst | 8 + drivers/block/zram/zram_drv.c | 321 +++++++++++++------- 2 files changed, 224 insertions(+), 105 deletions(-) diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst index 9bdb30901a93..753543751c28 100644 --- a/Documentation/admin-guide/blockdev/zram.rst +++ b/Documentation/admin-guide/blockdev/zram.rst @@ -369,6 +369,14 @@ they could write a page index into the interface:: echo "page_index=1251" > /sys/block/zramX/writeback +Starting from 6.16, this interface supports `page_index_range` parameters +(multiple ranges can be provided at once), which specify `LOW-HIGH` ranges +of pages to be written-back. This reduces the number of syscalls, but more +importantly this enables optimal post-processing target selection strategy. +Usage example:: + + echo "page_index_range=1-100" > /sys/block/zramX/writeback + If there are lots of write IO with flash device, potentially, it has flash wearout problem so that admin needs to design write limitation to guarantee storage health for entire product life. diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index fda7d8624889..f267d562518c 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -734,114 +734,19 @@ static void read_from_bdev_async(struct zram *zram, struct page *page, submit_bio(bio); } -#define PAGE_WB_SIG "page_index=" - -#define PAGE_WRITEBACK 0 -#define HUGE_WRITEBACK (1<<0) -#define IDLE_WRITEBACK (1<<1) -#define INCOMPRESSIBLE_WRITEBACK (1<<2) - -static int scan_slots_for_writeback(struct zram *zram, u32 mode, - unsigned long nr_pages, - unsigned long index, - struct zram_pp_ctl *ctl) +static int zram_writeback_slots(struct zram *zram, struct zram_pp_ctl *ctl) { - for (; nr_pages != 0; index++, nr_pages--) { - bool ok = true; - - zram_slot_lock(zram, index); - if (!zram_allocated(zram, index)) - goto next; - - if (zram_test_flag(zram, index, ZRAM_WB) || - zram_test_flag(zram, index, ZRAM_SAME)) - goto next; - - if (mode & IDLE_WRITEBACK && - !zram_test_flag(zram, index, ZRAM_IDLE)) - goto next; - if (mode & HUGE_WRITEBACK && - !zram_test_flag(zram, index, ZRAM_HUGE)) - goto next; - if (mode & INCOMPRESSIBLE_WRITEBACK && - !zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE)) - goto next; - - ok = place_pp_slot(zram, ctl, index); -next: - zram_slot_unlock(zram, index); - if (!ok) - break; - } - - return 0; -} - -static ssize_t writeback_store(struct device *dev, - struct device_attribute *attr, const char *buf, size_t len) -{ - struct zram *zram = dev_to_zram(dev); - unsigned long nr_pages = zram->disksize >> PAGE_SHIFT; - struct zram_pp_ctl *ctl = NULL; + unsigned long blk_idx = 0; + struct page *page = NULL; struct zram_pp_slot *pps; - unsigned long index = 0; - struct bio bio; struct bio_vec bio_vec; - struct page *page = NULL; - ssize_t ret = len; - int mode, err; - unsigned long blk_idx = 0; - - if (sysfs_streq(buf, "idle")) - mode = IDLE_WRITEBACK; - else if (sysfs_streq(buf, "huge")) - mode = HUGE_WRITEBACK; - else if (sysfs_streq(buf, "huge_idle")) - mode = IDLE_WRITEBACK | HUGE_WRITEBACK; - else if (sysfs_streq(buf, "incompressible")) - mode = INCOMPRESSIBLE_WRITEBACK; - else { - if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1)) - return -EINVAL; - - if (kstrtol(buf + sizeof(PAGE_WB_SIG) - 1, 10, &index) || - index >= nr_pages) - return -EINVAL; - - nr_pages = 1; - mode = PAGE_WRITEBACK; - } - - down_read(&zram->init_lock); - if (!init_done(zram)) { - ret = -EINVAL; - goto release_init_lock; - } - - /* Do not permit concurrent post-processing actions. */ - if (atomic_xchg(&zram->pp_in_progress, 1)) { - up_read(&zram->init_lock); - return -EAGAIN; - } - - if (!zram->backing_dev) { - ret = -ENODEV; - goto release_init_lock; - } + struct bio bio; + int ret, err; + u32 index; page = alloc_page(GFP_KERNEL); - if (!page) { - ret = -ENOMEM; - goto release_init_lock; - } - - ctl = init_pp_ctl(); - if (!ctl) { - ret = -ENOMEM; - goto release_init_lock; - } - - scan_slots_for_writeback(zram, mode, nr_pages, index, ctl); + if (!page) + return -ENOMEM; while ((pps = select_pp_slot(ctl))) { spin_lock(&zram->wb_limit_lock); @@ -929,10 +834,216 @@ static ssize_t writeback_store(struct device *dev, if (blk_idx) free_block_bdev(zram, blk_idx); - -release_init_lock: if (page) __free_page(page); + + return ret; +} + +#define PAGE_WRITEBACK 0 +#define HUGE_WRITEBACK (1 << 0) +#define IDLE_WRITEBACK (1 << 1) +#define INCOMPRESSIBLE_WRITEBACK (1 << 2) + +static int parse_page_index(char *val, unsigned long nr_pages, + unsigned long *lo, unsigned long *hi) +{ + int ret; + + ret = kstrtoul(val, 10, lo); + if (ret) + return ret; + *hi = *lo + 1; + if (*lo >= nr_pages || *hi > nr_pages) + return -ERANGE; + return 0; +} + +static int parse_page_index_range(char *val, unsigned long nr_pages, + unsigned long *lo, unsigned long *hi) +{ + char *delim; + int ret; + + delim = strchr(val, '-'); + if (!delim) + return -EINVAL; + + *delim = 0x00; + ret = kstrtoul(val, 10, lo); + if (ret) + return ret; + if (*lo >= nr_pages) + return -ERANGE; + + ret = kstrtoul(delim + 1, 10, hi); + if (ret) + return ret; + if (*hi >= nr_pages || *lo > *hi) + return -ERANGE; + *hi += 1; + return 0; +} + +static int parse_mode(char *val, u32 *mode) +{ + *mode = 0; + + if (!strcmp(val, "idle")) + *mode = IDLE_WRITEBACK; + if (!strcmp(val, "huge")) + *mode = HUGE_WRITEBACK; + if (!strcmp(val, "huge_idle")) + *mode = IDLE_WRITEBACK | HUGE_WRITEBACK; + if (!strcmp(val, "incompressible")) + *mode = INCOMPRESSIBLE_WRITEBACK; + + if (*mode == 0) + return -EINVAL; + return 0; +} + +static int scan_slots_for_writeback(struct zram *zram, u32 mode, + unsigned long lo, unsigned long hi, + struct zram_pp_ctl *ctl) +{ + u32 index = lo; + + while (index < hi) { + bool ok = true; + + zram_slot_lock(zram, index); + if (!zram_allocated(zram, index)) + goto next; + + if (zram_test_flag(zram, index, ZRAM_WB) || + zram_test_flag(zram, index, ZRAM_SAME)) + goto next; + + if (mode & IDLE_WRITEBACK && + !zram_test_flag(zram, index, ZRAM_IDLE)) + goto next; + if (mode & HUGE_WRITEBACK && + !zram_test_flag(zram, index, ZRAM_HUGE)) + goto next; + if (mode & INCOMPRESSIBLE_WRITEBACK && + !zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE)) + goto next; + + ok = place_pp_slot(zram, ctl, index); +next: + zram_slot_unlock(zram, index); + if (!ok) + break; + index++; + } + + return 0; +} + +static ssize_t writeback_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct zram *zram = dev_to_zram(dev); + u64 nr_pages = zram->disksize >> PAGE_SHIFT; + unsigned long lo = 0, hi = nr_pages; + struct zram_pp_ctl *ctl = NULL; + char *args, *param, *val; + ssize_t ret = len; + int err, mode = 0; + + down_read(&zram->init_lock); + if (!init_done(zram)) { + up_read(&zram->init_lock); + return -EINVAL; + } + + /* Do not permit concurrent post-processing actions. */ + if (atomic_xchg(&zram->pp_in_progress, 1)) { + up_read(&zram->init_lock); + return -EAGAIN; + } + + if (!zram->backing_dev) { + ret = -ENODEV; + goto release_init_lock; + } + + ctl = init_pp_ctl(); + if (!ctl) { + ret = -ENOMEM; + goto release_init_lock; + } + + args = skip_spaces(buf); + while (*args) { + args = next_arg(args, ¶m, &val); + + /* + * Workaround to support the old writeback interface. + * + * The old writeback interface has a minor inconsistency and + * requires key=value only for page_index parameter, while + * writeback mode is a valueless parameter. + * + * This is not the case anymore and now all parameters are + * required to have values, however, we need to support the + * legacy writeback interface that's why we check if we can + * recognize a valueless parameter as a (legacy) writeback + * mode. + */ + if (!val || !*val) { + err = parse_mode(param, &mode); + if (err) { + ret = err; + goto release_init_lock; + } + + scan_slots_for_writeback(zram, mode, lo, hi, ctl); + break; + } + + if (!strcmp(param, "type")) { + err = parse_mode(val, &mode); + if (err) { + ret = err; + goto release_init_lock; + } + + scan_slots_for_writeback(zram, mode, lo, hi, ctl); + break; + } + + if (!strcmp(param, "page_index")) { + err = parse_page_index(val, nr_pages, &lo, &hi); + if (err) { + ret = err; + goto release_init_lock; + } + + scan_slots_for_writeback(zram, mode, lo, hi, ctl); + break; + } + + /* There can be several page index ranges */ + if (!strcmp(param, "page_index_range")) { + err = parse_page_index_range(val, nr_pages, &lo, &hi); + if (err) { + ret = err; + goto release_init_lock; + } + + scan_slots_for_writeback(zram, mode, lo, hi, ctl); + continue; + } + } + + err = zram_writeback_slots(zram, ctl); + if (err) + ret = err; + +release_init_lock: release_pp_ctl(zram, ctl); atomic_set(&zram->pp_in_progress, 0); up_read(&zram->init_lock); -- 2.49.0.395.g12beb8f557-goog