* [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs
@ 2022-11-19 0:51 Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 01/20] mm: add bdi_set_strict_limit() function Stefan Roesch
` (19 more replies)
0 siblings, 20 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:51 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
At meta network block devices (nbd) are used to implement remote block
storage. In testing and during production it has been observed that
these network block devices can consume a huge portion of the dirty
writeback cache and writeback can take a considerable time.
To be able to give stricter limits, I'm proposing the following changes:
1) introduce strictlimit knob
Currently the max_ratio knob exists to limit the dirty_memory. However
this knob only applies once (dirty_ratio + dirty_background_ratio) / 2
has been reached.
With the BDI_CAP_STRICTLIMIT flag, the max_ratio can be applied without
reaching that limit. This change exposes that knob.
This knob can also be useful for NFS, fuse filesystems and USB devices.
2) Use part of 1000000 internal calculation
The max_ratio is based on percentage. With the current machine sizes
percentage values can be very high (1% of a 256GB main memory is already
2.5GB). This change uses part of 1000000 instead of percentages for the
internal calculations.
3) Introduce two new sysfs knobs: min_bytes and max_bytes.
Currently all calculations are based on ratio, but for a user it often
more convenient to specify a limit in bytes. The new knobs will not
store bytes values, instead they will translate the byte value to a
corresponding ratio. As the internal values are now part of 1000, the
ratio is closer to the specified value. However the value should be more
seen as an approximation as it can fluctuate over time.
3) Introduce two new sysfs knobs: min_ratio_fine and max_ratio_fine.
The granularity for the existing sysfs bdi knobs min_ratio and max_ratio
is based on percentage values. The new sysfs bdi knobs min_ratio_fine
and max_ratio_fine allow to specify the ratio as part of 1 million.
Changes:
V4:
- Introduced two new sysfs knobs min_ratio_fine and max_ratio_fine to allow
setting ratios with smaller granularity
- Refreshed to 6.1-rc5
- removed bdi_set_strict_limit export
- removed bdi_get_max_bytes export
- removed bdi_set_max_bytes export
- change granularity to part of 1000000
- changed function signature of bdi_get_max_bytes() to return u64
- Fixed commit message of
"mm: split off __bdi_set_max_ratio() function"
- changed check in bdi_check_pages_limit()
V3:
- change signature of function bdi_ratio_from_pages to take an unsigned long
parameter
- use div64_u64 function for division to support 32 bit platforms
- Refreshed to 6.1-rc2
V2:
- Refreshed to 6.1-rc1
- Use part of 1000, instead of part of 10000
- Reformat cover letter
Stefan Roesch (20):
mm: add bdi_set_strict_limit() function
mm: add knob /sys/class/bdi/<bdi>/strict_limit
mm: document /sys/class/bdi/<bdi>/strict_limit knob
mm: use part per 1000000 for bdi ratios.
mm: add bdi_get_max_bytes() function
mm: split off __bdi_set_max_ratio() function
mm: add bdi_set_max_bytes() function.
mm: add knob /sys/class/bdi/<bdi>/max_bytes
mm: document /sys/class/bdi/<bdi>/max_bytes knob
mm: add bdi_get_min_bytes() function.
mm: split off __bdi_set_min_ratio() function
mm: add bdi_set_min_bytes() function
mm: add /sys/class/bdi/<bdi>/min_bytes knob
mm: document /sys/class/bdi/<bdi>/min_bytes knob
mm: add bdi_set_max_ratio_no_scale() function
mm: add /sys/class/bdi/<bdi>/max_ratio_fine knob
mm: document /sys/class/bdi/<bdi>/max_ratio_fine knob
mm: add bdi_set_min_ratio_no_scale() function
mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob
mm: document /sys/class/bdi/<bdi>/min_ratio_fine knob
Documentation/ABI/testing/sysfs-class-bdi | 68 +++++++++++
include/linux/backing-dev.h | 10 ++
mm/backing-dev.c | 133 +++++++++++++++++++++-
mm/page-writeback.c | 130 +++++++++++++++++++--
4 files changed, 329 insertions(+), 12 deletions(-)
base-commit: ab290eaddc4c41b237b9a366fa6a5527be890b84
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 01/20] mm: add bdi_set_strict_limit() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
@ 2022-11-19 0:51 ` Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 02/20] mm: add knob /sys/class/bdi/<bdi>/strict_limit Stefan Roesch
` (18 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:51 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This adds the bdi_set_strict_limit function to be able to set/unset the
BDI_CAP_STRICTLIMIT flag.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 15 +++++++++++++++
2 files changed, 16 insertions(+)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 439815cc1ab9..9c984ffc8a0a 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -104,6 +104,7 @@ static inline unsigned long wb_stat_error(void)
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
+int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit);
/*
* Flags in backing_dev_info::capability
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 7e9d8d857ecc..3745b886722f 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -698,6 +698,21 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned max_ratio)
}
EXPORT_SYMBOL(bdi_set_max_ratio);
+int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit)
+{
+ if (strict_limit > 1)
+ return -EINVAL;
+
+ spin_lock_bh(&bdi_lock);
+ if (strict_limit)
+ bdi->capabilities |= BDI_CAP_STRICTLIMIT;
+ else
+ bdi->capabilities &= ~BDI_CAP_STRICTLIMIT;
+ spin_unlock_bh(&bdi_lock);
+
+ return 0;
+}
+
static unsigned long dirty_freerun_ceiling(unsigned long thresh,
unsigned long bg_thresh)
{
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 02/20] mm: add knob /sys/class/bdi/<bdi>/strict_limit
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 01/20] mm: add bdi_set_strict_limit() function Stefan Roesch
@ 2022-11-19 0:51 ` Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 03/20] mm: document /sys/class/bdi/<bdi>/strict_limit knob Stefan Roesch
` (17 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:51 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
Add a new knob to /sys/class/bdi/<bdi>/strict_limit. This new knob
allows to set/unset the flag BDI_CAP_STRICTLIMIT in the bdi
capabilities.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/backing-dev.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index c30419a5e119..a0899cce72ef 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -209,11 +209,40 @@ static ssize_t stable_pages_required_show(struct device *dev,
}
static DEVICE_ATTR_RO(stable_pages_required);
+static ssize_t strict_limit_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ unsigned int strict_limit;
+ ssize_t ret;
+
+ ret = kstrtouint(buf, 10, &strict_limit);
+ if (ret < 0)
+ return ret;
+
+ ret = bdi_set_strict_limit(bdi, strict_limit);
+ if (!ret)
+ ret = count;
+
+ return ret;
+}
+
+static ssize_t strict_limit_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+
+ return sysfs_emit(buf, "%d\n",
+ !!(bdi->capabilities & BDI_CAP_STRICTLIMIT));
+}
+static DEVICE_ATTR_RW(strict_limit);
+
static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
&dev_attr_max_ratio.attr,
&dev_attr_stable_pages_required.attr,
+ &dev_attr_strict_limit.attr,
NULL,
};
ATTRIBUTE_GROUPS(bdi_dev);
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 03/20] mm: document /sys/class/bdi/<bdi>/strict_limit knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 01/20] mm: add bdi_set_strict_limit() function Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 02/20] mm: add knob /sys/class/bdi/<bdi>/strict_limit Stefan Roesch
@ 2022-11-19 0:51 ` Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 04/20] mm: use part per 1000000 for bdi ratios Stefan Roesch
` (16 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:51 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This documents the new /sys/class/bdi/<bdi>/strict_limit knob.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
Documentation/ABI/testing/sysfs-class-bdi | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-class-bdi b/Documentation/ABI/testing/sysfs-class-bdi
index 6d2a2fc189dd..68b5d4018c2f 100644
--- a/Documentation/ABI/testing/sysfs-class-bdi
+++ b/Documentation/ABI/testing/sysfs-class-bdi
@@ -55,6 +55,17 @@ Description:
mount that is prone to get stuck, or a FUSE mount which cannot
be trusted to play fair.
+ (read-write)
+What: /sys/class/bdi/<bdi>/strict_limit
+Date: October 2022
+Contact: Stefan Roesch <shr@devkernel.io>
+Description:
+ Forces per-BDI checks for the share of given device in the write-back
+ cache even before the global background dirty limit is reached. This
+ is useful in situations where the global limit is much higher than
+ affordable for given relatively slow (or untrusted) device. Turning
+ strictlimit on has no visible effect if max_ratio is equal to 100%.
+
(read-write)
What: /sys/class/bdi/<bdi>/stable_pages_required
Date: January 2008
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 04/20] mm: use part per 1000000 for bdi ratios.
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (2 preceding siblings ...)
2022-11-19 0:51 ` [RFC PATCH v4 03/20] mm: document /sys/class/bdi/<bdi>/strict_limit knob Stefan Roesch
@ 2022-11-19 0:51 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 05/20] mm: add bdi_get_max_bytes() function Stefan Roesch
` (15 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:51 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
To get finer granularity for ratio calculations use part per million
instead of percentiles. This is especially important if we want to
automatically convert byte values to ratios. Otherwise the values that
are actually used can be quite different. This is also important for
machines with more main memory (1% of 256GB is already 2.5GB).
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 3 +++
mm/backing-dev.c | 6 +++---
mm/page-writeback.c | 15 +++++++++------
3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 9c984ffc8a0a..1b50c028e5ad 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -102,6 +102,9 @@ static inline unsigned long wb_stat_error(void)
#endif
}
+/* BDI ratio is expressed as part per 1000000 for finer granularity. */
+#define BDI_RATIO_SCALE 10000
+
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index a0899cce72ef..90fa517123dc 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -178,7 +178,7 @@ static ssize_t min_ratio_store(struct device *dev,
return ret;
}
-BDI_SHOW(min_ratio, bdi->min_ratio)
+BDI_SHOW(min_ratio, bdi->min_ratio / BDI_RATIO_SCALE)
static ssize_t max_ratio_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
@@ -197,7 +197,7 @@ static ssize_t max_ratio_store(struct device *dev,
return ret;
}
-BDI_SHOW(max_ratio, bdi->max_ratio)
+BDI_SHOW(max_ratio, bdi->max_ratio / BDI_RATIO_SCALE)
static ssize_t stable_pages_required_show(struct device *dev,
struct device_attribute *attr,
@@ -809,7 +809,7 @@ int bdi_init(struct backing_dev_info *bdi)
kref_init(&bdi->refcnt);
bdi->min_ratio = 0;
- bdi->max_ratio = 100;
+ bdi->max_ratio = 100 * BDI_RATIO_SCALE;
bdi->max_prop_frac = FPROP_FRAC_BASE;
INIT_LIST_HEAD(&bdi->bdi_list);
INIT_LIST_HEAD(&bdi->wb_list);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3745b886722f..dd98b2654302 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -197,7 +197,7 @@ static void wb_min_max_ratio(struct bdi_writeback *wb,
min *= this_bw;
min = div64_ul(min, tot_bw);
}
- if (max < 100) {
+ if (max < 100 * BDI_RATIO_SCALE) {
max *= this_bw;
max = div64_ul(max, tot_bw);
}
@@ -655,6 +655,8 @@ int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
unsigned int delta;
int ret = 0;
+ min_ratio *= BDI_RATIO_SCALE;
+
spin_lock_bh(&bdi_lock);
if (min_ratio > bdi->max_ratio) {
ret = -EINVAL;
@@ -665,7 +667,7 @@ int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
bdi->min_ratio = min_ratio;
} else {
delta = min_ratio - bdi->min_ratio;
- if (bdi_min_ratio + delta < 100) {
+ if (bdi_min_ratio + delta < 100 * BDI_RATIO_SCALE) {
bdi_min_ratio += delta;
bdi->min_ratio = min_ratio;
} else {
@@ -684,6 +686,7 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned max_ratio)
if (max_ratio > 100)
return -EINVAL;
+ max_ratio *= BDI_RATIO_SCALE;
spin_lock_bh(&bdi_lock);
if (bdi->min_ratio > max_ratio) {
@@ -775,15 +778,15 @@ static unsigned long __wb_calc_thresh(struct dirty_throttle_control *dtc)
fprop_fraction_percpu(&dom->completions, dtc->wb_completions,
&numerator, &denominator);
- wb_thresh = (thresh * (100 - bdi_min_ratio)) / 100;
+ wb_thresh = (thresh * (100 * BDI_RATIO_SCALE - bdi_min_ratio)) / (100 * BDI_RATIO_SCALE);
wb_thresh *= numerator;
wb_thresh = div64_ul(wb_thresh, denominator);
wb_min_max_ratio(dtc->wb, &wb_min_ratio, &wb_max_ratio);
- wb_thresh += (thresh * wb_min_ratio) / 100;
- if (wb_thresh > (thresh * wb_max_ratio) / 100)
- wb_thresh = thresh * wb_max_ratio / 100;
+ wb_thresh += (thresh * wb_min_ratio) / (100 * BDI_RATIO_SCALE);
+ if (wb_thresh > (thresh * wb_max_ratio) / (100 * BDI_RATIO_SCALE))
+ wb_thresh = thresh * wb_max_ratio / (100 * BDI_RATIO_SCALE);
return wb_thresh;
}
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 05/20] mm: add bdi_get_max_bytes() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (3 preceding siblings ...)
2022-11-19 0:51 ` [RFC PATCH v4 04/20] mm: use part per 1000000 for bdi ratios Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 06/20] mm: split off __bdi_set_max_ratio() function Stefan Roesch
` (14 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This adds a function to return the specified value for max_bytes. It
converts the stored max_ratio of the bdi to the corresponding bytes
value. It introduces the bdi_get_bytes helper function to do the
conversion. This is an approximation as it is based on the value that is
returned by global_dirty_limits(), which can change. The helper function
will also be used by the min_bytes bdi knob.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 17 +++++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 1b50c028e5ad..473686c32775 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -105,6 +105,7 @@ static inline unsigned long wb_stat_error(void)
/* BDI ratio is expressed as part per 1000000 for finer granularity. */
#define BDI_RATIO_SCALE 10000
+u64 bdi_get_max_bytes(struct backing_dev_info *bdi);
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index dd98b2654302..719404e0d03d 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -650,6 +650,18 @@ void wb_domain_exit(struct wb_domain *dom)
*/
static unsigned int bdi_min_ratio;
+static u64 bdi_get_bytes(unsigned int ratio)
+{
+ unsigned long background_thresh;
+ unsigned long dirty_thresh;
+ u64 bytes;
+
+ global_dirty_limits(&background_thresh, &dirty_thresh);
+ bytes = (dirty_thresh * PAGE_SIZE * ratio) / BDI_RATIO_SCALE / 100;
+
+ return bytes;
+}
+
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
{
unsigned int delta;
@@ -701,6 +713,11 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned max_ratio)
}
EXPORT_SYMBOL(bdi_set_max_ratio);
+u64 bdi_get_max_bytes(struct backing_dev_info *bdi)
+{
+ return bdi_get_bytes(bdi->max_ratio);
+}
+
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit)
{
if (strict_limit > 1)
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 06/20] mm: split off __bdi_set_max_ratio() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (4 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 05/20] mm: add bdi_get_max_bytes() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 07/20] mm: add bdi_set_max_bytes() function Stefan Roesch
` (13 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This splits off __bdi_set_max_ratio() from bdi_set_max_ratio().
__bdi_set_max_ratio() will also be called from bdi_set_max_bytes(),
which will be introduced in the next patch.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/page-writeback.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 719404e0d03d..e74ef596dc27 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -692,14 +692,10 @@ int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
return ret;
}
-int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned max_ratio)
+static int __bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio)
{
int ret = 0;
- if (max_ratio > 100)
- return -EINVAL;
- max_ratio *= BDI_RATIO_SCALE;
-
spin_lock_bh(&bdi_lock);
if (bdi->min_ratio > max_ratio) {
ret = -EINVAL;
@@ -711,6 +707,14 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned max_ratio)
return ret;
}
+
+int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio)
+{
+ if (max_ratio > 100)
+ return -EINVAL;
+
+ return __bdi_set_max_ratio(bdi, max_ratio * BDI_RATIO_SCALE);
+}
EXPORT_SYMBOL(bdi_set_max_ratio);
u64 bdi_get_max_bytes(struct backing_dev_info *bdi)
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 07/20] mm: add bdi_set_max_bytes() function.
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (5 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 06/20] mm: split off __bdi_set_max_ratio() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 08/20] mm: add knob /sys/class/bdi/<bdi>/max_bytes Stefan Roesch
` (12 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This introduces the bdi_set_max_bytes() function. The max_bytes function
does not store the max_bytes value. Instead it converts the max_bytes
value into the corresponding ratio value.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 37 +++++++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 473686c32775..ea6c993433d5 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -108,6 +108,7 @@ static inline unsigned long wb_stat_error(void)
u64 bdi_get_max_bytes(struct backing_dev_info *bdi);
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
+int bdi_set_max_bytes(struct backing_dev_info *bdi, u64 max_bytes);
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit);
/*
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index e74ef596dc27..20ae9adeb22f 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -13,6 +13,7 @@
*/
#include <linux/kernel.h>
+#include <linux/math64.h>
#include <linux/export.h>
#include <linux/spinlock.h>
#include <linux/fs.h>
@@ -650,6 +651,28 @@ void wb_domain_exit(struct wb_domain *dom)
*/
static unsigned int bdi_min_ratio;
+static int bdi_check_pages_limit(unsigned long pages)
+{
+ unsigned long max_dirty_pages = global_dirtyable_memory();
+
+ if (pages > max_dirty_pages)
+ return -EINVAL;
+
+ return 0;
+}
+
+static unsigned long bdi_ratio_from_pages(unsigned long pages)
+{
+ unsigned long background_thresh;
+ unsigned long dirty_thresh;
+ unsigned long ratio;
+
+ global_dirty_limits(&background_thresh, &dirty_thresh);
+ ratio = div64_u64(pages * 100ULL * BDI_RATIO_SCALE, dirty_thresh);
+
+ return ratio;
+}
+
static u64 bdi_get_bytes(unsigned int ratio)
{
unsigned long background_thresh;
@@ -722,6 +745,20 @@ u64 bdi_get_max_bytes(struct backing_dev_info *bdi)
return bdi_get_bytes(bdi->max_ratio);
}
+int bdi_set_max_bytes(struct backing_dev_info *bdi, u64 max_bytes)
+{
+ int ret;
+ unsigned long pages = max_bytes >> PAGE_SHIFT;
+ unsigned long max_ratio;
+
+ ret = bdi_check_pages_limit(pages);
+ if (ret)
+ return ret;
+
+ max_ratio = bdi_ratio_from_pages(pages);
+ return __bdi_set_max_ratio(bdi, max_ratio);
+}
+
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit)
{
if (strict_limit > 1)
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 08/20] mm: add knob /sys/class/bdi/<bdi>/max_bytes
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (6 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 07/20] mm: add bdi_set_max_bytes() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 09/20] mm: document /sys/class/bdi/<bdi>/max_bytes knob Stefan Roesch
` (11 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This adds the new knob max_bytes to specify a dirty memory limit for the
corresponding bdi. The specified bytes value is converted to a ratio.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/backing-dev.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 90fa517123dc..95d3229fc81f 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -199,6 +199,34 @@ static ssize_t max_ratio_store(struct device *dev,
}
BDI_SHOW(max_ratio, bdi->max_ratio / BDI_RATIO_SCALE)
+static ssize_t max_bytes_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+
+ return sysfs_emit(buf, "%llu\n", bdi_get_max_bytes(bdi));
+}
+
+static ssize_t max_bytes_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ u64 bytes;
+ ssize_t ret;
+
+ ret = kstrtoull(buf, 10, &bytes);
+ if (ret < 0)
+ return ret;
+
+ ret = bdi_set_max_bytes(bdi, bytes);
+ if (!ret)
+ ret = count;
+
+ return ret;
+}
+DEVICE_ATTR_RW(max_bytes);
+
static ssize_t stable_pages_required_show(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -241,6 +269,7 @@ static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
&dev_attr_max_ratio.attr,
+ &dev_attr_max_bytes.attr,
&dev_attr_stable_pages_required.attr,
&dev_attr_strict_limit.attr,
NULL,
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 09/20] mm: document /sys/class/bdi/<bdi>/max_bytes knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (7 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 08/20] mm: add knob /sys/class/bdi/<bdi>/max_bytes Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 10/20] mm: add bdi_get_min_bytes() function Stefan Roesch
` (10 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This documents the new /sys/class/bdi/<bdi>/max_bytes knob.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
Documentation/ABI/testing/sysfs-class-bdi | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-class-bdi b/Documentation/ABI/testing/sysfs-class-bdi
index 68b5d4018c2f..580f723de049 100644
--- a/Documentation/ABI/testing/sysfs-class-bdi
+++ b/Documentation/ABI/testing/sysfs-class-bdi
@@ -56,6 +56,20 @@ Description:
be trusted to play fair.
(read-write)
+
+What: /sys/class/bdi/<bdi>/max_bytes
+Date: October 2022
+Contact: Stefan Roesch <shr@devkernel.io>
+Description:
+ Allows limiting a particular device to use not more than the
+ given 'max_bytes' of the write-back cache. This is useful in
+ situations where we want to avoid one device taking all or
+ most of the write-back cache. For example in case of an NFS
+ mount that is prone to get stuck, a FUSE mount which cannot be
+ trusted to play fair, or a nbd device.
+
+ (read-write)
+
What: /sys/class/bdi/<bdi>/strict_limit
Date: October 2022
Contact: Stefan Roesch <shr@devkernel.io>
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 10/20] mm: add bdi_get_min_bytes() function.
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (8 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 09/20] mm: document /sys/class/bdi/<bdi>/max_bytes knob Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 11/20] mm: split off __bdi_set_min_ratio() function Stefan Roesch
` (9 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This adds a function to return the specified value for min_bytes. It
converts the stored min_ratio of the bdi to the corresponding bytes
value. This is an approximation as it is based on the value that is
returned by global_dirty_limits(), which can change. The returned
value can be different than the value when the min_bytes value was set.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 5 +++++
2 files changed, 6 insertions(+)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index ea6c993433d5..8e04567727e6 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -105,6 +105,7 @@ static inline unsigned long wb_stat_error(void)
/* BDI ratio is expressed as part per 1000000 for finer granularity. */
#define BDI_RATIO_SCALE 10000
+u64 bdi_get_min_bytes(struct backing_dev_info *bdi);
u64 bdi_get_max_bytes(struct backing_dev_info *bdi);
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 20ae9adeb22f..c47824464f4c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -740,6 +740,11 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio)
}
EXPORT_SYMBOL(bdi_set_max_ratio);
+u64 bdi_get_min_bytes(struct backing_dev_info *bdi)
+{
+ return bdi_get_bytes(bdi->min_ratio);
+}
+
u64 bdi_get_max_bytes(struct backing_dev_info *bdi)
{
return bdi_get_bytes(bdi->max_ratio);
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 11/20] mm: split off __bdi_set_min_ratio() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (9 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 10/20] mm: add bdi_get_min_bytes() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 12/20] mm: add bdi_set_min_bytes() function Stefan Roesch
` (8 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This splits off the __bdi_set_min_ratio() function from the
bdi_set_min_ratio() function. The __bdi_set_min_ratio() function will
also be called from the bdi_set_min_bytes() function, which will be
introduced in the next patch.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/page-writeback.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index c47824464f4c..cefee7210d83 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -685,7 +685,7 @@ static u64 bdi_get_bytes(unsigned int ratio)
return bytes;
}
-int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
+static int __bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
{
unsigned int delta;
int ret = 0;
@@ -731,6 +731,11 @@ static int __bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ra
return ret;
}
+int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
+{
+ return __bdi_set_min_ratio(bdi, min_ratio * BDI_RATIO_SCALE);
+}
+
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio)
{
if (max_ratio > 100)
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 12/20] mm: add bdi_set_min_bytes() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (10 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 11/20] mm: split off __bdi_set_min_ratio() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 13/20] mm: add /sys/class/bdi/<bdi>/min_bytes knob Stefan Roesch
` (7 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This introduces the bdi_set_min_bytes() function. The min_bytes function
does not store the min_bytes value. Instead it converts the min_bytes
value into the corresponding ratio value.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 14 ++++++++++++++
2 files changed, 15 insertions(+)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 8e04567727e6..572669758c7f 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -109,6 +109,7 @@ u64 bdi_get_min_bytes(struct backing_dev_info *bdi);
u64 bdi_get_max_bytes(struct backing_dev_info *bdi);
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
+int bdi_set_min_bytes(struct backing_dev_info *bdi, u64 min_bytes);
int bdi_set_max_bytes(struct backing_dev_info *bdi, u64 max_bytes);
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index cefee7210d83..3d151e7a9b6c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -750,6 +750,20 @@ u64 bdi_get_min_bytes(struct backing_dev_info *bdi)
return bdi_get_bytes(bdi->min_ratio);
}
+int bdi_set_min_bytes(struct backing_dev_info *bdi, u64 min_bytes)
+{
+ int ret;
+ unsigned long pages = min_bytes >> PAGE_SHIFT;
+ unsigned long min_ratio;
+
+ ret = bdi_check_pages_limit(pages);
+ if (ret)
+ return ret;
+
+ min_ratio = bdi_ratio_from_pages(pages);
+ return __bdi_set_min_ratio(bdi, min_ratio);
+}
+
u64 bdi_get_max_bytes(struct backing_dev_info *bdi)
{
return bdi_get_bytes(bdi->max_ratio);
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 13/20] mm: add /sys/class/bdi/<bdi>/min_bytes knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (11 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 12/20] mm: add bdi_set_min_bytes() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 14/20] mm: document " Stefan Roesch
` (6 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
bdi has two existing knobs to limit the amount of dirty memory:
min_ratio and max_ratio. However the granularity of the knobs is limited
and often it is more convenient to specify limits in terms of bytes.
This change adds the min_bytes knob.
It does not store the min_bytes value, instead it converts the max_bytes
value to a ratio. The value is therefore more an approximation than an
absolute value.
It also maintains the sum over all the bdi min_ratio values stored in
the variable bdi_min_ratio.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/backing-dev.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 95d3229fc81f..3fab79061ade 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -199,6 +199,34 @@ static ssize_t max_ratio_store(struct device *dev,
}
BDI_SHOW(max_ratio, bdi->max_ratio / BDI_RATIO_SCALE)
+static ssize_t min_bytes_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+
+ return sysfs_emit(buf, "%llu\n", bdi_get_min_bytes(bdi));
+}
+
+static ssize_t min_bytes_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ u64 bytes;
+ ssize_t ret;
+
+ ret = kstrtoull(buf, 10, &bytes);
+ if (ret < 0)
+ return ret;
+
+ ret = bdi_set_min_bytes(bdi, bytes);
+ if (!ret)
+ ret = count;
+
+ return ret;
+}
+DEVICE_ATTR_RW(min_bytes);
+
static ssize_t max_bytes_show(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -269,6 +297,7 @@ static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
&dev_attr_max_ratio.attr,
+ &dev_attr_min_bytes.attr,
&dev_attr_max_bytes.attr,
&dev_attr_stable_pages_required.attr,
&dev_attr_strict_limit.attr,
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 14/20] mm: document /sys/class/bdi/<bdi>/min_bytes knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (12 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 13/20] mm: add /sys/class/bdi/<bdi>/min_bytes knob Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 15/20] mm: add bdi_set_max_ratio_no_scale() function Stefan Roesch
` (5 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This documents the new /sys/class/bdi/<bdi>/min_bytes knob.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
Documentation/ABI/testing/sysfs-class-bdi | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-class-bdi b/Documentation/ABI/testing/sysfs-class-bdi
index 580f723de049..bec996e29565 100644
--- a/Documentation/ABI/testing/sysfs-class-bdi
+++ b/Documentation/ABI/testing/sysfs-class-bdi
@@ -57,6 +57,21 @@ Description:
(read-write)
+What: /sys/class/bdi/<bdi>/min_bytes
+Date: October 2022
+Contact: Stefan Roesch <shr@devkernel.io>
+Description:
+ Under normal circumstances each device is given a part of the
+ total write-back cache that relates to its current average
+ writeout speed in relation to the other devices.
+
+ The 'min_bytes' parameter allows assigning a minimum
+ percentage of the write-back cache to a particular device
+ expressed in bytes.
+ For example, this is useful for providing a minimum QoS.
+
+ (read-write)
+
What: /sys/class/bdi/<bdi>/max_bytes
Date: October 2022
Contact: Stefan Roesch <shr@devkernel.io>
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 15/20] mm: add bdi_set_max_ratio_no_scale() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (13 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 14/20] mm: document " Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 16/20] mm: add /sys/class/bdi/<bdi>/max_ratio_fine knob Stefan Roesch
` (4 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This introduces bdi_set_max_ratio_no_scale(). It uses the max
granularity for the ratio. This function by the new sysfs knob
max_ratio_fine.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 11 ++++++++---
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 572669758c7f..d9acbb22ff25 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -109,6 +109,7 @@ u64 bdi_get_min_bytes(struct backing_dev_info *bdi);
u64 bdi_get_max_bytes(struct backing_dev_info *bdi);
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
+int bdi_set_max_ratio_no_scale(struct backing_dev_info *bdi, unsigned int max_ratio);
int bdi_set_min_bytes(struct backing_dev_info *bdi, u64 min_bytes);
int bdi_set_max_bytes(struct backing_dev_info *bdi, u64 max_bytes);
int bdi_set_strict_limit(struct backing_dev_info *bdi, unsigned int strict_limit);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3d151e7a9b6c..f44ade72966c 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -719,6 +719,9 @@ static int __bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ra
{
int ret = 0;
+ if (max_ratio > 100 * BDI_RATIO_SCALE)
+ return -EINVAL;
+
spin_lock_bh(&bdi_lock);
if (bdi->min_ratio > max_ratio) {
ret = -EINVAL;
@@ -731,6 +734,11 @@ static int __bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ra
return ret;
}
+int bdi_set_max_ratio_no_scale(struct backing_dev_info *bdi, unsigned int max_ratio)
+{
+ return __bdi_set_max_ratio(bdi, max_ratio);
+}
+
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
{
return __bdi_set_min_ratio(bdi, min_ratio * BDI_RATIO_SCALE);
@@ -738,9 +746,6 @@ int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio)
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio)
{
- if (max_ratio > 100)
- return -EINVAL;
-
return __bdi_set_max_ratio(bdi, max_ratio * BDI_RATIO_SCALE);
}
EXPORT_SYMBOL(bdi_set_max_ratio);
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 16/20] mm: add /sys/class/bdi/<bdi>/max_ratio_fine knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (14 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 15/20] mm: add bdi_set_max_ratio_no_scale() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 17/20] mm: document " Stefan Roesch
` (3 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This adds the max_ratio_fine knob. The knob specifies the values not
based on 1 of 100, but instead 1 per million.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/backing-dev.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 3fab79061ade..94c2382367cf 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -199,6 +199,25 @@ static ssize_t max_ratio_store(struct device *dev,
}
BDI_SHOW(max_ratio, bdi->max_ratio / BDI_RATIO_SCALE)
+static ssize_t max_ratio_fine_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ unsigned int ratio;
+ ssize_t ret;
+
+ ret = kstrtouint(buf, 10, &ratio);
+ if (ret < 0)
+ return ret;
+
+ ret = bdi_set_max_ratio_no_scale(bdi, ratio);
+ if (!ret)
+ ret = count;
+
+ return ret;
+}
+BDI_SHOW(max_ratio_fine, bdi->max_ratio)
+
static ssize_t min_bytes_show(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -297,6 +316,7 @@ static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
&dev_attr_max_ratio.attr,
+ &dev_attr_max_ratio_fine.attr,
&dev_attr_min_bytes.attr,
&dev_attr_max_bytes.attr,
&dev_attr_stable_pages_required.attr,
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 17/20] mm: document /sys/class/bdi/<bdi>/max_ratio_fine knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (15 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 16/20] mm: add /sys/class/bdi/<bdi>/max_ratio_fine knob Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 18/20] mm: add bdi_set_min_ratio_no_scale() function Stefan Roesch
` (2 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This documents the new /sys/class/bdi/<bdi>/max_ratio_fine knob.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
Documentation/ABI/testing/sysfs-class-bdi | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-class-bdi b/Documentation/ABI/testing/sysfs-class-bdi
index bec996e29565..34d2e5489c74 100644
--- a/Documentation/ABI/testing/sysfs-class-bdi
+++ b/Documentation/ABI/testing/sysfs-class-bdi
@@ -57,6 +57,19 @@ Description:
(read-write)
+What: /sys/class/bdi/<bdi>/max_ratio_fine
+Date: November 2022
+Contact: Stefan Roesch <shr@devkernel.io>
+Description:
+ Allows limiting a particular device to use not more than the
+ given value of the write-back cache. The value is given as part
+ of 1 million. This is useful in situations where we want to avoid
+ one device taking all or most of the write-back cache. For example
+ in case of an NFS mount that is prone to get stuck, or a FUSE mount
+ which cannot be trusted to play fair.
+
+ (read-write)
+
What: /sys/class/bdi/<bdi>/min_bytes
Date: October 2022
Contact: Stefan Roesch <shr@devkernel.io>
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 18/20] mm: add bdi_set_min_ratio_no_scale() function
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (16 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 17/20] mm: document " Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 19/20] mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 20/20] mm: document " Stefan Roesch
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This introduces bdi_set_min_ratio_no_scale(). It uses the max
granularity for the ratio. This function by the new sysfs knob
min_ratio_fine.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
include/linux/backing-dev.h | 1 +
mm/page-writeback.c | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index d9acbb22ff25..fbad4fcd408e 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -109,6 +109,7 @@ u64 bdi_get_min_bytes(struct backing_dev_info *bdi);
u64 bdi_get_max_bytes(struct backing_dev_info *bdi);
int bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ratio);
+int bdi_set_min_ratio_no_scale(struct backing_dev_info *bdi, unsigned int min_ratio);
int bdi_set_max_ratio_no_scale(struct backing_dev_info *bdi, unsigned int max_ratio);
int bdi_set_min_bytes(struct backing_dev_info *bdi, u64 min_bytes);
int bdi_set_max_bytes(struct backing_dev_info *bdi, u64 max_bytes);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f44ade72966c..ad608ef2a243 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -690,6 +690,8 @@ static int __bdi_set_min_ratio(struct backing_dev_info *bdi, unsigned int min_ra
unsigned int delta;
int ret = 0;
+ if (min_ratio > 100 * BDI_RATIO_SCALE)
+ return -EINVAL;
min_ratio *= BDI_RATIO_SCALE;
spin_lock_bh(&bdi_lock);
@@ -734,6 +736,11 @@ static int __bdi_set_max_ratio(struct backing_dev_info *bdi, unsigned int max_ra
return ret;
}
+int bdi_set_min_ratio_no_scale(struct backing_dev_info *bdi, unsigned int min_ratio)
+{
+ return __bdi_set_min_ratio(bdi, min_ratio);
+}
+
int bdi_set_max_ratio_no_scale(struct backing_dev_info *bdi, unsigned int max_ratio)
{
return __bdi_set_max_ratio(bdi, max_ratio);
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 19/20] mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (17 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 18/20] mm: add bdi_set_min_ratio_no_scale() function Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 20/20] mm: document " Stefan Roesch
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This adds the min_ratio_fine knob. The knob specifies the values not
based on 1 of 100, but instead 1 per million.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
mm/backing-dev.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 94c2382367cf..a53b9360b72e 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -180,6 +180,25 @@ static ssize_t min_ratio_store(struct device *dev,
}
BDI_SHOW(min_ratio, bdi->min_ratio / BDI_RATIO_SCALE)
+static ssize_t min_ratio_fine_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ struct backing_dev_info *bdi = dev_get_drvdata(dev);
+ unsigned int ratio;
+ ssize_t ret;
+
+ ret = kstrtouint(buf, 10, &ratio);
+ if (ret < 0)
+ return ret;
+
+ ret = bdi_set_min_ratio_no_scale(bdi, ratio);
+ if (!ret)
+ ret = count;
+
+ return ret;
+}
+BDI_SHOW(min_ratio_fine, bdi->min_ratio)
+
static ssize_t max_ratio_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
@@ -315,6 +334,7 @@ static DEVICE_ATTR_RW(strict_limit);
static struct attribute *bdi_dev_attrs[] = {
&dev_attr_read_ahead_kb.attr,
&dev_attr_min_ratio.attr,
+ &dev_attr_min_ratio_fine.attr,
&dev_attr_max_ratio.attr,
&dev_attr_max_ratio_fine.attr,
&dev_attr_min_bytes.attr,
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC PATCH v4 20/20] mm: document /sys/class/bdi/<bdi>/min_ratio_fine knob
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
` (18 preceding siblings ...)
2022-11-19 0:52 ` [RFC PATCH v4 19/20] mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob Stefan Roesch
@ 2022-11-19 0:52 ` Stefan Roesch
19 siblings, 0 replies; 21+ messages in thread
From: Stefan Roesch @ 2022-11-19 0:52 UTC (permalink / raw)
To: kernel-team, linux-block, linux-mm; +Cc: shr, axboe, clm, akpm
This documents the new /sys/class/bdi/<bdi>/max_ratio_fine knob.
Signed-off-by: Stefan Roesch <shr@devkernel.io>
---
Documentation/ABI/testing/sysfs-class-bdi | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-class-bdi b/Documentation/ABI/testing/sysfs-class-bdi
index 34d2e5489c74..b4ed0db680cf 100644
--- a/Documentation/ABI/testing/sysfs-class-bdi
+++ b/Documentation/ABI/testing/sysfs-class-bdi
@@ -44,6 +44,21 @@ Description:
(read-write)
+What: /sys/class/bdi/<bdi>/min_ratio_fine
+Date: November 2022
+Contact: Stefan Roesch <shr@devkernel.io>
+Description:
+ Under normal circumstances each device is given a part of the
+ total write-back cache that relates to its current average
+ writeout speed in relation to the other devices.
+
+ The 'min_ratio_fine' parameter allows assigning a minimum reserve
+ of the write-back cache to a particular device. The value is
+ expressed as part of 1 million. For example, this is useful for
+ providing a minimum QoS.
+
+ (read-write)
+
What: /sys/class/bdi/<bdi>/max_ratio
Date: January 2008
Contact: Peter Zijlstra <a.p.zijlstra@chello.nl>
--
2.30.2
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-11-19 2:33 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-19 0:51 [RFC PATCH v4 00/20] mm/block: add bdi sysfs knobs Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 01/20] mm: add bdi_set_strict_limit() function Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 02/20] mm: add knob /sys/class/bdi/<bdi>/strict_limit Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 03/20] mm: document /sys/class/bdi/<bdi>/strict_limit knob Stefan Roesch
2022-11-19 0:51 ` [RFC PATCH v4 04/20] mm: use part per 1000000 for bdi ratios Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 05/20] mm: add bdi_get_max_bytes() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 06/20] mm: split off __bdi_set_max_ratio() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 07/20] mm: add bdi_set_max_bytes() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 08/20] mm: add knob /sys/class/bdi/<bdi>/max_bytes Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 09/20] mm: document /sys/class/bdi/<bdi>/max_bytes knob Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 10/20] mm: add bdi_get_min_bytes() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 11/20] mm: split off __bdi_set_min_ratio() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 12/20] mm: add bdi_set_min_bytes() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 13/20] mm: add /sys/class/bdi/<bdi>/min_bytes knob Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 14/20] mm: document " Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 15/20] mm: add bdi_set_max_ratio_no_scale() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 16/20] mm: add /sys/class/bdi/<bdi>/max_ratio_fine knob Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 17/20] mm: document " Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 18/20] mm: add bdi_set_min_ratio_no_scale() function Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 19/20] mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob Stefan Roesch
2022-11-19 0:52 ` [RFC PATCH v4 20/20] mm: document " Stefan Roesch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox