* [patch 0/3] mm: bdi: updates
@ 2008-02-02 23:01 Miklos Szeredi
2008-02-02 23:01 ` [patch 1/3] mm: bdi: fix read_ahead_kb_store() Miklos Szeredi, Miklos Szeredi
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Miklos Szeredi @ 2008-02-02 23:01 UTC (permalink / raw)
To: akpm; +Cc: a.p.zijlstra, linux-kernel, linux-fsdevel, linux-mm
Here are incremental patches against the "export BDI attributes in
sysfs" patchset, addressing the issues identified at the last
submission:
- the read-only attributes are only for debugging
- more consistent naming needed in /sys/class/bdi
- documentation problems
I've also done some testing, and fixed some bugs. Including patches
in -mm can do wonders, even before the kernel containing them is
released :)
Let me know if you prefer a resubmission of the original series with
these changes folded in.
Thanks,
Miklos
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch 1/3] mm: bdi: fix read_ahead_kb_store()
2008-02-02 23:01 [patch 0/3] mm: bdi: updates Miklos Szeredi
@ 2008-02-02 23:01 ` Miklos Szeredi, Miklos Szeredi
2008-02-02 23:01 ` [patch 2/3] mm: bdi: use MAJOR:MINOR in /sys/class/bdi Miklos Szeredi, Miklos Szeredi
2008-02-02 23:01 ` [patch 3/3] mm: bdi: move statistics to debugfs Miklos Szeredi, Miklos Szeredi
2 siblings, 0 replies; 4+ messages in thread
From: Miklos Szeredi, Miklos Szeredi @ 2008-02-02 23:01 UTC (permalink / raw)
To: akpm; +Cc: a.p.zijlstra, linux-kernel, linux-fsdevel, linux-mm
[-- Attachment #1: mm-bdi-fix-read_ahead_kb_store.patch --]
[-- Type: text/plain, Size: 1762 bytes --]
This managed to completely evade testing :(
Fix return value to be count or -errno. Also bring the function in
line with the other store functions on this object, which have more
strict input checking.
Also fix bdi_set_max_ratio() to actually return an error, instead of
always zero.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
Index: linux/mm/backing-dev.c
===================================================================
--- linux.orig/mm/backing-dev.c 2008-02-02 23:21:50.000000000 +0100
+++ linux/mm/backing-dev.c 2008-02-02 23:26:01.000000000 +0100
@@ -16,10 +16,15 @@ static ssize_t read_ahead_kb_store(struc
{
struct backing_dev_info *bdi = dev_get_drvdata(dev);
char *end;
+ unsigned long read_ahead_kb;
+ ssize_t ret = -EINVAL;
- bdi->ra_pages = simple_strtoul(buf, &end, 10) >> (PAGE_SHIFT - 10);
-
- return end - buf;
+ read_ahead_kb = simple_strtoul(buf, &end, 10);
+ if (*buf && (end[0] == '\0' || (end[0] == '\n' && end[1] == '\0'))) {
+ bdi->ra_pages = read_ahead_kb >> (PAGE_SHIFT - 10);
+ ret = count;
+ }
+ return ret;
}
#define K(pages) ((pages) << (PAGE_SHIFT - 10))
Index: linux/mm/page-writeback.c
===================================================================
--- linux.orig/mm/page-writeback.c 2008-02-02 20:51:26.000000000 +0100
+++ linux/mm/page-writeback.c 2008-02-02 23:26:15.000000000 +0100
@@ -288,7 +288,7 @@ int bdi_set_max_ratio(struct backing_dev
}
spin_unlock_irqrestore(&bdi_lock, flags);
- return 0;
+ return ret;
}
EXPORT_SYMBOL(bdi_set_max_ratio);
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch 2/3] mm: bdi: use MAJOR:MINOR in /sys/class/bdi
2008-02-02 23:01 [patch 0/3] mm: bdi: updates Miklos Szeredi
2008-02-02 23:01 ` [patch 1/3] mm: bdi: fix read_ahead_kb_store() Miklos Szeredi, Miklos Szeredi
@ 2008-02-02 23:01 ` Miklos Szeredi, Miklos Szeredi
2008-02-02 23:01 ` [patch 3/3] mm: bdi: move statistics to debugfs Miklos Szeredi, Miklos Szeredi
2 siblings, 0 replies; 4+ messages in thread
From: Miklos Szeredi, Miklos Szeredi @ 2008-02-02 23:01 UTC (permalink / raw)
To: akpm; +Cc: a.p.zijlstra, linux-kernel, linux-fsdevel, linux-mm
[-- Attachment #1: mm-bdi-use-major-minor-in-sys-class-bdi.patch --]
[-- Type: text/plain, Size: 4952 bytes --]
Uniformly use MAJOR:MINOR in /sys/class/bdi/ for both block devices
and non-block device backed filesystems: FUSE and NFS.
Add symlink for block devices:
/sys/block/<name>/bdi -> /sys/class/bdi/<bdi>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
Index: linux/block/genhd.c
===================================================================
--- linux.orig/block/genhd.c 2008-02-02 22:41:03.000000000 +0100
+++ linux/block/genhd.c 2008-02-02 22:50:03.000000000 +0100
@@ -178,13 +178,17 @@ static int exact_lock(dev_t devt, void *
*/
void add_disk(struct gendisk *disk)
{
+ struct backing_dev_info *bdi;
+
disk->flags |= GENHD_FL_UP;
blk_register_region(MKDEV(disk->major, disk->first_minor),
disk->minors, NULL, exact_match, exact_lock, disk);
register_disk(disk);
blk_register_queue(disk);
- bdi_register(&disk->queue->backing_dev_info, NULL,
- "blk-%s", disk->disk_name);
+
+ bdi = &disk->queue->backing_dev_info;
+ bdi_register_dev(bdi, MKDEV(disk->major, disk->first_minor));
+ sysfs_create_link(&disk->dev.kobj, &bdi->dev->kobj, "bdi");
}
EXPORT_SYMBOL(add_disk);
@@ -192,8 +196,9 @@ EXPORT_SYMBOL(del_gendisk); /* in partit
void unlink_gendisk(struct gendisk *disk)
{
- blk_unregister_queue(disk);
+ sysfs_remove_link(&disk->dev.kobj, "bdi");
bdi_unregister(&disk->queue->backing_dev_info);
+ blk_unregister_queue(disk);
blk_unregister_region(MKDEV(disk->major, disk->first_minor),
disk->minors);
}
Index: linux/include/linux/backing-dev.h
===================================================================
--- linux.orig/include/linux/backing-dev.h 2008-02-02 22:41:03.000000000 +0100
+++ linux/include/linux/backing-dev.h 2008-02-02 22:50:03.000000000 +0100
@@ -62,6 +62,7 @@ void bdi_destroy(struct backing_dev_info
int bdi_register(struct backing_dev_info *bdi, struct device *parent,
const char *fmt, ...);
+int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev);
void bdi_unregister(struct backing_dev_info *bdi);
static inline void __add_bdi_stat(struct backing_dev_info *bdi,
Index: linux/mm/backing-dev.c
===================================================================
--- linux.orig/mm/backing-dev.c 2008-02-02 22:43:36.000000000 +0100
+++ linux/mm/backing-dev.c 2008-02-02 22:50:03.000000000 +0100
@@ -143,6 +143,12 @@ exit:
}
EXPORT_SYMBOL(bdi_register);
+int bdi_register_dev(struct backing_dev_info *bdi, dev_t dev)
+{
+ return bdi_register(bdi, NULL, "%u:%u", MAJOR(dev), MINOR(dev));
+}
+EXPORT_SYMBOL(bdi_register_dev);
+
void bdi_unregister(struct backing_dev_info *bdi)
{
if (bdi->dev) {
Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c 2008-02-02 22:41:03.000000000 +0100
+++ linux/fs/fuse/inode.c 2008-02-02 22:50:03.000000000 +0100
@@ -472,8 +472,7 @@ static struct fuse_conn *new_conn(struct
err = bdi_init(&fc->bdi);
if (err)
goto error_kfree;
- err = bdi_register(&fc->bdi, NULL, "fuse-%u:%u",
- MAJOR(fc->dev), MINOR(fc->dev));
+ err = bdi_register_dev(&fc->bdi, fc->dev);
if (err)
goto error_bdi_destroy;
fc->reqctr = 0;
Index: linux/fs/nfs/super.c
===================================================================
--- linux.orig/fs/nfs/super.c 2008-02-02 22:41:03.000000000 +0100
+++ linux/fs/nfs/super.c 2008-02-02 22:50:03.000000000 +0100
@@ -1477,8 +1477,7 @@ static int nfs_compare_super(struct supe
static int nfs_bdi_register(struct nfs_server *server)
{
- return bdi_register(&server->backing_dev_info, NULL, "nfs-%u:%u",
- MAJOR(server->s_dev), MINOR(server->s_dev));
+ return bdi_register_dev(&server->backing_dev_info, server->s_dev);
}
static int nfs_get_sb(struct file_system_type *fs_type,
Index: linux/Documentation/ABI/testing/sysfs-class-bdi
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-class-bdi 2008-02-02 22:41:03.000000000 +0100
+++ linux/Documentation/ABI/testing/sysfs-class-bdi 2008-02-02 22:50:03.000000000 +0100
@@ -6,17 +6,13 @@ Description:
Provide a place in sysfs for the backing_dev_info object.
This allows us to see and set the various BDI specific variables.
-The <bdi> identifyer can take the following forms:
+The <bdi> identifier can be either of the following:
-blk-NAME
+MAJOR:MINOR
- Block devices, NAME is 'sda', 'loop0', etc...
-
-FSTYPE-MAJOR:MINOR
-
- Non-block device backed filesystems which provide their own
- BDI, such as NFS and FUSE. MAJOR:MINOR is the value of st_dev
- for files on this filesystem.
+ Device number for block devices, or value of st_dev on
+ non-block filesystems which provide their own BDI, such as NFS
+ and FUSE.
default
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch 3/3] mm: bdi: move statistics to debugfs
2008-02-02 23:01 [patch 0/3] mm: bdi: updates Miklos Szeredi
2008-02-02 23:01 ` [patch 1/3] mm: bdi: fix read_ahead_kb_store() Miklos Szeredi, Miklos Szeredi
2008-02-02 23:01 ` [patch 2/3] mm: bdi: use MAJOR:MINOR in /sys/class/bdi Miklos Szeredi, Miklos Szeredi
@ 2008-02-02 23:01 ` Miklos Szeredi, Miklos Szeredi
2 siblings, 0 replies; 4+ messages in thread
From: Miklos Szeredi, Miklos Szeredi @ 2008-02-02 23:01 UTC (permalink / raw)
To: akpm; +Cc: a.p.zijlstra, linux-kernel, linux-fsdevel, linux-mm
[-- Attachment #1: mm-bdi-move-statistics-to-debugfs.patch --]
[-- Type: text/plain, Size: 7565 bytes --]
Move BDI statistics to debugfs:
/sys/kernel/debug/bdi/<bdi>/stats
Use postcore_initcall() to initialize the sysfs class and debugfs,
because debugfs is initialized in core_initcall().
Update descriptions in ABI documentation.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
Index: linux/include/linux/backing-dev.h
===================================================================
--- linux.orig/include/linux/backing-dev.h 2008-02-02 23:08:41.000000000 +0100
+++ linux/include/linux/backing-dev.h 2008-02-02 23:08:41.000000000 +0100
@@ -16,6 +16,7 @@
#include <asm/atomic.h>
struct page;
+struct dentry;
/*
* Bits in backing_dev_info.state
@@ -55,6 +56,11 @@ struct backing_dev_info {
unsigned int max_ratio, max_prop_frac;
struct device *dev;
+
+#ifdef CONFIG_DEBUG_FS
+ struct dentry *debug_dir;
+ struct dentry *debug_stats;
+#endif
};
int bdi_init(struct backing_dev_info *bdi);
Index: linux/mm/backing-dev.c
===================================================================
--- linux.orig/mm/backing-dev.c 2008-02-02 23:08:41.000000000 +0100
+++ linux/mm/backing-dev.c 2008-02-02 23:12:47.000000000 +0100
@@ -10,6 +10,80 @@
static struct class *bdi_class;
+#ifdef CONFIG_DEBUG_FS
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+static struct dentry *bdi_debug_root;
+
+static void bdi_debug_init(void)
+{
+ bdi_debug_root = debugfs_create_dir("bdi", NULL);
+}
+
+static int bdi_debug_stats_show(struct seq_file *m, void *v)
+{
+ struct backing_dev_info *bdi = m->private;
+ long background_thresh;
+ long dirty_thresh;
+ long bdi_thresh;
+
+ get_dirty_limits(&background_thresh, &dirty_thresh, &bdi_thresh, bdi);
+
+#define K(x) ((x) << (PAGE_SHIFT - 10))
+ seq_printf(m,
+ "BdiWriteback: %8lu kB\n"
+ "BdiReclaimable: %8lu kB\n"
+ "BdiDirtyThresh: %8lu kB\n"
+ "DirtyThresh: %8lu kB\n"
+ "BackgroundThresh: %8lu kB\n",
+ (unsigned long) K(bdi_stat(bdi, BDI_WRITEBACK)),
+ (unsigned long) K(bdi_stat(bdi, BDI_RECLAIMABLE)),
+ K(bdi_thresh),
+ K(dirty_thresh),
+ K(background_thresh));
+#undef K
+
+ return 0;
+}
+
+static int bdi_debug_stats_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, bdi_debug_stats_show, inode->i_private);
+}
+
+static const struct file_operations bdi_debug_stats_fops = {
+ .open = bdi_debug_stats_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static void bdi_debug_register(struct backing_dev_info *bdi, const char *name)
+{
+ bdi->debug_dir = debugfs_create_dir(name, bdi_debug_root);
+ bdi->debug_stats = debugfs_create_file("stats", 0444, bdi->debug_dir,
+ bdi, &bdi_debug_stats_fops);
+}
+
+static void bdi_debug_unregister(struct backing_dev_info *bdi)
+{
+ debugfs_remove(bdi->debug_stats);
+ debugfs_remove(bdi->debug_dir);
+}
+#else
+static inline void bdi_debug_init(void)
+{
+}
+static inline void bdi_debug_register(struct backing_dev_info *bdi,
+ const char *name)
+{
+}
+static inline void bdi_debug_unregister(struct backing_dev_info *bdi)
+{
+}
+#endif
+
static ssize_t read_ahead_kb_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
@@ -40,21 +114,6 @@ static ssize_t name##_show(struct device
BDI_SHOW(read_ahead_kb, K(bdi->ra_pages))
-BDI_SHOW(reclaimable_kb, K(bdi_stat(bdi, BDI_RECLAIMABLE)))
-BDI_SHOW(writeback_kb, K(bdi_stat(bdi, BDI_WRITEBACK)))
-
-static inline unsigned long get_dirty(struct backing_dev_info *bdi, int i)
-{
- unsigned long thresh[3];
-
- get_dirty_limits(&thresh[0], &thresh[1], &thresh[2], bdi);
-
- return thresh[i];
-}
-
-BDI_SHOW(dirty_kb, K(get_dirty(bdi, 1)))
-BDI_SHOW(bdi_dirty_kb, K(get_dirty(bdi, 2)))
-
static ssize_t min_ratio_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
@@ -95,10 +154,6 @@ BDI_SHOW(max_ratio, bdi->max_ratio)
static struct device_attribute bdi_dev_attrs[] = {
__ATTR_RW(read_ahead_kb),
- __ATTR_RO(reclaimable_kb),
- __ATTR_RO(writeback_kb),
- __ATTR_RO(dirty_kb),
- __ATTR_RO(bdi_dirty_kb),
__ATTR_RW(min_ratio),
__ATTR_RW(max_ratio),
__ATTR_NULL,
@@ -108,10 +163,11 @@ static __init int bdi_class_init(void)
{
bdi_class = class_create(THIS_MODULE, "bdi");
bdi_class->dev_attrs = bdi_dev_attrs;
+ bdi_debug_init();
return 0;
}
-core_initcall(bdi_class_init);
+postcore_initcall(bdi_class_init);
int bdi_register(struct backing_dev_info *bdi, struct device *parent,
const char *fmt, ...)
@@ -136,6 +192,7 @@ int bdi_register(struct backing_dev_info
bdi->dev = dev;
dev_set_drvdata(bdi->dev, bdi);
+ bdi_debug_register(bdi, name);
exit:
kfree(name);
@@ -152,6 +209,7 @@ EXPORT_SYMBOL(bdi_register_dev);
void bdi_unregister(struct backing_dev_info *bdi)
{
if (bdi->dev) {
+ bdi_debug_unregister(bdi);
device_unregister(bdi->dev);
bdi->dev = NULL;
}
Index: linux/Documentation/ABI/testing/sysfs-class-bdi
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-class-bdi 2008-02-02 23:08:41.000000000 +0100
+++ linux/Documentation/ABI/testing/sysfs-class-bdi 2008-02-02 23:17:27.000000000 +0100
@@ -3,8 +3,8 @@ Date: January 2008
Contact: Peter Zijlstra <a.p.zijlstra@chello.nl>
Description:
-Provide a place in sysfs for the backing_dev_info object.
-This allows us to see and set the various BDI specific variables.
+Provide a place in sysfs for the backing_dev_info object. This allows
+setting and retrieving various BDI specific variables.
The <bdi> identifier can be either of the following:
@@ -26,34 +26,21 @@ read_ahead_kb (read-write)
Size of the read-ahead window in kilobytes
-reclaimable_kb (read-only)
-
- Reclaimable (dirty or unstable) memory destined for writeback
- to this device
-
-writeback_kb (read-only)
-
- Memory currently under writeback to this device
-
-dirty_kb (read-only)
-
- Global threshold for reclaimable + writeback memory
-
-bdi_dirty_kb (read-only)
-
- Current threshold on this BDI for reclaimable + writeback
- memory
-
min_ratio (read-write)
- Minimal percentage of global dirty threshold allocated to this
- bdi. If the value written to this file would make the the sum
- of all min_ratio values exceed 100, then EINVAL is returned.
- If min_ratio would become larger than the current max_ratio,
- then also EINVAL is returned. The default is zero
+ Under normal circumstances each device is given a part of the
+ total write-back cache that relates to its current average
+ writeout speed in relation to the other devices.
+
+ The 'min_ratio' parameter allows assigning a minimum
+ percentage of the write-back cache to a particular device.
+ For example, this is useful for providing a minimum QoS.
max_ratio (read-write)
- Maximal percentage of global dirty threshold allocated to this
- bdi. If max_ratio would become smaller than the current
- min_ratio, then EINVAL is returned. The default is 100
+ Allows limiting a particular device to use not more than the
+ given percentage of the write-back cache. This is useful in
+ situations where we want to avoid one device taking all or
+ most of the write-back cache. For example in case of an NFS
+ mount that is prone to get stuck, or a FUSE mount which cannot
+ be trusted to play fair.
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-02-02 23:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-02 23:01 [patch 0/3] mm: bdi: updates Miklos Szeredi
2008-02-02 23:01 ` [patch 1/3] mm: bdi: fix read_ahead_kb_store() Miklos Szeredi, Miklos Szeredi
2008-02-02 23:01 ` [patch 2/3] mm: bdi: use MAJOR:MINOR in /sys/class/bdi Miklos Szeredi, Miklos Szeredi
2008-02-02 23:01 ` [patch 3/3] mm: bdi: move statistics to debugfs Miklos Szeredi, Miklos Szeredi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox