From: Dan Streetman <ddstreet@ieee.org>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Seth Jennings <sjenning@redhat.com>,
Minchan Kim <minchan@kernel.org>, Nitin Gupta <ngupta@vflare.org>,
Linux-MM <linux-mm@kvack.org>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Dan Streetman <dan.streetman@canonical.com>
Subject: Re: [PATCH] mm/zswap: use workqueue to destroy pool
Date: Thu, 28 Apr 2016 04:21:53 -0400 [thread overview]
Message-ID: <CALZtONDia=pLWZ9nWwbiWKhme8LfiyTjJt4yEX0VzaqRG_=R5g@mail.gmail.com> (raw)
In-Reply-To: <20160428014028.GA594@swordfish>
On Wed, Apr 27, 2016 at 9:40 PM, Sergey Senozhatsky
<sergey.senozhatsky.work@gmail.com> wrote:
> Hello Dan,
>
> On (04/27/16 13:19), Dan Streetman wrote:
> [..]
>> > so in general the patch look good to me.
>> >
>> > it's either I didn't have enough coffee yet (which is true) or
>> > _IN THEORY_ it creates a tiny race condition; which is hard (and
>> > unlikely) to hit, but still. and the problem being is
>> > CONFIG_ZSMALLOC_STAT.
>>
>> Aha, thanks, I hadn't tested with that param enabled. However, the
>> patch doesn't create the race condition, that existed already.
>
> well, agree. it's not like zsmalloc race condition, but the way zsmalloc
> is managed (deferred destruction either via rcu or scheduled work).
>
>> It fails because the new zswap pool creates a new zpool using
>> zsmalloc, but it can't create the zsmalloc pool because there is
>> already one named 'zswap' so the stat dir can't be created.
>>
>> So...either zswap needs to provide a unique 'name' to each of its
>> zpools, or zsmalloc needs to modify its provided pool name in some way
>> (add a unique suffix maybe). Or both.
>>
>> It seems like zsmalloc should do the checking/modification - or, at
>> the very least, it should have consistent behavior regardless of the
>> CONFIG_ZSMALLOC_STAT setting.
>
> yes, zram guarantees that there won't be any name collisions. and the
> way it's working for zram, zram<ID> corresponds to zsmalloc<ID>.
>
>
> the bigger issue here (and I was thinking at some point of fixing it,
> but then I grepped to see how many API users are in there, and I gave
> up) is that it seems we have no way to check if the dir exists in debugfs.
>
> we call this function
>
> struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
> {
> struct dentry *dentry = start_creating(name, parent);
> struct inode *inode;
>
> if (IS_ERR(dentry))
> return NULL;
>
> inode = debugfs_get_inode(dentry->d_sb);
> if (unlikely(!inode))
> return failed_creating(dentry);
>
> inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
> inode->i_op = &simple_dir_inode_operations;
> inode->i_fop = &simple_dir_operations;
>
> /* directory inodes start off with i_nlink == 2 (for "." entry) */
> inc_nlink(inode);
> d_instantiate(dentry, inode);
> inc_nlink(d_inode(dentry->d_parent));
> fsnotify_mkdir(d_inode(dentry->d_parent), dentry);
> return end_creating(dentry);
> }
>
> and debugfs _does know_ that the directory ERR_PTR(-EEXIST), that's what
> start_creating()->lookup_one_len() return
>
> static struct dentry *start_creating(const char *name, struct dentry *parent)
> {
> struct dentry *dentry;
> int error;
>
> pr_debug("debugfs: creating file '%s'\n",name);
>
> if (IS_ERR(parent))
> return parent;
>
> error = simple_pin_fs(&debug_fs_type, &debugfs_mount,
> &debugfs_mount_count);
> if (error)
> return ERR_PTR(error);
>
> /* If the parent is not specified, we create it in the root.
> * We need the root dentry to do this, which is in the super
> * block. A pointer to that is in the struct vfsmount that we
> * have around.
> */
> if (!parent)
> parent = debugfs_mount->mnt_root;
>
> inode_lock(d_inode(parent));
> dentry = lookup_one_len(name, parent, strlen(name));
> if (!IS_ERR(dentry) && d_really_is_positive(dentry)) {
> dput(dentry);
> dentry = ERR_PTR(-EEXIST);
> }
>
> if (IS_ERR(dentry)) {
> inode_unlock(d_inode(parent));
> simple_release_fs(&debugfs_mount, &debugfs_mount_count);
> }
>
> return dentry;
> }
>
> but debugfs_create_dir() instead of propagating this error, it swallows it
> and simply return NULL, so we can't tell the difference between -EEXIST, OOM,
> or anything else. so doing this check in zsmalloc() is not so easy.
yeah, Greg intentionally made the debugfs api opaque, so there's only
a binary created/failed indication.
While I agree zswap should provide unique names, I also think zsmalloc
should not abort if its debugfs content fails to be created - the
intention of debugfs is not to be a critical part of drivers, but only
to provide debug information. I'll send a patch to zsmalloc
separately, that allows zsmalloc pool creation to continue even if the
debugfs dir/file failed to be created.
>
> /* well, I may be wrong here */
>
>> However, it's easy to change zswap to provide a unique name for each
>> zpool creation, and zsmalloc's primary user (zram) guarantees to
>> provide a unique name for each pool created. So updating zswap is
>> probably best.
>
> if you can do it in zswap, then please do.
>
> -ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-04-28 8:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-29 22:02 [PATCH] zsmalloc: use workqueue to destroy pool in zpool callback Yu Zhao
2016-03-30 0:54 ` Sergey Senozhatsky
[not found] ` <20160329235950.GA19927@bbox>
2016-03-31 8:46 ` Sergey Senozhatsky
2016-03-31 21:46 ` Yu Zhao
2016-03-31 22:05 ` Dan Streetman
2016-04-25 21:20 ` [PATCH] mm/zpool: use workqueue for zpool_destroy Dan Streetman
2016-04-25 21:46 ` Andrew Morton
2016-04-25 22:18 ` Yu Zhao
2016-04-26 0:59 ` Sergey Senozhatsky
2016-04-26 11:07 ` Dan Streetman
2016-04-26 21:08 ` [PATCH] mm/zswap: use workqueue to destroy pool Dan Streetman
2016-04-27 0:58 ` Sergey Senozhatsky
2016-04-27 17:19 ` Dan Streetman
2016-04-28 1:40 ` Sergey Senozhatsky
2016-04-28 4:09 ` Sergey Senozhatsky
2016-04-28 8:21 ` Dan Streetman [this message]
2016-04-28 9:13 ` [PATCH] mm/zswap: provide unique zpool name Dan Streetman
2016-04-28 22:16 ` Andrew Morton
2016-04-29 0:25 ` Sergey Senozhatsky
2016-04-29 0:25 ` [PATCH] mm/zswap: use workqueue to destroy pool Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALZtONDia=pLWZ9nWwbiWKhme8LfiyTjJt4yEX0VzaqRG_=R5g@mail.gmail.com' \
--to=ddstreet@ieee.org \
--cc=akpm@linux-foundation.org \
--cc=dan.streetman@canonical.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=ngupta@vflare.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=sjenning@redhat.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox