From: Chuck Lever <chuck.lever@oracle.com>
To: Jan Kara <jack@suse.cz>
Cc: Chuck Lever <cel@kernel.org>,
viro@zeniv.linux.org.uk, brauner@kernel.org, hughd@google.com,
akpm@linux-foundation.org, Liam.Howlett@oracle.com,
oliver.sang@intel.com, feng.tang@intel.com,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
maple-tree@lists.infradead.org, linux-mm@kvack.org,
lkp@intel.com
Subject: Re: [PATCH RFC 6/7] libfs: Convert simple directory offsets to use a Maple Tree
Date: Thu, 15 Feb 2024 08:45:33 -0500 [thread overview]
Message-ID: <Zc4VfZ4/ejBEOt6s@tissot.1015granger.net> (raw)
In-Reply-To: <20240215130601.vmafdab57mqbaxrf@quack3>
On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > Test robot reports:
> > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > >
> > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > Feng Tang further clarifies that:
> > > ... the new simple_offset_add()
> > > called by shmem_mknod() brings extra cost related with slab,
> > > specifically the 'radix_tree_node', which cause the regression.
> >
> > Willy's analysis is that, over time, the test workload causes
> > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> >
> > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > hope that Maple Tree's dense node mode will handle this scenario
> > more scalably.
> >
> > In addition, we can widen the directory offset to an unsigned long
> > everywhere.
> >
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>
> OK, but this will need the performance numbers.
Yes, I totally concur. The point of this posting was to get some
early review and start the ball rolling.
Actually we expect roughly the same performance numbers now. "Dense
node" support in Maple Tree is supposed to be the real win, but
I'm not sure it's ready yet.
> Otherwise we have no idea
> whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> 0-day guys are quite helpful.
Oliver and Feng were copied on this series.
> > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> > if (!inode || !S_ISDIR(inode->i_mode))
> > return ret;
> >
> > - index = 2;
> > + index = DIR_OFFSET_MIN;
>
> This bit should go into the simple_offset_empty() patch...
>
> > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> >
> > /* In this case, ->private_data is protected by f_pos_lock */
> > file->private_data = NULL;
> > - return vfs_setpos(file, offset, U32_MAX);
> > + return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> ^^^
> Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> quite right? Why not use ULONG_MAX here directly?
I initially changed U32_MAX to ULONG_MAX, but for some reason, the
length checking in vfs_setpos() fails. There is probably a sign
extension thing happening here that I don't understand.
> Otherwise the patch looks good to me.
As always, thank you for your review.
--
Chuck Lever
next prev parent reply other threads:[~2024-02-15 13:45 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 21:37 [PATCH RFC 0/7] Use Maple Trees for simple_offset utilities Chuck Lever
2024-02-13 21:37 ` [PATCH RFC 1/7] libfs: Rename "so_ctx" Chuck Lever
2024-02-15 12:42 ` Jan Kara
2024-02-13 21:37 ` [PATCH RFC 2/7] libfs: Define a minimum directory offset Chuck Lever
2024-02-15 12:47 ` Jan Kara
2024-02-13 21:37 ` [PATCH RFC 3/7] libfs: Add simple_offset_empty() Chuck Lever
2024-02-15 12:53 ` Jan Kara
2024-02-13 21:37 ` [PATCH RFC 4/7] maple_tree: Add mtree_alloc_cyclic() Chuck Lever
2024-02-13 21:37 ` [PATCH RFC 5/7] test_maple_tree: testing the cyclic allocation Chuck Lever
2024-02-13 21:38 ` [PATCH RFC 6/7] libfs: Convert simple directory offsets to use a Maple Tree Chuck Lever
2024-02-15 13:06 ` Jan Kara
2024-02-15 13:45 ` Chuck Lever [this message]
2024-02-15 14:02 ` Jan Kara
2024-02-16 15:15 ` Christian Brauner
2024-02-18 2:02 ` Oliver Sang
2024-02-18 15:57 ` Chuck Lever
2024-02-19 6:00 ` Oliver Sang
2024-02-13 21:38 ` [PATCH RFC 7/7] libfs: Re-arrange locking in offset_iterate_dir() Chuck Lever
2024-02-15 13:16 ` Jan Kara
2024-02-15 17:00 ` Liam R. Howlett
2024-02-15 17:16 ` Jan Kara
2024-02-15 21:07 ` Liam R. Howlett
2024-02-16 10:15 ` Jan Kara
2024-02-16 15:57 ` Matthew Wilcox
2024-02-16 16:33 ` Liam R. Howlett
2024-02-19 18:06 ` Jan Kara
2024-02-15 17:40 ` Chuck Lever
2024-02-15 21:08 ` Liam R. Howlett
2024-02-13 21:40 ` [PATCH RFC 0/7] Use Maple Trees for simple_offset utilities Chuck Lever III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zc4VfZ4/ejBEOt6s@tissot.1015granger.net \
--to=chuck.lever@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=cel@kernel.org \
--cc=feng.tang@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=maple-tree@lists.infradead.org \
--cc=oliver.sang@intel.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox