linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: tavianator@tavianator.com
Cc: cel@kernel.org, akpm@linux-foundation.org, brauner@kernel.org,
	hughd@google.com, jlayton@redhat.com,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	viro@zeniv.linux.org.uk
Subject: Re: [PATCH v7 3/3] shmem: stable directory offsets
Date: Mon, 13 Nov 2023 13:58:30 -0500	[thread overview]
Message-ID: <ZVJx1jvSZjupTyWk@tissot.1015granger.net> (raw)
In-Reply-To: <20231113180616.2831430-1-tavianator@tavianator.com>

On Mon, Nov 13, 2023 at 01:06:16PM -0500, tavianator@tavianator.com wrote:
> On Fri, 30 Jun 2023 at 13:49:03 -0400, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > The current cursor-based directory offset mechanism doesn't work
> > when a tmpfs filesystem is exported via NFS. This is because NFS
> > clients do not open directories. Each server-side READDIR operation
> > has to open the directory, read it, then close it. The cursor state
> > for that directory, being associated strictly with the opened
> > struct file, is thus discarded after each NFS READDIR operation.
> >
> > Directory offsets are cached not only by NFS clients, but also by
> > user space libraries on those clients. Essentially there is no way
> > to invalidate those caches when directory offsets have changed on
> > an NFS server after the offset-to-dentry mapping changes. Thus the
> > whole application stack depends on unchanging directory offsets.
> >
> > The solution we've come up with is to make the directory offset for
> > each file in a tmpfs filesystem stable for the life of the directory
> > entry it represents.
> >
> > shmem_readdir() and shmem_dir_llseek() now use an xarray to map each
> > directory offset (an loff_t integer) to the memory address of a
> > struct dentry.
> 
> I believe this patch is responsible for a tmpfs behaviour change when
> a directory is modified while being read.  The following test program
> 
> #include <dirent.h>
> #include <err.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/stat.h>
> #include <unistd.h>
> 
> int main(int argc, char *argv[]) {
> 	const char *tmp = "/tmp";
> 	if (argc >= 2)
> 		tmp = argv[1];
> 
> 	char *dir_path;
> 	if (asprintf(&dir_path, "%s/foo.XXXXXX", tmp) < 0)
> 		err(EXIT_FAILURE, "asprintf()");
> 
> 	if (!mkdtemp(dir_path))
> 		err(EXIT_FAILURE, "mkdtemp(%s)", dir_path);
> 
> 	char *file_path;
> 	if (asprintf(&file_path, "%s/bar", dir_path) < 0)
> 		err(EXIT_FAILURE, "asprintf()");
> 
> 	if (creat(file_path, 0644) < 0)
> 		err(EXIT_FAILURE, "creat(%s)", file_path);
> 
> 	DIR *dir = opendir(dir_path);
> 	if (!dir)
> 		err(EXIT_FAILURE, "opendir(%s)", dir_path);
> 
> 	struct dirent *de;
> 	while ((de = readdir(dir))) {
> 		printf("readdir(): %s/%s\n", dir_path, de->d_name);
> 		if (de->d_name[0] == '.')
> 			continue;
> 
> 		if (unlink(file_path) != 0)
> 			err(EXIT_FAILURE, "unlink(%s)", file_path);
> 
> 		if (creat(file_path, 0644) < 0)
> 			err(EXIT_FAILURE, "creat(%s)", file_path);
> 	}
> 
> 	return EXIT_SUCCESS;
> }
> 
> when run on Linux 6.5, doesn't print the new directory entry:
> 
> tavianator@graphene $ uname -a
> Linux graphene 6.5.9-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 26 Oct 2023 00:52:20 +0000 x86_64 GNU/Linux
> tavianator@graphene $ gcc -Wall foo.c -o foo
> tavianator@graphene $ ./foo
> readdir(): /tmp/foo.wgmdmm/.
> readdir(): /tmp/foo.wgmdmm/..
> readdir(): /tmp/foo.wgmdmm/bar
> 
> But on Linux 6.6, readdir() never stops:
> 
> tavianator@tachyon $ uname -a
> Linux tachyon 6.6.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 08 Nov 2023 16:05:38 +0000 x86_64 GNU/Linux
> tavianator@tachyon $ gcc foo.c -o foo
> tavianator@tachyon $ ./foo
> readdir(): /tmp/foo.XnIRqj/.
> readdir(): /tmp/foo.XnIRqj/..
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> readdir(): /tmp/foo.XnIRqj/bar
> ...
> foo: creat(/tmp/foo.TTL6Fg/bar): Too many open files
> 
> POSIX says[1]
> 
> > If a file is removed from or added to the directory after the most recent
> > call to opendir() or rewinddir(), whether a subsequent call to readdir()
> > returns an entry for that file is unspecified.
> 
> so this isn't necessarily a *bug*, but I just wanted to point out the
> behaviour change.

I'm betting dollars to donuts that the v6.6 tmpfs behavior doesn't
match the behavior of other filesystems either.

Thanks for the reproducer, I'll look into it.


> I only noticed it because it broke one of my tests in
> bfs[2] (in a non-default build configuration).
> 
> [1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html
> [2]: https://github.com/tavianator/bfs/blob/main/tests/gnu/ignore_readdir_race_notdir.sh

-- 
Chuck Lever


  reply	other threads:[~2023-11-13 18:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-30 17:48 [PATCH v7 0/3] shmemfs " Chuck Lever
2023-06-30 17:48 ` [PATCH v7 1/3] libfs: Add directory operations for stable offsets Chuck Lever
2023-07-03 10:56   ` Christian Brauner
2023-07-03 13:26     ` Chuck Lever III
2023-06-30 17:48 ` [PATCH v7 2/3] shmem: Refactor shmem_symlink() Chuck Lever
2023-06-30 17:49 ` [PATCH v7 3/3] shmem: stable directory offsets Chuck Lever
2023-07-13 13:30   ` kernel test robot
2023-07-17  6:46   ` kernel test robot
2023-07-22 20:33     ` Chuck Lever III
2023-07-25 15:12       ` Chuck Lever III
2023-07-25 15:54         ` Philip Li
2023-07-25 15:59           ` Christian Brauner
2023-07-26  2:55             ` Philip Li
2023-11-13 18:06   ` tavianator
2023-11-13 18:58     ` Chuck Lever [this message]
2023-07-03 15:23 ` [PATCH v7 0/3] shmemfs " Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVJx1jvSZjupTyWk@tissot.1015granger.net \
    --to=chuck.lever@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=cel@kernel.org \
    --cc=hughd@google.com \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tavianator@tavianator.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox