linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zach Brown <zab@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: axboe@kernel.dk, martin.petersen@oracle.com,
	JBottomley@parallels.com, jmoyer@redhat.com, bcrl@kvack.org,
	viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
	linux-aio@kvack.org, linux-scsi@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 2/6] io: define an interface for IO extensions
Date: Wed, 2 Apr 2014 12:49:47 -0700	[thread overview]
Message-ID: <20140402194947.GJ2394@lenny.home.zabbo.net> (raw)
In-Reply-To: <20140324162244.10848.46322.stgit@birch.djwong.org>

> @@ -916,6 +921,17 @@ void aio_complete(struct kiocb *iocb, long res, long res2)
>  	struct io_event	*ev_page, *event;
>  	unsigned long	flags;
>  	unsigned tail, pos;
> +	int ret;
> +
> +	ret = io_teardown_extensions(iocb);
> +	if (ret) {
> +		if (!res)
> +			res = ret;
> +		else if (!res2)
> +			res2 = ret;
> +		else
> +			pr_err("error %d tearing down aio extensions\n", ret);
> +	}

This ends up trying to copy the kernel's io_extension copy back to
userspace from interrupts, which obviously won't fly.

And to what end?  So that maybe someone can later add an 'extension'
that can fill in some field that's then copied to userspace?  But by
copying the entire argument struct back?

Let's not get ahead of ourselves.  If they're going to try and give
userspace some feedback after IO completion they're going to have to try
a lot harder because they don't have acces to the submitting task
context anymore.  They'd have to pin some reference to a feedback
mechanism in the in-flight io.  I think we'd want that explicit in the
iocb, not hiding off on the other side of this extension interface.

I'd just remove this generic teardown callback path entirely.  If
there's PI state hanging off the iocb tear it down during iocb teardown.

> +struct io_extension_type {
> +	unsigned int type;
> +	unsigned int extension_struct_size;
> +	int (*setup_fn)(struct kiocb *, int is_write);
> +	int (*destroy_fn)(struct kiocb *);
> +};

I'd also get rid of all of this.  More below.

> +static int io_setup_extensions(struct kiocb *req, int is_write,
> +			       struct io_extension __user *ioext)
> +{
> +	struct io_extension_type *iet;
> +	__u64 sz, has;
> +	int ret;
> +
> +	/* Check size of buffer */
> +	if (unlikely(copy_from_user(&sz, &ioext->ie_size, sizeof(sz))))
> +		return -EFAULT;
> +	if (sz > PAGE_SIZE ||
> +	    sz > sizeof(struct io_extension) ||
> +	    sz < IO_EXT_SIZE(ie_has))
> +		return -EINVAL;
> +
> +	/* Check that the buffer's big enough */
> +	if (unlikely(copy_from_user(&has, &ioext->ie_has, sizeof(has))))
> +		return -EFAULT;
> +	ret = io_check_bufsize(has, sz);
> +	if (ret)
> +		return ret;
> +
> +	/* Copy from userland */
> +	req->ki_ioext = kzalloc(sizeof(struct kio_extension), GFP_NOIO);
> +	if (!req->ki_ioext)
> +		return -ENOMEM;
> +
> +	req->ki_ioext->ke_user = ioext;
> +	if (unlikely(copy_from_user(&req->ki_ioext->ke_kern, ioext, sz))) {
> +		ret = -EFAULT;
> +		goto out;
> +	}

(Isn't there some allocate-and-copy-from-userspace helper now? But..)

I don't like the rudundancy of the implicit size requirement by a
field's flag being set being duplicated by the explicit size argument.
What does that give us, exactly?

Our notion of the total size only seems to only matter if we're copying
the entire struct from userspace and I'm don't think we need to do that.

For each argument, we're translating it into some kernel equivalent,
right?  Fields in the iocb  As each of these are initialized I'd just
test the presence bits and __get_user() the userspace arguemnts
directly, or copy_from_user() something slightly more complicated on to
the stack.

That gets rid of us having to care about the size at all.  It stops us
from allocating a kernel copy and pinning it for the duration of the IO.
We'd just be sampling the present userspace arguments as we initialie
the iocb during submission.

> +	/* Try to initialize all the extensions */
> +	has = 0;
> +	for (iet = extensions; iet->type != IO_EXT_INVALID; iet++) {
> +		if (!(req->ki_ioext->ke_kern.ie_has & iet->type))
> +			continue;
> +		ret = iet->setup_fn(req, is_write);
> +		if (ret) {
> +			req->ki_ioext->ke_kern.ie_has = has;
> +			goto out_destroy;
> +		}
> +		has |= iet->type;
> +	}

So instead of doing all this we'd test explicit bits and act
accordingly.  If they're trivial translations between userspace fields
and iocb fields we could just do it inline in this helper that'd be more
like iocb_parse_more_args(iocb, struct __user *ptr).  For more
complicated stuff, like the PI page pinning, it could call out to PI.

> +	user_ext = (struct io_extension __user *)iocb->aio_extension_ptr;

Need a __force there?  Has this been run through sparse?

- z

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2014-04-02 19:49 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-24 16:22 [RFC PATCH DONOTMERGE v2 0/6] userspace PI passthrough via AIO/DIO Darrick J. Wong
2014-03-24 16:22 ` [PATCH 1/6] fs/bio-integrity: remove duplicate code Darrick J. Wong
2014-04-02 19:17   ` Zach Brown
2014-04-02 20:35     ` Darrick J. Wong
2014-03-24 16:22 ` [PATCH 2/6] io: define an interface for IO extensions Darrick J. Wong
2014-04-02 19:22   ` Jeff Moyer
2014-04-02 22:08     ` Darrick J. Wong
2014-04-02 19:49   ` Zach Brown [this message]
2014-04-02 22:28     ` Darrick J. Wong
2014-04-02 22:53       ` Zach Brown
2014-04-02 23:06         ` Darrick J. Wong
2014-03-24 16:22 ` [PATCH 3/6] aio/dio: enable PI passthrough Darrick J. Wong
2014-04-02 20:01   ` Zach Brown
2014-04-02 20:44     ` Darrick J. Wong
2014-04-02 22:33       ` Zach Brown
2014-04-02 22:55         ` Darrick J. Wong
2014-03-24 16:22 ` [PATCH 4/6] PI IO extension: allow user to ask kernel to fill in parts of the protection info Darrick J. Wong
2014-03-24 16:23 ` [PATCH 5/6] PI IO extension: advertise possible userspace flags Darrick J. Wong
2014-03-24 16:23 ` [PATCH 6/6] blk-integrity: refactor various routines Darrick J. Wong
2014-04-02 19:14 ` [RFC PATCH DONOTMERGE v2 0/6] userspace PI passthrough via AIO/DIO Zach Brown
2014-04-02 20:05   ` Zach Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140402194947.GJ2394@lenny.home.zabbo.net \
    --to=zab@redhat.com \
    --cc=JBottomley@parallels.com \
    --cc=axboe@kernel.dk \
    --cc=bcrl@kvack.org \
    --cc=darrick.wong@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox