From: David Howells <dhowells@redhat.com>
To: David Laight <David.Laight@ACULAB.COM>
Cc: dhowells@redhat.com, Al Viro <viro@zeniv.linux.org.uk>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
"Christian Brauner" <christian@brauner.io>,
Matthew Wilcox <willy@infradead.org>,
"Jeff Layton" <jlayton@kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 05/13] iov: Move iterator functions to a header file
Date: Fri, 15 Sep 2023 10:38:14 +0100 [thread overview]
Message-ID: <3369774.1694770694@warthog.procyon.org.uk> (raw)
In-Reply-To: <445a78b0ff3047fea20d3c8058a5ff6a@AcuMS.aculab.com>
David Laight <David.Laight@ACULAB.COM> wrote:
> > Move the iterator functions to a header file so that other operations that
> > need to scan over an iterator can be added. For instance, the rbd driver
> > could use this to scan a buffer to see if it is all zeros and libceph could
> > use this to generate a crc.
>
> These all look a bit big for being more generally inlined.
>
> I know you want to avoid the indirect call in the normal cases,
> but maybe it would be ok for other uses?
So you'd advocate for something like:
size_t generic_iterate(struct iov_iter *iter, size_t len, void *priv,
void *priv2, iov_ustep_f ustep, iov_step_f step)
{
return iterate_and_advance2(iter, len, priv, priv2,
ustep, step);
}
EXPORT_SYMBOL(generic_iterate);
in lib/iov_iter.c and then call that from the places that want to use it?
I tried benchmarking that (see attached patch - it needs to go on top of my
iov patches). Running the insmod thrice and then filtering out and sorting
the results:
iov_kunit_benchmark_bvec: avg 3174 uS, stddev 68 uS
iov_kunit_benchmark_bvec: avg 3176 uS, stddev 61 uS
iov_kunit_benchmark_bvec: avg 3180 uS, stddev 64 uS
iov_kunit_benchmark_bvec_outofline: avg 3678 uS, stddev 4 uS
iov_kunit_benchmark_bvec_outofline: avg 3678 uS, stddev 5 uS
iov_kunit_benchmark_bvec_outofline: avg 3679 uS, stddev 6 uS
iov_kunit_benchmark_xarray: avg 3560 uS, stddev 5 uS
iov_kunit_benchmark_xarray: avg 3560 uS, stddev 6 uS
iov_kunit_benchmark_xarray: avg 3570 uS, stddev 16 uS
iov_kunit_benchmark_xarray_outofline: avg 4125 uS, stddev 13 uS
iov_kunit_benchmark_xarray_outofline: avg 4125 uS, stddev 2 uS
iov_kunit_benchmark_xarray_outofline: avg 4125 uS, stddev 6 uS
It adds almost 16% overhead:
(gdb) p 4125/3560.0
$2 = 1.1587078651685394
(gdb) p 3678/3174.0
$3 = 1.1587901701323251
I'm guessing a lot of that is due to function pointer mitigations.
Now, part of the code size expansion can be mitigated by using, say,
iterate_and_advance_kernel() if you know you aren't going to encounter
user-backed iterators, or even using, say, iterate_bvec() if you know you're
only going to see a specific iterator type.
David
---
iov_iter: Benchmark out of line generic iterator
diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h
index 2ebb86c041b6..8f562e80473b 100644
--- a/include/linux/iov_iter.h
+++ b/include/linux/iov_iter.h
@@ -293,4 +293,7 @@ size_t iterate_and_advance_kernel(struct iov_iter *iter, size_t len, void *priv,
return progress;
}
+size_t generic_iterate(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+ iov_ustep_f ustep, iov_step_f step);
+
#endif /* _LINUX_IOV_ITER_H */
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 8f7a10c4a295..f9643dd02676 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1684,3 +1684,10 @@ ssize_t iov_iter_extract_pages(struct iov_iter *i,
return -EFAULT;
}
EXPORT_SYMBOL_GPL(iov_iter_extract_pages);
+
+size_t generic_iterate(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+ iov_ustep_f ustep, iov_step_f step)
+{
+ return iterate_and_advance2(iter, len, priv, priv2, ustep, step);
+}
+EXPORT_SYMBOL(generic_iterate);
diff --git a/lib/kunit_iov_iter.c b/lib/kunit_iov_iter.c
index cc9c64663a73..f208516a68c9 100644
--- a/lib/kunit_iov_iter.c
+++ b/lib/kunit_iov_iter.c
@@ -18,6 +18,7 @@
#include <linux/writeback.h>
#include <linux/uio.h>
#include <linux/bvec.h>
+#include <linux/iov_iter.h>
#include <kunit/test.h>
MODULE_DESCRIPTION("iov_iter testing");
@@ -1571,6 +1572,124 @@ static void __init iov_kunit_benchmark_xarray(struct kunit *test)
KUNIT_SUCCEED();
}
+static noinline
+size_t shovel_to_user_iter(void __user *iter_to, size_t progress,
+ size_t len, void *from, void *priv2)
+{
+ if (should_fail_usercopy())
+ return len;
+ if (access_ok(iter_to, len)) {
+ from += progress;
+ instrument_copy_to_user(iter_to, from, len);
+ len = raw_copy_to_user(iter_to, from, len);
+ }
+ return len;
+}
+
+static noinline
+size_t shovel_to_kernel_iter(void *iter_to, size_t progress,
+ size_t len, void *from, void *priv2)
+{
+ memcpy(iter_to, from + progress, len);
+ return 0;
+}
+
+/*
+ * Time copying 256MiB through an ITER_BVEC with an out-of-line copier
+ * function.
+ */
+static void __init iov_kunit_benchmark_bvec_outofline(struct kunit *test)
+{
+ struct iov_iter iter;
+ struct bio_vec *bvec;
+ struct page *page;
+ unsigned int samples[IOV_KUNIT_NR_SAMPLES];
+ ktime_t a, b;
+ ssize_t copied;
+ size_t size = 256 * 1024 * 1024, npages = size / PAGE_SIZE;
+ void *scratch;
+ int i;
+
+ /* Allocate a page and tile it repeatedly in the buffer. */
+ page = alloc_page(GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, page);
+ kunit_add_action_or_reset(test, iov_kunit_free_page, page);
+
+ bvec = kunit_kmalloc_array(test, npages, sizeof(bvec[0]), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, bvec);
+ for (i = 0; i < npages; i++)
+ bvec_set_page(&bvec[i], page, PAGE_SIZE, 0);
+
+ /* Create a single large buffer to copy to/from. */
+ scratch = iov_kunit_create_source(test, npages);
+
+ /* Perform and time a bunch of copies. */
+ kunit_info(test, "Benchmarking copy_to_iter() over BVEC:\n");
+ for (i = 0; i < IOV_KUNIT_NR_SAMPLES; i++) {
+ iov_iter_bvec(&iter, ITER_DEST, bvec, npages, size);
+ a = ktime_get_real();
+ copied = generic_iterate(&iter, size, scratch, NULL,
+ shovel_to_user_iter,
+ shovel_to_kernel_iter);
+ b = ktime_get_real();
+ KUNIT_EXPECT_EQ(test, copied, size);
+ samples[i] = ktime_to_us(ktime_sub(b, a));
+ }
+
+ iov_kunit_benchmark_print_stats(test, samples);
+ KUNIT_SUCCEED();
+}
+
+/*
+ * Time copying 256MiB through an ITER_XARRAY with an out-of-line copier
+ * function.
+ */
+static void __init iov_kunit_benchmark_xarray_outofline(struct kunit *test)
+{
+ struct iov_iter iter;
+ struct xarray *xarray;
+ struct page *page;
+ unsigned int samples[IOV_KUNIT_NR_SAMPLES];
+ ktime_t a, b;
+ ssize_t copied;
+ size_t size = 256 * 1024 * 1024, npages = size / PAGE_SIZE;
+ void *scratch;
+ int i;
+
+ /* Allocate a page and tile it repeatedly in the buffer. */
+ page = alloc_page(GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, page);
+ kunit_add_action_or_reset(test, iov_kunit_free_page, page);
+
+ xarray = iov_kunit_create_xarray(test);
+
+ for (i = 0; i < npages; i++) {
+ void *x = xa_store(xarray, i, page, GFP_KERNEL);
+
+ KUNIT_ASSERT_FALSE(test, xa_is_err(x));
+ }
+
+ /* Create a single large buffer to copy to/from. */
+ scratch = iov_kunit_create_source(test, npages);
+
+ /* Perform and time a bunch of copies. */
+ kunit_info(test, "Benchmarking copy_to_iter() over XARRAY:\n");
+ for (i = 0; i < IOV_KUNIT_NR_SAMPLES; i++) {
+ iov_iter_xarray(&iter, ITER_DEST, xarray, 0, size);
+ a = ktime_get_real();
+
+ copied = generic_iterate(&iter, size, scratch, NULL,
+ shovel_to_user_iter,
+ shovel_to_kernel_iter);
+ b = ktime_get_real();
+ KUNIT_EXPECT_EQ(test, copied, size);
+ samples[i] = ktime_to_us(ktime_sub(b, a));
+ }
+
+ iov_kunit_benchmark_print_stats(test, samples);
+ KUNIT_SUCCEED();
+}
+
static struct kunit_case __refdata iov_kunit_cases[] = {
KUNIT_CASE(iov_kunit_copy_to_ubuf),
KUNIT_CASE(iov_kunit_copy_from_ubuf),
@@ -1593,6 +1712,8 @@ static struct kunit_case __refdata iov_kunit_cases[] = {
KUNIT_CASE(iov_kunit_benchmark_bvec),
KUNIT_CASE(iov_kunit_benchmark_bvec_split),
KUNIT_CASE(iov_kunit_benchmark_xarray),
+ KUNIT_CASE(iov_kunit_benchmark_bvec_outofline),
+ KUNIT_CASE(iov_kunit_benchmark_xarray_outofline),
{}
};
next prev parent reply other threads:[~2023-09-15 9:38 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-13 16:56 [PATCH v4 00/13] iov_iter: Convert the iterator macros into inline funcs David Howells
2023-09-13 16:56 ` [PATCH v4 01/13] iov_iter: Add a benchmarking kunit test David Howells
2023-09-14 6:04 ` Johannes Thumshirn
2023-09-13 16:56 ` [PATCH v4 02/13] iov_iter: Renumber ITER_* constants David Howells
2023-09-13 16:56 ` [PATCH v4 03/13] iov_iter: Derive user-backedness from the iterator type David Howells
2023-09-13 16:56 ` [PATCH v4 04/13] iov_iter: Convert iterate*() to inline funcs David Howells
2023-09-13 16:56 ` [PATCH v4 05/13] iov: Move iterator functions to a header file David Howells
2023-09-14 9:06 ` David Laight
2023-09-15 9:38 ` David Howells [this message]
2023-09-13 16:56 ` [PATCH v4 06/13] iov_iter: Add a kernel-type iterator-only iteration function David Howells
2023-09-13 16:56 ` [PATCH v4 07/13] iov_iter: Make copy_from_iter() always handle MCE David Howells
2023-09-13 19:43 ` Linus Torvalds
2023-09-13 16:56 ` [PATCH v4 08/13] iov_iter: Remove the copy_mc flag and associated functions David Howells
2023-09-13 16:56 ` [PATCH v4 09/13] iov_iter, net: Move csum_and_copy_to/from_iter() to net/ David Howells
2023-09-13 16:56 ` [PATCH v4 10/13] iov_iter, net: Fold in csum_and_memcpy() David Howells
2023-09-13 16:56 ` [PATCH v4 11/13] iov_iter, net: Merge csum_and_copy_from_iter{,_full}() together David Howells
2023-09-13 16:56 ` [PATCH v4 12/13] iov_iter, net: Move hash_and_copy_to_iter() to net/ David Howells
2023-09-13 16:56 ` [PATCH v4 13/13] iov_iter: Create a fake device to allow iov_iter testing/benchmarking David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3369774.1694770694@warthog.procyon.org.uk \
--to=dhowells@redhat.com \
--cc=David.Laight@ACULAB.COM \
--cc=axboe@kernel.dk \
--cc=christian@brauner.io \
--cc=hch@lst.de \
--cc=jlayton@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox