linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kirill@shutemov.name>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	 Luis Chamberlain <mcgrof@kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: Optimizing small reads
Date: Wed, 8 Oct 2025 15:54:05 +0100	[thread overview]
Message-ID: <ik7rut5k6vqpaxatj5q2kowmwd6gchl3iik6xjdokkj5ppy2em@ymsji226hrwp> (raw)
In-Reply-To: <CAHk-=wgbQ-aS3U7gCg=qc9mzoZXaS_o+pKVOLs75_aEn9H_scw@mail.gmail.com>

On Tue, Oct 07, 2025 at 04:30:20PM -0700, Linus Torvalds wrote:
> On Tue, 7 Oct 2025 at 15:54, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So here's the slightly fixed patch that actually does boot - and that
> > I'm running right now. But I wouldn't call it exactly "tested".
> >
> > Caveat patchor.
> 
> From a quick look at profiles, the major issue is that clac/stac is
> very expensive on the machine I'm testing this on, and that makes the
> looping over smaller copies unnecessarily costly.
> 
> And the iov_iter overhead is quite costly too.
> 
> Both would be fixed by instead of just checking the iov_iter_count(),
> we should likely check just the first iov_iter entry, and make sure
> it's a user space iterator.
> 
> Then we'd be able to use the usual - and *much* cheaper -
> user_access_begin/end() and unsafe_copy_to_user() functions, and do
> the iter update at the end outside the loop.
> 
> Anyway, this all feels fairly easily fixable and not some difficult
> fundamental issue, but it just requires being careful and getting the
> small details right. Not difficult, just "care needed".
> 
> But even without that, and in this simplistic form, this should
> *scale* beautifully, because all the overheads are purely CPU-local.
> So it does avoid the whole atomic page reference stuff etc
> synchronization.

I tried to look at numbers too.

The best case scenario looks great. 16 threads hammering the same 4k
with 256 bytes read:

Baseline:	2892MiB/s
Kiryl:		7751MiB/s
Linus:		7787MiB/s

But when I tried something outside of the best case, it doesn't look
good. 16 threads read from 512k file with 4k:

Baseline:	99.4GiB/s
Kiryl:		40.0GiB/s
Linus:		44.0GiB/s

I have not profiled it yet.

Disabling SMAP (clearcpuid=smap) makes it 45.7GiB/s for mine patch and
50.9GiB/s for yours. So it cannot be fully attributed to SMAP.

Other candidates are iov overhead and multiple xarray lookups.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


  reply	other threads:[~2025-10-08 14:54 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-03  2:18 Linus Torvalds
2025-10-03  3:32 ` Luis Chamberlain
2025-10-15 21:31   ` Swarna Prabhu
2025-10-03  9:55 ` Kiryl Shutsemau
2025-10-03 16:18   ` Linus Torvalds
2025-10-03 16:40     ` Linus Torvalds
2025-10-03 17:23       ` Kiryl Shutsemau
2025-10-03 17:49         ` Linus Torvalds
2025-10-06 11:44           ` Kiryl Shutsemau
2025-10-06 15:50             ` Linus Torvalds
2025-10-06 18:04               ` Kiryl Shutsemau
2025-10-06 18:14                 ` Linus Torvalds
2025-10-07 21:47                 ` Linus Torvalds
2025-10-07 22:35                   ` Linus Torvalds
2025-10-07 22:54                     ` Linus Torvalds
2025-10-07 23:30                       ` Linus Torvalds
2025-10-08 14:54                         ` Kiryl Shutsemau [this message]
2025-10-08 16:27                           ` Linus Torvalds
2025-10-08 17:03                             ` Linus Torvalds
2025-10-09 16:22                               ` Kiryl Shutsemau
2025-10-09 17:29                                 ` Linus Torvalds
2025-10-10 10:10                                   ` Kiryl Shutsemau
2025-10-10 17:51                                     ` Linus Torvalds
2025-10-13 15:35                                       ` Kiryl Shutsemau
2025-10-13 15:39                                         ` Kiryl Shutsemau
2025-10-13 16:19                                           ` Linus Torvalds
2025-10-14 12:58                                             ` Kiryl Shutsemau
2025-10-14 16:41                                               ` Linus Torvalds
2025-10-13 16:06                                         ` Linus Torvalds
2025-10-13 17:26                                         ` Theodore Ts'o
2025-10-14  3:20                                           ` Theodore Ts'o
2025-10-08 10:28                       ` Kiryl Shutsemau
2025-10-08 16:24                         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ik7rut5k6vqpaxatj5q2kowmwd6gchl3iik6xjdokkj5ppy2em@ymsji226hrwp \
    --to=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox