linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, Baoquan He <bhe@redhat.com>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/9] Mitigate a vmap lock contention
Date: Wed, 24 May 2023 10:30:46 +0900	[thread overview]
Message-ID: <ZG1oxkEXDCfab8Ga@debian-BULLSEYE-live-builder-AMD64> (raw)
In-Reply-To: <ZG0zil5dpXUiuF5q@dread.disaster.area>

On Wed, May 24, 2023 at 07:43:38AM +1000, Dave Chinner wrote:
> On Wed, May 24, 2023 at 03:04:28AM +0900, Hyeonggon Yoo wrote:
> > On Tue, May 23, 2023 at 05:12:30PM +0200, Uladzislau Rezki wrote:
> > > > > 2. Motivation.
> > > > > 
> > > > > - The vmap code is not scalled to number of CPUs and this should be fixed;
> > > > > - XFS folk has complained several times that vmalloc might be contented on
> > > > >   their workloads:
> > > > > 
> > > > > <snip>
> > > > > commit 8dc9384b7d75012856b02ff44c37566a55fc2abf
> > > > > Author: Dave Chinner <dchinner@redhat.com>
> > > > > Date:   Tue Jan 4 17:22:18 2022 -0800
> > > > > 
> > > > >     xfs: reduce kvmalloc overhead for CIL shadow buffers
> > > > >     
> > > > >     Oh, let me count the ways that the kvmalloc API sucks dog eggs.
> > > > >     
> > > > >     The problem is when we are logging lots of large objects, we hit
> > > > >     kvmalloc really damn hard with costly order allocations, and
> > > > >     behaviour utterly sucks:
> > > > 
> > > > based on the commit I guess xfs should use vmalloc/kvmalloc is because
> > > > it allocates large buffers, how large could it be?
> > > > 
> > > They use kvmalloc(). When the page allocator is not able to serve a
> > > request they fallback to vmalloc. At least what i see, the sizes are:
> > > 
> > > from 73728 up to 1048576, i.e. 18 pages up to 256 pages.
> > > 
> > > > > 3. Test
> > > > > 
> > > > > On my: AMD Ryzen Threadripper 3970X 32-Core Processor, i have below figures:
> > > > > 
> > > > >     1-page     1-page-this-patch
> > > > > 1  0.576131   vs   0.555889
> > > > > 2   2.68376   vs    1.07895
> > > > > 3   4.26502   vs    1.01739
> > > > > 4   6.04306   vs    1.28924
> > > > > 5   8.04786   vs    1.57616
> > > > > 6   9.38844   vs    1.78142
> > > > 
> > > > <snip>
> > > > 
> > > > > 29    20.06   vs    3.59869
> > > > > 30  20.4353   vs     3.6991
> > > > > 31  20.9082   vs    3.73028
> > > > > 32  21.0865   vs    3.82904
> > > > > 
> > > > > 1..32 - is a number of jobs. The results are in usec and is a vmallco()/vfree()
> > > > > pair throughput.
> > > > 
> > > > I would be more interested in real numbers than synthetic benchmarks,
> > > > Maybe XFS folks could help performing profiling similar to commit 8dc9384b7d750
> > > > with and without this patchset?
> > > > 
> > > I added Dave Chinner <david@fromorbit.com> to this thread.
> > 
> > Oh, I missed that, and it would be better to [+Cc linux-xfs]
> > 
> > > But. The contention exists.
> > 
> > I think "theoretically can be contended" doesn't necessarily mean it's actually
> > contended in the real world.
> 
> Did you not read the commit message for the XFS commit documented
> above? vmalloc lock contention most c0ertainly does exist in the
> real world and the profiles in commit 8dc9384b7d75  ("xfs: reduce
> kvmalloc overhead for CIL shadow buffers") document it clearly.
>
> > Also I find it difficult to imagine vmalloc being highly contended because it was
> > historically considered slow and thus discouraged when performance is important.
> 
> Read the above XFS commit.
> 
> We use vmalloc in critical high performance fast paths that cannot
> tolerate high order memory allocation failure. XFS runs this
> fast path millions of times a second, and will call into
> vmalloc() several hundred thousands times a second with machine wide
> concurrency under certain types of workloads.
>
> > IOW vmalloc would not be contended when allocation size is small because we have
> > kmalloc/buddy API, and therefore I wonder which workloads are allocating very large
> > buffers and at the same time allocating very frequently, thus performance-sensitive.
> >
> > I am not against this series, but wondering which workloads would benefit ;)
> 
> Yup, you need to read the XFS commit message. If you understand what
> is in that commit message, then you wouldn't be doubting that
> vmalloc contention is real and that it is used in high performance
> fast paths that are traversed millions of times a second....

Oh, I read the commit but seems slipped my mind while reading it - sorry for such a dumb
question, now I get it, and thank you so much. In any case didn't mean to offend,
I should've read and thought more before asking.

>
> > > Apart of that per-cpu-KVA allocator can go away if we make it generic instead.
> > 
> > Not sure I understand your point, can you elaborate please?
> > 
> > And I would like to ask some side questions:
> > 
> > 1. Is vm_[un]map_ram() API still worth with this patchset?
> 
> XFS also uses this interface for mapping multi-page buffers in the
> XFS buffer cache. These are the items that also require the high
> order costly kvmalloc allocations in the transaction commit path
> when they are modified.
> 
> So, yes, we need these mapping interfaces to scale just as well as
> vmalloc itself....

I mean, even before this series, vm_[un]map_ram() caches vmap_blocks
per CPU but it has limitation on size that can be cached per cpu.

But now that vmap() itself becomes scalable after this series, I wonder
they are still worth, why not replace it with v[un]map()?
> 
> > 2. How does this patchset deals with 32-bit machines where
> >    vmalloc address space is limited?
> 
> From the XFS side, we just don't care about 32 bit machines at all.
> XFS is aimed at server and HPC environments which have been entirely
> 64 bit for a long, long time now...

But Linux still supports 32 bit machines and is not going to drop
support for them anytime soon so I think there should be at least a way to
disable this feature.

Thanks!

-- 
Hyeonggon Yoo

Doing kernel stuff as a hobby
Undergraduate | Chungnam National University
Dept. Computer Science & Engineering


  reply	other threads:[~2023-05-24  1:31 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22 11:08 Uladzislau Rezki (Sony)
2023-05-22 11:08 ` [PATCH 1/9] mm: vmalloc: Add va_alloc() helper Uladzislau Rezki (Sony)
2023-05-23  6:05   ` Christoph Hellwig
2023-05-23  9:57     ` Uladzislau Rezki
2023-05-27 19:55   ` Lorenzo Stoakes
2023-05-22 11:08 ` [PATCH 2/9] mm: vmalloc: Rename adjust_va_to_fit_type() function Uladzislau Rezki (Sony)
2023-05-23  6:06   ` Christoph Hellwig
2023-05-23 10:01     ` Uladzislau Rezki
2023-05-23 17:24   ` Liam R. Howlett
2023-05-24 11:51     ` Uladzislau Rezki
2023-05-27 21:50   ` Lorenzo Stoakes
2023-05-29 20:37     ` Uladzislau Rezki
2023-05-22 11:08 ` [PATCH 3/9] mm: vmalloc: Move vmap_init_free_space() down in vmalloc.c Uladzislau Rezki (Sony)
2023-05-23  6:06   ` Christoph Hellwig
2023-05-27 21:52   ` Lorenzo Stoakes
2023-05-22 11:08 ` [PATCH 4/9] mm: vmalloc: Add a per-CPU-zone infrastructure Uladzislau Rezki (Sony)
2023-05-23  6:08   ` Christoph Hellwig
2023-05-23 14:53     ` Uladzislau Rezki
2023-05-23 15:13       ` Christoph Hellwig
2023-05-23 15:32         ` Uladzislau Rezki
2023-05-22 11:08 ` [PATCH 5/9] mm: vmalloc: Insert busy-VA per-cpu zone Uladzislau Rezki (Sony)
2023-05-23  6:12   ` Christoph Hellwig
2023-05-23 15:00     ` Uladzislau Rezki
2023-05-22 11:08 ` [PATCH 6/9] mm: vmalloc: Support multiple zones in vmallocinfo Uladzislau Rezki (Sony)
2023-05-22 11:08 ` [PATCH 7/9] mm: vmalloc: Insert lazy-VA per-cpu zone Uladzislau Rezki (Sony)
2023-05-22 11:08 ` [PATCH 8/9] mm: vmalloc: Offload free_vmap_area_lock global lock Uladzislau Rezki (Sony)
     [not found]   ` <ZH0vuwaSddREy9dz@MiWiFi-R3L-srv>
     [not found]     ` <ZH7128Q0MiRh6S5f@pc638.lan>
     [not found]       ` <ZH8iWAgsDSF1I+B6@MiWiFi-R3L-srv>
2023-06-07  6:58         ` Uladzislau Rezki
2023-05-22 11:08 ` [PATCH 9/9] mm: vmalloc: Scale and activate cvz_size Uladzislau Rezki (Sony)
2023-05-23 11:59 ` [PATCH 0/9] Mitigate a vmap lock contention Hyeonggon Yoo
2023-05-23 15:12   ` Uladzislau Rezki
2023-05-23 18:04     ` Hyeonggon Yoo
2023-05-23 21:43       ` Dave Chinner
2023-05-24  1:30         ` Hyeonggon Yoo [this message]
2023-05-24  9:50       ` Uladzislau Rezki
2023-05-24 21:56         ` Dave Chinner
2023-05-25  7:59           ` Christoph Hellwig
2023-05-25 10:20           ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZG1oxkEXDCfab8Ga@debian-BULLSEYE-live-builder-AMD64 \
    --to=42.hyeyoo@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=lstoakes@gmail.com \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=urezki@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox