linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Omar Sandoval <osandov@osandov.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	Christoph Hellwig <hch@lst.de>,
	linux-mm@kvack.org, kexec@lists.infradead.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Cliff Wickman <cpw@sgi.com>,
	x86@kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore
Date: Thu, 7 Apr 2022 10:32:07 +0800	[thread overview]
Message-ID: <Yk5NJ6CwNcf6yIfe@MiWiFi-R3L-srv> (raw)
In-Reply-To: <Yk31Bs06E1nFTelX@relinquished.localdomain>

On 04/06/22 at 01:16pm, Omar Sandoval wrote:
> On Wed, Apr 06, 2022 at 05:59:53PM +0800, Baoquan He wrote:
> > On 04/06/22 at 11:13am, Uladzislau Rezki wrote:
> > > > On Tue, Apr 05, 2022 at 12:40:31PM -0700, Omar Sandoval wrote:
> > > > > A simple way to "fix" this would be to make set_iounmap_nonlazy() set
> > > > > vmap_lazy_nr to lazy_max_pages() instead of lazy_max_pages() + 1. But, I
> > > > > think it'd be better to get rid of this hack of clobbering vmap_lazy_nr.
> > > > > Instead, this fix makes __copy_oldmem_page() explicitly drain the vmap
> > > > > areas itself.
> > > > 
> > > > This fixes the bug and the interface also is better than what we had
> > > > before.  But a vmap/iounmap_eager would seem even better.  But hey,
> > > > right now it has one caller in always built іn x86 arch code, so maybe
> > > > it isn't worth spending more effort on this.
> > > >
> > > IMHO, it just makes sense to remove it. The set_iounmap_nonlazy() was
> > > added in 2010 year:
> > > 
> > > <snip>
> > > commit 3ee48b6af49cf534ca2f481ecc484b156a41451d
> > > Author: Cliff Wickman <cpw@sgi.com>
> > > Date:   Thu Sep 16 11:44:02 2010 -0500
> > > 
> > >     mm, x86: Saving vmcore with non-lazy freeing of vmas
> > > 
> > >     During the reading of /proc/vmcore the kernel is doing
> > >     ioremap()/iounmap() repeatedly. And the buildup of un-flushed
> > >     vm_area_struct's is causing a great deal of overhead. (rb_next()
> > >     is chewing up most of that time).
> > > 
> > >     This solution is to provide function set_iounmap_nonlazy(). It
> > >     causes a subsequent call to iounmap() to immediately purge the
> > >     vma area (with try_purge_vmap_area_lazy()).
> > > 
> > >     With this patch we have seen the time for writing a 250MB
> > >     compressed dump drop from 71 seconds to 44 seconds.
> > > 
> > >     Signed-off-by: Cliff Wickman <cpw@sgi.com>
> > >     Cc: Andrew Morton <akpm@linux-foundation.org>
> > >     Cc: kexec@lists.infradead.org
> > >     Cc: <stable@kernel.org>
> > >     LKML-Reference: <E1OwHZ4-0005WK-Tw@eag09.americas.sgi.com>
> > >     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> > > <snip>
> > > 
> > > and the reason was the "slow vmap" code, i.e. due to poor performance
> > > they decided to drop the lazily ASAP. Now we have absolutely different
> > > picture when it comes to performance and the vmalloc/vmap code.
> > 
> > I would vote for the current code change, removing it. As pointed out by
> > Christoph, it's only used by x86, may not be so worth to introduce a new
> > interface.
> 
> I did a quick benchmark to see if this optimization is still needed.
> This is on a system with 32GB RAM. I timed
> `dd if=/proc/vmcore of=/dev/null` with 4k and 1M block sizes on 5.17,
> 5.18 with this fix, and 5.18 with the non-lazy cleanup removed entirely.
> It looks like Uladzislau has a point, and this "optimization" actually
> slows things down now:
> 
>   |5.17  |5.18+fix|5.18+removal
> 4k|40.86s|  40.09s|      26.73s
> 1M|24.47s|  23.98s|      21.84s
> 
> I'll send a v2 which removes set_iounmap_nonlazy() entirely.

Hi Omar,

Thanks for the effort on posting patch to fix this and further benchmark
testing.

The removing I said means what you are doing in v1. While from your
testing result, seems removing set_iounmap_nonlazy() directly is better.
I agree that the old optimization was made too long ago, should be not
needed any more. E.g the added purge_vmap_area_root tree will speed up
the searching, adding and removing of the purged vmap_area.

I am wondering if this is a real issue you met, or you just found it by
code inspecting. If it breaks vmcore dumping, we should add 'Fixes' tag
and cc stable.

I am wondering how your vmcore dumping is handled. Asking this because we
usually use makedumpfile utility to filter and dump those kernel data, the
unneeded data, e.g zero-ed pages, unused pages, are all filtered out. While
using makedumpfile, we use mmap which is 4M at one time by default, then
process the content. So the copy_oldmem_page() may only be called during
elfcorehdr and notes reading.

Thanks
Baoquan



      reply	other threads:[~2022-04-07  2:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 19:40 Omar Sandoval
2022-04-05 20:28 ` Uladzislau Rezki
2022-04-06  4:42 ` Christoph Hellwig
2022-04-06  9:13   ` Uladzislau Rezki
2022-04-06  9:59     ` Baoquan He
2022-04-06 20:16       ` Omar Sandoval
2022-04-07  2:32         ` Baoquan He [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yk5NJ6CwNcf6yIfe@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cpw@sgi.com \
    --cc=hch@lst.de \
    --cc=kernel-team@fb.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=osandov@osandov.com \
    --cc=urezki@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox