From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4714AC433EF for ; Thu, 7 Apr 2022 02:32:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B19626B0071; Wed, 6 Apr 2022 22:32:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AC9176B0073; Wed, 6 Apr 2022 22:32:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 969126B0074; Wed, 6 Apr 2022 22:32:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 87D0E6B0071 for ; Wed, 6 Apr 2022 22:32:27 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 391488249980 for ; Thu, 7 Apr 2022 02:32:17 +0000 (UTC) X-FDA: 79328508714.26.E226847 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id AD7C7160006 for ; Thu, 7 Apr 2022 02:32:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649298736; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u+VCawG04Up+1pWOVtiChgcQUftonP3CAjMvRJfApmg=; b=O4gooDYJG9eTsl+pXkmRvjPU1GXCYad0J4POeVhf2yoLwy7kAhbuwbHjiimPLN0fo4zLEF H7NLpok8V7RZDnH2RqFdDDg+NmLKWAP1DtW9jlW92gbQpL4brSW3DLUFroKZOZqgprF4ns DnAr8IGwrkFPJVcUYQBk8+snMYVzFBw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-32-aJjgEedGP9CbZ-oxZ_LSKQ-1; Wed, 06 Apr 2022 22:32:13 -0400 X-MC-Unique: aJjgEedGP9CbZ-oxZ_LSKQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C8839299E757; Thu, 7 Apr 2022 02:32:12 +0000 (UTC) Received: from localhost (ovpn-12-57.pek2.redhat.com [10.72.12.57]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EECDD41617B; Thu, 7 Apr 2022 02:32:11 +0000 (UTC) Date: Thu, 7 Apr 2022 10:32:07 +0800 From: Baoquan He To: Omar Sandoval Cc: Uladzislau Rezki , Christoph Hellwig , linux-mm@kvack.org, kexec@lists.infradead.org, Andrew Morton , Cliff Wickman , x86@kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore Message-ID: References: <75014514645de97f2d9e087aa3df0880ea311b77.1649187356.git.osandov@fb.com> <20220406044244.GA9959@lst.de> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: AD7C7160006 X-Rspam-User: Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=O4gooDYJ; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf08.hostedemail.com: domain of bhe@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=bhe@redhat.com X-Stat-Signature: rjkj34fku5t5qtdufa4judtow58g1eaq X-HE-Tag: 1649298736-262301 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 04/06/22 at 01:16pm, Omar Sandoval wrote: > On Wed, Apr 06, 2022 at 05:59:53PM +0800, Baoquan He wrote: > > On 04/06/22 at 11:13am, Uladzislau Rezki wrote: > > > > On Tue, Apr 05, 2022 at 12:40:31PM -0700, Omar Sandoval wrote: > > > > > A simple way to "fix" this would be to make set_iounmap_nonlazy= () set > > > > > vmap_lazy_nr to lazy_max_pages() instead of lazy_max_pages() + = 1. But, I > > > > > think it'd be better to get rid of this hack of clobbering vmap= _lazy_nr. > > > > > Instead, this fix makes __copy_oldmem_page() explicitly drain t= he vmap > > > > > areas itself. > > > >=20 > > > > This fixes the bug and the interface also is better than what we = had > > > > before. But a vmap/iounmap_eager would seem even better. But he= y, > > > > right now it has one caller in always built =D1=96n x86 arch code= , so maybe > > > > it isn't worth spending more effort on this. > > > > > > > IMHO, it just makes sense to remove it. The set_iounmap_nonlazy() w= as > > > added in 2010 year: > > >=20 > > > > > > commit 3ee48b6af49cf534ca2f481ecc484b156a41451d > > > Author: Cliff Wickman > > > Date: Thu Sep 16 11:44:02 2010 -0500 > > >=20 > > > mm, x86: Saving vmcore with non-lazy freeing of vmas > > >=20 > > > During the reading of /proc/vmcore the kernel is doing > > > ioremap()/iounmap() repeatedly. And the buildup of un-flushed > > > vm_area_struct's is causing a great deal of overhead. (rb_next(= ) > > > is chewing up most of that time). > > >=20 > > > This solution is to provide function set_iounmap_nonlazy(). It > > > causes a subsequent call to iounmap() to immediately purge the > > > vma area (with try_purge_vmap_area_lazy()). > > >=20 > > > With this patch we have seen the time for writing a 250MB > > > compressed dump drop from 71 seconds to 44 seconds. > > >=20 > > > Signed-off-by: Cliff Wickman > > > Cc: Andrew Morton > > > Cc: kexec@lists.infradead.org > > > Cc: > > > LKML-Reference: > > > Signed-off-by: Ingo Molnar > > > > > >=20 > > > and the reason was the "slow vmap" code, i.e. due to poor performan= ce > > > they decided to drop the lazily ASAP. Now we have absolutely differ= ent > > > picture when it comes to performance and the vmalloc/vmap code. > >=20 > > I would vote for the current code change, removing it. As pointed out= by > > Christoph, it's only used by x86, may not be so worth to introduce a = new > > interface. >=20 > I did a quick benchmark to see if this optimization is still needed. > This is on a system with 32GB RAM. I timed > `dd if=3D/proc/vmcore of=3D/dev/null` with 4k and 1M block sizes on 5.1= 7, > 5.18 with this fix, and 5.18 with the non-lazy cleanup removed entirely= . > It looks like Uladzislau has a point, and this "optimization" actually > slows things down now: >=20 > |5.17 |5.18+fix|5.18+removal > 4k|40.86s| 40.09s| 26.73s > 1M|24.47s| 23.98s| 21.84s >=20 > I'll send a v2 which removes set_iounmap_nonlazy() entirely. Hi Omar, Thanks for the effort on posting patch to fix this and further benchmark testing. The removing I said means what you are doing in v1. While from your testing result, seems removing set_iounmap_nonlazy() directly is better. I agree that the old optimization was made too long ago, should be not needed any more. E.g the added purge_vmap_area_root tree will speed up the searching, adding and removing of the purged vmap_area. I am wondering if this is a real issue you met, or you just found it by code inspecting. If it breaks vmcore dumping, we should add 'Fixes' tag and cc stable. I am wondering how your vmcore dumping is handled. Asking this because we usually use makedumpfile utility to filter and dump those kernel data, th= e unneeded data, e.g zero-ed pages, unused pages, are all filtered out. Whi= le using makedumpfile, we use mmap which is 4M at one time by default, then process the content. So the copy_oldmem_page() may only be called during elfcorehdr and notes reading. Thanks Baoquan