From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D7AEC02181 for ; Mon, 20 Jan 2025 14:48:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 534106B0082; Mon, 20 Jan 2025 09:48:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E3966B0083; Mon, 20 Jan 2025 09:48:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AC346B0085; Mon, 20 Jan 2025 09:48:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1F3926B0082 for ; Mon, 20 Jan 2025 09:48:45 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A75041A01F9 for ; Mon, 20 Jan 2025 14:48:44 +0000 (UTC) X-FDA: 83028111768.18.0C871EE Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf17.hostedemail.com (Postfix) with ESMTP id 0CFE340015 for ; Mon, 20 Jan 2025 14:48:42 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=fZ35l5xq; spf=pass (imf17.hostedemail.com: domain of brauner@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737384523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6a7iACmqAsKmEJhOXwH5DRkzdelaj8/5IemkhAII2Ps=; b=yZQyBcTRn/nMSMgU7gpXBFBVLcrk1cfHOgO1CPU/l/s9gHTAckHsrMV8y/geqttpKAhjqO Yp4egMBWTH+QUeU4FUgq7ioO2HWWPj0fFyiyPnBPCeM/+cfESNX4bWO/VXxt6Qxq61QGmU rUIIu22WDQ/H1O3GyDGdjCp+WNHOd/I= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=fZ35l5xq; spf=pass (imf17.hostedemail.com: domain of brauner@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737384523; a=rsa-sha256; cv=none; b=serFtsOjm8MTK3ZVF0+4HvWilpoaoXaFAnQ4fTKX7ez6JlgMpcjJG7dbh9N6+vfWy9wgak xD+wuqaNlgOTI4ic/A7FONkmNXTB1wgN2p6haBIdP6SSu4nFjPR1Tytj4bb+VFDJfME4Qh uCJXZhKkya3T3ca8IvMPuWjtlh+6ttc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id B73AFA40F7B; Mon, 20 Jan 2025 14:46:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0093BC4CEDD; Mon, 20 Jan 2025 14:48:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737384522; bh=2HBvFH+2G/9d3WwdJ8gi7ozfTW+xMJRiUKu+XfHr74Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=fZ35l5xqhfWRdaqVfQS2MYtj9GlZ+46lCHSUqrE+SauGVCCWizHx6koD+FFve8JX4 wf5fAAA07pPWFaOnMIKnAN2vUVGivR3ZiNFgkbgjSO30BhPNFGlIMgJMGppmxXSRSH +tnoCRYq7aplX7x0GpYgzD8s5QKOoKjvhW6yPN1ljran1grPNC8sBj2KT6kewI1RiS XN8JD3KkvA2Yhruo9EBHMHm2wnwdMieQOjCKQ6r+vIpvRBpYO4N2nWiTcUpMAborAR TgfUnIKMy/p8d6JbWZ9uxd+CtedZ9YgtoDVSJgRABybA/gnB9f9zGj4/TbATdEmI4C KOvxhyy6WaOnQ== Date: Mon, 20 Jan 2025 15:48:37 +0100 From: Christian Brauner To: Mateusz Guzik Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, tavianator@tavianator.com, linux-mm@kvack.org, akpm@linux-foundation.org Subject: Re: [RESEND PATCH] fs: avoid mmap sem relocks when coredumping with many missing pages Message-ID: <20250120-mutig-umgewandelt-4ced736ffe30@brauner> References: <20250119103205.2172432-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20250119103205.2172432-1-mjguzik@gmail.com> X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0CFE340015 X-Stat-Signature: 6pje5mpqgqhxuw17zikafedoowkaizy9 X-Rspam-User: X-HE-Tag: 1737384522-998288 X-HE-Meta: U2FsdGVkX1+byfWQTI6MUiXfcjaEZsi3nHXI1y0Ln/24K4mUve4wpIWSll4yMMe8kQWNNhf38JBxGqT59LV3ilt2sgc2tCuARFSavInw2ITMTc0IAvp/33nP/0/TopDqSZ74HWQu999n1sCU1N8itleL5977gY4ywPopGQFuxT5TY0GtZND6PeMN2c5yCOQD770pT/xJ/Xms+tf+X/rPUmwirTDQcjhcmZU2L4YhWkMFS+YQa7M518Pale45fBCZwJ0skZff1zoRcgk/XA+hrya4D0od2DqfD5KceBTqRsvunr/+RV2unPJZJ5mpUJ/s31qeG7V7nObGX/lG5iS3oDMs6SYKmccWZkpJ9oOnWagE3aszQPSK9IlR4MvoI+exkNsCmcEQbsipEZz22wG8VOR53TqJB8SJfWs0DytVKFj9pFSsouOYkd4fHLhTMXb9QrIddG9CFJ50bpbIYcd/0ksYJIWh2wN+euqaEweUiO9I8J/9VnO7Ii50CStg2vjNIUCn+6f6BYB+3XGJQyxksTcnnywbNtB9GSIANECq9wdUc9xYo6THt3FtC8WBuLh3VvA04xCk3ltrB+PRrFyTJWt/yDI2iZg5lGCwYtVdUuSkbxVFIP5NHF7tolo5tEOwWVZIO9cHSckyyik819AkgOfI9V7VufEuFOkTLxbqGWiXhVsNVG+SdhanriXu8xjQtf3WNTGGOIBEmfLEtY6Tyl66jxGrj8MQvrwfbCeKwoC/KrAxh6BFn6IEoXlI1vs5cKJcIkMV4LPY2L7ntPOsz1rISY9uzc0qqcLFQrS2b1rjw6lrI7+QPcmCvnrTeURaj/77pr0cVCJnvxg41hv+WjzIUb8O3/rfmVPdFWVx6ZmEs+N2mj7744cVhLLfDcU+VHY7pdk4pGarMxD+Kt2Wi/h81U/3Q4VtQTQDRn3dgega48/nMMzBvBTinXLFNfFhUIXinngJZ7CxkIEdWjJ G5cu7ISb mdE38wihXMBVagbDmqqESxt4VkTRtv/TQxM2bryLUpFIRHQWglQXuOXCMrsmSTBIKSSio+Djo9eISJNldQikMm/20IJpNlXqVVq48wrFip5cf5u8buNDeWXPEtwZPvj048FesILI3Tbq63pMX4IMpEHsLAmkOMgAP/7wbf+ip1TY9gADo5GbR8AwP4syoVuF4aos5QYykrU+QInfdZchL5R+dB+/qm1xzThUa4iq2/pavLk94/xk8BaxXC0cfYOsmBATJTQiQQGrE6S9oLpf5u73x8szUP8xceZUNHYHdGESlOiNv9Z9OX4zBd+e9SzPbJ6yB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Jan 19, 2025 at 11:32:05AM +0100, Mateusz Guzik wrote: > Dumping processes with large allocated and mostly not-faulted areas is > very slow. > > Borrowing a test case from Tavian Barnes: > > int main(void) { > char *mem = mmap(NULL, 1ULL << 40, PROT_READ | PROT_WRITE, > MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); > printf("%p %m\n", mem); > if (mem != MAP_FAILED) { > mem[0] = 1; > } > abort(); > } > > That's 1TB of almost completely not-populated area. > > On my test box it takes 13-14 seconds to dump. > > The profile shows: > - 99.89% 0.00% a.out > entry_SYSCALL_64_after_hwframe > do_syscall_64 > syscall_exit_to_user_mode > arch_do_signal_or_restart > - get_signal > - 99.89% do_coredump > - 99.88% elf_core_dump > - dump_user_range > - 98.12% get_dump_page > - 64.19% __get_user_pages > - 40.92% gup_vma_lookup > - find_vma > - mt_find > 4.21% __rcu_read_lock > 1.33% __rcu_read_unlock > - 3.14% check_vma_flags > 0.68% vma_is_secretmem > 0.61% __cond_resched > 0.60% vma_pgtable_walk_end > 0.59% vma_pgtable_walk_begin > 0.58% no_page_table > - 15.13% down_read_killable > 0.69% __cond_resched > 13.84% up_read > 0.58% __cond_resched > > Almost 29% of the time is spent relocking the mmap semaphore between > calls to get_dump_page() which find nothing. > > Whacking that results in times of 10 seconds (down from 13-14). > > While here make the thing killable. > > The real problem is the page-sized iteration and the real fix would > patch it up instead. It is left as an exercise for the mm-familiar > reader. > > Signed-off-by: Mateusz Guzik > --- Seems like a good improvement to me. Let's get it tested.