From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA707C25B75 for ; Sun, 19 May 2024 10:07:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF2836B0085; Sun, 19 May 2024 06:07:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA2A36B0088; Sun, 19 May 2024 06:07:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6A816B0089; Sun, 19 May 2024 06:07:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 86DB56B0085 for ; Sun, 19 May 2024 06:07:25 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E973A80935 for ; Sun, 19 May 2024 10:07:24 +0000 (UTC) X-FDA: 82134718008.21.4A817BD Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) by imf21.hostedemail.com (Postfix) with ESMTP id 1FBF81C000E for ; Sun, 19 May 2024 10:07:22 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FtUHicEG; spf=pass (imf21.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.221.41 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716113243; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=320z9E/86yc2wudgNLgLL8g4U1YGIAZ7tfBbM4eHHIY=; b=Lz6B3dNVMJFbVCk1LPQTyySgQMB8gV+rYYm+e5kuUPctalxSvpPhfflfv6CTTJjf/7V/n8 R9yrBGJ5saUcbFbbZ89zUX1UNHTEir5PLVftETj2rahBpizob/pD0SNwkLRPsL539DoibM Rd7KJVC5zq5IcHzawFv/4KTiz8R37Ko= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716113243; a=rsa-sha256; cv=none; b=5izSgWKShRPmaGSBJcXph4X4HaIVgxkQTKDemQhu7s1G0ngri71yzIDjPua5YBwgVqER4N ShelTj+RwRtz2vzW/mBPsc2HyPqcZ6X+ru4l2+TWJlbhcLeOONrR0F1+9tp2VUUp8lkx4b Irwj0nBy3NnFvlf375MmUzpiOpxzaHw= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FtUHicEG; spf=pass (imf21.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.221.41 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-354b722fe81so1321214f8f.3 for ; Sun, 19 May 2024 03:07:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716113241; x=1716718041; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=320z9E/86yc2wudgNLgLL8g4U1YGIAZ7tfBbM4eHHIY=; b=FtUHicEGLMNVGSaSaQ83uqAapy1glP5jFrNUegFXaXVds96EBuzpMPaxG573XZbeaE ggIWb+l1dmaxTxAz447tq8AXCJSCwlZ6oDuztXkNqQz00ch2OTANMIQgxCJK0A0DTBZT OEU6JIxbu4yXGTeok90Sd5MwG/HwJt5MgUyLaXMJThHyvlJ++ER+n/4y0gJU9QR/C+Vz M2iCVJN2stEdEfVOwkm+ppbH8sxgWm2d3vrmP74Th7Bc7Vi3qZzees7odPBo/I/NULP/ GDI+66FQAPoh7RTUPaGsgDicx5ecTUXJey2Hz07zPlQ/ysxweOctCdy817K6Dj81PcKf Jdpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716113241; x=1716718041; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=320z9E/86yc2wudgNLgLL8g4U1YGIAZ7tfBbM4eHHIY=; b=goM2BjmItik52VIMu+fBWh+IwP5y5UxNA1YUWguXDUD3UAIxHWYCUMYTnDEmo/zk4b jCWwdALmuQoQfL5JFPLPKtQXYNfITOt1kgJhh2kTYbJGnSiWEHPjFREsxlL9ahGBOErn gFVBOK+P/bMVD8wC972T6sw4/PL0jBQVyd/JehwNFf7jt7rs+B0nDMHBjRBPIw+4Zt4S wah3gJ0zwPY+uyXU0TJio3xoHu+AIdJQS0ua8vVEFHp1bFhBYyTA2erAvudW6/+JaSKf 9r6CGsyzhfkD4whja5edS6ETUUCcL5uPc5msp69NR8LRPNR9XgySJL6ucgB74tQdQVio 43dQ== X-Forwarded-Encrypted: i=1; AJvYcCVJq5t15/s7cEQpjDTpUTX2qlVKswVycrslC0xUy6yIBCDVkXZLKzmGduNS7BPzvG0W3Fm/xBta95kaKQMIIiaMruc= X-Gm-Message-State: AOJu0YyWfqbyOT6HpdCpekzkEzpCZcPiQNjAFQ4Kh2CB6NetbMnNpy6y IOC2v348VxQWppWI7+2heeIw7clFCFVX4HlwxCzjdWr7xqOtXbvk X-Google-Smtp-Source: AGHT+IErv8QLrHO4ewcOMEXQoUPuwcgDRLStIisOq5DrjhknuIwGVYk6eO38TVuLwkJFG9FcCwb51w== X-Received: by 2002:adf:ec4d:0:b0:343:7b6b:dcc6 with SMTP id ffacd0b85a97d-3504a73bc6amr18580623f8f.30.1716113241336; Sun, 19 May 2024 03:07:21 -0700 (PDT) Received: from f (cst-prg-73-12.cust.vodafone.cz. [46.135.73.12]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3502bbbbedbsm26261151f8f.92.2024.05.19.03.07.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 May 2024 03:07:20 -0700 (PDT) Date: Sun, 19 May 2024 12:07:07 +0200 From: Mateusz Guzik To: Matthew Wilcox Cc: akpm@linux-foundation.org, Liam.Howlett@oracle.com, vbabka@suse.cz, lstoakes@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm: batch unlink_file_vma calls in free_pgd_range Message-ID: References: <20240518062005.76129-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Stat-Signature: bg4i1gdo1yg76krrypfsf45bqi4eu6wr X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1FBF81C000E X-HE-Tag: 1716113242-70759 X-HE-Meta: U2FsdGVkX1+qOSIbD2wdVKsEUKGCF9nlnLAv5iUB+05h795r29I8OI6sKeFNlWCxahE+6QbkEwU131HUtQGGQ0ivPP0UrYGMQNrNuviE3l0wA5qa9zp7RDjbj2ZbqnhAJpEJrmx8xgFSXRCpQbdUXQ8Z5p0U4zDuB4zkUC3HpkNPck6x1SWP0QnBEA4OIC/brPrYP01RqbTvSnv9hNEBJWVOMashwmM/f6d9KQOSX8g1j+k2m7UmvfyNiqIvEwXPGnZRE5+A67yjzW+t1kWTCNN5VKWOHD7yRrr8CeazIzFYvL3N0cgouToi4Ah7+xOQNFyXFhT4eyVLiNAsLEcHc89i1z4qlzZOgzffybrQklwrmkFmKaDBIOEgd/C+nt13O35VR9ahUltY4weVsxF41Yc9gLRQyAkHGCPuEUY62OX8MsT+Qhu9fWAX5ZzbwAcsADJbj2yQthf2V3wLivaO46pdnTXSXKLcnJARUyUeQVDzwbmnMR8uUSZIkoN80sQcPsEftW2MXFCIX6OPHuiY8vuY3tVV69McmrpBmRdzUu/HSI0b8jo6qm+XBjHGWPl3C08VYyAQFEHrMkAo59QB4EuX3aGQTo3+G5c7eWfTO3DsorerhCFGxulmr+a2bGtHmRNrQMjtdH/Kzs31nl++WMRRLP6Cv6ZBkFjLbCVLmJm/HF+MYe/KljAManjpH7gOwJPF6BPGw8ID0YXoiMRdHwOmGG8XBQZ4jRuIcGfLHGDcXVi7kusCsxMCt/eE9d8uypKSH+kZYvVCzB5N3bCL40fc7wpaj8A7KXFAwh4OpbxFoJXiy4AIc0jPye/U0Rr9OFdNnVKogGS2z5vgY4AnWbTHXTwmGQkzvwFBU7jIkflA+IE7IVacFfJPwAT/RRRHI47cBtQLB9KwFAhztwSoWdCxwStpR20jBeLsB1ygmHa3eM1Ug0LuXEgreVQAIoev4OnnkSAu86BIJaL0Lfb ddeuhKrz RVTseLpF16r7tUYBLLFy2Ow+/Esq45K55d16MpOilweKzdGNX7sl/P4MEwxRfJ4WgIPrqpDu3zAGmFMU9M3DKeGM+5f4fL9y0faBf6nxNN4MHShpl6DeXi3+pHGOSkwNbuAJ1HNbvU8VoF+JzLBCwMgcP2yo/zPJQ0Eceh9f8dA8FAyv2NBy3/p0UBHG47O2hk/rO3J8VduznjHfdrlJ2hCQ7EYS0+SvWVEUgpgt4uGCAG9Pci6alxmHhZKSBKJTGrjdfW0zLENCAPrQH+qiQ0nO3OIZT23IYflyFJUyP+31MMZv+gEs6JsE6NWW/9zE9tFogdcG/3MSPkxD1uxXRLxq4xeGOku579zL6bp/NMg/tHzYqvaBTWbc4NWfYbqgy1idwKlZQ1gfuAYzDQ/J+TjNNK9iM4jtt4q3k/r4rs++f0/ziW13Dl5dfljcUz00iufecrECUkXGZQ1Z8/EnWCi17l3fkfH+91c2H X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, May 19, 2024 at 12:31:20AM +0100, Matthew Wilcox wrote: > On Sat, May 18, 2024 at 08:20:05AM +0200, Mateusz Guzik wrote: > > Execs of dynamically linked binaries at 20-ish cores are bottlenecked on > > the i_mmap_rwsem semaphore, while the biggest singular contributor is > > free_pgd_range inducing the lock acquire back-to-back for all > > consecutive mappings of a given file. > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index b6bdaa18b9e9..443d0c55df80 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > I do object to this going into mm.h. mm/internal.h would be better. > Noted. > I haven't reviewed the patch in depth, but I don't have a problem with > the idea. I think it's only a stopgap and we really do need a better > data structure than this. > I'll send a v2 after some more reviews pour in. The above indeed is just a low hanging fruit fixup in an unpleasant situation. I think the real fix in the long run would provide the loader with means to be more efficient about it. strace /bin/echo shows: [snip] openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\243\2\0\0\0\0\0"..., 832) = 832 pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 fstat(3, {st_mode=S_IFREG|0755, st_size=2125328, ...}) = 0 pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784 mmap(NULL, 2170256, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7dbda8a00000 mmap(0x7dbda8a28000, 1605632, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7dbda8a28000 mmap(0x7dbda8bb0000, 323584, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b0000) = 0x7dbda8bb0000 mmap(0x7dbda8bff000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1fe000) = 0x7dbda8bff000 mmap(0x7dbda8c05000, 52624, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7dbda8c05000 [/snip] Hence the 5 mappings. Should there be a mechanism to issue all these mmaps at the same time there would definitely be savings in total work done, not only in terms of one i_mmap_rwsem lock trip. The mechanism should be versatile enough to replace other back-to-back mmap uses. It would be great if on top of it it did not require the size argument, instead it could return a pair address + size. Then the typical combo of open + fstat + mmap could be shortened. As in that was just a quick note, I have no intention of pursuing anything of the sort. I'll probably submit some other patches to damage-control the state without altering any design choices.