From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8C05C25B74 for ; Sat, 18 May 2024 23:31:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A19306B0082; Sat, 18 May 2024 19:31:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C85D6B0083; Sat, 18 May 2024 19:31:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B7326B0085; Sat, 18 May 2024 19:31:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6F8BE6B0082 for ; Sat, 18 May 2024 19:31:35 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id AEBE8120697 for ; Sat, 18 May 2024 23:31:34 +0000 (UTC) X-FDA: 82133115708.21.C217B17 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 68E0A40002 for ; Sat, 18 May 2024 23:31:30 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=k3OhCE7p; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716075092; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3c+zav5RRUM3yh91jgmhYbjJOd21xZVsPB4CAi0Cfzk=; b=ePiaEy9VNwNL6UkKeBuMqwAM+TU35OhMEjpIVfQF61ewaeQu5Ce7ycw6Ffflp8mwIBwxxl qU7UTLEHWPxenlDAWUszfo2yHVhcgE//fqjQCry1jbYr1CsViyIioqCbAsQB7KLI+OcpgV 4zN4kodAzT6YkEZf1C9Y1wqCaWx8ANQ= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=k3OhCE7p; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716075092; a=rsa-sha256; cv=none; b=1XwVdgbC00fQ+Chq2yPrpg1RXmv8/i02VwojaD/JYN1HUcUE6UuCF+nMhCS/g1FfY2xdbc OhksVlLm9E2mJzlaCljavqm37alQNQ9EyQAbsJTWyhRvXuqxDEsXsh+0NzUZzm2bfWOKqX 7qge24sWYJCd570wdzRQqu9c2tfZtGM= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=3c+zav5RRUM3yh91jgmhYbjJOd21xZVsPB4CAi0Cfzk=; b=k3OhCE7p+QlmQHFC6AZ3SDcEaa g4E385k5EQhYbTnHW1lNLu0tuCokHHw6kiS2GtSAKY8POV/+FZMRLjHk3ty4GiPkh7yuKXH7AFTFH xzAmLv6I/fAflrVqrrV67JtsUJ1oDIPtGCkSYx5KR4bsjFev8HvGCybi5I6mvMkpSVTI9smw2fUSC HaSnNButmTZeSgrPcTrmj0HCRkawDuWZB2D5rDQarnRHEA/MPDMLyEe7UlMqtnKOiW4wdfdXl8GVd 2aYs45zO7hWdzQ+v+/kksJGIW7o55rLxSM4nLG9V6/dedktrGC6eBOlJVrOHIbq3S4fTL1p1It38i /h6EKXdA==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1s8TWK-0000000EXqc-2VQ9; Sat, 18 May 2024 23:31:20 +0000 Date: Sun, 19 May 2024 00:31:20 +0100 From: Matthew Wilcox To: Mateusz Guzik Cc: akpm@linux-foundation.org, Liam.Howlett@oracle.com, vbabka@suse.cz, lstoakes@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm: batch unlink_file_vma calls in free_pgd_range Message-ID: References: <20240518062005.76129-1-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240518062005.76129-1-mjguzik@gmail.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 68E0A40002 X-Stat-Signature: habz5sgppcwtrxacniyrk5tcmako3xhh X-Rspam-User: X-HE-Tag: 1716075090-832365 X-HE-Meta: U2FsdGVkX1+l+7cspMdWerqrGVamq4ivBKx8rrhYObwjNsA3HPF9kVKqFRzC17txCynqxQ0R4Tv90kNIc2SqRRAUpVVdJODmqYjsDWipQkEljmUxYmxBOJEMmXJtvx5xvkU2auWUwhRd+wi1DOfLwD7YaZ6N7V9Aa00XzVeO4H470SP4L5fSscr0DD/TvfXn92Orjr4ZlL63ZuQ/ttP7EUnmMEzu3cycIUjRUIJVylEWqKzVef20LpoXQXSo5kCFKU9FogsvLxyIbnMY2+h0ZGXlzpjJhjq840OrTUl18xzniAZX31uOcoUrIS/phBMNEb2G7sU0XBvubf9cbrWsAExBm2NpnQFTAlk1NIOHeB02OMphpH9oFv1g6kTcu3bUcj13oTrKcQMTSBauaLTMwdlxGqtFac7OkdjDJqfc4ygMmXu4FQ19hWSrhd0UaKLw1stkkpgfsRwOOPkRPd8IYCUIBqOukey1hlTt2qn7VZtrAXeWuK5KUCgg2mFX/XsB4pzYU0JfXVamkwc91pkaZT8jHgSZZtS0s74iaSHTHU4ZWaAO+jrjx90L1LQY2TeSbi346Kv7MdN+xq2HRDpagIgAX8Qrk9hhjBcHgsLYOQgiGKJWyGI29dZDJ9rMK+2z+qZgOeBVfUcGmhZL1EFjHiSKEz5uVcAD7g96ccqH+AxKcR/dPqyG6kENyySwRF8r5FazmuXe1jvGWlIud2AM3z5g+r8S/1PdocQLUb+kQLa2g8acqU2k82N3UJ9e+bviZSwgFI1/ulxp4CR3CgBUKVabuRr7T8SpeUaIRnWZrFwFNJwUJCcg8tPAF3L43+gB7Rbql22+H/MC7CXLFW8ULt4B2aU30n1MFbDGxeMPgqVBYC9fDJUgwhJWm6BrD7TiVj6YZyM7V/wSh36ABtkoxAKZQHd96fUMLh63xIrfIm0SZONc1m9ffNSm1zgrjgErCxbIBSzLhyxcvMYK+fE NMf8maV5 HyrgQ/QT4WhKYqZo4SsmEw7w5PkaWdyIoGPEJNDrJf161WQa6eXISEE6bKGmWaphq5DImwvaELgraHB2LXZBTQENiODpH670YhKk+uqRptxdjYzpUS1s1a9d5G6VjEZVfE8C08HNL5XIJO46gG2jD5OMx3Gi1EnSjDdOQw1QgqMnb1SOnQy0u9xnn5hAR/roBIFherDxd5LvrIZ3lOyJY49MzhjgcAuIaODyvjmxiXXO4MPy0d9XQ2z0WLg8MaJ4wUBhni1kcyBA0hWRwjXDI6G9D9L/8s2a26wErgVjw/ixT9aI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, May 18, 2024 at 08:20:05AM +0200, Mateusz Guzik wrote: > Execs of dynamically linked binaries at 20-ish cores are bottlenecked on > the i_mmap_rwsem semaphore, while the biggest singular contributor is > free_pgd_range inducing the lock acquire back-to-back for all > consecutive mappings of a given file. > > Tracing the count of said acquires while building the kernel shows: > [1, 2) 799579 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > [2, 3) 0 | | > [3, 4) 3009 | | > [4, 5) 3009 | | > [5, 6) 326442 |@@@@@@@@@@@@@@@@@@@@@ | This makes sense. A snippet of /proc/self/maps: 7f0a44725000-7f0a4474b000 r--p 00000000 fe:01 100663437 /usr/lib/x86_64-linux-gnu/libc.so.6 7f0a4474b000-7f0a448a0000 r-xp 00026000 fe:01 100663437 /usr/lib/x86_64-linux-gnu/libc.so.6 7f0a448a0000-7f0a448f4000 r--p 0017b000 fe:01 100663437 /usr/lib/x86_64-linux-gnu/libc.so.6 7f0a448f4000-7f0a448f8000 r--p 001cf000 fe:01 100663437 /usr/lib/x86_64-linux-gnu/libc.so.6 7f0a448f8000-7f0a448fa000 rw-p 001d3000 fe:01 100663437 /usr/lib/x86_64-linux-gnu/libc.so.6 so we frequently have the same file mmaped five times in a row. > The lock remains the main bottleneck, I have not looked at other spots > yet. You're not the first to report high contention on this lock. https://lore.kernel.org/all/20240202093407.12536-1-JonasZhou-oc@zhaoxin.com/ for example. > diff --git a/include/linux/mm.h b/include/linux/mm.h > index b6bdaa18b9e9..443d0c55df80 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h I do object to this going into mm.h. mm/internal.h would be better. I haven't reviewed the patch in depth, but I don't have a problem with the idea. I think it's only a stopgap and we really do need a better data structure than this.