From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DC7CC77B71 for ; Fri, 14 Apr 2023 05:28:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4F6690000A; Fri, 14 Apr 2023 01:28:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BF88E900003; Fri, 14 Apr 2023 01:28:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE7FC90000A; Fri, 14 Apr 2023 01:28:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 95DB1900003 for ; Fri, 14 Apr 2023 01:28:53 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5953280121 for ; Fri, 14 Apr 2023 05:28:53 +0000 (UTC) X-FDA: 80678867346.30.2F38A43 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf03.hostedemail.com (Postfix) with ESMTP id AEFCA2000A for ; Fri, 14 Apr 2023 05:28:51 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=nrh5SDtL; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=none (imf03.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681450131; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=WTrjq/Hj7xVrgUTwXL4fhU02Kmf9Tqu/Teapat/Qe9U=; b=ql4ernKTmc7hUEEATGTZA9jtN71eQhDVyHSsO8UYIronefkhdv79jfD4EwsATDZYjD7yjw pkNfBMmCz0a/096O21XZqPB8t5Fq1krB3k8KO8wY62cgDwbaB++TxLSWU98VDKbaY1goYY MdIXL5xeo0xrDpRM8W/rnRNQGnKW6Hg= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=nrh5SDtL; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=none (imf03.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681450131; a=rsa-sha256; cv=none; b=6KHexdyHNzfzH3ByjuyXAYAPouMXM6jlW0UgO0+UKlKdrzXhoePrxxcQ+f4GcfIoqFwP2S EGZFsR0EBpFAMlwRFkLDjmVVAaPWtGF8BircxNAdExSGWJbVtTFOc7q2mDc9oiIGwYK+xJ ZGx6G2NdnWMnseLA8pI+kf9jYrarjgc= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-Type: Content-ID:Content-Description:In-Reply-To:References; bh=WTrjq/Hj7xVrgUTwXL4fhU02Kmf9Tqu/Teapat/Qe9U=; b=nrh5SDtLtFVCBqV9LcByXzOdcP fXhCPG9ig8aJsOG7gPetVfxNTBbuzPyZkskWLeKVsmjPcuEk2cCOfpxGR02jXC/fUzmWFnPsOyscd tc4Qi/T0dBYNpGiQzXWkCgMNGewsl84u+OBVY7+BPnz/UDbbg5Wo1OSWF5IVwBkE4C2kGI+NSmCn+ Zd4jl1uWkAD/IEm/bTzZtoWpqT1F6I+jvwdCYcLyoAk1+9kA55qTZXnYD7bAe2zgwOUVnZUoeg4Pl 9iQAxEaW6qucTufLGSa7Swyd2NQedIZfgXzGRsJOVGG30gme0xbCyDq4EiANSdnuqAM+e3dchOhRt KlPkdRUg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1pnBzK-008MsV-0U; Fri, 14 Apr 2023 05:28:46 +0000 From: Luis Chamberlain To: david@redhat.com, patches@lists.linux.dev, linux-modules@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, pmladek@suse.com, petr.pavlu@suse.com, prarit@redhat.com, torvalds@linux-foundation.org, gregkh@linuxfoundation.org, rafael@kernel.org Cc: christophe.leroy@csgroup.eu, tglx@linutronix.de, peterz@infradead.org, song@kernel.org, rppt@kernel.org, dave@stgolabs.net, willy@infradead.org, vbabka@suse.cz, mhocko@suse.com, dave.hansen@linux.intel.com, colin.i.king@gmail.com, jim.cromie@gmail.com, catalin.marinas@arm.com, jbaron@akamai.com, rick.p.edgecombe@intel.com, mcgrof@kernel.org Subject: [RFC 0/2] module: fix virtual memory wasted on finit_module() Date: Thu, 13 Apr 2023 22:28:38 -0700 Message-Id: <20230414052840.1994456-1-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: AEFCA2000A X-Stat-Signature: 3185hf715swdzq4cxen8uepcmpskzycf X-HE-Tag: 1681450131-154493 X-HE-Meta: U2FsdGVkX19Tn74vS2IX+nt+MbUiVlQFFR/cMXa7opOJMQjmsU669LT9STfcnRLLNwJTDQ5HJ8ZU+trjsSMjXjQSh0bIovfqnqdVFrWbsa2lie5A6dL4C0pQGh3z6v07SEVJqsKwxhLvTQGZKvzpo0k/Pa17Hom+XuUcnz2Quw7iAY5IdeqGT8RzBbwZAdqFBwiIe1T0EWo2eLY6wcMqMQJcbxXXkqodYlV8oxpjeRwFUZ7uZVztsKwkgMJ1/xAcUaYUt5U/A+w1GhzZxMWpE8Dx+NGRVvqaxrAV9RJniydATKRAL1x54lx4wQJ454zzoGGi33spTpNL0Jo5PzuIrmtxxmQBCeV3Uf+afcPq1Dx3g4DLFcyEabsnYeIxbcXKgP5XfCccVtiXTAHHpPciaowstiB4raSX94Up9cM898xW3qNvGDk5t4asvZpnuhF4h/O0ljpevZZFEVOV3qzjHz1SFXJYzcbF19nei/RNcnBgy4fuv4igIZYmSM4/iMuZYiNWajbceKgY/A1TM3DtL4OEwSuOKw4vbOhd28MI/hd3mN06W3CxvGjr3/1tzGQyTFH0DK6DldUxto3cqwxPm0q34osCNACUd+jjm30ifJf6d376R5Xqu3oa8q8rDl81QD0HxPRLuEUt1cFlCHrmKKwRmJVlru+WNYsyIbRRkI3HS9KE+wk6JJQPtwUylMkr3Lxa6mE233XzSrc5ysNqbi8mmEWuTxQOcAbnFC54Mafb3ovuf9TXZ43lcToBRKUtYtm0LaKUoh9h0LFM3S9d1gOkF0Vb/I7ZUJlFLZvYcTBJvl1kwPhV7OJU+rf8wv8vbOBtYqbotGv+/cEo7qFZC2MjKlFO3wybzXARQWn6NdjBzBVbwNohPbBrZuHAMXVz5oh2p8ucfxU4ziSuVBoEZcX3L09OGROsjBThhHL7yDM17dgS2TboimpI7/b+LHWqwr6/HJWWIvkllJAXad3 k0su2a0V Xn7Pj6rr8h3EvJrlyM54XsvcEoJD1bQuC4B9yqZIyS6mKZMT5ipx5fgdz7wjeLhvJKOJJ18UhaA3EF4MLKSuLiM0rsMBcv3CWTWi9SirD+mUd8PSrNd3RdOeTqDrvNPzvvT0jABe7QvPKZIkUV2IsH4JB4KtUuIlvrGM462mQvuyXfGl60QlfkRiklhlTa0lz0kdgWU/bapHC+uGkkgwrXUMrfhaJsqOb08fZPnoufmlwJqbXsVrPuk/LLDf1Lilw9UYAm4dwMcgDGnAlAGF6YHMpXUI48m/NXA+7yhdh+/KuYzOFqw89bhy0ZPnuzbZAHieJNL/aMUrwsmh/mTpSLofS458BoUGFTH/PucTI7K3czAXqFtZVNvPZDz0AVRNYjDh55HajWXwhjHHzk/JSNkY+STTECTSQhKQDRn5eU2M6YCG3waSWHs6dp86gI87Dt5Fsjl9MHiOtYV0L38rcqTBhdk82b1I8+u6vzF/c+sBfkJKoeRhxq4FZYzd/8FtVsAxyDtJBmyW/hJlWEiNGvSET9A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The graph from my v3 patch series [0] which tries to resolve the virtual memory lost bytes due to duplicates says it all: +----------------------------------------------------------------------------+ 14GB |-+ + + + + *+ +-| | **** | | *** | | ** | 12GB |-+ ** +-| | ** | | ** | | ** | | ** | 10GB |-+ ** +-| | ** | | ** | | ** | 8GB |-+ ** +-| waste | ** ### | | ** #### | | ** ####### | 6GB |-+ **** #### +-| | * #### | | * #### | | ***** #### | 4GB |-+ ** #### +-| | ** #### | | ** #### | | ** #### | 2GB |-+ ** ##### +-| | * #### | | * #### Before ******* | | **## + + + + After ####### | +----------------------------------------------------------------------------+ 0 50 100 150 200 250 300 CPUs count So we really need to debug to see WTF, because really, WTF. The first patch tries to answer the question if the issue is module auto-loading being abused and that causing the issues. The patch proves that the answer is no, but it does also help us find *a few* requests which can get a bit of love to avoid duplicates. My system at least found one. So it adds a debugging facility to let you do that. As I was writing the commit log for my first patch series [0] I was noting that this is it... and the obvious conclusion is that the culprit is udev issuing requests per CPU for tons of modules. I didn't feel comfortable in writing that this is it and we can't really do anything before really trying hard. So I gave it a good 'ol college try. At first I wondered if we could use file descriptor hints to just exlude users early on boot before SYSTEM_RUNNING. I couldn't find much, but if there are some ways to do that -- then the last patch can be simplified to do just that. The second patch proves essentially that we can just send -EBUSY to duplicate requests, at least for duplicate module loads and the world doesn't fall apart. It *would* solve the issue. The patch however borrows tons of the code from the first, and if we're realy going to rely on something like that we may as well share. But I'm hopeful that perhaps there are some jucier file descriptor tricks we can use to just make a file mutually exlusivive and introduce a new kread which lets finit_module() use that. The saving grace is that at least all finit_module() calls *wait*, contray to request_module() calls and so the solution can be much simpler. The end result is 0 wasted virtual memory bytes. Any ideas how not to make patch 2 suck as-is ? Yes -- we can also go fix udev, or libkmod, and that's what should be done. However, it seems silly to not fix if the fix is as trivial as patch 2 demonstrates. If you want to test / muck with all this you can use my branch 20230413-module-alloc-opts [1]: [0] https://lkml.kernel.org/r/20230414050836.1984746-1-mcgrof@kernel.org [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230413-module-alloc-opts Luis Chamberlain (2): module: add debugging auto-load duplicate module support kread: avoid duplicates fs/kernel_read_file.c | 150 +++++++++++++++++++++++++ kernel/module/Kconfig | 40 +++++++ kernel/module/Makefile | 1 + kernel/module/dups.c | 234 +++++++++++++++++++++++++++++++++++++++ kernel/module/internal.h | 15 +++ kernel/module/kmod.c | 23 +++- kernel/module/main.c | 6 +- 7 files changed, 463 insertions(+), 6 deletions(-) create mode 100644 kernel/module/dups.c -- 2.39.2