From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F130C77B76 for ; Mon, 17 Apr 2023 22:06:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE42B8E0002; Mon, 17 Apr 2023 18:06:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C94A68E0001; Mon, 17 Apr 2023 18:06:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5D5D8E0002; Mon, 17 Apr 2023 18:06:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A78478E0001 for ; Mon, 17 Apr 2023 18:06:10 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 75277C046F for ; Mon, 17 Apr 2023 22:06:10 +0000 (UTC) X-FDA: 80692266900.17.1B1E776 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf04.hostedemail.com (Postfix) with ESMTP id 5649B4001B for ; Mon, 17 Apr 2023 22:06:08 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=Tdx4lOPN; spf=none (imf04.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681769168; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U5ZyEMDy9cwOLuCvHr2OAy9M7iqscgHDIWUXBtsfJUs=; b=I1uHK0FvbT0o0v9/i8DBuNRKGYaI+nfO2HiNzPvdz03x9wE9qHuiuRdtg+svZ4g8LThEIh mrqH9cLdTPMDYF05S7z7CR1qMxCTEF2QYhF9DLlVBbpxINkpNfgg6BAAw3pzr4ZLE310mc gnUmlrPtSs5y8VVkkK3Mg6dJ/joX/eU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=Tdx4lOPN; spf=none (imf04.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681769168; a=rsa-sha256; cv=none; b=JqRkXuWQpiCQNrbHhta9GIDwmiclmqr8n5ieIHANlIdY7tkDGlBGnnXwH1itVKg+/vt9Iz OEBC+nfrJxNxtBYNERl8DTsS4OtP01htNTXtjDSgKTwOp35Rb81AZms4HXte2Fsg7dAVB9 hXO9krNWE6cMSDWMyKw3YLOZD/wABvE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=U5ZyEMDy9cwOLuCvHr2OAy9M7iqscgHDIWUXBtsfJUs=; b=Tdx4lOPNDHR7SzR2upVakzozCb TxSs2fNt2ADiBh6Q/hUjNST2xWdXbILtXVYc49laIBLRHU/GFQaOaZgl7tKKdMThqbrtTY4UHA0T1 djYlZ1NZk56EPSXHH1SibN549ObLQNdYzf/x/4+2kfky/qDcd+ssIiyMkQbn5X/XqgvFF81FoCkx6 NDN5qRRS4iddN+CbtpziwVxKd3jjald+YC0O7jQHzJjBuarC+d62RuNkIeuA53c7qA05vwqoitDJh 7aNNI0Zb1o6kYC5XD7nchFU9KP5ywLreZuCUukQY44smLPpqNx2c/bUGyEh7EYy3mABIJNp9HOB7G wYyoHSlA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1poWz0-000EUW-22; Mon, 17 Apr 2023 22:05:58 +0000 Date: Mon, 17 Apr 2023 15:05:58 -0700 From: Luis Chamberlain To: Greg KH Cc: Christoph Hellwig , Kees Cook , david@redhat.com, patches@lists.linux.dev, linux-modules@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, pmladek@suse.com, petr.pavlu@suse.com, prarit@redhat.com, torvalds@linux-foundation.org, rafael@kernel.org, christophe.leroy@csgroup.eu, tglx@linutronix.de, peterz@infradead.org, song@kernel.org, rppt@kernel.org, dave@stgolabs.net, willy@infradead.org, vbabka@suse.cz, mhocko@suse.com, dave.hansen@linux.intel.com, colin.i.king@gmail.com, jim.cromie@gmail.com, catalin.marinas@arm.com, jbaron@akamai.com, rick.p.edgecombe@intel.com Subject: Re: [RFC 2/2] kread: avoid duplicates Message-ID: References: <20230414052840.1994456-1-mcgrof@kernel.org> <20230414052840.1994456-3-mcgrof@kernel.org> <2023041637-glamorous-appetite-dc12@gregkh> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: kdi4ku7uoo5wxf6rj8u5o1e6zg96crtr X-Rspamd-Queue-Id: 5649B4001B X-HE-Tag: 1681769168-806070 X-HE-Meta: U2FsdGVkX19PH/UAqDoqi4XkgqOSnqft9NAuFa8Hpv1pp4gRI/Suv8tIbz409RysfycwuTEfth5BTFbdHSDQexin7w5OYkGcwx2wXhbOAyzFNm/5/NMHmB6YsAfzC6bLlLfffNws5NdjGjay/kcXsm9+rGyRsHkSynMugsrMC0K9EpmrWq5ZpISA4jonDcfi7PH5Xun1a9EtNivy54V3IREJA965HIUQ4iBj17sgdtUzPSBlrGbBHkxaS5B4Lim6YOHIvuuJs5lZwIgfLMe2HJBcfct97Yi2J0tg9f7IdF+7qvxeU3ABnAHxElTrtZaIn46rEO/mb/wt/z19V9Dpqi0CM4sFYlAwxU0R7C9tXgEaoOl3I5UuL42DcJEDzawcIhoSQ6lVTkN2TJmWP7YMbesh1G1gyPHE5QpECXI+tsxsXvK+XAHW1hlAHrqPD+pJCSX1W3tD2Th7YkdTP7zLHft5vAn3iljyVXYO6TQC4761gts7k+dRx/e9a4BwxoU2gCdRq/qDk6UWFl6/oe4obXDqw8iZTXwUiTgb6ROEsgeNM56/ZEM+tpGuIkaQPwocf7A9+BH5Tya1YxftgeblyOjgyQb5Lc+e0pw6qcoOz2U7gyc4BUcjPdNRP3i+Yb0bQOT0aCMkY06G/R6xF86pw5yt2te6j9F5w/HKN+ga48ScazTf+Iz2ZqrB0sJDhvcjTSfSLq4cepLTGgQZvoO7RPUV+X6A45y+xM6Zbs4Vq3Quk/Fzzmr/zQZnBzIIuqJbsst5WTWc11lCSZVPfOLNkP4gxXKLJTN3ltJnTtjwhMfovUAa9fPix9i7m2vSdXdPAJdjB2k0nbbrFgngEH/zD2rRHUNbbKc8qk2ZSXmj9hZuoPSdAWmzPUaCv5qq+xwDxegy879PtABwrmlv28071u04AwfxHTo4Zygys8rGA1aLykSkUx58vhxVvDKcyPkF+BG+fT/1bT9yC7wJsJW HMAwkfR8 yYVBRvuCuWjeVQpNJ37JWuc7a+vqh9TObNvhUG8AvPCXdb5a3gXP8x8WHXztHvZH8TqwCN9JnEFGKNONQtJRUWdnqt5q3jKXiQmpm9iolEYbacUHNQU0tsmbhTzIp/RoXEjCOoacYij9JLLHeliYnY8BeHHluXgmhbPs2u7PrI1MgMeDfmJ9C9bZrLdjMrgisrQuIzR9dfSHCxsCX8+qC9FWJGlJ3H0L3ZBD4I4MlmrZqm5Inwwu17nm0asb9fuZm5j3ur7wdS+P9fJqQ0nfEUND+w0jefTnfRtF8szdM6K7wKG1GvWyVIJDUtRPMYN+K7n+E5Z5rzJvpM8ze3mk4H0PCEE/rQhLx1qRvlFTF70Lmex9iQRn/w9wVRCRmgt8USLHW1zR+uxht8bPW2CBMd2fD0Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Apr 17, 2023 at 08:05:31AM +0200, Greg KH wrote: > On Sun, Apr 16, 2023 at 11:46:44AM -0700, Luis Chamberlain wrote: > > On Sun, Apr 16, 2023 at 02:50:01PM +0200, Greg KH wrote: > > > On Sat, Apr 15, 2023 at 11:41:28PM -0700, Luis Chamberlain wrote: > > > > On Sat, Apr 15, 2023 at 11:04:12PM -0700, Christoph Hellwig wrote: > > > > > On Thu, Apr 13, 2023 at 10:28:40PM -0700, Luis Chamberlain wrote: > > > > > > With this we run into 0 wasted virtual memory bytes. > > > > > > > > > > Avoid what duplicates? > > > > > > > > David Hildenbrand had reported that with over 400 CPUs vmap space > > > > runs out and it seems it was related to module loading. I took a > > > > look and confirmed it. Module loading ends up requiring in the > > > > worst case 3 vmalloc allocations, so typically at least twice > > > > the size of the module size and in the worst case just add > > > > the decompressed module size: > > > > > > > > a) initial kernel_read*() call > > > > b) optional module decompression > > > > c) the actual module data copy we will keep > > > > > > > > Duplicate module requests that come from userspace end up being thrown > > > > in the trash bin, as only one module will be allocated. Although there > > > > are checks for a module prior to requesting a module udev still doesn't > > > > do the best of a job to avoid that and so we end up with tons of > > > > duplicate module requests. We're talking about gigabytes of vmalloc > > > > bytes just lost because of this for large systems and megabytes for > > > > average systems. So for example with just 255 CPUs we can loose about > > > > 13.58 GiB, and for 8 CPUs about 226.53 MiB. > > > > > > How does the memory get "lost"? Shouldn't it be properly freed when the > > > duplicate module load fails? > > > > Yes memory gets freed, but since virtual memory space can be limitted it > > also means you can end up eventually getting to the point -ENOMEMs will > > happen as you have more CPUS and you cannot use virtual memory for other > > things during kernel bootup and bootup fails. This is apparently > > exacerbated with KASAN enabled. > > Then why not just rate-limit the module loader in userspace on such > large systems if that's an issue? No kernel changes needed to do that. We can certainly just take a stance punt this as a userspace problem. I thought it would be good to see what a kernel style of workaround would look like for us to evluate. Luis