From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 838E9C77B71 for ; Fri, 14 Apr 2023 17:25:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D37036B0072; Fri, 14 Apr 2023 13:25:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CE71D900004; Fri, 14 Apr 2023 13:25:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD60C900003; Fri, 14 Apr 2023 13:25:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AD8656B0072 for ; Fri, 14 Apr 2023 13:25:26 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 82808A022B for ; Fri, 14 Apr 2023 17:25:26 +0000 (UTC) X-FDA: 80680673052.17.C4F5D94 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf07.hostedemail.com (Postfix) with ESMTP id C28C240018 for ; Fri, 14 Apr 2023 17:25:24 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=Nfmpo1Xz; spf=none (imf07.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681493124; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=J/6fKZJeiVQu9MpxQ2IzLW5njfnFTm8vpA4Cqb6je8s=; b=nbe0fwPWs0lWzVQc5q3Fc6H1ny+I8cXjWJo7T6SHbaE7l7tvJXv3b+CziCinwXN1JbI91q iW5zbuNTJq+7EL0ud9l/kmyURcgcw5/MHLTKfsGfjHzv2wbyIUNp3+2lmnfDnK2gDp3nek BnC4b1sQ+JrG8iWEAFbrWcB67VKY7dQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=Nfmpo1Xz; spf=none (imf07.hostedemail.com: domain of mcgrof@infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=mcgrof@infradead.org; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681493124; a=rsa-sha256; cv=none; b=CoaCdE2aqfqb91wBhfTlc/oZf4iLVes3ZsiF5r23nWOj+XO55tG6VB+5veOZpNn/C0VpjQ UaqdA/VLUjr0WVFqjPVRoRhTFdY1TnHxbY4sNyzku0yq2PQ5R73nTFi75gJXggFvDMJ2M0 4BbZkXO4shpHYi29kpQHJxqxiPncKhw= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=J/6fKZJeiVQu9MpxQ2IzLW5njfnFTm8vpA4Cqb6je8s=; b=Nfmpo1Xz+ROfplpuPLDd/rZuP7 tnbpDoKbH654bILM+4w++OLD+Rb44/gAhB5RstSVHJDja1+ji7D1SLE5z5eGgS7mWTzqtWtX7YRqX VPbJ908JpKdhJl8aP0PjEab7NT98TcciYwSPb1t7GXkEbALRGykckVZ/nL5yxgS7TZAZk80Lo4/Ci +XPM3T4J+skaklt3kjdXQ5voC2Y+DmO9Yk/5i5FmA7Ygv24i6kGlH/bYGF/iBN8Pns0a5BCG1c2Xj 2i70Rw9gRjHL89v3GD5Vh65FsaXu47JAS6ikx2hKouoCOvzWSwj+euAfTLAkX9uG4Kdt0Zdtq9Nmy NL4lE/bA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1pnNAZ-00AEdf-30; Fri, 14 Apr 2023 17:25:07 +0000 Date: Fri, 14 Apr 2023 10:25:07 -0700 From: Luis Chamberlain To: Kees Cook , Christoph Hellwig , david@redhat.com, patches@lists.linux.dev, linux-modules@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, pmladek@suse.com, petr.pavlu@suse.com, prarit@redhat.com, torvalds@linux-foundation.org, gregkh@linuxfoundation.org, rafael@kernel.org Cc: christophe.leroy@csgroup.eu, tglx@linutronix.de, peterz@infradead.org, song@kernel.org, rppt@kernel.org, dave@stgolabs.net, willy@infradead.org, vbabka@suse.cz, mhocko@suse.com, dave.hansen@linux.intel.com, colin.i.king@gmail.com, jim.cromie@gmail.com, catalin.marinas@arm.com, jbaron@akamai.com, rick.p.edgecombe@intel.com Subject: Re: [RFC 0/2] module: fix virtual memory wasted on finit_module() Message-ID: References: <20230414052840.1994456-1-mcgrof@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230414052840.1994456-1-mcgrof@kernel.org> X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 6md15h5u4nnbbhe4fn9mnqmtg4iaq9gu X-Rspamd-Queue-Id: C28C240018 X-HE-Tag: 1681493124-905800 X-HE-Meta: U2FsdGVkX197KLTtmMAKEIZTtFYeSFqgqJKxt6i7shzrokb+3G7CfjImRzpEAM7xiJ98p+ehZXe9YeA6LmlpNaoftSvDEyoxCx0nxJMAk5KkUew28u4QlRteSkTQtq232r6bDm5FJ/d4lALNx4zkpK16vFMdrq3dj/UHDB07lYjAlJQcyh1W03uqcobpjat46Jd71DnAfB9CnZUZ4j+9rsfAirN0F6mQJSZb1/JXR3ig7uw6vgObdSx96C0j5GWTNao2/rUkPCxOL66o3RaBZlUaRnAny3J1i8/kpg5Wl0Rm+nDqiQAbWRPmY4NEElk8APcX/irexm7N/kOjz3UJs0gcUW7VuKWPQL2ZWQFGuUFFiN9iKsOQJ67c5IhXjxXYayMIqbInwJGnltcHqc2EOdrAaJFMmabyWI+H/Y1IpRdxH6K4ALnbIE+resMIoygsUnrJ+NkZ6vWczDtzh6zxstvycsCsx/AA795g+xE9eknc3O5MsNVlkaM8nv6KfQGUyFZPwf3Wjtm0CkTORu1OQGajV6Q5U+PxzzmSrNF7t5F3Mzr+fmS2uskNYhex573UxmBXZEYQCCKA2RjKS/9uYyRydhPAZBm6tDJDRyWor+E4n7zsH2o6c6ia9nKT/m8mg68fZkNeFdHnhlHM5mAhXqbUj776S37SoLGJQETerZoax597No6H+3kCKnOW+Lu8+hAA5IQfUcWfv8PT5aNf6OlZHD9bv+sqduS81ABNiBycLYeY3Y4L4PD3v6cRsmL2d5RCMh2RXXopG99w9VE91OmE6sfEjGl1gNJvxrjn6pYWJcgfZo5UdsD3UXs/CuulFuhiHL0Epc4w2fZQNsInq8+j4WYpipS3g6VB1I27fZ6vFonctKNkk7dVmFDP4ba0+5XUzsnibtXvKbhN8+mbCmQy4HYdtAUUIQ3o2FIVuJKGdB+NsDTEhIC3QILni+v7DwBtTVYqEnD6OhBkt3j 9dHOztwy DMZiruBVZpcbU6O4M/jy0FWUNRaxu1LozpsoYbAIdr7XQbY90FmTVB1R1Pe1OqYYGFtE0g3e+qH6VGLFtf6Qujmp7KGtwaCnXpgWQGDRl7lVutPcEwiVD3SbeKlXbhXSePr+TvuLtsid0ONM5YezuSz1f8vu1ojVAxhJS5j2iGxaXDWgOuqhAW/+13KNA1B09liWbfia8hWDJgvtoNzqwe0FwhRCtd1SVn3IE/WZHyfEhUWQZmtBsYF6133kdOAB5GfDqQygx5ERGNpe76Xj4Z2TNOQpC4mksRNh1mocFT/6rLo+pd/9TQFjnmT6czYRH8rPUs70cj+UDvvWYg72YJClzoCmaTiTmUZRnAAd1Ytcq6iQWrAjCDeQz/bmY5p0xw6V/S4IGKsKqEjXT4khgq9OQ4kRyjo0pmwMZtD+o3lO8GTmrd7aVGI73KPVIrbLifliU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 13, 2023 at 10:28:38PM -0700, Luis Chamberlain wrote: > At first I wondered if > we could use file descriptor hints to just exlude users early on boot > before SYSTEM_RUNNING. I couldn't find much, but if there are some ways > to do that -- then the last patch can be simplified to do just that. > The second patch proves essentially that we can just send -EBUSY to > duplicate requests, at least for duplicate module loads and the world > doesn't fall apart. It *would* solve the issue. The patch however > borrows tons of the code from the first, and if we're realy going to > rely on something like that we may as well share. But I'm hopeful that > perhaps there are some jucier file descriptor tricks we can use to > just make a file mutually exlusivive and introduce a new kread which > lets finit_module() use that. The saving grace is that at least all > finit_module() calls *wait*, contray to request_module() calls and so > the solution can be much simpler. > > The end result is 0 wasted virtual memory bytes. > > Any ideas how not to make patch 2 suck as-is ? The more I think about it, a file descriptor based approach would have to loop over all tasks as we're not sure if userspace would use the same thread and just fork requests or what. One would then have to loop over all tasks and look for files that match the same name. This approach is domain specific to internal kernel reads and I am thinking now it is more appropriate. Since this is a bootup issue for races with module loading what we could do is just have say kernel_read_file_from_fd_excl() which would just call a shared __kernel_read_file_from_fd(..., bool excl, ...) and the kernel_read_file_from_fd() could just set that excl to false while the new one sets it to true. Then finit_module() could just use that. Luis