linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: david@redhat.com, patches@lists.linux.dev,
	linux-modules@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, pmladek@suse.com,
	petr.pavlu@suse.com, prarit@redhat.com,
	torvalds@linux-foundation.org, gregkh@linuxfoundation.org,
	rafael@kernel.org
Cc: christophe.leroy@csgroup.eu, tglx@linutronix.de,
	peterz@infradead.org, song@kernel.org, rppt@kernel.org,
	dave@stgolabs.net, willy@infradead.org, vbabka@suse.cz,
	mhocko@suse.com, dave.hansen@linux.intel.com,
	colin.i.king@gmail.com, jim.cromie@gmail.com,
	catalin.marinas@arm.com, jbaron@akamai.com,
	rick.p.edgecombe@intel.com, mcgrof@kernel.org
Subject: [RFC 0/2] module: fix virtual memory wasted on finit_module()
Date: Thu, 13 Apr 2023 22:28:38 -0700	[thread overview]
Message-ID: <20230414052840.1994456-1-mcgrof@kernel.org> (raw)

The graph from my v3 patch series [0] which tries to resolve the virtual
memory lost bytes due to duplicates says it all:

         +----------------------------------------------------------------------------+
    14GB |-+          +            +            +           +           *+          +-|
         |                                                          ****              |
         |                                                       ***                  |
         |                                                     **                     |
    12GB |-+                                                 **                     +-|
         |                                                 **                         |
         |                                               **                           |
         |                                             **                             |
         |                                           **                               |
    10GB |-+                                       **                               +-|
         |                                       **                                   |
         |                                     **                                     |
         |                                   **                                       |
     8GB |-+                               **                                       +-|
waste    |                               **                             ###           |
         |                             **                           ####              |
         |                           **                      #######                  |
     6GB |-+                     ****                    ####                       +-|
         |                      *                    ####                             |
         |                     *                 ####                                 |
         |                *****              ####                                     |
     4GB |-+            **               ####                                       +-|
         |            **             ####                                             |
         |          **           ####                                                 |
         |        **         ####                                                     |
     2GB |-+    **      #####                                                       +-|
         |     *    ####                                                              |
         |    * ####                                                   Before ******* |
         |  **##      +            +            +           +           After ####### |
         +----------------------------------------------------------------------------+
         0            50          100          150         200          250          300
                                          CPUs count

So we really need to debug to see WTF, because really, WTF. The first
patch tries to answer the question if the issue is module auto-loading
being abused and that causing the issues. The patch proves that the
answer is no, but it does also help us find *a few* requests which can
get a bit of love to avoid duplicates. My system at least found one. So
it adds a debugging facility to let you do that.

As I was writing the commit log for my first patch series [0] I was noting
that this is it... and the obvious conclusion is that the culprit is udev
issuing requests per CPU for tons of modules. I didn't feel comfortable in
writing that this is it and we can't really do anything before really
trying hard. So I gave it a good 'ol college try. At first I wondered if
we could use file descriptor hints to just exlude users early on boot
before SYSTEM_RUNNING. I couldn't find much, but if there are some ways
to do that -- then the last patch can be simplified to do just that.
The second patch proves essentially that we can just send -EBUSY to
duplicate requests, at least for duplicate module loads and the world
doesn't fall apart. It *would* solve the issue. The patch however
borrows tons of the code from the first, and if we're realy going to
rely on something like that we may as well share. But I'm hopeful that
perhaps there are some jucier file descriptor tricks we can use to
just make a file mutually exlusivive and introduce a new kread which
lets finit_module() use that. The saving grace is that at least all
finit_module() calls *wait*, contray to request_module() calls and so
the solution can be much simpler.

The end result is 0 wasted virtual memory bytes.

Any ideas how not to make patch 2 suck as-is ?

Yes -- we can also go fix udev, or libkmod, and that's what should be
done. However, it seems silly to not fix if the fix is as trivial as
patch 2 demonstrates.

If you want to test / muck with all this you can use my branch
20230413-module-alloc-opts [1]:

[0] https://lkml.kernel.org/r/20230414050836.1984746-1-mcgrof@kernel.org
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=20230413-module-alloc-opts

Luis Chamberlain (2):
  module: add debugging auto-load duplicate module support
  kread: avoid duplicates

 fs/kernel_read_file.c    | 150 +++++++++++++++++++++++++
 kernel/module/Kconfig    |  40 +++++++
 kernel/module/Makefile   |   1 +
 kernel/module/dups.c     | 234 +++++++++++++++++++++++++++++++++++++++
 kernel/module/internal.h |  15 +++
 kernel/module/kmod.c     |  23 +++-
 kernel/module/main.c     |   6 +-
 7 files changed, 463 insertions(+), 6 deletions(-)
 create mode 100644 kernel/module/dups.c

-- 
2.39.2



             reply	other threads:[~2023-04-14  5:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14  5:28 Luis Chamberlain [this message]
2023-04-14  5:28 ` [RFC 1/2] module: add debugging auto-load duplicate module support Luis Chamberlain
2023-04-14  5:28 ` [RFC 2/2] kread: avoid duplicates Luis Chamberlain
2023-04-14  6:35   ` Greg KH
2023-04-14 16:35     ` Luis Chamberlain
2023-04-16  6:04   ` Christoph Hellwig
2023-04-16  6:41     ` Luis Chamberlain
2023-04-16 12:50       ` Greg KH
2023-04-16 18:46         ` Luis Chamberlain
2023-04-17  6:05           ` Greg KH
2023-04-17 22:05             ` Luis Chamberlain
2023-04-17 17:33       ` Edgecombe, Rick P
2023-04-17 22:08         ` Luis Chamberlain
2023-04-18 18:46           ` Luis Chamberlain
2023-04-14 17:25 ` [RFC 0/2] module: fix virtual memory wasted on finit_module() Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230414052840.1994456-1-mcgrof@kernel.org \
    --to=mcgrof@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=colin.i.king@gmail.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jbaron@akamai.com \
    --cc=jim.cromie@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=patches@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=petr.pavlu@suse.com \
    --cc=pmladek@suse.com \
    --cc=prarit@redhat.com \
    --cc=rafael@kernel.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox