From: Luis Chamberlain <mcgrof@kernel.org>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "keescook@chromium.org" <keescook@chromium.org>,
"hch@infradead.org" <hch@infradead.org>,
"prarit@redhat.com" <prarit@redhat.com>,
"rppt@kernel.org" <rppt@kernel.org>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"Torvalds, Linus" <torvalds@linux-foundation.org>,
"willy@infradead.org" <willy@infradead.org>,
"song@kernel.org" <song@kernel.org>,
"patches@lists.linux.dev" <patches@lists.linux.dev>,
"pmladek@suse.com" <pmladek@suse.com>,
"david@redhat.com" <david@redhat.com>,
"colin.i.king@gmail.com" <colin.i.king@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"jim.cromie@gmail.com" <jim.cromie@gmail.com>,
"vbabka@suse.cz" <vbabka@suse.cz>,
"christophe.leroy@csgroup.eu" <christophe.leroy@csgroup.eu>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"jbaron@akamai.com" <jbaron@akamai.com>,
"peterz@infradead.org" <peterz@infradead.org>,
"linux-modules@vger.kernel.org" <linux-modules@vger.kernel.org>,
"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"petr.pavlu@suse.com" <petr.pavlu@suse.com>,
"rafael@kernel.org" <rafael@kernel.org>,
"Hocko, Michal" <mhocko@suse.com>,
"dave@stgolabs.net" <dave@stgolabs.net>
Subject: Re: [RFC 2/2] kread: avoid duplicates
Date: Mon, 17 Apr 2023 15:08:34 -0700 [thread overview]
Message-ID: <ZD3DYqYE4DOiJQaS@bombadil.infradead.org> (raw)
In-Reply-To: <be5182b65384f6a7667c239134037649a468033d.camel@intel.com>
On Mon, Apr 17, 2023 at 05:33:49PM +0000, Edgecombe, Rick P wrote:
> On Sat, 2023-04-15 at 23:41 -0700, Luis Chamberlain wrote:
> > On Sat, Apr 15, 2023 at 11:04:12PM -0700, Christoph Hellwig wrote:
> > > On Thu, Apr 13, 2023 at 10:28:40PM -0700, Luis Chamberlain wrote:
> > > > With this we run into 0 wasted virtual memory bytes.
> > >
> > > Avoid what duplicates?
> >
> > David Hildenbrand had reported that with over 400 CPUs vmap space
> > runs out and it seems it was related to module loading. I took a
> > look and confirmed it. Module loading ends up requiring in the
> > worst case 3 vmalloc allocations, so typically at least twice
> > the size of the module size and in the worst case just add
> > the decompressed module size:
> >
> > a) initial kernel_read*() call
> > b) optional module decompression
> > c) the actual module data copy we will keep
> >
> > Duplicate module requests that come from userspace end up being
> > thrown
> > in the trash bin, as only one module will be allocated. Although
> > there
> > are checks for a module prior to requesting a module udev still
> > doesn't
> > do the best of a job to avoid that and so we end up with tons of
> > duplicate module requests. We're talking about gigabytes of vmalloc
> > bytes just lost because of this for large systems and megabytes for
> > average systems. So for example with just 255 CPUs we can loose about
> > 13.58 GiB, and for 8 CPUs about 226.53 MiB.
> >
> > I have patches to curtail 1/2 of that space by doing a check in
> > kernel
> > before we do the allocation in c) if the module is already present.
> > For
> > a) it is harder because userspace just passes a file descriptor. But
> > since we can get the file path without the vmalloc this RFC suggest
> > maybe we can add a new kernel_read*() for module loading where it
> > makes
> > sense to have only one read happen at a time.
>
> I'm wondering how difficult it would be to just try to remove the
> vmallocs in (a) and (b) and operate on a list of pages.
Yes I think it's worth long term to do that, if possible with seq reads.
Luis
next prev parent reply other threads:[~2023-04-17 22:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-14 5:28 [RFC 0/2] module: fix virtual memory wasted on finit_module() Luis Chamberlain
2023-04-14 5:28 ` [RFC 1/2] module: add debugging auto-load duplicate module support Luis Chamberlain
2023-04-14 5:28 ` [RFC 2/2] kread: avoid duplicates Luis Chamberlain
2023-04-14 6:35 ` Greg KH
2023-04-14 16:35 ` Luis Chamberlain
2023-04-16 6:04 ` Christoph Hellwig
2023-04-16 6:41 ` Luis Chamberlain
2023-04-16 12:50 ` Greg KH
2023-04-16 18:46 ` Luis Chamberlain
2023-04-17 6:05 ` Greg KH
2023-04-17 22:05 ` Luis Chamberlain
2023-04-17 17:33 ` Edgecombe, Rick P
2023-04-17 22:08 ` Luis Chamberlain [this message]
2023-04-18 18:46 ` Luis Chamberlain
2023-04-14 17:25 ` [RFC 0/2] module: fix virtual memory wasted on finit_module() Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZD3DYqYE4DOiJQaS@bombadil.infradead.org \
--to=mcgrof@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=colin.i.king@gmail.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=jbaron@akamai.com \
--cc=jim.cromie@gmail.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-modules@vger.kernel.org \
--cc=mhocko@suse.com \
--cc=patches@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=petr.pavlu@suse.com \
--cc=pmladek@suse.com \
--cc=prarit@redhat.com \
--cc=rafael@kernel.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rppt@kernel.org \
--cc=song@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox