From: Mike Rapoport <rppt@kernel.org>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Song Liu <song@kernel.org>, Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>,
linux-kernel@vger.kernel.org, x86@kernel.org
Subject: Re: [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc()
Date: Thu, 18 May 2023 18:23:54 +0300 [thread overview]
Message-ID: <20230518152354.GD4967@kernel.org> (raw)
In-Reply-To: <ZGWdHC3Jo7tFUC59@moria.home.lan>
On Wed, May 17, 2023 at 11:35:56PM -0400, Kent Overstreet wrote:
> On Wed, Mar 08, 2023 at 11:41:02AM +0200, Mike Rapoport wrote:
> > From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> >
> > When set_memory or set_direct_map APIs used to change attribute or
> > permissions for chunks of several pages, the large PMD that maps these
> > pages in the direct map must be split. Fragmenting the direct map in such
> > manner causes TLB pressure and, eventually, performance degradation.
> >
> > To avoid excessive direct map fragmentation, add ability to allocate
> > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the
> > allocated pages from the direct map and use a cache of the unmapped pages.
> >
> > This cache is replenished with higher order pages with preference for
> > PMD_SIZE pages when possible so that there will be fewer splits of large
> > pages in the direct map.
> >
> > The cache is implemented as a buddy allocator, so it can serve high order
> > allocations of unmapped pages.
>
> So I'm late to this discussion, I stumbled in because of my own run in
> with executable memory allocation.
>
> I understand that post LSF this patchset seems to not be going anywhere,
> but OTOH there's also been a desire for better executable memory
> allocation; as noted by tglx and elsewhere, there _is_ a definite
> performance impact on page size with kernel text - I've seen numbers in
> the multiple single digit percentage range in the past.
>
> This patchset does seem to me to be roughly the right approach for that,
> and coupled with the slab allocator for sub-page sized allocations it
> seems there's the potential for getting a nice interface that spans the
> full range of allocation sizes, from small bpf/trampoline allocations up
> to modules.
>
> Is this patchset worth reviving/continuing with? Was it really just the
> needed module refactoring that was the blocker?
As I see it, this patchset only one building block out of three? four?
If we are to repurpose it for code allocations it should be something like
1) allocate 2M page to fill the cache
2) remove this page from the direct map
3) map the 2M page ROX in module address space (usually some part of
vmalloc address space)
4) allocate a smaller chunk of that page to the actual caller (bpf,
modules, whatever)
Right now (3) and (4) won't work for modules because they mix code and data
in a single allocation.
So module refactoring is a blocker and another blocker is to teach vmalloc
to map the areas for the executable memory with 2M pages and probably
something else.
I remember there was an attempt to add VM_ALLOW_HUGE_VMAP to
x86::module_alloc(), but it caused problems and was reverted. Sorry, I
could not find the lore link.
There was a related discussion here:
https://lore.kernel.org/all/14D6DBA0-0572-44FB-A566-464B1FF541E0@fb.com/
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2023-05-18 15:24 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-08 9:41 [RFC PATCH 0/5] Prototype for direct map awareness in page allocator Mike Rapoport
2023-03-08 9:41 ` [RFC PATCH 1/5] mm: intorduce __GFP_UNMAPPED and unmapped_alloc() Mike Rapoport
2023-03-09 1:56 ` Edgecombe, Rick P
2023-03-09 14:39 ` Mike Rapoport
2023-03-09 15:34 ` Edgecombe, Rick P
2023-03-09 6:31 ` Hyeonggon Yoo
2023-03-09 15:27 ` Mike Rapoport
2023-03-24 8:37 ` Michal Hocko
2023-03-25 6:38 ` Mike Rapoport
2023-03-27 13:43 ` Michal Hocko
2023-03-27 14:31 ` Vlastimil Babka
2023-03-27 15:10 ` Michal Hocko
2023-03-28 6:25 ` Mike Rapoport
2023-03-28 7:39 ` Michal Hocko
2023-03-28 15:11 ` Mike Rapoport
2023-03-28 15:24 ` Michal Hocko
2023-03-29 7:28 ` Mike Rapoport
2023-03-29 8:13 ` Michal Hocko
2023-03-30 5:13 ` Mike Rapoport
2023-03-30 8:11 ` Michal Hocko
2023-03-28 17:18 ` Luis Chamberlain
2023-03-28 17:37 ` Matthew Wilcox
2023-03-28 17:52 ` Luis Chamberlain
2023-03-28 17:55 ` Luis Chamberlain
2023-05-18 3:35 ` Kent Overstreet
2023-05-18 15:23 ` Mike Rapoport [this message]
2023-05-18 16:33 ` Song Liu
2023-05-18 16:48 ` Kent Overstreet
2023-05-18 17:00 ` Song Liu
2023-05-18 17:23 ` Kent Overstreet
2023-05-18 18:47 ` Song Liu
2023-05-18 19:03 ` Song Liu
2023-05-18 19:15 ` Kent Overstreet
2023-05-18 20:03 ` Song Liu
2023-05-18 20:13 ` Kent Overstreet
2023-05-18 20:51 ` Song Liu
2023-05-19 1:24 ` Kent Overstreet
2023-05-19 15:08 ` Song Liu
2023-05-18 19:16 ` Kent Overstreet
2023-05-19 8:29 ` Mike Rapoport
2023-05-19 15:42 ` Song Liu
2023-05-22 22:05 ` Thomas Gleixner
2023-05-19 15:47 ` Kent Overstreet
2023-05-19 16:14 ` Mike Rapoport
2023-05-19 16:21 ` Kent Overstreet
2023-05-18 16:58 ` Kent Overstreet
2023-05-18 17:15 ` Song Liu
2023-05-18 17:25 ` Kent Overstreet
2023-05-18 18:54 ` Song Liu
2023-05-18 19:01 ` Song Liu
2023-05-18 19:10 ` Kent Overstreet
2023-03-08 9:41 ` [RFC PATCH 2/5] mm/unmapped_alloc: add debugfs file similar to /proc/pagetypeinfo Mike Rapoport
2023-03-08 9:41 ` [RFC PATCH 3/5] mm/unmapped_alloc: add shrinker Mike Rapoport
2023-03-08 9:41 ` [RFC PATCH 4/5] EXPERIMENTAL: x86: use __GFP_UNMAPPED for modele_alloc() Mike Rapoport
2023-03-09 1:54 ` Edgecombe, Rick P
2023-03-08 9:41 ` [RFC PATCH 5/5] EXPERIMENTAL: mm/secretmem: use __GFP_UNMAPPED Mike Rapoport
2023-03-09 1:59 ` [RFC PATCH 0/5] Prototype for direct map awareness in page allocator Edgecombe, Rick P
2023-03-09 15:14 ` Mike Rapoport
2023-05-19 15:40 ` Sean Christopherson
2023-05-19 16:24 ` Mike Rapoport
2023-05-19 18:25 ` Sean Christopherson
2023-05-25 20:37 ` Mike Rapoport
2023-03-10 7:27 ` Christoph Hellwig
2023-03-27 14:27 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230518152354.GD4967@kernel.org \
--to=rppt@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@linux.intel.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=song@kernel.org \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox