From: Luis Chamberlain <mcgrof@kernel.org>
To: Aaron Lu <aaron.lu@intel.com>
Cc: Song Liu <song@kernel.org>,
bpf@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, x86@kernel.org, peterz@infradead.org,
hch@lst.de, rick.p.edgecombe@intel.com, dave.hansen@intel.com,
rppt@kernel.org, zhengjun.xing@linux.intel.com,
kbusch@kernel.org, p.raghav@samsung.com, dave@stgolabs.net,
vbabka@suse.cz, mgorman@suse.de, willy@infradead.org,
torvalds@linux-foundation.org,
Hyeonggon Yoo <42.hyeyoo@gmail.com>
Subject: Re: [PATCH bpf-next v1 RESEND 1/5] vmalloc: introduce vmalloc_exec, vfree_exec, and vcopy_exec
Date: Mon, 7 Nov 2022 09:39:03 -0800 [thread overview]
Message-ID: <Y2lCt7kWG+tsePDL@bombadil.infradead.org> (raw)
In-Reply-To: <Y2ioTodn+mBXdIqp@ziqianlu-desk2>
On Mon, Nov 07, 2022 at 02:40:14PM +0800, Aaron Lu wrote:
> Hello,
>
> On Wed, Nov 02, 2022 at 04:41:59PM -0700, Luis Chamberlain wrote:
>
> ... ...
>
> > I'm under the impression that the real missed, undocumented, major value-add
> > here is that the old "BPF prog pack" strategy helps to reduce the direct map
> > fragmentation caused by heavy use of the eBPF JIT programs and this in
> > turn helps your overall random system performance (regardless of what
> > it is you do). As I see it then the eBPF prog pack is just one strategy to
> > try to mitigate memory fragmentation on the direct map caused by the the eBPF
> > JIT programs, so the "slow down" your team has obvserved should be due to the
> > eventual fragmentation caused on the direct map *while* eBPF programs
> > get heavily used.
> >
> > Mike Rapoport had presented about the Direct map fragmentation problem
> > at Plumbers 2021 [0], and clearly mentioned modules / BPF / ftrace /
> > kprobes as possible sources for this. Then Xing Zhengjun's 2021 performance
> > evaluation on whether using 2M/1G pages aggressively for the kernel direct map
> > help performance [1] ends up generally recommending huge pages. The work by Xing
> > though was about using huge pages *alone*, not using a strategy such as in the
> > "bpf prog pack" to share one 2 MiB huge page for *all* small eBPF programs,
> > and that I think is the real golden nugget here.
>
> I'm interested in how this patchset (further) improves direct map
> fragmentation so would like to evaluate it to see if my previous work to
> merge small mappings back in architecture layer[1] is still necessary.
You gotta apply it to 6.0.5 which had a large change go in for eBPF
which was not present on 6.0.
> Conclusion: I think bpf_prog_pack is very good at reducing direct map
> fragmentation and this patchset can further improve this situation on
> large machines(with huge amount of memory) or with more large bpf progs
> loaded etc.
Fantastic. Thanks for the analysis, so yet another set of metrics which
I'd hope can be applied to this patch set as this effort is generalized.
Now imagine the effort in cosnideration also of modules / ftrace / kprobes.
> Some imperfect things I can think of are(not related to this patchset):
> 1 Once a split happened, it remains happened. This may not be a big deal
> now with bpf_prog_pack and this patchset because the need to allocate a
> new order-9 page and thus cause a potential split should happen much much
> less;
Not sure I follow, are you suggesting a order-9 (512 bytes) allocation would
trigger a split of the reserved say 2 MiB huge page?
> 2 When a new order-9 page has to be allocated, there is no way to tell
> the allocator to allocate this order-9 page from an already splitted PUD
> range to avoid another PUD mapping split;
> 3 As Mike and others have mentioned, there are other users that can also
> cause direct map split.
Hence the effort to generalize.
Luis
next prev parent reply other threads:[~2022-11-07 17:39 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 22:25 [PATCH bpf-next v1 RESEND 0/5] vmalloc_exec for modules and BPF programs Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 1/5] vmalloc: introduce vmalloc_exec, vfree_exec, and vcopy_exec Song Liu
2022-11-02 23:41 ` Luis Chamberlain
2022-11-03 15:51 ` Mike Rapoport
2022-11-03 18:59 ` Luis Chamberlain
2022-11-03 21:19 ` Edgecombe, Rick P
2022-11-03 21:41 ` Song Liu
2022-11-03 23:33 ` Luis Chamberlain
2022-11-04 0:18 ` Luis Chamberlain
2022-11-04 3:29 ` Luis Chamberlain
2022-11-07 6:58 ` Mike Rapoport
2022-11-07 17:26 ` Luis Chamberlain
2022-11-07 6:40 ` Aaron Lu
2022-11-07 17:39 ` Luis Chamberlain [this message]
2022-11-07 18:35 ` Song Liu
2022-11-07 18:30 ` Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 2/5] x86/alternative: support vmalloc_exec() and vfree_exec() Song Liu
2022-11-02 22:21 ` Edgecombe, Rick P
2022-11-03 21:03 ` Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 3/5] bpf: use vmalloc_exec for bpf program and bpf dispatcher Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 4/5] vmalloc: introduce register_text_tail_vm() Song Liu
2022-10-31 22:25 ` [PATCH bpf-next v1 RESEND 5/5] x86: use register_text_tail_vm Song Liu
2022-11-02 22:24 ` Edgecombe, Rick P
2022-11-03 21:04 ` Song Liu
2022-11-01 11:26 ` [PATCH bpf-next v1 RESEND 0/5] vmalloc_exec for modules and BPF programs Christoph Hellwig
2022-11-01 15:10 ` Song Liu
2022-11-02 20:45 ` Luis Chamberlain
2022-11-02 22:29 ` Edgecombe, Rick P
2022-11-03 21:13 ` Song Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2lCt7kWG+tsePDL@bombadil.infradead.org \
--to=mcgrof@kernel.org \
--cc=42.hyeyoo@gmail.com \
--cc=aaron.lu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=dave.hansen@intel.com \
--cc=dave@stgolabs.net \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=p.raghav@samsung.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rppt@kernel.org \
--cc=song@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox