From: Maxwell Bland <mbland@motorola.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@infradead.org>,
Lorenzo Stoakes <lstoakes@gmail.com>
Subject: Re: [PATCH 1/5] mm: allow arch refinement/skip for vmap alloc
Date: Thu, 18 Apr 2024 15:52:48 +0000 [thread overview]
Message-ID: <SEZPR03MB6786A4990ADA94C29CE2D77BB40E2@SEZPR03MB6786.apcprd03.prod.outlook.com> (raw)
In-Reply-To: <ZiDf9AeFUG22CEU5@pc636>
On Thu, April 18, 2024 at 3:55 AM, Uladzislau Rezki wrote:
> On Tue, Apr 02, 2024 at 03:15:01PM -0500, Maxwell Bland wrote:
> > +extern void insert_vmap_area_augment(struct vmap_area *va, struct rb_node
> > +extern int va_clip(struct rb_root *root, struct list_head *head, +extern
> > struct vmap_area *__find_vmap_area(unsigned long addr,
> To me it looks like you want to make internal functions as public for
> everyone which is not good, imho.
First, thank you for the feedback. I tussled with some of these ideas too while
writing. I will clarify some motivations below and then propose some
alternatives based upon your review.
> arch_skip_va() injections into the search algorithm sounds like a hack and
> might lead(if i do not miss something, need to check closer) to alloc
> failures when we go toward a reserved VA but we are not allowed to allocate
> from.
This is a good insight into the architectural intention here. As is clear, the
underlying goal of this patch is to provide a method for architectures to
enforce their own pseudo-reserved vmalloc regions dynamically.
This considered, the highlighted potential failures would technically be
legitimate with the caveat of making architectures who implement the interface
responsible for maintaining only correct and appropriate reservations?
If so, then the path diverges conditioned on whether we believe that caveat is
reasonable. I am on the fence about whether freedom is good here, so I think it
is reasonable to disallow this freedom, see below.
> Why do not you allocate just using a specific range from MODULES_ASLR_START
> till VMALLOC_END?
Mark Rutland has indicated that he does not support a large free region size
reduction in favor of ensuring pages are not interleaved. That is, this was my
initial approach, but it was deemed unfit. Strict partitioning creates a
trade-off between region size and ASLR randomization.
To clarify a secondary point, in case this question was more general: allowing
interleaving between VMALLOC_START to VMALLOC_END and MODULES_ASLR_START to
MODULES_ASLR_END regions breaks a key usecase of being able to enforce new
PMD-level and coarse-grained protections (e.g. PXNTable) dynamically.
In case the question is more of a "why are you submitting this in the first
place": non-interleaving simplifies code focused on preventing malicious page
table updates since we do not need to track all updates of PTE level
descriptors. Verifying individual PTE updates comes at a high (performance,
complexity) cost and happens to lead to hardware-level privilege-checking race
conditions on certain very popular arm64 chipsets.
OK, preamble out of the way:
(1) Would it be OK to potentially export a more generic version of the
functions written in arch/arm64/kernel/vmalloc.c for
https://lore.kernel.org/all/20240416122254.868007168-3-mbland@motorola.com/
That is, move a version of these functions to the main vmalloc.c? This way
these functions are still owned by the right part of the kernel.
Or (2) the exported functions could be duplicated, effectively, into
architecture-specific code, a sort of "all in" to the caveat mentioned above of
making the architectures responsible for maintaining a reserved code region if
they choose to implement the interface.
(3) Potentially a different approach that does not involve skipping the
allocation of "bad" VA's but instead dynamically restructures the tree,
potentially just creating two trees, one for data and one for code, is in mind.
Thanks and Regards,
Maxwell Bland
next prev parent reply other threads:[~2024-04-18 15:53 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-15 20:16 [PATCH 0/5] mm: code and data partitioning improvements Maxwell Bland
2024-04-02 20:15 ` [PATCH 1/5] mm: allow arch refinement/skip for vmap alloc Maxwell Bland
2024-04-16 19:18 ` [PATCH 1/5 RESEND] " Maxwell Bland
2024-04-18 8:55 ` [PATCH 1/5] " Uladzislau Rezki
2024-04-18 15:52 ` Maxwell Bland [this message]
2024-04-05 18:37 ` [PATCH 3/5] mm: add vaddr param to pmd_populate_kernel Maxwell Bland
2024-04-16 19:18 ` [PATCH 3/5 RESEND] " Maxwell Bland
2024-04-17 8:23 ` [PATCH 3/5] " kernel test robot
2024-04-15 19:51 ` [PATCH 5/5] ptdump: add state parameter for non-leaf callback Maxwell Bland
2024-04-16 19:18 ` [PATCH 5/5 RESEND] " Maxwell Bland
2024-04-16 20:11 ` [PATCH 5/5] " Andrew Morton
2024-04-16 21:01 ` Maxwell Bland
2024-04-16 19:18 ` [PATCH 0/5 RESEND] mm: code and data partitioning improvements Maxwell Bland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SEZPR03MB6786A4990ADA94C29CE2D77BB40E2@SEZPR03MB6786.apcprd03.prod.outlook.com \
--to=mbland@motorola.com \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox