From: Dev Jain <dev.jain@arm.com>
To: akpm@linux-foundation.org, david@redhat.com,
catalin.marinas@arm.com, will@kernel.org
Cc: lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, suzuki.poulose@arm.com,
steven.price@arm.com, gshan@redhat.com,
linux-arm-kernel@lists.infradead.org,
yang@os.amperecomputing.com, ryan.roberts@arm.com,
anshuman.khandual@arm.com
Subject: Re: [PATCH v4] arm64: Enable permission change on arm64 kernel block mappings
Date: Fri, 4 Jul 2025 09:41:07 +0530 [thread overview]
Message-ID: <a0ed4dee-eac8-4272-9fb2-2b7b62f16455@arm.com> (raw)
In-Reply-To: <20250703151441.60325-1-dev.jain@arm.com>
[-- Attachment #1: Type: text/plain, Size: 4223 bytes --]
On 03/07/25 8:44 pm, Dev Jain wrote:
> This patch paves the path to enable huge mappings in vmalloc space and
> linear map space by default on arm64. For this we must ensure that we can
> handle any permission games on the kernel (init_mm) pagetable. Currently,
> __change_memory_common() uses apply_to_page_range() which does not support
> changing permissions for block mappings. We attempt to move away from this
> by using the pagewalk API, similar to what riscv does right now; however,
> it is the responsibility of the caller to ensure that we do not pass a
> range overlapping a partial block mapping or cont mapping; in such a case,
> the system must be able to support range splitting.
>
> This patch is tied with Yang Shi's attempt [1] at using huge mappings
> in the linear mapping in case the system supports BBML2, in which case
> we will be able to split the linear mapping if needed without
> break-before-make. Thus, Yang's series, IIUC, will be one such user of my
> patch; suppose we are changing permissions on a range of the linear map
> backed by PMD-hugepages, then the sequence of operations should look
> like the following:
>
> split_range(start)
> split_range(end);
> __change_memory_common(start, end);
>
> However, this patch can be used independently of Yang's; since currently
> permission games are being played only on pte mappings (due to
> apply_to_page_range not supporting otherwise), this patch provides the
> mechanism for enabling huge mappings for various kernel mappings
> like linear map and vmalloc.
>
> ---------------------
> Implementation
> ---------------------
>
> arm64 currently changes permissions on vmalloc objects locklessly, via
> apply_to_page_range, whose limitation is to deny changing permissions for
> block mappings. Therefore, we move away to use the generic pagewalk API,
> thus paving the path for enabling huge mappings by default on kernel space
> mappings, thus leading to more efficient TLB usage. However, the API
> currently enforces the init_mm.mmap_lock to be held. To avoid the
> unnecessary bottleneck of the mmap_lock for our usecase, this patch
> extends this generic API to be used locklessly, so as to retain the
> existing behaviour for changing permissions. Apart from this reason, it is
> noted at [2] that KFENCE can manipulate kernel pgtable entries during
> softirqs. It does this by calling set_memory_valid() -> __change_memory_common().
> This being a non-sleepable context, we cannot take the init_mm mmap lock.
>
> Add comments to highlight the conditions under which we can use the
> lockless variant - no underlying VMA, and the user having exclusive control
> over the range, thus guaranteeing no concurrent access.
>
> We require that the start and end of a given range do not partially overlap
> block mappings, or cont mappings. Return -EINVAL in case a partial block
> mapping is detected in any of the PGD/P4D/PUD/PMD levels; add a
> corresponding comment in update_range_prot() to warn that eliminating
> such a condition is the responsibility of the caller.
>
> Note that, the pte level callback may change permissions for a whole
> contpte block, and that will be done one pte at a time, as opposed to
> an atomic operation for the block mappings. This is fine as any access
> will decode either the old or the new permission until the TLBI.
>
> apply_to_page_range() currently performs all pte level callbacks while in
> lazy mmu mode. Since arm64 can optimize performance by batching barriers
> when modifying kernel pgtables in lazy mmu mode, we would like to continue
> to benefit from this optimisation. Unfortunately walk_kernel_page_table_range()
> does not use lazy mmu mode. However, since the pagewalk framework is not
> allocating any memory, we can safely bracket the whole operation inside
> lazy mmu mode ourselves. Therefore, wrap the call to
> walk_kernel_page_table_range() with the lazy MMU helpers.
>
> [1]https://lore.kernel.org/all/20250304222018.615808-1-yang@os.amperecomputing.com/
> [2]https://lore.kernel.org/linux-arm-kernel/89d0ad18-4772-4d8f-ae8a-7c48d26a927e@arm.com/
>
> Signed-off-by: Dev Jain<dev.jain@arm.com>
> ---
>
Forgot to carry:
Reviewed-by: Ryan Roberts<ryan.roberts@arm.com>
[-- Attachment #2: Type: text/html, Size: 5049 bytes --]
next prev parent reply other threads:[~2025-07-04 4:12 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-03 15:14 Dev Jain
2025-07-04 4:11 ` Dev Jain [this message]
2025-07-19 13:52 ` Dev Jain
2025-07-19 23:29 ` Andrew Morton
2025-07-24 8:19 ` Catalin Marinas
2025-07-24 10:40 ` Dev Jain
2025-07-24 11:58 ` Catalin Marinas
2025-07-24 17:51 ` Yang Shi
2025-07-25 4:23 ` Dev Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a0ed4dee-eac8-4272-9fb2-2b7b62f16455@arm.com \
--to=dev.jain@arm.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=gshan@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=steven.price@arm.com \
--cc=surenb@google.com \
--cc=suzuki.poulose@arm.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox