linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <yang@os.amperecomputing.com>
To: Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	scott@os.amperecomputing.com, cl@gentwo.org
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v7 0/6] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full
Date: Wed, 17 Sep 2025 12:59:52 -0700	[thread overview]
Message-ID: <e42e8b11-d86c-4164-9d6b-13cd34045570@os.amperecomputing.com> (raw)
In-Reply-To: <3cbda17c-4c1c-4c04-a1fe-bd6ea6714de8@arm.com>



On 9/17/25 12:40 PM, Ryan Roberts wrote:
> On 17/09/2025 20:15, Yang Shi wrote:
>>
>> On 9/17/25 11:58 AM, Ryan Roberts wrote:
>>> On 17/09/2025 18:21, Yang Shi wrote:
>>>> On 9/17/25 9:28 AM, Ryan Roberts wrote:
>>>>> Hi Yang,
>>>>>
>>>>> Sorry for the slow reply; I'm just getting back to this...
>>>>>
>>>>> On 11/09/2025 23:03, Yang Shi wrote:
>>>>>> Hi Ryan & Catalin,
>>>>>>
>>>>>> Any more concerns about this?
>>>>> I've been trying to convince myself that your assertion that all users that set
>>>>> the VM_FLUSH_RESET_PERMS also call set_memory_*() for the entire range that was
>>>>> returned my vmalloc. I agree that if that is the contract and everyone is
>>>>> following it, then there is no problem here.
>>>>>
>>>>> But I haven't been able to convince myself...
>>>>>
>>>>> Some examples (these might intersect with examples you previously raised):
>>>>>
>>>>> 1. bpf_dispatcher_change_prog() -> bpf_jit_alloc_exec() -> execmem_alloc() ->
>>>>> sets VM_FLUSH_RESET_PERMS. But I don't see it calling set_memory_*() for
>>>>> rw_image.
>>>> Yes, it doesn't call set_memory_*(). I spotted this in the earlier email. But it
>>>> is actually RW, so it should be ok to miss the call. The later
>>>> set_direct_map_invalid call in vfree() may fail, but set_direct_map_default call
>>>> will set RW permission back. But I think it doesn't have to use execmem_alloc(),
>>>> the plain vmalloc() should be good enough.
>>>>
>>>>> 2. module_memory_alloc() -> execmem_alloc_rw() -> execmem_alloc() -> sets
>>>>> VM_FLUSH_RESET_PERMS (note that execmem_force_rw() is nop for arm64).
>>>>> set_memory_*() is not called until much later on in module_set_memory().
>>>>> Another
>>>>> error in the meantime could cause the memory to be vfreed before that point.
>>>> IIUC, execmem_alloc_rw() is used to allocate memory for modules' text section
>>>> and data section. The code will set mod->mem[type].is_rox according to the type
>>>> of the section. It is true for text, false for data. Then set_memory_rox() will
>>>> be called later if it is true *after* insns are copied to the memory. So it is
>>>> still RW before that point.
>>>>
>>>>> 3. When set_vm_flush_reset_perms() is set for the range, it is called before
>>>>> set_memory_*() which might then fail to split prior to vfree.
>>>> Yes, all call sites check the return value and bail out if set_memory_*() failed
>>>> if I don't miss anything.
>>>>
>>>>> But I guess as long as set_memory_*() is never successfully called for a
>>>>> *sub-range* of the vmalloc'ed region, then for all of the above issues, the
>>>>> memory must still be RW at vfree-time, so this issue should be benign... I
>>>>> think?
>>>> Yes, it is true.
>>> So to summarise, all freshly vmalloc'ed memory starts as RW. set_memory_*() may
>>> only be called if VM_FLUSH_RESET_PERMS has already been set. If set_memory_*()
>>> is called at all, the first call MUST be for the whole range.
>> Whether the default permission is RW or not depends on the type passed in by
>> execmem_alloc(). It is defined by execmem_info in arch/arm64/mm/init.c. For
>> ARM64, module and BPF have PAGE_KERNEL permission (RW) by default, but kprobes
>> is PAGE_KERNEL_ROX (ROX).
> Perhaps I missed it, but as far as I could tell the prot that the arch sets for
> the type only determines the prot that is set for the vmalloc map. It doesn't
> look like the linear map is modified at all... which feels like a bug to me
> since the linear map will be RW while the vmalloc map will be ROX... I guess I
> must have missed something...

Yes, it just sets the permission for vmalloc area. The set_memory_*() 
must be called to change permission for direct map.

>
>>> If those requirements are all met, then if VM_FLUSH_RESET_PERMS was set but
>>> set_memory_*() was never called, the worst that can happen is for both the
>>> set_direct_map_invalid() and set_direct_map_default() calls to fail due to not
>>> enough memory. But that is safe because the memory was always RW. If
>>> set_memory_*() was called for the whole range and failed, it's the same as if it
>>> was never called. If it was called for the whole range and succeeded, then the
>>> split must have happened already and set_direct_map_invalid() and
>>> set_direct_map_default() will therefore definitely succeed.
>>>
>>> The only way this could be a problem is if someone vmallocs a range then
>>> performs a set_memory_*() on a sub-region without having first done it for the
>>> whole region. But we have not found any evidence that there are any users that
>>> do that.
>> Yes, exactly.
>>
>>> In fact, by that logic, I think alloc_insn_page() must also be safe; it only
>>> allocates 1 page, so if set_memory_*() is subsequently called for it, it must by
>>> definition be covering the whole allocation; 1 page is the smallest amount that
>>> can be protected.
>> Yes, but kprobes default permission is ROX.
>>
>>> So I agree we are safe.
>>>
>>>
>>>>> In summary this all looks horribly fragile. But I *think* it works. It would be
>>>>> good to clean it all up and have some clearly documented rules regardless.
>>>>> But I
>>>>> think that could be a follow up series.
>>>> Yeah, absolutely agreed.
>>>>
>>>>>> Shall we move forward with v8?
>>>>> Yes; Do you wnat me to post that or would you prefer to do it? I'm happy to do
>>>>> it; there are a few other tidy ups in pageattr.c I want to make which I
>>>>> spotted.
>>>> I actually just had v8 ready in my tree. I removed pageattr_pgd_entry and
>>>> pageattr_pud_entry in pageattr.c and fixed pmd_leaf/pud_leaf as you suggested.
>>>> Is it the cleanup you are supposed to do?
>>> I was also going to fix up the comment in change_memory_common() which is now
>>> stale.
>> Oops, I missed that in my v8. Please just comment for v8, I can fix it up later.
> Ahh no biggy. If there is a chance Will will take the series, let's not hold it
> up for a comment.

Yeah, sure, thank you.

Yang

>
>> Thanks,
>> Yang
>>
>>
>>>> And I also rebased it on top of
>>>> Shijie's series (https://git.kernel.org/pub/scm/linux/kernel/git/arm64/
>>>> linux.git/commit/?id=bfbbb0d3215f) which has been picked up by Will.
>>>>
>>>>>> We can include the
>>>>>> fix to kprobes in v8 or I can send it separately, either is fine to me.
>>>>> Post it on list, and I'll also incorporate into the series.
>>>> I can include it in v8 series.
>>>>
>>>>>> Hopefully we can make v6.18.
>>>>> It's probably getting a bit late now. Anyway, I'll aim to get v8 out
>>>>> tomorrow or
>>>>> Friday and we will see what Will thinks.
>>>> Thank you. I can post v8 today.
>>> OK great - I'll leave it all to you then - thanks!
>>>
>>>> Thanks,
>>>> Yang
>>>>
>>>>> Thanks,
>>>>> Ryan
>>>>>
>>>>>> Thanks,
>>>>>> Yang
>>>>>>



  reply	other threads:[~2025-09-17 20:00 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29 11:52 Ryan Roberts
2025-08-29 11:52 ` [PATCH v7 1/6] arm64: Enable permission change on arm64 kernel block mappings Ryan Roberts
2025-09-04  3:40   ` Jinjiang Tu
2025-09-04 11:06     ` Ryan Roberts
2025-09-04 11:49       ` Jinjiang Tu
2025-09-04 13:21         ` Ryan Roberts
2025-09-16 21:37       ` Yang Shi
2025-08-29 11:52 ` [PATCH v7 2/6] arm64: cpufeature: add AmpereOne to BBML2 allow list Ryan Roberts
2025-08-29 22:08   ` Yang Shi
2025-09-04 11:07     ` Ryan Roberts
2025-09-03 17:24   ` Catalin Marinas
2025-09-04  0:49     ` Yang Shi
2025-08-29 11:52 ` [PATCH v7 3/6] arm64: mm: support large block mapping when rodata=full Ryan Roberts
2025-09-03 19:15   ` Catalin Marinas
2025-09-04  0:52     ` Yang Shi
2025-09-04 11:09     ` Ryan Roberts
2025-09-04 11:15   ` Ryan Roberts
2025-09-04 14:57     ` Yang Shi
2025-08-29 11:52 ` [PATCH v7 4/6] arm64: mm: Optimize split_kernel_leaf_mapping() Ryan Roberts
2025-08-29 22:11   ` Yang Shi
2025-09-03 19:20   ` Catalin Marinas
2025-09-04 11:09     ` Ryan Roberts
2025-08-29 11:52 ` [PATCH v7 5/6] arm64: mm: split linear mapping if BBML2 unsupported on secondary CPUs Ryan Roberts
2025-09-04 16:59   ` Catalin Marinas
2025-09-04 17:54     ` Yang Shi
2025-09-08 15:25     ` Ryan Roberts
2025-08-29 11:52 ` [PATCH v7 6/6] arm64: mm: Optimize linear_map_split_to_ptes() Ryan Roberts
2025-08-29 22:27   ` Yang Shi
2025-09-04 11:10     ` Ryan Roberts
2025-09-04 14:58       ` Yang Shi
2025-09-04 17:00   ` Catalin Marinas
2025-09-01  5:04 ` [PATCH v7 0/6] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full Dev Jain
2025-09-01  8:03   ` Ryan Roberts
2025-09-03  0:21     ` Yang Shi
2025-09-03  0:50       ` Yang Shi
2025-09-04 13:14         ` Ryan Roberts
2025-09-04 13:16           ` Ryan Roberts
2025-09-04 17:47             ` Yang Shi
2025-09-04 21:49               ` Yang Shi
2025-09-08 16:34                 ` Ryan Roberts
2025-09-08 18:31                   ` Yang Shi
2025-09-09 14:36                     ` Ryan Roberts
2025-09-09 15:32                       ` Yang Shi
2025-09-09 16:32                         ` Ryan Roberts
2025-09-09 17:32                           ` Yang Shi
2025-09-11 22:03                             ` Yang Shi
2025-09-17 16:28                               ` Ryan Roberts
2025-09-17 17:21                                 ` Yang Shi
2025-09-17 18:58                                   ` Ryan Roberts
2025-09-17 19:15                                     ` Yang Shi
2025-09-17 19:40                                       ` Ryan Roberts
2025-09-17 19:59                                         ` Yang Shi [this message]
2025-09-16 23:44               ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e42e8b11-d86c-4164-9d6b-13cd34045570@os.amperecomputing.com \
    --to=yang@os.amperecomputing.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=ryan.roberts@arm.com \
    --cc=scott@os.amperecomputing.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox