From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9D2CC77B7F for ; Thu, 26 Jun 2025 08:15:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DA486B00A4; Thu, 26 Jun 2025 04:15:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 58AD56B00A6; Thu, 26 Jun 2025 04:15:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4C84B6B00A7; Thu, 26 Jun 2025 04:15:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 351E06B00A4 for ; Thu, 26 Jun 2025 04:15:09 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id ACFD5C0914 for ; Thu, 26 Jun 2025 08:15:08 +0000 (UTC) X-FDA: 83596841496.04.1020C03 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id D56EB8000A for ; Thu, 26 Jun 2025 08:15:06 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750925707; a=rsa-sha256; cv=none; b=sgfEvHszWc9U7qNja7aNvbRFY0deAdwIbAue7M31b6iZR6zhLkZIrf3b6yq/EnfKxn7wyq fiknvucD6LCuBNat64ShssSPWDCR3VbMDw8PhgHIX2RMsJD595VadASU/bjN6o4VZ6NdSK /9vGqSb8Jum6EjOflOr2zBeDk6bMx4M= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750925707; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vd7tMTohoa6LeQOn+qr9faH9kFeLhtHVyKmiHETqjmE=; b=ZT7a3U3Zsdybtney05r01JOrevVds3qlosS4vSoylFylUWfHYlD5U2zsgUR2PbKWwu/9Xi IrcZQ5of6ibjsuWgijyecw0FlrEIyU0MnFCxfm3OhFBUdMgdn10D9/dk+C16GDL1rRFmh7 jIwnQg6/kjc4NHMGQs0Ga9xThCKS//k= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 143371063; Thu, 26 Jun 2025 01:14:48 -0700 (PDT) Received: from [10.57.84.221] (unknown [10.57.84.221]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 97D7B3F58B; Thu, 26 Jun 2025 01:15:02 -0700 (PDT) Message-ID: <48400a85-3f0f-4b4c-81aa-0e7d1dc14c9d@arm.com> Date: Thu, 26 Jun 2025 09:15:00 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 1/2] arm64: pageattr: Use pagewalk API to change memory permissions Content-Language: en-GB To: Dev Jain , akpm@linux-foundation.org, david@redhat.com, catalin.marinas@arm.com, will@kernel.org Cc: lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, suzuki.poulose@arm.com, steven.price@arm.com, gshan@redhat.com, linux-arm-kernel@lists.infradead.org, yang@os.amperecomputing.com, anshuman.khandual@arm.com References: <20250613134352.65994-1-dev.jain@arm.com> <20250613134352.65994-2-dev.jain@arm.com> <1bb09534-56bd-4aba-b7e8-dad8bf6e9200@arm.com> From: Ryan Roberts In-Reply-To: <1bb09534-56bd-4aba-b7e8-dad8bf6e9200@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: ecgxba6n11wff4t5nsod3zacmbrfxxtr X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D56EB8000A X-Rspam-User: X-HE-Tag: 1750925706-582168 X-HE-Meta: U2FsdGVkX1+1bow04F4ieKRo2xKQQPCCmZqsLMJEQ4aQBtFCQ1/vLDFLTCAX4NP5TQOn/i2lK9aiEdGCoHfjJ5SZUDBVU9hQpTxPBrrSgTVo0aHWBoNIMnQoqufAcIlfKNipputzLjtVS+Imy0MECvxDB1L0yeF9JrVx8pvmNKTofPZ5IfChm9A7cmY/iieJhmXxZOvfkn9mXu5IVkEasSOnhk8TA6vs7fPmhjACLusHQR4x7GqixKxExFViy8igfXcg7SgtDcLKmzAAFkTc+31aL5yuWHBy/1VUGDYCYcYUUPWIdT7Tmc7HpUEJ+iX1t2nDZaPP3KSsbIwXP9cnu9ANtMzUMCRs0q0jduflIMGhvKaVujdw03ctau/3tA+jzm0xjLcQ1g+eTQU6OQ8Ac06P8NufFgoF+FPN2LKB8zQ2FJT8udGa6LTODV36qp+WqVVJVUHVtegkgu28VdugHpAk0ueRTjahd9pVqEYZNpVAfqlLL6bSMxMtbTfkgSJJ+YME4xysf3lHm5IM4CNgnpbuu8nbAbeX97TWNONGFvDicEMvvt2cUAhdWzmRHpFTHD8Wtch+XUG7qrdeu3fj9xChJQ1vDmeX003LocBAQiSK9CD2RKN/RVyJ0q9EYZBTxN+MHsRw5KWbgnVs1wVv6JT5F8kkiAbJLHlWWlf7EgtjaN5NJXM/WZ7aPEmnoFPqlJ7p9uAMJlRoQwDvbv0eQtFRJxLyWKiRcrO1OZx/EvP5B/RHkFXeK0ptE3h2t/rumGI1f0HUY4KN9NXxBFPjBZC0qcItYbHZyk/4YU7XXXy9GOvhMPLgblAyYE/x17V2nj6enG9iFRt58KHceTXf3cEcYi1mrOxsKGh8F0rdVK6A4WyylDm5jgRg8KvAdfC8PmsQEHdLxl6cNPj7kgPb0X1Iifa/xvjyqq8XbqA2+oDS9ivbyd6KIeodzJkwBrQiIiudsBGsvJpNYve4USL EE4SYQNT +56MtrrIxWtBF4K8Kq/RntdCi0fqo4qLW7hJDUYxO4OVhxNACZJWGppegULJgAf6GdL3O/8WAJe1SLzPlSBtI2CeUozN7phGFCImqrkmUCpFTjRDwa8XL8NhN0LVeXONgD8YgKWk2FC3eZMLJLYU44UgGL8PuLx2Y7PQqTOIvW8abhlYBskQ6ZV8Mi4kty8tpxIOsUKhptsy/mQnUPzIMZvRuTTkxn62Oh6wMKz9Fp1gBWYGkoiB/lcdH7DMi6hjwsdDbC3Wt90d1UBFuDaG5JKTYWClKMkfjbD3J X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 26/06/2025 06:47, Dev Jain wrote: > > On 13/06/25 7:13 pm, Dev Jain wrote: >> arm64 currently changes permissions on vmalloc objects locklessly, via >> apply_to_page_range, whose limitation is to deny changing permissions for >> block mappings. Therefore, we move away to use the generic pagewalk API, >> thus paving the path for enabling huge mappings by default on kernel space >> mappings, thus leading to more efficient TLB usage. However, the API >> currently enforces the init_mm.mmap_lock to be held. To avoid the >> unnecessary bottleneck of the mmap_lock for our usecase, this patch >> extends this generic API to be used locklessly, so as to retain the >> existing behaviour for changing permissions. Apart from this reason, it is >> noted at [1] that KFENCE can manipulate kernel pgtable entries during >> softirqs. It does this by calling set_memory_valid() -> __change_memory_common(). >> This being a non-sleepable context, we cannot take the init_mm mmap lock. >> >> Add comments to highlight the conditions under which we can use the >> lockless variant - no underlying VMA, and the user having exclusive control >> over the range, thus guaranteeing no concurrent access. >> >> Since arm64 cannot handle kernel live mapping splitting without BBML2, >> we require that the start and end of a given range lie on block mapping >> boundaries. Return -EINVAL in case a partial block mapping is detected; >> add a corresponding comment in ___change_memory_common() to warn that >> eliminating such a condition is the responsibility of the caller. >> >> apply_to_page_range() currently performs all pte level callbacks while in >> lazy mmu mode. Since arm64 can optimize performance by batching barriers >> when modifying kernel pgtables in lazy mmu mode, we would like to continue >> to benefit from this optimisation. Unfortunately walk_kernel_page_table_range() >> does not use lazy mmu mode. However, since the pagewalk framework is not >> allocating any memory, we can safely bracket the whole operation inside >> lazy mmu mode ourselves. Therefore, wrap the call to >> walk_kernel_page_table_range() with the lazy MMU helpers. >> >> [1] https://lore.kernel.org/linux-arm-kernel/89d0ad18-4772-4d8f- >> ae8a-7c48d26a927e@arm.com/ >> >> Signed-off-by: Dev Jain >> --- >>   arch/arm64/mm/pageattr.c | 157 +++++++++++++++++++++++++++++++-------- >>   include/linux/pagewalk.h |   3 + >>   mm/pagewalk.c            |  26 +++++++ >>   3 files changed, 154 insertions(+), 32 deletions(-) >> >> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c >> index 04d4a8f676db..cfc5279f27a2 100644 >> --- a/arch/arm64/mm/pageattr.c >> +++ b/arch/arm64/mm/pageattr.c >> @@ -8,6 +8,7 @@ >>   #include >>   #include >>   #include >> +#include >>     #include >>   #include >> @@ -20,6 +21,99 @@ struct page_change_data { >>       pgprot_t clear_mask; >>   }; >>   +static ptdesc_t set_pageattr_masks(ptdesc_t val, struct mm_walk *walk) >> +{ >> +    struct page_change_data *masks = walk->private; >> + >> +    val &= ~(pgprot_val(masks->clear_mask)); >> +    val |= (pgprot_val(masks->set_mask)); >> + >> +    return val; >> +} >> + >> +static int pageattr_pgd_entry(pgd_t *pgd, unsigned long addr, >> +                  unsigned long next, struct mm_walk *walk) >> +{ >> +    pgd_t val = pgdp_get(pgd); >> + >> +    if (pgd_leaf(val)) { >> +        if (WARN_ON_ONCE((next - addr) != PGDIR_SIZE)) >> +            return -EINVAL; >> +        val = __pgd(set_pageattr_masks(pgd_val(val), walk)); >> +        set_pgd(pgd, val); >> +        walk->action = ACTION_CONTINUE; >> +    } >> + >> +    return 0; >> +} >> + >> +static int pageattr_p4d_entry(p4d_t *p4d, unsigned long addr, >> +                  unsigned long next, struct mm_walk *walk) >> +{ >> +    p4d_t val = p4dp_get(p4d); >> + >> +    if (p4d_leaf(val)) { >> +        if (WARN_ON_ONCE((next - addr) != P4D_SIZE)) >> +            return -EINVAL; >> +        val = __p4d(set_pageattr_masks(p4d_val(val), walk)); >> +        set_p4d(p4d, val); >> +        walk->action = ACTION_CONTINUE; >> +    } >> + >> +    return 0; >> +} >> + >> +static int pageattr_pud_entry(pud_t *pud, unsigned long addr, >> +                  unsigned long next, struct mm_walk *walk) >> +{ >> +    pud_t val = pudp_get(pud); >> + >> +    if (pud_leaf(val)) { >> +        if (WARN_ON_ONCE((next - addr) != PUD_SIZE)) >> +            return -EINVAL; >> +        val = __pud(set_pageattr_masks(pud_val(val), walk)); >> +        set_pud(pud, val); >> +        walk->action = ACTION_CONTINUE; >> +    } >> + >> +    return 0; >> +} >> + >> +static int pageattr_pmd_entry(pmd_t *pmd, unsigned long addr, >> +                  unsigned long next, struct mm_walk *walk) >> +{ >> +    pmd_t val = pmdp_get(pmd); >> + >> +    if (pmd_leaf(val)) { >> +        if (WARN_ON_ONCE((next - addr) != PMD_SIZE)) >> +            return -EINVAL; >> +        val = __pmd(set_pageattr_masks(pmd_val(val), walk)); >> +        set_pmd(pmd, val); >> +        walk->action = ACTION_CONTINUE; >> +    } >> + >> +    return 0; >> +} >> + >> +static int pageattr_pte_entry(pte_t *pte, unsigned long addr, >> +                  unsigned long next, struct mm_walk *walk) >> +{ >> +    pte_t val = __ptep_get(pte); >> + >> +    val = __pte(set_pageattr_masks(pte_val(val), walk)); >> +    __set_pte(pte, val); >> + >> +    return 0; >> +} > > I was wondering, now that we have vmalloc contpte support, > do we need to ensure in this pte level callback that > we don't partially cover a contpte block? Yes good point!