From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Shuah Khan <shuah@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org, jeffxu@chromium.org,
oliver.sang@intel.com, torvalds@linux-foundation.org,
Michael Ellerman <mpe@ellerman.id.au>,
Kees Cook <kees@kernel.org>
Subject: Re: [PATCH v3 5/7] mseal: Replace can_modify_mm_madv with a vma variant
Date: Wed, 21 Aug 2024 09:41:29 +0100 [thread overview]
Message-ID: <7e31d62f-45b2-4b37-a6bb-96b7934a66c2@lucifer.local> (raw)
In-Reply-To: <20240817-mseal-depessimize-v3-5-d8d2e037df30@gmail.com>
On Sat, Aug 17, 2024 at 01:18:32AM GMT, Pedro Falcato wrote:
> Replace can_modify_mm_madv() with a single vma variant, and associated
> checks in madvise.
>
> While we're at it, also invert the order of checks in:
> if (unlikely(is_ro_anon(vma) && !can_modify_vma(vma))
>
> Checking if we can modify the vma itself (through vm_flags) is
> certainly cheaper than is_ro_anon() due to arch_vma_access_permitted()
> looking at e.g pkeys registers (with extra branches) in some
> architectures.
>
> This patch allows for partial madvise success when finding a sealed VMA,
> which historically has been allowed in Linux.
>
> Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> ---
> mm/internal.h | 2 --
> mm/madvise.c | 13 +++----------
> mm/mseal.c | 17 ++++-------------
> mm/vma.h | 7 +++++++
> 4 files changed, 14 insertions(+), 25 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index ca422aede342..1db320650539 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1363,8 +1363,6 @@ static inline int can_do_mseal(unsigned long flags)
>
> bool can_modify_mm(struct mm_struct *mm, unsigned long start,
> unsigned long end);
> -bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start,
> - unsigned long end, int behavior);
> #else
> static inline int can_do_mseal(unsigned long flags)
> {
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 89089d84f8df..4e64770be16c 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1031,6 +1031,9 @@ static int madvise_vma_behavior(struct vm_area_struct *vma,
> struct anon_vma_name *anon_name;
> unsigned long new_flags = vma->vm_flags;
>
> + if (unlikely(!can_modify_vma_madv(vma, behavior)))
> + return -EPERM;
> +
> switch (behavior) {
> case MADV_REMOVE:
> return madvise_remove(vma, prev, start, end);
> @@ -1448,15 +1451,6 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh
> start = untagged_addr_remote(mm, start);
> end = start + len;
>
> - /*
> - * Check if the address range is sealed for do_madvise().
> - * can_modify_mm_madv assumes we have acquired the lock on MM.
> - */
> - if (unlikely(!can_modify_mm_madv(mm, start, end, behavior))) {
> - error = -EPERM;
> - goto out;
> - }
> -
> blk_start_plug(&plug);
> switch (behavior) {
> case MADV_POPULATE_READ:
> @@ -1470,7 +1464,6 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh
> }
> blk_finish_plug(&plug);
>
> -out:
> if (write)
> mmap_write_unlock(mm);
> else
> diff --git a/mm/mseal.c b/mm/mseal.c
> index 2170e2139ca0..fdd1666344fa 100644
> --- a/mm/mseal.c
> +++ b/mm/mseal.c
> @@ -75,24 +75,15 @@ bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end)
> }
>
> /*
> - * Check if the vmas of a memory range are allowed to be modified by madvise.
> - * the memory ranger can have a gap (unallocated memory).
> - * return true, if it is allowed.
> + * Check if a vma is allowed to be modified by madvise.
> */
> -bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long end,
> - int behavior)
> +bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior)
> {
> - struct vm_area_struct *vma;
> -
> - VMA_ITERATOR(vmi, mm, start);
> -
> if (!is_madv_discard(behavior))
> return true;
>
> - /* going through each vma to check. */
> - for_each_vma_range(vmi, vma, end)
> - if (unlikely(is_ro_anon(vma) && !can_modify_vma(vma)))
> - return false;
> + if (unlikely(!can_modify_vma(vma) && is_ro_anon(vma)))
> + return false;
Not your fault, but I find it extremely irritating that something this subtle
has literally zero comments.
mseal()'d + user does not have permission to modify pages = potentially
discards, as per the original message:
6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
memory, when users don't have write permission to the memory. Those
behaviors can alter region contents by discarding pages, effectively a
memset(0) for anonymous memory.
For something so invasive to just leave this as implied + needing to look
up the commit message to understand is just... yeah. But again, not your
fault...
>
> /* Allow by default. */
> return true;
> diff --git a/mm/vma.h b/mm/vma.h
> index e979015cc7fc..da31d0f62157 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -380,6 +380,8 @@ static inline bool can_modify_vma(struct vm_area_struct *vma)
> return true;
> }
>
> +bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior);
> +
> #else
>
> static inline bool can_modify_vma(struct vm_area_struct *vma)
> @@ -387,6 +389,11 @@ static inline bool can_modify_vma(struct vm_area_struct *vma)
> return true;
> }
>
> +static inline bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior)
> +{
> + return true;
> +}
> +
> #endif
>
> #endif /* __MM_VMA_H */
>
> --
> 2.46.0
>
I remain baffled that the original implementation tried to do these things
at an mm- granularity.
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
next prev parent reply other threads:[~2024-08-21 8:41 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-17 0:18 [PATCH v3 0/7] mm: Optimize mseal checks Pedro Falcato
2024-08-17 0:18 ` [PATCH v3 1/7] mm: Move can_modify_vma to mm/vma.h Pedro Falcato
2024-08-19 20:15 ` Liam R. Howlett
2024-08-19 21:00 ` Pedro Falcato
2024-08-21 6:31 ` Lorenzo Stoakes
2024-08-17 0:18 ` [PATCH v3 2/7] mm/munmap: Replace can_modify_mm with can_modify_vma Pedro Falcato
2024-08-19 20:22 ` Liam R. Howlett
2024-08-21 6:40 ` Lorenzo Stoakes
2024-08-21 16:15 ` Jeff Xu
2024-08-21 16:23 ` Pedro Falcato
2024-08-21 16:33 ` Jeff Xu
2024-08-21 17:02 ` Lorenzo Stoakes
2024-08-21 18:25 ` Liam R. Howlett
2024-08-21 17:00 ` Lorenzo Stoakes
2024-08-17 0:18 ` [PATCH v3 3/7] mm/mprotect: " Pedro Falcato
2024-08-19 20:33 ` Liam R. Howlett
2024-08-21 6:51 ` Lorenzo Stoakes
2024-08-17 0:18 ` [PATCH v3 4/7] mm/mremap: " Pedro Falcato
2024-08-19 20:34 ` Liam R. Howlett
2024-08-21 6:53 ` Lorenzo Stoakes
2024-08-17 0:18 ` [PATCH v3 5/7] mseal: Replace can_modify_mm_madv with a vma variant Pedro Falcato
2024-08-19 20:32 ` Liam R. Howlett
2024-08-21 8:41 ` Lorenzo Stoakes [this message]
2024-08-17 0:18 ` [PATCH v3 6/7] mm: Remove can_modify_mm() Pedro Falcato
2024-08-19 20:32 ` Liam R. Howlett
2024-08-21 8:42 ` Lorenzo Stoakes
2024-08-17 0:18 ` [PATCH v3 7/7] selftests/mm: add more mseal traversal tests Pedro Falcato
2024-08-18 6:36 ` Pedro Falcato
2024-08-20 15:45 ` Liam R. Howlett
2024-08-21 8:47 ` Lorenzo Stoakes
2024-08-21 15:56 ` Jeff Xu
2024-08-21 16:20 ` Pedro Falcato
2024-08-21 16:27 ` Jeff Xu
2024-08-21 17:28 ` Pedro Falcato
2024-08-21 17:36 ` Pedro Falcato
2024-08-21 23:37 ` Pedro Falcato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e31d62f-45b2-4b37-a6bb-96b7934a66c2@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=jeffxu@chromium.org \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mpe@ellerman.id.au \
--cc=oliver.sang@intel.com \
--cc=pedro.falcato@gmail.com \
--cc=shuah@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox