From: LEROY Christophe <christophe.leroy2@cs-soprasteria.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Michael Ellerman <mpe@ellerman.id.au>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
"sidhartha.kumar@oracle.com" <sidhartha.kumar@oracle.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Bert Karwatzki <spasswolf@web.de>, Jiri Olsa <olsajiri@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Kees Cook <kees@kernel.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v3 16/16] mm/mmap: Move may_expand_vm() check in mmap_region()
Date: Wed, 10 Jul 2024 12:59:29 +0000 [thread overview]
Message-ID: <7161cef3-49e4-4057-8a66-ec1ad6eb3f9e@cs-soprasteria.com> (raw)
In-Reply-To: <92ce2025-51c3-423a-902e-dbd19d5d7850@lucifer.local>
Le 10/07/2024 à 14:45, Lorenzo Stoakes a écrit :
> On Wed, Jul 10, 2024 at 10:28:01PM GMT, Michael Ellerman wrote:
>> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>>> On Mon, Jul 08, 2024 at 04:43:15PM GMT, Liam R. Howlett wrote:
>>>>
>> ...
>>>> The functionality here has changed
>>>> --- from ---
>>>> may_expand_vm() check
>>>> can_modify_mm() check
>>>> arch_unmap()
>>>> vms_gather_munmap_vmas()
>>>> ...
>>>>
>>>> --- to ---
>>>> can_modify_mm() check
>>>> arch_unmap()
>>>> vms_gather_munmap_vmas()
>>>> may_expand_vm() check
>>>> ...
>>>>
>>>> vms_gather_munmap_vmas() does nothing but figures out what to do later,
>>>> but could use memory and can fail.
>>>>
>>>> The user implications are:
>>>>
>>>> 1. The return type on the error may change to -EPERM from -ENOMEM, if
>>>> you are not allowed to expand and are trying to overwrite mseal()'ed
>>>> VMAs. That seems so very rare that I'm not sure it's worth mentioning.
>>>>
>>>>
>>>> 2. arch_unmap() called prior to may_expand_vm().
>>>> powerpc uses this to set mm->context.vdso = NULL if mm->context.vdso is
>>>> within the unmap range. User implication of this means that an
>>>> application my set the vdso to NULL prior to hitting the -ENOMEM case in
>>>> may_expand_vm() due to the address space limit.
>>>>
>>>> Assuming the removal of the vdso does not cause the application to seg
>>>> fault, then the user visible change is that any vdso call after a failed
>>>> mmap(MAP_FIXED) call would result in a seg fault. The only reason it
>>>> would fail is if the mapping process was attempting to map a large
>>>> enough area over the vdso (which is accounted and in the vma tree,
>>>> afaict) and ran out of memory. Note that this situation could arise
>>>> already since we could run out of memory (not accounting) after the
>>>> arch_unmap() call within the kernel.
>>>>
>>>> The code today can suffer the same fate, but not by the accounting
>>>> failure. It can happen due to failure to allocate a new vma,
>>>> do_vmi_munmap() failure after the arch_unmap() call, or any of the other
>>>> failure scenarios later in the mmap_region() function.
>>>>
>>>> At the very least, this requires an expanded change log.
>>>
>>> Indeed, also (as mentioned on IRC) I feel like we need to look at whether
>>> we _truly_ need this arch_unmap() call for a single, rather antiquated,
>>> architecture.
>>
>> You can call it "niche" or "irrelevant" or "fringe", but "antiquated" is
>> factually wrong :) Power10 came out of the fab just a few years ago at
>> 7nm.
>
> Fair point ;) perhaps we could go with "rarified"? :>)
>
>>
>>> I mean why are they unmapping the VDSO, why is that valid, why does it need
>>> that field to be set to NULL, is it possible to signify that in some other
>>> way etc.?
>>
>> It was originally for CRIU. So a niche workload on a niche architecture.
>>
>> But from the commit that added it, it sounds like CRIU was using mremap,
>> which should be handled these days by vdso_mremap(). So it could be that
>> arch_unmap() is not actually needed for CRIU anymore.
>
> Oh that's interesting!
>
>>
>> Then I guess we have to decide if removing our arch_unmap() would be an
>> ABI break, regardless of whether CRIU needs it or not.
>
> Seems to me like an internal implementation detail that should hopefully
> not result in anything that should have visible ABI impact?
>
> I guess this is something we ought to assess. It would be useful to
> eliminate hooks where we can so we can better control VMA behaviour without
> having to worry about an arch being able to do arbitrary things at
> unexpected times, especially pertinent where we change the order of things.
>
I see you are talking about arch_unmap(). I didn't follow the entire
discussion but we have some related stuff here:
https://github.com/linuxppc/issues/issues/241
If I remember correctly arch_unmap() should have gone away we Dmitry's
series
https://lore.kernel.org/lkml/20210611180242.711399-1-dima@arista.com/#r
but it hasn't been applied yet.
Christophe
next prev parent reply other threads:[~2024-07-10 12:59 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 18:27 [PATCH v3 00/16] Avoid MAP_FIXED gap exposure Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 01/16] mm/mmap: Correctly position vma_iterator in __split_vma() Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 02/16] mm/mmap: Introduce abort_munmap_vmas() Liam R. Howlett
2024-07-05 17:02 ` Lorenzo Stoakes
2024-07-05 18:12 ` Liam R. Howlett
2024-07-10 16:06 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 03/16] mm/mmap: Introduce vmi_complete_munmap_vmas() Liam R. Howlett
2024-07-05 17:39 ` Lorenzo Stoakes
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 04/16] mm/mmap: Extract the gathering of vmas from do_vmi_align_munmap() Liam R. Howlett
2024-07-05 18:01 ` Lorenzo Stoakes
2024-07-05 18:41 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 16:32 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 05/16] mm/mmap: Introduce vma_munmap_struct for use in munmap operations Liam R. Howlett
2024-07-05 18:39 ` Lorenzo Stoakes
2024-07-05 19:09 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 16:30 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 06/16] mm/mmap: Change munmap to use vma_munmap_struct() for accounting and surrounding vmas Liam R. Howlett
2024-07-05 19:27 ` Lorenzo Stoakes
2024-07-05 19:59 ` Liam R. Howlett
2024-07-10 16:07 ` Suren Baghdasaryan
2024-07-10 17:29 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 07/16] mm/mmap: Extract validate_mm() from vma_complete() Liam R. Howlett
2024-07-05 19:35 ` Lorenzo Stoakes
2024-07-10 16:06 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 08/16] mm/mmap: Inline munmap operation in mmap_region() Liam R. Howlett
2024-07-05 19:39 ` Lorenzo Stoakes
2024-07-05 20:00 ` Liam R. Howlett
2024-07-10 16:15 ` Suren Baghdasaryan
2024-07-10 16:35 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 09/16] mm/mmap: Expand mmap_region() munmap call Liam R. Howlett
2024-07-05 20:06 ` Lorenzo Stoakes
2024-07-05 20:30 ` Liam R. Howlett
2024-07-05 20:36 ` Lorenzo Stoakes
2024-07-08 14:49 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 10/16] mm/mmap: Reposition vma iterator in mmap_region() Liam R. Howlett
2024-07-05 20:18 ` Lorenzo Stoakes
2024-07-05 20:56 ` Liam R. Howlett
2024-07-08 11:08 ` Lorenzo Stoakes
2024-07-08 16:43 ` Liam R. Howlett
2024-07-10 16:48 ` Suren Baghdasaryan
2024-07-10 17:18 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 11/16] mm/mmap: Track start and end of munmap in vma_munmap_struct Liam R. Howlett
2024-07-05 20:27 ` Lorenzo Stoakes
2024-07-08 14:45 ` Liam R. Howlett
2024-07-10 17:14 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 12/16] mm/mmap: Clean up unmap_region() argument list Liam R. Howlett
2024-07-05 20:33 ` Lorenzo Stoakes
2024-07-10 17:14 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 13/16] mm/mmap: Avoid zeroing vma tree in mmap_region() Liam R. Howlett
2024-07-08 12:18 ` Lorenzo Stoakes
2024-07-08 19:10 ` Liam R. Howlett
2024-07-09 14:27 ` Lorenzo Stoakes
2024-07-09 18:43 ` Liam R. Howlett
2024-07-04 18:27 ` [PATCH v3 14/16] mm/mmap: Use PHYS_PFN " Liam R. Howlett
2024-07-08 12:21 ` Lorenzo Stoakes
2024-07-09 18:35 ` Liam R. Howlett
2024-07-09 18:42 ` Lorenzo Stoakes
2024-07-10 17:32 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 15/16] mm/mmap: Use vms accounted pages " Liam R. Howlett
2024-07-08 12:43 ` Lorenzo Stoakes
2024-07-10 17:43 ` Suren Baghdasaryan
2024-07-04 18:27 ` [PATCH v3 16/16] mm/mmap: Move may_expand_vm() check " Liam R. Howlett
2024-07-08 12:52 ` Lorenzo Stoakes
2024-07-08 20:43 ` Liam R. Howlett
2024-07-09 14:42 ` Liam R. Howlett
2024-07-09 14:51 ` Lorenzo Stoakes
2024-07-09 14:52 ` Liam R. Howlett
2024-07-09 18:13 ` Dave Hansen
2024-07-09 14:45 ` Lorenzo Stoakes
2024-07-10 12:28 ` Michael Ellerman
2024-07-10 12:45 ` Lorenzo Stoakes
2024-07-10 12:59 ` LEROY Christophe [this message]
2024-07-10 16:09 ` Liam R. Howlett
2024-07-10 19:27 ` Dmitry Safonov
2024-07-10 21:04 ` LEROY Christophe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7161cef3-49e4-4057-8a66-ec1ad6eb3f9e@cs-soprasteria.com \
--to=christophe.leroy2@cs-soprasteria.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mpe@ellerman.id.au \
--cc=olsajiri@gmail.com \
--cc=paulmck@kernel.org \
--cc=sidhartha.kumar@oracle.com \
--cc=spasswolf@web.de \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox