* Re: [PATCH 2/4] powerpc/mm: Handle VDSO unmapping via close() rather than arch_unmap()
[not found] ` <CAHk-=wgTXVMBRuya5J0peujSrtunehRtzk=WVrm6njPhHrpTJw@mail.gmail.com>
@ 2024-08-08 16:15 ` Liam R. Howlett
[not found] ` <CALmYWFtAenAQmUCSrW8Pu6eNYMcfDe9R4f87XgUxaO4gsfzVQg@mail.gmail.com>
1 sibling, 0 replies; 7+ messages in thread
From: Liam R. Howlett @ 2024-08-08 16:15 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeff Xu, Michael Ellerman, linux-mm, linuxppc-dev, akpm,
christophe.leroy, jeffxu, linux-kernel, npiggin, oliver.sang,
pedro.falcato, Kees Cook
* Linus Torvalds <torvalds@linux-foundation.org> [240807 23:21]:
> On Wed, 7 Aug 2024 at 16:20, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
...
>
> That said, I don't love how special powerpc is here.
I think more (all?) archs should be doing something like ppc when the
vdso is removed. If someone removes the vdso, then the speed up
provided should just go away and the function calls shouldn't try to use
the quick look up and crash.
I view this as another 'caching of a vma pointer' issue that isn't
cleaned up when the vma goes away.
>
> What we could do is to is
>
> - stop calling these things "special mappings", and just admit that
> it's for different vdso mappings and nothing else (for some odd reason
> arm and nios2 calls it a "kuser helper" rather than vdso, but it's the
> exact same thing)
But isn't it a special mapping? We don't allow for merging of the vma,
the mlock handling has some odd behaviour with this vma, and there is
the comment in mm/internal.h's mlock_vma_folio about ignoring these
special vmas in a race.
There is also some other 'special mapping' of vvars too? I haven't
looked deeply into this yet as my investigation was preempted by
vacation.
>
> - don't do this whole indirect function pointer thing with mremap and
> close at all, and just do this all unapologetically and for all
> architectures in the generic VM layer together with "if (vma->vm_start
> == mm->context.vdso)" etc.
>
> that would get rid of the conceptual complexity of having different
> architectures doing different things (and the unnecessary overhead of
> having an indirect function pointer that just points to one single
> thing).
>
> But I think the current "clean up the existing mess" is probably the
> less invasive one over "make the existing mess be explicitly about
> vdso and avoid unnecessary per-architecture differences".
Okay, sure.
>
> If people want to, we can do the unification (and stop pretending the
> "special mappings" could be something else) later.
>
I was planning to use the regular vma vm_ops to jump into the 'special
unmap code' and then do all the checks there. IOW, keep the vma flagged
as VM_SPECIAL and call the special_mapping_whatever() function as a
regular vmops for, say, ->remove_vma() or ->mremap(). Keeping the flag
means all the race avoidance/locking/merging works the same as it does
today.
What I am trying to avoid is another arch_get_unmapped_area() scenario
where a bug exists for a decade in some versions of the cloned code.
Thanks,
Liam
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/4] powerpc/mm: Handle VDSO unmapping via close() rather than arch_unmap()
[not found] ` <CALmYWFtAenAQmUCSrW8Pu6eNYMcfDe9R4f87XgUxaO4gsfzVQg@mail.gmail.com>
@ 2024-08-08 18:08 ` Liam R. Howlett
2024-08-08 18:36 ` Jeff Xu
0 siblings, 1 reply; 7+ messages in thread
From: Liam R. Howlett @ 2024-08-08 18:08 UTC (permalink / raw)
To: Jeff Xu
Cc: Linus Torvalds, Michael Ellerman, linux-mm, linuxppc-dev, akpm,
christophe.leroy, jeffxu, linux-kernel, npiggin, oliver.sang,
pedro.falcato, Kees Cook
* Jeff Xu <jeffxu@google.com> [240807 23:37]:
> On Wed, Aug 7, 2024 at 8:21 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Wed, 7 Aug 2024 at 16:20, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > >
> > > Okay, I'm going to try one more time here. You are suggesting to have a
> > > conf flag to leave the vdso pointer unchanged when it is unmapped.
> > > Having the close behind the conf option will not prevent it from being
> > > unmapped or mapped over, so what you are suggesting is have a
> > > configuration option that leaves a pointer, mm->context.vdso, to be
> > > unsafe if it is unmapped if you disable checkpoint restore.
> >
> This is a new point that I didn't realize before, if we are going to handle
> unmap vdso safely, yes, this is a bugfix that should be applied everywhere
> for all arch, without CHECKPOINT_RESTORE config.
>
> Do we need to worry about mmap(fixed) ? which can have the same effect
> as mremap.
Yes, but it should be handled by vm_ops->close() when MAP_FIXED unmaps
the vdso. Note that you cannot MAP_FIXED over half of the vma as the
vm_ops->may_split() is special_mapping_split(), which just returns
-EINVAL.
Thanks,
Liam
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/4] powerpc/mm: Handle VDSO unmapping via close() rather than arch_unmap()
2024-08-08 18:08 ` Liam R. Howlett
@ 2024-08-08 18:36 ` Jeff Xu
2024-08-08 18:46 ` Liam R. Howlett
0 siblings, 1 reply; 7+ messages in thread
From: Jeff Xu @ 2024-08-08 18:36 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Linus Torvalds, Michael Ellerman,
linux-mm, linuxppc-dev, akpm, christophe.leroy, jeffxu,
linux-kernel, npiggin, oliver.sang, pedro.falcato, Kees Cook
On Thu, Aug 8, 2024 at 11:08 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@google.com> [240807 23:37]:
> > On Wed, Aug 7, 2024 at 8:21 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > On Wed, 7 Aug 2024 at 16:20, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > >
> > > > Okay, I'm going to try one more time here. You are suggesting to have a
> > > > conf flag to leave the vdso pointer unchanged when it is unmapped.
> > > > Having the close behind the conf option will not prevent it from being
> > > > unmapped or mapped over, so what you are suggesting is have a
> > > > configuration option that leaves a pointer, mm->context.vdso, to be
> > > > unsafe if it is unmapped if you disable checkpoint restore.
> > >
> > This is a new point that I didn't realize before, if we are going to handle
> > unmap vdso safely, yes, this is a bugfix that should be applied everywhere
> > for all arch, without CHECKPOINT_RESTORE config.
> >
> > Do we need to worry about mmap(fixed) ? which can have the same effect
> > as mremap.
>
> Yes, but it should be handled by vm_ops->close() when MAP_FIXED unmaps
> the vdso. Note that you cannot MAP_FIXED over half of the vma as the
> vm_ops->may_split() is special_mapping_split(), which just returns
> -EINVAL.
>
The may_split() failure logic is specific to vm_special_mapping, right ?
Do we still need to keep vm_special_mapping struct , if we are going to
treat special vma as normal vma ?
> Thanks,
> Liam
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/4] powerpc/mm: Handle VDSO unmapping via close() rather than arch_unmap()
2024-08-08 18:36 ` Jeff Xu
@ 2024-08-08 18:46 ` Liam R. Howlett
2024-08-08 18:52 ` Jeff Xu
0 siblings, 1 reply; 7+ messages in thread
From: Liam R. Howlett @ 2024-08-08 18:46 UTC (permalink / raw)
To: Jeff Xu
Cc: Linus Torvalds, Michael Ellerman, linux-mm, linuxppc-dev, akpm,
christophe.leroy, jeffxu, linux-kernel, npiggin, oliver.sang,
pedro.falcato, Kees Cook
* Jeff Xu <jeffxu@google.com> [240808 14:37]:
> On Thu, Aug 8, 2024 at 11:08 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Jeff Xu <jeffxu@google.com> [240807 23:37]:
> > > On Wed, Aug 7, 2024 at 8:21 PM Linus Torvalds
> > > <torvalds@linux-foundation.org> wrote:
> > > >
> > > > On Wed, 7 Aug 2024 at 16:20, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > > >
> > > > > Okay, I'm going to try one more time here. You are suggesting to have a
> > > > > conf flag to leave the vdso pointer unchanged when it is unmapped.
> > > > > Having the close behind the conf option will not prevent it from being
> > > > > unmapped or mapped over, so what you are suggesting is have a
> > > > > configuration option that leaves a pointer, mm->context.vdso, to be
> > > > > unsafe if it is unmapped if you disable checkpoint restore.
> > > >
> > > This is a new point that I didn't realize before, if we are going to handle
> > > unmap vdso safely, yes, this is a bugfix that should be applied everywhere
> > > for all arch, without CHECKPOINT_RESTORE config.
> > >
> > > Do we need to worry about mmap(fixed) ? which can have the same effect
> > > as mremap.
> >
> > Yes, but it should be handled by vm_ops->close() when MAP_FIXED unmaps
> > the vdso. Note that you cannot MAP_FIXED over half of the vma as the
> > vm_ops->may_split() is special_mapping_split(), which just returns
> > -EINVAL.
> >
> The may_split() failure logic is specific to vm_special_mapping, right ?
Not really, it's just what exists for these vmas vm_ops struct. It's
called on every vma for every split in __split_vma().
>
> Do we still need to keep vm_special_mapping struct , if we are going to
> treat special vma as normal vma ?
No, just set the vm_ops may_split to something that returns -EINVAL.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/4] powerpc/mm: Handle VDSO unmapping via close() rather than arch_unmap()
2024-08-08 18:46 ` Liam R. Howlett
@ 2024-08-08 18:52 ` Jeff Xu
0 siblings, 0 replies; 7+ messages in thread
From: Jeff Xu @ 2024-08-08 18:52 UTC (permalink / raw)
To: Liam R. Howlett, Jeff Xu, Linus Torvalds, Michael Ellerman,
linux-mm, linuxppc-dev, akpm, christophe.leroy, jeffxu,
linux-kernel, npiggin, oliver.sang, pedro.falcato, Kees Cook
On Thu, Aug 8, 2024 at 11:46 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@google.com> [240808 14:37]:
> > On Thu, Aug 8, 2024 at 11:08 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > >
> > > * Jeff Xu <jeffxu@google.com> [240807 23:37]:
> > > > On Wed, Aug 7, 2024 at 8:21 PM Linus Torvalds
> > > > <torvalds@linux-foundation.org> wrote:
> > > > >
> > > > > On Wed, 7 Aug 2024 at 16:20, Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > > > >
> > > > > > Okay, I'm going to try one more time here. You are suggesting to have a
> > > > > > conf flag to leave the vdso pointer unchanged when it is unmapped.
> > > > > > Having the close behind the conf option will not prevent it from being
> > > > > > unmapped or mapped over, so what you are suggesting is have a
> > > > > > configuration option that leaves a pointer, mm->context.vdso, to be
> > > > > > unsafe if it is unmapped if you disable checkpoint restore.
> > > > >
> > > > This is a new point that I didn't realize before, if we are going to handle
> > > > unmap vdso safely, yes, this is a bugfix that should be applied everywhere
> > > > for all arch, without CHECKPOINT_RESTORE config.
> > > >
> > > > Do we need to worry about mmap(fixed) ? which can have the same effect
> > > > as mremap.
> > >
> > > Yes, but it should be handled by vm_ops->close() when MAP_FIXED unmaps
> > > the vdso. Note that you cannot MAP_FIXED over half of the vma as the
> > > vm_ops->may_split() is special_mapping_split(), which just returns
> > > -EINVAL.
> > >
> > The may_split() failure logic is specific to vm_special_mapping, right ?
>
> Not really, it's just what exists for these vmas vm_ops struct. It's
> called on every vma for every split in __split_vma().
>
> >
> > Do we still need to keep vm_special_mapping struct , if we are going to
> > treat special vma as normal vma ?
>
> No, just set the vm_ops may_split to something that returns -EINVAL.
>
OK, that makes sense.
Thanks
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] mm: Add optional close() to struct vm_special_mapping
[not found] ` <shiq5v3jrmyi6ncwke7wgl76ojysgbhrchsk32q4lbx2hadqqc@kzyy2igem256>
@ 2024-08-12 8:22 ` Michael Ellerman
0 siblings, 0 replies; 7+ messages in thread
From: Michael Ellerman @ 2024-08-12 8:22 UTC (permalink / raw)
To: Liam R. Howlett
Cc: linux-mm, linuxppc-dev, torvalds, akpm, christophe.leroy, jeffxu,
jeffxu, linux-kernel, npiggin, oliver.sang, pedro.falcato
"Liam R. Howlett" <Liam.Howlett@oracle.com> writes:
> * Michael Ellerman <mpe@ellerman.id.au> [240807 08:41]:
>> Add an optional close() callback to struct vm_special_mapping. It will
>> be used, by powerpc at least, to handle unmapping of the VDSO.
>>
>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>> include/linux/mm_types.h | 2 ++
>> mm/mmap.c | 3 +++
>> 2 files changed, 5 insertions(+)
>>
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index 485424979254..ef32d87a3adc 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -1313,6 +1313,8 @@ struct vm_special_mapping {
>>
>> int (*mremap)(const struct vm_special_mapping *sm,
>> struct vm_area_struct *new_vma);
>
> nit: missing new line?
Ack.
>> + void (*close)(const struct vm_special_mapping *sm,
>> + struct vm_area_struct *vma);
>> };
>>
>> enum tlb_flush_reason {
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index d0dfc85b209b..24bd6aa9155c 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -3624,6 +3624,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
>> */
>
> The above comment should probably be expanded to explain what this is
> about, or removed.
I expanded it slightly, happy for others to wordsmith it further.
>> static void special_mapping_close(struct vm_area_struct *vma)
>> {
>> + const struct vm_special_mapping *sm = vma->vm_private_data;
>> + if (sm->close)
>> + sm->close(sm, vma);
>
> Right now we have the same sort of situation for mremap calls on
> special: we have a call to the specific vma mremap() function.
> ...
> So, are we missing an opportunity to avoid every arch having the same
> implementation here (that will evolve into random bugs existing in some
> archs for years before someone realises the cloned code wasn't fixed)?
> Do we already have a fix in ppc for the size checking that doesn't exist
> in the other archs in the case of mremap?
I took this as more of a meta comment/rant :)
Yes I agree the implementation should eventually be generic, but this series
is just about moving the existing powerpc behaviour from arch_unmap()
into this hook.
cheers
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] mm: Add optional close() to struct vm_special_mapping
[not found] ` <1b0e07fb-33fb-4397-b03e-65698601bc70@redhat.com>
@ 2024-08-12 8:23 ` Michael Ellerman
0 siblings, 0 replies; 7+ messages in thread
From: Michael Ellerman @ 2024-08-12 8:23 UTC (permalink / raw)
To: David Hildenbrand, linux-mm
Cc: linuxppc-dev, torvalds, akpm, christophe.leroy, jeffxu, jeffxu,
Liam.Howlett, linux-kernel, npiggin, oliver.sang, pedro.falcato
David Hildenbrand <david@redhat.com> writes:
> On 07.08.24 14:41, Michael Ellerman wrote:
>> Add an optional close() callback to struct vm_special_mapping. It will
>> be used, by powerpc at least, to handle unmapping of the VDSO.
>>
>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>> include/linux/mm_types.h | 2 ++
>> mm/mmap.c | 3 +++
>> 2 files changed, 5 insertions(+)
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index d0dfc85b209b..24bd6aa9155c 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -3624,6 +3624,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
>> */
>> static void special_mapping_close(struct vm_area_struct *vma)
>> {
>> + const struct vm_special_mapping *sm = vma->vm_private_data;
>
> I'm old-fashioned, I enjoy an empty line here ;)
Ack.
>> + if (sm->close)
>> + sm->close(sm, vma);
>
> Reviewed-by: David Hildenbrand <david@redhat.com>
Thanks.
cheers
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-08-12 8:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20240807124103.85644-1-mpe@ellerman.id.au>
[not found] ` <20240807124103.85644-2-mpe@ellerman.id.au>
[not found] ` <CALmYWFsCrMxkA1v58fJxtyGR15ZGxmSP8x7QC=oeKwzcwGL76A@mail.gmail.com>
[not found] ` <gtz7s4eyzydaomh2msvfhpemhiruexy53nutd3fwumqfpos7v5@4fnqun2olore>
[not found] ` <CALmYWFvqoxyBf4iP7WPTU_Oxq_zpRzvaBOWoHc4n4EwQTYhyBA@mail.gmail.com>
[not found] ` <babup6k7qh5ii5avcvtz2rqo4n2mzh2wjbbgk5xeuivfypqnuc@2gydsfao3w7b>
[not found] ` <CALmYWFsAT+Cb37-cSTykc_P7bJDHmFa7mWD5+B1pEz73thchcQ@mail.gmail.com>
[not found] ` <lhe2mky6ahlk2jzvvfjyongqiseelyx2uy7sbyuso6jcy3b2dq@7ju6cea62jgk>
[not found] ` <CAHk-=wgTXVMBRuya5J0peujSrtunehRtzk=WVrm6njPhHrpTJw@mail.gmail.com>
2024-08-08 16:15 ` [PATCH 2/4] powerpc/mm: Handle VDSO unmapping via close() rather than arch_unmap() Liam R. Howlett
[not found] ` <CALmYWFtAenAQmUCSrW8Pu6eNYMcfDe9R4f87XgUxaO4gsfzVQg@mail.gmail.com>
2024-08-08 18:08 ` Liam R. Howlett
2024-08-08 18:36 ` Jeff Xu
2024-08-08 18:46 ` Liam R. Howlett
2024-08-08 18:52 ` Jeff Xu
[not found] ` <shiq5v3jrmyi6ncwke7wgl76ojysgbhrchsk32q4lbx2hadqqc@kzyy2igem256>
2024-08-12 8:22 ` [PATCH 1/4] mm: Add optional close() to struct vm_special_mapping Michael Ellerman
[not found] ` <1b0e07fb-33fb-4397-b03e-65698601bc70@redhat.com>
2024-08-12 8:23 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox