* [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork()
@ 2024-11-13 16:01 Vlastimil Babka
2024-11-13 16:09 ` David Hildenbrand
0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2024-11-13 16:01 UTC (permalink / raw)
To: linux-mm; +Cc: Vlastimil Babka, Peter Xu, David Hildenbrand
When a combination of unfortunate factors occur, we might BUG in fork():
dup_mmap()
copy_page_range()
copy_***_range()
copy_present_pte()
copy_present_page()
page_add_new_anon_rmap()
__page_set_anon_rmap()
BUG_ON(!anon_vma);
The factors are:
- source vma is VM_MIXEDMAP otherwise copy_page_range() would bail out
when !src_vma->anon_vma
- I think this was due to gpfs, but can happen in-tree as well
- is_cow_mapping() is true because VM_MAYWRITE (even though the vma
was a read-only mapping of a .so file)
- MMF_HAS_PINNED is true, thus some actual pinning has happened
- page_maybe_dma_pinned() is true as a false positive, because mapcount
and thus refcount is >1024
That makes us reach page_needs_cow_for_dma() in copy_present_page() and
evaluate it as true and attempt to CoW a file page and hit the BUG_ON()
because we never had a reason to instantiate anon_vma for the source
vma.
AFAICS this was fixed inadvertedly in 5.19 by commit fb3d824d1a46
("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and
page_try_dup_anon_rmap()") or another commit in that series. What caught
my attention is this part of the changelog:
We really only care about pins on anonymous pages, because they are prone
to getting replaced in the COW handler once mapped R/O. For !anon pages
in cow-mappings (!VM_SHARED && VM_MAYWRITE) we shouldn't really care about
that, at least not that I could come up with an example.
And as part of that commit, an PageAnon() test is added in
copy_present_pte().
But the code is already refactored a lot, so this is an attempt at a
minimal fix for LTS kernels by placing the PageAnon() check to
copy_present_page().
Fixes: 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes")
Cc: Peter Xu <peterx@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Hi, we've seen this in our 5.14 based kernel and it involved the out of
tree gpfs module, but I believe the same thing can happen in LTS's 5.10
and 5.15 without out of tree modules as well. So I'd like your opinion
on this fix before I propose it to stable as a non-standard
version-specific fix (I don't think we'd want to backport fb3d824d1a46
with prerequisities). Thanks.
mm/memory.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/memory.c b/mm/memory.c
index 6d058973a97e..73871bac0e4c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -887,6 +887,10 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
{
struct page *new_page;
+ /* We only care about pins on anonymous pages */
+ if (!PageAnon(page))
+ return 1;
+
/*
* What we want to do is to check whether this page may
* have been pinned by the parent process. If so,
--
2.47.0
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork()
2024-11-13 16:01 [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork() Vlastimil Babka
@ 2024-11-13 16:09 ` David Hildenbrand
2024-11-13 16:18 ` Vlastimil Babka
0 siblings, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2024-11-13 16:09 UTC (permalink / raw)
To: Vlastimil Babka, linux-mm; +Cc: Peter Xu
On 13.11.24 17:01, Vlastimil Babka wrote:
> When a combination of unfortunate factors occur, we might BUG in fork():
>
> dup_mmap()
> copy_page_range()
> copy_***_range()
> copy_present_pte()
> copy_present_page()
> page_add_new_anon_rmap()
> __page_set_anon_rmap()
> BUG_ON(!anon_vma);
>
> The factors are:
>
> - source vma is VM_MIXEDMAP otherwise copy_page_range() would bail out
> when !src_vma->anon_vma
> - I think this was due to gpfs, but can happen in-tree as well
> - is_cow_mapping() is true because VM_MAYWRITE (even though the vma
> was a read-only mapping of a .so file)
> - MMF_HAS_PINNED is true, thus some actual pinning has happened
> - page_maybe_dma_pinned() is true as a false positive, because mapcount
> and thus refcount is >1024
>
> That makes us reach page_needs_cow_for_dma() in copy_present_page() and
> evaluate it as true and attempt to CoW a file page and hit the BUG_ON()
> because we never had a reason to instantiate anon_vma for the source
> vma.
>
> AFAICS this was fixed inadvertedly in 5.19 by commit fb3d824d1a46
> ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and
> page_try_dup_anon_rmap()") or another commit in that series. What caught
> my attention is this part of the changelog:
>
> We really only care about pins on anonymous pages, because they are prone
> to getting replaced in the COW handler once mapped R/O. For !anon pages
> in cow-mappings (!VM_SHARED && VM_MAYWRITE) we shouldn't really care about
> that, at least not that I could come up with an example.
>
> And as part of that commit, an PageAnon() test is added in
> copy_present_pte().
>
> But the code is already refactored a lot, so this is an attempt at a
> minimal fix for LTS kernels by placing the PageAnon() check to
> copy_present_page().
>
> Fixes: 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes")
> Cc: Peter Xu <peterx@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> Hi, we've seen this in our 5.14 based kernel and it involved the out of
> tree gpfs module, but I believe the same thing can happen in LTS's 5.10
> and 5.15 without out of tree modules as well. So I'd like your opinion
> on this fix before I propose it to stable as a non-standard
> version-specific fix (I don't think we'd want to backport fb3d824d1a46
> with prerequisities). Thanks.
I recall seeing+discussing this exact patch already a couple years ago :D
Ah, here is the 5.15 version
https://lkml.kernel.org/r/20221028075244.3112566-1-songyuanzheng@huawei.com
And the 5.10 version
https://lore.kernel.org/lkml/20221024094911.3054769-1-songyuanzheng@huawei.com/
... I could have sworn they got applied.
... and in linux-5.10.y I see
commit 935a8b6202101d7f58fe9cd11287f9cec0d8dd32
Author: Yuanzheng Song <songyuanzheng@huawei.com>
Date: Fri Oct 28 03:07:05 2022 +0000
mm/memory: add non-anonymous page check in the copy_present_page()
The vma->anon_vma of the child process may be NULL because
the entire vma does not contain anonymous pages. In this
case, a BUG will occur when the copy_present_page() passes
a copy of a non-anonymous page of that vma to the
page_add_new_anon_rmap() to set up new anonymous rmap.
Maybe you missed that the PageAnon() check is simply a couple of lines
further down in there?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork()
2024-11-13 16:09 ` David Hildenbrand
@ 2024-11-13 16:18 ` Vlastimil Babka
2024-11-13 16:25 ` David Hildenbrand
0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2024-11-13 16:18 UTC (permalink / raw)
To: David Hildenbrand, linux-mm; +Cc: Peter Xu
On 11/13/24 17:09, David Hildenbrand wrote:
>> ---
>> Hi, we've seen this in our 5.14 based kernel and it involved the out of
>> tree gpfs module, but I believe the same thing can happen in LTS's 5.10
>> and 5.15 without out of tree modules as well. So I'd like your opinion
>> on this fix before I propose it to stable as a non-standard
>> version-specific fix (I don't think we'd want to backport fb3d824d1a46
>> with prerequisities). Thanks.
>
> I recall seeing+discussing this exact patch already a couple years ago :D
>
> Ah, here is the 5.15 version
>
> https://lkml.kernel.org/r/20221028075244.3112566-1-songyuanzheng@huawei.com
>
> And the 5.10 version
>
> https://lore.kernel.org/lkml/20221024094911.3054769-1-songyuanzheng@huawei.com/
>
>
> ... I could have sworn they got applied.
>
> ... and in linux-5.10.y I see
>
> commit 935a8b6202101d7f58fe9cd11287f9cec0d8dd32
> Author: Yuanzheng Song <songyuanzheng@huawei.com>
> Date: Fri Oct 28 03:07:05 2022 +0000
>
> mm/memory: add non-anonymous page check in the copy_present_page()
>
> The vma->anon_vma of the child process may be NULL because
> the entire vma does not contain anonymous pages. In this
> case, a BUG will occur when the copy_present_page() passes
> a copy of a non-anonymous page of that vma to the
> page_add_new_anon_rmap() to set up new anonymous rmap.
>
>
>
> Maybe you missed that the PageAnon() check is simply a couple of lines
> further down in there?
No I was only looking at the 5.15 branch so far, and seems it was never
applied there, unlike 5.10. Weird. But thanks for the heads up.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork()
2024-11-13 16:18 ` Vlastimil Babka
@ 2024-11-13 16:25 ` David Hildenbrand
2024-11-13 16:26 ` Vlastimil Babka
0 siblings, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2024-11-13 16:25 UTC (permalink / raw)
To: Vlastimil Babka, linux-mm; +Cc: Peter Xu
On 13.11.24 17:18, Vlastimil Babka wrote:
> On 11/13/24 17:09, David Hildenbrand wrote:
>>> ---
>>> Hi, we've seen this in our 5.14 based kernel and it involved the out of
>>> tree gpfs module, but I believe the same thing can happen in LTS's 5.10
>>> and 5.15 without out of tree modules as well. So I'd like your opinion
>>> on this fix before I propose it to stable as a non-standard
>>> version-specific fix (I don't think we'd want to backport fb3d824d1a46
>>> with prerequisities). Thanks.
>>
>> I recall seeing+discussing this exact patch already a couple years ago :D
>>
>> Ah, here is the 5.15 version
>>
>> https://lkml.kernel.org/r/20221028075244.3112566-1-songyuanzheng@huawei.com
>>
>> And the 5.10 version
>>
>> https://lore.kernel.org/lkml/20221024094911.3054769-1-songyuanzheng@huawei.com/
>>
>>
>> ... I could have sworn they got applied.
>>
>> ... and in linux-5.10.y I see
>>
>> commit 935a8b6202101d7f58fe9cd11287f9cec0d8dd32
>> Author: Yuanzheng Song <songyuanzheng@huawei.com>
>> Date: Fri Oct 28 03:07:05 2022 +0000
>>
>> mm/memory: add non-anonymous page check in the copy_present_page()
>>
>> The vma->anon_vma of the child process may be NULL because
>> the entire vma does not contain anonymous pages. In this
>> case, a BUG will occur when the copy_present_page() passes
>> a copy of a non-anonymous page of that vma to the
>> page_add_new_anon_rmap() to set up new anonymous rmap.
>>
>>
>>
>> Maybe you missed that the PageAnon() check is simply a couple of lines
>> further down in there?
>
> No I was only looking at the 5.15 branch so far, and seems it was never
> applied there, unlike 5.10. Weird. But thanks for the heads up.
Indeed, looks like it never got applied to 5.15 ...
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork()
2024-11-13 16:25 ` David Hildenbrand
@ 2024-11-13 16:26 ` Vlastimil Babka
0 siblings, 0 replies; 5+ messages in thread
From: Vlastimil Babka @ 2024-11-13 16:26 UTC (permalink / raw)
To: David Hildenbrand, linux-mm; +Cc: Peter Xu
On 11/13/24 17:25, David Hildenbrand wrote:
> On 13.11.24 17:18, Vlastimil Babka wrote:
>> On 11/13/24 17:09, David Hildenbrand wrote:
>>>> ---
>>>> Hi, we've seen this in our 5.14 based kernel and it involved the out of
>>>> tree gpfs module, but I believe the same thing can happen in LTS's 5.10
>>>> and 5.15 without out of tree modules as well. So I'd like your opinion
>>>> on this fix before I propose it to stable as a non-standard
>>>> version-specific fix (I don't think we'd want to backport fb3d824d1a46
>>>> with prerequisities). Thanks.
>>>
>>> I recall seeing+discussing this exact patch already a couple years ago :D
>>>
>>> Ah, here is the 5.15 version
>>>
>>> https://lkml.kernel.org/r/20221028075244.3112566-1-songyuanzheng@huawei.com
>>>
>>> And the 5.10 version
>>>
>>> https://lore.kernel.org/lkml/20221024094911.3054769-1-songyuanzheng@huawei.com/
>>>
>>>
>>> ... I could have sworn they got applied.
>>>
>>> ... and in linux-5.10.y I see
>>>
>>> commit 935a8b6202101d7f58fe9cd11287f9cec0d8dd32
>>> Author: Yuanzheng Song <songyuanzheng@huawei.com>
>>> Date: Fri Oct 28 03:07:05 2022 +0000
>>>
>>> mm/memory: add non-anonymous page check in the copy_present_page()
>>>
>>> The vma->anon_vma of the child process may be NULL because
>>> the entire vma does not contain anonymous pages. In this
>>> case, a BUG will occur when the copy_present_page() passes
>>> a copy of a non-anonymous page of that vma to the
>>> page_add_new_anon_rmap() to set up new anonymous rmap.
>>>
>>>
>>>
>>> Maybe you missed that the PageAnon() check is simply a couple of lines
>>> further down in there?
>>
>> No I was only looking at the 5.15 branch so far, and seems it was never
>> applied there, unlike 5.10. Weird. But thanks for the heads up.
>
> Indeed, looks like it never got applied to 5.15 ...
The author forgot to actually include stable@
will forward it
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-11-13 16:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-13 16:01 [RFC for stable 5.15 and 5.10] mm/memory: only copy anonymous pages during fork() Vlastimil Babka
2024-11-13 16:09 ` David Hildenbrand
2024-11-13 16:18 ` Vlastimil Babka
2024-11-13 16:25 ` David Hildenbrand
2024-11-13 16:26 ` Vlastimil Babka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox