* [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
@ 2024-06-25 5:00 Hugh Dickins
2024-06-25 5:55 ` Barry Song
2024-06-25 7:40 ` David Hildenbrand
0 siblings, 2 replies; 7+ messages in thread
From: Hugh Dickins @ 2024-06-25 5:00 UTC (permalink / raw)
To: Andrew Morton
Cc: Barry Song, Hugh Dickins, baolin.wang, chrisl, david,
linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb,
v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao
Commit "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==
false" has extended folio_add_new_anon_rmap() to use on non-exclusive
folios, already visible to others in swap cache and on LRU.
That renders its non-atomic __folio_set_swapbacked() unsafe: it risks
overwriting concurrent atomic operations on folio->flags, losing bits
added or restoring bits cleared. Since it's only used in this risky
way when folio_test_locked and !folio_test_anon, many such races are
excluded; but, for example, isolations by folio_test_clear_lru() are
vulnerable, and setting or clearing active.
It could just use the atomic folio_set_swapbacked(); but this function
does try to avoid atomics where it can, so use a branch instead: just
avoid setting swapbacked when it is already set, that is good enough.
(Swapbacked is normally stable once set: lazyfree can undo it, but
only later, when found anon in a page table.)
This fixes a lot of instability under compaction and swapping loads:
assorted "Bad page"s, VM_BUG_ON_FOLIO()s, apparently even page double
frees - though I've not worked out what races could lead to the latter.
Signed-off-by: Hugh Dickins <hughd@google.com>
---
mm/rmap.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index df1a43295c85..5394c1178bf1 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1408,7 +1408,9 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
VM_BUG_ON_VMA(address < vma->vm_start ||
address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
- __folio_set_swapbacked(folio);
+
+ if (!folio_test_swapbacked(folio))
+ __folio_set_swapbacked(folio);
__folio_set_anon(folio, vma, address, exclusive);
if (likely(!folio_test_large(folio))) {
--
2.35.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
2024-06-25 5:00 [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked() Hugh Dickins
@ 2024-06-25 5:55 ` Barry Song
2024-06-25 7:04 ` Baolin Wang
2024-06-25 7:40 ` David Hildenbrand
1 sibling, 1 reply; 7+ messages in thread
From: Barry Song @ 2024-06-25 5:55 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, baolin.wang, chrisl, david, linux-kernel,
linux-mm, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua,
willy, ying.huang, yosryahmed, yuanshuai, yuzhao
On Tue, Jun 25, 2024 at 5:00 PM Hugh Dickins <hughd@google.com> wrote:
>
> Commit "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==
> false" has extended folio_add_new_anon_rmap() to use on non-exclusive
> folios, already visible to others in swap cache and on LRU.
>
> That renders its non-atomic __folio_set_swapbacked() unsafe: it risks
> overwriting concurrent atomic operations on folio->flags, losing bits
> added or restoring bits cleared. Since it's only used in this risky
> way when folio_test_locked and !folio_test_anon, many such races are
> excluded; but, for example, isolations by folio_test_clear_lru() are
> vulnerable, and setting or clearing active.
>
> It could just use the atomic folio_set_swapbacked(); but this function
> does try to avoid atomics where it can, so use a branch instead: just
> avoid setting swapbacked when it is already set, that is good enough.
> (Swapbacked is normally stable once set: lazyfree can undo it, but
> only later, when found anon in a page table.)
>
> This fixes a lot of instability under compaction and swapping loads:
> assorted "Bad page"s, VM_BUG_ON_FOLIO()s, apparently even page double
> frees - though I've not worked out what races could lead to the latter.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Thanks a lot, Hugh. Sorry for my mistake. I guess we should squash this into
patch 1/3 "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio) ==
false"?
Andrew, could you please help to squash this one?
> ---
> mm/rmap.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index df1a43295c85..5394c1178bf1 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1408,7 +1408,9 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
> VM_BUG_ON_VMA(address < vma->vm_start ||
> address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
> - __folio_set_swapbacked(folio);
> +
> + if (!folio_test_swapbacked(folio))
> + __folio_set_swapbacked(folio);
> __folio_set_anon(folio, vma, address, exclusive);
>
> if (likely(!folio_test_large(folio))) {
> --
> 2.35.3
>
Thanks
Barry
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
2024-06-25 5:55 ` Barry Song
@ 2024-06-25 7:04 ` Baolin Wang
0 siblings, 0 replies; 7+ messages in thread
From: Baolin Wang @ 2024-06-25 7:04 UTC (permalink / raw)
To: Barry Song, Hugh Dickins
Cc: Andrew Morton, chrisl, david, linux-kernel, linux-mm, mhocko,
ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang,
yosryahmed, yuanshuai, yuzhao
On 2024/6/25 13:55, Barry Song wrote:
> On Tue, Jun 25, 2024 at 5:00 PM Hugh Dickins <hughd@google.com> wrote:
>>
>> Commit "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==
>> false" has extended folio_add_new_anon_rmap() to use on non-exclusive
>> folios, already visible to others in swap cache and on LRU.
>>
>> That renders its non-atomic __folio_set_swapbacked() unsafe: it risks
>> overwriting concurrent atomic operations on folio->flags, losing bits
>> added or restoring bits cleared. Since it's only used in this risky
>> way when folio_test_locked and !folio_test_anon, many such races are
>> excluded; but, for example, isolations by folio_test_clear_lru() are
>> vulnerable, and setting or clearing active.
>>
>> It could just use the atomic folio_set_swapbacked(); but this function
>> does try to avoid atomics where it can, so use a branch instead: just
>> avoid setting swapbacked when it is already set, that is good enough.
>> (Swapbacked is normally stable once set: lazyfree can undo it, but
>> only later, when found anon in a page table.)
>>
>> This fixes a lot of instability under compaction and swapping loads:
>> assorted "Bad page"s, VM_BUG_ON_FOLIO()s, apparently even page double
>> frees - though I've not worked out what races could lead to the latter.
>>
>> Signed-off-by: Hugh Dickins <hughd@google.com>
>
> Thanks a lot, Hugh. Sorry for my mistake. I guess we should squash this into
> patch 1/3 "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio) ==
> false"?
> Andrew, could you please help to squash this one?
Hope the commit message written by Hugh can also be squashed into the
original patch, as it is very helpful to me :)
>> ---
>> mm/rmap.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index df1a43295c85..5394c1178bf1 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1408,7 +1408,9 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
>> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
>> VM_BUG_ON_VMA(address < vma->vm_start ||
>> address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
>> - __folio_set_swapbacked(folio);
>> +
>> + if (!folio_test_swapbacked(folio))
>> + __folio_set_swapbacked(folio);
>> __folio_set_anon(folio, vma, address, exclusive);
>>
>> if (likely(!folio_test_large(folio))) {
>> --
>> 2.35.3
>>
>
> Thanks
> Barry
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
2024-06-25 5:00 [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked() Hugh Dickins
2024-06-25 5:55 ` Barry Song
@ 2024-06-25 7:40 ` David Hildenbrand
2024-06-25 19:37 ` Hugh Dickins
1 sibling, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-06-25 7:40 UTC (permalink / raw)
To: Hugh Dickins, Andrew Morton
Cc: Barry Song, baolin.wang, chrisl, linux-kernel, linux-mm, mhocko,
ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang,
yosryahmed, yuanshuai, yuzhao
On 25.06.24 07:00, Hugh Dickins wrote:
> Commit "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==
> false" has extended folio_add_new_anon_rmap() to use on non-exclusive
> folios, already visible to others in swap cache and on LRU.
>
> That renders its non-atomic __folio_set_swapbacked() unsafe: it risks
> overwriting concurrent atomic operations on folio->flags, losing bits
> added or restoring bits cleared. Since it's only used in this risky
> way when folio_test_locked and !folio_test_anon, many such races are
> excluded; but, for example, isolations by folio_test_clear_lru() are
> vulnerable, and setting or clearing active.
>
> It could just use the atomic folio_set_swapbacked(); but this function
> does try to avoid atomics where it can, so use a branch instead: just
> avoid setting swapbacked when it is already set, that is good enough.
> (Swapbacked is normally stable once set: lazyfree can undo it, but
> only later, when found anon in a page table.)
>
> This fixes a lot of instability under compaction and swapping loads:
> assorted "Bad page"s, VM_BUG_ON_FOLIO()s, apparently even page double
> frees - though I've not worked out what races could lead to the latter.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
> mm/rmap.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index df1a43295c85..5394c1178bf1 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1408,7 +1408,9 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
> VM_BUG_ON_VMA(address < vma->vm_start ||
> address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
> - __folio_set_swapbacked(folio);
> +
> + if (!folio_test_swapbacked(folio))
> + __folio_set_swapbacked(folio);
> __folio_set_anon(folio, vma, address, exclusive);
>
> if (likely(!folio_test_large(folio))) {
LGTM.
I'll point out that it's sufficient for a PFN walker to do a tryget +
trylock to cause trouble.
Fortunately isolate_movable_page() will check __folio_test_movable()
before doing the trylock.
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
2024-06-25 7:40 ` David Hildenbrand
@ 2024-06-25 19:37 ` Hugh Dickins
2024-06-25 19:45 ` David Hildenbrand
0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2024-06-25 19:37 UTC (permalink / raw)
To: David Hildenbrand
Cc: Hugh Dickins, Andrew Morton, Barry Song, baolin.wang, chrisl,
linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb,
v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao
On Tue, 25 Jun 2024, David Hildenbrand wrote:
> On 25.06.24 07:00, Hugh Dickins wrote:
> > Commit "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==
> > false" has extended folio_add_new_anon_rmap() to use on non-exclusive
> > folios, already visible to others in swap cache and on LRU.
> >
> > That renders its non-atomic __folio_set_swapbacked() unsafe: it risks
> > overwriting concurrent atomic operations on folio->flags, losing bits
> > added or restoring bits cleared. Since it's only used in this risky
> > way when folio_test_locked and !folio_test_anon, many such races are
> > excluded; but, for example, isolations by folio_test_clear_lru() are
> > vulnerable, and setting or clearing active.
> >
> > It could just use the atomic folio_set_swapbacked(); but this function
> > does try to avoid atomics where it can, so use a branch instead: just
> > avoid setting swapbacked when it is already set, that is good enough.
> > (Swapbacked is normally stable once set: lazyfree can undo it, but
> > only later, when found anon in a page table.)
> >
> > This fixes a lot of instability under compaction and swapping loads:
> > assorted "Bad page"s, VM_BUG_ON_FOLIO()s, apparently even page double
> > frees - though I've not worked out what races could lead to the latter.
> >
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > ---
> > mm/rmap.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index df1a43295c85..5394c1178bf1 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1408,7 +1408,9 @@ void folio_add_new_anon_rmap(struct folio *folio,
> > struct vm_area_struct *vma,
> > VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
> > VM_BUG_ON_VMA(address < vma->vm_start ||
> > address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
> > - __folio_set_swapbacked(folio);
> > +
> > + if (!folio_test_swapbacked(folio))
> > + __folio_set_swapbacked(folio);
> > __folio_set_anon(folio, vma, address, exclusive);
> >
> > if (likely(!folio_test_large(folio))) {
>
> LGTM.
>
> I'll point out that it's sufficient for a PFN walker to do a tryget + trylock
> to cause trouble.
That surprises me. I thought a racer's tryget was irrelevant (touching
a different field) and its trylock not a problem, since "we" hold the
folio lock throughout. If my mental model is too naive there, please
explain in more detail: we all need to understand this better.
>
> Fortunately isolate_movable_page() will check __folio_test_movable() before
> doing the trylock.
>
> Reviewed-by: David Hildenbrand <david@redhat.com>
Thanks,
Hugh
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
2024-06-25 19:37 ` Hugh Dickins
@ 2024-06-25 19:45 ` David Hildenbrand
2024-06-25 20:12 ` Hugh Dickins
0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-06-25 19:45 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Barry Song, baolin.wang, chrisl, linux-kernel,
linux-mm, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua,
willy, ying.huang, yosryahmed, yuanshuai, yuzhao
On 25.06.24 21:37, Hugh Dickins wrote:
> On Tue, 25 Jun 2024, David Hildenbrand wrote:
>> On 25.06.24 07:00, Hugh Dickins wrote:
>>> Commit "mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==
>>> false" has extended folio_add_new_anon_rmap() to use on non-exclusive
>>> folios, already visible to others in swap cache and on LRU.
>>>
>>> That renders its non-atomic __folio_set_swapbacked() unsafe: it risks
>>> overwriting concurrent atomic operations on folio->flags, losing bits
>>> added or restoring bits cleared. Since it's only used in this risky
>>> way when folio_test_locked and !folio_test_anon, many such races are
>>> excluded; but, for example, isolations by folio_test_clear_lru() are
>>> vulnerable, and setting or clearing active.
>>>
>>> It could just use the atomic folio_set_swapbacked(); but this function
>>> does try to avoid atomics where it can, so use a branch instead: just
>>> avoid setting swapbacked when it is already set, that is good enough.
>>> (Swapbacked is normally stable once set: lazyfree can undo it, but
>>> only later, when found anon in a page table.)
>>>
>>> This fixes a lot of instability under compaction and swapping loads:
>>> assorted "Bad page"s, VM_BUG_ON_FOLIO()s, apparently even page double
>>> frees - though I've not worked out what races could lead to the latter.
>>>
>>> Signed-off-by: Hugh Dickins <hughd@google.com>
>>> ---
>>> mm/rmap.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index df1a43295c85..5394c1178bf1 100644
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -1408,7 +1408,9 @@ void folio_add_new_anon_rmap(struct folio *folio,
>>> struct vm_area_struct *vma,
>>> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
>>> VM_BUG_ON_VMA(address < vma->vm_start ||
>>> address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
>>> - __folio_set_swapbacked(folio);
>>> +
>>> + if (!folio_test_swapbacked(folio))
>>> + __folio_set_swapbacked(folio);
>>> __folio_set_anon(folio, vma, address, exclusive);
>>>
>>> if (likely(!folio_test_large(folio))) {
>>
>> LGTM.
>>
>> I'll point out that it's sufficient for a PFN walker to do a tryget + trylock
>> to cause trouble.
>
> That surprises me. I thought a racer's tryget was irrelevant (touching
> a different field) and its trylock not a problem, since "we" hold the
> folio lock throughout. If my mental model is too naive there, please
> explain in more detail: we all need to understand this better.
Sorry, I was imprecise.
tryget+trylock should indeed not be a problem, tryget+lock would be,
because IIRC folio_wait_bit_common()->folio_set_waiters() would be
messing with folio flags.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked()
2024-06-25 19:45 ` David Hildenbrand
@ 2024-06-25 20:12 ` Hugh Dickins
0 siblings, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2024-06-25 20:12 UTC (permalink / raw)
To: David Hildenbrand
Cc: Hugh Dickins, Andrew Morton, Barry Song, baolin.wang, chrisl,
linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb,
v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao
On Tue, 25 Jun 2024, David Hildenbrand wrote:
> On 25.06.24 21:37, Hugh Dickins wrote:
> > On Tue, 25 Jun 2024, David Hildenbrand wrote:
> >>
> >> I'll point out that it's sufficient for a PFN walker to do a tryget +
> >> trylock
> >> to cause trouble.
> >
> > That surprises me. I thought a racer's tryget was irrelevant (touching
> > a different field) and its trylock not a problem, since "we" hold the
> > folio lock throughout. If my mental model is too naive there, please
> > explain in more detail: we all need to understand this better.
>
> Sorry, I was imprecise.
>
> tryget+trylock should indeed not be a problem, tryget+lock would be, because
> IIRC folio_wait_bit_common()->folio_set_waiters() would be messing with folio
> flags.
Interesting observation, thanks. I had imagined that a folio locker was
safe, but think you're right that (before the fix) this could have erased
its PG_waiters. Typically, I guess something else would come along sooner
or later to lock the folio, and that succeed in waking up the earlier one:
so probably not an issue that would be detected in testing, but not good.
Hugh
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-06-25 20:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-25 5:00 [PATCH mm-unstable] mm: folio_add_new_anon_rmap() careful __folio_set_swapbacked() Hugh Dickins
2024-06-25 5:55 ` Barry Song
2024-06-25 7:04 ` Baolin Wang
2024-06-25 7:40 ` David Hildenbrand
2024-06-25 19:37 ` Hugh Dickins
2024-06-25 19:45 ` David Hildenbrand
2024-06-25 20:12 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox